Oh boy, this is an interesting pickle.
For *last-access-timestamp*, I think only *event-time-of-current-record* makes sense. I’m looking at this from a GDPR/regulatory compliance perspective. If you update a state, by say storing the event you just received in state, you want to use the exact timestamp of that event to to expiration. Both *max-timestamp-of-data-seen-so-far* and *last-watermark* suffer from problems in edge cases: if the timestamp of an event you receive is quite a bit earlier than other timestamps that we have seen so far (i.e. the event is late) we would artificially lengthen the TTL of that event (which is stored in state) and would therefore break regulatory requirements. Always using the timestamp of an event doesn’t suffer from that problem.
For *expiration-check-time*, both *last-watermark* and *current-processing-time* could make sense but I’m leaning towards *processing-time*. The reason is again the GDPR/compliance view: if we have an old savepoint with data that should have been expired by now but we re-process it with *last-watermark* expiration, this means that we will get to “see” that state even though we shouldn’t allowed to be. If we use *current-processing-time* for expiration, we wouldn’t have that problem because that old data (according to their event-time timestamp) would be properly cleaned up and access would be prevented.
To sum up:
last-access-timestamp: event-time of event
What do you think?