Harsha, can you comment on this alternative approach: instead of fetching
directly from remote storage via a new API, implement something like
paging, where segments are paged-in and out of cold storage based on access
frequency/recency? For example, when a remote segment is accessed, it could
be first fetched to disk and then read from there. I suppose this would
require less code changes, or at least less API changes.
And related to paging, does the proposal address what happens when a broker
runs out of HDD space? Maybe we should have a way to configure a max number
of segments or bytes stored on each broker, after which older or
least-recently-used segments are kicked out, even if they aren't expired
per the retention policy? Otherwise, I suppose tiered storage requires some
babysitting to ensure that brokers don't run out of local storage, despite
having access to potentially unbounded cold storage.
Just some things to add to Alternatives Considered :)
On Wed, Apr 3, 2019 at 8:21 AM Viktor Somogyi-Vass <[EMAIL PROTECTED]>