In most distributed systems, the data is over sharded. Helix seems taking
this as an assumption.
Is there any way to use Helix to manage splitting the shards, for data
I am trying to fit the jump consistent hash into Helix model.
Basically jumping hash can change from N shards to M shards, where N and M
can be just any positive integers. If growing, the (M-N)/M data on N shards
would be moved to the new M-N shards.https://arxiv.org/abs/1406.2294
Jumping Hash is needed to provide an atomic operation to switch to the new
topology. When growing, the data queries still go to the existing shards,
until the new shards are ready. So the new servers can prepare data as fast
as possible, rather than having to throttle the data preparation.