Subject: Process for changing producer partition assignment strategy


Well. You could always run it in an IDE and set a breakpoint when the
partition is computed to get insight.

>  I guess another approach could be to generate a random uuid and use that for the message record key instead?

That is certainly possible. Why don't you try to write a custom
`StreamPartitioner` though what would be the straight forward solution.
-Matthias

On 11/15/19 2:12 AM, Mikkel Gadegaard wrote:
> Definitely not null keys. They are time based UUIDs. Basically the test set I’m running is a collection of articles stored in Cassandra and their key is the uuid generated when inserted there.
>
> Get the articles from bing api and its the same set that bing returns in both cases (same number (67) of articles and same articles). So my theory were that the time based UUIDs where so similar that the hash and modulo ended up being the same. But after reading your responses I’m back at just being puzzled. I guess another approach could be to generate a random uuid and use that for the message record key instead?
>
> Mikkel Gadegaard
>
>> On Nov 15, 2019, at 01:39, Matthias J. Sax <[EMAIL PROTECTED]> wrote:
>>
>> That is puzzling to me, too. Could it be that you have `null` keys for
>> the "new topic" you mentioned in your original email? For `null` keys,
>> the fallback would be round-robin.
>>
>> Or you just got lucky and the keys you write get distributed evenly "by
>> chance" -- in general, if the data is not skewed, hash partitioning
>> should result in a fairly even distribution, too.
>>
>> -Matthias
>>
>>> On 11/15/19 1:21 AM, Mikkel Gadegaard wrote:
>>> Well it definitely gives me something to move ahead with.
>>>
>>> I am however puzzled how I could observe a really even distribution over
>>> the partitions when specifying `PARTITIONER_CLASS_CONFIG`, whereas when I
>>> remove it the same set of test messages are written to only one partition.
>>>
>>> Thanks
>>> Mikkel
>>>
>>> --
>>>
>>>
>>> On Fri, Nov 15, 2019 at 12:22 AM Matthias J. Sax <[EMAIL PROTECTED]>
>>> wrote:
>>>
>>>> In Kafka Streams the producer config `PARTITIONER_CLASS_CONFIG` does not
>>>> take effect, because Kafka Streams computes and set partition numbers
>>>> explicitly and thus the producer does never use the partitioner to
>>>> compute a partition, but accepts whatever Kafka Streams specifies on
>>>> each `ProducerRecord`.
>>>>
>>>> If you want to change the partitioning strategy, you need to implement a
>>>> custom `StreamPartitioner` and pass it into the corresponding methods.
>>>> For example, `to()` or `through()`.
>>>>
>>>> Hope this helps.
>>>>
>>>>
>>>> -Matthias
>>>>
>>>> On 11/14/19 9:51 AM, Mikkel Gadegaard wrote:
>>>>> I've set up a POC using KafkaStreams with microservices consuming and
>>>>> producing from/to topics. In the beginning I hadn't thought about
>>>>> partition strategy, and so I was using the DefaultPartitioner for
>>>> producer
>>>>> partition assignments. My messages have keys (I use these for
>>>>> forking/joining), and the keys are time based UUIDs, this causes some
>>>>> rather uneven distribution on my topics. I looked around google and
>>>>> stumbled on KIP-369 (Alternative Partitioner to Support "Always
>>>>> Round-Robin" Selection) and figured that would be what I needed, so since
>>>>> 2.4 isn't out yet I borrowed the class from the PR on github, added it to
>>>>> my project and added the property to my config, like so:
>>>>>
>>>>> streamProperties.put(ProducerConfig.PARTITIONER_CLASS_CONFIG,
>>>>> RoundRobinPartitioner.class.getCanonicalName());
>>>>>
>>>>>
>>>>> And the round robin strategy works on a newly added topic, spreading
>>>>> messages evenly over 4 partitions. But, and I'm finally getting to my
>>>>> question, it doesn't seem to have any effect on existing topics, in other
>>>>> words, it seems to be continuing to use the DefaultPartitioner for topics
>>>>> created before I added the RoundRobinPartioner class to my
>>>>> project/properties.
>>>>>
>>>>> Is it me that just hasn't understood that it is impossible to change
Comment: GPGTools - https://gpgtools.org

iQIzBAEBCgAdFiEE8osu2CcCCF5douGQu8PBaGu5w1EFAl3PtrMACgkQu8PBaGu5
w1HGEA//VZb2Izt4RnqBSMilg/xZ887EeRBkWs69znKEK/RO7f9S+jQHNO5pWbYJ
d7G8EmX04t4Bspae135Klzxf86RKt2RLxIN8TPj2DqEbzWpyi1OdqQoctnSEwP+l
svyxQYXCxs5QDKshQU5rsL+umfhIkxvK1UGWwum40z24suzjmGMCbagsLNvZA2jM
hP4m/3RDptHAU3kFjbfrik09r9FhVEyxynmF8CkSf9WDiMb4nC+PgJMFp38i3dJv
UTXtjUI4iSlJUDNZT/WHg9wUtr9AmsRtsjFm5l3XDlc+D5XBDC+d/69eICV6Yx7N
nJZF6xzdN06mXw3w0TFDWQiJxdGOWR+/3ewYHcxBsLwX5JPIW370t9m0Su+XGLY9
wZo7R1O1UszMV1HJQEaCCOI9L2z+637IIUehWafVZ5N6VhoN6GrfXZmJzy1P5pGj
FCGBAILDKo9/JbKDTPPXmnq8XhTUBQ/+Gr882ZuUm3QExQAIr8uB6W3llxzbbslK
KbtRKiOvL1IyIajPewCsruCyGB3ego4Yhf6yX6eosBRx+swwqgVN5vZhbCxocfzd
adDl8yeQGXNTZmXMyBupf3Jiq8YzFlDz5q4nxR/8hOYqbw8OGX7bGJ41lydSnHPb
/yk1ClVnhg+UIqbhnVA6ssbr4cfpN6yX56R+byPJsjNlhBtnFxU=
=MUcx