The goal of this email is to summarize and unify the discussion started across several email threads (Storm 2.0 Roadmap<http://search-hadoop.com/?project=Storm&q=%22%5BDISCUSS%5D+Storm+2.0+Roadmap%22>
, 1.1.1 Release Planning<http://search-hadoop.com/m/Storm/8gnYyGagLDWv1qG?subj=Release+Planning+for+1+1+1+and+others+>
, Lag Issues<http://search-hadoop.com/m/Storm/8gnYyLmjIjYr692?subj=Lag+issues+using+Storm+1+1+1+latest+build+with+StormKafkaClient+1+1+1+vs+old+StormKafka+spouts>
) concerning the maintenance, branch support, and eventual deprecation of storm-kafka and storm-kafka-client.
It was proposed in an earlier discussion<http://search-hadoop.com/?project=Storm&q=%22%5BDISCUSS%5D+Storm+2.0+Roadmap%22>
the plan to deprecate storm-kafka in prol of storm-kafka-client. To clarify, the idea is not to completely eliminate storm-kafka, but rather keep supporting it in the 1.x-branch, while removing it from master (i.e. Storm 2.0 onwards). That is, storm-kafka-client will then become the only Storm Kafka option available for Storm 2.0 onwards, given that we have enough confidence in its stability by the time of the Storm 2.0 release.
The main reason for this proposal is the fact that the Kafka community agreed<https://cwiki.apache.org/confluence/display/KAFKA/KIP-109:+Old+Consumer+Deprecation>
to deprecate the old consumer APIs starting in version 0.10.2, and will remove them in the next major version (0.12). This implies that storm-kafka will not work for Kafka 0.12 onwards. Important features missing in the old Kafka consumer are: security, new message format, and fetching offsets based on time stamp (KIP-79).
In earlier discussions the Storm community has shown concerns about the performance and stability of the storm-kafka-client. Those concerns are valid and were mirrored by the Kafka community in their early deprecation discussions. I align with what was said in the Kafka discussion<http://search-hadoop.com/m/Kafka/uyzND1e4bUP1Rjq721>:
the storm-kafka-client has bugs, but so does storm-kafka, and all the development is currently going into storm-kafka-client, which will be even more prevalent in face of Kafka discontinuing the old consumer API’s. The only way to stabilize a complex component such as storm-kafka-client is to test it extensively in all its variants, which inevitably comes from users using it. Furthermore, removing storm-kafka from Storm 2.0 does not prevent users from still referring to storm-kafka version 1.x in their topologies.
I did a quick analysis of the JIRA issues for storm-kafka and storm-kafka-client . As of July 11 there are 22 open or in-progress bugs for storm-kafka (1 blocker) and 15 for storm-kafka-client.
The recent refactoring around manual partition assignment should solve a lot of edge case bugs that occurred during rebalance. There are also a few open pull requests for Trident and fixing some internal state details such as maxUncommittedOffsets, topic compaction, etc. Nevertheless, there are several areas that need to be addressed to stabilize and improve storm-kafka-client. Similarly to what was done for Storm SQL I suggest that we create a wiki page where we can centralize some points of action such as:
Features / Stability
* Memory Footprint
* Retrial Mechanism
* Exactly once and at least once guarantees
* Kafka Lag
* Spout Internals (e.g. maxUncommittedOffsets, ack, emitted, failed, ...)
* Autocommit mode
* Run performance benchmarks
* Test for exactly once in non failure scenarios (e.g. activate/deactivate)
* Test for at least once in failure scenarios
* Test Trident guarantees
* Identify unit test coverage and find a modular way to continually add new tests
* Pull request<https://github.com/apache/storm/pull/2174>
* Investigate for gaps in API between storm-kafka and storm-kafka-client.
* Can we discontinue the old API ?
* Check for accuracy and completeness of documentation
* Make clean code snippets with examples available
 - The data was extracted from JIRA on 07/11/2017. The storm-kafka-client JIRAs were checked for correctness of component label, and had their status updated. None of that was done for the storm-kafka JIRAs, therefore some of its issues marked as open may already have been fixed. The results and charts can be found here: