As a direct Apache software product consumer and sometimes contributor, I
also experienced firsthand the pain of the project splits. It was not
possible to build an installable release. It may have been many days or
weeks before that was cured by a re-merge. I gave up after burning too many
hours on it, went back to the 1.0 code base, and came back only after the
damage was repaired.
It's also frustrating to hear, even if just one person's proposal, that we
have spent months preparing to stabilize our next production deployment
based on the 2.0 branch, with the expectation that it will be the new
stable, but now maybe 0.23 will be the new stable. 0.23 is quite backwards
in comparison and missing all of the critical HA HDFS work.
This thread seems to be becoming a competition for which is the more
radical proposal to snatch defeat from the jaws of success.
These proposals seem to be made with a total lack of care for the end user.
>From my point of view, things were going reasonably well until suddenly
there is this sudden turn into lunacy. I am positive this kind of
"foundation" / PMC / project / administrivia tinkering is what will
fragment or disband the Hadoop community of users and contributors, not
disagreements between committers. A Hadoop competitor couldn't be happer.
On Thu, Aug 30, 2012 at 1:12 PM, Konstantin Shvachko
> On Wed, Aug 29, 2012 at 4:54 PM, Mattmann, Chris A (388J)
> <[EMAIL PROTECTED]> wrote:
> > OK I lied and said I wouldn't reply :)
> Long thread. I just picked a random Chris's (as the initiator) email to
> You are basically saying there's been a history of community problems
> in Hadoop project,
> and proposing a technical solution to split the project by replicating
> the source base under three new names,
> implying that this will solve the community problems we (the Hadoop
> community) are facing.
> I see several issues.
> 1. There are other ways to split the project.
> We essentially have a "natural" split of the project already in place.
> Hadoop 1, Hadoop 2, Hadoop 0.23, the Trunk
> are in a sense competing projects by themselves, with own contributors
> and release cycles.
> 2. From technical (not community) viewpoint your "svn copy" is an ugly
> as it creates a lot of code duplication and will result in a
> maintenance nightmare or / and
> will require many man-months to fix. My point is that you cannot
> neglect "technical issues" when you solve community problems.
> 3. I am as skeptical as Todd that the community problems will be
> solved by simply TLP-ing the three projects.
> Two years ago Hadoop was in crises as vendors were producing their own
> releases calling it Hadoop.
> I think this was solved, but "poor community behavior" and contentions
> remained, embrace them or not.
> 4. Having said the above, separating the projects seems reasonable.
> (See timing though)
> HDFS will inevitable have to inherit and maintain most of Common.
> Totally understand frustration of people who just put a huge effort
> into merging
> the sources back under common root.
> 5. Timing is important.
> Waiting until Hadoop 2 is stable as Arun suggested earlier would
> probably be too long.
> Doing it next week, without discussing and solving technical issue
> listed in the thread would be premature.
> I think Hadoop 0.23.3 release backed by Yahoo production has a
> potential to become
> the next stable version, letting the project to move ahead off the
> four year old code base.
> We should help that happen first, and do necessary preparations for
> the split in the mean time.
Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)