I have been working on getting Accumulo running on IBM JDK, as preparation of including Accumulo in an upcoming version of BigInsights (IBM's Hadoop distribution). I have come across a number of issues, to which I have made some local fixes in my own environment. Since I'm a newbie in Accumulo, I wanted to make sure that the approach that I have taken for resolving these issues is aligned with the design intent of Accumulo.
Some of the issues are real defects, and some are instances in which the assumption of Sun/Oracle JDK being the used JVM is hard-coded into the source-code.
I have grouped the issues into 2 sections - Unit test failures and Sun-specific dependencies (though there is an overlap)
1. Unit Test failures - should run consistently no matter which OS, Java vendor/version etc... a. org.apache.accumulo.core.util.format.ShardedTableDistributionFormatterTest.testAggregate . This fails on IBM JRE, since the test is asserting order of elements in a HashMap. This consistently passes on Sun , and consistently fails on Oracle. Proposal: Change ShardedTableDistributionFormatter.countsByDay to TreeMap
c. Both org.apache.accumulo.core.security.crypto.CrypoTest & org.apache.accumulo.core.file.rfile.RFileTest have lots of failures due to calls to SEcureRandom with Random Number Generator Provider hard-coded as Sun. The IBM JRE has it's own built in RNG Provider called IBMJCE. 2 issues - hard-coded calls to SecureRandom.getInstance(<algo>,"SUN") and also default value in Property class is "SUN". Proposal: Add mechanism to override default Property through System property through new annotator in Property class. Only usage will be by Property.CRYPTO_SECURE_RNG_PROVIDER
2. Environment/Configuration a. The generated configuration files contain references to GC params that are specific to Sun JVM. In accumulo-env.sh, the ACCUMULO_TSERVER_OPTS contains -XX:NewSize and -XX:MaxNewSize , and also in ACCUMULO_GENERAL_OPTS, -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 are used. b. in bin/accumulo, get ClassNotFoundException due to specification of JAXP Doc Builder: -Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl . The Sun implementation of Document Builder Factory does not exists in IBM JDK, so a ClassNotFoundException is thrown on running accumulo script
c. MiniAccumuloCluster - in the MiniAccumuloClusterImpl, Sun-speciifc GC params are passed as params to the java process (similar to section a. )
Single proposal for solving all three above issues: Enhance bootstrap_config.sh with request to select Java vendor. Selecting this will set correct values for GC params (they differ between IBM and Sun), inclusion/ommision of JAXP setting. The MiniAccumuloClusterImpl can read the same env variable that was set in code for the GC Params, and use in the exec command.
So far, my work has been focused on getting unit tests working for all Java vendors in a clean manner. I have not yet run intensive testing of real clusters following these changes, and would be happy to get pointers to what else might need treatment.
I would also like to hear if these changes make sense, and if so, should I go ahead and create some JIRAs, and attach my patches for commit approval?
Looking forward to hearing feedback!
Regards, Hayden Marchant Software Architect IBM BigInsights, IBM
Most of the recommendation looks okay to me since there are many change to be done I think you should go ahead and create main JIRA which would have multiple subtasks addressing all the changes. I am almost sure that you might get into similar kind of issue if you run other java based NoSql distributions i.e. HBase/Cassandra on IBM jdk, I personally had surprises in api calls related to ordering in my application a long back ago. Your observations looks reasonable to me.
Regards, Vicky On Thu, Jun 19, 2014 at 3:47 PM, Hayden Marchant <[EMAIL PROTECTED]> wrote:
Mike On Thu, Jun 19, 2014 at 6:14 AM, Vicky Kak <[EMAIL PROTECTED]> wrote: This is probably a real defect. We should not be asserting order on a HashMap. Another possible solution is to change the test to check for unordered elements - HamCrest matchers may be useful here. This might be https://issues.apache.org/jira/browse/ACCUMULO-2774 I'm not sure about adding new annotators to Property. However, the CryptoTest should be getting the value from the conf instead of hard-coding it. Then you can specify the correct value in accumulo-site.xml
I think another part of the issue is in CryptoModuleFactory::fillParamsObjectFromStringMap because it looks like that ignores the default setting.
I don't know enough about the IBM JDK to comment on this part intelligently. Go ahead and generate a patch, and we can use that as a starting point for discussion.
Unit tests is a good first pass. Integration tests (mvn verify) is probably the minimum that you want on your continuous integration once you have things set up.
Accumulo also comes with a set of longer running, cluster based tests, since we know that there are some pieces too complex for unit tests to catch. have a look in the test module for the Continuous Ingest test. Once you get to that point, we can help you set it up if the README is unclear.
Filing JIRAs is going to be the most straightforward path, yes.
<snip/> Yup! I actually bumped this up to 1G already after I started seeing failures (again) from the ACCUMULO-2774 patch which set a 768M heap. Pull the upstream changes and feel free to submit something to address any problem you still have. I'm a little hesitant to remove the CMS configuration (as it really helps). My first thought about how to address this is you can submit some example Accumulo configurations that work with IBM JDK or you can add something to the configuration template/script (conf/examples and conf/templates with bin/bootstrap_config.sh, respectively). I think you're on the right path. Likewise. Looking forward to applying some patches!
Thanks for the comments. I'll take them into account, and will create the JIRAs.
I was not intending on removing the CMS options, but rather only including them in the JVM in which they are relevant, and including the equivalent in different JVMs (i.e. IBM ) - all through the bootstrap_config.sh.
Here's my newbie question: Should I be making this patch based on 1.6.1, or should I always be working against the 'master' branch, and then backport the fix(es) to any desired older version?
From: Josh Elser <[EMAIL PROTECTED]> To: [EMAIL PROTECTED], Date: 19/06/2014 06:43 PM Subject: Re: Running Accumulo on the IBM JVM
Yup! I actually bumped this up to 1G already after I started seeing failures (again) from the ACCUMULO-2774 patch which set a 768M heap. Pull the upstream changes and feel free to submit something to address any problem you still have.
& due as and will hard-coding also used. -Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl (similar vendor. between
I'm a little hesitant to remove the CMS configuration (as it really helps). My first thought about how to address this is you can submit some example Accumulo configurations that work with IBM JDK or you can add something to the configuration template/script (conf/examples and conf/templates with bin/bootstrap_config.sh, respectively). I think you're on the right path.
all of pointers probably Once unclear. should
Likewise. Looking forward to applying some patches!
One correction: "... the oldest *active* branch." This would be 1.5.2-SNAPSHOT
I think some things, like the configuration generator, were only introduced in 1.6.0. Ultimately, you can choose what you'd like to target as long as the changes aren't invasive/breaking for the bugfix releases (1.5.2 and 1.6.1). On Jun 23, 2014 8:46 AM, "William Slacum" <[EMAIL PROTECTED]> wrote:
One thing to note from the guide: we generally only expect the contributors to make the initial patch. If the committer who handles pushing things runs into trouble merging forward, they may ask for an additional patch as assistance.
As Josh mentioned, the configuration builder was introduced in 1.6.0. If you'd like changes in the 1.5.x branch as well, you'll need to add a configuration example.
On Mon, Jun 23, 2014 at 5:00 AM, Hayden Marchant <[EMAIL PROTECTED]> wrote: