Subject: Answers to recent questions on Hive on Spark

Hi xuefu ,

thanks for the information.
One simple question,     any plan when the hive on spark can be used in production environment?


发件人: Xuefu Zhang [mailto:[EMAIL PROTECTED]]
发送时间: 2015年11月28日 2:12
主题: Answers to recent questions on Hive on Spark

Hi there,
There seemed an increasing interest in Hive On Spark From the Hive users. I understand that there have been a few questions or problems reported and I can see some frustration sometimes. It's impossible for Hive on Spark team to respond every inquiry even thought we wish we could. However, there are a few items to be noted:
1. Hive on Spark is being tested as part of Precommit test.
2. Hive on Spark is supported in some distributions such as CDH.
3. I tried a couple of days ago with latest master and branch-1, and they all worked with my Spark 1.5 build.
Therefore, if you are facing some problem, it's likely due to your setup. Please refer to Wiki on how to do it right. Nevertheless, I have a few suggestions here:
1. Start with simple. Try out a CDH sandbox or distribution first and to see it works in action before building your own. Comparing with your setup may give you some clues.
2. Try with spark.master=local first, making sure that you have all the necessary dependent jars, and then move to your production setup. Please note that yarn-cluster is recommended and mesos is not supported. I tried both yarn-cluster and local-cluster and both worked for me.
3. Check logs beyond hive.log such as spark log, and yarn-log to get more error messages.
When you report your problem, please provide as much info as possible, such as your platform, your builds, your configurations, and relevant logs so that others can reproduce.
Please note that we are not in a good position to answer questions with respect to Spark itself, such as spark-shell. Not only is that beyond the scope of Hive on Scope, but also the team may not have the expertise to give your meaningful answers. One thing to emphasize. When you build your spark jar, don't include Hive, as it's very likely there is a version mismatch. Again, a distribution may have solve the problem for you if you like to give it a try.
Hope this helps.