Your questions are too broad, and you are asking questions that are
impossible to answer.
Q. "What is faster X or Y?".
A. "This depends on countless variables and can not be answered."

For one example even databases that are very similar in nature like
mysql/postgres might execute a query a different way based on it's query
planner or even the characteristics of the data.

How can you show if a query is "faster then vertica" if you do not have
access vertica to prove it?

I understand some of what you are trying to determine, but you should
really attempt to install these things and build a prototype to determine
what is the best fit for your application. This will grow your
understanding of the systems, help you ask better questions, and
potentially give you the ability to answer those questions yourself and
make better decisions.

The right way to ask this question might be "Hello, I have loaded 50Million
rows of data into hive and I am running this query 'select X, from bla
bla'. My vertica instances runs this query in X seconds and hive runs this
in Y seconds. Can this be optimized further?"

The software license for Impala is included here:

On Sat, Jan 3, 2015 at 3:29 PM, Shashidhar Rao <[EMAIL PROTECTED]>