1: "SQL constructs inside hive" <--use jdbc driver "describe table" read
2: "use thrift"
3: web hcathttps://cwiki.apache.org/confluence/display/Hive/WebHCat+InstallWebHCat#WebHCatInstallWebHCat-WebHCatInstalledwithHive
4: Just go the mysql db that backs the metastore and query directly
That gives you 4 ways to get at hive's meta data.
>> "since backwards compatibility is... well lets just say lacking"
Welcome to open source software. Or all software in general really.
All I am getting at was there is 4 ways right there to get at the metadata.
>>"but how easy is it to do this with a secure hadoop/hive ecosystem? now i
need to handle kerberos myself and somehow pass tokens into thrift i
Frankly I do not give a crud about the "secure bla bla" but I have seen
several tickets on thift/sasl so I assume someone does.
My only point was hive seems to give 4 ways to get at the metadata, which
is better then say mysql or vertica which only really gives you the option
to do #1 over jdbc.
Hive actually works with avro formats where it can read the schema from the
other then pointing your "table" at a folder the metadata is magic. Which
is what you are basically describing.
So again it depends on your definition of easily accessible. But the fact
that I have a thrift API which I can use to walk through the tables in a
database seems more accessable than many other databases I am aware of.
On Sat, Jan 31, 2015 at 2:38 PM, Koert Kuipers <[EMAIL PROTECTED]> wrote:
> i would not call "SQL constructs inside hive" accessible for other
> systems. its inside hive after all
> it is true that i can contact the metastore in java using
> HiveMetaStoreClient, but then i need to bring in a whole slew of
> dependencies (the miniumum seems to be hive-metastore, hive-common,
> hive-shims, libfb303, libthrift and a few hadoop dependencies, by trial and
> error). these jars need to be "provided" and added to the classpath on the
> cluster, unless someone is willing to build versions of an application for
> every hive version out there. and even when you do all this you can only
> pray its going to be compatible with the next hive version, since backwards
> compatibility is... well lets just say lacking. the attitude seems to be
> that hive does not have a java api, so there is nothing that needs to be
> you are right i could go the pure thrift road. i havent tried that yet.
> that might just be the best option. but how easy is it to do this with a
> secure hadoop/hive ecosystem? now i need to handle kerberos myself and
> somehow pass tokens into thrift i assume?
> contrast all of this with an avro file on hadoop with metadata baked in,
> and i think its safe to say hive metadata is not easily accessible.
> i will take a look at your book. i hope it has an example of using thrift
> on a secure cluster to contact hive metastore (without using the
> HiveMetaStoreClient), that would be awesome.
> On Sat, Jan 31, 2015 at 1:32 PM, Edward Capriolo <[EMAIL PROTECTED]>
>> "with the metadata in a special metadata store (not on hdfs), and its not
>> as easy for all systems to access hive metadata." I disagree.
>> Hives metadata is not only accessible through the SQL constructs like
>> "describe table". But the entire meta-store also is actually a thrift
>> service so you have programmatic access to determine things like what
>> columns are in a table etc. Thrift creates RPC clients for almost every
>> major language.
>> In the programming hive book
>> there is even examples where I show how to iterate all the tables inside
>> the database from a java client.
>> On Sat, Jan 31, 2015 at 11:05 AM, Koert Kuipers <[EMAIL PROTECTED]>
>>> yes you can run whatever you like with the data in hdfs. keep in mind