Subject: This MapR-DB Spark Connector with Secondary Indexes

First as I understand MapR-DB is a proprietary (not open source) NOSQL
database that MapR offers. Similar to Hbase but better performance. There
are some speculative statement as below:
"MapR Data Platform offers significant advantages over any other tool on
the big data space. MapR-DB is one of the core components of the platform
and it offers state of the art capabilities that blow away most of the
NoSQL databases out there"

OK Spark has connectors for Hbase, Aerospike, Mongo etc. So no surprise
here. However, as I understand within Map-R DB one can create secondary
indexes and Spark can take advantages of these filters to reduce the load
into RDD.

val schema = StructType(Seq(StructField("_id", StringType),
StructField("uid", StringType)))

val data = sparkSession
  .loadFromMapRDB("/user/mapr/tables/data", schema)
  .filter("uid = '101'")
So apparently this load will be more efficient as long as the secondary
indexes are created in Map-R on the filtering column.

Also see this doc

Sounds like MapR-DB tries to be a third part version of HBase and in some
way mimics HDFS as well. I just don't understand when one can use Apache
Phoenix with secondary indexes on Hbase that provide a relational view of

Has anyone used this product?

There is some reference here as well

Dr Mich Talebzadeh

LinkedIn *
*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.