Subject: Reading JSON RDD in Spark Streaming


I have prices coming through Kafka in the following format

key,{JSON data}

The key is needed as part of data post to NoSQL database like Aerospike.

The following is record of topic from Kafka

"timeissued":"2019-06-18T22:10:26", "price":555.75}

The "key":"value" pairs inside {} are valid JSON as shown below in JSONLint

 "rowkey": "ba7e6bdc-2a92-4dc3-8e28-a75e1a7d58f2",
 "ticker": "SBRY",
 "timeissued": "2019-06-18T22:10:26",
 "price": 555.75

Now I need to extract values from this JSON.

One way would be to go through dstream

    { pricesRDD =>
      if (!pricesRDD.isEmpty)  // data exists in RDD
         for(row <- pricesRDD.collect.toArray)

And I get hit and miss as shown in the sample below with incorrect parsing
"timeissued":"2019-06-18T22:10:26", "price":555.75})
"SBRY"  //corrrect
"2019-06-18T22  // missing half
555.75}  // incorrect

Is there any way reading JSON data systematically?


Dr Mich Talebzadeh

LinkedIn *
*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.