Subject: [Beginner] How to save Kafka Dstream data to parquet ?


There is no good way to save to parquet without causing downstream
consistency issues.
You could use foreachRDD to get each RDD, convert it to DataFrame/Dataset,
and write out as parquet files. But you will later run into issues with
partial files caused by failures, etc.
On Wed, Feb 28, 2018 at 11:09 AM, karthikus <[EMAIL PROTECTED]> wrote: