Subject: [Beginner] How to save Kafka Dstream data to parquet ?


Structured Streaming's file sink solves these problems by writing a
log/manifest of all the authoritative files written out (for any format).
So if you run batch or interactive queries on the output directory with
Spark, it will automatically read the manifest and only process files are
that are in the manifest, thus skipping any partial files, etc.

On Fri, Mar 2, 2018 at 1:37 PM, Sunil Parmar <[EMAIL PROTECTED]> wrote: