clear query| facets| time Search criteria: .   Results from 1 to 10 from 13 (0.0s).
Loading phrases to help you
refine your search...
Saving Parquet files to S3 - Spark - [mail # user]
...Hi Ankur,I also tried setting a property to write parquet file size of 256MB. I amusing pyspark below is how I set the property but it's not working for me.How did you set the property?spark...
   Author: Bijay Kumar Pathak , 2016-06-10, 20:48
Error joining dataframes - Spark - [mail # user]
...Hi,Try this one:df_join = df1.*join*(df2, 'Id', "fullouter")Thanks,BijayOn Tue, May 17, 2016 at 9:39 AM, ram kumar  wrote: ...
   Author: Bijay Kumar Pathak , 2016-05-17, 20:52
Disable parquet metadata summary in - Spark - [mail # user]
...Hi,How can we disable writing _common_metdata while saving Data Frame inparquet format in PySpark. I tried to set the property using below commandbut didn't helped.sparkContext._jsc.hadoopCo...
   Author: Bijay Kumar Pathak , 2016-05-06, 00:43
[expand - 1 more] - SqlContext parquet read OutOfMemoryError: Requested array size exceeds VM limit error - Spark - [mail # user]
...Thanks for the suggestions and links. The problem arises when I usedDataFrame api to write but it works fine when doing insert overwrite inhive table.# Works goodhive_context.sql("insert ove...
   Author: Bijay Kumar Pathak , 2016-05-04, 22:37
[expand - 1 more] - Performance with Insert overwrite into Hive Table. - Spark - [mail # user]
...Thanks Ted. This looks like the issue since I am running it in EMR and theHive version is 1.0.0.Thanks,BijayOn Wed, May 4, 2016 at 10:29 AM, Ted Yu  wrote: ...
   Author: Bijay Kumar Pathak , 2016-05-04, 21:22
Dataframe saves for a large set but throws OOM for a small dataset - Spark - [mail # user]
...Hi,I was facing the same issue on Spark 1.6. My data size was around 100 GBand was writing in the partition Hive table.I was able to solve this issue by starting from 6G of memory and reachi...
   Author: Bijay Kumar Pathak , 2016-04-30, 21:37
[expand - 1 more] - Spark SQL insert overwrite table not showing all the partition. - Spark - [mail # user]
...Hi Zhan,I tried with IF NOT EXISTS clause and still I cannot see the firstpartition only the partition with last insert overwrite is present inthe table.Thanks,BijayOn Thu, Apr 21, 2016 at 1...
   Author: Bijay Kumar Pathak , 2016-04-22, 19:37
Reading conf file in Pyspark in cluster mode - Spark - [mail # user]
...Hello,I have spark jobs packaged in zipped and deployed using cluster mode in AWSEMR. The job has to read conf file packaged with the zip under theresources directory. I can read the conf fi...
   Author: Bijay Kumar Pathak , 2016-04-17, 00:30
[expand - 1 more] - Connection closed Exception. - Spark - [mail # user]
...Hi Rodrick,I had tried increasing memory from 6G to 9G to 12G but still I am gettingthe same error. The size of dataframe I am trying to write is around 6-7 Gand the Hive table is Parquet fo...
   Author: Bijay Kumar Pathak , 2016-04-11, 15:38