clear query| facets| time Search criteria: .   Results from 31 to 40 from 227 (0.0s).
Loading phrases to help you
refine your search...
[SPARK-24521] Fix ineffective test in CachedTableSuite - Spark - [issue]
...test("withColumn doesn't invalidate cached dataframe") in CachedTableSuite doesn't not work because:The UDF is executed and test count incremented when "df.cache()" is called and the subsequ...    Author: Li Jin , 2018-06-19, 17:43
[SPARK-24563] Allow running PySpark shell without Hive - Spark - [issue]
...A previous commit: running PySpark shell without Hive.Per discu...    Author: Li Jin , 2018-06-14, 20:16
[expand - 4 more] - Missing HiveConf when starting PySpark from head - Spark - [mail # dev]
...Sounds good. Thanks all for the quick reply. Thu, Jun 14, 2018 at 12:19 PM, Xiao Li  wrote:> Thanks for catching this. Please feel ...
   Author: Li Jin , 2018-06-14, 18:04
[SPARK-22239] User-defined window functions with pandas udf (unbounded window) - Spark - [issue]
...Window function is another place we can benefit from vectored udf and add another useful function to the pandas_udf suite.Example usage (preliminary):w = Window.partitionBy('id').rowsBetween...    Author: Li Jin , 2018-06-14, 13:56
[SPARK-23754] StopIterator exception in Python UDF results in partial result - Spark - [issue]
...Reproduce:df = spark.range(0, 1000)from pyspark.sql.functions import udfdef foo(x):    raise StopIteration()df.withColumn('v', udf(foo)).show()# Results# +---+---+# | id|  v|# +---...    Author: Li Jin , 2018-06-12, 09:48
[expand - 2 more] - Optimizer rule ConvertToLocalRelation causes expressions to be eager-evaluated in Planning phase - Spark - [mail # dev]
...Sorry I am confused now... My UDF gets executed for each row anyway(because I am doing with column and want to execute the UDF with each row).The difference is that with the optimization "Co...
   Author: Li Jin , 2018-06-08, 20:22
[expand - 1 more] - MatrixUDT and VectorUDT in Spark ML - Spark - [mail # dev]
...Please see Wed, May 30, 2018 at 10:40 PM Dongjin Lee  wrote:> How is this issue going? Is there any Jira ticket about this?>>...
   Author: Li Jin , 2018-05-31, 12:20
[SPARK-20144] no long maintains ordering of the data - Spark - [issue]
...Hi, We are trying to upgrade Spark from 1.6.3 to 2.0.2. One issue we found is when we read parquet files in 2.0.2, the ordering of rows in the resulting dataframe is not the same as the orde...    Author: Li Jin , 2018-05-30, 08:03
[Celebrate] Arrow has reached 2000 stargeezers - Arrow - [mail # dev]
...Congrats everyone!On Mon, May 28, 2018 at 3:21 PM Jacques Nadeau  wrote:> Woo!>> On Mon, May 28, 2018 at 4:50 PM, Wes McKinney  wrote:>> > Congrats all! The journ...
   Author: Li Jin , 2018-05-28, 19:42
[SPARK-24334] Race condition in ArrowPythonRunner causes unclean shutdown of Arrow memory allocator - Spark - [issue]
...Currently, ArrowPythonRunner has two thread that frees the Arrow vector schema root and allocator - The main writer thread and task completion listener thread. Having both thread doing the c...    Author: Li Jin , 2018-05-28, 02:51