clear query| facets| time Search criteria: .   Results from 1 to 10 from 640 (0.0s).
Loading phrases to help you
refine your search...
[SPARK-26901] Vectorized gapply should not prune columns - Spark - [issue]
...Currently, if some columns can be pushed, it's being pushed through FlatMapGroupsInRWithArrow.explain(count(gapply(df,                    &n...
http://issues.apache.org/jira/browse/SPARK-26901    Author: Hyukjin Kwon , 2019-02-16, 17:19
[SPARK-26759] Arrow optimization in SparkR's interoperability - Spark - [issue]
...Arrow 0.12.0 is release and it contains R API. We could optimize Spark DaraFrame <> R DataFrame interoperability.For instance see the examples below: dapply    df <- creat...
http://issues.apache.org/jira/browse/SPARK-26759    Author: Hyukjin Kwon , 2019-02-15, 02:31
[SPARK-26858] Vectorized gapplyCollect, Arrow optimization in native R function execution - Spark - [issue]
...Unlike gapply, gapplyCollect requires additional ser/de steps because it can omit the schema, and Spark SQL doesn't know the return type before actually execution happens.In original code pa...
http://issues.apache.org/jira/browse/SPARK-26858    Author: Hyukjin Kwon , 2019-02-15, 02:04
[expand - 1 more] - Vectorized R gapply[Collect]() implementation - Spark - [mail # dev]
...Thanks guys <3.FYI, I made a PR for collect and vectorized dapply too.Given my tests, it boosts up the speed 1500%+, and 4600%+ each.https://github.com/apache/spark/pull/23760https://gith...
   Author: Hyukjin Kwon , 2019-02-14, 10:16
[SPARK-26762] Arrow optimization for conversion from Spark DataFrame to R DataFrame - Spark - [issue]
...Like SPARK-25981, collect(rdf) can be optimized via Arrow....
http://issues.apache.org/jira/browse/SPARK-26762    Author: Hyukjin Kwon , 2019-02-14, 10:12
[SPARK-26830] Vectorized dapply, Arrow optimization in native R function execution - Spark - [issue]
...Similar like SPARK-26761. Like pandas scalar UDF, looks we can do it in dapply....
http://issues.apache.org/jira/browse/SPARK-26830    Author: Hyukjin Kwon , 2019-02-14, 09:44
[SPARK-26761] Vectorized gapply, Arrow optimization in native R function execution - Spark - [issue]
...gapply is like groupped Pandas UDF. This can be optimized by Arrow likewise when send and receive data between JVM and R workers....
http://issues.apache.org/jira/browse/SPARK-26761    Author: Hyukjin Kwon , 2019-02-13, 03:20
Time to cut an Apache 2.4.1 release? - Spark - [mail # dev]
...+1 for 2.4.12019년 2월 12일 (화) 오후 4:56, Dongjin Lee 님이 작성:> > SPARK-23539 is a non-trivial improvement, so probably would not be> back-ported to 2.4.x.>> Got it. It seems reason...
   Author: Hyukjin Kwon , 2019-02-12, 11:54
[ARROW-4512] [R] Stream reader/writer API that takes socket stream - Arrow - [issue]
...I have been working on Spark integration with Arrow.I realised that there are no ways to use socket as input to use Arrow stream format. For instance,I want to something like:connStream <...
http://issues.apache.org/jira/browse/ARROW-4512    Author: Hyukjin Kwon , 2019-02-11, 18:53
[VOTE] Release Apache Spark 2.3.3 (RC2) - Spark - [mail # dev]
...Sorry for the last minute vote.+12019년 2월 8일 (금) 오전 10:15, Takeshi Yamamuro 님이 작성:> Thanks, all.>> Yea, I think we don't need to block the release, too.>> > Jungtaek> Th...
   Author: Hyukjin Kwon , 2019-02-09, 00:36