clear query| facets| time Search criteria: .   Results from 1 to 10 from 237 (0.0s).
Loading phrases to help you
refine your search...
[SPARK-20144] spark.read.parquet no long maintains ordering of the data - Spark - [issue]
...Hi, We are trying to upgrade Spark from 1.6.3 to 2.0.2. One issue we found is when we read parquet files in 2.0.2, the ordering of rows in the resulting dataframe is not the same as the orde...
http://issues.apache.org/jira/browse/SPARK-20144    Author: Li Jin , 2018-10-15, 20:52
[ARROW-3497] [Java] Add user documentation for achieving better performance - Arrow - [issue]
http://issues.apache.org/jira/browse/ARROW-3497    Author: Li Jin , 2018-10-12, 08:45
[ARROW-3496] [Java] Add microbenchmark code to Java - Arrow - [issue]
...Animesh Trivedi has done some microbenchmarking with the Java API. Let's consider adding them to the codebase....
http://issues.apache.org/jira/browse/ARROW-3496    Author: Li Jin , 2018-10-12, 08:45
[ARROW-3495] [Java] Optimize bit operations performance - Arrow - [issue]
...From Animesh Trivedi's benchmark finding:2) Materialize values from Validity and Value direct buffers instead ofcalling getInt() function on the IntVector. This is implemented as a newUnsafe...
http://issues.apache.org/jira/browse/ARROW-3495    Author: Li Jin , 2018-10-12, 08:45
[ARROW-3493] [Java] Document BOUNDS_CHECKING_ENABLED - Arrow - [issue]
...According to Animesh Trivedi, BOUNDS_CHECKING_ENABLED has significant implication on performance.We should document this better and maybe revisit the default value. https://github.com/apache...
http://issues.apache.org/jira/browse/ARROW-3493    Author: Li Jin , 2018-10-12, 08:45
[expand - 1 more] - [JAVA] Arrow performance measurement - Arrow - [mail # dev]
...I have created these as the first step. Animesh, feel free to submit PR forthese. I will look into your micro benchmarks soon.   1. [image: Improvement] ARROW-3497[Java] Add user d...
   Author: Li Jin , 2018-10-11, 15:55
[SPARK-25640] Clarify/Improve EvalType for grouped aggregate and window aggregate - Spark - [issue]
...Currently, grouped aggregate and window aggregate uses different EvalType, however, they map to the same user facing type PandasUDFType.GROUPED_MAP.It makes sense to have one user facing typ...
http://issues.apache.org/jira/browse/SPARK-25640    Author: Li Jin , 2018-10-10, 05:50
[ARROW-3396] VectorSchemaRoot.create(schema, allocator) doesn't create dictionary encoded vector correctly - Arrow - [issue]
http://issues.apache.org/jira/browse/ARROW-3396    Author: Li Jin , 2018-10-07, 17:31
[JAVA] Total row count of an Arrow file - Arrow - [mail # dev]
...Hi Michael,I think ArrowFileReader takes SeekableByteChannel so it's possible to onlyread the metadata for each record batches and skip the data. However it isnot implemented.If the input Ch...
   Author: Li Jin , 2018-09-21, 14:32
[expand - 2 more] - [DISCUSS] PySpark Window UDF - Spark - [mail # dev]
...Thanks Wes and Felix!I have finished the initial development work and the PR is in a good statefor review (have pinged a couple of people to review this too). I amexcited to work with the co...
   Author: Li Jin , 2018-09-20, 17:49