clear query| facets| time Search criteria: author:"Davies Liu".   Results from 1 to 10 from 474 (0.0s).
Loading phrases to help you
refine your search...
[SPARK-18188] Add checksum for block of broadcast - Spark - [issue]
...There is an understanding issue for a long time:, without any checksum for the blocks, it's very hard for us to identify where is the bug cam...    Author: Davies Liu , 2018-07-20, 07:55
[SPARK-16011] SQL metrics include duplicated attempts - Spark - [issue]
...When I ran a simple scan and aggregate query, the number of rows in scan could be different from run to run, but actually scanned result is correct, the SQL metrics is wrong (should not incl...    Author: Davies Liu , 2018-09-11, 14:31
[SPARK-3554] handle large dataset in closure of PySpark - Spark - [issue]
...Sometimes there are large dataset used in closure and user forget to use broadcast for it, then the serialized command will become huge.py4j can not handle large objects efficiently, we shou...    Author: Davies Liu , 2014-09-19, 01:12
[SPARK-3592] applySchema to an RDD of Row - Spark - [issue]
...Right now, we can not appy schema to a RDD of Row, this should be a Bug,>>> srdd = sqlCtx.jsonRDD(sc.parallelize(["""{"a":2}"""]))>>> sqlCtx.applySchema( x:x...    Author: Davies Liu , 2014-09-19, 22:33
[SPARK-3594] try more rows during inferSchema - Spark - [issue]
...If there are some empty values in the first row of RDD of Row, the inferSchema will failed.It's better to try with more rows, combine them together....    Author: Davies Liu , 2014-11-03, 21:18
[SPARK-3679] pickle the exact globals of functions - Spark - [issue]
...function.func_code.co_names has all the names used in the function, including name of attributes. It will pickle some unnecessary globals if there is a global having the same name with attri...    Author: Davies Liu , 2014-09-24, 20:00
[SPARK-3681] Failed to serialized ArrayType or MapType  after accessing them in Python - Spark - [issue] x: x.files).take(1)Also it will lose the schema after iterate an x: [f.batch for f in x.files]).take(1)...    Author: Davies Liu , 2014-09-27, 19:21
[SPARK-3463] Show metrics about spilling in Python - Spark - [issue]
...It should also show the number of bytes spilled into disks while doing aggregation in Python....    Author: Davies Liu , 2014-09-14, 05:31
[SPARK-3465] Task metrics are not aggregated correctly in local mode - Spark - [issue]
...In local mode, after onExecutorMetricsUpdate(), t.taskMetrics will be the same object with that in TaskContext (because there is no serialization for MetricsUpdate in local mode), then all t...    Author: Davies Liu , 2014-09-12, 21:30
[SPARK-3478] Profile Python tasks stage by stage in worker - Spark - [issue]
...The Python code in driver is easy to profile by users, but the code run in worker is distributed in clusters, is not easy to profile by users.So we need a way to do the profiling in worker a...    Author: Davies Liu , 2014-09-27, 04:35