clear query| facets| time Search criteria: author:"Josh Rosen".   Results from 1 to 10 from 94 (0.0s).
Loading phrases to help you
refine your search...
[SPARK-27940] SubtractedRDD is OOM-prone because it does not support spilling - Spark - [issue]
...SubtractedRDD, which is used to implement RDD.subtract() and PairRDDFunctions.subtractByKey(), currently buffers one partition in memory and does not support spilling:    Author: Josh Rosen , 2019-06-04, 00:54
[SPARK-27969] Non-deterministic expressions in filters or projects can unnecessarily prevent all scan-time column pruning, harming performance - Spark - [issue]
...If a scan operator is followed by a projection or filter and those operators contain any non-deterministic expressions then scan column pruning optimizations are completely skipped, harming ...    Author: Josh Rosen , 2019-06-18, 02:11
[SPARK-27684] Reduce ScalaUDF conversion overheads for primitives - Spark - [issue]
...I believe that we can reduce ScalaUDF overheads when operating over primitive types.In ScalaUDF's doGenCode we have logic to convert UDF function input types from Catalyst internal types to ...    Author: Josh Rosen , 2019-05-31, 00:10
[SPARK-27839] Improve UTF8String.replace() / StringReplace performance - Spark - [issue]
...The UTF8String.replace() function and StringReplace expression are missing a few common-case optimizations, such as avoiding copies when the replacement does not change the string and avoidi...    Author: Josh Rosen , 2019-06-19, 22:21
[SPARK-28102] Failed LZ4 JNI initialization is repeatedly re-attempted, causing lock contention issues - Spark - [issue]
...Spark's use of lz4-java ends up calling LZ4Factory.fastestInstance, which attempts to load JNI libraries and falls back on Java implementations in case the JNI library cannot be loaded or in...    Author: Josh Rosen , 2019-06-19, 22:26
[SPARK-11309] Clean up hacky use of MemoryManager inside of HashedRelation - Spark - [issue]
...In HashedRelation, there's a hacky creation of a new MemoryManager in order to handle broadcasting of BytesToBytesMap:    Author: Josh Rosen , 2019-07-09, 15:09
[SPARK-5063] Display more helpful error messages for several invalid operations - Spark - [issue]
...Spark does not support nested RDDs or performing Spark actions inside of transformations; this usually leads to NullPointerExceptions (see SPARK-718 as one example).  The confusing NPE ...    Author: Josh Rosen , 2019-08-12, 23:45
[SPARK-29310] TestMemoryManager should implement getExecutionMemoryUsageForTask() - Spark - [issue]
...Spark uses a TestMemoryManager class to mock out memory manager functionality in tests, allowing test authors to exercise control over certain behaviors (e.g. to simulate OOMs).Our tests hav...    Author: Josh Rosen , 2019-10-15, 16:15
[SPARK-27653] Add max_by() / min_by() SQL aggregate functions - Spark - [issue]
...It would be useful if Spark SQL supported the max_by() SQL aggregate function. Quoting from the Presto docs:max_by(x, y) → [same as x] Returns the value of x associated with the...    Author: Josh Rosen , 2019-10-26, 04:17
[SPARK-28702] Display useful error message (instead of NPE) for invalid Dataset operations (e.g. calling actions inside of transformations) - Spark - [issue]
...In Spark, SparkContext and SparkSession can only be used on the driver, not on executors. For example, this means that you cannot call someDataset.collect() inside of a Dataset or RDD transf...    Author: Josh Rosen , 2019-08-23, 05:16