clear query| facets| time Search criteria: .   Results from 1 to 10 from 21 (0.0s).
Loading phrases to help you
refine your search...
[expand - 1 more] - Using existing distribution for join when subset of keys - Spark - [mail # user]
...Hey Terry,Thanks for the response! I'm not sure that it ends up working though - thebucketing still seems to require the exchange before the join. Both tablesbelow are saved bucketed by "x":...
   Author: Patrick Woody , 2020-05-31, 21:38
[SPARK-8167] Tasks that fail due to YARN preemption can cause job failure - Spark - [issue]
...Tasks that are running on preempted executors will count as FAILED with an ExecutorLostFailure. Unfortunately, this can quickly spiral out of control if a large resource shift is occurring, ...    Author: Patrick Woody , 2020-05-17, 17:48
[SPARK-19631] OutputCommitCoordinator should not allow commits for already failed tasks - Spark - [issue]
...This is similar to SPARK-6614, but there a race condition where a task may fail (e.g. Executor heartbeat timeout) and still manage to go through the commit protocol successfully. After this ...    Author: Patrick Woody , 2020-05-17, 17:48
[SPARK-23819] InMemoryTableScanExec prunes orderable complex types due to out of date ColumnStats - Spark - [issue]
...The data types that can be compared via BinaryComparison was expanded in SPARK-21110 now include Arrays/Structs/etc, but ColumnStats would still have hard coded upper/lower bounds for these ...    Author: Patrick Woody , 2020-01-19, 00:06
[SPARK-21317] Avoid unnecessary sort in FileFormatWriter if data is already bucketed - Spark - [issue]
...When bucketing in FileFormatWriter, the partition is always sorted on bucketIdExpression, the partition id produced by the hash bucketing. If the data is already bucketed in that format, the...    Author: Patrick Woody , 2020-01-17, 00:13
[SPARK-17170] Enable whole partition pruning for InMemoryTableScanExec - Spark - [issue]
...Currently InMemoryTableScanExec will prune cached batches executor side, possibly pruning the entire partition in the process. We should be able to leverage the same stats to determine if an...    Author: Patrick Woody , 2019-05-21, 04:33
[SPARK-18079] CollectLimitExec.executeToIterator() should perform per-partition limits - Spark - [issue]
...Analogous PR to for executeToIterator....    Author: Patrick Woody , 2019-05-21, 04:32
[SPARK-15038] Add ability to do broadcasts in SQL at execution time - Spark - [issue]
...Currently the auto broadcasting done in SparkSQL is asynchronous and done at query planning time. If you have a large query with many broadcasts, this can end up creating a large amount of m...    Author: Patrick Woody , 2019-05-21, 04:32
[SPARK-24060] StreamingSymmetricHashJoinHelperSuite should initialize after SparkSession creation - Spark - [issue]    Author: Patrick Woody , 2018-04-24, 03:45
[PARQUET-743] DictionaryFilters can re-use StreamBytesInput when compressed - Parquet - [issue]
...When using an And or Or DictionaryFilter, we re-use the BytesInput across reads. This is problematic when compressed because compressed BytesInputs get converted over to StreamBytesInputs wh...    Author: Patrick Woody , 2018-04-21, 12:39