clear query| facets| time Search criteria: .   Results from 1 to 10 from 763 (0.0s).
Loading phrases to help you
refine your search...
[SPARK-23469] HashingTF should use corrected MurmurHash3 implementation - Spark - [issue]
...SPARK-23381 added a corrected MurmurHash3 implementation but left the old implementation alone.  In Spark 2.3 and earlier, HashingTF will use the old implementation.  (We should no...    Author: Joseph K. Bradley , 2020-06-08, 17:37
[SPARK-3159] Check for reducible DecisionTree - Spark - [issue]
...Improvement: test-time computationCurrently, pairs of leaf nodes with the same parent can both output the same prediction.  This happens since the splitting criterion (e.g., Gini) is no...    Author: Joseph K. Bradley , 2020-05-24, 15:00
[SPARK-8542] PMML export for Decision Trees - Spark - [issue]    Author: Joseph K. Bradley , 2020-04-14, 00:09
[SPARK-23482] R support for robust regression with Huber loss - Spark - [issue]
...Add support for huber loss for linear regression in R API.  See linked JIRA for change in Scala/Java....    Author: Joseph K. Bradley , 2020-03-17, 01:50
[SPARK-24632] Allow 3rd-party libraries to use abstractions for Java wrappers for persistence - Spark - [issue]
...This is a follow-up for SPARK-17025, which allowed users to implement Python PipelineStages in 3rd-party libraries, include them in Pipelines, and use Pipeline persistence.  This task i...    Author: Joseph K. Bradley , 2020-03-16, 22:53
[SPARK-9623] RandomForestRegressor: provide variance of predictions - Spark - [issue]
...Variance of predicted value, as estimated from training data.Analogous to class probabilities for classification.See SPARK-3727 for discussion....    Author: Joseph K. Bradley , 2020-01-20, 00:05
[SPARK-19063] Add parameter for storage levels to LDA - Spark - [issue]
...See parent JIRA for details.  This is to address SPARK-19007....    Author: Joseph K. Bradley , 2020-01-20, 00:05
[SPARK-3162] Train DecisionTree locally when possible - Spark - [issue]
...Improvement: communicationCurrently, every level of a DecisionTree is trained in a distributed manner.  However, at deeper levels in the tree, it is possible that a small set of trainin...    Author: Joseph K. Bradley , 2020-01-16, 00:08
[SPARK-10764] Add optional caching to Pipelines - Spark - [issue]
...We need to explore how to cache DataFrames during the execution of Pipelines.  It's a hard problem in general to handle automatically or manually, so we should start with some design di...    Author: Joseph K. Bradley , 2020-01-12, 23:54
[SPARK-9612] Add instance weight support for GBTs - Spark - [issue]
...GBT support for instance weights could be handled by: sampling data before passing it to trees passing weights to trees (requiring weight support for trees first, but probably better in the ...    Author: Joseph K. Bradley , 2020-01-06, 02:08