clear query| facets| time Search criteria: .   Results from 41 to 50 from 763 (0.0s).
Loading phrases to help you
refine your search...
[SPARK-6312] ChiSqTest should check for too few counts - Spark - [issue]
...ChiSqTest assumes that elements of the contingency matrix are large enough (have enough counts) s.t. the central limit theorem kicks in.  It would be reasonable to do one or more of the...
http://issues.apache.org/jira/browse/SPARK-6312    Author: Joseph K. Bradley , 2019-05-21, 05:36
[SPARK-3380] DecisionTree: overflow and precision in aggregation - Spark - [issue]
...DecisionTree does not check for overflows or loss of precision while aggregating sufficient statistics (binAggregates).  It uses Double, which may be a problem for DecisionTree regressi...
http://issues.apache.org/jira/browse/SPARK-3380    Author: Joseph K. Bradley , 2019-05-21, 05:36
[SPARK-6160] ChiSqSelector should keep test statistic info - Spark - [issue]
...It is useful to have the test statistics explaining selected features, but these data are thrown out when constructing the ChiSqSelectorModel.  The data are expensive to recompute, so t...
http://issues.apache.org/jira/browse/SPARK-6160    Author: Joseph K. Bradley , 2019-05-21, 05:36
[SPARK-5571] LDA should handle text as well - Spark - [issue]
...Latent Dirichlet Allocation (LDA) currently operates only on vectors of word counts.  It should also supporting training and prediction using text (Strings).This plan is sketched in the...
http://issues.apache.org/jira/browse/SPARK-5571    Author: Joseph K. Bradley , 2019-05-21, 05:36
[SPARK-3163] Separate continuous and categorical features in DecisionTree - Spark - [issue]
...Improvement: code clarity, memory usageCurrently, during DecisionTree training, some internal data structures have overloaded meanings and unused values.  These data structures are shar...
http://issues.apache.org/jira/browse/SPARK-3163    Author: Joseph K. Bradley , 2019-05-21, 05:36
[SPARK-5114] Should Evaluator be a PipelineStage - Spark - [issue]
...Pipelines can currently contain Estimators and Transformers.Question for debate: Should Pipelines be able to contain Evaluators?Pros: Schema check: Evaluators take input datasets with partic...
http://issues.apache.org/jira/browse/SPARK-5114    Author: Joseph K. Bradley , 2019-05-21, 05:36
[SPARK-3703] Ensemble learning methods - Spark - [issue]
...This is a general JIRA for coordinating on adding ensemble learning methods to MLlib.  These methods include a variety of boosting and bagging algorithms.  Below is a general desig...
http://issues.apache.org/jira/browse/SPARK-3703    Author: Joseph K. Bradley , 2019-05-21, 05:36
[SPARK-3155] Support DecisionTree pruning - Spark - [issue]
...Improvement: accuracy, computationSummary: Pruning is a common method for preventing overfitting with decision trees.  A smart implementation can prune the tree during training in order...
http://issues.apache.org/jira/browse/SPARK-3155    Author: Joseph K. Bradley , 2019-05-21, 05:36
[SPARK-7546] Example code for ML Pipelines feature transformations - Spark - [issue]
...This should be added for Scala, Java, and Python.It should cover ML Pipelines using a complex series of feature transformations....
http://issues.apache.org/jira/browse/SPARK-7546    Author: Joseph K. Bradley , 2019-05-21, 04:38
[SPARK-11529] Add section in user guide for StreamingLogisticRegressionWithSGD - Spark - [issue]
...Jeremy Freeman Would you be able to do this for 1.6?  Or if there are others who can, could you please ping them?  Thanks!...
http://issues.apache.org/jira/browse/SPARK-11529    Author: Joseph K. Bradley , 2019-05-21, 04:37