clear query| facets| time Search criteria: .   Results from 1 to 10 from 763 (0.0s).
Loading phrases to help you
refine your search...
[SPARK-18409] LSH approxNearestNeighbors should use approxQuantile instead of sort - Spark - [issue]
...LSHModel.approxNearestNeighbors sorts the full dataset on the hashDistance in order to find a threshold.  It should use approxQuantile instead....
http://issues.apache.org/jira/browse/SPARK-18409    Author: Joseph K. Bradley , 2019-11-17, 20:17
[SPARK-9612] Add instance weight support for GBTs - Spark - [issue]
...GBT support for instance weights could be handled by: sampling data before passing it to trees passing weights to trees (requiring weight support for trees first, but probably better in the ...
http://issues.apache.org/jira/browse/SPARK-9612    Author: Joseph K. Bradley , 2019-10-25, 08:52
[SPARK-13346] Using DataFrames iteratively leads to slow query planning - Spark - [issue]
...I have an iterative algorithm based on DataFrames, and the query plan grows very quickly with each iteration.  Caching the current DataFrame at the end of an iteration does not fix the ...
http://issues.apache.org/jira/browse/SPARK-13346    Author: Joseph K. Bradley , 2019-10-10, 11:13
[SPARK-8767] Abstractions for InputColParam, OutputColParam - Spark - [issue]
...I'd like to create Param subclasses for output and input columns.  These will provide easier schema checking, which could even be done automatically in an abstraction rather than in eac...
http://issues.apache.org/jira/browse/SPARK-8767    Author: Joseph K. Bradley , 2019-10-08, 05:44
[SPARK-19498] Discussion: Making MLlib APIs extensible for 3rd party libraries - Spark - [issue]
...Per the recent discussion on the dev list, this JIRA is for discussing how we can make MLlib DataFrame-based APIs more extensible, especially for the purpose of writing 3rd-party libraries w...
http://issues.apache.org/jira/browse/SPARK-19498    Author: Joseph K. Bradley , 2019-10-08, 05:44
[SPARK-7206] Gaussian Mixture Model (GMM) improvements - Spark - [issue]
...This is an umbrella JIRA for listing improvements for GMMs: planned improvements optional/experimental work tests for verifying scalability...
http://issues.apache.org/jira/browse/SPARK-7206    Author: Joseph K. Bradley , 2019-10-08, 05:44
[SPARK-3723] DecisionTree, RandomForest: Add more instrumentation - Spark - [issue]
...Some simple instrumentation would help advanced users understand performance, and to check whether parameters (such as maxMemoryInMB) need to be tuned.Most important instrumentation (simple)...
http://issues.apache.org/jira/browse/SPARK-3723    Author: Joseph K. Bradley , 2019-10-08, 05:44
[SPARK-14585] Provide accessor methods for Pipeline stages - Spark - [issue]
...It is currently hard to access particular stages in a Pipeline or PipelineModel.  Some accessor methods would help.Scala:class Pipeline {  /** Returns stage at index i in Pipeline ...
http://issues.apache.org/jira/browse/SPARK-14585    Author: Joseph K. Bradley , 2019-10-08, 05:44
[SPARK-15882] Discuss distributed linear algebra in spark.ml package - Spark - [issue]
...This JIRA is for discussing how org.apache.spark.mllib.linalg.distributed.* should be migrated to org.apache.spark.ml.Initial questions: Should we use Datasets or RDDs underneath? If Dataset...
http://issues.apache.org/jira/browse/SPARK-15882    Author: Joseph K. Bradley , 2019-10-08, 05:44
[SPARK-22887] ML test for StructuredStreaming: spark.ml.fpm - Spark - [issue]
...Task for adding Structured Streaming tests for all Models/Transformers in a sub-module in spark.mlFor an example, see LinearRegressionSuite.scala in https://github.com/apache/spark/pull/1984...
http://issues.apache.org/jira/browse/SPARK-22887    Author: Joseph K. Bradley , 2019-10-08, 05:44