clear query| facets| time Search criteria: author:"Xiangrui Meng".   Results from 1 to 10 from 774 (0.0s).
Loading phrases to help you
refine your search...
[SPARK-5874] How to improve the current ML pipeline API? - Spark - [issue]
...I created this JIRA to collect feedbacks about the ML pipeline API we introduced in Spark 1.2. The target is to graduate this set of APIs in 1.4 with confidence, which requires valuable inpu...    Author: Xiangrui Meng , 2018-05-05, 03:11
[SPARK-7924] Consolidate example code in MLlib - Spark - [issue]
...This JIRA is an umbrella for consolidating example code in MLlib, now that we are able to insert code snippets from examples into the user guide.  This will contain tasks not already ha...    Author: Xiangrui Meng , 2018-05-05, 03:14
[SPARK-10383] Sync example code between API doc and user guide - Spark - [issue]
...It would be nice to provide example code in both user guide and API docs. However, it would become hard to keep the content in-sync. This JIRA is to collect approaches/processes to make it f...    Author: Xiangrui Meng , 2018-05-05, 03:17
[SPARK-15064] Locale support in StopWordsRemover - Spark - [issue]
...We support case insensitive filtering (default) in StopWordsRemover. However, case insensitive matching depends on the locale and region, which cannot be explicitly set in StopWordsRemover. ...    Author: Xiangrui Meng , 2018-06-12, 16:01
[SPARK-24477] Import submodules under by default - Spark - [issue]
...Right now, we do not import submodules under by default. So users cannot dofrom pyspark import mlkmeans = ml.clustering.KMeans(...)I create this JIRA to discuss if we should impor...    Author: Xiangrui Meng , 2018-06-08, 16:32
[SPARK-24454] ml.image doesn't have __all__ explicitly defined - Spark - [issue] doesn't have _all_ explicitly defined. It will import all global names by default (only ImageSchema for now), which is not a good practice. We should add _all_ to    Author: Xiangrui Meng , 2018-06-08, 16:32
[SPARK-1485] Implement AllReduce - Spark - [issue]
...The current implementations of machine learning algorithms rely on the driver for some computation and data broadcasting. This will create a bottleneck at the driver for both computation and...    Author: Xiangrui Meng , 2018-06-04, 19:54
[SPARK-24300] generateLDAData in ml.cluster.LDASuite didn't set seed correctly - Spark - [issue]
... generateLDAData uses the same RNG in all part...    Author: Xiangrui Meng , 2018-06-04, 23:08
[SPARK-25248] Audit barrier APIs for Spark 2.4 - Spark - [issue]
...Make a pass over APIs added for barrier execution mode....    Author: Xiangrui Meng , 2018-09-04, 16:56
[SPARK-25234] SparkR:::parallelize doesn't handle integer overflow properly - Spark - [issue]
...parallelize uses integer multiplication, which cannot handle size over ~47000. This cause issues with lapply SparkR:::parallelize(sc, 1:47000, 47000)Error in rep(start, end - start) : invali...    Author: Xiangrui Meng , 2018-08-24, 22:04