clear query| facets| time Search criteria: author:"Xiangrui Meng".   Results from 21 to 30 from 47 (0.0s).
Loading phrases to help you
refine your search...
[SPARK-30154] PySpark UDF to convert MLlib vectors to dense arrays - Spark - [issue]
...If a PySpark user wants to convert MLlib sparse/dense vectors in a DataFrame into dense arrays, an efficient approach is to do that in JVM. However, it requires PySpark user to write Scala c...    Author: Xiangrui Meng , 2020-01-07, 00:19
[SPARK-10413] ML models should support prediction on single instances - Spark - [issue]
...Currently models in the pipeline API only implement transform(DataFrame). It would be quite useful to support prediction on single instance.UPDATE: This issue is for making predictions with ...    Author: Xiangrui Meng , 2019-12-26, 11:19
[SPARK-14850] VectorUDT/MatrixUDT should take primitive arrays without boxing - Spark - [issue]
...In SPARK-9390, we switched to use GenericArrayData to store indices and values in vector/matrix UDTs. However, GenericArrayData is not specialized for primitive types. This might hurt MLlib ...    Author: Xiangrui Meng , 2020-04-27, 08:48
[SPARK-26410] Support per Pandas UDF configuration - Spark - [issue]
...We use a "maxRecordsPerBatch" conf to control the batch sizes. However, the "right" batch size usually depends on the task itself. It would be nice if user can configure the batch size when ...    Author: Xiangrui Meng , 2020-03-17, 09:46
[SPARK-26028] Design sketch for SPIP: Property Graphs, Cypher Queries, and Algorithms - Spark - [issue]
...Placeholder for the design discussion of SPARK-25994. The scope here is to help SPIP vote instead of the final design....    Author: Xiangrui Meng , 2020-03-17, 09:52
[SPARK-25349] Support sample pushdown in Data Source V2 - Spark - [issue]
...Support sample pushdown would help file-based data source implementation save I/O cost significantly if it can decide whether to read a file or not. cc: Wenchen Fan...    Author: Xiangrui Meng , 2020-03-17, 09:55
[SPARK-25383] Image data source supports sample pushdown - Spark - [issue]
...After SPARK-25349, we should update image data source to support sampling....    Author: Xiangrui Meng , 2020-03-17, 09:55
[SPARK-26412] Allow Pandas UDF to take an iterator of pd.DataFrames - Spark - [issue]
...Pandas UDF is the ideal connection between PySpark and DL model inference workload. However, user needs to load the model file first to make predictions. It is common to see models of size ~...    Author: Xiangrui Meng , 2020-04-09, 04:14
[SPARK-27303] Spark Graph API (Scala/Java) - Spark - [issue]
...(1) As a user, I can construct a PropertyGraph and view its nodes and relationships as DataFrames.Required: Scala API to construct a PropertyGraph. Scala API to view nodes and relationships ...    Author: Xiangrui Meng , 2020-04-03, 12:22
[SPARK-31775] Support tensor type (TensorType) in Spark SQL/DataFrame - Spark - [issue]
...More and more DS/ML workloads are dealing with tensors. For example, a decoded color image can be represented by a 3D tensor. It would be nice to natively support tensor type. A local tensor...    Author: Xiangrui Meng , 2020-05-20, 17:43