clear query| facets| time Search criteria: .   Results from 1 to 10 from 946 (0.0s).
Loading phrases to help you
refine your search...
[SPARK-16824] Add API docs for VectorUDT - Spark - [issue]
...Following on the discussion here, it appears that VectorUDT is missing documentation, at least in PySpark. I'm not sure if this is intentional or not....
http://issues.apache.org/jira/browse/SPARK-16824    Author: Nicholas Chammas , 2019-05-22, 18:25
[SPARK-18277] na.fill() and friends should work on struct fields - Spark - [issue]
...It appears that you cannot use fill() and friends to quickly modify struct fields.For example:>>> df = spark.createDataFrame([Row(a=Row(b='yeah yeah'), c='alright'), Row(a=Row(b=Non...
http://issues.apache.org/jira/browse/SPARK-18277    Author: Nicholas Chammas , 2019-05-21, 07:05
[SPARK-4868] Twitter DStream.map() throws "Task not serializable" - Spark - [issue]
...(Continuing the discussion started here on the Spark user list.)The following Spark Streaming code throws a serialization exception I do not understand.import twitter4j.auth.{Authorization, ...
http://issues.apache.org/jira/browse/SPARK-4868    Author: Nicholas Chammas , 2019-05-21, 05:37
[SPARK-5685] Show warning when users open text files compressed with non-splittable algorithms like gzip - Spark - [issue]
...This is a usability or user-friendliness issue.It's extremely common for people to load a text file compressed with gzip, process it, and then wonder why only 1 core in their cluster is doin...
http://issues.apache.org/jira/browse/SPARK-5685    Author: Nicholas Chammas , 2019-05-21, 05:36
[SPARK-16921] RDD/DataFrame persist() and cache() should return Python context managers - Spark - [issue]
...Context managers are a natural way to capture closely related setup and teardown code in Python.For example, they are commonly used when doing file I/O:with open('/path/to/file') as f: ...
http://issues.apache.org/jira/browse/SPARK-16921    Author: Nicholas Chammas , 2019-05-21, 04:33
[SPARK-15191] createDataFrame() should mark fields that are known not to be null as not nullable - Spark - [issue]
...Here's a brief reproduction:>>> numbers = sqlContext.createDataFrame(...     data=[(1,), (2,), (3,), (4,), (5,)],...     samplingRatio=1  # go through all t...
http://issues.apache.org/jira/browse/SPARK-15191    Author: Nicholas Chammas , 2019-05-21, 04:33
[SPARK-19216] LogisticRegressionModel is missing getThreshold() - Spark - [issue]
...Say I just loaded a logistic regression model from storage. How do I check that model's threshold in PySpark? From what I can see, the only way to do that is to dip into the Java object:mode...
http://issues.apache.org/jira/browse/SPARK-19216    Author: Nicholas Chammas , 2019-05-21, 04:17
[SPARK-19553] Add GroupedData.countApprox() - Spark - [issue]
...We already have a pyspark.sql.functions.approx_count_distinct() that can be applied to grouped data, but it seems odd that you can't just get regular approximate count for grouped data.I ima...
http://issues.apache.org/jira/browse/SPARK-19553    Author: Nicholas Chammas , 2019-05-21, 04:15
[SPARK-2141] Add sc.getPersistentRDDs() to PySpark - Spark - [issue]
...PySpark does not appear to have sc.getPersistentRDDs()....
http://issues.apache.org/jira/browse/SPARK-2141    Author: Nicholas Chammas , 2019-05-21, 04:11
Suggestion on Join Approach with Spark - Spark - [mail # dev]
...This kind of question is for the User list, or for something like StackOverflow. It's not on topic here.The dev list (i.e. this list) is for discussions about the development ofSpark itself....
   Author: Nicholas Chammas , 2019-05-15, 18:04