clear query| facets| time Search criteria: .   Results from 1 to 10 from 94 (0.0s).
Loading phrases to help you
refine your search...
[CRUNCH-340] Create HCatSource and HCatTarget - Crunch - [issue]
...This patch adds HCatSource, which enables crunch pipeline to read from Hive tables. This is the very first version, leaving a few TODOs in code.It adds new dependency from crunch-core to hca...    Author: Chao Shi , 2017-12-10, 16:57
[expand - 1 more] - Update Crunch Team List - Crunch - [mail # dev]
...Hi Micah, I would prefer officially retire. Thank you.  2015-12-07 10:24 GMT+08:00 Micah Whitacre :  > Site is now updated with the exception of removing Chao.  Asked if he...
   Author: Chao Shi , 2015-12-07, 06:16
[CRUNCH-408] HFileSource does not estimate the size of input correctly when there is a wildcard in path - Crunch - [issue]
...The cause is that it calls FileSystem#listStatus rather than FileSystem#globStatus to retrieve the list of files under the given path. So the fix is straight forward....    Author: Chao Shi , 2015-04-24, 20:18
Question about HBaseSourceTarget#getSize() - Crunch - [mail # user]
...Hi Nithin,  Because HBaseSourceTarget supports custom Scan criteria (i.e. you can apply filters), I think it can hardly make a guess on the resulting data size. Even HBase itself, becau...
   Author: Chao Shi , 2015-03-28, 04:27
[CRUNCH-341] Move test resources used across multiple modules to crunch-test - Crunch - [issue]
...There are duplicated test resource files in multiple modules. This patch moves them into crunch-test, which is accessiable in classpath during unit testing.chaoshi@vm3 ~/projects/crunch (mas...    Author: Chao Shi , 2014-09-19, 00:07
[CRUNCH-351] Improve performance of Shard#shard on large records - Crunch - [issue]
...   This avoids sorting on the input data, which may be long and make    shuffle phase slow. The improvement is to sort on pseudo-random numbers....    Author: Chao Shi , 2014-06-20, 03:49
[CRUNCH-355] Rename jobs to show how many stages have done before job submission - Crunch - [issue]
...The naming mechanism introduced in CRUNCH-262 has a flaw. It adds (m/n) to the end of job name, where m is the current stage number at planning time and n is the total number of stages.Suppo...    Author: Chao Shi , 2014-06-20, 03:49
[CRUNCH-364] Fix failure on mvn dependency:tree - Crunch - [issue]
...I got the "NoClassDefFoundError: org/sonatype/aether/graph/DependencyNode" when running "mvn dependency:tree". According to [1], this can be fixed by simply upgrading maven-dependenc...    Author: Chao Shi , 2014-06-20, 03:49
[CRUNCH-315] Empty collection - Crunch - [issue]
...As discussed in the mailing list [1] and [2], I'd like to add an empty collection feature. On the API side, I think we can add a new method in Pipeline to create an empty col...    Author: Chao Shi , 2014-06-20, 03:49
[CRUNCH-368] TupleWritable.Comparator - Crunch - [issue]
...This patch should improve comparison performance on TupleWritables. It saves the deserialization overhead. It is particularly useful when the input tuple are large, e.g. contains long string...    Author: Chao Shi , 2014-06-20, 03:49