clear query| facets| time Search criteria: .   Results from 1 to 10 from 681 (0.0s).
Loading phrases to help you
refine your search...
[HDFS-14233] Implement DistributedFileSystem#listStatus(Path[]) by adding a batching listStatus RPC call to NameNode - HDFS - [issue]
...HDFS-985 fixed an important problem for HDFS where listing a huge directory takes too long and needs to be paged. This Jira proposed to solve the problem of the other extreme:  Make it more ...
http://issues.apache.org/jira/browse/HDFS-14233    Author: Zheng Shao , 2019-08-13, 18:27
[HDFS-11280] Allow WebHDFS to reuse HTTP connections to NN - HDFS - [issue]
...WebHDFSClient calls "conn.disconnect()", which disconnects from the NameNode.  When we use webhdfs as the source in distcp, this used up all ephemeral ports on the client side since all...
http://issues.apache.org/jira/browse/HDFS-11280    Author: Zheng Shao , 2019-06-20, 16:14
[HDFS-11586] Report %free, %write_locked, %read_locked for the NameNode FSNamesystemLock - HDFS - [issue]
...It's useful to understand how busy the NameNode is by providing these metrics, similar to the %util number from iostat for disks.When %free goes to close to 0, we know the NameNode is conges...
http://issues.apache.org/jira/browse/HDFS-11586    Author: Zheng Shao , 2019-01-26, 10:10
[HDFS-14229] Nonblocking HDFS create|write - HDFS - [issue]
...Right now, the create call on HDFS is blocking.  The write call can also be blocking if the write buffer reached its limit.However, for most applications, the only requirement is that when "...
http://issues.apache.org/jira/browse/HDFS-14229    Author: Zheng Shao , 2019-01-25, 18:07
[MAPREDUCE-6840] Distcp to support cutoff time - MapReduce - [issue]
...To ensure consistency in the datasets on HDFS,  some projects like file formats on Hive do HDFS operations in a particular order.  For example, if a file format uses an index file,...
http://issues.apache.org/jira/browse/MAPREDUCE-6840    Author: Zheng Shao , 2018-09-25, 11:26
[HADOOP-14086] Improve DistCp Speed for small files - Hadoop - [issue]
...When using distcp to copy lots of small files,  NameNode naturally becomes a bottleneck.The current distcp code did not optimize to reduce the NameNode calls.  We should restructur...
http://issues.apache.org/jira/browse/HADOOP-14086    Author: Zheng Shao , 2018-09-25, 10:51
[HIVE-20358] Allow setting variable value from Hive metastore table properties - Hive - [issue]
...Hive already supports set command as well as variable substitution:set start_ds=2018-08-01;SELECT COUNT( * ) FROM t WHERE ds >= '${hiveconf:start_ds}'; Or:set start_ds='2018-08-01';SELECT...
http://issues.apache.org/jira/browse/HIVE-20358    Author: Zheng Shao , 2018-08-09, 23:06
[HIVE-20261] Expose inputPartitionList in QueryPlan - Hive - [issue]
...Having access to the list of input partitions for all historical Hive queries in a system provides a great opportunity to insights on data access frequency and potential storage tiering.This...
http://issues.apache.org/jira/browse/HIVE-20261    Author: Zheng Shao , 2018-08-02, 11:30
SQL Query Set Analyzer - Calcite - [mail # dev]
...Hi,We are thinking about starting a project to analyze huge number of SQLqueries (think millions) to identify common patterns:* Common sub queries* Common filtering conditions (columns) for ...
   Author: Zheng Shao , 2018-07-26, 04:07
[expand - 1 more] - Clustering and Large-scale analysis of Hive Queries - Hadoop - [mail # user]
...Hi,I am interested in working on a project that takes a large number of Hivequeries (as well as their meta data like amount of resources used etc) andfind out common sub queries and expensiv...
   Author: Zheng Shao , 2018-07-25, 19:13