`

FirstKeyOnlyFilter的使用方法及实例

 
阅读更多
http://blog.csdn.net/liuxiaochen123/article/details/7737718

FirstKeyOnlyFilter :api中解释如下:
A filter that will only return the first KV from each row.

This filter can be used to more efficiently perform row count operations.

说的明明白白,只会取得每条数据的第一个kv,可以用于count,计算总数,速度很快

代码如下:

希望批评指正





[html] view plaincopy
01.public int getCount() { 
02.        long bef = System.currentTimeMillis(); 
03.        int i = 0;                                                                                                                          HTable tableKeyword = new HTable(conf,"tableName");                                                                                 tableKeyword.setScannerCaching(500); 
04.        ResultScanner rs = null; 
05.        try { 
06.            Scan s = new Scan(); 
07.            s.setCaching(500); 
08.            s.setCacheBlocks(false); 
09.            s.setFilter(new FirstKeyOnlyFilter()); 
10.            rs = tableKeyword.getScanner(s); 
11.        } catch (IOException e) { 
12.            log.warn(e); 
13.            e.printStackTrace(); 
14.        } 
15.        for (org.apache.hadoop.hbase.client.Result r : rs) { 
16.            i++ ; 
17.        } 
18.        long now = System.currentTimeMillis(); 
19.        log.warn("keyword表中数据总数 :" + i + ", 所用时间 : " + (now - bef)/1000.0); 
20.        rs.close(); 
21.        return i; 
22.    } 





最好设置tableKeyword.setScannerCaching(500);
s.setCaching(500);

s.setCacheBlocks(false);这三个参数,否则速度会降下来很多

总的来说,可以节省很多时间

分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics