Hbase API操作优化 -

oracle_api

浏览: 207488 次
性别:
来自: 深圳

最近访客更多访客>>

jimzhao

learnschema1

atianchen

蜗牛飞

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

2017-07 ( 13)
2017-06 ( 13)
2017-05 ( 22)
更多存档...

Hbase API操作优化

博客分类：

Hbase

一. put 优化

Hbase的API配备了一个客户端的写缓冲区（write buffer），缓冲区负责收集put操作，然后调用PRC操作一次性将put送往服务器。默认情况下写缓冲区是禁用的，可以调用table.setAutoFlush(false)来激活缓冲区：

	@Test
	public  void testWriteBuffer() throws Exception{
		HTable table = (HTable)conn.getTable(TableName.valueOf("t1"));
		//table.setAutoFlushTo(false);
		long start = System.currentTimeMillis();
		for(int i=10001; i<20000; i++){
			Put put = new Put(Bytes.toBytes("row"+i));
			put.addColumn(Bytes.toBytes("cf"), Bytes.toBytes("name"), Bytes.toBytes("terry"+i));
			put.addColumn(Bytes.toBytes("cf"), Bytes.toBytes("job"), Bytes.toBytes("manager"+i));
			table.put(put);
		}
		//table.flushCommits();
		table.close();
		System.out.println(System.currentTimeMillis()-start);
	}

测试结果：

1 使用table.setAutoFlushTo(false);：结果864 ，注意：即使不使用 table.flushCommits(); 在执行table.close();时也会提交缓存内容。

2 不使用table.setAutoFlushTo(false);：结果25443

Write Buffer默认大小是2MB，如果需要一次存储较大的数据，可以考虑增大这个数值

方法1: 临时修改WriteBufferSize

table.setWriteBufferSize(writeBufferSize);

方法2: 一次性修改hbase-site.xml

  <property>
    <name>hbase.client.write.buffer</name>
    <value>2097152</value>
  </property>

另外使用List也可以优化put，下面代码测试结果614：

	@Test
	public  void testPubList() throws Exception{
		HTable table = (HTable)conn.getTable(TableName.valueOf("t1"));
		List<Put> publist = new ArrayList<Put>();	
		long start = System.currentTimeMillis();
		for(int i=30001; i<40000; i++){
			Put put = new Put(Bytes.toBytes("row"+i));
			put.addColumn(Bytes.toBytes("cf"), Bytes.toBytes("name"), Bytes.toBytes("terry"+i));
			put.addColumn(Bytes.toBytes("cf"), Bytes.toBytes("job"), Bytes.toBytes("manager"+i));
			publist.add(put);
		}
		table.put(publist);
		table.close();
		System.out.println(System.currentTimeMillis()-start);
	}

二 Scan优化

设置扫描缓冲器大小可以优化scanner性能，

	@Test
	public  void testScanCache() throws Exception{
		HTable table = (HTable)conn.getTable(TableName.valueOf("t1"));
		Scan scan = new Scan(Bytes.toBytes("row0"), Bytes.toBytes("row999"));
		scan.setCaching(100);
		ResultScanner rs= table.getScanner(scan);
		Iterator<Result> it = rs.iterator();
		long start = System.currentTimeMillis();
		while(it.hasNext()){
			Result r = it.next();
			String name = Bytes.toString(r.getValue(Bytes.toBytes("cf"), Bytes.toBytes("name")));
			String job = Bytes.toString(r.getValue(Bytes.toBytes("cf"), Bytes.toBytes("job")));
			System.out.println(String.format("name=%s, job=%s", name, job) );
		}
		table.close();
		System.out.println(System.currentTimeMillis() - start);
	}

scan.setCaching(int value); value 代表一次RPC获取的行数。默认值取hbase-site.xml中的hbase.client.scanner.caching，为2147483647。所以上例中使用了scan.setCaching(100);性能反而降低。

scanner.caching值过高也会带来一些坏处，比如RPC超时或者返回给客户端的数据超过了其堆的大小。

分享到：

Hbase原子性操作 | bash编程之 awk格式化输出

2017-03-15 17:23
浏览 658
评论(0)
分类:企业架构
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

Hbase API操作优化

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

Hbase API操作优化

评论

发表评论

相关推荐

hbase-- Fully Distributed Install

Hbase 性能优化

hbase 管理工具

Hbase coprocesser协处理器

Hbase counter计数器

Hbase原子性操作

Hbase region切片 reqion切片合并操作

最近访客更多访客>>