- 浏览: 613109 次
- 性别:
- 来自: 上海
文章分类
最新评论
-
月光杯:
问题解决了吗?
Exceptions in HDFS -
iostreamin:
神,好厉害,这是我找到的唯一可以ac的Java代码,厉害。
[leetcode] word ladder II -
standalone:
One answer I agree with:引用Whene ...
How many string objects are created? -
DiaoCow:
不错!,一开始对这些确实容易犯迷糊
erlang中的冒号 分号 和 句号 -
standalone:
Exception in thread "main& ...
one java interview question
这几天tune了hbase的几个参数,有些有意思的结果。具体看我下面的邮件吧。
For example, I have total some data and I can tune hbase.hregion.max.filesize to increase/decrease total region number, rite?
I want to know if the region number has performance impact to random read tests. I observed that in my ycsb test, with larger hfile size, I got better tput and smaller latency.
Anybody can give me hints. Thanks.
Tao
回复
|
转发
|
回复
|
显示详细信息 1月18日 (2 天前)
|
Hi Tao,
I think the number of regions won't have much impact to random read throughput and latency. But the number of generations (HFiles) per region will do.
If this is the case, try to run major compaction on the table. This will merge HFile generations so the read throughput and latency will be recovered. You can do this from the hbase shell.
Also, you might want to increase hbase.region.mstore.flush.size to keep the number of HFile generations smaller.
Thanks,
--
Tatsuya Kawano (Mr.)
Tokyo, Japan
I think the number of regions won't have much impact to random read throughput and latency. But the number of generations (HFiles) per region will do.
If this is the case, try to run major compaction on the table. This will merge HFile generations so the read throughput and latency will be recovered. You can do this from the hbase shell.
Also, you might want to increase hbase.region.mstore.flush.size to keep the number of HFile generations smaller.
Thanks,
--
Tatsuya Kawano (Mr.)
Tokyo, Japan
- 显示引用文字 -
回复
|
转发
|
邀请 Tatsuya Kawano 聊天
|
回复
|
显示详细信息 1月18日 (2 天前)
|
Thanks for response.
I tuned the values of dfs.block.size and hbase.hregion.max.filesize for my tests (pure read tests) and had below results:
Test dfs.block.size hbase.hregion.max.filesize requests/sec latency
1 32 1024 ~4000 24
2 256 256 ~4500 22
3 1024 1024 ~5000 20
My understanding to the results is that, with less hdfs blocks hfile can speed up the lookup for a random row, avoiding jumping from one block to another (Test 1 vs. Test2); with less but bigger regions performance will also be better? (Test2 vs. Test3).
Sure, I believe number of HFiles per region will have impact, but I truly all did major compaction using the command line:
major_compact 'mytable'
and checked each region has only one storefile.
Is that correct?
回复
|
转发
|
回复
|
显示详细信息 1月18日 (2 天前)
|
Hi Tao,
Thanks for sharing the test result.
> but I truly
> all did major compaction using the command line:
> major_compact 'mytable'
> and checked each region has only one storefile.
> My understanding to the results is that, with less hdfs blocks hfile can
> speed up the lookup for a random row, avoiding jumping from one block to
> another (Test 1 vs. Test2)
Thanks,
--
Tatsuya Kawano (Mr.)
Tokyo, Japan
- 显示引用文字 -
回复
|
转发
|
邀请 Tatsuya Kawano 聊天
|
回复
|
显示详细信息 1月18日 (2 天前)
|
Along with Tatsuya, I thank you for sharing this interesting result.
I too wonder why the bigger block makes a difference -- 25%
improvement is a bunch -- since we set up a socket on each random read
and seek the block (we do not currently reuse connection if correct
block is already in the breach)?
Thanks for trying this experiment.
St.Ack
I too wonder why the bigger block makes a difference -- 25%
improvement is a bunch -- since we set up a socket on each random read
and seek the block (we do not currently reuse connection if correct
block is already in the breach)?
Thanks for trying this experiment.
St.Ack
- 显示引用文字 -
发表评论
-
hadoop-2.2.0 build failure due to missing dependancy
2014-01-06 13:18 752The bug and fix is at https://i ... -
HDFS中租约管理源代码分析
2013-07-05 18:05 0HDFS中Client写文件的时候要获得一个租约,用来保证Cl ... -
HBase Schema Design
2013-05-24 11:41 1189As someone has said here 引用You ... -
Question on HBase source code
2013-05-22 15:05 1112I'm reading source code of hbas ... -
Using the libjars option with Hadoop
2013-05-20 15:03 967As I have said in my last post, ... -
Use HBase to Solve Page Access Problem
2013-05-17 14:48 1188Currrently I'm working on sth l ... -
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/thi
2013-05-16 15:27 1138If you meet this exception, mak ... -
What's Xen?
2012-12-23 17:19 1124Xen的介绍。 -
学习hadoop之基于protocol buffers的 RPC
2012-11-15 23:23 10113现在版本的hadoop各种serv ... -
学习hadoop之基于protocol buffers的 RPC
2012-11-15 22:59 2现在版本的hadoop各种server、client RPC端 ... -
Hadoop RPC 一问
2012-11-14 14:43 121看代码时候发现好像有个地方做得多余,不知道改一下会不会有好处, ... -
Hadoop Version Graph
2012-11-14 11:47 928可以到这里看全文: http://cloudblog.8km ... -
Hadoop 2.0 代码分析---MapReduce
2012-10-25 18:27 7091本文参考hadoop的版本: hadoop-2.0.1-alp ... -
how to study hadoop?
2012-04-27 15:34 1531From StackOverflow http://stack ... -
[youtube] Scaling the Web: Databases & NoSQL
2012-03-23 13:11 1083Very good talk on this subject. ... -
首相发怒记之hadoop篇
2012-03-23 12:14 794我在youtube上看到的,某位能翻*墙的看一下吧,挺好笑的。 ... -
Cloud Security?
2011-09-02 14:23 845看了一些文章,主要是保证用户怎么保证存储在公有云的数据的完整性 ... -
一个HDFS Error
2011-06-11 21:53 1533ERROR: hdfs.DFSClient: Excep ... -
hadoop cluster at ebay
2011-06-11 21:39 1159Friday, December 17, 2010Hadoop ... -
[转]hadoop at ebay
2011-06-11 21:09 1196http://www.ebaytechblog.com/201 ...
相关推荐
### OFDM技术中定时同步误差对系统性能的数学分析 #### 概述 正交频分复用(Orthogonal Frequency-Division Multiplexing,简称OFDM)是一种高效的无线通信技术,广泛应用于现代通信系统中,如4G、5G、Wi-Fi等。...
FRM 2020 Current Issues指定阅读材料。"The Impact of Blockchain Technology on Finance: A Catalyst for Change,” International Center for Monetary and Banking Studies, 2018.
history and impact of hacking
Impact of receiver's tilted angle on channel capacity in VLCs 倾斜角对光通信的影响,一篇超级无聊的论文的matlab仿真,非常简单,发出来只是因为无聊。
皮革行业作为众多发展中国家经济的重要组成部分,不仅在就业和出口方面起着关键作用,同时也因其排放大量的液体和固体废物而日益受到环境影响的关注。皮革加工过程中通常需要使用大量的化学品以确保皮革的质量,但...
Impact of AI on Autonomous Vehicle Safety
这篇标题为“理解断电对闪存的影响.pdf”的论文详细探讨了断电对NAND Flash存储器的影响,特别是对不同类型的NAND Flash(如单层单元SLC和多层单元MLC)在断电时数据丢失情况的分析。论文通过实验数据对这一现象进行...
本文研究了暗黑鳃金龟(Coleoptera: Scarabaeidae)幼虫的后肠肠道细菌多样性,并探索了幼虫不同龄期对后肠肠道细菌群落组成的影响。研究的主要内容包括:肠道细菌的总DNA提取、构建16S rDNA克隆文库、DGGE分析、...
Ecommerce analytics - analyze and improve the impact of your digital Ecommerce analytics - analyze and improve the impact of your digital
### Juniper Networks’ JUNOS Network Operating System:总经济影响分析 #### 执行摘要 2009年2月,Juniper Networks委托Forrester Consulting对其JUNOS网络操作系统在企业网络环境中部署所带来的总体经济影响及...
Numerical Calculation of the Impact of Offshore Wind Power Stations on Hydrodynamic Conditions,张玮,夏海峰,本文通过建立长江口、杭州湾及其附近海域大范围平面二维潮流数学模型,探讨上海风力发电场规划...
A Study on the Impact of ESG Rating Disclosure on Corporate Green Innovation.doc
「系统安全」Psybersecurity Mental Health Impact of Cyberattacks - NGFW DDoS 防火墙 数据库审计 Android 安全方案
本研究关注的是多个蓝牙设备同时搜索时对蓝牙发现协议的影响。蓝牙是一种描述能源高效无线通信的标准,由于其特性,大多数移动设备都集成了蓝牙技术,以无线方式实现设备间的通信。要通过蓝牙进行通信,两个设备首先...
Advanced Analytics and Artificial Intelligence Impact, Implementation, and the Future of Work Boobier 2018
诊断人工智能对中国放射学的影响_Diagnosing the Impact of AI on Radiology in China.pdf
《基因组影响与真核生物转座元件》(Genomic Impact of Eukaryotic Transposable Elements)是一篇聚焦于真核生物中转座元件对基因组结构、功能及进化产生的深远影响的研究文章或会议报告。该文或报告主要探讨了转座...
impact 1.12.2