hbase root meta结构和存储数据

chengjianxiaoxue

浏览: 1321130 次
性别:
来自: 北京

最近访客更多访客>>

liu_shui8

happy2012

nddht

yhtppp

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

hbase

0 hbase 的 ROOT META两个表介绍:

ROOT META两个表是hbase的内置表，从存储结构和操作方法来说，他们和其他hbase表没有区别，

区别在于存储的东西不同，用于存储region的分布情况和每个region的详细信息。

1 这两个表的结构如下:

2 root表和meta表和真实region结构如下:

3 root表数据:

4 meta行记录:

5 查找某一行记录流程:

a) 代码位置:

整个路由过程的主要代码在org.apache.hadoop.hbase.client.HConnectionManager.TableServers中:

private HRegionLocation locateRegion(final byte[] tableName,  
        final byte[] row, boolean useCache) throws IOException {  
    if (tableName == null || tableName.length == 0) {  
        throw new IllegalArgumentException("table name cannot be null or zero length");  
    }  
    if (Bytes.equals(tableName, ROOT_TABLE_NAME)) {  
        synchronized (rootRegionLock) {  
            // This block guards against two threads trying to find the root  
            // region at the same time. One will go do the find while the  
            // second waits. The second thread will not do find.  
            if (!useCache || rootRegionLocation == null) {  
                this.rootRegionLocation = locateRootRegion();  
            }  
            return this.rootRegionLocation;  
        }  
    } else if (Bytes.equals(tableName, META_TABLE_NAME)) {  
        return locateRegionInMeta(ROOT_TABLE_NAME, tableName, row, useCache, metaRegionLock);  
    } else {  
        // Region not in the cache – have to go to the meta RS  
        return locateRegionInMeta(META_TABLE_NAME, tableName, row, useCache, userRegionLock);  
    }  
}

比如要查询Table2中RowKey是RK10000的数据：

获取Table2，RowKey为RK10000的RegionServer => 获取.META.，RowKey为Table2,RK10000, 99999999999999的RegionServer => 
获取-ROOT-，RowKey为.META.,Table2,RK10000,99999999999999,99999999999999的RegionServer => 获取-ROOT-的RegionServer => 
从ZooKeeper得到-ROOT-的RegionServer => 从-ROOT-表中查到RowKey最接近（小于） .META.,Table2,RK10000,99999999999999,99999999999999的一条Row，
并得到.META.的RegionServer => 从.META.表中查到RowKey最接近（小于）Table2,RK10000, 99999999999999的一条Row，
并得到Table2的RegionServer => 从Table2中查到RK10000的Row

到此为止Client完成了路由RegionServer的整个过程，在整个过程中使用了添加“99999999999999”后缀并查找最接近（小于）RowKey的方法。对于这个方法大家可以仔细揣摩一下，并不是很难理解。

最后要提醒大家注意两件事情：

1. 在整个路由过程中并没有涉及到MasterServer，也就是说HBase日常的数据操作并不需要MasterServer，不会造成MasterServer的负担。

2. Client端并不会每次数据操作都做这整个路由过程，很多数据都会被Cache起来

6 查看 hbase meta里数据写法：

hbase(main):012:0>scan 'hbase:meta'

7 hbase meta root表记录字段数据总结:

rowkey: tablename+startkey+timestamp

regioninfo: startkey+endkey+family-column

参考链接:

http://blog.csdn.net/chlaws/article/details/16918913

http://www.aboutyun.com/thread-9100-1-2.html 写的不错