study and summarie below
art 1:Table attributes
attr | default | usage/principle | use case | note |
Bloom filter | disable | cost some mem to impove lookup time TBD | do huge range scan table | this attr contains 'row','row-col',or none |
Column families | a printable string since this will be used as the dir name under region-name | |||
Maximum file size | 10G in 94.2 | maxStoreSize in fact;i.e. property "hbase.hregion.max.filesize" set in hbase-site.xml | ||
Read-only | false | like a firmware to keep safe .i.e. a 'dead' table that never changed | ||
Memstore flush size | 128m in 94.2 | same effect with property in xml 'hbase.hregion.memstore.flush.size' |
1.this value determine the frequency of generating store file 2.as 1,this effects the replay time of hlog when a rs down. |
|
Deferred log flush | false | if false,use 'hbase.regionserver.optionallogflushinterval' to check period to sumit edits |
if true may cause data loss as these cached data are in memory before sync to fs |
|
Part 2:Column Family attributes
attr | default | usage/principle | use case | note |
In-memory | false | cache some blocks of a small family in mem to speed up query | analogous to secondarny index table ,for small table | not guanrantee to when or how much blocks being cached |
Bloom filter | see Part 1 | |||
Replication scope | 0(disable) | sync local cluster data with remote ones TBD | for load balance by distribute req to clusters? | |
Maximum versions | 3 | control that how many versions(changes)are kept in storage |
use 1 in general.if u want to check last verion only,given '2' is a good idea. this will interact with 'Time-to-live' |
|
Compression | none | compress this family if specified SNAPPY,LZO,GZ.. | u must be clear completely what your requirements are then use corresponding one | |
Block size | 64k | a store file is splited into certain blocks,so smaller block cause faster reading randomly;else use bigger if for sequential readings TBD | ||
Block cache | true | when read some rows from hbase,this dertermine whehter to write back to cache to speed up last access | use 'true' if clients used access to the much duplicted rows ;'false' if do a whole table scan or less readings than writes system | |
Time-to-live | max.int(sec in unit) | how along a cell value will be kept in storage |
if this is a 'recycled' system(ie. rolling),use a appropriate value to keep data size |
this will interact with 'Maximum versions',that is both attributes contorl the data verions overlying by this |
Ref:
hbase definitive book
相关推荐
总之,《HBase in Action》这本书全面覆盖了HBase的基础知识和高级应用技巧,对于想要深入了解HBase并将其应用于实际项目中的开发者来说,是非常宝贵的资源。通过本书的学习,读者不仅可以掌握HBase的基本操作,还能...
HBase 数据集:ORDER_INFO
本文将基于提供的描述和部分代码示例,深入讲解HBase Shell的操作方法。 ### 创建表 在HBase中,表由行键(Row Key)、列族(Column Family)和列限定符(Column Qualifier)组成。通过HBase Shell可以创建带有...
赠送jar包:hbase-common-1.4.3.jar; 赠送原API文档:hbase-common-1.4.3-javadoc.jar; 赠送源代码:hbase-common-1.4.3-sources.jar; 赠送Maven依赖信息文件:hbase-common-1.4.3.pom; 包含翻译后的API文档:...
赠送jar包:hbase-annotations-1.1.2.jar; 赠送原API文档:hbase-annotations-1.1.2-javadoc.jar; 赠送源代码:hbase-annotations-1.1.2-sources.jar; 赠送Maven依赖信息文件:hbase-annotations-1.1.2.pom; ...
Title: HBase Design Patterns Author: Mark Kerzner, ...Chapter 5: Time Series Data Chapter 6: Denormalization Use Cases Chapter 7: Advanced Patterns for Data Modeling Chapter 8: Performance Optimization
hbase官网下载地址(官网下载太慢): https://downloads.apache.org/hbase/ 国内镜像hbase-2.4.16: https://mirrors.tuna.tsinghua.edu.cn/apache/hbase/2.4.16/hbase-2.4.16-bin.tar.gz
HBase基本操作 增删改查 java代码 要使用须导入对应的jar包
HBase在不同版本(1.x, 2.x, 3.0)中针对不同类型的硬件(以IO为例,HDD/SATA-SSD/PCIe-SSD/Cloud)和场景(single/batch, get/scan)做了(即将做)各种不同的优化,这些优化都有哪些?如何针对自己的生产业务和...
在Java中,我们可以使用HBase的Admin和Table接口创建和管理扫描器。以下是一个简单的示例: ```java Configuration config = HBaseConfiguration.create(); Connection connection = ConnectionFactory.create...
内容概要:本文档是一份详尽的HBase学习教程,涵盖从安装配置、基础操作到实战项目的全方位内容。首先介绍了HBase的基本概念和特点,接着详细讲解了HBase的安装与配置步骤,包括环境准备、下载与解压、配置文件修改...
- **高级配置**:对于已经熟悉HBase基础操作的读者,书中还介绍了如何根据具体需求调整配置参数,以获得更佳的性能表现。 #### 七、总结 通过上述内容可以看出,《HBase权威指南》全面而深入地介绍了HBase的相关...
hbase-exporterHBase Prometheus导出器收集指标并中继JMX指标以供Prometheus使用由于JMX中一些重要的指标缺失或为空,因此我们另外分析了HBase主界面,例如“过渡中的过时区域” 解析“ hbase hbck”命令的输出以...
在IT行业中,尤其是在大数据存储和处理领域,HBase和Phoenix是非常重要的组件。HBase是一个分布式的、面向列的NoSQL数据库,它构建于Hadoop之上,适用于大规模数据存储。而Phoenix是一个高性能的关系型SQL层,它允许...
搭建pinpoint需要的hbase初始化脚本hbase-create.hbase
Chapter 5: Column Family and Column Qualifi er Chapter 6: Row Versioning Chapter 7: Logical Storage Part III: Architecture Chapter 8: Major Components of a Cluster Chapter 9: Regions ...
HBase的一个重要特点就是它的schema设计,它使用行键来组织数据,以及列族(column family)的概念来管理列数据。 首先,了解HBase的架构是理解其schema设计的前提。HBase表由多个区域(regions)组成,每个区域由...
该项目是采用Java编写的HydraQL源码,一款旨在简化HBase操作体验的SQL查询器。项目包含1408个文件,涵盖804个Ruby脚本、530个Java源文件、30个XML配置文件、6个Shell脚本、6个属性文件以及少量其他类型文件。HydraQL...
在`hbase-site.xml`中,需要配置HBase的根目录(`hbase.rootdir`),分布模式(`hbase.cluster.distributed`),Master服务器的端口(`hbase.master.port`),ZooKeeper的群集地址(`hbase.zookeeper.quorum`)以及...