`

Hbase study notes - operation task

 
阅读更多
Node Decommissioning
1   $ ./bin/hbase-daemon.sh stop regionserver
Disabling the Load Balancer Before Decommissioning a Node
hbase(main):001:0> balance_switch false
hbase(main):002:0> balance_switch true
2  $ ./bin/graceful_stop.sh HOSTNAME
where HOSTNAME is the host carrying the region server you want to decommission

Rolling Restarts
1. Unpack your release, make sure of its configuration, and then rsync it across
the cluster. If you are using version 0.90.2, patch it with HBASE-3744 and
HBASE-3756.
2. Run hbck to ensure the cluster is consistent:
$ ./bin/hbase hbck
Effect repairs if inconsistent.
3. Restart the master:
$ ./bin/hbase-daemon.sh stop master; ./bin/hbase-daemon.sh start master
4. Disable the region balancer:
$ echo "balance_switch false" | ./bin/hbase shell
5. Run the graceful_stop.sh script per region server. For example:
$ for i in `cat conf/regionservers|sort`; do ./bin/graceful_stop.sh \
--restart --reload --debug $i; done &> /tmp/log.txt &
If you are running Thrift or REST servers on the region server, pass the --thrift
or --rest option, as per the script’s usage instructions, shown earlier (i.e., run it
without any commandline options to get the instructions).
6. Restart the master again. This will clear out the dead servers list and reenable the
balancer.
7. Run hbck to ensure the cluster is consistent.

Pseudodistributed mode

Adding Servers
Starting a local backup master process is accomplished by
using the local-master-backup.sh script in the bin directory, like so:
$ ./bin/local-master-backup.sh start 1
The number at the end of the command signifies an offset that is added to the default ports of 60000 for RPC and 60010 for the web-based UI. In this example, a new master process would be started that reads the same configuration files as usual, but would listen on ports 60001 and 60011, respectively.

$./bin/local-master-backup.sh start 1 3 5
This starts three backup masters on ports 60001, 60003, and 60005 for RPC, plus 60011, 60013, and 60015 for the web UIs.

Stopping the backup master(s) involves the same command, but replacing the start command with the aptly named stop, like so:
$ ./bin/local-master-backup.sh stop 1


Adding a local region server.
$ ./bin/local-regionservers.sh start 1
This command will start an additional region server using port 60201 for RPC, and 60301 for the web UI.
Starting more than one region server is accomplished by adding more offsets:
$ ./bin/local-regionservers.sh start 1 2 3
Stopping any additional region server involves replacing the start command with the stop command:
$ ./bin/local-regionservers.sh stop 1

Fully distributed cluster

Adding a backup master.

The master process uses ZooKeeper to negotiate which is the currently active master:
there is a dedicated ZooKeeper znode that all master processes race to create, and the first one to create it wins. This happens at startup and the winning process moves on to become the current master. All other machines simply loop around the znode check
and wait for it to disappear—triggering the race again.
The /hbase/master znode is ephemeral, and is the same kind the region servers use to
report their presence. When the master process that created the znode fails, ZooKeeper will notice the end of the session with that server and remove the znode accordingly, triggering the election process.
Starting a server on multiple machines requires that it is configured just like the rest of the HBase cluster (see “Configuration” on page 63 for details). The master servers
usually share the same configuration with the other servers in the cluster. Once you
have confirmed that this is set up appropriately, you can run the following command
on a server that is supposed to host the backup master:
$ ./bin/hbase-daemon.sh start master
Assuming you already had a master running, this command will bring up the new master to the point where it waits for the znode to be removed.* If you want to start many masters in an automated fashion and dedicate a specific server to host the current one, while all the others are considered backup masters, you can add the --backup
switch like so:
$ ./bin/hbase-daemon.sh start master --backup

Since HBase 0.90.x, there is also the option of creating a backup-masters file in the conf directory. This is akin to the regionservers file, listing one hostname per line that is supposed to start a backup master. For the example in “Example Configuration”
on page 65, we could assume that we have three backup masters running on the ZooKeeper servers. In that case, the conf/backup-masters, would contain these entries:
zk1.foo.com
zk2.foo.com
zk3.foo.com

Adding a region server.
The first thing you should do is to edit the regionservers
file in the conf directory, to enable the launcher scripts to automat the server start and stop procedure.‡ Simply add a new line to the file specifying the hostname to add.

Then you have a few choices to start the new region server process. One option is to run the start-hbase.sh script on the master machine.

Another option is to use the launcher script directly on the new server. This is done like so:
$ ./bin/hbase-daemon.sh start regionserver

Data Tasks
Import and Export Tools
HBase ships with a handful of useful tools, two of which are the Import and Export
MapReduce jobs. They can be used to write subsets, or an entire table, to files in HDFS,
and subsequently load them again. They are contained in the HBase JAR file and you
need the hadoop jar command to get a list of the tools:
$ hadoop jar $HBASE_HOME/hbase-0.91.0-SNAPSHOT.jar
Adding the export program name then displays the options for its usage:
$ hadoop jar $HBASE_HOME/hbase-0.91.0-SNAPSHOT.jar export
You do need to specify the parameters from left to right, and you cannot omit any inbetween.

Running the command will start the MapReduce job and print out the progress:
$ hadoop jar $HBASE_HOME/hbase-0.91.0-SNAPSHOT.jar export \
testtable /user/larsgeorge/backup-testtable

Once the job is complete, you can check the filesystem for the exported data. Use the
hadoop dfs command (the lines have been shortened to fit horizontally):
$ hadoop dfs -lsr /user/larsgeorge/backup-testtable

Importing the data is the reverse operation. First we can get the usage details by invoking
the command without any parameters, and then we can start the job with the table name and inputdir (the directory containing the exported files):
$ hadoop jar $HBASE_HOME/hbase-0.91.0-SNAPSHOT.jar import
ERROR: Wrong number of arguments: 0
Usage: Import <tablename> <inputdir>

$ hadoop jar $HBASE_HOME/hbase-0.91.0-SNAPSHOT.jar import \
testtable /user/larsgeorge/backup-testtable

CopyTable Tool
Another supplied tool is CopyTable, which is primarily designed to bootstrap cluster replication.You can use is it to make a copy of an existing table from the master cluster to the slave cluster. Here are its command-line options:
$ hadoop jar $HBASE_HOME/hbase-0.91.0-SNAPSHOT.jar copytable
Examples:
To copy 'TestTable' to a cluster that uses replication for a 1 hour window:
$ bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable \
--rs.class=org.apache.hadoop.hbase.ipc.ReplicationRegionInterface
--rs.impl=org.apache.hadoop.hbase.regionserver.replication.ReplicationRegionServer
--starttime=1265875194289 --endtime=1265878794289
--peer.adr=server1,server2,server3:2181:/hbase TestTable

Bulk Import
Bulk load procedure
The HBase bulk load process consists of two main steps:
Preparation of data
Load data

Using the importtsv tool

Using the completebulkload Tool


分享到:
评论

相关推荐

    HBase(hbase-2.4.9-bin.tar.gz)

    HBase(hbase-2.4.9-bin.tar.gz)是一个分布式的、面向列的开源数据库,该技术来源于 Fay Chang 所撰写的Google论文“Bigtable:一个结构化数据的分布式存储系统”。就像Bigtable利用了Google文件系统(File System...

    hbase的hbase-1.2.0-cdh5.14.2.tar.gz资源包

    `hbase-1.2.0-cdh5.14.2.tar.gz` 是针对Cloudera Distribution Including Apache Hadoop (CDH) 5.14.2的一个特定版本的HBase打包文件。CDH是一个流行的Hadoop发行版,包含了多个大数据组件,如HDFS、MapReduce、YARN...

    flink-hbase-2.11-1.10.0-API文档-中文版.zip

    赠送jar包:flink-hbase_2.11-1.10.0.jar; 赠送原API文档:flink-hbase_2.11-1.10.0-javadoc.jar; 赠送源代码:flink-hbase_2.11-1.10.0-sources.jar; 赠送Maven依赖信息文件:flink-hbase_2.11-1.10.0.pom; ...

    phoenix-hbase-2.2-5.1.2-bin.tar.gz

    本文将深入探讨这两个技术及其结合体`phoenix-hbase-2.2-5.1.2-bin.tar.gz`的详细内容。 首先,HBase(Hadoop Database)是Apache软件基金会的一个开源项目,它构建于Hadoop之上,是一款面向列的分布式数据库。...

    hbase-hadoop-compat-1.1.3-API文档-中文版.zip

    赠送jar包:hbase-hadoop-compat-1.1.3.jar; 赠送原API文档:hbase-hadoop-compat-1.1.3-javadoc.jar; 赠送源代码:hbase-hadoop-compat-1.1.3-sources.jar; 赠送Maven依赖信息文件:hbase-hadoop-compat-1.1.3....

    hbase-prefix-tree-1.1.3-API文档-中文版.zip

    赠送jar包:hbase-prefix-tree-1.1.3.jar; 赠送原API文档:hbase-prefix-tree-1.1.3-javadoc.jar; 赠送源代码:hbase-prefix-tree-1.1.3-sources.jar; 赠送Maven依赖信息文件:hbase-prefix-tree-1.1.3.pom; ...

    phoenix-hbase-1.4-4.16.1-bin

    《Phoenix与HBase的深度解析:基于phoenix-hbase-1.4-4.16.1-bin的探讨》 Phoenix是一种开源的SQL层,它为Apache HBase提供了高性能的关系型数据库查询能力。在大数据领域,HBase因其分布式、列式存储的特性,常被...

    phoenix-hbase-2.4-5.1.2

    《Phoenix与HBase的深度解析:基于phoenix-hbase-2.4-5.1.2版本》 在大数据处理领域,Apache HBase和Phoenix是两个至关重要的组件。HBase作为一个分布式、列式存储的NoSQL数据库,为海量数据提供了高效、实时的访问...

    hbase-metrics-api-1.4.3-API文档-中文版.zip

    赠送jar包:hbase-metrics-api-1.4.3.jar; 赠送原API文档:hbase-metrics-api-1.4.3-javadoc.jar; 赠送源代码:hbase-metrics-api-1.4.3-sources.jar; 赠送Maven依赖信息文件:hbase-metrics-api-1.4.3.pom; ...

    phoenix-client-hbase-2.2-5.1.2.jar

    phoenix-client-hbase-2.2-5.1.2.jar

    hbase-1.2.1-bin.tar.gz.zip

    标题“hbase-1.2.1-bin.tar.gz.zip”表明这是HBase 1.2.1版本的二进制发行版,以tar.gz格式压缩,并且进一步用zip压缩。这种双重压缩方式可能用于减小文件大小,方便在网络上传输。用户需要先对zip文件进行解压,...

    hbase-client-2.1.0-cdh6.3.0.jar

    hbase-client-2.1.0-cdh6.3.0.jar

    hbase-hadoop-compat-1.1.3-API文档-中英对照版.zip

    赠送jar包:hbase-hadoop-compat-1.1.3.jar; 赠送原API文档:hbase-hadoop-compat-1.1.3-javadoc.jar; 赠送源代码:hbase-hadoop-compat-1.1.3-sources.jar; 赠送Maven依赖信息文件:hbase-hadoop-compat-1.1.3....

    hive-hbase-handler-1.2.1.jar

    被编译的hive-hbase-handler-1.2.1.jar,用于在Hive中创建关联HBase表的jar,解决创建Hive关联HBase时报FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. org.apache.hadoop....

    hbase-common-1.4.3-API文档-中文版.zip

    赠送jar包:hbase-common-1.4.3.jar; 赠送原API文档:hbase-common-1.4.3-javadoc.jar; 赠送源代码:hbase-common-1.4.3-sources.jar; 赠送Maven依赖信息文件:hbase-common-1.4.3.pom; 包含翻译后的API文档:...

    hbase-2.4.17-bin 安装包

    这个“hbase-2.4.17-bin”安装包提供了HBase的最新稳定版本2.4.17,适用于大数据处理和分析场景。下面将详细介绍HBase的核心概念、安装步骤以及配置和管理。 一、HBase核心概念 1. 表(Table):HBase中的表是由行...

    hbase-2.2.6-bin.tar.gz

    hbase-2.2.6-bin.tar.gz HBase是一个分布式的、面向列的开源数据库,该技术来源于 Fay Chang 所撰写的Google论文“Bigtable:一个结构化数据的分布式存储系统”。就像Bigtable利用了Google文件系统(File System)所...

    phoenix-4.14.1-HBase-1.2-client.jar

    phoenix-4.14.1-HBase-1.2-client.jar

    hbase-meta-repair-hbase-2.0.2.jar

    HBase 元数据修复工具包。 ①修改 jar 包中的application.properties,重点是 zookeeper.address、zookeeper.nodeParent、hdfs....③开始修复 `java -jar -Drepair.tableName=表名 hbase-meta-repair-hbase-2.0.2.jar`

    hbase-2.0.2-bin.tar

    这个压缩包"**hbase-2.0.2-bin.tar**"包含了HBase 2.0.2的二进制发行版,它是针对大规模数据存储而优化的。HBase构建于Hadoop之上,充分利用了Hadoop的HDFS(Hadoop Distributed File System)作为底层存储,并且...

Global site tag (gtag.js) - Google Analytics