Hadoop2.2.0+Hbase0.96.2分布式集群搭建

qindongliang1922

浏览: 2204534 次
性别:
来自: 北京

最近访客更多访客>>

北风norther

godandghost

youhere

tanss

博主相关

博客

微博

相册

留言

关于我

博客专栏

: 证道Lucene4
浏览量：118169

: 证道Hadoop
浏览量：126557

: 证道shell编程
浏览量：60537

: ELK修真
浏览量：71790

文章分类

社区版块

存档分类

博客分类：

Hbase

hbase hadoop java linux

最近项目有用到Hbase存储数据，由于现在的hadoop
的集群是基于hadoop2.2.0的，所以不可避免的就需要使用新版的Hbase，以前和hadoop1.x的集群使用的hbase是0.94版本的，现在最新的版本是0.98的，鉴于不稳定，所以散仙就选择了0.96版的Hbase，本次搭建Hbase集群，是基于底层依赖Hadoop2.2.0的，具体的情况描述如下：

序号机器IP 角色
1 192.168.46.32 Master
2 192.168.46.11 Slave1
3 192.168.46.10 Slave2

本次的集群，散仙使用的是Hbase内置的zk，建议生产环境使用外置的zk集群，具体的配置步骤如下：
序号描述
1 Ant,Maven,JDK环境
2 配置各个机器之间SSH无密码登陆认证
3 配置底层Hadoop2.2.0的集群，注意需要编译64位的
4 下载Hbase0.96，无须编译，解压
5 进入hbase的conf下，配置hbase-env.sh文件
6 配置conf下的hbase-site.xml文件
7 配置conf下的regionservers文件
8 配置完成后，分发到各个节点上
9 先启动Hadoop集群，确定hadoop集群正常
10 启动Hbase集群
11 访问Hbase的60010的web界面，查看是否正常
12 使用命令bin/hbase shell进入hbase的shell终端，测试
13 配置Windows下的本地hosts映射（如需在win上查看Hbase）
14 屌丝软件工程师一名

hbase-env.sh里面的配置如下，需要配置的地方主要有JDK环境变量的设置，和启动Hbase自带的zk管理：

#
#/**
# * Copyright 2007 The Apache Software Foundation
# *
# * Licensed to the Apache Software Foundation (ASF) under one
# * or more contributor license agreements. See the NOTICE file
# * distributed with this work for additional information
# * regarding copyright ownership. The ASF licenses this file
# * to you under the Apache License, Version 2.0 (the
# * "License"); you may not use this file except in compliance
# * with the License. You may obtain a copy of the License at
# *
# * http://www.apache.org/licenses/LICENSE-2.0
# *
# * Unless required by applicable law or agreed to in writing, software
# * distributed under the License is distributed on an "AS IS" BASIS,
# * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# * See the License for the specific language governing permissions and
# * limitations under the License.
# */

# Set environment variables here.

# This script sets variables multiple times over the course of starting an hbase process,
# so try to keep things idempotent unless you want to take an even deeper look
# into the startup scripts (bin/hbase, etc.)

# The java implementation to use. Java 1.6 required.
export JAVA_HOME=/usr/local/jdk

# Extra Java CLASSPATH elements. Optional.
# export HBASE_CLASSPATH=

# The maximum amount of heap to use, in MB. Default is 1000.
# export HBASE_HEAPSIZE=1000

# Extra Java runtime options.
# Below are what we set by default. May only work with SUN JVM.
# For more on why as well as other possible settings,
# see http://wiki.apache.org/hadoop/PerformanceTuning
export HBASE_OPTS="-XX:+UseConcMarkSweepGC"

# Uncomment one of the below three options to enable java garbage collection logging for the server-side processes.

# This enables basic gc logging to the .out file.
# export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps"

# This enables basic gc logging to its own file.
# If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR .
# export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>"

# This enables basic GC logging to its own file with automatic log rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+.
# If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR .
# export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M"

# Uncomment one of the below three options to enable java garbage collection logging for the client processes.

# This enables basic gc logging to the .out file.
# export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps"

# This enables basic gc logging to its own file.
# If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR .
# export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>"

# This enables basic GC logging to its own file with automatic log rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+.
# If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR .
# export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M"

# Uncomment below if you intend to use the EXPERIMENTAL off heap cache.
# export HBASE_OPTS="$HBASE_OPTS -XX:MaxDirectMemorySize="
# Set hbase.offheapcache.percentage in hbase-site.xml to a nonzero value.

# Uncomment and adjust to enable JMX exporting
# See jmxremote.password and jmxremote.access in $JRE_HOME/lib/management to configure remote password access.
# More details at: http://java.sun.com/javase/6/docs/technotes/guides/management/agent.html
#
# export HBASE_JMX_BASE="-Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false"
# export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10101"
# export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10102"
# export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10103"
# export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10104"
# export HBASE_REST_OPTS="$HBASE_REST_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10105"

# File naming hosts on which HRegionServers will run. $HBASE_HOME/conf/regionservers by default.
# export HBASE_REGIONSERVERS=${HBASE_HOME}/conf/regionservers

# Uncomment and adjust to keep all the Region Server pages mapped to be memory resident
#HBASE_REGIONSERVER_MLOCK=true
#HBASE_REGIONSERVER_UID="hbase"

# File naming hosts on which backup HMaster will run. $HBASE_HOME/conf/backup-masters by default.
# export HBASE_BACKUP_MASTERS=${HBASE_HOME}/conf/backup-masters

# Extra ssh options. Empty by default.
# export HBASE_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HBASE_CONF_DIR"

# Where log files are stored. $HBASE_HOME/logs by default.
# export HBASE_LOG_DIR=${HBASE_HOME}/logs

# Enable remote JDWP debugging of major HBase processes. Meant for Core Developers
# export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8070"
# export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8071"
# export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8072"
# export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8073"

# A string representing this instance of hbase. $USER by default.
# export HBASE_IDENT_STRING=$USER

# The scheduling priority for daemon processes. See 'man nice'.
# export HBASE_NICENESS=10

# The directory where pid files are stored. /tmp by default.
# export HBASE_PID_DIR=/var/hadoop/pids

# Seconds to sleep between slave commands. Unset by default. This
# can be useful in large clusters, where, e.g., slave rsyncs can
# otherwise arrive faster than the master can service them.
# export HBASE_SLAVE_SLEEP=0.1

# Tell HBase whether it should manage it's own instance of Zookeeper or not.
export HBASE_MANAGES_ZK=true

# The default log rolling policy is RFA, where the log file is rolled as per the size defined for the
# RFA appender. Please refer to the log4j.properties file to see more details on this appender.
# In case one needs to do log rolling on a date change, one should set the environment property
# HBASE_ROOT_LOGGER to "<DESIRED_LOG LEVEL>,DRFA".
# For example:
# HBASE_ROOT_LOGGER=INFO,DRFA
# The reason for changing default to RFA is to avoid the boundary case of filling out disk space as
# DRFA doesn't put any cap on the log size. Please refer to HBase-5655 for more context.

hbase-site.xml里面的配置如下：

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
/**
 *
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
-->
<configuration>

  <property>
      <name>hbase.rootdir</name>
      <value>hdfs://192.168.46.32:9000/hbase</value><!--这里必须跟core-site.xml中的配置一样-->
  </property>
  <!-- 开启分布式模式 -->
  <property>
  <name>hbase.cluster.distributed</name>
   <value>true</value>
  </property>  
  <!--    这里是对的，只配置端口，为了配置多个HMaster -->
   <property>
   <name>hbase.master</name>
   <value>192.168.46.32:60000</value> 
   </property>

     <property>
     <name>hbase.tmp.dir</name>
     <value>/home/search/hbase/hbasetmp</value>
         </property>
<!-- Hbase的外置zk集群时，使用下面的zk端口 -->
     <property>
     <name>hbase.zookeeper.quorum</name>
     <value>192.168.46.32,192.168.46.11,192.168.46.10</value>
     </property>

</configuration>

regionservers里面的配置如下：

h1
h2
h3

启动后的在Master上进程如下所示：

1580 SecondaryNameNode
1289 NameNode
2662 HMaster
2798 HRegionServer
1850 NodeManager
3414 Jps
2569 HQuorumPeer
1743 ResourceManager
1394 DataNode

关闭防火墙后，在win上访问Hbase的60010端口，如下所示：

在linu的shell客户端里访问hbase的shell如下所示：

至此，我们的Hbase集群就搭建完毕，下一步我们就可以使用Hbase的shell命令，来测试Hbase的增删改查了，当然我们也可以使用Java API来和Hbase交互，下一篇散仙会给出Java API操作Hbase的一些通用代码。

查看图片附件

分享到：

如何使用Java API操作Hbase(基于0.96新的ap ... | hadoop升级之fsck命令迎战miss block警告

2014-07-23 21:39
浏览 1615
评论(0)
分类:数据库
查看更多

发表评论

您还没有登录,请您登录后再发表评论