`
qindongliang1922
  • 浏览: 2188685 次
  • 性别: Icon_minigender_1
  • 来自: 北京
博客专栏
7265517b-f87e-3137-b62c-5c6e30e26109
证道Lucene4
浏览量:117664
097be4a0-491e-39c0-89ff-3456fadf8262
证道Hadoop
浏览量:126072
41c37529-f6d8-32e4-8563-3b42b2712a50
证道shell编程
浏览量:60032
43832365-bc15-3f5d-b3cd-c9161722a70c
ELK修真
浏览量:71400
社区版块
存档分类
最新评论

Hadoop2.2.0+Hbase0.96.2分布式集群搭建

阅读更多
最近项目有用到Hbase存储数据,由于现在的hadoop
的集群是基于hadoop2.2.0的,所以不可避免的就需要使用新版的Hbase,以前和hadoop1.x的集群使用的hbase是0.94版本的,现在最新的版本是0.98的,鉴于不稳定,所以散仙就选择了0.96版的Hbase,本次搭建Hbase集群,是基于底层依赖Hadoop2.2.0的,具体的情况描述如下:


序号机器IP角色
1192.168.46.32Master
2192.168.46.11Slave1
3192.168.46.10Slave2


本次的集群,散仙使用的是Hbase内置的zk,建议生产环境使用外置的zk集群,具体的配置步骤如下:
序号描述
1Ant,Maven,JDK环境
2配置各个机器之间SSH无密码登陆认证
3配置底层Hadoop2.2.0的集群,注意需要编译64位的
4下载Hbase0.96,无须编译,解压
5进入hbase的conf下,配置hbase-env.sh文件
6配置conf下的hbase-site.xml文件
7配置conf下的regionservers文件
8配置完成后,分发到各个节点上
9先启动Hadoop集群,确定hadoop集群正常
10启动Hbase集群
11访问Hbase的60010的web界面,查看是否正常
12使用命令bin/hbase shell进入hbase的shell终端,测试
13配置Windows下的本地hosts映射(如需在win上查看Hbase)
14屌丝软件工程师一名




hbase-env.sh里面的配置如下,需要配置的地方主要有JDK环境变量的设置,和启动Hbase自带的zk管理:
#
#/**
# * Copyright 2007 The Apache Software Foundation
# *
# * Licensed to the Apache Software Foundation (ASF) under one
# * or more contributor license agreements.  See the NOTICE file
# * distributed with this work for additional information
# * regarding copyright ownership.  The ASF licenses this file
# * to you under the Apache License, Version 2.0 (the
# * "License"); you may not use this file except in compliance
# * with the License.  You may obtain a copy of the License at
# *
# *     http://www.apache.org/licenses/LICENSE-2.0
# *
# * Unless required by applicable law or agreed to in writing, software
# * distributed under the License is distributed on an "AS IS" BASIS,
# * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# * See the License for the specific language governing permissions and
# * limitations under the License.
# */

# Set environment variables here.

# This script sets variables multiple times over the course of starting an hbase process,
# so try to keep things idempotent unless you want to take an even deeper look
# into the startup scripts (bin/hbase, etc.)

# The java implementation to use.  Java 1.6 required.
 export JAVA_HOME=/usr/local/jdk

# Extra Java CLASSPATH elements.  Optional.
# export HBASE_CLASSPATH=

# The maximum amount of heap to use, in MB. Default is 1000.
# export HBASE_HEAPSIZE=1000

# Extra Java runtime options.
# Below are what we set by default.  May only work with SUN JVM.
# For more on why as well as other possible settings,
# see http://wiki.apache.org/hadoop/PerformanceTuning
export HBASE_OPTS="-XX:+UseConcMarkSweepGC"

# Uncomment one of the below three options to enable java garbage collection logging for the server-side processes.

# This enables basic gc logging to the .out file.
# export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps"

# This enables basic gc logging to its own file.
# If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR .
# export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>"

# This enables basic GC logging to its own file with automatic log rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+.
# If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR .
# export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M"

# Uncomment one of the below three options to enable java garbage collection logging for the client processes.

# This enables basic gc logging to the .out file.
# export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps"

# This enables basic gc logging to its own file.
# If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR .
# export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>"

# This enables basic GC logging to its own file with automatic log rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+.
# If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR .
# export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M"

# Uncomment below if you intend to use the EXPERIMENTAL off heap cache.
# export HBASE_OPTS="$HBASE_OPTS -XX:MaxDirectMemorySize="
# Set hbase.offheapcache.percentage in hbase-site.xml to a nonzero value.


# Uncomment and adjust to enable JMX exporting
# See jmxremote.password and jmxremote.access in $JRE_HOME/lib/management to configure remote password access.
# More details at: http://java.sun.com/javase/6/docs/technotes/guides/management/agent.html
#
# export HBASE_JMX_BASE="-Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false"
# export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10101"
# export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10102"
# export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10103"
# export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10104"
# export HBASE_REST_OPTS="$HBASE_REST_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10105"

# File naming hosts on which HRegionServers will run.  $HBASE_HOME/conf/regionservers by default.
# export HBASE_REGIONSERVERS=${HBASE_HOME}/conf/regionservers

# Uncomment and adjust to keep all the Region Server pages mapped to be memory resident
#HBASE_REGIONSERVER_MLOCK=true
#HBASE_REGIONSERVER_UID="hbase"

# File naming hosts on which backup HMaster will run.  $HBASE_HOME/conf/backup-masters by default.
# export HBASE_BACKUP_MASTERS=${HBASE_HOME}/conf/backup-masters

# Extra ssh options.  Empty by default.
# export HBASE_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HBASE_CONF_DIR"

# Where log files are stored.  $HBASE_HOME/logs by default.
# export HBASE_LOG_DIR=${HBASE_HOME}/logs

# Enable remote JDWP debugging of major HBase processes. Meant for Core Developers 
# export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8070"
# export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8071"
# export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8072"
# export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8073"

# A string representing this instance of hbase. $USER by default.
# export HBASE_IDENT_STRING=$USER

# The scheduling priority for daemon processes.  See 'man nice'.
# export HBASE_NICENESS=10

# The directory where pid files are stored. /tmp by default.
# export HBASE_PID_DIR=/var/hadoop/pids

# Seconds to sleep between slave commands.  Unset by default.  This
# can be useful in large clusters, where, e.g., slave rsyncs can
# otherwise arrive faster than the master can service them.
# export HBASE_SLAVE_SLEEP=0.1

# Tell HBase whether it should manage it's own instance of Zookeeper or not.
 export HBASE_MANAGES_ZK=true

# The default log rolling policy is RFA, where the log file is rolled as per the size defined for the 
# RFA appender. Please refer to the log4j.properties file to see more details on this appender.
# In case one needs to do log rolling on a date change, one should set the environment property
# HBASE_ROOT_LOGGER to "<DESIRED_LOG LEVEL>,DRFA".
# For example:
# HBASE_ROOT_LOGGER=INFO,DRFA
# The reason for changing default to RFA is to avoid the boundary case of filling out disk space as 
# DRFA doesn't put any cap on the log size. Please refer to HBase-5655 for more context.

hbase-site.xml里面的配置如下:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
/**
 *
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
-->
<configuration>

  <property>
      <name>hbase.rootdir</name>
      <value>hdfs://192.168.46.32:9000/hbase</value><!--这里必须跟core-site.xml中的配置一样-->
  </property>
  <!-- 开启分布式模式 -->
  <property>
  <name>hbase.cluster.distributed</name>
   <value>true</value>
  </property>  
  <!--    这里是对的,只配置端口,为了配置多个HMaster -->
   <property>
   <name>hbase.master</name>
   <value>192.168.46.32:60000</value> 
   </property>

     <property>
     <name>hbase.tmp.dir</name>
     <value>/home/search/hbase/hbasetmp</value>
         </property>
<!-- Hbase的外置zk集群时,使用下面的zk端口 -->
     <property>
     <name>hbase.zookeeper.quorum</name>
     <value>192.168.46.32,192.168.46.11,192.168.46.10</value>
     </property>

</configuration>



regionservers里面的配置如下:
h1
h2
h3

启动后的在Master上进程如下所示:
1580 SecondaryNameNode
1289 NameNode
2662 HMaster
2798 HRegionServer
1850 NodeManager
3414 Jps
2569 HQuorumPeer
1743 ResourceManager
1394 DataNode

关闭防火墙后,在win上访问Hbase的60010端口,如下所示:



在linu的shell客户端里访问hbase的shell如下所示:





至此,我们的Hbase集群就搭建完毕,下一步我们就可以使用Hbase的shell命令,来测试Hbase的增删改查了,当然我们也可以使用Java API来和Hbase交互,下一篇散仙会给出Java API操作Hbase的一些通用代码。







  • 大小: 379 KB
  • 大小: 502.7 KB
分享到:
评论

相关推荐

    Hadoop2.2+Zookeeper3.4.5+HBase0.96集群环境搭建

    Hadoop2.2+Zookeeper3.4.5+HBase0.96集群环境搭建 Hadoop2.2+Zookeeper3.4.5+HBase0.96集群环境搭建是大数据处理和存储的重要组件,本文档将指导用户从零开始搭建一个完整的Hadoop2.2+Zookeeper3.4.5+HBase0.96集群...

    Hadoop-2.2.0+Hbase-0.96.2+Hive-0.13.1分布式整合,Hadoop-2.X使用HA方式

    Hadoop-2.2.0+Hbase-0.96.2+Hive-0.13.1分布式整合,Hadoop-2.X使用HA方式

    CentOS-6.4 64位系统下hadoop-2.2.0+hbase-0.96+zookeeper-3.4.5 分布式安装配置

    在本文中,我们将深入探讨如何在CentOS-6.4 64位操作系统上配置一个基于Hadoop 2.2.0、HBase 0.96和Zookeeper 3.4.5的分布式环境。这个过程涉及到多个步骤,包括系统设置、软件安装、配置以及服务启动。 首先,为了...

    hadoop2.2.0+Hbase0.96+hive0.12详细配置

    本文将详细介绍如何在Linux环境下搭建Hadoop2.2.0、HBase0.96和Hive0.12的集群环境。 首先,我们从Hadoop的安装开始。Hadoop2.2.0是Apache官方稳定版,可以从官方网站或镜像站点下载。下载完成后,将其上传到Linux...

    CentOS下Hadoop+Hbase+ZooKeeper分布式存储部署详解

    通过以上步骤,我们已经在CentOS 6.5 x86_64环境下成功搭建了Hadoop 2.2.0集群,并且集成了HBase和ZooKeeper,形成了一套完整的分布式存储和处理系统。这样的系统不仅能够处理海量数据,还具备高可用性和扩展性,...

    妳那伊抹微笑_云计算之Hadoop-2.2.0+Hbaase-0.96.2 +Hive-0.13.1完全分布式环境整合安装文档V1.0.0.docx

    文档作者王扬庭分享的这份资料详细介绍了如何在云计算环境中集成和配置Hadoop-2.2.0、HBase-0.96.2以及Hive-0.13.1,形成一个完全分布式的计算环境。这个文档是《云计算之Flume+Kafka+Storm+Redis/Hbase+Hadoop+Hive...

    hadoop2.2 hbase0.96.2 hive 0.13.1整合部署

    【Hadoop2.2.0】 Hadoop2.2.0是Apache Hadoop项目的一个稳定版本,提供了改进的性能和稳定性。它引入了YARN(Yet Another Resource Negotiator),这是一个资源管理和调度器,用于更好地管理和优化分布式计算任务。...

    Hadoop2.2.0Hbase0.98.1Hive0.13完全安装手册

    ### Hadoop2.2.0 + HBase0.98.1 + Sqoop1.4.4 + Hive0.13 完全安装手册 #### 前言 随着大数据技术的发展,Hadoop已经成为处理海量数据的核心框架之一。本文旨在为读者提供一套最新的Hadoop2.2.0、HBase0.98.1、...

    hadoop2.2.0

    9. 数据处理生态:在Hadoop 2.2.0中,除了核心的HDFS和MapReduce,还有许多配套项目,如HBase(分布式数据库)、Hive(数据仓库工具)、Pig(高级数据分析语言)、Oozie(工作流管理系统)等,构建了一个完整的大...

    hadoop集群搭建详解

    一、Hadoop2.2.0、ZooKeeper3.4.5、HBase0.96.2、Hive0.13.1是什么? Hadoop2.2.0是一个大数据处理框架,具有许多新特性,如支持Windows平台、改进了安全性、提高了性能等。 ZooKeeper3.4.5是一个分布式应用程序...

    hbase客户端连接工具winutils-2.2.0.zip

    HBase是一款分布式的、面向列的开源数据库,它是Apache Hadoop生态系统的一部分,专门设计用于处理大规模数据。在Java客户端上连接HBase集群时,需要配置一系列的环境和依赖,其中包括了`winutils`工具。`winutils-...

    apache-atlas-2.2.0-hbase-hook.tar.gz

    6. **安装与配置**:要使用HBase Hook,用户需要在HBase集群中正确配置和部署这个组件,确保它能与Atlas服务器通信,将HBase操作的事件转化为Atlas的元数据更新。 7. **性能优化**:尽管增加了额外的治理层,但...

    hadoop2.2.0API

    Hadoop 2.2.0 不只是MapReduce和HDFS,还包括一系列生态系统项目,如HBase(分布式NoSQL数据库)、Hive(数据仓库工具)、Pig(数据流处理语言)、Oozie(工作流调度系统)和Zookeeper(分布式协调服务)。...

    hadoop-2.2.0.tar.gz + zookeeper3.4.5

    用户在下载hadoop-2.2.0.tar.gz后,可以通过`tar -zxfv hadoop-2.2.0.tar.gz`命令进行解压缩,然后配置环境变量,启动Hadoop集群,进行数据的存储和计算。 接下来是Zookeeper 3.4.5。这个版本是Zookeeper的一个经典...

    Hadoop安装

    【Hadoop安装】和【Hadoop集群安装】的知识点主要涉及分布式存储系统Hadoop的部署,特别是针对Hadoop 2.2.0版本的集群配置。Hadoop是Apache基金会的一个开源项目,它提供了分布式文件系统(HDFS)和MapReduce计算...

    hadoop-2.2.0-x64.tar.gz

    Hadoop是Apache软件基金会开发的一个开源分布式...总之,Hadoop 2.2.0是一个强大的分布式计算框架,适用于处理大规模数据。通过在64位Linux环境下正确安装和配置,可以充分利用硬件资源,实现高效的数据存储和处理。

    hbase-2.2.0-bin.tar.gz

    HBase,全称为Apache HBase,是一款开源的分布式NoSQL数据库,主要运行在Hadoop之上。这个名为“hbase-2.2.0-bin.tar.gz”的压缩包包含了HBase的二进制发行版,适用于2.2.0版本。这个版本的发布提供了最新的功能和...

    solr-8.6.3.tgz+hbase-2.3.3-bin.tar.gz

    而HBase则是一个分布式的、面向列的开源数据库,它是构建在Hadoop文件系统(HDFS)之上,提供实时读写服务,尤其适用于大数据存储和处理。 描述中提到的"atlas2.2.0内嵌式编译会用到"可能是指Apache Atlas 2.2.0,...

    hadoop-common-2.2.0-bin-master

    Hadoop Common 2.2.0是Apache Hadoop项目的核心组件之一,它提供了Hadoop生态系统中的通用工具和服务,支持分布式存储和计算。这个版本尤其适用于在Windows环境中进行Hadoop Java API的开发工作。下面将对Hadoop ...

Global site tag (gtag.js) - Google Analytics