`

hadoop 2.x-HDFS HA --Part II: installation

 
阅读更多

 this article is the another part follow to hadoop 2.x-HDFS HA --Part I: abstraction ,and here will talk about these topics:

 

2.installation HA
 2.1 manual failover  
 2.2 auto failover
3.conclusion
 

 2.installation HA

  2.1 manual failover

  for this mode,the cluster  can be failovered by manually using some commands,but it's hard to know when/why this issuse occures.of course ,this is better than nothing:)

here are the assignments my cluster:

host NN DN JN
hd1 y y y
hd2 y y y

 

 yes,in general,the number of journalnode is recommanded to be a odd number for max utilization,but here is just for validaing the  function of HA,so it's passed also!

 

  upon on the configs in install hadoop-2.5 without HDFS HA /Federation ,there are some changes of properties to be added or alternated,

mode

property to be added

/alternated 

value abstract
HA manual failover

dfs.nameservices

mycluster logic name of this name service,
  dfs.ha.namenodes.mycluster nn1,nn2

the name is formated by:

dfs.ha.namenodes.#serviceName#;

and this name service contains two namenodes

 

dfs.namenode.rpc-address.mycluster.nn1

hd1:8020

the internal communication addr
 

dfs.namenode.rpc-address.mycluster.nn2

hd2:8020

 
 

dfs.namenode.http-address.mycluster.nn1

hd1:50070 the web ui address
 

dfs.namenode.http-address.mycluster.nn2

hd2:50070  
 

dfs.namenode.shared.edits.dir

qjournal://hd1:8485;hd2:8485/mycluster

 
 

dfs.client.failover.proxy.provider.mycluster

org.apache.hadoop.hdfs.server.namenode.

ha.ConfiguredFailoverProxyProvider

 
 

dfs.ha.fencing.methods

sshfence

 
 

dfs.ha.fencing.ssh.private-key-files

/home/hadoop/.ssh/id_rsa

 
 

dfs.ha.fencing.ssh.connect-timeout

10000

 
 

fs.defaultFS

hdfs://mycluster

the suffix of this value must be same

as property 'dfs.nameservices' set in hdfs-site.xml

 

dfs.journalnode.edits.dir

/usr/local/hadoop/data-2.5.1/journal

 

 

2.1.2 steps to startup

 these steps below are order-related.

 2..1.2.1 startup journalnode

   go to all journalnodes,and run 

sbin/hadoop-daemon.sh start journalnode
 2.1.2.2 go to first NN and format followed by start
hdfs namenode -format
hadoop-daemon.sh start namenode
   then go to the remain JN nodes,to get the fs image,run by
bin/hdfs namenode -bootstrapStandby
sbin/hadoop-daemon.sh start namenode
  2.1.2.3 spawn datanode 
sbin/hadoop-daemons.sh start datanode

  

  now,both namenodes are all kept in 'standby' state(yes,this is the defult action by manual mode,if you want to set a default active namenode ,use the auto-failover mode instead in this page)

  here,u can use some commands to transition standby to active and vice-versa

 

hadoop@ubuntu:/usr/local/hadoop/hadoop-2.5.1/etc/hadoop-ha-manual$ hdfs haadmin
Usage: DFSHAAdmin [-ns <nameserviceId>]
    [-transitionToActive <serviceId> [--forceactive]]
    [-transitionToStandby <serviceId>]
    [-failover [--forcefence] [--forceactive] <serviceId> <serviceId>]
    [-getServiceState <serviceId>]
    [-checkHealth <serviceId>]
    [-help <command>]
  a.transition a standby namenode nn1 To active ,return nothing if it's already active

 

 

hdfs haadmin -transitionToActive nn1
  b.check whether nn1 is in active state

 

 

hdfs haadmin -getServiceState nn1
    it will return active result

 

  c.then will check its healthy

 

hdfs haadmin -checkHealth nn1
   return nothing if it's healthy,else some 'connection excpetion will show ' here

 

  d.yes,u can also failover from a dead namenode to another one to get active

 

hdfs haadmin -failover nn1 nn2
   here switch state from nn1(active) to nn2(standby).if u specify optioin '--forcefence' then the namenode nn1 will be killed also for fencing!so this is prudent.

 

  e.and stop-dfs.sh will shutdown all processes in cluster,and start-dfs.sh will spawn up all them

 

stop-dfs.sh
start-df.sh
hdfs haadmin -transitionToActive nn1
 

 

by now ,we can see the nn1 is in active,and 'standby' for nn2

 

 



 

   below are the orders of stop and start:

 start order :

hadoop@ubuntu:/usr/local/hadoop/hadoop-2.5.1$ start-all.sh 
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [hd1 hd2]
hd1: starting namenode, logging to /usr/local/hadoop/hadoop-2.5.1/logs/hadoop-hadoop-namenode-ubuntu.out
hd2: starting namenode, logging to /usr/local/hadoop/hadoop-2.5.1/logs/hadoop-hadoop-namenode-bfadmin.out
hd1: datanode running as process 1539. Stop it first.
hd2: datanode running as process 3081. Stop it first.
Starting journal nodes [hd1 hd2]
hd1: journalnode running as process 1862. Stop it first.
hd2: starting journalnode, logging to /usr/local/hadoop/hadoop-2.5.1/logs/hadoop-hadoop-journalnode-bfadmin.out
Starting ZK Failover Controllers on NN hosts [hd1 hd2]
hd1: zkfc running as process 2090. Stop it first.
hd2: zkfc running as process 3388. Stop it first.
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop/hadoop-2.5.1/logs/yarn-hadoop-resourcemanager-ubuntu.out
hd2: starting nodemanager, logging to /usr/local/hadoop/hadoop-2.5.1/logs/yarn-hadoop-nodemanager-bfadmin.out
hd1: starting nodemanager, logging to /usr/local/hadoop/hadoop-2.5.1/logs/yarn-hadoop-nodemanager-ubuntu.out

 

 shutdown order(same as starting):

hadoop@ubuntu:/usr/local/hadoop/hadoop-2.5.1$ stop-all.sh 
This script is Deprecated. Instead use stop-dfs.sh and stop-yarn.sh
Stopping namenodes on [hd1 hd2]
hd1: stopping namenode
hd2: stopping namenode
hd1: stopping datanode
hd2: stopping datanode
Stopping journal nodes [hd1 hd2]
hd1: stopping journalnode
hd2: stopping journalnode
Stopping ZK Failover Controllers on NN hosts [hd1 hd2]
hd1: stopping zkfc
hd2: stopping zkfc
stopping yarn daemons
no resourcemanager to stop
hd1: no nodemanager to stop
hd2: no nodemanager to stop
no proxyserver to stop

 

 --------------

 now u can test some failover cases:

 

hdfs dfs -put test.txt /
kill #process-of-nn1#
hdfs haadmin -transitionToActive nn2
# test whether the first nn1 's edits are synchronised to nn2?yes of course,you will see the file lied there correctly
hdfs dfs -ls /

 ------------------

  below is a test of killing the journalnode to check the cluster's robust 

after stop hd1's journalnode,this causes the hd1's(same host) namenode to be killed :

2014-11-12 16:45:15,102 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Took 10008ms to send a batch of 1 edits (17 bytes) to remote journal 192.168.1.25:8485
2014-11-12 16:45:15,102 FATAL org.apache.hadoop.hdfs.server.namenode.FSEditLog: Error: flush failed for required journal (JournalAndStream(mgr=QJM to [192.168.1.25:8485, 192.168.1.30:8485], stream=QuorumOutputStream starting at txid 261))
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many exceptions to achieve quorum size 2/2. 1 successful responses:
192.168.1.30:8485: null [success]
1 exceptions thrown:
192.168.1.25:8485: Call From ubuntu/192.168.1.25 to hd1:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

   note the msg:Got too many exceptions to achieve quorum size 2/2. 1 successful responses....1 exceptions thrown:

 

=============

  2.2 auto failover

  the so-called 'auto failover' is the opposite of 'manual failover',the former uses a coordination-system(ie.zookeeper) to automatically recover the namenodes if some failures occure,e.g hardware faults,soft ware bugs,etc.when these problems issue the HA will detect which namenode is failed from active(not standby).then the standby nn will undertake the role which prior was 'active'.

  here are some configs besides from manual failover's:

property to be added

/alternated 

value abstract

dfs.ha.automatic-failover.enabled

true auto failover when possible

 

ha.zookeeper.quorum

hd1:2181,hd2:2181 yes ,u can see,both journalnode and zookeeperrole here are only even number,but it's ok for test also!
     

 and the zkfc (zk client used in namenode to detect failures) roles are like this:

 

host nn jn dn zkfc(new)
hd1 y y y y
hd2 y y y y

 

  2.2.1 steps to startup 

   a.format zk

hdfs zkfc -formatZK

  b.start all

start-dfs.sh

   this includes the nn,jn,dn and zkfc processes.

  now just do what u want to do,the auto failover will function properly,have a nice experience for that!

 

 

ref:

HDFS High Availability Using the Quorum Journal Manager

Hadoop 2.0 NameNode HA和Federation实践

  • 大小: 100 KB
  • 大小: 95.9 KB
分享到:
评论

相关推荐

    Hadoop 3.x(HDFS)----【HDFS 的 API 操作】---- 代码

    Hadoop 3.x(HDFS)----【HDFS 的 API 操作】---- 代码 Hadoop 3.x(HDFS)----【HDFS 的 API 操作】---- 代码 Hadoop 3.x(HDFS)----【HDFS 的 API 操作】---- 代码 Hadoop 3.x(HDFS)----【HDFS 的 API 操作】--...

    hadoop最新版本3.1.1全量jar包

    hadoop-auth-3.1.1.jar hadoop-hdfs-3.1.1.jar hadoop-mapreduce-client-hs-3.1.1.jar hadoop-yarn-client-3.1.1.jar hadoop-client-api-3.1.1.jar hadoop-hdfs-client-3.1.1.jar hadoop-mapreduce-client-jobclient...

    hadoop-hdfs-client-2.9.1-API文档-中文版.zip

    赠送jar包:hadoop-hdfs-client-2.9.1.jar 赠送原API文档:hadoop-hdfs-client-2.9.1-javadoc.jar 赠送源代码:hadoop-hdfs-client-2.9.1-sources.jar 包含翻译后的API文档:hadoop-hdfs-client-2.9.1-javadoc-...

    flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar.tar.gz

    本文将针对"flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar.tar.gz"这一特定压缩包,探讨Flink 1.14.0如何与Hadoop 3.x实现兼容,并深入解析其背后的原理。 Flink 1.14.0是一个强大的流处理引擎,它提供了...

    hadoop-hdfs-2.7.3-API文档-中英对照版.zip

    赠送jar包:hadoop-hdfs-2.7.3.jar; 赠送原API文档:hadoop-hdfs-2.7.3-javadoc.jar; 赠送源代码:hadoop-hdfs-2.7.3-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-2.7.3.pom; 包含翻译后的API文档:hadoop...

    hadoop-hdfs-client-2.9.1-API文档-中英对照版.zip

    赠送jar包:hadoop-hdfs-client-2.9.1.jar; 赠送原API文档:hadoop-hdfs-client-2.9.1-javadoc.jar; 赠送源代码:hadoop-hdfs-client-2.9.1-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-client-2.9.1.pom;...

    Hadoop 2.X HDFS源码剖析-高清-完整目录-2016年3月

    Hadoop 2.X HDFS源码剖析-高清-完整目录-2016年3月,分享给所有需要的人!

    hadoop-hdfs-2.6.5-API文档-中文版.zip

    赠送jar包:hadoop-hdfs-2.6.5.jar; 赠送原API文档:hadoop-hdfs-2.6.5-javadoc.jar; 赠送源代码:hadoop-hdfs-2.6.5-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-2.6.5.pom; 包含翻译后的API文档:hadoop...

    hadoop-hdfs-2.5.1-API文档-中文版.zip

    赠送jar包:hadoop-hdfs-2.5.1.jar; 赠送原API文档:hadoop-hdfs-2.5.1-javadoc.jar; 赠送源代码:hadoop-hdfs-2.5.1-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-2.5.1.pom; 包含翻译后的API文档:hadoop...

    hadoop-hdfs-2.7.3-API文档-中文版.zip

    赠送jar包:hadoop-hdfs-2.7.3.jar; 赠送原API文档:hadoop-hdfs-2.7.3-javadoc.jar; 赠送源代码:hadoop-hdfs-2.7.3-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-2.7.3.pom; 包含翻译后的API文档:hadoop...

    hadoop-hdfs-2.5.1-API文档-中英对照版.zip

    赠送jar包:hadoop-hdfs-2.5.1.jar; 赠送原API文档:hadoop-hdfs-2.5.1-javadoc.jar; 赠送源代码:hadoop-hdfs-2.5.1-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-2.5.1.pom; 包含翻译后的API文档:hadoop...

    hadoop-hdfs-2.6.5-API文档-中英对照版.zip

    赠送jar包:hadoop-hdfs-2.6.5.jar; 赠送原API文档:hadoop-hdfs-2.6.5-javadoc.jar; 赠送源代码:hadoop-hdfs-2.6.5-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-2.6.5.pom; 包含翻译后的API文档:hadoop...

    hadoop-hdfs-2.9.1-API文档-中文版.zip

    赠送jar包:hadoop-hdfs-2.9.1.jar 赠送原API文档:hadoop-hdfs-2.9.1-javadoc.jar 赠送源代码:hadoop-hdfs-2.9.1-sources.jar 包含翻译后的API文档:hadoop-hdfs-2.9.1-javadoc-API文档-中文(简体)版.zip 对应...

    hadoop-hdfs-2.9.1-API文档-中英对照版.zip

    赠送jar包:hadoop-hdfs-2.9.1.jar; 赠送原API文档:hadoop-hdfs-2.9.1-javadoc.jar; 赠送源代码:hadoop-hdfs-2.9.1-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-2.9.1.pom; 包含翻译后的API文档:hadoop...

    Hadoop3.2.2资源包+安装文档

    Hadoop 2.x - 对于数据平衡使用HDFS平衡器。 Hadoop 3.x - 对于数据平衡使用Intra-data节点平衡器,该平衡器通过HDFS磁盘平衡器CLI调用。 存储Scheme Hadoop 2.x - 使用3X副本Scheme Hadoop 3.x - 支持HDFS中的擦除...

    Hadoop_2.X_HDFS源码剖析_带索引书签目录_徐鹏

    《Hadoop_2.X_HDFS源码剖析》是由徐鹏编著的一本深入解析Hadoop 2.x版本中HDFS(Hadoop Distributed File System)源码的专业书籍。这本书旨在帮助读者理解HDFS的核心机制,提升在分布式存储系统方面的专业技能。 ...

    hadoop.dll-and-winutils.exe-for-hadoop2.9.0-on-windows_X64

    4. 配置hadoop-env.cmd:打开Hadoop安装目录下的conf子目录,找到hadoop-env.cmd文件,编辑该文件,将`%JAVA_HOME%`替换为你本机Java JDK的安装路径。 5. 初始化HDFS:在命令行中,使用`winutils.exe fs -mkdir /...

    hadoop.dll-winutils.exe-hadoop2.7.x

    标题提到的"hadop.dll-winutils.exe-hadoop2.7.x"指的是针对Hadoop 2.7.2版本的特定解决方法,描述表明了在该环境中使用这两个文件可以消除错误。 `hadoop.dll` 是一个动态链接库文件,主要在Windows环境下为Hadoop...

    hadoop.dll-and-winutils.exe-for-hadoop2.7.7-on-windows_X64-master

    标题 "hadoop.dll-and-winutils.exe-for-hadoop2.7.7-on-windows_X64-master" 暗示了这是一个针对64位Windows系统优化的Hadoop 2.7.7版本的特定组件集合,主要包含`hadoop.dll`和`winutils.exe`两个关键文件。...

    hadoop1.x与hadoop2.x配置异同

    在探讨Hadoop1.x与Hadoop2.x配置的异同之前,我们首先简要回顾一下GridGain In-Memory HDFS的特性,这是基于行业首个高性能双模式内存文件系统,完全兼容HDFS。GridGain FileSystem(GGFS)作为Hadoop HDFS的即插即...

Global site tag (gtag.js) - Google Analytics