hadoop namenode failover hadoop name节点单点故障处理 -

josephgao

浏览: 15681 次
性别:
来自: 北京

最近访客更多访客>>

来自原野

ericxt

iamnotterminator

waldo.wy

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

hadoop namenode failover hadoop name节点单点故障处理

博客分类：

hadoop

转自http://wiki.apache.org/hadoop/NameNodeFailover
一. 在dfs.name.dir上添加nfs目录，
<property>
<name>dfs.name.dir</name> <value>/export/hadoop/namedir,/remote/export/hadoop/namedir</value>
</property>
如何挂在nfs参见http://server.zdnet.com.cn/server/2007/0831/482007.shtml
http://www.cnblogs.com/mchina/archive/2013/01/03/2840040.html

** chmod 777 -R server端目录 **

二. 当namenode发生failure时
1. 备份nfs上的数据
2. 选取一台同一网络内的机器，改变ip地址为namenode的ip，
3. 在这台机器上安装hadoop，并copy之前的配置
4. 不要format这个node，把nfs directory挂载在这台机器的同样位置下
5. 启动这个namenode

使用secondarynamenode恢复的方式参见http://blog.csdn.net/jokes000/article/details/7703512

原文：

Introduction

As of 0.20, Hadoop does not support automatic recovery in the case of a NameNode failure. This is a well known and recognized single point of failure in Hadoop.

Experience at Yahoo! shows that NameNodes are more likely to fail due to misconfiguration, network issues, and bad behavior amongst clients than actual hardware problems. Out of fifteen grids over three year period, only three NameNode failures were related to hardware problems.

Configuring Hadoop for Failover

There are some preliminary steps that must be in place prior to performing a NameNode recovery. The most important is the dfs.name.dir property. This setting configures the NameNode such that it can write to more than one directory. A typcal configuration might look something like this:

<property>
<name>dfs.name.dir</name> <value>/export/hadoop/namedir,/remote/export/hadoop/namedir</value>
</property>
The first directory is a local directory and the second directory is a NFS mounted directory. The NameNode will write to both locations, keeping the HDFS metadata in sync. This allows for storage of the metadata off-machine so that one will have something to recover. During startup, the NameNode will pick the most recent version of these two directories to use and then sync both of them to use the same data.

After we have configured the NameNode to write to two or more directories, we now have a working backup of the metadata. Using this data, in the more common failure scenarios, we can use this data to bring the dead NameNode from the grave.

When a Failure Occurs

Now the recovery steps:

Just to be safe, make a copy of the data on the remote NFS mount for safe keeping.
Pick a target machine on the same network.
Change the IP address of that machine to match the NameNode's IP address. Using an interface alias to provide this address movement works as well. If this is not an option, be prepared to restart the entire grid to avoid hitting https://issues.apache.org/jira/browse/HADOOP-3988 .
Install Hadoop similarly to how you did the NameNode
Do not format this node!
Mount the remote NFS directory in the same location.
Startup the NameNode.
The NameNode should start replaying the edits file, updating the image, block reports should come in, etc.
At this point, your NameNode should be up.

分享到：