`
wbj0110
  • 浏览: 1620399 次
  • 性别: Icon_minigender_1
  • 来自: 上海
文章分类
社区版块
存档分类
最新评论

Deploying HDFS High Availability

阅读更多

After you have set all of the necessary configuration options, you are ready to start the JournalNodes and the two HA NameNodes.

  Important: Before you start:

Install and Start the JournalNodes

  1. Install the JournalNode daemons on each of the machines where they will run.

    To install JournalNode on Red Hat-compatible systems:

    $ sudo yum install hadoop-hdfs-journalnode

    To install JournalNode on Ubuntu and Debian systems:

    $ sudo apt-get install hadoop-hdfs-journalnode 

    To install JournalNode on SLES systems:

    $ sudo zypper install hadoop-hdfs-journalnode
  2. Start the JournalNode daemons on each of the machines where they will run:
    sudo service hadoop-hdfs-journalnode start 

Wait for the daemons to start before starting the NameNodes.

Initialize the Shared Edits directory

If you are converting a non-HA NameNode to HA, initialize the shared edits directory with the edits data from the local NameNode edits directories:
hdfs namenode -initializeSharedEdits

Start the NameNodes

  1. Start the primary (formatted) NameNode:
    $ sudo service hadoop-hdfs-namenode start
  2. Start the standby NameNode:
    $ sudo -u hdfs hdfs namenode -bootstrapStandby
    $ sudo service hadoop-hdfs-namenode start 
      Note:

    If Kerberos is enabled, do not use commands in the form sudo -u <user> <command>; they will fail with a security error. Instead, use the following commands: $ kinit <user> (if you are using a password) or $ kinit -kt <keytab> <principal> (if you are using a keytab) and then, for each command executed by this user, $ <command>

Starting the standby NameNode with the -bootstrapStandby option copies over the contents of the primary NameNode's metadata directories (including the namespace information and most recent checkpoint) to the standby NameNode. (The location of the directories containing the NameNode metadata is configured via the configuration options dfs.namenode.name.dir and/or dfs.namenode.edits.dir.)

You can visit each NameNode's web page by browsing to its configured HTTP address. Notice that next to the configured address is the HA state of the NameNode (either "Standby" or "Active".) Whenever an HA NameNode starts and automatic failover is not enabled, it is initially in the Standby state. If automatic failover is enabled the first NameNode that is started will become active.

Restart Services

If you are converting from a non-HA to an HA configuration, you need to restart the JobTracker and TaskTracker (for MRv1, if used), or ResourceManager, NodeManager, and JobHistory Server (for YARN), and the DataNodes:

On each DataNode:

$ sudo service hadoop-hdfs-datanode start

On each TaskTracker system (MRv1):

$ sudo service hadoop-0.20-mapreduce-tasktracker start

On the JobTracker system (MRv1):

$ sudo service hadoop-0.20-mapreduce-jobtracker start

Verify that the JobTracker and TaskTracker started properly:

sudo jps | grep Tracker

On the ResourceManager system (YARN):

$ sudo service hadoop-yarn-resourcemanager start

On each NodeManager system (YARN; typically the same ones where DataNode service runs):

$ sudo service hadoop-yarn-nodemanager start

On the MapReduce JobHistory Server system (YARN):

$ sudo service hadoop-mapreduce-historyserver start

Deploy Automatic Failover

If you have configured automatic failover using the ZooKeeper FailoverController (ZKFC), you must install and start the zkfc daemon on each of the machines that runs a NameNode. Proceed as follows.

To install ZKFC on Red Hat-compatible systems:

$ sudo yum install hadoop-hdfs-zkfc

To install ZKFC on Ubuntu and Debian systems:

$ sudo apt-get install hadoop-hdfs-zkfc

To install ZKFC on SLES systems:

$ sudo zypper install hadoop-hdfs-zkfc

To start the zkfc daemon:

$ sudo service hadoop-hdfs-zkfc start

It is not important that you start the ZKFC and NameNode daemons in a particular order. On any given node you can start the ZKFC before or after its corresponding NameNode.

You should add monitoring on each host that runs a NameNode to ensure that the ZKFC remains running. In some types of ZooKeeper failures, for example, the ZKFC may unexpectedly exit, and should be restarted to ensure that the system is ready for automatic failover.

Additionally, you should monitor each of the servers in the ZooKeeper quorum. If ZooKeeper crashes, then automatic failover will not function. If the ZooKeeper cluster crashes, no automatic failovers will be triggered. However, HDFS will continue to run without any impact. When ZooKeeper is restarted, HDFS will reconnect with no issues.

Verifying Automatic Failover

After the initial deployment of a cluster with automatic failover enabled, you should test its operation. To do so, first locate the active NameNode. As mentioned above, you can tell which node is active by visiting the NameNode web interfaces.

Once you have located your active NameNode, you can cause a failure on that node. For example, you can use kill -9 <pid of NN> to simulate a JVM crash. Or you can power-cycle the machine or its network interface to simulate different kinds of outages. After you trigger the outage you want to test, the other NameNode should automatically become active within several seconds. The amount of time required to detect a failure and trigger a failover depends on the configuration of ha.zookeeper.session-timeout.ms, but defaults to 5 seconds.

If the test does not succeed, you may have a misconfiguration. Check the logs for the zkfc daemons as well as the NameNode daemons in order to further diagnose the issue.

http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH5/latest/CDH5-High-Availability-Guide/cdh5hag_hdfs_ha_deploy.html

分享到:
评论

相关推荐

    Pro.SQL.Server.Always.On.Availability.Groups.1484220706.epub

    Pro SQL Server Always On Availability Groups is aimed at SQL Server architects, database administrators, and IT professionals who are tasked with architecting and deploying a high-availability and ...

    医院信息系统中Oracle Data Guard部署与实践.pdf

    By deploying Oracle Data Guard in hospital information systems, hospitals can minimize downtime and ensure high availability of the database, thereby ensuring the continuity of medical services. ...

    Deploying Node.js(PACKT,2015)

    A series of examples on deploying your Node.js applications in production environments are provided, including a discussion on setting up continuous deployment and integration for your team....

    018Deploying+IPv6+Networks.chm

    018Deploying+IPv6+Networks.chm

    记一次Tomcat卡死在 Deploying web application 步骤的问题.doc

    在生产环境中部署web应用程序时,Tomcat卡死在Deploying web application步骤的问题是一个常见的问题。这个问题可能是由于多种原因引起的,包括但不限于securerandom.source[file:/dev/./urandom]没有指定、数据库...

    Implementing.Cloud.Design.Patterns.for.AWS

    Create highly efficient design patterns for scalability, redundancy, and high availability in the AWS Cloud About This Book Create highly robust systems using cloud infrastructure Make web ...

    Deploying Raspberry Pi in the Classroom 无水印pdf

    Deploying Raspberry Pi in the Classroom 英文无水印pdf pdf所有页面使用FoxitReader和PDF-XChangeViewer测试都可以打开 本资源转载自网络,如有侵权,请联系上传者或csdn删除 本资源转载自网络,如有侵权,...

    Enterprise.Integration.Patterns.Designing.Building.And.Deploying.Messaging.Solutions

    Enterprise.Integration.Patterns.Designing.Building.And.Deploying.Messaging.Solutions

    信息: Deploying web application directory lx01

    在给定的信息中,我们正在部署两个Web应用程序目录——`lx01`和`blog`。这个过程由Apache Tomcat服务器执行,一个流行的Java Servlet容器。然而,部署过程中遇到了问题,特别是与`blog`应用程序相关的错误。...

    Deploying.NET.Applications

    本书《Deploying .NET Applications: Learning MSBuild and ClickOnce》由Sayed Y. Hashimi和Sayed Ibrahim Hashimi共同编写,旨在为读者提供关于.NET应用程序部署的重要知识,特别是通过介绍MSBuild和ClickOnce这两...

    Deploying Nodejs

    《Deploying Node.js》这本书由Sandro Pasquali撰写,他是一位经验丰富的技术专家,曾在1997年创立Simple.com公司,并为一些世界顶级企业提供企业级应用设计服务。本书详细介绍了如何构建、测试、部署、监控和维护大...

    Deploying ACI

    思科的数据中心SDN 产品,ACI 部署指南,关于设计配置,管理,英文原版。

Global site tag (gtag.js) - Google Analytics