`
zsjg13
  • 浏览: 145052 次
  • 性别: Icon_minigender_1
  • 来自: 安徽
社区版块
存档分类
最新评论

Setting up SSH for a Hadoop cluster

阅读更多

When setting up a Hadoop cluster, you'll need to designate one specific node as the master node. This server will typically host the NameNode and JobTraker

daemons. It'll also serve as the base station contacting and activating the DataNode and TaskTracker daemons on all of the slave nodes.

 

Hadoop uses passphraseless SSH for this purpose. SSH utilizes standard public key cryptography to create a pair of keys for user verification——one public,

one private. The public key is stored locally on every node in the cluster, and the master node sends the private key when attempting to access a remote machine. With both pieces of information, the target machine can validate the login attempt.

 

1. Define a common account

This access is from a user account on one node to another user account on the target machine. For Hadoop, the accounts should have the same username on

all of the nodes (we use hadoop-user in this book), and for security purpose we recommend it being a user-level account. This account is only for managing your

Hadoop cluster. Once the cluster daemons are up and running, you'll be able to run your actual MapReduce jobs from other accounts.

 

2. Verify SSH installation

$ which ssh

$ which sshd

$ which ssh-keygen

 

没有装的话,那就装个OpenSSH

 

3. Generate SSH key pair

Having verified that SSH is correctly installed on all nodes of the cluster, we use ssh-keygen on the master node to generate an RSA key pair. Be certain to

avoid entering a passphrase, or you'll have to manually enter that phrase every time the master node attempts to access another node.

 

$ ssh-keygen -t rsa

 

4. Distribute public key and validate logins

Albeit a bit tedious, you'll next need to copy the public key to every slave node as well as the master node:

[hadoop-user@master]$ scp ~/.ssh/id_rsa.pub hadoop-user@target:~/master_key

 

Manually log in to the target node and set the master key as an authorized key (or append to the list of authorized keys if you have others defined).

[hadoop-user@target]$ mkdir ~/.ssh
[hadoop-user@target]$ chmod 700 ~/.ssh

[hadoop-user@target]$ mv ~/master_key ~/.ssh/authorized_keys

[hadoop-user@target]$ chmod 600 ~/.ssh/authorized_keys

 

After generating the key, you can verify it’s correctly defined by attempting to log in to the target node from the master:

[hadoop-user@master]$ ssh target

The authenticity of host 'target (xxx.xxx.xxx.xxx)' can’t be established.
RSA key fingerprint is 72:31:d8:1b:11:36:43:52:56:11:77:a4:ec:82:03:1d.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'target' (RSA) to the list of known hosts.
Last login: Sun Jan 4 15:32:22 2009 from master

 

After confirming the authenticity of a target node to the master node, you won’t be prompted upon subsequent login attempts.

[hadoop-user@master]$ ssh target
Last login: Sun Jan 4 15:32:49 2009 from master

 

We’ve now set the groundwork for running Hadoop on your own cluster.

 

 

分享到:
评论

相关推荐

    Elasticsearch for Hadoop

    Setting up Hadoop for Elasticsearch Setting up Java Setting up a dedicated user Installing SSH and setting up the certificate Downloading Hadoop Setting up environment variables Configuring ...

    [Hadoop] Hadoop 集群操作管理技巧 (英文版)

    Easy-to-understand recipes for securing and monitoring a Hadoop cluster, and design considerations Recipes showing you how to tune the performance of a Hadoop cluster Learn how to build a Hadoop ...

    SSH无密码登录配置(主要针对Hadoop配置)

    在Hadoop这样的分布式计算环境中,SSH无密码登录配置至关重要,因为它简化了节点间的通信,提高了运维效率。以下是关于SSH无密码登录配置的详细解释。 **一、SSH原理** SSH通过加密网络数据流,确保在非安全网络上...

    hadoop-cluster-build

    【标题】"hadoop-cluster-build"涉及的知识点主要围绕着Hadoop集群的构建,这是一个大数据处理的核心技术。Hadoop是一个开源框架,它允许在廉价硬件上进行大规模数据处理,具有高度可扩展性和容错性。 【描述】...

    Hadoop cluster配置

    在大数据处理领域,Hadoop是一个不可或缺的开源框架,它提供了分布式存储和计算的能力,使得海量数据的处理变得可能。本文将深入探讨“Hadoop集群配置”这一主题,结合提供的WordCount代码实例,来阐述Hadoop集群...

    Hadoop cluster planning guide

    Hadoop cluster planning guide

    Hadoop Cluster Deployment

    ### Hadoop集群部署详解 #### 一、Hadoop概述与重要性 Hadoop是一个开源软件框架,用于分布式存储和处理大型数据集。它基于Google的MapReduce论文和Google File System (GFS) 论文而设计,能够有效地处理PB级别的...

    hadoop-cluster-docker, 在 Docker 容器中运行 Hadoop.zip

    hadoop-cluster-docker, 在 Docker 容器中运行 Hadoop 在 Docker 容器内运行Hadoop集群博客:在 Docker 更新中运行Hadoop集群。博客:基于Docker搭建Hadoop集群之升级版 3节点Hadoop集群 1.拉 Docker 图像sudo do

    Hadoop Cluster deployment

    指导Hadoop集群部署的资料, 注意: 内容是英文的, 可能有些同学会失望

    基于Java和ssh在Hadoop平台上完成文件操作

    本主题将深入探讨如何使用Java编程语言与SSH工具在Hadoop平台上进行文件操作和结果查询。Hadoop作为开源的大数据处理框架,提供了一个分布式文件系统(HDFS)和MapReduce计算模型,使得大规模数据处理变得可能。 ...

    ssh for hadoop

    这就引出了“SSH for Hadoop”的重要性,即通过Secure Shell(SSH)协议来实现对Hadoop集群的安全管理。 ### SSH与Hadoop的关系 SSH是一种网络协议,用于计算机之间的安全通信。在Hadoop环境下,SSH主要用于以下几...

    Spring Data for Apache Hadoop API(Spring Data for Apache Hadoop 开发文档).CHM

    Spring Data for Apache Hadoop API。 Spring Data for Apache Hadoop 开发文档

    Data Algorithms Recipes for Scaling Up with Hadoop and Spark epub

    Data Algorithms Recipes for Scaling Up with Hadoop and Spark 英文epub 本资源转载自网络,如有侵权,请联系上传者或csdn删除 本资源转载自网络,如有侵权,请联系上传者或csdn删除

    Hadoop-Cluster-Install-.zip_hadoop_hadoop cluster

    Hadoop在centOS系统下的安装文档,系统是虚拟机上做出来的,一个namenode,两个datanode,详细讲解了安装过程。

    tuning-hadoop-on-dell-poweredge-servers

    Several good tools and guides describe how to deploy Hadoop clusters, but very little documentation tells how to increase performance on a Hadoop cluster once it is deployed. This white paper ...

    hadoop权威指南.示例代码

    ch09 - Setting Up a Hadoop Cluster ch10 - Administering Hadoop ch11 - Pig ch12 - Hive ch13 - HBase ch14 - ZooKeeper ch15 - Sqoop ch16 - Case Studies app1 - Installing Apache Hadoop app2 - Cloudera's ...

    SQL for Apache Hadoop

    标题中提到的“SQL for Apache Hadoop”指向一种通过SQL语言访问和操作Apache Hadoop存储的数据的能力。Hadoop是一个开源的框架,最初由Apache软件基金会开发,设计用于存储和处理大量数据。Hadoop主要采用分布式...

    Hadoop Multi Node Cluster 安装步骤.pdf

    Hadoop Multi Node Cluster 安装步骤.pdf

Global site tag (gtag.js) - Google Analytics