`
zhans52
  • 浏览: 35751 次
  • 性别: Icon_minigender_1
  • 来自: 上海
社区版块
存档分类
最新评论

CentOS 安装 hadoop(伪分布模式)

阅读更多

      在本机上装的CentOS 5.5 虚拟机,

      软件准备:jdk 1.6 U26

      hadoop:hadoop-0.20.203.tar.gz

 

ssh检查配置

 

[root@localhost ~]# ssh-keygen -t  rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): 
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
a8:7a:3e:f6:92:85:b8:c7:be:d9:0e:45:9c:d1:36:3b root@localhost.localdomain
[root@localhost ~]# 
[root@localhost ~]# cd ..
[root@localhost /]# cd root
[root@localhost ~]# ls
anaconda-ks.cfg  Desktop  install.log  install.log.syslog
[root@localhost ~]# cd .ssh
[root@localhost .ssh]# cat id_rsa.pub > authorized_keys
[root@localhost .ssh]# 

[root@localhost .ssh]# ssh localhost
The authenticity of host 'localhost (127.0.0.1)' can't be established.
RSA key fingerprint is 41:c8:d4:e4:60:71:6f:6a:33:6a:25:27:62:9b:e3:90.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost' (RSA) to the list of known hosts.
Last login: Tue Jun 21 22:40:31 2011
[root@localhost ~]# 

 

安装jdk

[root@localhost java]# chmod +x jdk-6u26-linux-i586.bin
[root@localhost java]# ./jdk-6u26-linux-i586.bin
......
......
......
For more information on what data Registration collects and 
how it is managed and used, see:
http://java.sun.com/javase/registration/JDKRegistrationPrivacy.html

Press Enter to continue.....

 
Done.

  安装完成后生成文件夹:jdk1.6.0_26

 

  配置环境变量

 

[root@localhost java]# vi /etc/profile
#添加如下信息
# set java environment
export JAVA_HOME=/usr/java/jdk1.6.0_26
export CLASSPATH=$CLASSPATH:$JAVA_HOME/lib:$JAVA_HOME/jre/lib
export PATH=$JAVA_HOME/lib:$JAVA_HOME/jre/bin:$PATH:$HOME/bin
export HADOOP_HOME=/usr/local/hadoop/hadoop-0.20.203
export PATH=$PATH:$HADOOP_HOME/bin

[root@localhost java]# chmod +x  /etc/profile
[root@localhost java]# source  /etc/profile
[root@localhost java]# 
[root@localhost java]# java -version
java version "1.6.0_26"
Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
Java HotSpot(TM) Client VM (build 20.1-b02, mixed mode, sharing)
[root@localhost java]# 

修改hosts

[root@localhost conf]# vi /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1               localhost.localdomain localhost
::1             localhost6.localdomain6 localhost6
127.0.0.1       namenode datanode01
 

 

解压安装hadoop

[root@localhost hadoop]# tar zxvf hadoop-0.20.203.tar.gz
......
......
......
hadoop-0.20.203.0/src/contrib/ec2/bin/image/create-hadoop-image-remote
hadoop-0.20.203.0/src/contrib/ec2/bin/image/ec2-run-user-data
hadoop-0.20.203.0/src/contrib/ec2/bin/launch-hadoop-cluster
hadoop-0.20.203.0/src/contrib/ec2/bin/launch-hadoop-master
hadoop-0.20.203.0/src/contrib/ec2/bin/launch-hadoop-slaves
hadoop-0.20.203.0/src/contrib/ec2/bin/list-hadoop-clusters
hadoop-0.20.203.0/src/contrib/ec2/bin/terminate-hadoop-cluster
[root@localhost hadoop]# 
 

  进入hadoop配置conf

 
####################################
[root@localhost conf]# vi hadoop-env.sh
# 添加代码
# set java environment
  export JAVA_HOME=/usr/java/jdk1.6.0_26

#####################################
[root@localhost conf]# vi core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
   <property>
     <name>fs.default.name</name>
     <value>hdfs://namenode:9000/</value>
   </property>
   <property>
     <name>hadoop.tmp.dir</name>
     <value>/usr/local/hadoop/hadooptmp</value>
   </property>
</configuration>

#######################################
[root@localhost conf]# vi hdfs-site.xml 
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
     <name>dfs.name.dir</name>
     <value>/usr/local/hadoop/hdfs/name</value>
  </property>
  <property>
     <name>dfs.data.dir</name>
     <value>/usr/local/hadoop/hdfs/data</value>
  </property>
  <property>
     <name>dfs.replication</name>
     <value>1</value>
  </property>
</configuration>

#########################################
[root@localhost conf]# vi mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
  <property>
     <name>mapred.job.tracker</name>
     <value>namenode:9001</value>
  </property>
  <property>
     <name>mapred.local.dir</name>
     <value>/usr/local/hadoop/mapred/local</value>
  </property>
  <property>
     <name>mapred.system.dir</name>
     <value>/tmp/hadoop/mapred/system</value>
  </property>
</configuration>

#########################################
[root@localhost conf]# vi masters
#localhost
namenode

#########################################
[root@localhost conf]# vi slaves
#localhost
datanode01

 

启动 hadoop

#####################格式化namenode##############



[root@localhost bin]# hadoop namenode -format
11/06/23 00:43:54 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = localhost.localdomain/127.0.0.1
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 0.20.203.0
STARTUP_MSG:   build = http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-security-203 -r 1099333; compiled by 'oom' on Wed May  4 07:57:50 PDT 2011
************************************************************/
11/06/23 00:43:55 INFO util.GSet: VM type       = 32-bit
11/06/23 00:43:55 INFO util.GSet: 2% max memory = 19.33375 MB
11/06/23 00:43:55 INFO util.GSet: capacity      = 2^22 = 4194304 entries
11/06/23 00:43:55 INFO util.GSet: recommended=4194304, actual=4194304
11/06/23 00:43:56 INFO namenode.FSNamesystem: fsOwner=root
11/06/23 00:43:56 INFO namenode.FSNamesystem: supergroup=supergroup
11/06/23 00:43:56 INFO namenode.FSNamesystem: isPermissionEnabled=true
11/06/23 00:43:56 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
11/06/23 00:43:56 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
11/06/23 00:43:56 INFO namenode.NameNode: Caching file names occuring more than 10 times 
11/06/23 00:43:57 INFO common.Storage: Image file of size 110 saved in 0 seconds.
11/06/23 00:43:57 INFO common.Storage: Storage directory /usr/local/hadoop/hdfs/name has been successfully formatted.
11/06/23 00:43:57 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at localhost.localdomain/127.0.0.1
************************************************************/
[root@localhost bin]# 

###########################################
[root@localhost bin]# ./start-all.sh
starting namenode, logging to /usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-namenode-localhost.localdomain.out
datanode01: starting datanode, logging to /usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-datanode-localhost.localdomain.out
namenode: starting secondarynamenode, logging to /usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-secondarynamenode-localhost.localdomain.out
starting jobtracker, logging to /usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-jobtracker-localhost.localdomain.out
datanode01: starting tasktracker, logging to /usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-tasktracker-localhost.localdomain.out
[root@localhost bin]# jps
11971 TaskTracker
11807 SecondaryNameNode
11599 NameNode
12022 Jps
11710 DataNode
11877 JobTracker 

 

 

  查看集群状态

[root@localhost bin]# hadoop dfsadmin  -report
Configured Capacity: 4055396352 (3.78 GB)
Present Capacity: 464142351 (442.64 MB)
DFS Remaining: 464089088 (442.59 MB)
DFS Used: 53263 (52.01 KB)
DFS Used%: 0.01%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)

Name: 127.0.0.1:50010
Decommission Status : Normal
Configured Capacity: 4055396352 (3.78 GB)
DFS Used: 53263 (52.01 KB)
Non DFS Used: 3591254001 (3.34 GB)
DFS Remaining: 464089088(442.59 MB)
DFS Used%: 0%
DFS Remaining%: 11.44%
Last contact: Thu Jun 23 01:11:15 PDT 2011


[root@localhost bin]# 

 

 

 

 

  其他问题: 1

####################启动报错##########
[root@localhost bin]# ./start-all.sh
starting namenode, logging to /usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-namenode-localhost.localdomain.out
The authenticity of host 'datanode01 (127.0.0.1)' can't be established.
RSA key fingerprint is 41:c8:d4:e4:60:71:6f:6a:33:6a:25:27:62:9b:e3:90.
Are you sure you want to continue connecting (yes/no)? y
Please type 'yes' or 'no': yes
datanode01: Warning: Permanently added 'datanode01' (RSA) to the list of known hosts.
datanode01: starting datanode, logging to /usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-datanode-localhost.localdomain.out
datanode01: Unrecognized option: -jvm
datanode01: Could not create the Java virtual machine.






namenode: starting secondarynamenode, logging to /usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-secondarynamenode-localhost.localdomain.out
starting jobtracker, logging to /usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-jobtracker-localhost.localdomain.out
datanode01: starting tasktracker, logging to /usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-tasktracker-localhost.localdomain.out
[root@localhost bin]# jps
10442 JobTracker
10533 TaskTracker
10386 SecondaryNameNode
10201 NameNode
10658 Jps

################################################
[root@localhost bin]# vi hadoop
elif [ "$COMMAND" = "datanode" ] ; then
  CLASS='org.apache.hadoop.hdfs.server.datanode.DataNode'
  if [[ $EUID -eq 0 ]]; then
    HADOOP_OPTS="$HADOOP_OPTS -jvm server $HADOOP_DATANODE_OPTS"
  else
    HADOOP_OPTS="$HADOOP_OPTS -server $HADOOP_DATANODE_OPTS"
  fi

#http://javoft.net/2011/06/hadoop-unrecognized-option-jvm-could-not-create-the-java-virtual-machine/
#改为
elif [ "$COMMAND" = "datanode" ] ; then
  CLASS='org.apache.hadoop.hdfs.server.datanode.DataNode'
#  if [[ $EUID -eq 0 ]]; then
#    HADOOP_OPTS="$HADOOP_OPTS -jvm server $HADOOP_DATANODE_OPTS"
#  else
    HADOOP_OPTS="$HADOOP_OPTS -server $HADOOP_DATANODE_OPTS"
#  fi

#或者换非root用户启动
#启动成功

  2,启动时要关闭防火墙

 

查看运行情况:

http://localhost:50070

NameNode 'localhost.localdomain:9000'
Started: 	Thu Jun 23 01:07:18 PDT 2011
Version: 	0.20.203.0, r1099333
Compiled: 	Wed May 4 07:57:50 PDT 2011 by oom
Upgrades: 	There are no upgrades in progress.

Browse the filesystem
Namenode Logs
Cluster Summary
6 files and directories, 1 blocks = 7 total. Heap Size is 31.38 MB / 966.69 MB (3%)
Configured Capacity	:	3.78 GB
DFS Used	:	52.01 KB
Non DFS Used	:	3.34 GB
DFS Remaining	:	442.38 MB
DFS Used%	:	0 %
DFS Remaining%	:	11.44 %
Live Nodes 	:	1
Dead Nodes 	:	0
Decommissioning Nodes 	:	0
Number of Under-Replicated Blocks	:	0

NameNode Storage:
Storage Directory	Type	State
/usr/local/hadoop/hdfs/name	IMAGE_AND_EDITS	Active
 

http://localhost:50030

namenode Hadoop Map/Reduce Administration
Quick Links

    * Scheduling Info
    * Running Jobs
    * Retired Jobs
    * Local Logs

State: RUNNING
Started: Thu Jun 23 01:07:30 PDT 2011
Version: 0.20.203.0, r1099333
Compiled: Wed May 4 07:57:50 PDT 2011 by oom
Identifier: 201106230107
Cluster Summary (Heap Size is 15.31 MB/966.69 MB)
Running Map Tasks	Running Reduce Tasks	Total Submissions	Nodes	Occupied Map Slots	Occupied Reduce Slots	Reserved Map Slots	Reserved Reduce Slots	Map Task Capacity	Reduce Task Capacity	Avg. Tasks/Node	Blacklisted Nodes	Graylisted Nodes	Excluded Nodes
0	0	0	1	0	0	0	0	2	2	4.00	0	0	0

Scheduling Information
Queue Name 	State 	Scheduling Information
default 	running 	N/A
Filter (Jobid, Priority, User, Name)
Example: 'user:smith 3200' will filter by 'smith' only in the user field and '3200' in all fields
Running Jobs
none
Retired Jobs
none
Local Logs
Log directory, Job Tracker History This is Apache Hadoop release 0.20.203.0 

 

测试:

##########建立目录名称##########
[root@localhost bin]# hadoop fs -mkdir  testFolder

###############拷贝文件到文件夹中
[root@localhost local]# ls
bin  etc  games  hadoop  include  lib  libexec  sbin  share  src  SSH_key_file
[root@localhost local]# hadoop fs -copyFromLocal SSH_key_file testFolder

进入web页面即可查看
 

 

 

 参考:http://bxyzzy.blog.51cto.com/854497/352692

 

   附:  准备FTP :yum install vsftpd (方便文件传输  和hadoop无关)

     关闭防火墙:service iptables start

     启动FTP:service vsftpd start

分享到:
评论

相关推荐

    VMware上CentOS7.0+Hadoop3.1伪分布式搭建

    6. **配置Hadoop伪分布式模式**:修改`/usr/local/hadoop/etc/hadoop/core-site.xml`和`hdfs-site.xml`配置文件,设置HDFS的相关参数,如命名节点和数据节点的位置。在`mapred-site.xml`中指定MapReduce框架。同时,...

    CentOS下安装伪分布式Hadoop-1.2.1

    ### CentOS 下安装伪分布式 Hadoop-1.2.1 的详细步骤 ...至此,已经完成了在 CentOS 下伪分布式模式的 Hadoop-1.2.1 的安装与基本配置。这为后续进行 Hadoop 相关的大数据处理任务提供了坚实的基础。

    VM+CentOS+hadoop2.7搭建hadoop完全分布式集群

    - **Hadoop伪分布部署**:适用于本地测试环境。 - **Zookeeper、Hive、HBase的分布式部署**:提供高可用性和数据仓库支持。 - **Spark、Sqoop、Mahout的分布式部署**:用于提高数据处理性能和数据分析能力。 - **...

    linux下伪分布安装hadoop环境及问题处理

    ### Hadoop伪分布模式在Linux CentOS下的安装与配置详解 #### 一、概览 本文旨在详细介绍如何在Linux CentOS 5.0系统下搭建Hadoop伪分布模式的测试环境,包括必要的步骤、注意事项以及可能遇到的问题及其解决方案...

    Hadoop安装教程_单机_伪分布式配置

    本文档将详细介绍如何在Ubuntu 14.04环境下安装配置Hadoop 2.6.0版本,包括单机模式和伪分布式模式。无论您是初学者还是有一定经验的技术人员,本教程都将帮助您顺利完成Hadoop的安装和配置。 #### 二、环境准备 1....

    在centos上安装hadoop

    在CentOS上安装Hadoop是一项关键的任务,尤其对于学习和实践大数据处理的用户来说。Hadoop是一个开源的分布式计算框架,它允许在廉价硬件上处理大规模数据集。在虚拟机上的CentOS系统上安装Hadoop,可以提供一个安全...

    hadoop伪分布式安装方法步骤

    在本资源中,我们将详细介绍Hadoop伪分布式安装的步骤,包括宿主机和客户机的网络连接、Hadoop的伪分布安装步骤、JDK的安装、Hadoop的安装等。 1. 宿主机和客户机的网络连接 在Hadoop伪分布式安装中,宿主机和客户...

    Hadoop3.1.3安装和单机/伪分布式配置

    ##### (七) Hadoop伪分布式配置 - **步骤**: 1. 修改配置文件`core-site.xml`和`hdfs-site.xml`。 2. 对`core-site.xml`进行配置: - 设置Hadoop的FS默认文件系统为HDFS。 - 设置HDFS的地址。 3. 对`hdfs-...

    在centos7下正确安装伪分布hadoop2.7.2和配置eclipse.docx

    【在CentOS7下正确安装伪分布Hadoop2.7.2和配置Eclipse】 在CentOS7系统中安装和配置Hadoop2.7.2的伪分布式模式,以及为Eclipse开发环境做准备,涉及多个步骤。首先,我们需要创建一个名为`hadoop`的用户,以便更好...

    实验1-安装Hadoop1

    本实验将引导你完成在CentOS 6操作系统上安装Hadoop的过程,涵盖单机模式、伪分布式模式以及分布式模式的安装。这些模式各有特点,适用于不同的学习和开发需求。\n\n**一、单机模式安装**\n\n1. **环境准备**:首先...

    LinuxRedHat、CentOS上搭建Hadoop集群.pdf

    Hadoop有三种工作模式:单机模式、伪分布式模式和完全分布式模式。 1. 单机模式:在单机模式下,Hadoop被配置成以非分布式模式运行的一个独立Java进程。这对调试非常有帮助。 2. 伪分布式模式:Hadoop可以在单节点...

    Linux下Hadoop伪分布式配置及操作命令

    以上知识点详细地阐述了在CentOS系统上配置Hadoop伪分布式环境的全过程,包括了环境准备、JDK安装、环境变量配置、Hadoop配置文件修改、SSH无密码登录配置、集群的启动和使用,以及常用命令的介绍。对于初学者来说,...

    Hadoop安装教程_单机_伪分布式配置_CentOS6.4_Hadoop2.6

    根据给定文件的信息,本文将详细介绍如何在 CentOS 6.4 系统中安装 Hadoop 2.6.0,并实现单机模式与伪分布式模式的配置。 ### 环境准备 #### 操作系统环境 - **操作系统**: CentOS 6.4 32位 - **虚拟化平台**: ...

    大数据教程-Hadoop伪分布式安装.pdf

    Hadoop伪分布式安装概览 Hadoop可以运行在多种模式下,包括单机模式、伪分布式模式和完全分布式模式。伪分布式模式是指所有的Hadoop守护进程在一台机器上运行,并且对外表现得就像是一个分布式的集群环境。这种模式...

    Hadoop课程设计,基于Hadoop的好友推荐,在VM虚拟机上搭建CentOS环境(伪分布式)

    【标题】:“Hadoop课程设计,基于Hadoop的好友推荐,在VM虚拟机上搭建CentOS环境(伪分布式)”这一主题涵盖了多个IT领域的关键知识点,包括大数据处理框架Hadoop、虚拟化技术VMware、操作系统CentOS以及数据推荐...

    linux下载,安装,JDK配置,hadoop安装

    - 全分布模式:除了伪分布模式外,还需配置 hosts 文件、SSH 免密登录等,并在所有节点上复制 Hadoop 配置文件。 **3.5 验证Hadoop安装** - **运行 WordCount 示例程序**: - 编写 MapReduce 任务。 - 提交任务...

    1_Hadoop伪分布式安装.docx

    本文档详细介绍了在CentOS系统上进行Hadoop伪分布式安装的过程。 #### 二、CentOS基础配置 **1. 解决Ifconfig查看不到IP的问题** - 虚拟机设置中,确保网络连接设置为NAT模式。 - 使用`ifconfig`或`ip addr`命令...

    CentOS下安装Apache Hadoop(案例).pdf

    在单节点集群中,可能还需要配置伪分布式模式,这通常通过在`hadoop-env.sh`中设置`HADOOP_OPTS`来实现,并在`hdfs-site.xml`中指定`dfs.nameservices`和`dfs.datanode.data.dir`等属性。 完成配置后,可以启动...

Global site tag (gtag.js) - Google Analytics