- 浏览: 35751 次
- 性别:
- 来自: 上海
最新评论
在本机上装的CentOS 5.5 虚拟机,
软件准备:jdk 1.6 U26
hadoop:hadoop-0.20.203.tar.gz
ssh检查配置
[root@localhost ~]# ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): Created directory '/root/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: a8:7a:3e:f6:92:85:b8:c7:be:d9:0e:45:9c:d1:36:3b root@localhost.localdomain [root@localhost ~]# [root@localhost ~]# cd .. [root@localhost /]# cd root [root@localhost ~]# ls anaconda-ks.cfg Desktop install.log install.log.syslog [root@localhost ~]# cd .ssh [root@localhost .ssh]# cat id_rsa.pub > authorized_keys [root@localhost .ssh]# [root@localhost .ssh]# ssh localhost The authenticity of host 'localhost (127.0.0.1)' can't be established. RSA key fingerprint is 41:c8:d4:e4:60:71:6f:6a:33:6a:25:27:62:9b:e3:90. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'localhost' (RSA) to the list of known hosts. Last login: Tue Jun 21 22:40:31 2011 [root@localhost ~]#
安装jdk
[root@localhost java]# chmod +x jdk-6u26-linux-i586.bin [root@localhost java]# ./jdk-6u26-linux-i586.bin ...... ...... ...... For more information on what data Registration collects and how it is managed and used, see: http://java.sun.com/javase/registration/JDKRegistrationPrivacy.html Press Enter to continue..... Done.
安装完成后生成文件夹:jdk1.6.0_26
配置环境变量
[root@localhost java]# vi /etc/profile #添加如下信息 # set java environment export JAVA_HOME=/usr/java/jdk1.6.0_26 export CLASSPATH=$CLASSPATH:$JAVA_HOME/lib:$JAVA_HOME/jre/lib export PATH=$JAVA_HOME/lib:$JAVA_HOME/jre/bin:$PATH:$HOME/bin export HADOOP_HOME=/usr/local/hadoop/hadoop-0.20.203 export PATH=$PATH:$HADOOP_HOME/bin [root@localhost java]# chmod +x /etc/profile [root@localhost java]# source /etc/profile [root@localhost java]# [root@localhost java]# java -version java version "1.6.0_26" Java(TM) SE Runtime Environment (build 1.6.0_26-b03) Java HotSpot(TM) Client VM (build 20.1-b02, mixed mode, sharing) [root@localhost java]#
修改hosts
[root@localhost conf]# vi /etc/hosts # Do not remove the following line, or various programs # that require network functionality will fail. 127.0.0.1 localhost.localdomain localhost ::1 localhost6.localdomain6 localhost6 127.0.0.1 namenode datanode01
解压安装hadoop
[root@localhost hadoop]# tar zxvf hadoop-0.20.203.tar.gz ...... ...... ...... hadoop-0.20.203.0/src/contrib/ec2/bin/image/create-hadoop-image-remote hadoop-0.20.203.0/src/contrib/ec2/bin/image/ec2-run-user-data hadoop-0.20.203.0/src/contrib/ec2/bin/launch-hadoop-cluster hadoop-0.20.203.0/src/contrib/ec2/bin/launch-hadoop-master hadoop-0.20.203.0/src/contrib/ec2/bin/launch-hadoop-slaves hadoop-0.20.203.0/src/contrib/ec2/bin/list-hadoop-clusters hadoop-0.20.203.0/src/contrib/ec2/bin/terminate-hadoop-cluster [root@localhost hadoop]#
进入hadoop配置conf
#################################### [root@localhost conf]# vi hadoop-env.sh # 添加代码 # set java environment export JAVA_HOME=/usr/java/jdk1.6.0_26 ##################################### [root@localhost conf]# vi core-site.xml <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>fs.default.name</name> <value>hdfs://namenode:9000/</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/usr/local/hadoop/hadooptmp</value> </property> </configuration> ####################################### [root@localhost conf]# vi hdfs-site.xml <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>dfs.name.dir</name> <value>/usr/local/hadoop/hdfs/name</value> </property> <property> <name>dfs.data.dir</name> <value>/usr/local/hadoop/hdfs/data</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration> ######################################### [root@localhost conf]# vi mapred-site.xml <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>mapred.job.tracker</name> <value>namenode:9001</value> </property> <property> <name>mapred.local.dir</name> <value>/usr/local/hadoop/mapred/local</value> </property> <property> <name>mapred.system.dir</name> <value>/tmp/hadoop/mapred/system</value> </property> </configuration> ######################################### [root@localhost conf]# vi masters #localhost namenode ######################################### [root@localhost conf]# vi slaves #localhost datanode01
启动 hadoop
#####################格式化namenode##############
[root@localhost bin]# hadoop namenode -format
11/06/23 00:43:54 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = localhost.localdomain/127.0.0.1
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 0.20.203.0
STARTUP_MSG: build = http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-security-203 -r 1099333; compiled by 'oom' on Wed May 4 07:57:50 PDT 2011
************************************************************/
11/06/23 00:43:55 INFO util.GSet: VM type = 32-bit
11/06/23 00:43:55 INFO util.GSet: 2% max memory = 19.33375 MB
11/06/23 00:43:55 INFO util.GSet: capacity = 2^22 = 4194304 entries
11/06/23 00:43:55 INFO util.GSet: recommended=4194304, actual=4194304
11/06/23 00:43:56 INFO namenode.FSNamesystem: fsOwner=root
11/06/23 00:43:56 INFO namenode.FSNamesystem: supergroup=supergroup
11/06/23 00:43:56 INFO namenode.FSNamesystem: isPermissionEnabled=true
11/06/23 00:43:56 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
11/06/23 00:43:56 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
11/06/23 00:43:56 INFO namenode.NameNode: Caching file names occuring more than 10 times
11/06/23 00:43:57 INFO common.Storage: Image file of size 110 saved in 0 seconds.
11/06/23 00:43:57 INFO common.Storage: Storage directory /usr/local/hadoop/hdfs/name has been successfully formatted.
11/06/23 00:43:57 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at localhost.localdomain/127.0.0.1
************************************************************/
[root@localhost bin]#
###########################################
[root@localhost bin]# ./start-all.sh
starting namenode, logging to /usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-namenode-localhost.localdomain.out
datanode01: starting datanode, logging to /usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-datanode-localhost.localdomain.out
namenode: starting secondarynamenode, logging to /usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-secondarynamenode-localhost.localdomain.out
starting jobtracker, logging to /usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-jobtracker-localhost.localdomain.out
datanode01: starting tasktracker, logging to /usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-tasktracker-localhost.localdomain.out
[root@localhost bin]# jps
11971 TaskTracker
11807 SecondaryNameNode
11599 NameNode
12022 Jps
11710 DataNode
11877 JobTracker
查看集群状态
[root@localhost bin]# hadoop dfsadmin -report Configured Capacity: 4055396352 (3.78 GB) Present Capacity: 464142351 (442.64 MB) DFS Remaining: 464089088 (442.59 MB) DFS Used: 53263 (52.01 KB) DFS Used%: 0.01% Under replicated blocks: 0 Blocks with corrupt replicas: 0 Missing blocks: 0 ------------------------------------------------- Datanodes available: 1 (1 total, 0 dead) Name: 127.0.0.1:50010 Decommission Status : Normal Configured Capacity: 4055396352 (3.78 GB) DFS Used: 53263 (52.01 KB) Non DFS Used: 3591254001 (3.34 GB) DFS Remaining: 464089088(442.59 MB) DFS Used%: 0% DFS Remaining%: 11.44% Last contact: Thu Jun 23 01:11:15 PDT 2011 [root@localhost bin]#
其他问题: 1
####################启动报错##########
[root@localhost bin]# ./start-all.sh
starting namenode, logging to /usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-namenode-localhost.localdomain.out
The authenticity of host 'datanode01 (127.0.0.1)' can't be established.
RSA key fingerprint is 41:c8:d4:e4:60:71:6f:6a:33:6a:25:27:62:9b:e3:90.
Are you sure you want to continue connecting (yes/no)? y
Please type 'yes' or 'no': yes
datanode01: Warning: Permanently added 'datanode01' (RSA) to the list of known hosts.
datanode01: starting datanode, logging to /usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-datanode-localhost.localdomain.out
datanode01: Unrecognized option: -jvm
datanode01: Could not create the Java virtual machine.
namenode: starting secondarynamenode, logging to /usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-secondarynamenode-localhost.localdomain.out
starting jobtracker, logging to /usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-jobtracker-localhost.localdomain.out
datanode01: starting tasktracker, logging to /usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-tasktracker-localhost.localdomain.out
[root@localhost bin]# jps
10442 JobTracker
10533 TaskTracker
10386 SecondaryNameNode
10201 NameNode
10658 Jps
################################################
[root@localhost bin]# vi hadoop
elif [ "$COMMAND" = "datanode" ] ; then
CLASS='org.apache.hadoop.hdfs.server.datanode.DataNode'
if [[ $EUID -eq 0 ]]; then
HADOOP_OPTS="$HADOOP_OPTS -jvm server $HADOOP_DATANODE_OPTS"
else
HADOOP_OPTS="$HADOOP_OPTS -server $HADOOP_DATANODE_OPTS"
fi
#http://javoft.net/2011/06/hadoop-unrecognized-option-jvm-could-not-create-the-java-virtual-machine/
#改为
elif [ "$COMMAND" = "datanode" ] ; then
CLASS='org.apache.hadoop.hdfs.server.datanode.DataNode'
# if [[ $EUID -eq 0 ]]; then
# HADOOP_OPTS="$HADOOP_OPTS -jvm server $HADOOP_DATANODE_OPTS"
# else
HADOOP_OPTS="$HADOOP_OPTS -server $HADOOP_DATANODE_OPTS"
# fi
#或者换非root用户启动
#启动成功
2,启动时要关闭防火墙
查看运行情况:
http://localhost:50070
NameNode 'localhost.localdomain:9000' Started: Thu Jun 23 01:07:18 PDT 2011 Version: 0.20.203.0, r1099333 Compiled: Wed May 4 07:57:50 PDT 2011 by oom Upgrades: There are no upgrades in progress. Browse the filesystem Namenode Logs Cluster Summary 6 files and directories, 1 blocks = 7 total. Heap Size is 31.38 MB / 966.69 MB (3%) Configured Capacity : 3.78 GB DFS Used : 52.01 KB Non DFS Used : 3.34 GB DFS Remaining : 442.38 MB DFS Used% : 0 % DFS Remaining% : 11.44 % Live Nodes : 1 Dead Nodes : 0 Decommissioning Nodes : 0 Number of Under-Replicated Blocks : 0 NameNode Storage: Storage Directory Type State /usr/local/hadoop/hdfs/name IMAGE_AND_EDITS Active
http://localhost:50030
namenode Hadoop Map/Reduce Administration Quick Links * Scheduling Info * Running Jobs * Retired Jobs * Local Logs State: RUNNING Started: Thu Jun 23 01:07:30 PDT 2011 Version: 0.20.203.0, r1099333 Compiled: Wed May 4 07:57:50 PDT 2011 by oom Identifier: 201106230107 Cluster Summary (Heap Size is 15.31 MB/966.69 MB) Running Map Tasks Running Reduce Tasks Total Submissions Nodes Occupied Map Slots Occupied Reduce Slots Reserved Map Slots Reserved Reduce Slots Map Task Capacity Reduce Task Capacity Avg. Tasks/Node Blacklisted Nodes Graylisted Nodes Excluded Nodes 0 0 0 1 0 0 0 0 2 2 4.00 0 0 0 Scheduling Information Queue Name State Scheduling Information default running N/A Filter (Jobid, Priority, User, Name) Example: 'user:smith 3200' will filter by 'smith' only in the user field and '3200' in all fields Running Jobs none Retired Jobs none Local Logs Log directory, Job Tracker History This is Apache Hadoop release 0.20.203.0
测试:
##########建立目录名称########## [root@localhost bin]# hadoop fs -mkdir testFolder ###############拷贝文件到文件夹中 [root@localhost local]# ls bin etc games hadoop include lib libexec sbin share src SSH_key_file [root@localhost local]# hadoop fs -copyFromLocal SSH_key_file testFolder 进入web页面即可查看
参考:http://bxyzzy.blog.51cto.com/854497/352692
附: 准备FTP :yum install vsftpd (方便文件传输 和hadoop无关)
关闭防火墙:service iptables start
启动FTP:service vsftpd start
相关推荐
6. **配置Hadoop伪分布式模式**:修改`/usr/local/hadoop/etc/hadoop/core-site.xml`和`hdfs-site.xml`配置文件,设置HDFS的相关参数,如命名节点和数据节点的位置。在`mapred-site.xml`中指定MapReduce框架。同时,...
### CentOS 下安装伪分布式 Hadoop-1.2.1 的详细步骤 ...至此,已经完成了在 CentOS 下伪分布式模式的 Hadoop-1.2.1 的安装与基本配置。这为后续进行 Hadoop 相关的大数据处理任务提供了坚实的基础。
- **Hadoop伪分布部署**:适用于本地测试环境。 - **Zookeeper、Hive、HBase的分布式部署**:提供高可用性和数据仓库支持。 - **Spark、Sqoop、Mahout的分布式部署**:用于提高数据处理性能和数据分析能力。 - **...
### Hadoop伪分布模式在Linux CentOS下的安装与配置详解 #### 一、概览 本文旨在详细介绍如何在Linux CentOS 5.0系统下搭建Hadoop伪分布模式的测试环境,包括必要的步骤、注意事项以及可能遇到的问题及其解决方案...
本文档将详细介绍如何在Ubuntu 14.04环境下安装配置Hadoop 2.6.0版本,包括单机模式和伪分布式模式。无论您是初学者还是有一定经验的技术人员,本教程都将帮助您顺利完成Hadoop的安装和配置。 #### 二、环境准备 1....
在CentOS上安装Hadoop是一项关键的任务,尤其对于学习和实践大数据处理的用户来说。Hadoop是一个开源的分布式计算框架,它允许在廉价硬件上处理大规模数据集。在虚拟机上的CentOS系统上安装Hadoop,可以提供一个安全...
在本资源中,我们将详细介绍Hadoop伪分布式安装的步骤,包括宿主机和客户机的网络连接、Hadoop的伪分布安装步骤、JDK的安装、Hadoop的安装等。 1. 宿主机和客户机的网络连接 在Hadoop伪分布式安装中,宿主机和客户...
##### (七) Hadoop伪分布式配置 - **步骤**: 1. 修改配置文件`core-site.xml`和`hdfs-site.xml`。 2. 对`core-site.xml`进行配置: - 设置Hadoop的FS默认文件系统为HDFS。 - 设置HDFS的地址。 3. 对`hdfs-...
【在CentOS7下正确安装伪分布Hadoop2.7.2和配置Eclipse】 在CentOS7系统中安装和配置Hadoop2.7.2的伪分布式模式,以及为Eclipse开发环境做准备,涉及多个步骤。首先,我们需要创建一个名为`hadoop`的用户,以便更好...
本实验将引导你完成在CentOS 6操作系统上安装Hadoop的过程,涵盖单机模式、伪分布式模式以及分布式模式的安装。这些模式各有特点,适用于不同的学习和开发需求。\n\n**一、单机模式安装**\n\n1. **环境准备**:首先...
Hadoop有三种工作模式:单机模式、伪分布式模式和完全分布式模式。 1. 单机模式:在单机模式下,Hadoop被配置成以非分布式模式运行的一个独立Java进程。这对调试非常有帮助。 2. 伪分布式模式:Hadoop可以在单节点...
以上知识点详细地阐述了在CentOS系统上配置Hadoop伪分布式环境的全过程,包括了环境准备、JDK安装、环境变量配置、Hadoop配置文件修改、SSH无密码登录配置、集群的启动和使用,以及常用命令的介绍。对于初学者来说,...
根据给定文件的信息,本文将详细介绍如何在 CentOS 6.4 系统中安装 Hadoop 2.6.0,并实现单机模式与伪分布式模式的配置。 ### 环境准备 #### 操作系统环境 - **操作系统**: CentOS 6.4 32位 - **虚拟化平台**: ...
Hadoop伪分布式安装概览 Hadoop可以运行在多种模式下,包括单机模式、伪分布式模式和完全分布式模式。伪分布式模式是指所有的Hadoop守护进程在一台机器上运行,并且对外表现得就像是一个分布式的集群环境。这种模式...
【标题】:“Hadoop课程设计,基于Hadoop的好友推荐,在VM虚拟机上搭建CentOS环境(伪分布式)”这一主题涵盖了多个IT领域的关键知识点,包括大数据处理框架Hadoop、虚拟化技术VMware、操作系统CentOS以及数据推荐...
- 全分布模式:除了伪分布模式外,还需配置 hosts 文件、SSH 免密登录等,并在所有节点上复制 Hadoop 配置文件。 **3.5 验证Hadoop安装** - **运行 WordCount 示例程序**: - 编写 MapReduce 任务。 - 提交任务...
本文档详细介绍了在CentOS系统上进行Hadoop伪分布式安装的过程。 #### 二、CentOS基础配置 **1. 解决Ifconfig查看不到IP的问题** - 虚拟机设置中,确保网络连接设置为NAT模式。 - 使用`ifconfig`或`ip addr`命令...
在单节点集群中,可能还需要配置伪分布式模式,这通常通过在`hadoop-env.sh`中设置`HADOOP_OPTS`来实现,并在`hdfs-site.xml`中指定`dfs.nameservices`和`dfs.datanode.data.dir`等属性。 完成配置后,可以启动...