朋友公司在搭建基于PC的hadoop集群,我也照猫画虎了一下,测试通过.
==========================================
操作系统为centos5.4(已经建立几个节点间的信任关系)
一,安装java
1,下载java (以下为下载在/work目录下操作)
wgethttp://download.oracle.com/otn-pub/java/jdk/7u2-b13/jdk-7u2-linux-i586.tar.gz
2,解压下载文件并改名
tar-zxvf jdk-7u2-linux-i586.tar.gz
mvjdk1.7.0_02 java
rmjdk-7u2-linux-i586.tar.gz
3,在/etc/profile中加入以下语句:
exportJAVA_HOME=/work/java
exportJRE_HOME=$JAVA_HOME/jre
exportPATH=$PATH:$JAVA_HOME/bin
二,安装hadoop
1, 下载hadoop压缩包(下载在/work目录下)
wget http://mirror.bit.edu.cn/apache//hadoop/common/hadoop-1.0.0/hadoop-1.0.0.tar.gz
2, 解压压缩包并改名
tar -zxvf hadoop-1.0.0.tar.gz
mv hadoop-1.0.0 hadoop
rm hadoop-1.0.0.tar.gz
3, 修改/etc/profile至
exportJAVA_HOME=/work/java
exportJRE_HOME=$JAVA_HOME/jre
exportHADOOP_HOME=/work/hadoop
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin
三,配置hadoop
1, 配置conf/hadoop-env.sh
export JAVA_HOME=/work/java
export HADOOP_HEAPSIZE=2000
2, 配置conf/core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://da-free-test1:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/work/hadoopneed/tmp</value>
</property>
<property>
<name>dfs.hosts.exclude</name>
<value>/work/hadoop/conf/dfs.hosts.exclude</value>
</property>
</configuration>
3, 配置hdfs-site.xml
<configuration>
<property>
<name>dfs.name.dir</name>
<value>/work/hadoopneed/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/work/hadoopneed/data/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.namenode.handler.count</name>
<value>30</value>
</property>
<property>
<name>dfs.datanode.handler.count</name>
<value>5</value>
</property>
<property>
<name>dfs.datanode.du.reserved</name>
<value>10737418240</value>
</property>
<property>
<name>dfs.block.size</name>
<value>134217728</value>
</property>
</configuration>
4, 配置mapred-site.xml
<property>
<name>mapred.job.tracker</name>
<value>da-free-test1:9001/</value>
</property>
<property>
<name>mapred.local.dir</name>
<value>/work/hadoopneed/mapred/local</value>
</property>
<property>
<name>mapred.system.dir</name>
<value>/tmp/hadoop/mapred/system</value>
</property>
<property>
<name>mapred.child.java.opts</name>
<value>-Xmx512</value>
<final>true</final>
</property>
<property>
<name>mapred.job.tracker.handler.count</name>
<value>30</value>
</property>
<property>
<name>mapred.map.tasks</name>
<value>100</value>
</property>
<property>
<name>mapred.tasktracker.map.tasks.maximum</name>
<value>12</value>
</property>
<property>
<name>mapred.reduce.tasks</name>
<value>63</value>
</property>
<property>
<name>mapred.tasktracker.reduce.tasks.maximum</name>
<value>6</value>
</property>
5, 配置masters
da-free-test1
6, 配置slaves
da-free-test2
da-free-test3
da-free-test4
四,其他节点的安装
1, 将hadoo和java目录拷贝到其他三个节点对应目录下
scp -rhadoop da-free-test2:/work
scp -rhadoop da-free-test3:/work
scp -rhadoop da-free-test4:/work
scp -rjava da-free-test2:/work
scp -rjava da-free-test3:/work
scp -r java da-free-test4:/work
2, 修改三个节点的/etc/profile,加入以下语句并执行一遍。
exportJAVA_HOME=/work/java
exportJRE_HOME=$JAVA_HOME/jre
exportHADOOP_HOME=/work/hadoop
exportPATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin
至此,算是安装完成,之后遇到的都当问题来处理。
六,格式化文件系统
1, 遇到问题:
[root@da-free-test1bin]# ./hadoop namenode -format
Warning:$HADOOP_HOME is deprecated.
Error: dlfailure on line 875
Error:failed /work/java/jre/lib/i386/server/libjvm.so, because/work/java/jre/lib/i386/server/libjvm.so: cannot restore segment prot afterreloc: Permission denied
Error: dlfailure on line 875
Error: failed /work/java/jre/lib/i386/server/libjvm.so,because /work/java/jre/lib/i386/server/libjvm.so: cannot restore segment protafter reloc: Permission denied
解决方法:关闭selinux:
修改/etc/selinux/config
SELINUX=disabled
更改其他三个节点,并重启系统。
解决报警
Warning: $HADOOP_HOME is deprecated.
将刚才添加到/etc/profile中的关于$HADOOP_HOME的删除并重新登录。
2, 成功格式化
[root@da-free-test1~]# hadoop namenode -format
12/02/0812:01:21 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG:Starting NameNode
STARTUP_MSG: host = da-free-test1/172.16.18.202
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 1.0.0
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0-r 1214675; compiled by 'hortonfo' on Thu Dec 15 16:36:35 UTC 2011
************************************************************/
12/02/0812:01:21 INFO util.GSet: VM type =32-bit
12/02/0812:01:21 INFO util.GSet: 2% max memory = 35.55625 MB
12/02/0812:01:21 INFO util.GSet: capacity =2^23 = 8388608 entries
12/02/0812:01:21 INFO util.GSet: recommended=8388608, actual=8388608
12/02/0812:01:21 INFO namenode.FSNamesystem: fsOwner=root
12/02/0812:01:21 INFO namenode.FSNamesystem: supergroup=supergroup
12/02/0812:01:21 INFO namenode.FSNamesystem: isPermissionEnabled=true
12/02/0812:01:21 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
12/02/0812:01:21 INFO namenode.FSNamesystem: isAccessTokenEnabled=falseaccessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
12/02/0812:01:21 INFO namenode.NameNode: Caching file names occuring more than 10 times
12/02/0812:01:22 INFO common.Storage: Image file of size 110 saved in 0 seconds.
12/02/0812:01:22 INFO common.Storage: Storage directory /work/hadoopneed/name has beensuccessfully formatted.
12/02/0812:01:22 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG:Shutting down NameNode at da-free-test1/172.16.18.202
************************************************************/
七,启动hadoop
1, 启动hadoop日志报错
./start-all.sh
WARN org.apache.hadoop.hdfs.DFSClient: DataStreamerException: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File/tmp/hadoop/mapred/system/jobtracker.info could only be replicated to 0 nodes,instead of 1
解决方法:关闭hadoop安全模式:hadoop dfsadmin -safemode leave(此时并未关闭hadoop)。
等待一会,hadoop自动恢复成功。
观察日志hadoop-root-jobtracker-da-free-test1.log,可以看到:
2012-02-08 12:14:07,804 INFOorg.apache.hadoop.ipc.Server: IPC Server Responder: starting
2012-02-08 12:14:07,804 INFOorg.apache.hadoop.ipc.Server: IPC Server listener on 9001: starting
2012-02-08 12:14:07,805 INFOorg.apache.hadoop.ipc.Server: IPC Server handler 0 on 9001: starting
2012-02-08 12:14:07,805 INFOorg.apache.hadoop.ipc.Server: IPC Server handler 1 on 9001: starting
2012-02-08 12:14:07,805 INFOorg.apache.hadoop.ipc.Server: IPC Server handler 2 on 9001: starting
2012-02-08 12:14:07,805 INFO org.apache.hadoop.ipc.Server:IPC Server handler 3 on 9001: starting
2012-02-08 12:14:07,805 INFOorg.apache.hadoop.ipc.Server: IPC Server handler 4 on 9001: starting
2012-02-08 12:14:07,805 INFOorg.apache.hadoop.ipc.Server: IPC Server handler 5 on 9001: starting
2012-02-08 12:14:07,806 INFOorg.apache.hadoop.ipc.Server: IPC Server handler 6 on 9001: starting
2012-02-08 12:14:07,806 INFOorg.apache.hadoop.ipc.Server: IPC Server handler 7 on 9001: starting
2012-02-08 12:14:07,806 INFOorg.apache.hadoop.ipc.Server: IPC Server handler 8 on 9001: starting
2012-02-08 12:14:07,806 INFOorg.apache.hadoop.ipc.Server: IPC Server handler 9 on 9001: starting
2012-02-08 12:14:07,806 INFOorg.apache.hadoop.ipc.Server: IPC Server handler 10 on 9001: starting
2012-02-08 12:14:07,806 INFOorg.apache.hadoop.ipc.Server: IPC Server handler 11 on 9001: starting
2012-02-08 12:14:07,806 INFOorg.apache.hadoop.ipc.Server: IPC Server handler 12 on 9001: starting
2012-02-08 12:14:07,806 INFOorg.apache.hadoop.ipc.Server: IPC Server handler 13 on 9001: starting
2012-02-08 12:14:07,806 INFOorg.apache.hadoop.ipc.Server: IPC Server handler 14 on 9001: starting
2012-02-08 12:14:07,807 INFOorg.apache.hadoop.ipc.Server: IPC Server handler 15 on 9001: starting
2012-02-08 12:14:07,807 INFO org.apache.hadoop.ipc.Server:IPC Server handler 16 on 9001: starting
2012-02-08 12:14:07,807 INFOorg.apache.hadoop.ipc.Server: IPC Server handler 17 on 9001: starting
2012-02-08 12:14:07,807 INFOorg.apache.hadoop.ipc.Server: IPC Server handler 18 on 9001: starting
2012-02-08 12:14:07,807 INFOorg.apache.hadoop.ipc.Server: IPC Server handler 19 on 9001: starting
2012-02-08 12:14:07,807 INFOorg.apache.hadoop.ipc.Server: IPC Server handler 20 on 9001: starting
2012-02-08 12:14:07,807 INFOorg.apache.hadoop.ipc.Server: IPC Server handler 21 on 9001: starting
2012-02-08 12:14:07,807 INFOorg.apache.hadoop.ipc.Server: IPC Server handler 22 on 9001: starting
2012-02-08 12:14:07,808 INFOorg.apache.hadoop.ipc.Server: IPC Server handler 23 on 9001: starting
2012-02-08 12:14:07,808 INFOorg.apache.hadoop.ipc.Server: IPC Server handler 25 on 9001: starting
2012-02-08 12:14:07,808 INFOorg.apache.hadoop.ipc.Server: IPC Server handler 26 on 9001: starting
2012-02-08 12:14:07,808 INFOorg.apache.hadoop.ipc.Server: IPC Server handler 27 on 9001: starting
2012-02-08 12:14:07,808 INFOorg.apache.hadoop.ipc.Server: IPC Server handler 28 on 9001: starting
2012-02-08 12:14:07,808 INFOorg.apache.hadoop.mapred.JobTracker: Starting RUNNING
2012-02-08 12:14:07,808 INFO org.apache.hadoop.ipc.Server:IPC Server handler 29 on 9001: starting
2012-02-08 12:14:07,808 INFOorg.apache.hadoop.ipc.Server: IPC Server handler 24 on 9001: starting
2012-02-08 12:14:12,623 INFOorg.apache.hadoop.net.NetworkTopology: Adding a new node: /default-rack/da-free-test4
2012-02-08 12:14:12,625 INFOorg.apache.hadoop.mapred.JobTracker: Adding trackertracker_da-free-test4:da-free-test1/127.0.0.1:42182 to host da-free-test4
2012-02-08 12:14:12,743 INFOorg.apache.hadoop.net.NetworkTopology: Adding a new node:/default-rack/da-free-test3
2012-02-08 12:14:12,744 INFOorg.apache.hadoop.mapred.JobTracker: Adding trackertracker_da-free-test3:da-free-test1/127.0.0.1:53695 to host da-free-test3
2012-02-08 12:14:12,802 INFOorg.apache.hadoop.net.NetworkTopology: Adding a new node:/default-rack/da-free-test2
2012-02-08 12:14:12,802 INFOorg.apache.hadoop.mapred.JobTracker: Adding trackertracker_da-free-test2:da-free-test1/127.0.0.1:47259 to host da-free-test2
打开浏览器,输入http://172.16.18.202:50030,可以看到节点数量为3。
上传一个文件:dfs -put hadoop-root-namenode-da-free-test1.log /usr/testfile
查看上传的文件:hadoop dfs -ls /usr/
Found 1 items
-rw-r--r-- 3 root supergroup 845332012-02-08 12:20 /usr/testfile
查看http://172.16.18.202:50070,发现已用空间为396k。
暂时验证通过。
===========================
http://i752.photobucket.com/albums/xx166/ntudou/dev/hadoop01.png
http://i752.photobucket.com/albums/xx166/ntudou/dev/02.png
http://i752.photobucket.com/albums/xx166/ntudou/dev/03.png
分享到:
相关推荐
Hadoop1.0.0版本是Hadoop发展历史中的一个重要里程碑,它为大数据处理提供了基础架构,尤其在处理大规模分布式存储和计算方面。在这个版本中,Hadoop已经具备了高度的可扩展性和容错性,使得企业能够有效地管理和...
Hadoop1.0.0权威API参考
Hadoop1.0.0 权威 入门指南
hadoop-1.0.0源代码免费下载,经测可用
hadoop-core-1.0.0.jar hadoop的核心包hadoop-core-1.0.0.jar
### Hadoop-1.0.0 集群安装详细步骤及知识点解析 #### 一、环境准备 根据所提供的信息,本次安装的目标是构建一个基于Hadoop-1.0.0版本的集群环境。该环境将包括一台主节点(NameNode)以及两台从节点(DataNodes...
在eclipse中集成hadoop插件进行开发
hadoop 1.0版本中已经不再提供eclipse插件jar包,此hadoop-eclipse-plugin-1.0.0.jar包在eclipse 3.7上编译hadoop源码产生并在eclipse3.7版本上成功运行。
顶级Apache开源项目Hadoop发布了1.0.0版。Hadoop是前雅虎开发者Doug Cutting开发的分布式计算平台,名字源于玩具象,至今已被数千家公司用于分析大容量数据。Hadoop 1.0.0主要是修正了bug,改进了性能和兼容性。它的...
《深入剖析Spring Data Hadoop 1.0.0源码》 Spring Data Hadoop是Spring框架的一个重要组件,它提供了一种优雅的方式来访问和操作Hadoop生态系统中的数据。这个源码包“spring-data-hadoop-1.0.0”为我们揭示了...
在本案例中,我们讨论的是`spring-data-hadoop-1.0.0.jar`,这是Spring Data Hadoop的1.0.0版本。每个版本都有其特定的特性和修复的问题,因此在选择使用时,应根据项目需求和Hadoop集群的版本进行匹配。 总的来说...
这个压缩包 "sqoop-1.4.6.bin__hadoop-1.0.0.tar.gz" 包含了 Sqoop 的 1.4.6 版本,该版本是针对 Apache Hadoop 1.0.0 版本优化的。下面将详细介绍 Sqoop 的主要功能、架构、使用场景以及如何在 Hadoop 环境中安装和...
hadoop1.0.0没有提供eclipse插件的jar包,但是提供了源码,编译后,供大家下载。
hadoop 1.0.0版本eclipse 3.7.1插件
标题《hadoop的安装》所涉及的知识点涵盖Hadoop安装过程中的各个方面,包括但不限于JDK环境的配置与安装、Hadoop下载、解压、配置以及启动等步骤。以下是根据给定内容和描述生成的详细知识点: 1. JDK环境配置与...
配置组合:ubuntu+eclipse3.7.1+hadoop-1.0.0+hadoop-eclipse-plugin-1.0.0.jar 已配置测试成功过。。
安装完SSH后,我们下载Hadoop 1.0.0并将其解压到`/home/hadoop`目录。接着,将Hadoop的安装路径添加到环境变量`PATH`中: ```bash export HADOOP_HOME=/home/hadoop/hadoop-1.0.0 export PATH=$HADOOP_HOME/bin:$...
### Hadoop完全分布式安装手册 #### 一、环境规划与准备 在进行Hadoop的完全分布式安装之前,首先需要对整个环境进行规划和必要的准备工作。根据文档提供的信息,本手册将涉及以下三个主要方面: 1. **JDK安装**...
spring-data-hadoop-1.0.0.M1 兼容包.part2