`

Hadoop 0.20.205.0安装配置

阅读更多

环境:

10.0.30.235 NameNode

10.0.30.236 SecondaryNameNode

10.0.30.237 DataNode

10.0.30.238 DataNode

 

配置主机名

 

/etc/hosts

 

# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1               localhost.localdomain localhost
::1             localhost6.localdomain6 localhost6
10.0.30.235     nn0001  nn0001
10.0.30.236     snn0001 snn0001
10.0.30.237     dn0001  dn0001
10.0.30.238     dn0002  dn0002

 

修改所有节点network文件中的HOSTNAME值(以NameNode节点为例,其他的改成相应的值)

/etc/sysconfig/network

HOSTNAME=nn0001

 

 

安装jdk-6u26-linux-x64-rpm.bin

配置环境变量

vim /etc/profile

JAVA_HOME=/usr/java/jdk1.6.0_26
PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin
CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$CATALINA_HOME/common/lib
export JAVA_HOME
export PATH
export CLASSPATH

 

解压hadoop-0.20.205.0.tar.gz

 

配置hadoop-env.sh

export JAVA_HOME=/usr/java/jdk1.6.0_26

 

配置文件hdfs-site.xml

 

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
        <name>dfs.http.address</name>
        <value>nn0001:50070</value>
</property>
<property>
                <name>dfs.name.dir</name>
                <value>/hadoop/dfs/namenode</value>

        </property>
        <property>
                <name>dfs.data.dir</name>
                <value>/hadoop/dfs/datanode</value>
        </property>
        <property>
                <name>dfs.replication</name>
                <value>1</value>
        </property>

        <property>
                <name>dfs.datanode.max.xcievers</name>
                <value>4096</value>
        </property>
</configuration>

 

注:An Hadoop HDFS datanode has an upper bound on the number of files that it will serve at any one time. The upper bound parameter is called xcievers (yes, this is misspelled). Again, before doing any loading, make sure you have configured Hadoop's conf/hdfs-site.xml setting the xceivers value to at least the following

 

Not having this configuration in place makes for strange looking failures. Eventually you'll see a complain in the datanode logs complaining about the xcievers exceeded, but on the run up to this one manifestation is complaint about missing blocks. For example: 10/12/08 20:10:31 INFO hdfs.DFSClient: Could not obtain block blk_XXXXXXXXXXXXXXXXXXXXXX_YYYYYYYY from any node: java.io.IOException: No live nodes contain current block. Will get new block locations from namenode and retry...

 

配置文件core-site.xml

 

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
                <name>fs.default.name</name>
                <value>hdfs://nn0001:9000</value>
        </property>
        <property>
                <name>httpserver.enable</name>
                <value>true</value>
        </property>
<property>
        <name>fs.checkpoint.dir</name>
        <value>/hadoop/dfs/namenodesecondary</value>
</property>
</configuration>

 

配置文件mapred-site.xml

 

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
        <property>
        <name>mapred.job.tracker</name>
        <value>nn0001:9001</value>
        </property>
</configuration>

 

安装配置ssh

 

NameNode节点:

ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa

cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

 

SecondaryNameNode节点

ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa

 

DataNode节点

ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa

 

把NameNode节点上的authorized_keys文件拷贝到其他节点上

scp /root/.ssh/authorized_keys 10.0.30.23(6-8):/root/.ssh/

 

在NameNode上格式化一个新的分布式文件系统

./hadoop namenode -format

 

启动hdfs

./start-dfs.sh

 

出现以下警告:

dn0001: Warning: $HADOOP_HOME is deprecated.
dn0001:
dn0001: Unrecognized option: -jvm
dn0001: Could not create the Java virtual machine.

 

这个问题导致两个DataNode节点无法启动

 

出现这个问题是由于我直接用root权限运行hadoop,在bin/hadoop中可以看到以下shell脚本:

 

if [[ $EUID -eq 0 ]]; then
    HADOOP_OPTS="$HADOOP_OPTS -jvm server $HADOOP_DATANODE_OPTS"
  else
    HADOOP_OPTS="$HADOOP_OPTS -server $HADOOP_DATANODE_OPTS"

fi

 

在root用户下echo $EUID,结果为0

 

有两种解决办法:

1、使用其他用户运行

2、修改shell代码,把1、2、3行注释掉,结果为

#if [[ $EUID -eq 0 ]]; then
    #HADOOP_OPTS="$HADOOP_OPTS -jvm server $HADOOP_DATANODE_OPTS"
  #else
    HADOOP_OPTS="$HADOOP_OPTS -server $HADOOP_DATANODE_OPTS"

#fi

 

 

 

 

 

分享到:
评论

相关推荐

    hadoop-0.20.205.0和hbase-0.90.5,集群和单机 安装配置

    tar -zxf hadoop-0.20.205.0.tar.gz -C /usr/local/hadoop ``` 3. **配置环境变量**: 修改`/etc/profile`文件,添加以下内容: ```bash HADOOP_HOME=/usr/local/hadoop/hadoop-0.20.205.0 PATH=$HADOOP_HOME/...

    Hadoop 0.20.205.0 API 官方CHM版

    Hadoop 0.20.205.0 API 官方CHM版,根据Apache官方文档生成的chm版的文档,绝对原汁原味!

    HDFS-Eclipse-plugin(0.20.205.0.jar )

    hadoop(0.20.205.0)的eclipse插件java版

    云计算 hadoop1.0 源代码

    它的上一个版本是0.20.205.0,新版的版本号原是 0.20.205.1,但开发者表示,Hadoop已经成熟几年前就做好了应用于生产的准备,但有些客户在采用前希望看到版本号是1.0,所以他们决定直接跳到了1.0。

    haoop除share部分的工具类

    1. hadoop-core-0.20.205.0.jar:这是Hadoop的核心库,包含了HDFS(Hadoop Distributed File System)和MapReduce的基本功能。 2. hadoop-test-0.20.205.0.jar:测试用例和工具,用于验证Hadoop组件的功能和性能。 3...

    基于Eclipse的Hadoop应用开发环境配置.pdf

    2. **安装Hadoop插件**:将`hadoop-eclipse-plugin-0.20.205.0.jar`复制到Eclipse的plugins目录下,然后重启Eclipse。这使得Eclipse能够识别Hadoop相关项目,并提供特定的Map/Reduce视图。 3. **配置Hadoop安装目录...

    Hadoop搭建

    在这个案例中,选择的是Hadoop的0.20.205.0版本。这是一个较早的版本,可能不包含最新的特性和优化。在实际生产环境中,通常会使用更稳定和功能丰富的最新版本。 3. **HBase版本选择**: HBase的0.90.5版本同样...

    hadoop版本迁移(不损失数据和用户信息等)

    例如,对于0.20.205.0和0.21.0这两个版本,需要创建如 `/home/hadoop/hadoop_dir/tmp205` 和 `/home/hadoop/hadoop_dir/tmp21` 这样的临时目录,以及其他如 `/dfs`、`/log`、`/mapred` 和 `/pids` 目录,以便于管理...

Global site tag (gtag.js) - Google Analytics