【2】Hadoop 配置

全部 Linux 数据库敏捷编程数据结构软件测试项目管理 Oracle 编程综合互联网 Erlang MySQL

浏览 3506 次

锁定老帖子主题：【2】Hadoop 配置精华帖 (0) :: 良好帖 (0) :: 新手帖 (1) :: 隐藏帖 (0)
作者	正文
chakey 等级: 性别: 文章: 30 积分: 210 来自: 水星	发表时间：2009-09-26 最后修改：2010-11-10 相关推荐: [hadoop全分布部署]安装Hadoop、配置Hadoop 配置文件② hadoop配置文件 [hadoop全分布部署]安装Hadoop、配置Hadoop 配置文件① Hadoop配置文件详解 Hadoop集群配置2 更多相关推荐数据库 Hadoop Configuration 新增hadoopuser用户 [root@noc rou]# adduser bash: adduser: command not found [root@noc rou]# cd /usr/bin/ [root@noc bin]# ln -s /usr/sbin/adduser adduser [root@noc bin]# adduser hadoopuser passwd wpsop 修改系统允许打开的文件数有时候在程序里面需要打开多个文件，进行分析，系统一般默认数量是1024，（用ulimit -n可以看到）对于正常使用是够了，但是对于程序来讲，就太少了。修改办法：重启就OK 修改2个文件。 1）/etc/security/limits.conf vi /etc/security/limits.conf 加上： * soft nofile 8192 * hard nofile 20480 2）./etc/pam.d/login session required /lib/security/pam_limits.so 注意：要重启才能生效（也就是把putty关了再打开）创建mysql用户kwps和密码kwps grant all privileges on . to 'kwps'@'%' identified by 'kwps' ； flush privileges ; 简化输入 sudo -s 切换到root vi /usr/bin/wpsop 新建 #! /bin/bash ssh s$1-opdev-wps.rdev.kingsoft.net -l hadoopuser 指定用户wpsop 更改hosts 1） sudo vi /etc/hosts 2） sudo vi /etc/sysconfig/network 3） hostname -v newhostname SSH免密码公钥认证 1） mkdir .ssh 2） cd .ssh sudo chmod 700 . //这一步很重要 3） ssh-keygen -t rsa 4） cat rsa_d.pub >> authorized_keys 当然也可以： cp rsa_d.pub authorized_keys 使用 scp向其他服务器发送，注意不要覆盖原有的文件！！ 5） chmod 644 authorized_keys //这一步很重要注意：要保证所有的结点间（包括自连接）都是免密码ssh连接的解压Hadoop-0.19.1 tar -xvf Hadoop-0.19.1 Hadoop配置 Hadoop下载地址 http://apache.etoak.com/hadoop/core/ http://hadoop.apache.org/common/releases.html 本机环境：版本：Hadoop-0.191 操作系统：CentOS 五台服务器： S2 (namenode) S5 (secondarynamenode datanode) S6 (datanode) S7 (datanode) S8 (datanode) S9 (datanode) */home/wps/hadoop-0.19.1/conf* 修改masters： s5 修改slaves： s5 s6 s7 s8 s9 修改log4j.propperties hadoop.log.dir=/data/hadoop-0.19.1/logs 修改hadoop-env.sh export JAVA_HOME=/opt/JDK-1.6.0.14 export HADOOP_HEAPSIZE=4000 修改hadoop-site.xml <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>fs.default.name</name> <value>hdfs://s2-opdev-wps.rdev.kingsoft.net:9000/</value> <description>The name of the default file system. Either the literal string "local" or a host:port for DFS.</description> </property> <property> <name>mapred.job.tracker</name> <value>s2-opdev-wps.rdev.kingsoft.net:9001</value> <description>The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task.</description> </property> <property> <name>dfs.name.dir</name> <value>/data/hadoop-0.19.1/name</value> <description>Determines where on the local filesystem the DFS name node should store the name table. If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy. </description> </property> <property> <name>dfs.data.dir</name> <value>/data/hadoop-0.19.1/dfsdata</value> <description>Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are ignored.</description> </property> <property> <name>hadoop.tmp.dir</name> <value>/data/hadoop-0.19.1/tmp</value> <description>A base for other temporary directories.</description> </property> <property> <name>dfs.replication</name> <value>3</value> <description>Default block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time.</description> </property> <property> <name>fs.checkpoint.dir</name> <value>/data/hadoop-0.19.1/namesecondary</value> <description>Determines where on the local filesystem the DFS secondary name node should store the temporary images to merge. If this is a comma-delimited list of directories then the image is replicated in all of the directories for redundancy. </description> </property> <property> <name>dfs.http.address</name> <value>s2-opdev-wps.rdev.kingsoft.net:50070</value> <description> The address and the base port where the dfs namenode web ui will listen on. If the port is 0 then the server will start on a free port. </description> </property> <property> <name>mapred.map.tasks</name> <value>50</value> <description>The default number of map tasks per job. Typically set to a prime several times greater than number of available hosts. Ignored when mapred.job.tracker is "local". </description> </property> <property> <name>mapred.reduce.tasks</name> <value>7</value> <description>The default number of reduce tasks per job. Typically set to a prime close to the number of available hosts. Ignored when mapred.job.tracker is "local". </description> </property> 启动hadoop bin/hadoop namenode —format && Do not format a running Hadoop namenode ,this will cause all your data in the HDFS filesystem to be erased. && bin/start-all.sh bin/stop-all.sh 查看文件目录： bin/hadoop fs -ls / 查看数据块： /home/wpsop/hadoop-0.19.1/running/dfsdata/current Bin/hadoop fs -ls /data/user/hiveware 声明：ITeye文章版权属于作者，受法律保护。没有作者书面许可不得转载。推荐链接
返回顶楼

论坛首页 → 综合技术版

跳转论坛: