- 浏览: 1482148 次
- 性别:
- 来自: 北京
文章分类
- 全部博客 (691)
- linux (207)
- shell (33)
- java (42)
- 其他 (22)
- javascript (33)
- cloud (16)
- python (33)
- c (48)
- sql (12)
- 工具 (6)
- 缓存 (16)
- ubuntu (7)
- perl (3)
- lua (2)
- 超级有用 (2)
- 服务器 (2)
- mac (22)
- nginx (34)
- php (2)
- 内核 (2)
- gdb (13)
- ICTCLAS (2)
- mac android (0)
- unix (1)
- android (1)
- vim (1)
- epoll (1)
- ios (21)
- mysql (3)
- systemtap (1)
- 算法 (2)
- 汇编 (2)
- arm (3)
- 我的数据结构 (8)
- websocket (12)
- hadoop (5)
- thrift (2)
- hbase (1)
- graphviz (1)
- redis (1)
- raspberry (2)
- qemu (31)
- opencv (4)
- socket (1)
- opengl (1)
- ibeacons (1)
- emacs (6)
- openstack (24)
- docker (1)
- webrtc (11)
- angularjs (2)
- neutron (23)
- jslinux (18)
- 网络 (13)
- tap (9)
- tensorflow (8)
- nlu (4)
- asm.js (5)
- sip (3)
- xl2tp (5)
- conda (1)
- emscripten (6)
- ffmpeg (10)
- srt (1)
- wasm (5)
- bert (3)
- kaldi (4)
- 知识图谱 (1)
最新评论
-
wahahachuang8:
我喜欢代码简洁易读,服务稳定的推送服务,前段时间研究了一下go ...
websocket的helloworld -
q114687576:
http://www.blue-zero.com/WebSoc ...
websocket的helloworld -
zhaoyanzimm:
感谢您的分享,给我提供了很大的帮助,在使用过程中发现了一个问题 ...
nginx的helloworld模块的helloworld -
haoningabc:
leebyte 写道太NB了,期待早日用上Killinux!么 ...
qemu+emacs+gdb调试内核 -
leebyte:
太NB了,期待早日用上Killinux!
qemu+emacs+gdb调试内核
启动后可以用
* NameNode - http://localhost:50070/
* JobTracker - http://localhost:50030/
-------- 具体流程-------------
--------- 具体流程--------------
网上 下的,http://blogimg.chinaunix.net/blog/upfile2/100317223114.pdf写得简单易懂。
似乎hadoop对openjdk不感冒
结合http://hadoop.apache.org/common/docs/r0.18.2/cn/quickstart.html看吧
JDK
用hadoop-0.20.2
sudo apt-get install sun-java6-jdk
ubuntu上装在了/usr/lib/jvm里面
如果在红帽5上安装
需要下载jdk-6u23-linux-x64-rpm.bin
装完后在/usr/java/jdk1.6.0_23里面
如果ubuntu的ssh比较慢
辑/etc/ssh/ssh_config这个文件,将
用户
redhat5:
需要!强制执行
[hadoop@122226 .ssh]$ ssh-keygen -t rsa -P ""
cat id_rsa.pub >authorized_keys
ubuntu好使
配置文件
0.18的版本好像是1个hadoop-site.xml
0.20分成了多个*-site.xml
hadoop@ubuntu:/usr/local/hadoop/hadoop-0.20.2$ vim conf/hadoop-env.sh
设置export JAVA_HOME=/usr/lib/jvm/java-6-sun
hadoop@ubuntu:/usr/local/hadoop/hadoop-0.20.2$ vim conf/core-site.xml
hadoop@ubuntu:/usr/local/hadoop/hadoop-0.20.2$ vim conf/mapred-site.xml
可以把test-out考出来
./bin/hadoop dfs -copyToLocal /user/hadoop/test-out test-out
看一下,-copyToLocal和-get似乎是一个意思
执行./bin/hadoop namenode -format
./bin/start-all.sh之后
执行./bin/hadoop dfs -mkdir test-in
启动后
或者执行jps。
执行
hadoop@zhengxq-desktop:/usr/local/hadoop/hadoop-0.20.1$ echo "hello hadoop
world." > /tmp/test_file1.txt
hadoop@zhengxq-desktop:/usr/local/hadoop/hadoop-0.20.1$ echo "hello world
hadoop,I'm haha." > /tmp/test_file2.txt
hadoop@zhengxq-desktop:/usr/local/hadoop/hadoop-0.20.1$ bin/hadoop dfs -
copyFromLocal /tmp/test*.txt test-in
后
后
可见
一个在dfs看到的文件对应两个文件
blk_-*_*.meta
blk_-*
但log不算
在windows上安装,先安装cygwin,全装吧,得有ssh,照百度文库“在Windows上安装Hadoop教程”,设置个链接,解决空格问题
ln -s /cygdrive/c/Program\ Files/Java/jdk1.6.0_17 \
/usr/local/jdk1.6.0_17
在windows下路径会乱一点,cygwin如果装了ssh,ssh-keygen互信过,文件夹权限一致就应该没问题了,2010年12月29日,成功在ubuntu10.10,redhat5,windowxp+cygwin上跑起单机的,准备跑个集群的试试,windows用tree /F查看tmp中结果
很奇怪
D:/cygwin/usr/local/hadoop
在这下面执行
echo "aa" >/tmp/test_file1.txt
$ ./bin/hadoop fs -copyFromLocal /tmp/test_file1.txt /user/Administrator/test-in
copyFromLocal: File /tmp/test_file1.txt does not exist.
把test_file1.txt复制到D:/tmp就好了
集群
usermod -G group user
vi /etc/group
/etc/passwd
看了http://malixxx.iteye.com/blog/459277,
相对hadoop-0.20.2
core-site.xml为
masters本机ip 192.168.200.12
slaves为节点ip 192.168.200.16
mapred-site.xml为
如果改了mapred-site.xml不行,报错“FATAL org.apache.hadoop.mapred.JobTracker: java.net.BindException: Problem binding to
/192.168.200.16:9999 : Cannot assign requested address”
因为jobtracker没起来,应该是配置错误,暂时没解决
其实方法就是把单机能运行的例子,改一个slave文件,(似乎slave就是datanode)
注意hadoop-env.sh设置了JAVA_HOME,如果是把namenode的hadoop拷贝到了datanode上,注意JAVA_HOME路径是否正确
---------★-----------
防火墙陶腾了我一天,考
上网找了个脚本刷上之后不报错了
accept-all.sh
各种网络连接错误nio错就罪魁祸首就是iptables防火墙
运行成功后,
12的mastar上,启动NameNode,SecondaryNameNode,JobTracker,为
从系统中考个文件到hadoop的新建的dir中后
16这个slave上启动DataNode,TaskTracker 为
一共5个
遇到了“FileSystem is not ready yet”“retry”“could only be replicated to 0 nodes”之类的错误全没了
看了一下ipc,跟ibm那个入门级的nio的MultiPortEcho.java很像
http://bbs.hadoopor.com/thread-329-1-2.html
遇到连接不上之类的错误就找网络原因吧
在家的ubuntu单机运行,写localhost好使,换成ip就不好使了
后来卸载了virtbr0,bridge-utils也不好使
后来配置了/etc/network/interfaces
$ /etc/init.d/networking restart
清了/tmp目录 /home/hadoop/temp目录,如果是多台机器,多台机器的tmp都删,注意查看logs里面的日志,有的时候stop-all.sh后,jps -l看不到的线程但是java还在跑,用ps -ef|grep java看一下是否有java的进程,netstat -nltp|grep 9999看jobTracker是否还在跑,如果有,kill掉
重启机器,把core-site.xml,mapred-site.xml,slaves,masters都设置192.168.1.118就好使了,dhcp好像有问题,一晚上一直报错,要不就是dfsadmin -report全0,可能是ip映射有问题,还出现什么127.0.1.1的问题,反正设置固定ip就好了
一直不知道hdfs-site.xml是干什么的,上网找了个http://www.javawhat.com/showWebsiteContent.do?id=527440
* NameNode - http://localhost:50070/
* JobTracker - http://localhost:50030/
-------- 具体流程-------------
$./bin/hadoop namenode -format $./bin/start-all.sh $jps -l $./bin/hadoop dfsadmin -report $echo "hello hadoopworld." > /tmp/test_file1.txt $echo "hello world hadoop,I'm test." > /tmp/test_file2.txt $./bin/hadoop dfs -mkdir test-in $./bin/hadoop dfs -copyFromLocal /tmp/test*.txt test-in $./bin/hadoop dfs -ls test-in $./bin/hadoop jar hadoop-0.20.2-examples.jar wordcount test-in test-out $./bin/hadoop dfs -ls test-out $./bin/hadoop dfs -cat test-out/part-r-00000
--------- 具体流程--------------
$ ./bin/hadoop namenode -format 10/12/29 14:25:57 INFO namenode.NameNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG: host = haoning/10.4.125.111 STARTUP_MSG: args = [-format] STARTUP_MSG: version = 0.20.2 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010 ************************************************************/ 10/12/29 14:25:57 INFO namenode.FSNamesystem: fsOwner=Administrator,None,root,Administrators,Users,Debugger,Users,ora_dba 10/12/29 14:25:57 INFO namenode.FSNamesystem: supergroup=supergroup 10/12/29 14:25:57 INFO namenode.FSNamesystem: isPermissionEnabled=true 10/12/29 14:25:57 INFO common.Storage: Image file of size 103 saved in 0 seconds. 10/12/29 14:25:57 INFO common.Storage: Storage directory \home\Administrator\tmp\dfs\name has been successfully formatted. 10/12/29 14:25:57 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at haoning/10.4.125.111 ************************************************************/
$ ./bin/start-all.sh starting namenode, logging to /usr/local/hadoop/bin/../logs/hadoop-Administrator-namenode-haoning.out localhost: datanode running as process 352. Stop it first. localhost: starting secondarynamenode, logging to /usr/local/hadoop/bin/../logs/hadoop-Administrator-secondarynamenode-haoning.out starting jobtracker, logging to /usr/local/hadoop/bin/../logs/hadoop-Administrator-jobtracker-haoning.out localhost: starting tasktracker, logging to /usr/local/hadoop/bin/../logs/hadoop-Administrator-tasktracker-haoning.out
网上 下的,http://blogimg.chinaunix.net/blog/upfile2/100317223114.pdf写得简单易懂。
似乎hadoop对openjdk不感冒
结合http://hadoop.apache.org/common/docs/r0.18.2/cn/quickstart.html看吧
JDK
用hadoop-0.20.2
sudo apt-get install sun-java6-jdk
ubuntu上装在了/usr/lib/jvm里面
如果在红帽5上安装
需要下载jdk-6u23-linux-x64-rpm.bin
装完后在/usr/java/jdk1.6.0_23里面
如果ubuntu的ssh比较慢
辑/etc/ssh/ssh_config这个文件,将
#GSSAPIAuthentication no #GSSAPIDelegateCredentials no
用户
redhat5:
groupadd hadoop useradd hadoop -g hadoop vim /etc/sudoers 修改 root ALL=(ALL) ALL hadoop ALL=(ALL) ALL
需要!强制执行
[hadoop@122226 .ssh]$ ssh-keygen -t rsa -P ""
cat id_rsa.pub >authorized_keys
ubuntu好使
配置文件
0.18的版本好像是1个hadoop-site.xml
0.20分成了多个*-site.xml
hadoop@ubuntu:/usr/local/hadoop/hadoop-0.20.2$ vim conf/hadoop-env.sh
设置export JAVA_HOME=/usr/lib/jvm/java-6-sun
hadoop@ubuntu:/usr/local/hadoop/hadoop-0.20.2$ vim conf/core-site.xml
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>fs.default.name</name> <value>hdfs://localhost:9000</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/home/hadoop/tmp</value> </property> </configuration>
hadoop@ubuntu:/usr/local/hadoop/hadoop-0.20.2$ vim conf/mapred-site.xml
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>mapred.job.tracker</name> <value>localhost:9001</value> </property> </configuration>
可以把test-out考出来
./bin/hadoop dfs -copyToLocal /user/hadoop/test-out test-out
看一下,-copyToLocal和-get似乎是一个意思
执行./bin/hadoop namenode -format
hadoop@ubuntu:~/tmp$ tree . └── dfs └── name ├── current │ ├── edits │ ├── fsimage │ ├── fstime │ └── VERSION └── image └── fsimage 4 directories, 5 files
./bin/start-all.sh之后
执行./bin/hadoop dfs -mkdir test-in
. |-- dfs | |-- data | | |-- current | | | |-- blk_-1605603437240955017 | | | |-- blk_-1605603437240955017_1019.meta | | | |-- dncp_block_verification.log.curr | | | `-- VERSION | | |-- detach | | |-- in_use.lock | | |-- storage | | `-- tmp | |-- name | | |-- current | | | |-- edits | | | |-- fsimage | | | |-- fstime | | | `-- VERSION | | |-- image | | | `-- fsimage | | `-- in_use.lock | `-- namesecondary | |-- current | | |-- edits | | |-- fsimage | | |-- fstime | | `-- VERSION | |-- image | | `-- fsimage | `-- in_use.lock `-- mapred `-- local 13 directories, 18 files
启动后
hadoop@ubuntu:/usr/local/hadoop/hadoop-0.20.2$ ./bin/hadoop dfsadmin -report Configured Capacity: 25538187264 (23.78 GB) Present Capacity: 8219365391 (7.65 GB) DFS Remaining: 8219340800 (7.65 GB) DFS Used: 24591 (24.01 KB) DFS Used%: 0% Under replicated blocks: 1 Blocks with corrupt replicas: 0 Missing blocks: 0 ------------------------------------------------- Datanodes available: 1 (1 total, 0 dead) Name: 127.0.0.1:50010 Decommission Status : Normal Configured Capacity: 25538187264 (23.78 GB) DFS Used: 24591 (24.01 KB) Non DFS Used: 17318821873 (16.13 GB) DFS Remaining: 8219340800(7.65 GB) DFS Used%: 0% DFS Remaining%: 32.18% Last contact: Tue Dec 21 11:18:48 CST 2010
或者执行jps。
执行
hadoop@zhengxq-desktop:/usr/local/hadoop/hadoop-0.20.1$ echo "hello hadoop
world." > /tmp/test_file1.txt
hadoop@zhengxq-desktop:/usr/local/hadoop/hadoop-0.20.1$ echo "hello world
hadoop,I'm haha." > /tmp/test_file2.txt
hadoop@zhengxq-desktop:/usr/local/hadoop/hadoop-0.20.1$ bin/hadoop dfs -
copyFromLocal /tmp/test*.txt test-in
后
hadoop@test-linux:~/tmp$ tree . |-- dfs | |-- data | | |-- current | | | |-- blk_-1605603437240955017 | | | |-- blk_-1605603437240955017_1019.meta | | | |-- blk_-2047199693110071270 | | | |-- blk_-2047199693110071270_1020.meta | | | |-- blk_-7264292243816045059 | | | |-- blk_-7264292243816045059_1021.meta | | | |-- dncp_block_verification.log.curr | | | `-- VERSION | | |-- detach | | |-- in_use.lock | | |-- storage | | `-- tmp | |-- name | | |-- current | | | |-- edits | | | |-- fsimage | | | |-- fstime | | | `-- VERSION | | |-- image | | | `-- fsimage | | `-- in_use.lock | `-- namesecondary | |-- current | | |-- edits | | |-- fsimage | | |-- fstime | | `-- VERSION | |-- image | | `-- fsimage | `-- in_use.lock `-- mapred `-- local 13 directories, 22 files
hadoop@test-linux:/usr/local/hadoop/hadoop-0.20.2$ ./bin/hadoop dfs -ls test-in Found 2 items -rw-r--r-- 3 hadoop supergroup 21 2010-12-21 23:28 /user/hadoop/test-in/test_file1.txt -rw-r--r-- 3 hadoop supergroup 22 2010-12-21 23:28 /user/hadoop/test-in/test_file2.txt hadoop@test-linux:/usr/local/hadoop/hadoop-0.20.2$ ./bin/hadoop jar hadoop-0.20.2-examples.jar wordcount test-in test-out 10/12/21 23:36:12 INFO input.FileInputFormat: Total input paths to process : 2 10/12/21 23:36:13 INFO mapred.JobClient: Running job: job_201012212251_0001 10/12/21 23:36:14 INFO mapred.JobClient: map 0% reduce 0% 10/12/21 23:36:55 INFO mapred.JobClient: map 100% reduce 0% 10/12/21 23:37:14 INFO mapred.JobClient: map 100% reduce 100% 10/12/21 23:37:16 INFO mapred.JobClient: Job complete: job_201012212251_0001 10/12/21 23:37:16 INFO mapred.JobClient: Counters: 17 10/12/21 23:37:16 INFO mapred.JobClient: Job Counters 10/12/21 23:37:16 INFO mapred.JobClient: Launched reduce tasks=1 10/12/21 23:37:16 INFO mapred.JobClient: Launched map tasks=2 10/12/21 23:37:16 INFO mapred.JobClient: Data-local map tasks=2 10/12/21 23:37:16 INFO mapred.JobClient: FileSystemCounters 10/12/21 23:37:16 INFO mapred.JobClient: FILE_BYTES_READ=85 10/12/21 23:37:16 INFO mapred.JobClient: HDFS_BYTES_READ=43 10/12/21 23:37:16 INFO mapred.JobClient: FILE_BYTES_WRITTEN=240 10/12/21 23:37:16 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=38 10/12/21 23:37:16 INFO mapred.JobClient: Map-Reduce Framework 10/12/21 23:37:16 INFO mapred.JobClient: Reduce input groups=4 10/12/21 23:37:16 INFO mapred.JobClient: Combine output records=6 10/12/21 23:37:16 INFO mapred.JobClient: Map input records=2 10/12/21 23:37:16 INFO mapred.JobClient: Reduce shuffle bytes=91 10/12/21 23:37:16 INFO mapred.JobClient: Reduce output records=4 10/12/21 23:37:16 INFO mapred.JobClient: Spilled Records=12 10/12/21 23:37:16 INFO mapred.JobClient: Map output bytes=67 10/12/21 23:37:16 INFO mapred.JobClient: Combine input records=6 10/12/21 23:37:16 INFO mapred.JobClient: Map output records=6 10/12/21 23:37:16 INFO mapred.JobClient: Reduce input records=6
后
hadoop@test-linux:~/tmp$ tree . |-- dfs | |-- data | | |-- current | | | |-- blk_-1605603437240955017 | | | |-- blk_-1605603437240955017_1019.meta | | | |-- blk_-1792462247745372986 | | | |-- blk_-1792462247745372986_1027.meta | | | |-- blk_-2047199693110071270 | | | |-- blk_-2047199693110071270_1020.meta | | | |-- blk_-27635221429411767 | | | |-- blk_-27635221429411767_1027.meta | | | |-- blk_-7264292243816045059 | | | |-- blk_-7264292243816045059_1021.meta | | | |-- blk_-8634524858846751168 | | | |-- blk_-8634524858846751168_1026.meta | | | |-- dncp_block_verification.log.curr | | | `-- VERSION | | |-- detach | | |-- in_use.lock | | |-- storage | | `-- tmp | |-- name | | |-- current | | | |-- edits | | | |-- fsimage | | | |-- fstime | | | `-- VERSION | | |-- image | | | `-- fsimage | | `-- in_use.lock | `-- namesecondary | |-- current | | |-- edits | | |-- fsimage | | |-- fstime | | `-- VERSION | |-- image | | `-- fsimage | `-- in_use.lock `-- mapred `-- local |-- jobTracker `-- taskTracker `-- jobcache 16 directories, 28 files
hadoop@test-linux:/usr/local/hadoop/hadoop-0.20.2$ ./bin/hadoop dfs -lsr drwxr-xr-x - hadoop supergroup 0 2010-12-21 23:28 /user/hadoop/test-in -rw-r--r-- 3 hadoop supergroup 21 2010-12-21 23:28 /user/hadoop/test-in/haoning1.txt -rw-r--r-- 3 hadoop supergroup 22 2010-12-21 23:28 /user/hadoop/test-in/haoning2.txt drwxr-xr-x - hadoop supergroup 0 2010-12-21 23:37 /user/hadoop/test-out drwxr-xr-x - hadoop supergroup 0 2010-12-21 23:36 /user/hadoop/test-out/_logs drwxr-xr-x - hadoop supergroup 0 2010-12-21 23:36 /user/hadoop/test-out/_logs/history -rw-r--r-- 3 hadoop supergroup 16751 2010-12-21 23:36 /user/hadoop/test-out/_logs/history/localhost_1292943083664_job_201012212251_0001_conf.xml -rw-r--r-- 3 hadoop supergroup 8774 2010-12-21 23:36 /user/hadoop/test-out/_logs/history/localhost_1292943083664_job_201012212251_0001_hadoop_word+count -rw-r--r-- 3 hadoop supergroup 38 2010-12-21 23:37 /user/hadoop/test-out/part-r-00000
hadoop@test-linux:/usr/local/hadoop/hadoop-0.20.2$ ./bin/hadoop dfs -cat test-out/part-r-00000统计出字数结果
可见
一个在dfs看到的文件对应两个文件
blk_-*_*.meta
blk_-*
但log不算
在windows上安装,先安装cygwin,全装吧,得有ssh,照百度文库“在Windows上安装Hadoop教程”,设置个链接,解决空格问题
ln -s /cygdrive/c/Program\ Files/Java/jdk1.6.0_17 \
/usr/local/jdk1.6.0_17
在windows下路径会乱一点,cygwin如果装了ssh,ssh-keygen互信过,文件夹权限一致就应该没问题了,2010年12月29日,成功在ubuntu10.10,redhat5,windowxp+cygwin上跑起单机的,准备跑个集群的试试,windows用tree /F查看tmp中结果
很奇怪
D:/cygwin/usr/local/hadoop
在这下面执行
echo "aa" >/tmp/test_file1.txt
$ ./bin/hadoop fs -copyFromLocal /tmp/test_file1.txt /user/Administrator/test-in
copyFromLocal: File /tmp/test_file1.txt does not exist.
把test_file1.txt复制到D:/tmp就好了
集群
usermod -G group user
vi /etc/group
/etc/passwd
看了http://malixxx.iteye.com/blog/459277,
相对hadoop-0.20.2
core-site.xml为
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>fs.default.name</name> <value>hdfs://192.168.200.12:8888</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/home/Administrator/tmp</value> </property> </configuration>
masters本机ip 192.168.200.12
slaves为节点ip 192.168.200.16
mapred-site.xml为
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>mapred.job.tracker</name> <value>192.168.200.12:9999</value> </property> </configuration>
如果改了mapred-site.xml不行,报错“FATAL org.apache.hadoop.mapred.JobTracker: java.net.BindException: Problem binding to
/192.168.200.16:9999 : Cannot assign requested address”
因为jobtracker没起来,应该是配置错误,暂时没解决
其实方法就是把单机能运行的例子,改一个slave文件,(似乎slave就是datanode)
注意hadoop-env.sh设置了JAVA_HOME,如果是把namenode的hadoop拷贝到了datanode上,注意JAVA_HOME路径是否正确
---------★-----------
防火墙陶腾了我一天,考
上网找了个脚本刷上之后不报错了
accept-all.sh
#!/bin/sh IPT='/sbin/iptables' $IPT -t nat -F $IPT -t nat -X $IPT -t nat -P PREROUTING ACCEPT $IPT -t nat -P POSTROUTING ACCEPT $IPT -t nat -P OUTPUT ACCEPT $IPT -t mangle -F $IPT -t mangle -X $IPT -t mangle -P PREROUTING ACCEPT $IPT -t mangle -P INPUT ACCEPT $IPT -t mangle -P FORWARD ACCEPT $IPT -t mangle -P OUTPUT ACCEPT $IPT -t mangle -P POSTROUTING ACCEPT $IPT -F $IPT -X $IPT -P FORWARD ACCEPT $IPT -P INPUT ACCEPT $IPT -P OUTPUT ACCEPT $IPT -t raw -F $IPT -t raw -X $IPT -t raw -P PREROUTING ACCEPT $IPT -t raw -P OUTPUT ACCEPT
各种网络连接错误nio错就罪魁祸首就是iptables防火墙
运行成功后,
12的mastar上,启动NameNode,SecondaryNameNode,JobTracker,为
$ jps -l 32416 org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode 32483 org.apache.hadoop.mapred.JobTracker 1398 sun.tools.jps.Jps 32252 org.apache.hadoop.hdfs.server.namenode.NameNode
从系统中考个文件到hadoop的新建的dir中后
. `-- tmp `-- dfs |-- name | |-- current | | |-- VERSION | | |-- edits | | |-- edits.new | | |-- fsimage | | `-- fstime | |-- image | | `-- fsimage | `-- in_use.lock `-- namesecondary |-- current |-- in_use.lock `-- lastcheckpoint.tmp
16这个slave上启动DataNode,TaskTracker 为
# jps -l 32316 sun.tools.jps.Jps 31068 org.apache.hadoop.mapred.TaskTracker 30949 org.apache.hadoop.hdfs.server.datanode.DataNode #
. `-- tmp |-- dfs | `-- data | |-- current | | |-- VERSION | | |-- blk_-4054376904853997355 | | |-- blk_-4054376904853997355_1002.meta | | |-- blk_-8185269915321998969 | | |-- blk_-8185269915321998969_1001.meta | | `-- dncp_block_verification.log.curr | |-- detach | |-- in_use.lock | |-- storage | `-- tmp `-- mapred `-- local
一共5个
遇到了“FileSystem is not ready yet”“retry”“could only be replicated to 0 nodes”之类的错误全没了
看了一下ipc,跟ibm那个入门级的nio的MultiPortEcho.java很像
http://bbs.hadoopor.com/thread-329-1-2.html
遇到连接不上之类的错误就找网络原因吧
在家的ubuntu单机运行,写localhost好使,换成ip就不好使了
后来卸载了virtbr0,bridge-utils也不好使
后来配置了/etc/network/interfaces
auto lo iface lo inet loopback auto eth0 iface eth0 inet static address 192.168.1.118 netmask 255.255.255.0 network 192.168.0.0 broadcast 192.168.1.255 gateway 192.168.1.1
$ /etc/init.d/networking restart
清了/tmp目录 /home/hadoop/temp目录,如果是多台机器,多台机器的tmp都删,注意查看logs里面的日志,有的时候stop-all.sh后,jps -l看不到的线程但是java还在跑,用ps -ef|grep java看一下是否有java的进程,netstat -nltp|grep 9999看jobTracker是否还在跑,如果有,kill掉
重启机器,把core-site.xml,mapred-site.xml,slaves,masters都设置192.168.1.118就好使了,dhcp好像有问题,一晚上一直报错,要不就是dfsadmin -report全0,可能是ip映射有问题,还出现什么127.0.1.1的问题,反正设置固定ip就好了
一直不知道hdfs-site.xml是干什么的,上网找了个http://www.javawhat.com/showWebsiteContent.do?id=527440
<configuration> <property> <name>dfs.hosts.exclude</name> <value>conf/excludes</value> </property> <property> <name>dfs.http.address</name> <value>namenodeip:50070</value> </property> <property> <name>dfs.balance.bandwidthPerSec</name> <value>12582912</value> </property> <property> <name>dfs.block.size</name> <value>134217728</value> <final>true</final> </property> <property> <name>dfs.data.dir</name> <value>/hadoop1/data/,/hadoop2/data/</value> <final>true</final> </property> <property> <name>dfs.datanode.du.reserved</name> <value>1073741824</value> <final>true</final> </property> <property> <name>dfs.datanode.handler.count</name> <value>10</value> <final>true</final> </property> <property> <name>dfs.name.dir</name> <value>/hadoop/name/</value> <final>true</final> </property> <property> <name>dfs.namenode.handler.count</name> <value>64</value> <final>true</final> </property> <property> <name>dfs.permissions</name> <value>True</value> <final>true</final> </property> <property> <name>dfs.replication</name> <value>3</value> </property> </configuration>
- 100317223114.pdf (785 KB)
- 下载次数: 17
发表评论
-
zookeeper集群安装
2011-12-15 11:48 9578好文章http://www.codedump.info/?p= ... -
(转)jslinux
2011-12-09 00:57 1931转载http://zwhc.iteye.com/blog/10 ... -
mac版本的qemu的网站及js的shell
2011-12-09 00:54 1125那个jslinux http://coolshell.cn/a ... -
xen的教程
2011-11-29 18:06 1012xen的虚机一直没建过,怒了,备份一下 http://wik ... -
hbase官方文档
2011-11-20 21:58 840http://www.yankay.com/wp-conten ... -
转发-百度搜索研发部门官方博客-日志分析方法概述-hadoop
2011-04-28 22:54 4757怀念云计算啊, 转发 http://stblog.baidu- ... -
yum原配置
2011-04-06 11:15 908mount -o loop rhel-server-5.4-x ... -
libvrit
2011-03-30 14:10 1643参考 http://www.baidu.com/s?wd=vi ... -
ubuntu备份笔记
2011-03-26 15:00 1151ls -sh du -h --max-depth=1 /roo ... -
(转)libvirt和Fedora 13 上搭建Eucalyptus
2011-03-26 09:58 2263转载 http://blog.csdn.net/hispani ... -
axis2c qpid
2011-03-13 23:39 1225具体http://haoningabc.iteye.com/b ... -
ubuntu_eucalyptus_qpid
2011-03-11 23:14 1942http://open.eucalyptus.com/wiki ... -
海量数据存储的数据库设计
2011-03-08 11:02 1787能想到的就只有这些了 缓存,分布式,hadoop,Atomki ... -
hadoop最基本配置及build(ant代理)
2011-01-06 11:16 8112网上的大多数都是hadoop-site.xml 20的版本,分 ... -
hadoop ipc
2010-12-30 14:32 1423用cygwin在window上装hadoop,做namenod ...
相关推荐
Hadoop学习笔记,自己总结的一些Hadoop学习笔记,比较简单。
**Hadoop学习笔记详解** Hadoop是一个开源的分布式计算框架,由Apache基金会开发,主要用于处理和存储海量数据。它的核心组件包括HDFS(Hadoop Distributed File System)和MapReduce,两者构成了大数据处理的基础...
Hadoop 学习笔记.md
【HADOOP学习笔记】 Hadoop是Apache基金会开发的一个开源分布式计算框架,是云计算领域的重要组成部分,尤其在大数据处理方面有着广泛的应用。本学习笔记将深入探讨Hadoop的核心组件、架构以及如何搭建云计算平台。...
《Hadoop学习笔记详解》 Hadoop,作为大数据处理领域中的核心框架,是Apache软件基金会下的一个开源项目,主要用于分布式存储和并行计算。本文将根据提供的Hadoop学习笔记,深入解析Hadoop的关键概念和实战技巧,...
Hadoop是一个开源框架,用于存储和处理大型数据集。由Apache软件基金会开发,Hadoop已经成为大数据处理事实上的标准。它特别适合于存储非结构化和半结构化数据,并且能够存储和运行在廉价硬件之上。Hadoop具有高可靠...
在本篇"Hadoop学习笔记(三)"中,我们将探讨如何使用Hadoop的MapReduce框架来解决一个常见的问题——从大量数据中找出最大值。这个问题与SQL中的`SELECT MAX(NUMBER) FROM TABLE`查询相似,但在这里我们通过编程...
"Hadoop学习笔记整理" 本篇笔记对Hadoop进行了系统的介绍和总结,从大数据的基本流程到Hadoop的发展史、特性、集群整体概述、配置文件、HDFS分布式文件系统等方面都进行了详细的讲解。 一、大数据分析的基本流程 ...
云计算,hadoop,学习笔记, dd
在初学者的角度,理解Hadoop的组成部分以及其架构设计是学习Hadoop的基础。 首先,Hadoop的分布式文件系统(HDFS)是其核心组件之一,它具有高吞吐量的数据访问能力,非常适合大规模数据集的存储和处理。HDFS的设计...