[Hadoop] “Too many fetch-failures” or “reducer stucks” issue

黎明lm

浏览: 311298 次
性别:
来自: 北京

最近访客更多访客>>

baby孔祥超

jiazhigang

slipper-jay

woshiliukun

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

hadoop

hadoop

I post the solution here to help any ‘Hadoopers’ that have the same problem. This issue had been asked a lot on Hadoop mailing list but no answer was given so far.
After installing Hadoop cluster and trying to run some jobs, you may see the Reducers stuck and TaskTracker log on one of the Worker node shows these messages :

INFO org.apache.hadoop.mapred.TaskTracker: task_200801281756_0001_r_000000_0 0.2727273% reduce > copy (9 of
11 at 0.00 MB/s) >
INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000000_0 0.2727273% reduce > copy (9 of
11 at 0.00 MB/s) >
INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000000_0 0.2727273% reduce > copy (9 of
11 at 0.00 MB/s) >
INFO org.apache.hadoop.mapred.JobInProgress: Too many fetch-failures for output of task: task_001_r_000000_0 … killing it

The Reducer was failed to copy data through the HDFS, what we should do is to double check your Linux network and Hadoop configuration :

1. Make sure that all the needed parameters are configured in hadoop-site.xml, and all the worker nodes should have the same content of this file.
2. URI for TaskTracker and HDFS should use hostname instead of IP address. I saw some instances of Hadoop cluster using IP address for the URI, they can start all the services and execute the jobs, but the task never finished successfully.
3. Check the file /etc/hosts on all the nodes and make sure that you’re binding the host name to its network IP, not the local one (127.0.0.1), don’t forget to check that all the nodes are able to communicate to the others using their hostname.
Anyway, it doesn’t make sense to me when Hadoop always try to resolve an IP address using the hostname. I consider this is a bug of Hadoop and hope they will solve it in next stable version.

0
顶

0
踩

分享到：

"DataXceiver: java.io.IOException: Conne ... | java 容易引起内存泄漏的几大原因

2012-02-15 09:17
浏览 1471
评论(0)
分类:开源软件
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论