目前集群上某台机器卡住导致出现大量的Map端任务FAIL,当定位到具体的机器上时,无法ssh或进去后terminal中无响应,退出的相关信息如下:
[hadoop@xxx ~]$ Received disconnect from xxx: Timeout, your session not responding.
AttemptID:attempt_1413206225298_24177_m_000001_0 Timed out after 1200 secsContainer killed by the ApplicationMaster. Container killed on request. Exit code is 143
The number of milliseconds before a task will be terminated if it neither reads an input, writes an output, nor updates its status string. A value of 0 disables the timeout.
Map.Entry<TaskAttemptId, ReportTime> entry = iterator.next(); boolean taskTimedOut = (taskTimeOut > 0) && (currentTime > (entry.getValue().getLastProgress() + taskTimeOut)); if(taskTimedOut) { // task is lost, remove from the list and raise lost event iterator.remove(); eventHandler.handle(new TaskAttemptDiagnosticsUpdateEvent(entry .getKey(), "AttemptID:" + entry.getKey().toString() + " Timed out after " + taskTimeOut / 1000 + " secs")); eventHandler.handle(new TaskAttemptEvent(entry.getKey(), TaskAttemptEventType.TA_TIMED_OUT)); }
public void progressing(TaskAttemptId attemptID) { //only put for the registered attempts //TODO throw an exception if the task isn't registered. ReportTime time = runningAttempts.get(attemptID); if(time != null) { time.setLastProgress(clock.getTime()); } }
Report progress
If your task reports no progress for 10 minutes (see the mapred.task.timeout
property) then it will be killed by Hadoop. Most tasks don’t encounter this situation since they report progress implicitly by reading input and writing output. However, some jobs which don’t process records in this way may fall foul of this behavior and have their tasks killed. Simulations are a good example, since they do a lot of CPU-intensive processing in each map and typically only write the result at the end of the computation. They should be written in such a way as to report progress on a regular basis (more frequently than every 10 minutes). This may be achieved in a number of ways:
- Call
setStatus()
onReporter
to set a human-readable description of
the task’s progress - Call
incrCounter()
onReporter
to increment a user counter - Call
progress()
onReporter
to tell Hadoop that your task is still there (and making progress)
但是,事情还没完,集群中会不定时地有任务卡死在某个点上导致任务无法继续下去:
"main" prio=10 tid=0x000000000293f000 nid=0x1e06 runnable [0x0000000041b20000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87) - locked <0x00000006e243c3f0> (a sun.nio.ch.Util$2) - locked <0x00000006e243c3e0> (a java.util.Collections$UnmodifiableSet) - locked <0x00000006e243c1a0> (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98) at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:335) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
/now wait for socket to be ready. int count = 0; try { count = selector.select(channel, ops, timeout); } catch (IOException e) { //unexpected IOException. closed = true; throw e; } if (count == 0) { throw new SocketTimeoutException(timeoutExceptionString(channel, timeout, ops)); }
Error: java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=xxx remote=/xxx] at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118) at java.io.FilterInputStream.read(FilterInputStream.java:83) at java.io.FilterInputStream.read(FilterInputStream.java:83) at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1490) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.transfer(DFSOutputStream.java:962) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:930) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1031) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:823) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:475)
while (true) { long start = (timeout == 0) ? 0 : Time.now(); key = channel.register(info.selector, ops); ret = info.selector.select(timeout); if (ret != 0) { return ret; } /* Sometimes select() returns 0 much before timeout for * unknown reasons. So select again if required. */ if (timeout > 0) { timeout -= Time.now() - start; if (timeout <= 0) { return 0; } } if (Thread.currentThread().isInterrupted()) { throw new InterruptedIOException("Interruped while waiting for " + "IO on channel " + channel + ". " + timeout + " millis timeout left."); } }
java.nio.channels.Selector public abstract int select(long timeout) throws java.io.IOException Selects a set of keys whose corresponding channels are ready for I/O operations. This method performs a blocking selection operation. It returns only after at least one channel is selected, this selector's wakeup method is invoked, the current thread is interrupted, or the given timeout period expires, whichever comes first.
- 至少一个已经注册的Channel被选择,返回的就是被选择的Channel数量;
- Selector被中断;
- 给定的超时时间已到;
但是,这也没完,难道超时了不会重试?到底会重试几次?
经过继续分析,发现往下的堆栈中的DFSInputStream调用了readBuffer方法,可以看到retryCurrentNode在第一次失败后,将IOException捕获,会进行必要的重试操作,如果还是发生超时,并且找不到就将其加入黑名单作为失败的DataNode(可能下次不会进行重试?),并转移到另外的DataNode上(执行seekToNewSource方法),经过几次后才会将IOException真正抛出。
try { return reader.doRead(blockReader, off, len, readStatistics); } catch ( ChecksumException ce ) { DFSClient.LOG.warn("Found Checksum error for " + getCurrentBlock() + " from " + currentNode + " at " + ce.getPos()); ioe = ce; retryCurrentNode = false; // we want to remember which block replicas we have tried addIntoCorruptedBlockMap(getCurrentBlock(), currentNode, corruptedBlockMap); } catch ( IOException e ) { if (!retryCurrentNode) { DFSClient.LOG.warn("Exception while reading from " + getCurrentBlock() + " of " + src + " from " + currentNode, e); } ioe = e; } boolean sourceFound = false; if (retryCurrentNode) { /* possibly retry the same node so that transient errors don't * result in application level failures (e.g. Datanode could have * closed the connection because the client is idle for too long). */ sourceFound = seekToBlockSource(pos); } else { addToDeadNodes(currentNode); sourceFound = seekToNewSource(pos); } if (!sourceFound) { throw ioe; } retryCurrentNode = false; }
相关推荐
本文将详细讨论Hadoop调优的关键参数,分为资源相关参数、Shuffle性能优化参数以及容错相关参数。 1. **资源相关参数**: - `mapreduce.map.memory.mb` 和 `mapreduce.reduce.memory.mb` 分别设定MapTask和Reduce...
本文探讨了Hadoop实时容错技术,并针对其心跳超时机制进行了改进,提出了公平心跳超时容错机制。在了解该技术之前,首先要了解Hadoop的基本原理及其心跳超时机制的作用。 Hadoop是一个开源的分布式存储与计算平台,...
MapReduce是Hadoop的数据处理模型,分为Map和Reduce两个阶段。`mapreduce.framework.name`设定运行模式,可以是经典的`local`或YARN(`yarn`)。`mapreduce.job.reduces`控制reduce任务的数量,影响数据的并行度和...
例如,可以通过调整TCP的缓冲区大小、设置合适的RPC超时时间以及优化网络带宽使用等来改善Hadoop集群的效率。 标签 "源码" 暗示我们需要关注Hadoop的源代码层面。理解Hadoop源码有助于深入学习其内部工作原理,比如...
- **Task任务超时**:调整`hive.exec.max.dynamic.partitions`和`hive.exec.max.dynamic.partitions.pernode`。 - **OutOfMemoryError: Java heap space**:调整JVM参数如`-Xmx`。 #### 四、结论 通过上述内容可以...
这种情况通常发生在高并发或IO竞争激烈的场景下,需要考虑调整超时时间和容器内存配置等参数。但这个问题并非是导致作业卡死的根本原因。 3. **长时间卡死的reduce任务日志**: - syslog日志中出现的`IOException...
Hadoop 2.9.0版本中的mapred-default.xml文件包含了MapReduce作业的配置属性,这些属性定义了MapReduce作业执行过程中的各种行为和参数。下面我们来详细介绍mapred-site.xml文件中的一些关键属性。 1. mapreduce....
总结,MapReduce模型通过将大数据处理任务分解为可并行执行的Map和Reduce任务,高效地在Hadoop集群中进行分布式计算。JobTracker和TaskTracker共同协调任务的执行,确保作业的正确性和容错性。HDFS作为存储系统,...
**公平调度器**(Fair Scheduler)是Hadoop中的一种插件式Map/Reduce调度器,它为大规模集群提供了一种有效的资源共享机制。其核心目标是确保随着时间的推移,所有作业都能平均分配到等量的共享资源。 #### 二、...
phpHiveAdmin是一个基于Web的Hive管理工具,能够实时监控Hive集群的运行状态,提供了详细的监控数据,包括Job提交、Map/Reduce过程、Hive日志等。phpHiveAdmin的优点是界面清晰,安装简单,运行方便,节省Hive操作...
### 企业级IT架构分享:Hadoop公平调度器指南 #### 目的 ...通过对公平调度器的基本配置、参数设置及其实现原理的理解,用户可以根据自身需求灵活地调整调度策略,从而最大程度地发挥Hadoop集群的效能。
- **TCPSession超时问题**:长时间运行的查询可能会导致TCPSession进入CLOSE_WAIT状态,可以通过调整HAProxy中的超时设置来缓解此问题。 - **Hive日志问题**:频繁的健康检查可能会产生大量Hive日志,可以通过增加...
Hive 优化方法整理是 Hive 数据处理过程中的重要步骤,涉及到 Hive 的类 SQL 语句本身进行调优、参数调优、Hadoop 的 HDFS 参数调优和 Map/Reduce 调优等多个方面。 Hive 类 SQL 语句优化 1. 尽量尽早地过滤数据...
例如,金融系统中的在线支付需要在几秒钟内完成处理和验证,超时的响应是不可接受的。因此,如何设计能够在Hadoop环境下有效支持实时服务的调度算法,是研究的重点。 MUS算法的提出正是为了解决这一问题。该算法被...
1. **设置和连接**: 首先,我们需要设置Spark和Kafka的连接参数,如Kafka的broker列表、topic名以及Spark的streaming超时时间等。 2. **创建DStream**: 使用`JavaInputDStream`从Kafka创建一个Discretized Stream...
### Google MapReduce 架构详解 #### 一、架构概览及核心特点 ...这些技术和思想不仅为Google内部的大规模数据处理提供了强大的支持,也为后来的大数据处理框架如Hadoop MapReduce等奠定了坚实的基础。
- 配置Hadoop的核心参数,如HDFS的地址、块大小等。 - 确保Hadoop和HBase版本兼容。 **2.4 运行模式** - **单机模式**:用于开发和测试。 - **分布式模式**:生产环境中使用,需要配置ZooKeeper集群。 **2.5 ...
`mapreduce.map.task.timeout`和`mapreduce.reduce.task.timeout`用于设置任务超时时间。 14. **列裁剪**:Hive会在解析阶段剔除未使用的列,减少数据读取量,`hive.optimize.pruning=true`启用该功能。 15. **...