we know,hadoop will show only abstract result info about mapreduce(represent in http://xxx:19888/clsuter by default),so it's unconvenient to track the number of mappers,where mappers run,how to track the exception logs etc.note ,this is the default behavior by hadoop(maybe decrease the resouces occupied).
and yep,there is a deamon to process these logs files,within it u can find what u can see in hadoop-1.x.here are some steps to enable this feature below:
1.add a property in yarn-site.xml
yarn.log-aggregation-enable=true
note:if u specify tihs ,the containers logs under 'userlogs' will be removed after job completed.
also,u can specify some relative items:
<property> <description>How long to keep aggregation logs before deleting them. -1 disables. Be careful set this too small and you will spam the name node.</description> <name>yarn.log-aggregation.retain-seconds</name> <value>-1</value> </property> <property> <description>How long to wait between aggregated log retention checks. If set to 0 or a negative value then the value is computed as one-tenth of the aggregated log retention time. Be careful set this too small and you will spam the name node.</description> <name>yarn.log-aggregation.retain-check-interval-seconds</name> <value>-1</value> </property> <property> <description>Time in seconds to retain user logs. Only applicable if log aggregation is disabled </description> <name>yarn.nodemanager.log.retain-seconds</name> <value>10800</value> </property> <property> <description>Where to aggregate logs to.</description> <name>yarn.nodemanager.remote-app-log-dir</name> <value>/tmp/logs</value> </property> <property> <description>The remote log dir will be created at {yarn.nodemanager.remote-app-log-dir}/${user}/{thisParam} </description> <name>yarn.nodemanager.remote-app-log-dir-suffix</name> <value>logs</value> </property>
2. specify the host where the 'JobHistoryServer' to run by mapred-site.xml
<property> <name>mapreduce.jobhistory.address</name> <value>host:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>host:19888</value> </property>
3.spawn mr history server
mr-jobhistory-daemon.sh start historyserver
now u can see a daemon named JobHistoryServer in the host configured by mapreduce.jobhistory.address
after all above complete,u can go to the historyserver by
http://host:19888/jobhistory
now some figures are shown here:
if u click the link 'history'(ie.http://host:50030/proxy/application_1418972108758_0001/jobhistory/job/job_1418972108758_0001),then u will be redirected to the job history server 'http://host:19888/jobhistory/job/job_1418972108758_0001/jobhistory/job/job_1418972108758_0001'
ref:
相关推荐
本文将针对"flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar.tar.gz"这一特定压缩包,探讨Flink 1.14.0如何与Hadoop 3.x实现兼容,并深入解析其背后的原理。 Flink 1.14.0是一个强大的流处理引擎,它提供了...
Hadoop 2.7.3 Windows64位 编译bin(包含winutils.exe, hadoop.dll),自己用的,把压缩包里的winutils.exe, hadoop.dll 放在你的bin 目录 在重启eclipse 就好了
hadoop-eclipse-plugin-2.6.0.jar
ambari-2.7.5 编译过程中四个大包下载很慢,所以需要提前下载,包含:hbase-2.0.2.3.1.4.0-315-bin.tar.gz ,hadoop-3.1.1.3.1.4.0-315.tar.gz , grafana-6.4.2.linux-amd64.tar.gz ,phoenix-5.0.0.3.1.4.0-315....
Hadoop 3.x(MapReduce)----【Hadoop 序列化】---- 代码 Hadoop 3.x(MapReduce)----【Hadoop 序列化】---- 代码 Hadoop 3.x(MapReduce)----【Hadoop 序列化】---- 代码 Hadoop 3.x(MapReduce)----【Hadoop ...
Hadoop是大数据技术中最重要的框架之一,是学习大数据必备的第一课,在Hadoop平台之上,可以更容易地开发和运行其他处理大规模数据的框架。尚硅谷Hadoop视频教程再次重磅升级!以企业实际生产环境为背景,增加了更...
hadoop-client-minicluster-3.1.1.jar hadoop-mapreduce-client-app-3.1.1.jar hadoop-mapreduce-client-shuffle-3.1.1.jar hadoop-yarn-server-common-3.1.1.jar hadoop-client-runtime-3.1.1.jar hadoop-mapreduce...
文件名: spark-3.4.1-bin-hadoop3.tgz 这是 Apache Spark 3.4.1 版本的二进制文件,专为与 Hadoop 3 配合使用而设计。Spark 是一种快速、通用的集群计算系统,用于大规模数据处理。这个文件包含了所有必要的组件,...
标题提到的"hadop.dll-winutils.exe-hadoop2.7.x"指的是针对Hadoop 2.7.2版本的特定解决方法,描述表明了在该环境中使用这两个文件可以消除错误。 `hadoop.dll` 是一个动态链接库文件,主要在Windows环境下为Hadoop...
hadoop3.1.1基于hdp3.1.5版本的ReentrantReadWriteLock还原成hadoop2.x版本的synchronized锁
Flink-1.11.2与Hadoop3集成JAR包,放到flink安装包的lib目录下,可以避免Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Hadoop is not in the classpath/dependencies.这个报错,实现...
4. 配置hadoop-env.cmd:打开Hadoop安装目录下的conf子目录,找到hadoop-env.cmd文件,编辑该文件,将`%JAVA_HOME%`替换为你本机Java JDK的安装路径。 5. 初始化HDFS:在命令行中,使用`winutils.exe fs -mkdir /...
在Hadoop中,作业(Job)是用户提交的一系列MapReduce任务的集合,它们会被Hadoop的JobTracker(在Hadoop 2.x中被YARN取代)进行调度和监控。`winutils.exe`和`hadoop.dll`可以帮助在Windows环境中提交、管理和监控...
flink-shaded-hadoop-3-uber-3.1.1.7.2.9.0-173-9.0.jar
Hadoop 2.x - 可以通过复制(浪费空间)来处理容错。 Hadoop 3.x - 可以通过Erasure编码处理容错。 数据平衡 Hadoop 2.x - 对于数据平衡使用HDFS平衡器。 Hadoop 3.x - 对于数据平衡使用Intra-data节点平衡器,该...
Hadoop 3.x(MapReduce)----【MapReduce 概述】---- 代码 Hadoop 3.x(MapReduce)----【MapReduce 概述】---- 代码 Hadoop 3.x(MapReduce)----【MapReduce 概述】---- 代码 Hadoop 3.x(MapReduce)----...
编译环境: hadoop2.5.2 win7 32位系统 eclipse-luna-x86
"Hadoop-2.x-Eclipse-Plugin"是专为Eclipse设计的插件,目的是为了简化在Eclipse中开发、调试和运行Hadoop应用的过程。这个插件提供了与Hadoop集群的集成,包括项目配置、编译、部署和测试等功能,使得开发者可以在...
Hadoop 3.x(HDFS)----【HDFS 的 API 操作】---- 代码 Hadoop 3.x(HDFS)----【HDFS 的 API 操作】---- 代码 Hadoop 3.x(HDFS)----【HDFS 的 API 操作】---- 代码 Hadoop 3.x(HDFS)----【HDFS 的 API 操作】--...
kettle 9.1 连接hadoop clusters (CDH 6.2) 驱动