`

hadoop 2.x-enable job historyserver

 
阅读更多

  we know,hadoop will show only abstract  result info about mapreduce(represent in http://xxx:19888/clsuter by default),so it's unconvenient to track the number of mappers,where mappers run,how to track the exception logs etc.note ,this is the default behavior by hadoop(maybe decrease the resouces occupied).

  and yep,there is a deamon to process these logs files,within it u can find what u can see in hadoop-1.x.here are some steps to enable this feature below:

 

  1.add a property in yarn-site.xml

yarn.log-aggregation-enable=true

    note:if u specify tihs ,the containers logs under 'userlogs' will be removed after job completed.

    also,u can specify some relative items:

  

  <property>
    <description>How long to keep aggregation logs before deleting them.  -1 disables. 
    Be careful set this too small and you will spam the name node.</description>
    <name>yarn.log-aggregation.retain-seconds</name>
    <value>-1</value>
  </property>
  <property>
    <description>How long to wait between aggregated log retention checks.
    If set to 0 or a negative value then the value is computed as one-tenth
    of the aggregated log retention time. Be careful set this too small and
    you will spam the name node.</description>
    <name>yarn.log-aggregation.retain-check-interval-seconds</name>
    <value>-1</value>
  </property>

  <property>
    <description>Time in seconds to retain user logs. Only applicable if
    log aggregation is disabled
    </description>
    <name>yarn.nodemanager.log.retain-seconds</name>
    <value>10800</value>
  </property>

  <property>
    <description>Where to aggregate logs to.</description>
    <name>yarn.nodemanager.remote-app-log-dir</name>
    <value>/tmp/logs</value>
  </property>
  <property>
    <description>The remote log dir will be created at 
      {yarn.nodemanager.remote-app-log-dir}/${user}/{thisParam}
    </description>
    <name>yarn.nodemanager.remote-app-log-dir-suffix</name>
    <value>logs</value>
  </property>

 

  2. specify the host where the 'JobHistoryServer' to run by mapred-site.xml

 

    <property>
        <name>mapreduce.jobhistory.address</name>
        <value>host:10020</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.webapp.address</name>
        <value>host:19888</value>
    </property>

 

 

  3.spawn mr history server

mr-jobhistory-daemon.sh start historyserver

  now u can see a daemon named JobHistoryServer in the host configured by mapreduce.jobhistory.address

 

  after all above complete,u can go to the historyserver by

 

http://host:19888/jobhistory

 

 now some figures are shown here:

 

 if u click the link 'history'(ie.http://host:50030/proxy/application_1418972108758_0001/jobhistory/job/job_1418972108758_0001),then u will be redirected to the job history server 'http://host:19888/jobhistory/job/job_1418972108758_0001/jobhistory/job/job_1418972108758_0001'


 

 

 

 ref:

HistoryServer的原理详解

 

  • 大小: 180.6 KB
  • 大小: 107 KB
分享到:
评论

相关推荐

    flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar.tar.gz

    本文将针对"flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar.tar.gz"这一特定压缩包,探讨Flink 1.14.0如何与Hadoop 3.x实现兼容,并深入解析其背后的原理。 Flink 1.14.0是一个强大的流处理引擎,它提供了...

    hadoop.dll-and-winutils.exe-for-hadoop2.7.3-on-windows_X64

    Hadoop 2.7.3 Windows64位 编译bin(包含winutils.exe, hadoop.dll),自己用的,把压缩包里的winutils.exe, hadoop.dll 放在你的bin 目录 在重启eclipse 就好了

    hadoop2.x-eclipse-plugin

    hadoop-eclipse-plugin-2.6.0.jar

    hadoop-3.1.1.3.1.4.0-315.tar.gz

    ambari-2.7.5 编译过程中四个大包下载很慢,所以需要提前下载,包含:hbase-2.0.2.3.1.4.0-315-bin.tar.gz ,hadoop-3.1.1.3.1.4.0-315.tar.gz , grafana-6.4.2.linux-amd64.tar.gz ,phoenix-5.0.0.3.1.4.0-315....

    Hadoop 3.x(MapReduce)----【Hadoop 序列化】---- 代码

    Hadoop 3.x(MapReduce)----【Hadoop 序列化】---- 代码 Hadoop 3.x(MapReduce)----【Hadoop 序列化】---- 代码 Hadoop 3.x(MapReduce)----【Hadoop 序列化】---- 代码 Hadoop 3.x(MapReduce)----【Hadoop ...

    大数据技术之Hadoop3.x-视频教程网盘链接提取码下载 .txt

    Hadoop是大数据技术中最重要的框架之一,是学习大数据必备的第一课,在Hadoop平台之上,可以更容易地开发和运行其他处理大规模数据的框架。尚硅谷Hadoop视频教程再次重磅升级!以企业实际生产环境为背景,增加了更...

    hadoop最新版本3.1.1全量jar包

    hadoop-client-minicluster-3.1.1.jar hadoop-mapreduce-client-app-3.1.1.jar hadoop-mapreduce-client-shuffle-3.1.1.jar hadoop-yarn-server-common-3.1.1.jar hadoop-client-runtime-3.1.1.jar hadoop-mapreduce...

    spark-3.4.1-bin-hadoop3.tgz - Spark 3.4.1 安装包(内置了Hadoop 3)

    文件名: spark-3.4.1-bin-hadoop3.tgz 这是 Apache Spark 3.4.1 版本的二进制文件,专为与 Hadoop 3 配合使用而设计。Spark 是一种快速、通用的集群计算系统,用于大规模数据处理。这个文件包含了所有必要的组件,...

    hadoop.dll-winutils.exe-hadoop2.7.x

    标题提到的"hadop.dll-winutils.exe-hadoop2.7.x"指的是针对Hadoop 2.7.2版本的特定解决方法,描述表明了在该环境中使用这两个文件可以消除错误。 `hadoop.dll` 是一个动态链接库文件,主要在Windows环境下为Hadoop...

    hadoop.dll-and-winutils.exe-for-hadoop2.9.0-on-windows_X64

    4. 配置hadoop-env.cmd:打开Hadoop安装目录下的conf子目录,找到hadoop-env.cmd文件,编辑该文件,将`%JAVA_HOME%`替换为你本机Java JDK的安装路径。 5. 初始化HDFS:在命令行中,使用`winutils.exe fs -mkdir /...

    hadoop.dll-and-winutils.exe-for-hadoop2.7.7-on-windows_X64-master

    在Hadoop中,作业(Job)是用户提交的一系列MapReduce任务的集合,它们会被Hadoop的JobTracker(在Hadoop 2.x中被YARN取代)进行调度和监控。`winutils.exe`和`hadoop.dll`可以帮助在Windows环境中提交、管理和监控...

    flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar

    Flink-1.11.2与Hadoop3集成JAR包,放到flink安装包的lib目录下,可以避免Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Hadoop is not in the classpath/dependencies.这个报错,实现...

    Hadoop3.2.2资源包+安装文档

    Hadoop 2.x - 可以通过复制(浪费空间)来处理容错。 Hadoop 3.x - 可以通过Erasure编码处理容错。 数据平衡 Hadoop 2.x - 对于数据平衡使用HDFS平衡器。 Hadoop 3.x - 对于数据平衡使用Intra-data节点平衡器,该...

    flink-shaded-hadoop-3-uber-3.1.1.7.2.9.0-173-9.0.jar

    flink-shaded-hadoop-3-uber-3.1.1.7.2.9.0-173-9.0.jar

    Hadoop 3.x(MapReduce)----【MapReduce 概述】---- 代码

    Hadoop 3.x(MapReduce)----【MapReduce 概述】---- 代码 Hadoop 3.x(MapReduce)----【MapReduce 概述】---- 代码 Hadoop 3.x(MapReduce)----【MapReduce 概述】---- 代码 Hadoop 3.x(MapReduce)----...

    hadoop2.x-eclipse-plugin插件

    编译环境: hadoop2.5.2 win7 32位系统 eclipse-luna-x86

    hadoop-2.x-eclipse-plugin-master.zip

    "Hadoop-2.x-Eclipse-Plugin"是专为Eclipse设计的插件,目的是为了简化在Eclipse中开发、调试和运行Hadoop应用的过程。这个插件提供了与Hadoop集群的集成,包括项目配置、编译、部署和测试等功能,使得开发者可以在...

    Hadoop 3.x(HDFS)----【HDFS 的 API 操作】---- 代码

    Hadoop 3.x(HDFS)----【HDFS 的 API 操作】---- 代码 Hadoop 3.x(HDFS)----【HDFS 的 API 操作】---- 代码 Hadoop 3.x(HDFS)----【HDFS 的 API 操作】---- 代码 Hadoop 3.x(HDFS)----【HDFS 的 API 操作】--...

    pentaho-hadoop-shims-cdh61-kar-9.1.2020.09.00-324.kar

    kettle 9.1 连接hadoop clusters (CDH 6.2) 驱动

    spark-3.3.3-bin-hadoop3.tgz

    这个版本特别针对Hadoop 3.x进行了优化,使得它能够充分利用Hadoop生态系统中的新特性和性能改进。在本文中,我们将深入探讨Spark 3.3.3与Hadoop 3.x的集成,以及它们在大数据处理领域的关键知识点。 首先,Spark的...

Global site tag (gtag.js) - Google Analytics