`
guoyunsky
  • 浏览: 854309 次
  • 性别: Icon_minigender_1
  • 来自: 上海
博客专栏
3d3a22a0-f00f-3227-8d03-d2bbe672af75
Heritrix源码分析
浏览量:206244
Group-logo
SQL的MapReduce...
浏览量:0
社区版块
存档分类
最新评论

Hadoop Oozie学习笔记(二) 使用OOZIE,通过命令行运行example

 
阅读更多

 

本博客属原创文章,转载请注明出处:http://guoyunsky.iteye.com/blog/1245092

欢迎加入Hadoop超级群: 180941958   

 

 

    Oozie下面有很多例子,提供测试.也可以将源码放入Eclipse中启动提交.这里就一起试下.但发现有些问题,一一解决吧.

     运行Oozie例子map-reduce,命令:

     $OOZIE_HOME/bin/oozie job -oozie http://localhost:11000/oozie -config /home/guoyun/hadoop/oozie-3.0.2/examples/apps/map-reduce/job.properties -run

     本以为会很顺利,但控制台报以下错误:

     Error: E0902 : E0902: Exception occured: [java.net.ConnectException: Call to localhost/127.0.0.1:8020 failed on connection exception: java.net.ConnectException: Connection refused]

    跑到日志中$OOZIE_HOME/logs/oozie.log发现有以下报错:

  2011-11-09 10:42:30,721 WARN V1JobsServlet:539 - USER[?] GROUP[users] TOKEN[-] APP[-] JOB[-] ACTION[-] URL[POST http://localhost:11000/oozie/v1/jobs?action=start] error[E0902], E0902: Exception occured: [java.net.ConnectException: Call to localhost/127.0.0.1:8020 failed on connection exception: java.net.ConnectException: Connection refused]

org.apache.oozie.servlet.XServletException: E0902: Exception occured: [java.net.ConnectException: Call to localhost/127.0.0.1:8020 failed on connection exception: java.net.ConnectException: Connection refused]
	at org.apache.oozie.servlet.BaseJobServlet.checkAuthorizationForApp(BaseJobServlet.java:196)
	at org.apache.oozie.servlet.BaseJobsServlet.doPost(BaseJobsServlet.java:89)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:637)
	at org.apache.oozie.servlet.JsonRestServlet.service(JsonRestServlet.java:281)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
	at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
	at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
	at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
	at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
	at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
	at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
	at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859)
	at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
	at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
	at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.oozie.service.AuthorizationException: E0902: Exception occured: [java.net.ConnectException: Call to localhost/127.0.0.1:8020 failed on connection exception: java.net.ConnectException: Connection refused]
	at org.apache.oozie.service.AuthorizationService.authorizeForApp(AuthorizationService.java:320)
	at org.apache.oozie.servlet.BaseJobServlet.checkAuthorizationForApp(BaseJobServlet.java:185)
	... 16 more
Caused by: org.apache.oozie.service.HadoopAccessorException: E0902: Exception occured: [java.net.ConnectException: Call to localhost/127.0.0.1:8020 failed on connection exception: java.net.ConnectException: Connection refused]
	at org.apache.oozie.service.KerberosHadoopAccessorService.createFileSystem(KerberosHadoopAccessorService.java:208)
	at org.apache.oozie.service.AuthorizationService.authorizeForApp(AuthorizationService.java:285)
	... 17 more
Caused by: java.net.ConnectException: Call to localhost/127.0.0.1:8020 failed on connection exception: java.net.ConnectException: Connection refused
	at org.apache.hadoop.ipc.Client.wrapException(Client.java:1131)
	at org.apache.hadoop.ipc.Client.call(Client.java:1107)
	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226)
	at $Proxy22.getProtocolVersion(Unknown Source)
	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:398)
	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:384)
	at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:111)
	at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:213)
	at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:180)
	at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1514)
	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:67)
	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:1548)
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1530)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:228)
	at org.apache.oozie.service.KerberosHadoopAccessorService$3.run(KerberosHadoopAccessorService.java:200)
	at org.apache.oozie.service.KerberosHadoopAccessorService$3.run(KerberosHadoopAccessorService.java:192)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
	at org.apache.oozie.service.KerberosHadoopAccessorService.createFileSystem(KerberosHadoopAccessorService.java:192)
	... 18 more
Caused by: java.net.ConnectException: Connection refused
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
	at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
	at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:408)
	at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:425)
	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:532)
	at org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:210)
	at org.apache.hadoop.ipc.Client.getConnection(Client.java:1244)
	at org.apache.hadoop.ipc.Client.call(Client.java:1075)
	... 37 more

  从Oozie的FAQ找到了解决办法:https://github.com/yahoo/oozie/wiki/FAQ,需要将oozie-default.xml中的 oozie.services属性拷贝到oozie-site.xml中,也就是以下属性:

     <property>
       <name>oozie.services</name>
        <value>
            org.apache.oozie.service.SchedulerService,
            org.apache.oozie.service.InstrumentationService,
            org.apache.oozie.service.CallableQueueService,
            org.apache.oozie.service.UUIDService,
            org.apache.oozie.service.ELService,
            org.apache.oozie.service.AuthorizationService,
            org.apache.oozie.service.KerberosHadoopAccessorService,
            org.apache.oozie.service.MemoryLocksService,
            org.apache.oozie.service.DagXLogInfoService,
            org.apache.oozie.service.SchemaService,
            org.apache.oozie.service.LiteWorkflowAppService,
            org.apache.oozie.service.JPAService,
            org.apache.oozie.service.StoreService,
            org.apache.oozie.service.CoordinatorStoreService,
            org.apache.oozie.service.SLAStoreService,
            org.apache.oozie.service.DBLiteWorkflowStoreService,
            org.apache.oozie.service.CallbackService,
            org.apache.oozie.service.ActionService,
            org.apache.oozie.service.ActionCheckerService,
            org.apache.oozie.service.RecoveryService,
            org.apache.oozie.service.PurgeService,
            org.apache.oozie.service.CoordinatorEngineService,
            org.apache.oozie.service.BundleEngineService,
            org.apache.oozie.service.DagEngineService,
            org.apache.oozie.service.CoordMaterializeTriggerService,
            org.apache.oozie.service.StatusTransitService,
            org.apache.oozie.service.PauseTransitService
        </value>
        <description>
            All services to be created and managed by Oozie Services singleton.
            Class names must be separated by commas.
        </description>
    </property>

   拷贝完成后重启OOZIE,继续跑map-reduce例子,控制台上却又报一样的错误:

   Error: E0902 : E0902: Exception occured: [java.net.ConnectException: Call to localhost/127.0.0.1:8020 failed on connection exception: java.net.ConnectException: Connection refused]

  还是看继续看日志吧,还是同样的错误.https://github.com/yahoo/oozie/wiki/FAQ中貌似也有解决办法,于是按照说明将org.apache.oozie.service.HadoopAccessorService 添加到oozie-size.xml中的oozie.services属性中.继续重启.
   还是一样的错误,怒了!难道不是Hadoop授权问题?而且本身来说我都是本地启动,不存在授权的问题啊.于是看下端口,lsof -i:8020,果真空空如也.早就应该想到这点了.于是去$HADOOP_HOME/conf/下查看core-site.xml发现HDFS的端口是9000,并不是这里默认的8020.同时看下maped-site.xml发现jobTracker端口是9001.于是去对应的app下,也就是我这里的$OOZIE_HOME/examples/apps/map-reduce下将job.properties中修改如下:
nameNode=hdfs://localhost:9000
jobTracker=localhost:9001
queueName=default
examplesRoot=hadoop/oozie-3.0.2/examples

oozie.coord.application.path=${nameNode}/user/${user.name}/${examplesRoot}/apps/aggregator
outputDir=map-reduce

  于是继续运行这个example,报如下错误:

   Error: E0504 : E0504: App directory [hdfs://localhost:9000/user/guoyun/hadoop/oozie-3.0.2/examples/apps/aggregator] does not exist

  想想我现在是伪分布式下运行Hadoop,从HDFS中拿文件则需要先存入到HDFS中.还是直接调用本地文件方便调试,于是将job.properties修改如下:

nameNode=file://
jobTracker=localhost:9001
queueName=default
examplesRoot=hadoop/oozie-3.0.2/examples

oozie.wf.application.path=${nameNode}/home/${user.name}/${examplesRoot}/apps/map-reduce
outputDir=map-reduce

   再次运行得到以下结果:

   job: 0000001-111109111632287-oozie-guoy-W

   再去http://localhost:11000/oozie查看也能看到该job,并有相应的状态.

   OK!example运行成功!

 

更多技术文章、感悟、分享、勾搭,请用微信扫描:

 

  

  

 

 
 
0
1
分享到:
评论
1 楼 zhxh1987200 2011-12-13  
写得挺好! 只是如果需要存到hdfs中,再运行的话,写成nameNode=hdfs://localhost:9000  就不一定对了,这里的localhost需要根据实际的nameNode的映射名称确定的,写成是localhost或是nameNode的IP地址都会报错的。

相关推荐

    Hadoop学习笔记

    Hadoop学习笔记,自己总结的一些Hadoop学习笔记,比较简单。

    最新Hadoop学习笔记

    **Hadoop学习笔记详解** Hadoop是一个开源的分布式计算框架,由Apache基金会开发,主要用于处理和存储海量数据。它的核心组件包括HDFS(Hadoop Distributed File System)和MapReduce,两者构成了大数据处理的基础...

    Hadoop 学习笔记.md

    Hadoop 学习笔记.md

    云计算hadoop学习笔记

    云计算,hadoop,学习笔记, dd

    Apache Hadoop---Oozie.docx

    Apache Hadoop 中,Oozie 是一个至关重要的组件,它是一个开源的工作流调度引擎,专门设计用于管理和协调在Hadoop...通过使用Oozie,用户可以构建和维护可靠且可扩展的作业调度系统,从而提升整个Hadoop集群的生产力。

    hadoop学习笔记.rar

    二、Hadoop学习笔记之五:使用Eclipse插件 Eclipse插件是开发Hadoop应用的重要工具,它提供了集成的开发环境,使得开发者可以更方便地编写、调试和运行Hadoop程序。通过插件,用户可以创建Hadoop项目,编写MapReduce...

    掌握大数据调度:Hadoop Oozie工作流管理深度指南与实战代码

    4. **可靠性**:Hadoop通过数据复制(默认是3份)来提高数据的可靠性,即使某些节点失败,数据也不会丢失。 5. **容错性**:Hadoop的MapReduce计算模型可以在节点失败时重新分布任务到其他节点。 6. **成本效益**...

    HADOOP学习笔记

    【HADOOP学习笔记】 Hadoop是Apache基金会开发的一个开源分布式计算框架,是云计算领域的重要组成部分,尤其在大数据处理方面有着广泛的应用。本学习笔记将深入探讨Hadoop的核心组件、架构以及如何搭建云计算平台。...

    使用命令行编译打包运行自己的MapReduce程序 Hadoop2.6.0

    ### 使用命令行编译打包运行自己的MapReduce程序 Hadoop2.6.0 #### Hadoop 2.x 版本变化及依赖分析 在Hadoop 2.x版本中,相较于早期版本,其架构和依赖库有了明显的变化。在早期版本如1.x中,所有的依赖都集中在`...

    Hadoop学习笔记.pdf

    Hadoop的源码项目结构主要包括hadoop-common-project、hadoop-hdfs-project、hadoop-mapreduce-project、hadoop-yarn-project等,每个项目下又有多个子项目,包含了Hadoop运行所需的各个组件和客户端等。 在实际...

    oozie介绍及使用详解

    **Oozie简介** Oozie是Apache Hadoop项目中的一个工作流调度系统,用于管理Hadoop生态系统中的批处理作业。...在实际工作中,深入学习和掌握Oozie的使用方法,能够帮助我们构建更稳定、智能的大数据工作流。

    hadoop集群安装笔记

    Hadoop集群安装笔记是一篇详细的安装指南,旨在帮助新手快速搭建Hadoop学习环境。以下是该笔记中的重要知识点: Hadoop集群安装目录 在安装Hadoop集群之前,需要准备好安装环境。安装环境包括Java Development Kit...

    大数据平台,hadoop集群学习笔记

    通过深入学习Hadoop集群,不仅可以理解大数据处理的基本原理,还能掌握实际操作技巧,为在大数据时代解决复杂问题奠定坚实基础。随着云计算和物联网的发展,Hadoop及其相关技术将继续在大数据处理领域扮演关键角色。...

    oozie oozie

    在Hadoop生态系统中,Oozie是一个工作流调度系统,用于管理Hadoop作业。它允许用户定义、调度和协调各种Hadoop相关任务,如MapReduce、Pig、Hive、Sqoop等。当遇到“Table ‘oozie.VALIDATE_CONN’ doesn’t exist”...

    MySQL :oozie建表sql

    hadoop oozie启动或运行报错:Table ‘oozie.VALIDATE_CONN’ doesn’t exist

Global site tag (gtag.js) - Google Analytics