本博客属原创文章,转载请注明出处:http://guoyunsky.iteye.com/blogs/1299770/
欢迎加入Hadoop超级群: 180941958
Oozie是个针对Hadoop的工作流,有些自己的语法. 这两天碰到一个异常,查看源码才明白Oozie的join只允许承接fork下来的任务,否则会报以下错误.整个异常如下:
WARN CallableQueueService$CallableWrapper:528 - USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] exception callable [signal], E0720: Fork/join mismatch, node [join_node_name]
org.apache.oozie.command.CommandException: E0720: Fork/join mismatch, node [tianqi_sawlog_transformation_done]
at org.apache.oozie.command.wf.SignalCommand.call(SignalCommand.java:213)
at org.apache.oozie.command.wf.SignalCommand.execute(SignalCommand.java:305)
at org.apache.oozie.command.wf.SignalCommand.execute(SignalCommand.java:59)
at org.apache.oozie.command.Command.call(Command.java:202)
at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:128)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.oozie.workflow.WorkflowException: E0720: Fork/join mismatch, node [tianqi_sawlog_transformation_done]
at org.apache.oozie.workflow.lite.JoinNodeDef$JoinNodeHandler.loopDetection(JoinNodeDef.java:44)
at org.apache.oozie.workflow.lite.LiteWorkflowInstance.signal(LiteWorkflowInstance.java:203)
at org.apache.oozie.workflow.lite.LiteWorkflowInstance.signal(LiteWorkflowInstance.java:284)
at org.apache.oozie.command.wf.SignalCommand.call(SignalCommand.java:120)
... 7 more
源码来自org.apache.oozie.workflow.lite.JoinNodeDef,检测这个语法的代码如下:
public void loopDetection(Context context) throws WorkflowException { String flag = getLoopFlag(context.getNodeDef().getName()); if (context.getVar(flag) != null) { throw new WorkflowException(ErrorCode.E0709, context.getNodeDef().getName()); } String parentExecutionPath = context.getParentExecutionPath(context.getExecutionPath()); String forkCount = context.getVar(ForkNodeDef.FORK_COUNT_PREFIX + parentExecutionPath); if (forkCount == null) { throw new WorkflowException(ErrorCode.E0720, context.getNodeDef().getName()); } int count = Integer.parseInt(forkCount) - 1; if (count == 0) { context.setVar(flag, "true"); } } public boolean enter(Context context) throws WorkflowException { String parentExecutionPath = context.getParentExecutionPath(context.getExecutionPath()); String forkCount = context.getVar(ForkNodeDef.FORK_COUNT_PREFIX + parentExecutionPath); if (forkCount == null) { throw new WorkflowException(ErrorCode.E0720, context.getNodeDef().getName()); } int count = Integer.parseInt(forkCount) - 1; if (count > 0) { context.setVar(ForkNodeDef.FORK_COUNT_PREFIX + parentExecutionPath, "" + count); context.deleteExecutionPath(); } else { context.setVar(ForkNodeDef.FORK_COUNT_PREFIX + parentExecutionPath, null); } return (count == 0); }
可以发现这个两个方法都会通过String forkCount = context.getVar(ForkNodeDef.FORK_COUNT_PREFIX + parentExecutionPath);去获取当前节点
的所有父节点是Fork的个数.如果为空,则通过这行代码throw new WorkflowException(ErrorCode.E0720, context.getNodeDef().getName());抛出这个异常
所以我们今年在使用join节点的时候,一定要承接来自fork或者join本身的节点.下面举几个例子:
1.错误的例子,由于join并没有承接fork或者join,所以会报以上的错误
<workflow-app xmlns="uri:oozie:workflow:0.1" name="workflow-test"> <start to="action1"> <action name="action1"> <!-- do some things--> <ok to="join1" /> <error to="fail" /> </action> <join name="join1" to="end" /> <kill name="fail"> <message>Workflow failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message> </kill> <end name="end" /> </workflow-app>
2.正确的例子,join承接了fork(join1承接了fork1过来的action1和action2)
<workflow-app xmlns="uri:oozie:workflow:0.1" name="workflow-test"> <start to="fork1"> <fork name="fork1"> <path start="action1" /> <path start="action2" /> </fork> <action name="action1"> <!-- do some things--> <ok to="join1" /> <error to="fail" /> </action> <action name="action2"> <!-- do some things--> <ok to="join1" /> <error to="fail" /> </action> <join name="join1" to="end" /> <kill name="fail"> <message>Workflow failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message> </kill> <end name="end" /> </workflow-app>
3.正确的例子,join承接了join(join2承接了join1)
<workflow-app xmlns="uri:oozie:workflow:0.1" name="workflow-test"> <start to="fork1"> <fork name="fork1"> <path start="action1" /> <path start="action2" /> </fork> <action name="action1"> <!-- do some things--> <ok to="join1" /> <error to="fail" /> </action> <action name="action2"> <!-- do some things--> <ok to="join1" /> <error to="fail" /> </action> <join name="join1" to="join2" /> <join name="join2" to="end" /> <kill name="fail"> <message>Workflow failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message> </kill> <end name="end" /> </workflow-app>
4.正确的例子,join承接了fork下来的所有节点
<workflow-app xmlns="uri:oozie:workflow:0.1" name="workflow-test"> <start to="fork1"> <fork name="fork1"> <path start="action1" /> <path start="action2" /> </fork> <action name="action1"> <!-- do some things--> <ok to="action3" /> <error to="fail" /> </action> <action name="action2"> <!-- do some things--> <ok to="join1" /> <error to="fail" /> </action> <action name="action3"> <!-- do some things--> <ok to="join1" /> <error to="fail" /> </action> <join name="join1" to="end" /> <kill name="fail"> <message>Workflow failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message> </kill> <end name="end" /> </workflow-app>
这里只是大概玩了下Oozie,希望能起抛砖引玉...
更多技术文章、感悟、分享、勾搭,请用微信扫描:
相关推荐
* OOZIE 系统时区配置:在 CM OOZIE 的配置界面,oozie-site.xml 的 Oozie Server 高级配置代码段(安全阀)这一栏配置如下属性,然后重启 OOZIE 服务:<property> <name>oozie.processing.timezone</name> <value>...
<name>hadoop.proxyuser.hc.hosts</name> *</value> </property> <name>hadoop.proxyuser.hc.groups</name> *</value> </property> ``` 这里,"hc"是尝试连接Beeline的用户名,"*"表示允许该用户从任何主机...
数据算法:Hadoop/Spark大数据处理技巧
基于springboot + Hadoop + Hive 的健身馆可视化分析平台源码+数据库 整合组件: HDFS MapReduce Hive Hadoop ###性别认为锻炼的重要性占比 饼图 ...
可作为java大数据课程设计使用: 详情查看:https://blog.csdn.net/weixin_46115961/article/details/126061076
Hadoop常见问题及解决办法汇总 Hadoop是一个基于Apache的开源大数据处理框架,广泛应用于大数据处理、数据分析和机器学习等领域。然而,在使用Hadoop时,经常会遇到一些常见的问题,这些问题可能会导致Hadoop集群...
export HADOOP_HOME="/usr/local/hadoop/" export JAVA_HOME="/usr/local/hadoop/jdk1.6.0_24" export CLASSPATH="$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:${HADOOP_HOME}/lib/commons-logging-1.0.4.jar...
### Hadoop 学习资源详解 #### 一、概述 Hadoop是一款开源的大数据处理框架,主要用于存储和处理大规模的数据集。它通过分布式文件系统(HDFS)来存储数据,并利用MapReduce编程模型来处理这些数据。对于希望深入...
hadoop jar /usr/bin/hadoop/software/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.7.2.4.jar \ -Dmapred.reduce.tasks=5 \ -Dmapred.output.compress=true \ -Dmapred.compress.map.output=true \ -...
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0.jar wordcount hdfs://localhost:9000/user/hadoop/input hdfs://localhost:9000/user/hadoop/output ``` - 结果查看:`hadoop fs -cat hdfs...
export HADOOP_HOME="/usr/local/hadoop/" export JAVA_HOME="/usr/local/hadoop/jdk1.6.0_24" export CLASSPATH="$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:${HADOOP_HOME}/lib/commons-logging-1.0.4.jar...
第4章 Oozie的使用 4.1 案例一:Oozie调度shell脚本 目标:使用Oozie调度Shell脚本 分步实现: 1)解压官方案例模板 [atguigu@hadoop102 oozie-4.0.0-cdh5.3.6]$ tar -zxvf oozie-examples.tar.gz 2)创建工作目录 ...
适用于Hadoop 2.x的Oozie 这是一个映像,该映像对oozie / webapp的uber配置文件进行了一些更改,并使用hadoop-2配置文件和Hadoop 2.7.0库构建了一个Oozie发行版。 用法 将Oozie sharelib安装到HDFS docker run -ti...
- HDFS 及 Map-Reduce 数据存储在 `/data/hadoop/dir/tmp` 文件夹下 - **核心配置**: - `fs.default.name`: `hdfs://192.168.1.104:9000` - `mapred.job.tracker`: `192.168.1.104:9001` - `dfs.replication`: `...
当从本地上传文件到HDFS中时报错 fs.FSInputChecker: Found checksum error: b[0, 69]=6d6f...[root@node01 data]# hadoop fs -put hyk.txt /hyk/test 20/02/18 12:54:39 INFO fs.FSInputChecker: Fo
编辑 `${HADOOP_HOME}/etc/hadoop/core-site.xml` 文件,添加或更新以下配置: ```xml <name>fs.defaultFS</name> <value>hdfs://bigdata01:8020</value> </property> <name>hadoop.tmp.dir</name> ...