`
ghost_face
  • 浏览: 54305 次
社区版块
存档分类
最新评论

Oozie相关函数

 
阅读更多

 

EL:Expression Language

http://oozie.apache.org/docs/3.3.2/WorkflowFunctionalSpec.html#a4.2.1_Basic_EL_Constants

Oozie相关函数

1.Decision Node

1.1 switch case

default一定要设置。

Example
<workflow-app name="foo-wf" xmlns="uri:oozie:workflow:0.1">

    ...

    <decision name="mydecision">

        <switch>

            <case to="reconsolidatejob">

              ${fs:fileSize(secondjobOutputDir) gt 10 * GB}

            </case>

            <case to="rexpandjob">

              ${fs:filSize(secondjobOutputDir) lt 100 * MB}

            </case>

            <case to="recomputejob">

              ${ hadoop:counters('secondjob')[RECORDS][REDUCE_OUT] lt 1000000 }

            </case>

            <default to="end"/>

        </switch>

    </decision>

    ...

</workflow-app>

 

1.2 fork join

Forkjoin成对出现。join动作的功能是要同步fork动作启动的多个并行执行的线程。如果fork启动的所有执行的线程都能够成功完成,那么join动作就会等待它们全部完成。如果有至少一个线程执行失败,kill节点会“杀掉”剩余运行的线程。

Example

<workflow-app name="sample-wf" xmlns="uri:oozie:workflow:0.1">
    ...
    <fork name="forking">
        <path start="firstparalleljob"/>
        <path start="secondparalleljob"/>
    </fork>
    <action name="firstparallejob">
        <map-reduce>
            <job-tracker>foo:8021</job-tracker>
            <name-node>bar:8020</name-node>
            <job-xml>job1.xml</job-xml>
        </map-reduce>
        <ok to="joining"/>
        <error to="kill"/>
    </action>
    <action name="secondparalleljob">
        <map-reduce>
            <job-tracker>foo:8021</job-tracker>
            <name-node>bar:8020</name-node>
            <job-xml>job2.xml</job-xml>
        </map-reduce>
        <ok to="joining"/>
        <error to="kill"/>
    </action>
    <join name="joining" to="nextaction"/>
    ...
</workflow-app>

 

1.Basic EL Functions

Basic EL Constants

  • KB: 1024, one kilobyte.
  • MB: 1024 * KB, one megabyte.
  • GB: 1024 * MB, one gigabyte.
  • TB: 1024 * GB, one terabyte.
  • PB: 1024 * TG, one petabyte.

All the above constants are of type long .

String firstNotNull(String value1, String value2)

It returns the first not null value, or null if both are null .

Note that if the output of this function is null and it is used as string, the EL library converts it to an empty string. This is the common behavior when using firstNotNull() in node configuration sections.

String concat(String s1, String s2)

It returns the concatenation of 2 strings. A string with nullvalue is considered as an empty string.

String replaceAll(String src, String regex, String replacement)

Replace each occurrence of regular expression match in the first string with the replacement string and return the replaced string. A 'regex' string with null value is considered as no change. A 'replacement' string with null value is consider as an empty string.

String appendAll(String src, String append, String delimeter)

Add the append string into each splitted sub-strings of the first string(=src=). The split is performed into src string using thedelimiter . E.g. appendAll("/a/b/,/c/b/,/c/d/", "ADD", ",")will return /a/b/ADD,/c/b/ADD,/c/d/ADD . A append string withnull value is consider as an empty string. A delimiter string with value null is considered as no append in the string.

String trim(String s)

It returns the trimmed value of the given string. A string withnull value is considered as an empty string.

String urlEncode(String s)

It returns the URL UTF-8 encoded value of the given string. A string with null value is considered as an empty string.

String timestamp()

It returns the UTC current date and time in W3C format down to the second (YYYY-MM-DDThh:mm:ss.sZ). I.e.: 1997-07-16T19:20:30.45Z

String toJsonStr(Map) (since Oozie 3.3)

It returns an XML encoded JSON representation of a Map. This function is useful to encode as a single property the complete action-data of an action, wf:actionData(String actionName) , in order to pass it in full to another action.

String toPropertiesStr(Map) (since Oozie 3.3)

It returns an XML encoded Properties representation of a Map. This function is useful to encode as a single property the complete action-data of an action, wf:actionData(String actionName) , in order to pass it in full to another action.

String toConfigurationStr(Map) (since Oozie 3.3)

It returns an XML encoded Configuration representation of a Map. This function is useful to encode as a single property the complete action-data of an action, wf:actionData(String actionName) , in order to pass it in full to another action.

2.Workflow EL Function

String wf:id()

It returns the workflow job ID for the current workflow job.

String wf:name()

It returns the workflow application name for the current workflow job.

String wf:appPath()

It returns the workflow application path for the current workflow job.

String wf:conf(String name)

It returns the value of the workflow job configuration property for the current workflow job, or an empty string if undefined.

String wf:user()

It returns the user name that started the current workflow job.

String wf:group()

It returns the group/ACL for the current workflow job.

String wf:callback(String stateVar)

It returns the callback URL for the current workflow action node, stateVar can be a valid exit state (=OK= or ERROR ) for the action or a token to be replaced with the exit state by the remote system executing the task.

String wf:transition(String node)

It returns the transition taken by the specified workflow action node, or an empty string if the action has not being executed or it has not completed yet.

String wf:lastErrorNode()

It returns the name of the last workflow action node that exit with an ERROR exit state, or an empty string if no a ction has exited with ERROR state in the current workflow job.

String wf:errorCode(String node)

It returns the error code for the specified action node, or an empty string if the action node has not exited with ERRORstate.

Each type of action node must define its complete error code list.

String wf:errorMessage(String message)

It returns the error message for the specified action node, or an empty string if no action node has not exited with ERRORstate.

The error message can be useful for debugging and notification purposes.

int wf:run()

It returns the run number for the current workflow job, normally 0 unless the workflow job is re-run, in which case indicates the current run.

Map wf:actionData(String node)

This function is only applicable to action nodes that produce output data on completion.

The output data is in a Java Properties format and via this EL function it is available as a Map .

int wf:actionExternalId(String node)

It returns the external Id for an action node, or an empty string if the action has not being executed or it has not completed yet.

int wf:actionTrackerUri(String node)

It returns the tracker URIfor an action node, or an empty string if the action has not being executed or it has not completed yet.

int wf:actionExternalStatus(String node)

It returns the external status for an action node, or an empty string if the action has not being executed or it has not completed yet.

3.Hadoop EL Functions

Hadoop EL Constants

  • RECORDS: Hadoop record counters group name.
  • MAP_IN: Hadoop mapper input records counter name.
  • MAP_OUT: Hadoop mapper output records counter name.
  • REDUCE_IN: Hadoop reducer input records counter name.
  • REDUCE_OUT: Hadoop reducer input record counter name.
  • GROUPS: 1024 * Hadoop mapper/reducer record groups counter name.

4.Hadoop Jobs EL Function

 wf:actionData()

5.HDFS EL Functions

For all the functions in this section the path must include the FS URI. For example hdfs://foo:8020/user/tucu .

boolean fs:exists(String path)

It returns true or false depending if the specified path URI exists or not.

boolean fs:isDir(String path)

It returns true if the specified path URI exists and it is a directory, otherwise it returns false .

boolean fs:dirSize(String path)

It returns the size in bytes of all the files in the specified path. If the path is not a directory, or if it does not exist it returns -1. It does not work recursively, only computes the size of the files under the specified path.

boolean fs:fileSize(String path)

It returns the size in bytes of specified file. If the path is not a file, or if it does not exist it returns -1.

boolean fs:blockSize(String path)

It returns the block size in bytes of specified file. If the path is not a file, or if it does not exist it returns -1.

 

 

0
5
分享到:
评论

相关推荐

    oozie调度脚本.docx

    3. **表达能力**:Oozie的XML配置支持EL(Expression Language)常量和函数,使得作业配置更灵活,能够定义更复杂的逻辑。 4. **图形化界面**:Oozie提供图形化界面,用户可以通过拖拽方式设计工作流,使得非程序员...

    Oozie之JavaAction测试实例

    JavaAction可以使用`capture-output`元素将值反向传播到Oozie上下文中,这些值随后可以通过EL函数访问。这些值需要按照Java属性格式文件输出,文件名通过`JavaMainMapper.OOZIE_JAVA_MAIN_CAPTURE_OUTPUT_FILE`常量...

    Oozie - The Workflow Scheduler for Hadoop

    此外,Oozie允许用户编写自己的表达式语言(EL)函数,这为工作流提供了更多的灵活性和强大的表达能力。 ### 工作流调试与Oozie运维管理 在工作流的执行过程中,可能会遇到各种问题。因此,调试工作流和管理Oozie的...

    Hive编程指南+HIVE从入门到精通+Hive高级编程+Apache Oozie

    Oozie是Hadoop作业的调度和管理系统,它可以协调Hadoop相关的任务,如Hive、Pig、MapReduce和Sqoop。理解Oozie,你需要: 1. **Oozie工作流**: 创建XML配置文件定义工作流流程,包括任务间的依赖关系。 2. **Action...

    Apache Ooize Workflow Scheduler for Hadoop

    书中还介绍了如何实现自定义扩展,比如编写自己的EL(表达式语言)函数,以供工作流内部使用。这为那些需要在工作流中处理特定逻辑但标准Oozie功能无法满足需求的用户提供了解决方案。此外,调试工作流和管理Oozie的...

    oozie-workflow-checker:验证复杂的Apache Oozie Hadoop工作流程-开源

    注意:在所有工作流程函数中,现在仅支持“ wf:conf”。 2)检查被调用的动作是否存在或以xml格式构建完整的调用树OozieWorkflowCheckerTest中的示例:您可以通过名称为“ override.properties”的文件覆盖“ ...

    sahara-extra:回购与撒哈拉沙漠相关的工具。 在opendev.org上维护的代码镜像

    适用于oozie的主要函数包装器的来源: ://opendev.org/openstack/sahara-extra/src/branch/master/edp-adapt-for-oozie/README.rst 适用于spark的主要函数包装器的来源: : 元素已移至新存储库: : 用于构建工件...

    精品课程推荐 大数据与云计算教程课件 优质大数据课程 21.Pig模式与函数(共64页).pptx

    【大数据与云计算教程】本课程涵盖了大数据处理的关键技术,包括Hadoop、MapReduce、YARN、HDFS、Hive、HBase、Pig、Zookeeper、Sqoop、Flume、Kafka、Storm、Spark以及数据处理相关的工具和框架。课程详细讲解了各...

    Hadoop、HBase、Hive、Pig、Zookeeper资料整理

    - **hive函数大全.doc**:这可能是一份详细列出Hive支持的各种内置函数的参考手册,帮助用户在编写HQL时查找和使用各种函数。 - **hive_installation and load data.doc**:这份文档可能介绍了如何安装Hive以及如何...

    价值上万的视频教程互联网程序开发+大数据+Hadoop、hive、Spark

    HBase、 Java9 、Java10 、MySQL优化 、JVM原理 、JUC多线程、 CDH版Hadoop Impala、 Flume 、Sqoop、 Azkaban、 Oozie、 HUE、 Kettle、 Kylin 、Spark 、Mllib机器学习、 Flink、 Python、 SpringBoot、 Hadoop3.x...

    大数据、云计算系统高级架构师课程学习路线图.pdf

    Linux 是大数据领域中最常用的操作系统,本部分课程旨在帮助学员打好 Linux 基础,涵盖了 Linux 系统概述、系统安装及相关配置、Linux 网络基础、OpenSSH 实现网络安全连接、vi 文本编辑器、用户和用户组管理、磁盘...

    Hive编程指南-2013.12.pdf

    5. 分区裁剪:通过WHERE子句指定分区,只处理相关分区。 6. 使用_TECHNICAL_视图预处理数据,减少计算量。 七、Hive与其他组件集成 Hive可以与Hadoop生态系统中的其他组件如Pig、HBase、Sqoop、Oozie等协同工作,...

    Hive编程指南中文版

    第13章 函数 第14章 Streaming 第15章 自定义Hive文件和记录格式 第16章 Hive的Thrift服务 第17章 存储处理程序和NoSQL 第18章 安全 第19章 锁 第20章 Hive和Oozie整合 第21章 Hive和亚马逊网络服务系统 第22章 ...

    Hadoop技术内幕:深入解析MapReduce架构设计i与实现原理

    11. **Hadoop生态**:Hadoop并不止于MapReduce,还包括HBase、Hive、Pig、Oozie、Zookeeper等一系列工具,共同构建了一个完整的数据处理和管理平台。 这本书详细解析了MapReduce的架构设计和实现原理,不仅涵盖了...

    Hadoop The Definitive Guide 2 example code

    - Oozie:工作流管理系统,调度Hadoop作业和工作流程。 4. 示例代码实践 - 在本地模式下运行:这适用于初步测试和调试,所有Hadoop进程都在单个节点上运行。 - 在伪分布式模式下运行:模拟分布式环境,但所有...

    大数据,算法总结

    在给定文件的部分内容中,提到了多个与大数据相关的编程语言、存储系统、计算引擎、数据仓库、数据处理系统和消息中间件等。 Scala是一种多范式编程语言,它继承了函数式编程语言的特性,同时也提供了面向对象编程...

    Hadoop实战.pdf

    Hadoop生态系统非常庞大,除了HDFS、MapReduce和YARN之外,还包括了HBase、Hive、Pig、ZooKeeper、Sqoop、Flume、Oozie等多个子项目,它们分别提供数据库服务、数据仓库服务、数据流语言、协调服务、数据导入导出...

    hadoop-0.20.2-cdh3u6

    CDH通常会比Apache Hadoop提供更多的功能、更好的性能以及更全面的集成测试,同时它还包含了其他相关的数据处理组件,如HBase、Hive、Pig、Oozie等。 1. **Hadoop核心组件**:CDH3u6包括Hadoop的两个关键组件——...

    大数据hadoop教程

    Map函数读取输入数据,并为每个键值对生成中间键值对,这些键值对随后会被排序并分组,然后输入到Reduce函数中。Reduce函数处理这些分组后的中间结果,并输出最终结果。Hadoop的MapReduce框架会自动管理数据分布、...

Global site tag (gtag.js) - Google Analytics