`
k_lb
  • 浏览: 837772 次
  • 性别: Icon_minigender_1
  • 来自: 郑州
社区版块
存档分类
最新评论
  • kitleer: 据我所知,国内有款ETL调度监控工具TaskCTL,支持ket ...
    kettle调度

kettle调度

 
阅读更多

Labels:

Index

What is Kitchen?


Kitchen is a program that can execute jobs designed by Spoon in XML or in a database repository. Usually jobs are scheduled in batch mode to be run automatically at regular intervals.

Installation


The first step is the installation of Sun Microsystems Java Runtime Environment version 1.5 or higher. You can download a JRE for free athttp://www.java.com/.

After this, you can simply unzip the distribution zip-file in a directory of your choice.
In the Kettle directory where you unzipped the file, you will find a number of files.
Under Unix-like environments (Solaris, Linux, OSX, ...) you will need to make the shell scripts executable. Execute these commands to make all shell scripts in the Kettle directory executable:

cd Kettle
chmod +x *.sh

Launching Kitchen


To launch Kitchen on the different platforms these are the scripts that are provided:

  • Kitchen.bat: run Kitchen on the Windows platform.
  • kitchen.sh: run Kitchen on Unix platforms and Mac OSX

Kitchen can be run on any platform that has a version of the Java Runtime Environment version 1.5 or higher.

Command line options


These are the command line options that you can use.

IMPORTANT NOTES:
On Windows system, the use of the minus ("-") in the options causes problems as well as the equal sign ("="). Because of this, from version 2.2.2 on, you can also use this format or any combination of /,- and :,=
Fields in italic represent the values that the options use.
It's important that if spaces are present in the option values, you use quotes or double quotes to keep them together. Take a look at the examples below for more info.

/option:value

Below are the valid options.

Display version information

-version

This option displays the version of the Kettle core library (kettle.jar).
The build version number and build date are shown as well.

Launch XML File

--file=filename

This option runs the job defined in the XML file. (.kjb : Kettle Job)

Set the logging file

-log=Logging Filename

Specifies the log file. The default is the standard output.

Set the logging level

-level=Logging Level

The level option sets the log level for the job that's being run.
These are the possible values:

  • Error: Only show errors
  • Nothing: Don't show any output
  • Minimal: Only use minimal logging
  • Basic: This is the default basic logging level
  • Detailed: Give detailed logging output
  • Debug: For debugging purposes, very detailed output.
  • Rowlevel: Logging at a row level, this can generate a lot of data.

Choose a repository

-rep=Repository name

Connect to the repository with name "Repository name".
You also need to specify the options -user, -pass, -dir and -job.
You can also specify this option in the form of environment variable KETTLE_REPOSITORY.

Set the repository user name

-user=Username

This is the username with which you want to connect to the repository.
You can also specify this option in the form of environment variable KETTLE_USER.

Set the repository password

-pass=Password

The password to use to connect to the repository
You can also specify this option in the form of environment variable KETTLE_PASSWORD.

Select the repository job to run

-job=Job Name

Use this option to select the job to run from the repository. Please also select the directory with the "-dir" option.

List the directories in the repository

-listdir=Y

Print a listing of all the sub-directories in the repository directory specified with the option "-dir".

Set the repository directory

-dir=directory

Specifies the directory in the repository to use. Repository directories are specified like this:

  • The root directory: /
  • A subdirectory: /production/Dimensions

From version 2.2.2 on, a / (slash) is used to separate directories on all platforms.

List the repository jobs

-listjobs=Y

Show a list of all the jobs in the repository directory specified with the option "-dir".

List the available repositories

-listrep=Y

Print a listing of all the defined repositories.

Don't log in to the repository

-norep=Y

If you have set environment variables KETTLE_REPOSITORY, KETTLE_USER, KETTLE_PASSWORD, you can prevent Kitchen from logging into the repository. For example if you want to launch a job from an XML file.

Path


Please make sure that you are positioned in the Kettle directory before running the samples below. If you put these scripts into a batch file or shell script, simply do a change directory to the installation directory:

If Kettle was installed on windows on the D:\ drive

D:
cd \Kettle

If Kettle was installed in the /product directory on a Unix system:

cd /product/Kettle/

Run a job from file


This example runs a job from file on a windows platform:

kitchen.bat /file:D:\Jobs\updateWarehouse.kjb /level:Basic

This example runs a job from file on a Linux box:

kitchen.sh -file=/PRD/updateWarehouse.kjb -level=Minimal

Run a job from Repository


This example runs a job from the repository on a windows platform:
(Enter on a single line without returns...)

kitchen.bat
                    /rep:"Production Repository"
                    /job:"Update dimensions"
                    /dir:/Dimensions
                    /user:matt
                    /pass:somepassword123
                    /level:Basic

例:

kitchen.bat /rep:"ywykettle" /user:admin /pass:admin /job "job_ywy" /dir:/执行job

Redirecting output


If you don't want the output of the file to appear on the screen but rather be put into a log file, you can use redirection.

This example adds the Kitchen output to an ever-growing log file:

kitchen.sh -file="/PRD/updateWarehouse.kjb" --level=Minimal >> /LOG/trans.log

This example writes the Kitchen output to a file that gets overwritten every time:

kitchen.bat /file:C:\PRD\runAll.kjb /level:Basic > C:\LOG\trans.log

Return codes


Kitchen returns an error code based on how the execution went:

  • 0 : The job ran without a problem.
  • 1 : Errors occurred during processing
  • 2 : An unexpected error occurred during loading / running of the job
  • 7 : The job couldn't be loaded from XML or the Repository
  • 8 : Error loading steps or plugins (error in loading one of the plugins mostly)
  • 9 : Command line usage printing

Scheduling

Schedule a job on windows

The best way to go at it is to test the command first at the dos prompt.
Then you can use the windows scheduler to launch this command.
Windows versions since Windows 2000 have a GUI for doing this accessible through the control panel. However it's also possible to use the command line to do this:

at 23:30 /every:Monday,Wednesday,Friday "D:\updateWarehouse.bat

To see a list of the scheduled commands simply type:

at

Schedule a job on Unix

First create a shell script that runs all the jobs you need. Then you can schedule this script to run.
On Unix like systems the easiest way to schedule a command is by using the "cron table". You can do this by entering the following command:

crontab -e

Then you can enter the time at which the command needs to be run as well as the command on a single line in the text file that is presented.
The first options are:

  • Minute: The minute of the hour, 0-59
  • Hour: The hour of the day, 0-23
  • Month day: The day of the month, 1-31
  • Month: The month of the year, 1-12
  • Weekday: The day of the week, 0-6, 0=Sunday

You can specify more then 1 number for each of these values by separating 2 number with a hyphen -. This means an inclusive number range. If you separate the number by commas (,), it means distinct values. If you use * instead of a number, it means: every possible hour, minute, day, month or weekday.

So, if you want to update the dimensions every hour, at 15 and 45 minutes past the hour during the weekdays, you might enter these lines in a crontab:

#
# Launches the update of the dimensions in the warehouse
# 15,45 * * * 1-5 /PROD/update_dimensions.sh


Kettle定时功能。

<wbr><wbr><wbr>在Job下的start模块,有一个定时功能,可以每日,每周等方式进行定时,对于周期性的ETL,很有帮助。</wbr></wbr></wbr>

<wbr><wbr><wbr>a.使用资源库(repository)登录时,默认的用户名和密码是admin/admin。</wbr></wbr></wbr>

<wbr><wbr><wbr>b.当job是存放在资源库(一般资源库都使用数据库)中时,使用Kitchen.bat执行job时,需使用如下的命令行:<span style="color:#0080C0"><br></span><span style="color:#0080C0">Kitchen.bat /rep kettle /user admin/pass admin /job job名</span></wbr></wbr></wbr>

<wbr><wbr><wbr>c.当job没有存放在资源库而存放在文件系统时,使用Kitchen.bat执行job时,需使用如下的命令行:<span style="color:#0080C0"><br> Kitchen.bat /norep /file user-transfer-job.kjb</span></wbr></wbr></wbr>

<wbr><wbr><wbr>d.可以使用命令行执行job后,就可以使用windows或linux的任务调度来定时执行任务了</wbr></wbr></wbr>

<wbr><wbr><wbr>e.如果出现异常语句,</wbr></wbr></wbr>

<wbr><wbr><wbr><wbr><wbr><wbr>Unexpected error during transformation metadata load<br><wbr><wbr><wbr><wbr><wbr><wbr>No repository defined!</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>

<wbr><wbr><wbr>请按上面的操作排除。</wbr></wbr></wbr>


分享到:
评论
1 楼 kitleer 2017-08-22  
据我所知,国内有款ETL调度监控工具TaskCTL,支持kettle还不错。可以跨平台分布式调度kettle。
Linux/Windows端的kettle都可以调。包括自动定时/排程,人工干预等调度方式。好像还没有一款软件能在图形监控上超过它,肯定比起其它的监控方案强多了。

相关推荐

    kettle调度工具jar包 下载即用

    标题“kettle调度工具jar包 下载即用”表明这是关于Kettle中的调度功能相关的jar包,通常这些jar包包含了用于计划和执行数据处理任务的组件。Kettle的调度工具允许用户定义时间表,以便在特定的时间点自动运行工作流...

    kettle-scheduler:一款简单易用的Kettle调度监控平台,专门用来调度和监控由kettle客户端创建的job和transformation。整体的框架是由spring+sprin gmvc +beetlsql整合而成,通过调用kettle的API来执行转换和作业,并且使用quartz框架完成调度工作

    项目介绍Kettle调度监控平台(以下简称KS)是一个自主开发的javaweb程序,专门用来调度和监控由kettle客户端创建的job和transformation。KS整体的框架是由spring+sprin gmvc +beetlsql整合而成,通过调用kettle的API...

    kettle定时调度监控方案选型策略

    首先,对于Kettle作业的调度监控方式,有三种主流方法:使用Kettle自带的Spoon工具、通过命令行工具pan和kitchen进行调度,以及通过Java调用Kettle核心库的方式。这三种方法各有优缺点,选择合适的方法对于确保系统...

    kettle调度系统.rar

    定时调度系统,100%部署成功,有问题联系电话13413011401

    Kettle根据参数循环调度

    标题“Kettle根据参数循环调度”意味着我们需要在Kettle的工作流(Job)中设置一个动态的调度过程,这个过程会根据某些参数值进行循环执行。这通常用于处理周期性的数据加载任务,比如每天监控数据量的变化。 作者...

    分布式kettle调度平台

    在使用过程中针对kettle的弱项,以及对市面上一些kettle二开工具的对比发现,目前尚未有一款好用的调度管理工具。经此,于是准备开发一款简单易用,灵活部署,可以水平扩展的分布式调度管理平台。

    基于web版kettle开发的一套分布式综合调度,管理,ETL开发的用户专业版BS架构工具.zip

    【标题】中的“基于Web版Kettle开发的一套分布式综合调度、管理、ETL开发的用户专业版BS架构工具”指的是使用Kettle(Pentaho Data Integration)进行ETL(提取、转换、加载)过程,并且是Web化的版本,便于在浏览器...

    kettle 调度 kjb的例子

    循环调用

    Kettle安装部署-调度-使用手册

    #### 四、Kettle调度配置 - **使用Quartz作为调度器**:Quartz是一个强大的任务调度框架,可以用来定期执行Kettle的任务。 - **配置Quartz**:在Kettle中设置Quartz的相关参数,如数据库连接等。 #### 总结 通过...

    kettle计划任务调度.rar

    标题中的“kettle计划任务调度.rar”指的是使用Kettle(Pentaho Data Integration,简称Kettle)工具进行计划任务调度的资源包。Kettle是一款开源的数据集成工具,它提供了ETL(Extract, Transform, Load)功能,...

    kettle-scheduler-master_springmvc_kettlejava_kettleweb调度_kettle-

    标题 "kettle-scheduler-master_springmvc_kettlejava_kettleweb调度_kettle-" 指的是一个基于Kettle(Pentaho Data Integration)的Web自动化调度项目,它使用了Spring MVC框架来实现。这个项目的核心目标是将...

    kettle-manager集成web页面调度

    【Kettle Manager 集成Web页面调度】详解 Kettle,又称Pentaho Data Integration(PDI),是由Pentaho公司开发的一款强大的ETL(数据抽取、转换、加载)工具。Kettle Manager是Kettle的一部分,它提供了一个管理...

    基于Springboot微服务的Kettle大数据调度服务监控平台设计源码

    旨在解决企业数据抽取业务场景中,无法通过web方式配置、调度、监控kettle的痛点。通过该项目,开发者可以学习并实践Springboot微服务和前端技术,为后续的大数据调度服务监控平台开发奠定基础。系统界面友好,易于...

    Kettle培训

    Kettle 功能与产品介绍 Kettle 控件介绍 Kettle 案例演示 Kettle 调度

    基于kettle的简单易用可视化任务调度系统设计源码

    该系统是一款基于Kettle的简单易用可视化任务调度系统设计源码,共计737个文件,涵盖184个JavaScript文件、149个Java文件、132个PNG图片文件、87个GIF图片文件、62个CSS文件、34个HTML文件,以及少量其他格式文件。...

    aaa搭建使用说明aaa

    2. 在使用kettle调度时,可能出现找不到转换问题 八、资源库信息 1. 选择新建数据库资源库 2. 测试连接,显示连接成功 3. 保存资源库信息 九、执行策略 1. 参考https://www.bejson.com/othertools/cron/网站,...

    springboot实现kettle

    总的来说,SpringBoot实现Kettle涉及了Spring Boot的Web开发、Kettle的ETL操作、并发处理、日志管理、任务调度和参数化执行等多个方面。这一整合方案为企业级数据处理提供了一种灵活、高效的解决方案,特别是在...

    kettle-core-8.1.0.0-365_kettle_kettle达梦8_

    Kettle,也被称为Pentaho Data Integration (PDI),是一款强大的数据集成工具,它提供了一种图形化的界面,让用户能够设计、执行和调度各种数据转换任务。在当前的信息化环境中,数据源多种多样,而达梦数据库作为...

    Pentaho Kettle Solutions 中文版文档

    此外,Kitchen用于执行作业,而Carte则是一个轻量级服务器,支持远程调度和监控Kettle作业与转换。 二、Kettle的ETL过程 在Kettle中,数据处理的核心概念是转换(Transformation)和作业(Job)。转换专注于单一的...

Global site tag (gtag.js) - Google Analytics