Kettle学习资料分享
Kettle 3.2 使用说明书
目录
概述..........................................................................................................................................7
1.Kettle 资源库管理.................................................................................................................7
1.1 新建资源库.................................................................................................................7
1.2 更新资源库..............................................................................................................11
1.3 资源库登陆和用户管理..........................................................................................12
1.4 资源库登录和没有资源库登录的区别..................................................................16
2.菜单栏介绍..........................................................................................................................18
2.1 文件..........................................................................................................................18
2.2 编辑..........................................................................................................................19
2.3 视图..........................................................................................................................21
2.4 资源库......................................................................................................................21
2.5 转换..........................................................................................................................22
2.6 作业..........................................................................................................................25
2.7 向导..........................................................................................................................26
2.8 帮助..........................................................................................................................26
2.9 变量..........................................................................................................................26
2.9.1 变量使用........................................................................................................26
2.9.2 变量范围.......................................................................................................26
2.9.2.1 环境变量............................................................................................26
2.9.2.2 Kettle 变量.........................................................................................27
2.9.2.3 内部变量............................................................................................27
3.工具栏介绍..........................................................................................................................28
3.1 转换Transformation 工具栏....................................................................................28
3.2 工作Jobs 工具栏......................................................................................................29
4.主对象树..............................................................................................................................30
4.1 转换主对象树..........................................................................................................31
4.1.1 新建转换.......................................................................................................32
4.1.2 转换设置.......................................................................................................32
4.1.3 DB 连接.........................................................................................................37
4.1.4 Steps(步骤) ....................................................................................................40
4.1.5 Hops(节点连接).............................................................................................40
4.1.5.1 右键节点连接,可以新建和排序连接.............................................41
4.1.5.2 右键单击每个具体连接,可以编辑和删除该节点连接的属性.....42
4.1.6 数据库分区schems ......................................................................................42
4.1.7 子服务器.......................................................................................................43
4.1.8 Kettle 集群schems ........................................................................................43
4.2 Jobs 主对象树...........................................................................................................44
4.2.1 新建Job ........................................................................................................44
4.2.2 设置Job 属性...............................................................................................45
4.2.3 DB 连接......................................................................................................45
4.2.4 作业项目....................................................................................................47
4.2.5 子服务器.......................................................................................................47
5. 转换核心对象....................................................................................................................47
5.1 Transform..................................................................................................................48
5.2 Input ..........................................................................................................................48
5.3 输入..........................................................................................................................49
5.3.1 Access Input ...................................................................................................49
5.3.2 CSV file input ................................................................................................50
5.3.3 Cube 输入多维立方体................................................................................51
5.3.4 Excel 输入......................................................................................................51
5.3.5 Fixed file input ...............................................................................................53
5.3.6 Generate random value ..................................................................................54
5.3.7 Get file Names................................................................................................55
5.3.8 Get Files Rows Count ....................................................................................55
5.3.9 Get data from XML........................................................................................55
5.3.10 LDAP Input ..................................................................................................57
5.3.11 LDIF Input....................................................................................................58
5.3.12 Mondrian Input.............................................................................................60
5.3.13 Property Input...............................................................................................60
5.3.14 Streaming XML Input ..................................................................................61
5.3.15 XBase 输入..................................................................................................65
5.3.16 XML 输入....................................................................................................66
5.3.17 文本文件输入.............................................................................................70
5.3.18 生成记录.....................................................................................................71
5.3.19 获取系统信息.............................................................................................71
5.3.20 表输入.........................................................................................................73
5.4 输出..........................................................................................................................75
5.4.1 Access Output.................................................................................................75
5.4.2 Cube 输出......................................................................................................75
5.4.3 Excel Output...................................................................................................76
5.4.4 Properties Output ...........................................................................................76
5.4.5 SQL File Output .............................................................................................78
5.4.6 XML 输出......................................................................................................79
5.4.7 删除...............................................................................................................80
5.4.8 插入/更新......................................................................................................81
5.4.9 文本文件输出...............................................................................................83
5.4.10 更新.............................................................................................................83
5.4.11 表输出.........................................................................................................84
5.5 查询..........................................................................................................................85
5.5.1 Check if a column exists ................................................................................85
5.5.2 File Exists.......................................................................................................86
5.5.3 HTTP client ....................................................................................................87
5.5.4 Table exists.....................................................................................................88
5.5.5 Web 服务查询................................................................................................89
5.5.6 数据库查询...................................................................................................89
5.5.7 数据库连接...................................................................................................91
5.5.8 流查询...........................................................................................................92
5.5.9 调用DB 存储过程.......................................................................................94
5.6 转换..........................................................................................................................94
5.6.1 Abort...............................................................................................................95
5.6.2 Add XML 增加XML....................................................................................96
5.6.3 Add a checksum 增加检查和.......................................................................97
5.6.4 Analytic Query 分析查询.............................................................................98
5.6.5 Append Streams .............................................................................................98
5.6.6 Blocking Step 被冻结的步骤.......................................................................99
5.6.7 Clone row.......................................................................................................99
5.6.8 Closure Generator 闭包生成器..................................................................100
5.6.9 Data Validator 数据检测.............................................................................100
5.6.10 Delay row 延迟行.....................................................................................101
5.6.11 Identify last row in a stream 标记流中最后一行.....................................101
5.6.12 Metadata structure of stream 流中元数据结构.........................................102
5.6.13 Null if 设置为空值...................................................................................102
5.6.14 Row Normaliser 行正规化.......................................................................103
5.6.15 Split field to rows 分离行.........................................................................103
5.6.16 Switch / case...............................................................................................104
5.6.17 XSD Validator ............................................................................................104
5.6.18 XSL Transformation...................................................................................105
5.6.19 值映射.......................................................................................................106
5.6.20 分组...........................................................................................................107
5.6.21 去除重复记录...........................................................................................108
5.6.22 增加常量...................................................................................................109
5.6.23 增加序列...................................................................................................109
5.6.24 字段选择...................................................................................................110
5.6.25 拆分字段................................................................................................... 111
5.6.26 排序记录...................................................................................................112
5.6.27 空操作.......................................................................................................113
5.6.28 行扁平化...................................................................................................113
5.6.29 行转列.......................................................................................................115
5.6.30 计算器.......................................................................................................116
5.6.31 过滤记录...................................................................................................119
5.7 连接.......................................................................................................................120
5.7.1 Merge Join....................................................................................................120
5.7.2 Sorted Merge................................................................................................121
5.7.3 XML Join .....................................................................................................122
5.7.4 合并记录.....................................................................................................122
5.7.5 记录关联(笛卡尔输出).........................................................................123
5.8 脚本........................................................................................................................124
5.8.1 Modified Java Script Calue..........................................................................124
5.8.2 Regex Evaluation .........................................................................................125
5.8.3 执行SQL 脚本...........................................................................................127
5.9 数据仓库................................................................................................................128
5.9.1 维度更新/查询............................................................................................128
5.9.2 联合更新/查询............................................................................................129
5.10 映射......................................................................................................................130
5.10.1 映射(子转换).......................................................................................130
5.10.2 映射输入规范...........................................................................................131
5.10.2 映射输出规范...........................................................................................132
5.11 作业......................................................................................................................132
5.11.1 Get Variables 获得变量.............................................................................132
5.11.2 Get files from result....................................................................................133
5.11.3 Set Variables 设置变量.............................................................................134
5.11.4 Set files in result.........................................................................................135
5.11.5 从结果获取记录.......................................................................................135
5.11.6 复制记录到结果.......................................................................................136
5.12 内联......................................................................................................................136
5.12.1 Injector .......................................................................................................136
5.12.2 Socket reader..............................................................................................137
5.12.3 Socket writer ..............................................................................................137
5.13 实验......................................................................................................................138
5.14 不推荐的..............................................................................................................138
5.14.1 聚合记录...................................................................................................139
5.15 Bulk loading..........................................................................................................140
5.16 History...................................................................................................................142
6. 任务Jobs 核心对象.........................................................................................................143
6.1 General ....................................................................................................................143
6.1.1 Dummy Job ..................................................................................................143
6.2 通用........................................................................................................................144
6.2.1 START..........................................................................................................144
6.2.2 Dummy Job ..................................................................................................144
6.2.3 中断任务.....................................................................................................145
6.2.4 显示消息对话框.........................................................................................145
6.2.5 任务(Job) ....................................................................................................146
6.2.6 Ping a host....................................................................................................147
6.2.7 Success .........................................................................................................148
6.2.8 文本输出.....................................................................................................148
6.2.9 Write to Log .................................................................................................149
6.3 邮件........................................................................................................................149
6.3.1 Write to Log .................................................................................................149
6.3.2 Mail ..............................................................................................................150
6.4 文件管理................................................................................................................151
6.4.1 向结果中添加文件名.................................................................................152
6.4.2 比较文件夹.................................................................................................152
6.4.3 拷贝文件.....................................................................................................153
6.4.4 拷贝或移动结果文件名.............................................................................153
6.4.5 新建文件夹.................................................................................................154
6.4.6 新建文件.....................................................................................................155
6.4.7 删除文件.....................................................................................................155
6.4.8 从结果集中删除文件名.............................................................................155
6.4.9 删除文件.....................................................................................................156
6.4.10 删除文件夹...............................................................................................156
6.4.11 文件比较...................................................................................................157
6.4.12 HTTP..........................................................................................................157
6.4.13 Move FIles .................................................................................................158
6.4.14 文件解压缩................................................................................................159
6.4.15 等待文件...................................................................................................159
6.4.16 文件打包...................................................................................................160
6.5 条件........................................................................................................................161
6.5.1 检查文件夹是否为空.................................................................................161
6.5.2 检查文件是否存在.....................................................................................161
6.5.3 检查数据库表中的列是否存在.................................................................162
6.5.4 检查文件存在.............................................................................................162
6.5.5 检查表是否存在.........................................................................................163
6.5.6 等待.............................................................................................................163
6.6 脚本........................................................................................................................164
6.6.1 Mail ..............................................................................................................164
6.6.2 SQL ..............................................................................................................164
6.6.3 SHELL .........................................................................................................165
6.7 批量加载................................................................................................................166
6.7.1 批量从Mysql 中加载数据至文件.............................................................166
6.7.2 从文件中向MS SQL Server 数据库中批量加载.....................................166
6.7.3 从文件中向Mysql 数据库中批量加载......................................................167
6.8 XML........................................................................................................................168
6.8.1 Check if XML File is well formed ...............................................................168
6.8.2 DTD Validator..............................................................................................169
6.8.3 XSD Validator ..............................................................................................169
6.8.4 XSL Transformation.....................................................................................170
6.9 文件传输................................................................................................................171
6.9.1 FTP...............................................................................................................171
6.9.2 FTP Delete....................................................................................................173
6.9.3 Put a file with FTP .......................................................................................173
6.9.4 Put a file with SFTP .....................................................................................175
6.9.5 SSH2 Get......................................................................................................176
6.9.6 SSH2 Put ......................................................................................................177
6.9.7 Secure FTP...................................................................................................179
6.10 资源库..................................................................................................................180
6.10.1 Check if connected to repository................................................................180
6.10.2 Export repository to XML file....................................................................181
6.11 实验......................................................................................................................181
6.11.1 Evaluate rows number in a table ................................................................182
6.11.2 MS Access Bulk Load ................................................................................182
6.11.3 Set variables ...............................................................................................184
6.11.4 Simple evaluation.......................................................................................184
6.11.5 Truncate tables............................................................................................185
6.11.6 Wait for SQL ..............................................................................................186
附:
1、Kettle+3.2使用说明书.pdf
2、kettle初探--内含配置信息.pdf
3、用Kettle的一套流程完成对整个数据库迁移.pdf
相关推荐
在这个“Kettle的一套流程完成对整个数据库迁移”的压缩包中,包含了完成数据库迁移所需的所有步骤。数据库迁移通常涉及从一个数据库系统迁移到另一个系统,确保数据的完整性和一致性。 首先,理解Kettle的工作原理...
使用kettle重复的画着:表输入-表输出、创建表,很烦恼。 实现了一套通用的数据库迁移流程。 做一个批量抽取的job
本文主要探讨如何在Kettle 7.0环境下实现数据库迁移,特别是从Oracle到MySQL的迁移,同时也涵盖了对其他数据库类型的迁移支持。 Kettle,又称Pentaho Data Integration (PDI),是一款强大的ETL(提取、转换、加载)...
8. **自动化与调度**:Kettle支持批处理和定时任务,可以将整个数据库迁移流程设置为一个作业,并使用内置的调度器定期执行,实现自动化迁移。 总的来说,Kettle通过其灵活且强大的ETL功能,使得数据库迁移变得简单...
该实例主要完成sqlserver数据库表信息到Oracle数据库表的一次数据迁移,用kettle工具(简称水壶)编写好转换文件后保存,最后执行转化,即可完成数据库表的数据迁移。运行该实例你需下载kettle工具,并对实例中的...
使用Kettle的"表输入"步骤连接到源数据库,查询需要迁移的数据。根据数据量,可以选择全量或增量迁移策略。全量迁移将迁移所有数据,而增量迁移仅迁移自上次迁移以来更改的数据。 6. **数据转换** 在...
【Kettle学习资料大全20191012.rar】是一个包含全面的Kettle学习资源的压缩包,其中可能涵盖了各种文档、教程、实战案例等,旨在帮助用户深入理解和掌握Pentaho Data Integration(Kettle)这一强大的ETL(提取、...
神通数据库的Kettle数据库插件是专为解决Kettle工具在处理神通数据库时的兼容性问题而设计的扩展。Kettle,又称Pentaho Data Integration(PDI),是一款强大的ETL(提取、转换、加载)工具,广泛应用于数据集成与...
在Kettle 5.1版本中,它提供了对数据库整套迁移的支持,无论是在 Spoon 工具中操作,还是通过Java代码实现,都能灵活地处理数据迁移任务。 1. **Spoon工具的使用**: Spoon是Kettle的图形化工作台,用户可以通过...
2. **数据抽取(ETL的E步)**:使用Kettle的"数据库输入"步骤来从各个源数据库读取数据。可以设置SQL查询来选择需要抽取的数据,并且可以通过过滤条件来优化数据抽取的效率。 3. **数据转换(ETL的T步)**:Kettle...
通过学习这些案例、安装和使用Kettle,以及观看视频教程,用户将能够掌握数据集成的核心技能,并有能力解决实际工作中的数据问题。同时,KettleRep的存在也提示了团队合作和项目管理的重要性,是提升工作效率和保证...
### Kettle 数据迁移工具使用详解 #### 一、Kettle 概述 Kettle 是一款功能强大的开源 ETL(Extract, Transform, Load)工具,它采用纯 Java 编写,支持在 Windows、Linux 和 Unix 等多种操作系统上运行。Kettle ...
在IT领域,数据库连接是数据集成过程中的关键环节,特别是在使用工具如Kettle(也称为Pentaho Data Integration,简称PDI)时。Kettle是一款强大的ETL(Extract, Transform, Load)工具,用于从各种数据源抽取数据,...
里面包含了ETL工具KETTLE实例手册、ETL工具Kettle用户手册、kettle各个组件用法、Kettle培训手册、Kettle入门-教程、kettle入门例子大全、Kettle相关内容及实验、kettle_4.2.1基础教程。
Kettle,全称为Pentaho Data Integration(PDI),是一款强大的ETL(Extract, Transform, Load)工具,常用于数据整合、数据清洗和数据迁移等任务。本示例中的"Kettle同步数据库所有的表数据到其他库.rar"是一个具体...
MySQL 与 Oracle 数据库迁移工具方法 MySQL 与 Oracle 数据库迁移是指在两个不同的数据库管理系统之间迁移数据的过程。这种迁移可能由于系统需求的变化,或者是由于数据库管理系统的升级或更换。数据库迁移工具是...
5. **设计ETL流程**:现在你可以使用Kettle的各种转换步骤来从ClickHouse中读取、处理和写入数据。例如,使用“Table input”步骤来查询ClickHouse中的数据,使用“Filter rows”或“Join rows”进行数据过滤和合并...