0.准备工作 hadoop 服务器
10.156.50.35 yanfabu2-35.base.app.dev.yf zk1 hadoop1 master1 master 10.156.50.36 yanfabu2-36.base.app.dev.yf zk2 hadoop2 master2 10.156.50.37 yanfabu2-37.base.app.dev.yf zk3 hadoop3 slaver1
1.解压 sqoop
tar xf sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz scp sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz zkkafka@10.156.50.37:/home/zkkafka/ scp sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz zkkafka@10.156.50.36:/home/zkkafka/ rm -rf sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz mv sqoop-1.4.7.bin__hadoop-2.6.0 sqoop
2.修改环境变量
vim ~/.bash_profile export SQOOP_HOME=/home/zkkafka/sqoop export PATH=$PATH:$SQOOP_HOME/bin source ~/.bash_profile scp ~/.bash_profile zkkafka@10.156.50.36:/home/zkkafka/ scp ~/.bash_profile zkkafka@10.156.50.37:/home/zkkafka/
3. 修改sqoop-env.sh
cp sqoop-env-template.sh sqoop-env.sh vim sqoop-env.sh export HADOOP_COMMON_HOME=/home/zkkafka/hadoop export HADOOP_MAPRED_HOME=/home/zkkafka/hadoop export HIVE_HOME=/home/zkkafka/hive export HBASE_HOME=/home/zkkafka/hbase scp ../conf/* zkkafka@10.156.50.36:/home/zkkafka/sqoop/conf/ scp ../conf/* zkkafka@10.156.50.37:/home/zkkafka/sqoop/conf/
4.修改 sqoop-site.xml
<property> <name>sqoop.metastore.client.record.password</name> <value>true</value> <description>If true, allow saved passwords in the metastore. </description> </property>
5. 上传mysql的驱动到sqoop的lib下
pwd /home/zkkafka/sqoop/lib cp /home/zkkafka/hive/lib/mysql-connector-java-5.1.47.jar /home/zkkafka/sqoop/lib/ 解决 :Could not load org.apache.hadoop.hive.conf.HiveConf. Make sure HIVE_CONF_DIR cp /home/zkkafka/hive/lib/hive-exec-**.jar /home/zkkafka/sqoop/lib/ 解决:Please set $ACCUMULO_HOME to the root of your Accumulo installation. cp /home/zkkafka/hadoop/share/hadoop/tools/lib/aws-java-sdk**.jar /home/zkkafka/sqoop/lib/ 解决:ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException vim sqoop-env.sh export HCAT_HOME=/home/zkkafka/hive/tmp/HiveJobsLog http://www.java2s.com/Code/Jar/j/Downloadjavajsonjar.htm scp java-json.jar zkkafka@10.156.50.37:/home/zkkafka/sqoop/lib/ http://www.java2s.com/Code/Jar/j/Downloadjavajsonschemajar.htm scp java-json-schema.jar zkkafka@10.156.50.36:/home/zkkafka/sqoop/lib/ scp json-20160212.jar zkkafka@10.156.50.37:/home/zkkafka/sqoop/lib/
6.sqoop 使用
6.1 测试sqoop 安装配置
sqoop -help
[zkkafka@yanfabu2-35 sqoop]$ sqoop help Warning: /home/zkkafka/sqoop/../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /home/zkkafka/sqoop/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. 19/05/29 11:14:16 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7 usage: sqoop COMMAND [ARGS] Available commands: codegen Generate code to interact with database records create-hive-table Import a table definition into Hive eval Evaluate a SQL statement and display the results export Export an HDFS directory to a database table help List available commands import Import a table from a database to HDFS import-all-tables Import tables from a database to HDFS import-mainframe Import datasets from a mainframe server to HDFS job Work with saved jobs list-databases List available databases on a server list-tables List available tables in a database merge Merge results of incremental imports metastore Run a standalone Sqoop metastore version Display version information See 'sqoop help COMMAND' for information on a specific command.
6.2 sqoop list-tables使用
sqoop list-tables --username hive --password hive --connect jdbc:mysql://10.156.50.36:3306/metastore?characterEncoding=UTF-8
[zkkafka@yanfabu2-35 sqoop]$ sqoop list-tables --username hive --password hive --connect jdbc:mysql://10.156.50.36:3306/metastore?characterEncoding=UTF-8 Warning: /home/zkkafka/sqoop/../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /home/zkkafka/sqoop/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. 19/05/29 11:17:53 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7 19/05/29 11:17:53 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 19/05/29 11:17:53 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. Wed May 29 11:17:54 CST 2019 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification. AUX_TABLE BUCKETING_COLS CDS COLUMNS_V2 COMPACTION_QUEUE COMPLETED_COMPACTIONS COMPLETED_TXN_COMPONENTS DATABASE_PARAMS DBS DB_PRIVS DELEGATION_TOKENS FUNCS FUNC_RU GLOBAL_PRIVS HIVE_LOCKS IDXS INDEX_PARAMS KEY_CONSTRAINTS MASTER_KEYS NEXT_COMPACTION_QUEUE_ID NEXT_LOCK_ID NEXT_TXN_ID NOTIFICATION_LOG NOTIFICATION_SEQUENCE NUCLEUS_TABLES PARTITIONS PARTITION_EVENTS PARTITION_KEYS PARTITION_KEY_VALS PARTITION_PARAMS PART_COL_PRIVS PART_COL_STATS PART_PRIVS ROLES ROLE_MAP SDS SD_PARAMS SEQUENCE_TABLE SERDES SERDE_PARAMS SKEWED_COL_NAMES SKEWED_COL_VALUE_LOC_MAP SKEWED_STRING_LIST SKEWED_STRING_LIST_VALUES SKEWED_VALUES SORT_COLS TABLE_PARAMS TAB_COL_STATS TBLS TBL_COL_PRIVS TBL_PRIVS TXNS TXN_COMPONENTS TYPES TYPE_FIELDS VERSION WRITE_SET
7.准备mysql
/* Navicat MySQL Data Transfer Source Server : 10.156.50.36 hive Source Server Version : 50717 Source Host : 10.156.50.36:3306 Source Database : mqh Target Server Type : MYSQL Target Server Version : 50717 File Encoding : 65001 Date: 2019-05-29 11:48:53 */ SET FOREIGN_KEY_CHECKS=0; -- ---------------------------- -- Table structure for `app` -- ---------------------------- DROP TABLE IF EXISTS `app`; CREATE TABLE `app` ( `appid` int(11) DEFAULT NULL, `appname` varchar(255) DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8; -- ---------------------------- -- Records of app -- ---------------------------- INSERT INTO `app` VALUES ('1', 'kaxinxiaoxiaole'); INSERT INTO `app` VALUES ('2', 'wangzherognyao'); INSERT INTO `app` VALUES ('3', 'cijizhanchang'); -- ---------------------------- -- Table structure for `device` -- ---------------------------- DROP TABLE IF EXISTS `device`; CREATE TABLE `device` ( `deviceid` int(11) DEFAULT NULL, `appid` int(11) DEFAULT NULL, `register_time` datetime DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8; -- ---------------------------- -- Records of device -- ---------------------------- INSERT INTO `device` VALUES ('1', '1', '2019-05-29 11:25:54'); INSERT INTO `device` VALUES ('2', '1', '2019-05-29 11:26:04'); INSERT INTO `device` VALUES ('3', '1', '2019-05-30 11:26:14'); INSERT INTO `device` VALUES ('4', '1', '2019-05-29 11:26:30'); INSERT INTO `device` VALUES ('5', '2', '2019-05-29 11:26:38'); INSERT INTO `device` VALUES ('6', '3', '2019-05-29 11:26:45');
8.sqoop 一次性mysql导入数据库到hive
sqoop import \ --connect jdbc:mysql://10.156.50.36:3306/mqh \ --username root \ --password root \ --table device \ --fields-terminated-by '\t' \ --delete-target-dir \ --num-mappers 1 \ --hive-import \ --hive-database test \ --hive-table device;
[zkkafka@yanfabu2-35 ~]$ sqoop import \ > --connect jdbc:mysql://10.156.50.36:3306/mqh \ > --username root \ > --password root \ > --table device \ > --fields-terminated-by '\t' \ > --delete-target-dir \ > --num-mappers 1 \ > --hive-import \ > --hive-database test \ > --hive-table device; Warning: /home/zkkafka/sqoop/../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /home/zkkafka/sqoop/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. 19/05/29 14:04:02 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7 19/05/29 14:04:02 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 19/05/29 14:04:03 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 19/05/29 14:04:03 INFO tool.CodeGenTool: Beginning code generation Wed May 29 14:04:03 CST 2019 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification. 19/05/29 14:04:03 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `device` AS t LIMIT 1 19/05/29 14:04:03 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `device` AS t LIMIT 1 19/05/29 14:04:03 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/zkkafka/hadoop 注: /tmp/sqoop-zkkafka/compile/35bdce3f9edb048d842d71d7b7f25f85/device.java使用或覆盖了已过时的 API。 注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。 19/05/29 14:04:06 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-zkkafka/compile/35bdce3f9edb048d842d71d7b7f25f85/device.jar SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/zkkafka/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/zkkafka/hbase/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 19/05/29 14:04:07 INFO tool.ImportTool: Destination directory device is not present, hence not deleting. 19/05/29 14:04:07 WARN manager.MySQLManager: It looks like you are importing from mysql. 19/05/29 14:04:07 WARN manager.MySQLManager: This transfer can be faster! Use the --direct 19/05/29 14:04:07 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path. 19/05/29 14:04:07 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql) 19/05/29 14:04:07 INFO mapreduce.ImportJobBase: Beginning import of device 19/05/29 14:04:07 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 19/05/29 14:04:08 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps Wed May 29 14:04:17 CST 2019 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification. 19/05/29 14:04:17 INFO db.DBInputFormat: Using read commited transaction isolation 19/05/29 14:04:17 INFO mapreduce.JobSubmitter: number of splits:1 19/05/29 14:04:18 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1558676658010_0012 19/05/29 14:04:18 INFO impl.YarnClientImpl: Submitted application application_1558676658010_0012 19/05/29 14:04:18 INFO mapreduce.Job: The url to track the job: http://master1:8088/proxy/application_1558676658010_0012/ 19/05/29 14:04:18 INFO mapreduce.Job: Running job: job_1558676658010_0012 19/05/29 14:04:28 INFO mapreduce.Job: Job job_1558676658010_0012 running in uber mode : false 19/05/29 14:04:28 INFO mapreduce.Job: map 0% reduce 0% 19/05/29 14:04:34 INFO mapreduce.Job: map 100% reduce 0% 19/05/29 14:04:35 INFO mapreduce.Job: Job job_1558676658010_0012 completed successfully 19/05/29 14:04:35 INFO mapreduce.Job: Counters: 30 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=129039 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=87 HDFS: Number of bytes written=156 HDFS: Number of read operations=4 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=1 Other local map tasks=1 Total time spent by all maps in occupied slots (ms)=3989 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=3989 Total vcore-milliseconds taken by all map tasks=3989 Total megabyte-milliseconds taken by all map tasks=4084736 Map-Reduce Framework Map input records=6 Map output records=6 Input split bytes=87 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=76 CPU time spent (ms)=1480 Physical memory (bytes) snapshot=170278912 Virtual memory (bytes) snapshot=2107375616 Total committed heap usage (bytes)=92274688 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=156 19/05/29 14:04:36 INFO mapreduce.ImportJobBase: Transferred 156 bytes in 27.9729 seconds (5.5768 bytes/sec) 19/05/29 14:04:36 INFO mapreduce.ImportJobBase: Retrieved 6 records. 19/05/29 14:04:36 INFO mapreduce.ImportJobBase: Publishing Hive/Hcat import job data to Listeners for table device Wed May 29 14:04:36 CST 2019 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification. 19/05/29 14:04:36 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `device` AS t LIMIT 1 19/05/29 14:04:36 WARN hive.TableDefWriter: Column register_time had to be cast to a less precise type in Hive 19/05/29 14:04:36 INFO hive.HiveImport: Loading uploaded data into Hive 19/05/29 14:04:36 INFO conf.HiveConf: Found configuration file file:/home/zkkafka/hive/conf/hive-site.xml 19/05/29 14:04:36 WARN conf.HiveConf: HiveConf of name hive.files.umask.value does not exist 19/05/29 14:04:36 WARN conf.HiveConf: HiveConf of name hive.server2.enable.impersonation does not exist 19/05/29 14:04:49 INFO hive.HiveImport: SLF4J: Class path contains multiple SLF4J bindings. 19/05/29 14:04:49 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/home/zkkafka/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] 19/05/29 14:04:49 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/home/zkkafka/hbase/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] 19/05/29 14:04:49 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/home/zkkafka/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] 19/05/29 14:04:49 INFO hive.HiveImport: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. 19/05/29 14:04:49 INFO hive.HiveImport: SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] 19/05/29 14:04:54 INFO hive.HiveImport: 19/05/29 14:04:54 INFO hive.HiveImport: Logging initialized using configuration in file:/home/zkkafka/hive/conf/hive-log4j2.properties Async: true 19/05/29 14:05:01 INFO hive.HiveImport: OK 19/05/29 14:05:01 INFO hive.HiveImport: Time taken: 4.585 seconds 19/05/29 14:05:02 INFO hive.HiveImport: Loading data to table test.device 19/05/29 14:05:06 INFO hive.HiveImport: OK 19/05/29 14:05:06 INFO hive.HiveImport: Time taken: 4.722 seconds 19/05/29 14:05:07 INFO hive.HiveImport: Hive import complete. 19/05/29 14:05:07 INFO hive.HiveImport: Export directory is contains the _SUCCESS file only, removing the directory.
hive> select * from device; OK 1 1 2019-05-29 11:25:54 2 1 2019-05-29 11:26:04 3 1 2019-05-30 11:26:14 4 1 2019-05-29 11:26:30 5 2 2019-05-29 11:26:38 6 3 2019-05-29 11:26:45
8.sqoop 增量导入
sqoop import \ --connect jdbc:mysql://10.156.50.36:3306/mqh \ --username root \ --password root \ --table device \ --target-dir=/hdfs/sqoop/device \ --fields-terminated-by '\t' \ --num-mappers 1 \ --hive-import \ --hive-database test \ --hive-table device \ --incremental append \ --check-column deviceid \ --last-value 10 \ --m 1
[zkkafka@yanfabu2-35 ~]$ sqoop import \ > --connect jdbc:mysql://10.156.50.36:3306/mqh \ > --username root \ > --password root \ > --table device \ > --target-dir=/hdfs/sqoop/device \ > --fields-terminated-by '\t' \ > --num-mappers 1 \ > --hive-import \ > --hive-database test \ > --hive-table device \ > --incremental append \ > --check-column deviceid \ > --last-value 10 \ > --m 1 19/06/04 16:56:40 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7 19/06/04 16:56:40 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 19/06/04 16:56:41 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 19/06/04 16:56:41 INFO tool.CodeGenTool: Beginning code generation Tue Jun 04 16:56:41 CST 2019 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification. 19/06/04 16:56:42 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `device` AS t LIMIT 1 19/06/04 16:56:42 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `device` AS t LIMIT 1 19/06/04 16:56:42 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/zkkafka/hadoop 注: /tmp/sqoop-zkkafka/compile/66854d2f9121b88905b0b82b5730b38b/device.java使用或覆盖了已过时的 API。 注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。 19/06/04 16:56:44 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-zkkafka/compile/66854d2f9121b88905b0b82b5730b38b/device.jar 19/06/04 16:56:44 INFO tool.ImportTool: Maximal id query for free form incremental import: SELECT MAX(`deviceid`) FROM `device` 19/06/04 16:56:44 INFO tool.ImportTool: Incremental import based on column `deviceid` 19/06/04 16:56:44 INFO tool.ImportTool: Lower bound value: 10 19/06/04 16:56:44 INFO tool.ImportTool: Upper bound value: 11 19/06/04 16:56:44 WARN manager.MySQLManager: It looks like you are importing from mysql. 19/06/04 16:56:44 WARN manager.MySQLManager: This transfer can be faster! Use the --direct 19/06/04 16:56:44 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path. 19/06/04 16:56:44 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql) 19/06/04 16:56:44 INFO mapreduce.ImportJobBase: Beginning import of device SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/zkkafka/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/zkkafka/hbase/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 19/06/04 16:56:45 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 19/06/04 16:56:46 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps Tue Jun 04 16:56:56 CST 2019 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification. 19/06/04 16:56:56 INFO db.DBInputFormat: Using read commited transaction isolation 19/06/04 16:56:56 INFO mapreduce.JobSubmitter: number of splits:1 19/06/04 16:56:57 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1559370613628_0007 19/06/04 16:56:58 INFO impl.YarnClientImpl: Submitted application application_1559370613628_0007 19/06/04 16:56:58 INFO mapreduce.Job: The url to track the job: http://master1:8088/proxy/application_1559370613628_0007/ 19/06/04 16:56:58 INFO mapreduce.Job: Running job: job_1559370613628_0007 19/06/04 16:57:07 INFO mapreduce.Job: Job job_1559370613628_0007 running in uber mode : false 19/06/04 16:57:07 INFO mapreduce.Job: map 0% reduce 0% 19/06/04 16:57:15 INFO mapreduce.Job: map 100% reduce 0% 19/06/04 16:57:16 INFO mapreduce.Job: Job job_1559370613628_0007 completed successfully 19/06/04 16:57:16 INFO mapreduce.Job: Counters: 30 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=131266 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=87 HDFS: Number of bytes written=27 HDFS: Number of read operations=4 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=1 Other local map tasks=1 Total time spent by all maps in occupied slots (ms)=4312 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=4312 Total vcore-milliseconds taken by all map tasks=4312 Total megabyte-milliseconds taken by all map tasks=4415488 Map-Reduce Framework Map input records=1 Map output records=1 Input split bytes=87 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=76 CPU time spent (ms)=1240 Physical memory (bytes) snapshot=170573824 Virtual memory (bytes) snapshot=2106789888 Total committed heap usage (bytes)=92798976 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=27 19/06/04 16:57:16 INFO mapreduce.ImportJobBase: Transferred 27 bytes in 30.087 seconds (0.8974 bytes/sec) 19/06/04 16:57:16 INFO mapreduce.ImportJobBase: Retrieved 1 records. 19/06/04 16:57:16 INFO mapreduce.ImportJobBase: Publishing Hive/Hcat import job data to Listeners for table device 19/06/04 16:57:16 INFO util.AppendUtils: Creating missing output directory - device Tue Jun 04 16:57:16 CST 2019 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification. 19/06/04 16:57:16 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `device` AS t LIMIT 1 19/06/04 16:57:16 WARN hive.TableDefWriter: Column register_time had to be cast to a less precise type in Hive 19/06/04 16:57:16 INFO hive.HiveImport: Loading uploaded data into Hive 19/06/04 16:57:16 INFO conf.HiveConf: Found configuration file file:/home/zkkafka/hive/conf/hive-site.xml 19/06/04 16:57:17 WARN conf.HiveConf: HiveConf of name hive.files.umask.value does not exist 19/06/04 16:57:17 WARN conf.HiveConf: HiveConf of name hive.server2.enable.impersonation does not exist 19/06/04 16:57:34 INFO hive.HiveImport: SLF4J: Class path contains multiple SLF4J bindings. 19/06/04 16:57:34 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/home/zkkafka/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] 19/06/04 16:57:34 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/home/zkkafka/hbase/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] 19/06/04 16:57:34 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/home/zkkafka/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] 19/06/04 16:57:34 INFO hive.HiveImport: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. 19/06/04 16:57:34 INFO hive.HiveImport: SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] 19/06/04 16:57:37 INFO hive.HiveImport: 19/06/04 16:57:37 INFO hive.HiveImport: Logging initialized using configuration in file:/home/zkkafka/hive/conf/hive-log4j2.properties Async: true 19/06/04 16:57:47 INFO hive.HiveImport: OK 19/06/04 16:57:47 INFO hive.HiveImport: Time taken: 5.746 seconds 19/06/04 16:57:48 INFO hive.HiveImport: Loading data to table test.device 19/06/04 16:57:52 INFO hive.HiveImport: OK 19/06/04 16:57:52 INFO hive.HiveImport: Time taken: 5.208 seconds 19/06/04 16:57:53 INFO hive.HiveImport: Hive import complete. 19/06/04 16:57:53 INFO hive.HiveImport: Export directory is empty, removing it. 19/06/04 16:57:53 INFO tool.ImportTool: Incremental import complete! To run another incremental import of all data following this import, supply the following arguments: 19/06/04 16:57:53 INFO tool.ImportTool: --incremental append 19/06/04 16:57:53 INFO tool.ImportTool: --check-column deviceid 19/06/04 16:57:53 INFO tool.ImportTool: --last-value 11 19/06/04 16:57:53 INFO tool.ImportTool: (Consider saving this with 'sqoop job --create')
9. 其他相关命令
--create <job-id> 创建一个新的job. --delete <job-id> 删除job --exec <job-id> 执行job --show <job-id> 显示job的参数 --list 列出所有的job sqoop job --list sqoop job --delete sqoop_job_device sqoop job --exec sqoop_job_device sqoop job --show sqoop_job_device
[zkkafka@yanfabu2-35 conf]$ sqoop job --list 19/05/29 15:15:58 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7 Available jobs: sqoop_job_device [zkkafka@yanfabu2-35 conf]$ sqoop job --show sqoop_job_device 19/05/29 15:16:10 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7 Job: sqoop_job_device Tool: import Options: ---------------------------- verbose = false hcatalog.drop.and.create.table = false codegen.output.delimiters.escape = 0 codegen.output.delimiters.enclose.required = false codegen.input.delimiters.field = 0 split.limit = null hbase.create.table = false mainframe.input.dataset.type = p db.require.password = false skip.dist.cache = false hdfs.append.dir = false codegen.input.delimiters.escape = 0 import.fetch.size = null accumulo.create.table = false codegen.input.delimiters.enclose.required = false reset.onemapper = false codegen.output.delimiters.record = 10 import.max.inline.lob.size = 16777216 sqoop.throwOnError = false hbase.bulk.load.enabled = false hcatalog.create.table = false db.clear.staging.table = false codegen.input.delimiters.record = 0 enable.compression = false hive.overwrite.table = false hive.import = false codegen.input.delimiters.enclose = 0 accumulo.batch.size = 10240000 hive.drop.delims = false customtool.options.jsonmap = {} codegen.output.delimiters.enclose = 0 hdfs.delete-target.dir = false codegen.output.dir = . codegen.auto.compile.dir = true relaxed.isolation = false mapreduce.num.mappers = 4 accumulo.max.latency = 5000 import.direct.split.size = 0 sqlconnection.metadata.transaction.isolation.level = 2 codegen.output.delimiters.field = 44 export.new.update = UpdateOnly incremental.mode = None hdfs.file.format = TextFile sqoop.oracle.escaping.disabled = true codegen.compile.dir = /tmp/sqoop-zkkafka/compile/424a3807a29b831b82c7cd119bb44e9f direct.import = false temporary.dirRoot = _sqoop hive.fail.table.exists = false db.batch = false
捐助开发者
在兴趣的驱动下,写一个免费
的东西,有欣喜,也还有汗水,希望你喜欢我的作品,同时也能支持一下。 当然,有钱捧个钱场(支持支付宝和微信 以及扣扣群),没钱捧个人场,谢谢各位。
个人主页:http://knight-black-bob.iteye.com/
谢谢您的赞助,我会做的更好!
相关推荐
文中介绍了通过 Sqoop 在 MySQL 和 HDFS 之间、MySQL 和 Hive 之间以及 MySQL 和 HBase 之间的数据互导过程,包括如何处理常见的错误。同时,也详细记录了 MySQL 用户创建、授权、数据插入和 Sqoop 配置的相关细节。...
通过Sqoop,用户可以从关系型数据库(如MySQL、Oracle等)中抽取数据并存储到Hadoop的HDFS中,或者将HDFS中的数据导出到关系型数据库中。 #### 二、导入数据到HDFS的命令解析 根据提供的部分内容,我们可以详细...
Apache Sqoop 是一个用于在关系型数据库(如 MySQL 和 Sybase)与 Hadoop 分布式文件系统(HDFS)之间高效传输数据的工具。在大数据处理中,Sqoop 提供了方便的数据导入和导出功能,它能够将结构化的数据从传统...
Sqoop 是一个用于在 Hadoop 和关系型数据库之间进行数据导入导出的工具,它使得在大数据处理场景下,能够方便地将结构化的数据从 MySQL 这样的 RDBMS(关系型数据库管理系统)转移到 Hadoop 的 HDFS(Hadoop 分布式...
在本文中,我们将深入探讨如何使用Sqoop将数据从HDFS(Hadoop分布式文件系统)导出到MySQL数据库,这属于ETL(提取、转换、加载)过程的一部分。 **一、Sqoop导出介绍** Sqoop提供了三种不同的导出模式来将HDFS中...
首先,Sqoop不仅支持将数据从关系型数据库如MySQL导入到HDFS或Hive,还能直接导入到HBase。关键在于正确使用参数: 1. `--hbase-table`:此参数用于指定导入的数据应存储在哪个HBase表中。不指定的话,数据将被导入...
使用sqoop2从mysql导入hadoop时,导入的数据带单引号。是由于sqoop2的一个bug导致,该包上传到sqoop_home/server/lib替换原来的jar包后,重启sqoop server,即可解决。
在构建一个完整的Hadoop生态系统时,我们需要搭建多个组件,包括Zookeeper、HBase、Hive、MySQL、Kafka以及Spark,并且它们都要运行在HDFS之上。下面将详细介绍这些组件的安装与配置过程。 1. **Zookeeper**: ...
05_Sqoop将MySQL数据导入到HDFS(一) 06_Sqoop将MySQL数据导入到HDFS(二) 07_Sqoop中的增量导入与Sqoop job 08_Sqoop将MySQL数据导入Hive表中 09_Sqoop的导出及脚本中使用的方式 10_案例分析-动态分区的实现 11_...
本压缩包"hadop安装文件.rar"提供了关于Hadoop及其相关组件(如Hive、HBase、Sqoop和MySQL)的安装步骤,这对于初学者和系统管理员来说是一份宝贵的资源。 首先,我们来详细了解一下Hadoop。Hadoop由Apache软件基金...
1.集群配置,包含三个结点,主节点控制所有的子节点 2.集群已经搭建好了,解压之后...3.包含全套组件:hdfs,hbase,hive,sqoop,mysql,pig,spark等大数据组件 4.如果不知道如何使用,或者需要做Hadoop项目可以私信博主
### Sqoop配置与Hadoop2.6环境下MySQL与HDFS数据交换详解 #### Sqoop简介 Sqoop(SQL to Hadoop)是一款开源工具,用于在Hadoop和关系型数据库之间高效地传输大型数据集。它利用MapReduce来并行加载和提取数据,...
- 执行任务:运行Sqoop命令,启动一个MapReduce作业,该作业将负责从MySQL读取数据并写入HDFS。 - 数据转换:如果需要,可以在导入过程中添加转换逻辑,比如数据清洗或格式调整。 - 数据加载:最后, Sqoop将数据...
本文主要讲述在Hue平台使用Oozie工作流操作Sqoop工具将MySQL数据库的数据传输到HDFS中,并最终导入到Hive表中的经验。以下是详细知识点: 1. Hue平台和Oozie工作流简介: Hue是一种开源的用户界面,用于简化与...
1. 连接配置:首先,你需要配置 Sqoop 连接到你的数据库。这通常涉及设置数据库的 JDBC 驱动、URL、用户名和密码。例如,对于 MySQL,JDBC 驱动可能是 `com.mysql.jdbc.Driver`,URL 可能是 `jdbc:mysql://localhost...
本篇将详细介绍Sqoop的导入和导出过程,以及在MySQL、HDFS、Hive和HBase之间的具体操作。 ### Sqoop导入原理 1. **JDBC检查**:在开始导入前,Sqoop通过JDBC连接数据库,获取表结构和列信息,如数据类型,将其映射...
在将 mysql 数据库中的表导入到 HDFS 中时,需要启动 hadoop,登录 mysql 数据库,查看 hive 数据库中有哪些表,然后使用 sqoop 命令将 hive 数据库中的表导入到 HDFS 中。最后,查看是否导入 HDFS 中。 Sqoop ...
1. **数据导入**:使用Sqoop,你可以通过指定MySQL数据库的连接参数(如主机名、端口、用户名、密码和数据库名),将MySQL中的数据高效地导入到Hadoop集群的HDFS中。`mysql-connector-5.1.39`在这个过程中起到桥梁...
1. **数据导入**:Sqoop 可以将结构化的数据从关系型数据库管理系统(RDBMS)如 MySQL、Oracle 等导入到 HDFS,然后可以进一步使用 MapReduce 或 Hive 进行分析和处理。 2. **数据导出**:反之,Sqoop 也可以将 HDFS...
Sqoop是专为Hadoop和传统数据库之间进行数据传输而设计的开源工具,支持从RDBMS(如MySQL、Oracle等)导入数据到HDFS或者从HDFS导出数据到RDBMS。 **1. Sqoop简介** Sqoop自2009年起作为一个Hadoop的第三方模块发展...