`
sunqi
  • 浏览: 230470 次
  • 性别: Icon_minigender_1
  • 来自: 杭州
社区版块
存档分类
最新评论

sqoop MySQL导入Hbase

 
阅读更多

sqoop http://mirror.bit.edu.cn/apache/sqoop/1.4.1-incubating/sqoop-1.4.1-incubating__hadoop-0.20.tar.gz

 

 

mysql  http://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.20.tar.gz/from/http://mysql.ntu.edu.tw/

 

安装好 sqoop、hbase。

下载jbdc驱动:mysql-connector-java-5.1.20.jar

将 mysql-connector-java-5.2.10.jar 拷贝到sqoop的lib下,同时拷贝hbase,zookeeper的jar到lib下

 

执行:

./sqoop import --connect jdbc:mysql://10.20.147.3/pm2 --username pm --password ×× --table acookie --hbase-table acookie  --column-family acookie --hbase-create-table

 

12/05/24 14:49:55 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.

12/05/24 14:49:55 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.

12/05/24 14:49:55 INFO tool.CodeGenTool: Beginning code generation

12/05/24 14:49:55 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `acookie` AS t LIMIT 1

12/05/24 14:49:55 INFO orm.CompilationManager: HADOOP_HOME is /home/hadoop/hadoop-0.20.203.0/bin/..

Note: /tmp/sqoop-hadoop/compile/d75d3a2bf713dd671b174830abc3da31/acookie.java uses or overrides a deprecated API.

Note: Recompile with -Xlint:deprecation for details.

12/05/24 14:49:56 ERROR orm.CompilationManager: Could not rename /tmp/sqoop-hadoop/compile/d75d3a2bf713dd671b174830abc3da31/acookie.java to /home/hadoop/sqoop-1.4.1-incubating__hadoop-0.20/bin/./acookie.java

12/05/24 14:49:56 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/d75d3a2bf713dd671b174830abc3da31/acookie.jar

12/05/24 14:49:56 WARN manager.MySQLManager: It looks like you are importing from mysql.

12/05/24 14:49:56 WARN manager.MySQLManager: This transfer can be faster! Use the --direct

12/05/24 14:49:56 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.

12/05/24 14:49:56 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)

12/05/24 14:50:04 INFO mapreduce.ImportJobBase: Beginning import of acookie

12/05/24 14:50:10 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version=3.3.2-1031432, built on 11/05/2010 05:32 GMT

12/05/24 14:50:10 INFO zookeeper.ZooKeeper: Client environment:host.name=node1

12/05/24 14:50:10 INFO zookeeper.ZooKeeper: Client environment:java.version=1.6.0_18

12/05/24 14:50:10 INFO zookeeper.ZooKeeper: Client environment:java.vendor=Sun Microsystems Inc.

12/05/24 14:50:10 INFO zookeeper.ZooKeeper: Client environment:java.home=/usr/alibaba/java/jre

12/05/24 14:50:10 INFO zookeeper.ZooKeeper: Client environment:java.class.path=/home/hadoop/hadoop-0.20.203.0/bin/../conf:/usr/alibaba/java/lib/tools.jar:/home/hadoop/hadoop-0.20.203.0/bin/..:/home/hadoop/hadoop-0.20.203.0/bin/../hadoop-core-0.20.203.0.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/aspectjrt-1.6.5.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/aspectjtools-1.6.5.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/commons-beanutils-1.7.0.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/commons-beanutils-core-1.8.0.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/commons-cli-1.2.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/commons-codec-1.4.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/commons-collections-3.2.1.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/commons-configuration-1.6.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/commons-daemon-1.0.1.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/commons-digester-1.8.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/commons-el-1.0.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/commons-httpclient-3.0.1.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/commons-lang-2.4.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/commons-logging-1.1.1.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/commons-logging-api-1.0.4.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/commons-math-2.1.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/commons-net-1.4.1.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/core-3.1.1.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/hsqldb-1.8.0.10.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/jackson-core-asl-1.0.1.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/jackson-mapper-asl-1.0.1.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/jasper-compiler-5.5.12.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/jasper-runtime-5.5.12.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/jets3t-0.6.1.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/jetty-6.1.26.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/jetty-util-6.1.26.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/jsch-0.1.42.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/junit-4.5.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/kfs-0.2.2.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/log4j-1.2.15.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/mockito-all-1.8.5.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/oro-2.0.8.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/servlet-api-2.5-20081211.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/slf4j-api-1.4.3.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/slf4j-log4j12-1.4.3.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/xmlenc-0.52.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/jsp-2.1/jsp-2.1.jar:/home/hadoop/hadoop-0.20.203.0/bin/../lib/jsp-2.1/jsp-api-2.1.jar:/home/hadoop/sqoop-1.4.1-incubating__hadoop-0.20/bin/../conf::/home/hadoop/sqoop-1.4.1-incubating__hadoop-0.20/bin/../lib/ant-contrib-1.0b3.jar:/home/hadoop/sqoop-1.4.1-incubating__hadoop-0.20/bin/../lib/ant-eclipse-1.0-jvm1.2.jar:/home/hadoop/sqoop-1.4.1-incubating__hadoop-0.20/bin/../lib/avro-1.5.3.jar:/home/hadoop/sqoop-1.4.1-incubating__hadoop-0.20/bin/../lib/avro-ipc-1.5.3.jar:/home/hadoop/sqoop-1.4.1-incubating__hadoop-0.20/bin/../lib/avro-mapred-1.5.3.jar:/home/hadoop/sqoop-1.4.1-incubating__hadoop-0.20/bin/../lib/commons-io-1.4.jar:/home/hadoop/sqoop-1.4.1-incubating__hadoop-0.20/bin/../lib/hbase-0.90.4.jar:/home/hadoop/sqoop-1.4.1-incubating__hadoop-0.20/bin/../lib/jackson-core-asl-1.7.3.jar:/home/hadoop/sqoop-1.4.1-incubating__hadoop-0.20/bin/../lib/jackson-mapper-asl-1.7.3.jar:/home/hadoop/sqoop-1.4.1-incubating__hadoop-0.20/bin/../lib/jopt-simple-3.2.jar:/home/hadoop/sqoop-1.4.1-incubating__hadoop-0.20/bin/../lib/mysql-connector-java-5.1.20-bin.jar:/home/hadoop/sqoop-1.4.1-incubating__hadoop-0.20/bin/../lib/paranamer-2.3.jar:/home/hadoop/sqoop-1.4.1-incubating__hadoop-0.20/bin/../lib/snappy-java-1.0.3.2.jar:/home/hadoop/sqoop-1.4.1-incubating__hadoop-0.20/bin/../lib/zookeeper-3.3.2.jar:/home/hadoop/sqoop-1.4.1-incubating__hadoop-0.20/bin/../sqoop-1.4.1-incubating.jar:/home/hadoop/sqoop-1.4.1-incubating__hadoop-0.20/bin/../sqoop-test-1.4.1-incubating.jar:

12/05/24 14:50:10 INFO zookeeper.ZooKeeper: Client environment:java.library.path=/home/hadoop/hadoop-0.20.203.0/bin/../lib/native/Linux-amd64-64

12/05/24 14:50:10 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp

12/05/24 14:50:10 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA>

12/05/24 14:50:10 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux

12/05/24 14:50:10 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64

12/05/24 14:50:10 INFO zookeeper.ZooKeeper: Client environment:os.version=2.6.18-131.el5.customxen

12/05/24 14:50:10 INFO zookeeper.ZooKeeper: Client environment:user.name=hadoop

12/05/24 14:50:10 INFO zookeeper.ZooKeeper: Client environment:user.home=/home/hadoop

12/05/24 14:50:10 INFO zookeeper.ZooKeeper: Client environment:user.dir=/home/hadoop/sqoop-1.4.1-incubating__hadoop-0.20/bin

12/05/24 14:50:10 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=180000 watcher=hconnection

12/05/24 14:50:10 INFO zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181

12/05/24 14:50:10 INFO zookeeper.ClientCnxn: Socket connection established to localhost/127.0.0.1:2181, initiating session

12/05/24 14:50:10 INFO zookeeper.ClientCnxn: Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x3777b6dcd00029, negotiated timeout = 180000

12/05/24 14:50:10 INFO mapreduce.HBaseImportJob: Creating missing column family acookie

12/05/24 14:50:10 INFO client.HBaseAdmin: Started disable of acookie

12/05/24 14:50:12 INFO client.HBaseAdmin: Disabled acookie

12/05/24 14:50:12 INFO client.HBaseAdmin: Started enable of acookie

12/05/24 14:50:14 INFO client.HBaseAdmin: Enabled table acookie

12/05/24 14:50:15 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(`ACOOKIE_ID`), MAX(`ACOOKIE_ID`) FROM `acookie`

12/05/24 14:50:15 WARN db.TextSplitter: Generating splits for a textual index column.

12/05/24 14:50:15 WARN db.TextSplitter: If your database sorts in a case-insensitive order, this may result in a partial import or duplicate records.

12/05/24 14:50:15 WARN db.TextSplitter: You are strongly encouraged to choose an integral split column.

12/05/24 14:50:16 INFO mapred.JobClient: Running job: job_201205231120_0008

12/05/24 14:50:17 INFO mapred.JobClient:  map 0% reduce 0%

12/05/24 14:50:34 INFO mapred.JobClient:  map 16% reduce 0%

12/05/24 14:50:37 INFO mapred.JobClient:  map 33% reduce 0%

12/05/24 14:50:38 INFO mapred.JobClient:  map 66% reduce 0%

12/05/24 14:50:40 INFO mapred.JobClient:  map 83% reduce 0%

12/05/24 14:50:43 INFO mapred.JobClient:  map 100% reduce 0%

12/05/24 14:50:48 INFO mapred.JobClient: Job complete: job_201205231120_0008

12/05/24 14:50:48 INFO mapred.JobClient: Counters: 13

12/05/24 14:50:48 INFO mapred.JobClient:   Job Counters 

12/05/24 14:50:48 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=35523

12/05/24 14:50:48 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0

12/05/24 14:50:48 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0

12/05/24 14:50:48 INFO mapred.JobClient:     Launched map tasks=6

12/05/24 14:50:48 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0

12/05/24 14:50:48 INFO mapred.JobClient:   File Output Format Counters 

12/05/24 14:50:48 INFO mapred.JobClient:     Bytes Written=0

12/05/24 14:50:48 INFO mapred.JobClient:   FileSystemCounters

12/05/24 14:50:48 INFO mapred.JobClient:     HDFS_BYTES_READ=877

12/05/24 14:50:48 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=226202

12/05/24 14:50:48 INFO mapred.JobClient:   File Input Format Counters 

12/05/24 14:50:48 INFO mapred.JobClient:     Bytes Read=0

12/05/24 14:50:48 INFO mapred.JobClient:   Map-Reduce Framework

12/05/24 14:50:48 INFO mapred.JobClient:     Map input records=72

12/05/24 14:50:48 INFO mapred.JobClient:     Spilled Records=0

12/05/24 14:50:48 INFO mapred.JobClient:     Map output records=72

12/05/24 14:50:48 INFO mapred.JobClient:     SPLIT_RAW_BYTES=877

12/05/24 14:50:48 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 33.7181 seconds (0 bytes/sec)

12/05/24 14:50:48 INFO mapreduce.ImportJobBase: Retrieved 72 records.

 

 

 

去hbase shell下,list查询,可以看到表已经建立,scan一下or

get 'acookie','7JzGBSJKrh0CAccdAHmqwaeu'

COLUMN                           CELL                                                                                        

 acookie:ALIIM_ID                timestamp=1337842232403, value=cnalichnmyetptest19                                          

 acookie:CREATOR                 timestamp=1337842232403, value=sys                                                          

 acookie:GMT_CREATE              timestamp=1337842232403, value=2011-01-26 13:34:15.0                                        

 acookie:GMT_MODIFIED            timestamp=1337842232403, value=2011-01-26 16:11:01.0                                        

 acookie:MODIFIER                timestamp=1337842232403, value=sys                                                          

 acookie:corp_name               timestamp=1337842232403, value=\xE4\xB8\xAD\xE5\x9B\xBDmyetptest19&\xE8\x9C\xBB\xE8\x84\xB2\

                                 xE9\x9C\x81\xE6\x92\xAC\xE6\x9C\x89\xE9\x99\x90\xE5\x85\xAC\xE5\x8F\xB8                     

 acookie:email                   timestamp=1337842232403, value=myetptest19@dfasf.com                                        

 acookie:gender                  timestamp=1337842232403, value=M                                                            

 acookie:phone                   timestamp=1337842232403, value=86_0571_ 45683158                                            

 acookie:user_name               timestamp=1337842232403, value=\xE5\xB7\xA1\xE5\x9C\x8A\xE7\x8A\x81                         

10 row(s) in 0.0410 seconds

 

 

 

 

 

 

附网上的导入命令参考:

 

 

Sqoop Import Examples:

Sqoop Import :- Import data from a relational database management system (RDBMS) such as MySQL or Oracle into the Hadoop Distributed File System (HDFS) and its subprojects (Hive, HBase).

 

 

Import the data (MySQL table) to HBase: 

 

Case 1: If table have primary key and import all the column of MySQL table into HBase table.

 

$ bin/sqoop import --connect jdbc:mysql://localhost/db1 --username root --password root --table tableName --hbase-table hbase_tableName  --column-family hbase_table_col1 --hbase-create-table

 

Case 2: If table have primary key and import only few columns of MySQL table into HBase table.  

 

$ bin/sqoop import --connect jdbc:mysql://localhost/db1 --username root --password root --table tableName --hbase-table hbase_tableName --columns column1,column2 --column-family hbase_table_col1 --hbase-create-table 

 

Note : Column names specified in --columns attribute must contain the primary key column.

 

Case 3: If table doesn't have primary key then choose one column as a hbase-row-key. Import all the column of MySQL table into HBase table.

 

$ bin/sqoop import --connect jdbc:mysql://localhost/db1 --username root --password root --table tableName --hbase-table hbase_tableName --column-family hbase_table_col1 --hbase-row-key column1 --hbase-create-table

 

Case 4: If table doesn't have primary key then choose one column as a hbase-row-key. Import only few columns of MySQL table into HBase table.

 

$ bin/sqoop import --connect jdbc:mysql://localhost/db1 --username root --password root --table tableName --hbase-table hbase_tableName --columns column1,column2 --column-family hbase_table_col --hbase-row-key column1 --hbase-create-table  

 

Note: Column name specified in hbase-row-key atribute must be in columns list. Otherwise command will execute successfully but no records are inserted into hbase.

 

 

Note : The value of primary key column or column specified in --hbase-row-key attribute become the HBase row value. If MySQL table doesn't have primary key or column specified in --hbase-row-key attribute doesn't have unique value then there is a lost of few records.

 

Example : Let us consider a MySQL table test_table which have two columns name,address. The table test_table doesn't have primary key or unique key column.

 

Records of test_table:

________________

name    address

----------------

abc    123

sqw    345

abc    125

sdf    1234

aql    23dw

 

 

Run the following command to import test_table data into HBase:

 

$ bin/sqoop import --connect jdbc:mysql://localhost/db1 --username root --password root --table test_table --hbase-table hbase_test_table --column-family test_table_col1 --hbase-row-key name --hbase-create-table

 

Only 4 records are visible into HBase table instead of 5. In above example two rows have same value 'abc' of name column and value of this column is used as a HBase row key value. If record having value 'abc' of name column come then thoes record will inserted into HBase table. Next time, another record having the same value 'abc' of name column come then thoes column will overwrite the value previous column.

 

Above problem also occured if table have composite primary key because the one column from composite key is used as a HBase row key.

 

Import the data (MySQL table) to Hive

 

Case 1: Import MySQL table into Hive if table have primary key.

 

bin/sqoop-import  --connect jdbc:mysql://localhost:3306/db1 -username root -password password --table tableName  --hive-table tableName --create-hive-table --hive-import --hive-home path/to/hive_home

 

Case 2: Import MySQL table into Hive if table doesn't have primary key.

 

$ bin/sqoop-import  --connect jdbc:mysql://localhost:3306/db1 -username root -password password --table tableName  --hive-table tableName --create-hive-table --hive-import --hive-home path/to/hive_home --split-by column_name

 

or 

 

$ bin/sqoop-import  --connect jdbc:mysql://localhost:3306/db1 -username root -password password --table tableName  --hive-table tableName --create-hive-table --hive-import --hive-home path/to/hive_home -m 1

 

 

 

Import the data (MySQL table) to HDFS

 

 

Case 1: Import MySQL table into HDFS if table have primary key.

 

$ bin/sqoop import -connect jdbc:mysql://localhost:3306/db1 -username root -password password --table tableName --target-dir /user/ankit/tableName 

 

Case 2: Import MySQL table into HDFS if table doesn't have primary key.

 

$ bin/sqoop import -connect jdbc:mysql://localhost:3306/db1 -username root -password password --table tableName --target-dir /user/ankit/tableName  -m 1

 

 

 

Sqoop Export Examples:

 

Sqoop Export: export the HDFS and its subproject (Hive, HBase) data back into an RDBMS. 

 

Export Hive table back to an RDBMS:

 

By default, Hive will stored data using ^A as a field delimiter and \n as a row delimiter.

 

$ bin/sqoop export --connect jdbc:mysql://localhost/test_db --table tableName  --export-dir /user/hive/warehouse/tableName --username root --password password -m 1 --input-fields-terminated-by '\001'

 

where '\001' is octal representation of ^A.

分享到:
评论

相关推荐

    mysql导入hbase所需要的jar

    首先,标题中提到的"mysql导入hbase所需要的jar",主要涉及到的是Sqoop工具。Sqoop是一个用于在Hadoop和传统关系型数据库之间传输数据的工具。当尝试使用Sqoop从MySQL导入数据到HBase时,可能需要特定的JAR文件来...

    sqoop把mysql数据导入hbase2.1.6

    首先,Sqoop不仅支持将数据从关系型数据库如MySQL导入到HDFS或Hive,还能直接导入到HBase。关键在于正确使用参数: 1. `--hbase-table`:此参数用于指定导入的数据应存储在哪个HBase表中。不指定的话,数据将被导入...

    mysql2hbase.7z

    本资源“mysql2hbase.7z”提供了一种解决方案,由于Sqoop已经停止更新,不再支持较新的HBase版本,因此采用Java编程语言进行数据迁移显得尤为必要。 Sqoop是一个用于在Hadoop和关系数据库之间转移数据的工具,它...

    zookeeper3.4.12+hbase1.4.4+sqoop1.4.7+kafka2.10

    它允许用户将结构化数据从关系数据库如MySQL、Oracle导入到Hadoop的HDFS,或者将Hadoop的数据导出回关系数据库。版本1.4.7支持更多的数据库类型,改进了性能和错误处理,使得数据迁移更加便捷和可靠。 **Kafka 2.10...

    mysql数据导入hbase

    MySQL通过sqoop工具用命令将数据导入到hbase的代码文件

    Hive、MySQL、HBase数据互导

    使用Sqoop将数据从MySQL导入HBase**: - 在MySQL中更新或添加数据后,使用Sqoop连接到MySQL并指定要导入的表。 - 设置HBase的连接信息,包括Zookeeper地址、表名等。 - 使用Sqoop的`--create-hbase-table`选项...

    23-Sqoop数据导入导出1

    - 导入数据:使用Sqoop命令,指定HBase的表名和列族,将MySQL数据导入HBase。 3. **MySQL到Hive**: - 配置Hive:创建与MySQL表结构匹配的Hive表。 - 导入数据:使用Sqoop将MySQL数据导入Hive,数据会自动创建为...

    mysql数据导入到hbase中

    利用sqoop把mysql数据导入到hbase中,建立phoenix与hbase的映射,用phoenix jdbc操作hbase!达到sql操作nosql!

    大数据处理:用 Sqoop 实现 HBase 与关系型数据库的数据互导

    内容概要:本文介绍了如何使用 Sqoop 在 HBase 和关系型数据库(如 MySQL 或 PostgreSQL)之间进行数据导入和导出。首先,文中详细描述了使用 Sqoop 导入数据的具体步骤,包括前提条件、创建 HBase 表以及执行 Sqoop...

    关系型数据库的数据导入Hbase

    通过 Sqoop导出到Hbase,需要先将数据导入HDFS,再用Hbase的Import命令将数据加载到Hbase表中。 - Hadoop MapReduce:可以编写自定义的MapReduce作业,将RDBMS数据读取、转换并写入Hbase。这种方法灵活性高,但开发...

    大数据实践-sqoop数据导入导出.doc

    3. Sqoop会生成MapReduce作业,执行数据导入过程,将数据从MySQL导入到HDFS。 ### 三、数据导出:HDFS-&gt;MySQL 1. 数据导出是逆向操作,使用`sqoop export`命令,指定HDFS中数据的路径,数据库连接参数,以及要写入...

    大数据处理技术中Sqoop与HBase的数据交互详解

    文档还提供了详细的实战步骤指导,覆盖了从准备工作、HBase 表创建,到数据导入验证的具体执行细节,同时对潜在的问题进行了预判和给出解决方案建议。 适合人群:本指南面向希望深入掌握数据迁移技术的技术人员,...

    使用spark对网站用户行为分析

    1.对文本文件形式的原始数据集进行...5.使用Sqoop将数据从MySQL导入HBase 6.使用HBase Java API把数据从本地导入到HBase中 7.使用R对MySQL中的数据进行可视化分析 内含三份报告和数据集,报告中有源码,是用spark做的

    hadoop安装文件.rar,内涵hadoop安装的步骤word,hadoop、hive、hbase、sqoop、mysql等

    本压缩包"hadop安装文件.rar"提供了关于Hadoop及其相关组件(如Hive、HBase、Sqoop和MySQL)的安装步骤,这对于初学者和系统管理员来说是一份宝贵的资源。 首先,我们来详细了解一下Hadoop。Hadoop由Apache软件基金...

    nosql实验五-HBase数据迁移与数据备份&恢复.docx

    本实验主要介绍了 HBase 数据迁移与数据备份和恢复的方法,包括使用 Sqoop 将 MySQL 数据导入到 HBase、将文本文件批量导入 HBase、使用 Hadoop DistCp 实现 HBase 的冷备份和热备份。 一、使用 Sqoop 将 MySQL ...

    第15章-Sqoop+Hive+Hbase+Kettle+R某技术论坛日志分析项目案例.docx

    7. **Sqoop MySQL数据导入**:使用Sqoop将MySQL中的数据导入到Hive表中。 8. **R语言可视化分析**:利用R语言绘制图表,直观展示数据分析结果。 通过完成以上任务,不仅可以掌握各工具的具体操作步骤,还能深入理解...

    Sqoop安装与使用

    在将 mysql 数据库中的表导入到 HDFS 中时,需要启动 hadoop,登录 mysql 数据库,查看 hive 数据库中有哪些表,然后使用 sqoop 命令将 hive 数据库中的表导入到 HDFS 中。最后,查看是否导入 HDFS 中。 Sqoop ...

    spark大作业.zip

    使用Spark框架进行网站用户购物分析 目的 1、熟悉Linux系统、MySQL、Spark、HBase、...5、使用Sqoop将数据从MySQL导入HBase 6、使用HBase Java API把数据从本地导入到HBase中 7、使用R对MySQL中的数据进行可视化分析

    Sqoop数据采集工具简介、安装、使用学习笔记(配合Hive和Hbase)

    最初的设计方案是通过 Sqoop 将数据从 PostgreSQL 导入到 Kafka,再通过存储程序将 Kafka 的数据保存至 Hive 或 HBase 中。然而,在实施过程中遇到了 Sqoop 版本兼容性问题: - **Sqoop1**:适用于命令行模式执行。...

    sqoop配置.docx

    本文将详细介绍如何在Hadoop2.6伪分布环境中安装配置Sqoop1.4.6,并进行简单的测试,包括启动Sqoop、MySQL服务以及如何实现从MySQL导入数据到HDFS和从HDFS导出数据到MySQL。 #### 安装环境准备 在开始之前,请确保...

Global site tag (gtag.js) - Google Analytics