签到成功

知道了

CNDBA社区CNDBA社区

Sqoop导入Mysql数据表到Hbase中

2019-05-15 16:19 3285 0 原创 sqoop
作者: lirui

1.在mysql表中创建一个千万条数据的测试表card

2.在Hbase中创建对应的test表,指定一个列族info

hbase shell
create 'test','info'

3.将mysql数据导入hbase中

sqoop import 
--connect jdbc:mysql://192.168.20.160/test 
--username root 
--password 111111 
--table card 
--hbase-table 'test' # 指定hbase表的列族名
--hbase-row-key card_id  # 指定hbase表的rowkey对应为mysql表的card_id主键
--column-family 'info' #指定hbase表列族
--hbase-create-table # 自动在hbase数据库中创建"test"这张表,如果之前创建了,请忽略这一句

中间碰到了一个报错:http://www.cndba.cn/lirui/article/3408

 tool.ImportTool: Import failed: java.io.IOException: java.sql.SQLException: Incorrect key file for table './test/card.MYI'; try to repair it

这是mysql中的表索引损坏了,重新用repair命令执行一下就好了

http://www.cndba.cn/lirui/article/3408

repair table card;

4.执行成功

19/05/15 15:08:25 INFO mapreduce.Job: Job job_1557888023370_0004 completed successfully
19/05/15 15:08:26 INFO mapreduce.Job: Counters: 30
    File System Counters
        FILE: Number of bytes read=0
        FILE: Number of bytes written=747592
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=476
        HDFS: Number of bytes written=0
        HDFS: Number of read operations=4
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=0
    Job Counters 
        Launched map tasks=4
        Other local map tasks=4
        Total time spent by all maps in occupied slots (ms)=1347125
        Total time spent by all reduces in occupied slots (ms)=0
        Total time spent by all map tasks (ms)=1347125
        Total vcore-milliseconds taken by all map tasks=2694250
        Total megabyte-milliseconds taken by all map tasks=1379456000
    Map-Reduce Framework
        Map input records=10000000
        Map output records=10000000
        Input split bytes=476
        Spilled Records=0
        Failed Shuffles=0
        Merged Map outputs=0
        GC time elapsed (ms)=3329
        CPU time spent (ms)=1170200
        Physical memory (bytes) snapshot=1686634496
        Virtual memory (bytes) snapshot=11169898496
        Total committed heap usage (bytes)=1444937728
    File Input Format Counters 
        Bytes Read=0
    File Output Format Counters 
        Bytes Written=0
19/05/15 15:08:26 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 486.5645 seconds (0 bytes/sec)
19/05/15 15:08:26 INFO mapreduce.ImportJobBase: Retrieved 10000000 records.

显示成功导入hbase test表一千万条数据http://www.cndba.cn/lirui/article/3408http://www.cndba.cn/lirui/article/3408

5.进入hbase,验证数据是否导入成功

利用hbase jar中自带的统计行数的工具类,查询test表总条数http://www.cndba.cn/lirui/article/3408

hbase org.apache.hadoop.hbase.mapreduce.RowCounter 'test'

执行结果出来,test表里面新增了一千万条数据
并且可以执行http://www.cndba.cn/lirui/article/3408

http://www.cndba.cn/lirui/article/3408
http://www.cndba.cn/lirui/article/3408
http://www.cndba.cn/lirui/article/3408

scan 'test'

查看表中数据

http://www.cndba.cn/lirui/article/3408

版权声明:本文为博主原创文章,未经博主允许不得转载。

用户评论
* 以下用户言论只代表其个人观点,不代表CNDBA社区的观点或立场
lirui

lirui

关注
  • 15
    原创
  • 0
    翻译
  • 0
    转载
  • 1
    评论
  • 访问:61826次
  • 积分:56
  • 等级:注册会员
  • 排名:第39名
精华文章
    最新问题
    查看更多+
    热门文章
      热门用户
      推荐用户
        Copyright © 2016 All Rights Reserved. Powered by CNDBA · 皖ICP备2022006297号-1·

        QQ交流群

        注册联系QQ