签到成功

知道了

CNDBA社区CNDBA社区

Sqoop 安装配置

2019-03-01 17:21 2180 0 原创 Sqoop
作者: dave

1 搭建Hadoop 环境

由于Sqoop是Hadoop的一个子项目,其依赖与Hadoop环境,所以在使用Sqoop之前,必须先搭建好Hadoop 环境。 http://www.cndba.cn/dave/article/3290

关于Hadoop环境的搭建可以参考我之前的文章:

Linux 7.6 平台 Hadoop 3.1.1 集群搭建手册
https://www.cndba.cn/download/dave/6

http://www.cndba.cn/dave/article/3290

Hadoop 环境如下:http://www.cndba.cn/dave/article/3290

[root@hadoopmaster ~]# yarn node --list
2019-03-01 22:26:03,155 INFO client.RMProxy: Connecting to ResourceManager at hadoopmaster/192.168.20.80:8032
Total Nodes:4
         Node-Id         Node-State    Node-Http-Address    Number-of-Running-Containers
hadoopslave4:46449            RUNNING    hadoopslave4:8042                               0
hadoopslave2:42839            RUNNING    hadoopslave2:8042                               0
hadoopslave1:34451            RUNNING    hadoopslave1:8042                               0
hadoopslave3:35911            RUNNING    hadoopslave3:8042                               0
[root@hadoopmaster ~]# jps
16819 ResourceManager
16548 SecondaryNameNode
9208 Jps
16235 NameNode
15933 DataNode
[root@hadoopmaster ~]#

2 安装Sqoop

2.1 下载Sqoop

从Sqoop官网下载:
http://sqoop.apache.org/

我们这里从如下地址下载:
http://mirror.bit.edu.cn/apache/sqoop/1.4.7/sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz

注意sqoop有1和2两个版本,他们之间区别比较大。这个官网已经有说明,我们这里使用sqoop1的版本,即1.4.7.

Latest stable release is 1.4.7 (download, documentation). Latest cut of Sqoop2 is 1.99.7 (download, documentation). Note that 1.99.7 is not compatible with 1.4.7 and not feature complete, it is not intended for production deployment.

2.2 解压缩文件到安装目录

[cndba@hadoopmaster ~]$ pwd
/home/cndba
[cndba@hadoopmaster ~]$ ll
total 344256
drwxr-xr-x. 13 cndba cndba       194 Jan 23 02:24 hadoop
-rw-r--r--.  1 cndba cndba 334559382 Jan 22 14:02 hadoop-3.1.1.tar.gz
drwxr-xr-x.  4 cndba cndba        30 Jan 23 21:57 NCDC
-rw-r--r--   1 cndba cndba  17953604 Mar  1 14:38 sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz
[cndba@hadoopmaster ~]$ tar -xzvf sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz 
sqoop-1.4.7.bin__hadoop-2.6.0/
sqoop-1.4.7.bin__hadoop-2.6.0/CHANGELOG.txt
sqoop-1.4.7.bin__hadoop-2.6.0/COMPILING.txt
sqoop-1.4.7.bin__hadoop-2.6.0/LICENSE.txt
sqoop-1.4.7.bin__hadoop-2.6.0/NOTICE.txt
sqoop-1.4.7.bin__hadoop-2.6.0/README.txt
sqoop-1.4.7.bin__hadoop-2.6.0/bin/
……
[cndba@hadoopmaster ~]$ ll
total 344256
drwxr-xr-x. 13 cndba cndba       194 Jan 23 02:24 hadoop
-rw-r--r--.  1 cndba cndba 334559382 Jan 22 14:02 hadoop-3.1.1.tar.gz
drwxr-xr-x.  4 cndba cndba        30 Jan 23 21:57 NCDC
drwxr-xr-x   9 cndba cndba       318 Dec 19  2017 sqoop-1.4.7.bin__hadoop-2.6.0
-rw-r--r--   1 cndba cndba  17953604 Mar  1 14:38 sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz
[cndba@hadoopmaster ~]$ mv sqoop-1.4.7.bin__hadoop-2.6.0 sqoop
[cndba@hadoopmaster ~]$ ll
total 344256
drwxr-xr-x. 13 cndba cndba       194 Jan 23 02:24 hadoop
-rw-r--r--.  1 cndba cndba 334559382 Jan 22 14:02 hadoop-3.1.1.tar.gz
drwxr-xr-x.  4 cndba cndba        30 Jan 23 21:57 NCDC
drwxr-xr-x   9 cndba cndba       318 Dec 19  2017 sqoop
-rw-r--r--   1 cndba cndba  17953604 Mar  1 14:38 sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz
[cndba@hadoopmaster ~]$

2.3 添加环境变量

修改/etc/profile文件:添加如下内容:

#Sqoop
export SQOOP_HOME=/home/cndba/sqoop
export PATH=$PATH:$SQOOP_HOME/bin

source使修改生效:http://www.cndba.cn/dave/article/3290

[root@hadoopmaster ~]# source /etc/profile

2.4 配置Sqoop

要配置Sqoop用Hadoop,需要编辑 sqoop-env.sh 文件,该文件被放置在 $SQOOP_HOME/conf 目录。

首先要重定向到 Sqoop config 目录,并使用以下命令复制的模板文件:

[cndba@hadoopmaster conf]$ pwd
/home/cndba/sqoop/conf
[cndba@hadoopmaster conf]$ ll
total 28
-rw-rw-r-- 1 cndba cndba 3895 Dec 19  2017 oraoop-site-template.xml
-rw-rw-r-- 1 cndba cndba 1404 Dec 19  2017 sqoop-env-template.cmd
-rwxr-xr-x 1 cndba cndba 1345 Dec 19  2017 sqoop-env-template.sh
-rw-rw-r-- 1 cndba cndba 6044 Dec 19  2017 sqoop-site-template.xml
-rw-rw-r-- 1 cndba cndba 6044 Dec 19  2017 sqoop-site.xml
[cndba@hadoopmaster conf]$ cp sqoop-env-template.sh sqoop-env.sh
[cndba@hadoopmaster conf]$

编辑sqoop-env.sh文件,添加以下内容:http://www.cndba.cn/dave/article/3290

export HADOOP_MAPRED_HOME=/home/cndba/hadoop
export HADOOP_COMMON_HOME=/home/cndba/hadoop

当然,如果要支持Hive,HBase等,也需要进行相应配置,这个模板里非常清晰:

#Set path to where bin/hadoop is available
#export HADOOP_COMMON_HOME=

#Set path to where hadoop-*-core.jar is available
#export HADOOP_MAPRED_HOME=

#set the path to where bin/hbase is available
#export HBASE_HOME=

#Set the path to where bin/hive is available
#export HIVE_HOME=

#Set the path for where zookeper config dir is
#export ZOOCFGDIR=
~

2.5 下载对应的JDBC驱动包

Sqoop连接RDBMS数据库通过JDBC的方式进行,如果要连接Mysql 或者Oracle 数据库,那么需要将对应的JDBC包复制到sqoop的lib目录中。

http://www.cndba.cn/dave/article/3290
http://www.cndba.cn/dave/article/3290

我们这里用Oracle 进行测试,可以从ORACLE_HOME中直接复制:http://www.cndba.cn/dave/article/3290

[oracle@18c lib]$ pwd
/u01/app/oracle/product/18.0.0/dbhome_1/jdbc/lib
[oracle@18c lib]$ ll
total 23352
-rw-r--r-- 1 oracle oinstall 6967899 Jul 19  2018 ojdbc8dms_g.jar
-rw-r--r-- 1 oracle oinstall 5801820 Jul 19  2018 ojdbc8dms.jar
-rw-r--r-- 1 oracle oinstall 6938361 Jul 19  2018 ojdbc8_g.jar
-rw-r--r-- 1 oracle oinstall 4161657 Jul 19  2018 ojdbc8.jar
-rw-r--r-- 1 oracle oinstall   29663 Jul 19  2018 simplefan.jar
[oracle@18c lib]$

[oracle@18c lib]$ scp ojdbc8.jar cndba@192.168.20.80:/home/cndba/sqoop/lib
cndba@192.168.20.80's password: 
ojdbc8.jar                                                                                                                                100% 4064KB  19.3MB/s   00:00    
[oracle@18c lib]$

2.6 验证Sqoop

查看sqop版本:

[cndba@hadoopmaster ~]$ sqoop-version 
Warning: /home/cndba/sqoop/../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /home/cndba/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /home/cndba/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /home/cndba/sqoop/../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
2019-03-02 01:07:29,410 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
Sqoop 1.4.7
git commit id 2328971411f57f0cb683dfb79d19d4d19d185dd8
Compiled by maugli on Thu Dec 21 15:59:58 STD 2017
[cndba@hadoopmaster ~]$

这里因为没有配置HBASE等组件,所以这里有警告。

http://www.cndba.cn/dave/article/3290

我们使用sqoop连接Oracle,信息如下:http://www.cndba.cn/dave/article/3290

[cndba@hadoopmaster ~]$ sqoop list-tables --connect jdbc:oracle:thin:@192.168.20.5:1521:orcl --username system --password oracle
2019-03-02 01:15:59,972 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
2019-03-02 01:16:00,009 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
2019-03-02 01:16:00,111 INFO oracle.OraOopManagerFactory: Data Connector for Oracle and Hadoop is disabled.
2019-03-02 01:16:00,127 INFO manager.SqlManager: Using default fetchSize of 1000
2019-03-02 01:16:01,285 INFO manager.OracleManager: Time zone has been set to GMT
LOGMNR_SESSION_EVOLVE$
LOGMNR_GLOBAL$
LOGMNR_PDB_INFO$
LOGMNR_DID$
LOGMNR_UID$
LOGMNRGGC_GTLO
LOGMNRGGC_GTCS
LOGMNRC_DBNAME_UID_MAP
LOGMNR_LOG$
LOGMNR_PROCESSED_LOG$
LOGMNR_SPILL$
LOGMNR_AGE_SPILL$
LOGMNR_RESTART_CKPT_TXINFO$
LOGMNR_ERROR$
LOGMNR_RESTART_CKPT$
LOGMNR_FILTER$
LOGMNR_SESSION_ACTIONS$
LOGMNR_PARAMETER$
LOGMNR_SESSION$
REDO_DB
REDO_LOG
……

版权声明:本文为博主原创文章,未经博主允许不得转载。

用户评论
* 以下用户言论只代表其个人观点,不代表CNDBA社区的观点或立场
dave

dave

关注

人的一生应该是这样度过的:当他回首往事的时候,他不会因为虚度年华而悔恨,也不会因为碌碌无为而羞耻;这样,在临死的时候,他就能够说:“我的整个生命和全部精力,都已经献给世界上最壮丽的事业....."

  • 2261
    原创
  • 3
    翻译
  • 578
    转载
  • 191
    评论
  • 访问:7993269次
  • 积分:4346
  • 等级:核心会员
  • 排名:第1名
精华文章
    最新问题
    查看更多+
    热门文章
      热门用户
      推荐用户
        Copyright © 2016 All Rights Reserved. Powered by CNDBA · 皖ICP备2022006297号-1·

        QQ交流群

        注册联系QQ