在前面几篇文章里我们看了利用sqoop工具将数据在RDBMS和HDFS之间进行移动。如果是频繁需要执行的命令,每次敲命令就会比较麻烦,sqoop job 可以很好的解决这个问题。 可以把日常执行的命令创建成job,以后在每次执行时只需要执行这个job就可以了。
Sqoop MySQL 导入数据到 HDFS
https://www.cndba.cn/dave/article/3305
Sqoop 导出库中所有表到 HDFS
https://www.cndba.cn/dave/article/3306
Sqoop 将数据从HDFS导入到MySQL
https://www.cndba.cn/dave/article/3307
1 创建作业(—create)
我们创建一个名为dave的job,负责将RDBMS表的数据导入到HDFS。
[dave@www.cndba.cn lib]$ sqoop job –create dave -- import --connect jdbc:mysql://192.168.56.2:3306/employees --username root --table employees --m 1 -P
注意:在创建job时,命令”— import” 中间有个空格,切勿忽视,否则报错!
2 验证作业 (—list)
‘—list’ 参数是用来验证保存的作业。下面的命令用来验证保存Sqoop作业的列表。
[dave@www.cndba.cn data]$ sqoop job --list
Warning: /home/hadoop/sqoop/../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /home/hadoop/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /home/hadoop/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
2019-03-03 02:52:37,399 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
Available jobs:
cndba
dave
[dave@www.cndba.cn data]$
3 检查作业(—show)
‘—show’ 参数用于检查或验证特定的工作,及其详细信息。
[dave@www.cndba.cn data]$ sqoop job --show cndba
Warning: /home/hadoop/sqoop/../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /home/hadoop/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /home/hadoop/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
2019-03-03 02:55:56,573 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
Enter password:
Job: cndba
Tool: import
Options:
----------------------------
verbose = false
hcatalog.drop.and.create.table = false
db.connect.string = jdbc:mysql://192.168.56.2:3306/employees
codegen.output.delimiters.escape = 0
codegen.output.delimiters.enclose.required = false
codegen.input.delimiters.field = 0
split.limit = null
hbase.create.table = false
mainframe.input.dataset.type = p
db.require.password = true
skip.dist.cache = false
hdfs.append.dir = false
db.table = employees
codegen.input.delimiters.escape = 0
accumulo.create.table = false
import.fetch.size = null
codegen.input.delimiters.enclose.required = false
db.username = root
reset.onemapper = false
codegen.output.delimiters.record = 10
import.max.inline.lob.size = 16777216
sqoop.throwOnError = false
hbase.bulk.load.enabled = false
hcatalog.create.table = false
db.clear.staging.table = false
codegen.input.delimiters.record = 0
enable.compression = false
hive.overwrite.table = false
hive.import = false
codegen.input.delimiters.enclose = 0
accumulo.batch.size = 10240000
hive.drop.delims = false
customtool.options.jsonmap = {}
codegen.output.delimiters.enclose = 0
hdfs.delete-target.dir = false
codegen.output.dir = .
codegen.auto.compile.dir = true
relaxed.isolation = false
mapreduce.num.mappers = 1
accumulo.max.latency = 5000
import.direct.split.size = 0
sqlconnection.metadata.transaction.isolation.level = 2
codegen.output.delimiters.field = 44
export.new.update = UpdateOnly
incremental.mode = None
hdfs.file.format = TextFile
sqoop.oracle.escaping.disabled = true
codegen.compile.dir = /tmp/sqoop-hadoop/compile/1388d1ba83d095a4e3b3ae4bb45832ff
direct.import = false
temporary.dirRoot = _sqoop
hive.fail.table.exists = false
db.batch = false
[dave@www.cndba.cn data]$
4 执行作业 (—exec)
‘—exec’ 选项用于执行保存的作业。
[dave@www.cndba.cn data]$ sqoop job --exec cndba
Warning: /home/hadoop/sqoop/../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /home/hadoop/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /home/hadoop/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
2019-03-03 02:55:14,148 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
Enter password:
…
因为我们在创建job的时候使用的是-P选项,即每次执行的时候需要输入密码,这里我们可以—password 参数直接把密码加到命令中,这样在执行的时候,就不需要执行密码了。 只不过这样安全性会下降一些。
版权声明:本文为博主原创文章,未经博主允许不得转载。