签到成功

知道了

CNDBA社区CNDBA社区

CDH 5.16 Hive on Spark 配置手册

2019-05-07 23:23 6326 1 原创 Hive
作者: dave

  Hive 默认使用的MapReduce 引擎,该引擎效率较低,使用Spark引擎,效率会提升很多。CDH官方手册里有Hive on Spark的配置说明,链接如下:

http://www.cndba.cn/cndba/dave/article/3392

https://www.cloudera.com/documentation/enterprise/5-16-x/topics/admin_hos_oview.htmlhttp://www.cndba.cn/cndba/dave/article/3392http://www.cndba.cn/cndba/dave/article/3392

1 添加Spark 服务

在CDH 集群中添加Spark 服务,注意这里有2个Spark,后面解释的很清楚,具体操作截图如下:




http://www.cndba.cn/cndba/dave/article/3392

2 修改Hive 配置

修改Hive 的配置,启用Spark On YARN,如下图:

在配置里,还需要修改Hive 的默认引擎,从MR改成Spark:

http://www.cndba.cn/cndba/dave/article/3392

3 添加Spark Gateway 到HiveServer2

在Spark 配置中,在运行HiveServer2的主机上添加Spark gateway角色,实际上在之前安装Spark服务的时候,就已经有选择了,如果之前配置过,那么忽略该步骤:
http://www.cndba.cn/cndba/dave/article/3392

http://www.cndba.cn/cndba/dave/article/3392

4 重启所有过时的服务

返回CDH 主界面,重启所有过时的服务:
http://www.cndba.cn/cndba/dave/article/3392

5 验证

Hive中执行查询操作,从输出结果中可以看到Starting Spark Job:http://www.cndba.cn/cndba/dave/article/3392

[dave@www.cndba.cn ~]# hive
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512M; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: Using incremental CMS is deprecated and will likely be removed in a future release
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512M; support was removed in 8.0

Logging initialized using configuration in jar:file:/opt/cloudera/parcels/CDH-5.16.1-1.cdh5.16.1.p0.3/jars/hive-common-1.1.0-cdh5.16.1.jar!/hive-log4j.properties
WARNING: Hive CLI is deprecated and migration to Beeline is recommended.
hive> select * from default.cndba;
OK
1     http://www.cndba.cn
2     中国DBA社区
3     Oracle
4     Hadoop
Time taken: 3.42 seconds, Fetched: 4 row(s)
hive> select * from default.cndba;
OK
1     http://www.cndba.cn
2     中国DBA社区
3     Oracle
4     Hadoop
Time taken: 0.177 seconds, Fetched: 4 row(s)
hive> select count(1) from default.cndba;
Query ID = root_20281105001414_486f5d2d-cbae-41e9-a643-51a2ddb55d12
Total jobs = 1
Launching Job 1 out of 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Spark Job = 2ba7b7f9-722c-4202-afef-18f654639d61
Running with YARN Application = application_1856966558053_0001
Kill Command = /opt/cloudera/parcels/CDH-5.16.1-1.cdh5.16.1.p0.3/lib/hadoop/bin/yarn application -kill application_1856966558053_0001

Query Hive on Spark job[0] stages:
0
1

Status: Running (Hive on Spark job[0])
Job Progress Format
CurrentTime StageId_StageAttemptId: SucceededTasksCount(+RunningTasksCount-FailedTasksCount)/TotalTasksCount [StageCost]
2028-11-05 00:15:26,194    Stage-0_0: 0(+1)/1    Stage-1_0: 0/1    
2028-11-05 00:15:29,230    Stage-0_0: 0(+1)/1    Stage-1_0: 0/1    
2028-11-05 00:15:31,267    Stage-0_0: 1/1 Finished    Stage-1_0: 0(+1)/1    
2028-11-05 00:15:32,283    Stage-0_0: 1/1 Finished    Stage-1_0: 1/1 Finished    
Status: Finished successfully in 21.16 seconds
OK
4
Time taken: 44.595 seconds, Fetched: 1 row(s)
hive>

在YARN的ResourceManager的web管理界面也可以看到这些Job的信息:
http://www.cndba.cn/cndba/dave/article/3392

版权声明:本文为博主原创文章,未经博主允许不得转载。

用户评论
* 以下用户言论只代表其个人观点,不代表CNDBA社区的观点或立场
dave

dave

关注

人的一生应该是这样度过的:当他回首往事的时候,他不会因为虚度年华而悔恨,也不会因为碌碌无为而羞耻;这样,在临死的时候,他就能够说:“我的整个生命和全部精力,都已经献给世界上最壮丽的事业....."

  • 2262
    原创
  • 3
    翻译
  • 578
    转载
  • 192
    评论
  • 访问:8066319次
  • 积分:4349
  • 等级:核心会员
  • 排名:第1名
精华文章
    最新问题
    查看更多+
    热门文章
      热门用户
      推荐用户
        Copyright © 2016 All Rights Reserved. Powered by CNDBA · 皖ICP备2022006297号-1·

        QQ交流群

        注册联系QQ