签到成功

知道了

CNDBA社区CNDBA社区

Hive 3.1.1 环境 搭建手册

2019-03-06 23:30 3194 0 原创 Hive
作者: dave

在之前的博客中,我们了解了Hive的概念和架构,如下:
Hive 概念及架构
https://www.cndba.cn/dave/article/3330

本篇我们了解一下Hive环境的搭建。 http://www.cndba.cn/dave/article/3335

1 安装Hadoop 集群

和其他组件一样,Hive 的运行也依赖Hadoop环境,所以在安装Hive之前,需要先搭建好HDFS环境,具体参考如下博客:

Linux 7.6 平台 Hadoop 3.1.1 集群搭建手册
https://www.cndba.cn/download/dave/6

2 安装配置MySQL数据库

在之前Hive架构的博客中,我们了解Hive的元数据存储在RDBMS库中,我们这里选择MySQL数据库。 可以在Hadoop集群中找一台机器来安装,也可以在其他机器。 MySQL的具体安装步骤可以网上搜索。我们这里不在重复。

安装好MySQL之后,创建Hive的用户和数据库,如下:

mysql> create user 'hive' identified by 'hive';
Query OK, 0 rows affected (0.05 sec)

mysql>  grant all on *.* to 'hive'@'%' identified by 'hive';
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql>  flush privileges;
Query OK, 0 rows affected (0.03 sec)

mysql>  create database hive;
Query OK, 1 row affected (0.01 sec)

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| cndba              |
| employees          |
| hive               |
| mysql              |
| performance_schema |
| sys                |
+--------------------+
7 rows in set (0.00 sec)

mysql>

3 安装配置Hive

3.1 下载Hive

官网:http://hive.apache.org/
我们这里下载的Hive 3.1.1:
https://mirrors.tuna.tsinghua.edu.cn/apache/hive/hive-3.1.1/apache-hive-3.1.1-bin.tar.gzhttp://www.cndba.cn/dave/article/3335http://www.cndba.cn/dave/article/3335http://www.cndba.cn/dave/article/3335

3.2 解压缩Hive

[dave@www.cndba.cn ~]$ tar xzvf apache-hive-3.1.1-bin.tar.gz
[dave@www.cndba.cn ~]$ mv apache-hive-3.1.1-bin hive

3.3 添加环境变量

修改/etc/profile 文件,添加如下内容:

#Hive
export HIVE_HOME=/home/hadoop/hive
export PATH=$PATH:$HIVE_HOME/bin

source /etc/profie  #刷新配置文件

3.4 修改Hive 配置文件:hive-site.xml

通过模板复制配置文件:http://www.cndba.cn/dave/article/3335http://www.cndba.cn/dave/article/3335http://www.cndba.cn/dave/article/3335

[dave@www.cndba.cn conf]$ pwd
/home/hadoop/hive/conf
[dave@www.cndba.cn conf]$ cp hive-default.xml.template hive-site.xml
[dave@www.cndba.cn conf]$

在hive-site.xml中修改如下内容:http://www.cndba.cn/dave/article/3335

<property>
    <name>javax.jdo.option.ConnectionURL</name>
  <value>jdbc:mysql://192.168.1.168:3306/hive?characterEncoding=UTF-8&amp;useSSL=false</value>
    <description>JDBC connect string for a JDBC metastore</description>
</property> 
<property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
    <description>driver class name for a JDBC metastore</decription>
</property>
<property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>hive</value>
    <description>password to use against metastore database</description>
</property> 
<property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>hive</value>
    <description>Username to use against metastore database</description>
</property>

<property>
    <name>hive.querylog.location</name>
    <value>/home/hadoop/hive/iotmp</value>
    <description>Local of hive run time structured log file</description>
  </property>
  <property>
    <name>hive.exec.local.scratchdir</name>
    <value>/home/hadoop/hive/iotmp</value>
    <description>Local scratch space for Hive jobs</description>
  </property>
  <property>
    <name>hive.downloaded.resources.dir</name>
    <value>/home/hadoop/hive/iotmp</value>
    <description>Temporary local directory for added resources in the remote file system.</description>
  </property>

注意:http://www.cndba.cn/dave/article/3335

1、Xml文件中不能使用&,要使用他的转义&来代替。

2、Jdbc连接字符串中必须加useSSL=false,否则会报如下错误:
Wed Mar 06 03:18:55 CST 2019 WARN: Establishing SSL connection without server’s identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn’t set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to ‘false’. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
schemaTool completed

3.后面部分路径的参数是为了解决以下问题:
Hive 启动报错java.net.URISyntaxException: Relative path in absolute URI 解决方法
https://www.cndba.cn/dave/article/3334

3.5 将mysql驱动包拷贝到hive的lib目录

下载mysql-connector-java-5.1.47.jar包,上传到hive的lib目录。

http://www.cndba.cn/dave/article/3335

[dave@www.cndba.cn lib]$ pwd
/home/hadoop/hive/lib
[dave@www.cndba.cn lib]$ ll mysql-*
-rw-r--r--. 1 hadoop hadoop 1007502 Aug  7  2018 mysql-connector-java-5.1.47.jar
-rw-r--r--. 1 hadoop hadoop   10476 May  4  2018 mysql-metadata-storage-0.12.0.jar
[dave@www.cndba.cn lib]$

3.6 初始化 hive 数据库

进入hive安装目录bin目录下,执行schematool -dbType mysql -initSchema,如下:

[dave@www.cndba.cn conf]$ schematool -dbType mysql -initSchema
Metastore connection URL:     jdbc:mysql://192.168.1.168:3306/hive?characterEncoding=UTF-8&useSSL=false
Metastore Connection Driver :     com.mysql.jdbc.Driver
Metastore connection User:     hive
Starting metastore schema initialization to 3.1.0
Initialization script hive-schema-3.1.0.mysql.sql
...
Initialization script completed
schemaTool completed
[dave@www.cndba.cn conf]$

mysql> use hive
Database changed
mysql> show tables
    -> ;
+-------------------------------+
| Tables_in_hive                |
+-------------------------------+
| aux_table                     |
| bucketing_cols                |
| cds                           |
| columns_v2                    |
| compaction_queue              |
| completed_compactions         |
| completed_txn_components      |
| ctlgs                         |
| database_params               |
| db_privs                      |
| dbs                           |
| delegation_tokens             |
| func_ru                       |
| funcs                         |
| global_privs                  |
| hive_locks                    |
| i_schema                      |
| idxs                          |
| index_params                  |
| key_constraints               |
| master_keys                   |
| materialization_rebuild_locks |
| metastore_db_properties       |
| min_history_level             |
| mv_creation_metadata          |
| mv_tables_used                |
| next_compaction_queue_id      |
| next_lock_id                  |
| next_txn_id                   |
| next_write_id                 |
| notification_log              |
| notification_sequence         |
| nucleus_tables                |
| part_col_privs                |
| part_col_stats                |
| part_privs                    |
| partition_events              |
| partition_key_vals            |
| partition_keys                |
| partition_params              |
| partitions                    |
| repl_txn_map                  |
| role_map                      |
| roles                         |
| runtime_stats                 |
| schema_version                |
| sd_params                     |
| sds                           |
| sequence_table                |
| serde_params                  |
| serdes                        |
| skewed_col_names              |
| skewed_col_value_loc_map      |
| skewed_string_list            |
| skewed_string_list_values     |
| skewed_values                 |
| sort_cols                     |
| tab_col_stats                 |
| table_params                  |
| tbl_col_privs                 |
| tbl_privs                     |
| tbls                          |
| txn_components                |
| txn_to_write_id               |
| txns                          |
| type_fields                   |
| types                         |
| version                       |
| wm_mapping                    |
| wm_pool                       |
| wm_pool_to_trigger            |
| wm_resourceplan               |
| wm_trigger                    |
| write_set                     |
+-------------------------------+
74 rows in set (0.00 sec)

mysql>

3.7 验证Hive

[dave@www.cndba.cn hive]$ hive
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
Hive Session ID = d29bf554-dde2-4543-848c-2d80459bc896

Logging initialized using configuration in jar:file:/home/hadoop/hive/lib/hive-common-3.1.1.jar!/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Hive Session ID = 7fba3eb0-f8e5-4126-ad13-ce1157b39e45
hive> CREATE TABLE page_view(viewTime INT, userid BIGINT,
    >      page_url STRING, referrer_url STRING,
    >      ip STRING COMMENT 'IP Address of the User')
    >  COMMENT 'This is the page view table'
    >  PARTITIONED BY(dt STRING, country STRING)
    >  CLUSTERED BY(userid) SORTED BY(viewTime) INTO 32 BUCKETS
    >  ROW FORMAT DELIMITED
    >    FIELDS TERMINATED BY '/001'
    >    COLLECTION ITEMS TERMINATED BY '/002'
    >    MAP KEYS TERMINATED BY '/003'
    >  STORED AS SEQUENCEFILE;
OK
Time taken: 1.509 seconds
hive> show tables;
OK
page_view
Time taken: 0.934 seconds, Fetched: 1 row(s)
hive>

[dave@www.cndba.cn conf]$ hdfs dfs -ls /tmp/hive/hadoop
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
Found 6 items
drwx------   - hadoop supergroup          0 2019-03-06 04:10 /tmp/hive/hadoop/4a82203f-55a0-4679-914f-e5ab740a2d2f
drwx------   - hadoop supergroup          0 2019-03-06 04:09 /tmp/hive/hadoop/65b62cc8-aad3-4331-9dcd-8fc78ee6bad8
drwx------   - hadoop supergroup          0 2019-03-06 04:20 /tmp/hive/hadoop/9c6c9eb1-ca02-4f9b-aad9-05b68d3a8209
drwx------   - hadoop supergroup          0 2019-03-06 04:20 /tmp/hive/hadoop/b29a6cba-5ea7-4c50-82f8-6fe099dea4b7
drwx------   - hadoop supergroup          0 2019-03-06 04:10 /tmp/hive/hadoop/b40cfb50-a340-4f5f-b32d-b6688000c0b0
drwx------   - hadoop supergroup          0 2019-03-06 04:09 /tmp/hive/hadoop/ce8f4953-a4b4-47f8-b7ef-4c79bdeebfb8
[dave@www.cndba.cn conf]$

版权声明:本文为博主原创文章,未经博主允许不得转载。

用户评论
* 以下用户言论只代表其个人观点,不代表CNDBA社区的观点或立场
dave

dave

关注

人的一生应该是这样度过的:当他回首往事的时候,他不会因为虚度年华而悔恨,也不会因为碌碌无为而羞耻;这样,在临死的时候,他就能够说:“我的整个生命和全部精力,都已经献给世界上最壮丽的事业....."

  • 2261
    原创
  • 3
    翻译
  • 578
    转载
  • 191
    评论
  • 访问:7983953次
  • 积分:4346
  • 等级:核心会员
  • 排名:第1名
精华文章
    最新问题
    查看更多+
    热门文章
      热门用户
      推荐用户
        Copyright © 2016 All Rights Reserved. Powered by CNDBA · 皖ICP备2022006297号-1·

        QQ交流群

        注册联系QQ