在之前的博客中,我们了解了Hive的概念和架构,如下:
Hive 概念及架构
https://www.cndba.cn/dave/article/3330
本篇我们了解一下Hive环境的搭建。
1 安装Hadoop 集群
和其他组件一样,Hive 的运行也依赖Hadoop环境,所以在安装Hive之前,需要先搭建好HDFS环境,具体参考如下博客:
Linux 7.6 平台 Hadoop 3.1.1 集群搭建手册
https://www.cndba.cn/download/dave/6
2 安装配置MySQL数据库
在之前Hive架构的博客中,我们了解Hive的元数据存储在RDBMS库中,我们这里选择MySQL数据库。 可以在Hadoop集群中找一台机器来安装,也可以在其他机器。 MySQL的具体安装步骤可以网上搜索。我们这里不在重复。
安装好MySQL之后,创建Hive的用户和数据库,如下:
mysql> create user 'hive' identified by 'hive';
Query OK, 0 rows affected (0.05 sec)
mysql> grant all on *.* to 'hive'@'%' identified by 'hive';
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.03 sec)
mysql> create database hive;
Query OK, 1 row affected (0.01 sec)
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| cndba |
| employees |
| hive |
| mysql |
| performance_schema |
| sys |
+--------------------+
7 rows in set (0.00 sec)
mysql>
3 安装配置Hive
3.1 下载Hive
官网:http://hive.apache.org/
我们这里下载的Hive 3.1.1:
https://mirrors.tuna.tsinghua.edu.cn/apache/hive/hive-3.1.1/apache-hive-3.1.1-bin.tar.gz
3.2 解压缩Hive
[dave@www.cndba.cn ~]$ tar xzvf apache-hive-3.1.1-bin.tar.gz
[dave@www.cndba.cn ~]$ mv apache-hive-3.1.1-bin hive
3.3 添加环境变量
修改/etc/profile 文件,添加如下内容:
#Hive
export HIVE_HOME=/home/hadoop/hive
export PATH=$PATH:$HIVE_HOME/bin
source /etc/profie #刷新配置文件
3.4 修改Hive 配置文件:hive-site.xml
通过模板复制配置文件:
[dave@www.cndba.cn conf]$ pwd
/home/hadoop/hive/conf
[dave@www.cndba.cn conf]$ cp hive-default.xml.template hive-site.xml
[dave@www.cndba.cn conf]$
在hive-site.xml中修改如下内容:
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://192.168.1.168:3306/hive?characterEncoding=UTF-8&useSSL=false</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>driver class name for a JDBC metastore</decription>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>hive</value>
<description>password to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
<description>Username to use against metastore database</description>
</property>
<property>
<name>hive.querylog.location</name>
<value>/home/hadoop/hive/iotmp</value>
<description>Local of hive run time structured log file</description>
</property>
<property>
<name>hive.exec.local.scratchdir</name>
<value>/home/hadoop/hive/iotmp</value>
<description>Local scratch space for Hive jobs</description>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/home/hadoop/hive/iotmp</value>
<description>Temporary local directory for added resources in the remote file system.</description>
</property>
注意:
1、Xml文件中不能使用&,要使用他的转义&来代替。
2、Jdbc连接字符串中必须加useSSL=false,否则会报如下错误:
Wed Mar 06 03:18:55 CST 2019 WARN: Establishing SSL connection without server’s identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn’t set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to ‘false’. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
schemaTool completed3.后面部分路径的参数是为了解决以下问题:
Hive 启动报错java.net.URISyntaxException: Relative path in absolute URI 解决方法
https://www.cndba.cn/dave/article/3334
3.5 将mysql驱动包拷贝到hive的lib目录
下载mysql-connector-java-5.1.47.jar包,上传到hive的lib目录。
[dave@www.cndba.cn lib]$ pwd
/home/hadoop/hive/lib
[dave@www.cndba.cn lib]$ ll mysql-*
-rw-r--r--. 1 hadoop hadoop 1007502 Aug 7 2018 mysql-connector-java-5.1.47.jar
-rw-r--r--. 1 hadoop hadoop 10476 May 4 2018 mysql-metadata-storage-0.12.0.jar
[dave@www.cndba.cn lib]$
3.6 初始化 hive 数据库
进入hive安装目录bin目录下,执行schematool -dbType mysql -initSchema,如下:
[dave@www.cndba.cn conf]$ schematool -dbType mysql -initSchema
Metastore connection URL: jdbc:mysql://192.168.1.168:3306/hive?characterEncoding=UTF-8&useSSL=false
Metastore Connection Driver : com.mysql.jdbc.Driver
Metastore connection User: hive
Starting metastore schema initialization to 3.1.0
Initialization script hive-schema-3.1.0.mysql.sql
...
Initialization script completed
schemaTool completed
[dave@www.cndba.cn conf]$
mysql> use hive
Database changed
mysql> show tables
-> ;
+-------------------------------+
| Tables_in_hive |
+-------------------------------+
| aux_table |
| bucketing_cols |
| cds |
| columns_v2 |
| compaction_queue |
| completed_compactions |
| completed_txn_components |
| ctlgs |
| database_params |
| db_privs |
| dbs |
| delegation_tokens |
| func_ru |
| funcs |
| global_privs |
| hive_locks |
| i_schema |
| idxs |
| index_params |
| key_constraints |
| master_keys |
| materialization_rebuild_locks |
| metastore_db_properties |
| min_history_level |
| mv_creation_metadata |
| mv_tables_used |
| next_compaction_queue_id |
| next_lock_id |
| next_txn_id |
| next_write_id |
| notification_log |
| notification_sequence |
| nucleus_tables |
| part_col_privs |
| part_col_stats |
| part_privs |
| partition_events |
| partition_key_vals |
| partition_keys |
| partition_params |
| partitions |
| repl_txn_map |
| role_map |
| roles |
| runtime_stats |
| schema_version |
| sd_params |
| sds |
| sequence_table |
| serde_params |
| serdes |
| skewed_col_names |
| skewed_col_value_loc_map |
| skewed_string_list |
| skewed_string_list_values |
| skewed_values |
| sort_cols |
| tab_col_stats |
| table_params |
| tbl_col_privs |
| tbl_privs |
| tbls |
| txn_components |
| txn_to_write_id |
| txns |
| type_fields |
| types |
| version |
| wm_mapping |
| wm_pool |
| wm_pool_to_trigger |
| wm_resourceplan |
| wm_trigger |
| write_set |
+-------------------------------+
74 rows in set (0.00 sec)
mysql>
3.7 验证Hive
[dave@www.cndba.cn hive]$ hive
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
Hive Session ID = d29bf554-dde2-4543-848c-2d80459bc896
Logging initialized using configuration in jar:file:/home/hadoop/hive/lib/hive-common-3.1.1.jar!/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Hive Session ID = 7fba3eb0-f8e5-4126-ad13-ce1157b39e45
hive> CREATE TABLE page_view(viewTime INT, userid BIGINT,
> page_url STRING, referrer_url STRING,
> ip STRING COMMENT 'IP Address of the User')
> COMMENT 'This is the page view table'
> PARTITIONED BY(dt STRING, country STRING)
> CLUSTERED BY(userid) SORTED BY(viewTime) INTO 32 BUCKETS
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '/001'
> COLLECTION ITEMS TERMINATED BY '/002'
> MAP KEYS TERMINATED BY '/003'
> STORED AS SEQUENCEFILE;
OK
Time taken: 1.509 seconds
hive> show tables;
OK
page_view
Time taken: 0.934 seconds, Fetched: 1 row(s)
hive>
[dave@www.cndba.cn conf]$ hdfs dfs -ls /tmp/hive/hadoop
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
Found 6 items
drwx------ - hadoop supergroup 0 2019-03-06 04:10 /tmp/hive/hadoop/4a82203f-55a0-4679-914f-e5ab740a2d2f
drwx------ - hadoop supergroup 0 2019-03-06 04:09 /tmp/hive/hadoop/65b62cc8-aad3-4331-9dcd-8fc78ee6bad8
drwx------ - hadoop supergroup 0 2019-03-06 04:20 /tmp/hive/hadoop/9c6c9eb1-ca02-4f9b-aad9-05b68d3a8209
drwx------ - hadoop supergroup 0 2019-03-06 04:20 /tmp/hive/hadoop/b29a6cba-5ea7-4c50-82f8-6fe099dea4b7
drwx------ - hadoop supergroup 0 2019-03-06 04:10 /tmp/hive/hadoop/b40cfb50-a340-4f5f-b32d-b6688000c0b0
drwx------ - hadoop supergroup 0 2019-03-06 04:09 /tmp/hive/hadoop/ce8f4953-a4b4-47f8-b7ef-4c79bdeebfb8
[dave@www.cndba.cn conf]$
版权声明:本文为博主原创文章,未经博主允许不得转载。