签到成功

知道了

CNDBA社区CNDBA社区

Greenplum DB 分布式5节点数据库集群配置安装

2018-10-29 18:20 3932 0 原创 PostgreSQL
作者: Marvinn

Greenplum DB 分布式5节点数据库集群配置安装http://www.cndba.cn/Marvinn/article/3100

最近,客户需要弄Greenplum集群数据迁移,弄个环境学习下,其实gpl也是基于pg开发的分布式框架,用起来就类似于Mysql分布式中间件Mycat配置安装过程细无巨细…附上官方文档链接

GP数据库文档官方链接:https://gp-docs-cn.github.io/docs/admin_guide/managing/monitor.html

GP集群安装参考官方链接:https://gpdb.docs.pivotal.io/4370/prep_os-system-req.html#topic2

Table 1. System Prerequisites for Greenplum Database 4.3

  • xfs required for data storage on SUSE Linux and Red Hat (ext3 supported for root file system) // 这点尤为重要

  • 150MB per host for Greenplum installation

  • Approximately 300MB per segment instance for meta data
  • Appropriate free space for data with disks at no more than 70% capacity
  • High-speed, local storage

一、环境准备

操作系统:centos 6.5 64位http://www.cndba.cn/Marvinn/article/3100

master 1台(架构图中的主节点),Standby 1台(架构图中的从节点),Segment 3台。共5台服务器。

     ip           主机名
172.16.10.21       master
172.16.10.22      standby
172.16.10.23       gpl1
172.16.10.24       gpl2
172.16.10.25        gpl3

二、系统配置

2.1、主机名修改

主机IP固定步骤省略……

5台主机分别修改主机名为master、standby、gpl1、gpl2、gpl3,命令如下:

# vi /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=master                        //    修改部位
GATEWAY=172.16.10.254

# hostname master

断开连接,重新连接服务器即可....

2.2、修改/etc/hosts文件

添加下面内容(5台服务器相同的配置)

172.16.10.21       master
172.16.10.22      standby
172.16.10.23       gpl1
172.16.10.24       gpl2
172.16.10.25    gpl3

2.3、服务器关闭selinux

5台服务器防火关闭,命令如下:

#    service iptables stop
#    chkconfig iptables off
#    vi /etc/sysconfig/selinux
SELINUX=disabled    //    修改部位

#    setenforce 0

2.4、操作系统参数设置

系统参数设置参考官方链接: https://gpdb.docs.pivotal.io/4370/prep_os-system-params.html#topic4

http://www.cndba.cn/Marvinn/article/3100

2.4.1、sysctl.conf 文件修改

(注:5台服务器)

#    vi /etc/sysctl.conf

kernel.shmmax = 500000000
kernel.shmmni = 4096
kernel.shmall = 4000000000
kernel.sem = 250 512000 100 2048
kernel.sysrq = 1
kernel.core_uses_pid = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.msgmni = 2048
net.ipv4.tcp_syncookies = 1
net.ipv4.ip_forward = 0
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.conf.all.arp_filter = 1
net.ipv4.ip_local_port_range = 1025 65535
net.core.netdev_max_backlog = 10000
net.core.rmem_max = 2097152
net.core.wmem_max = 2097152
vm.overcommit_memory = 2

配置生效
#    sysctl -p    

2.4.2、limits.conf文件修改

(注:5台服务器)

#    vi /etc/security/limits.conf

# End of file
* soft nofile 65536
* hard nofile 65536
* soft nproc 131072
* hard nproc 131072

注意:
对于RedHat6.x系统,还需要将/etc/security/limits.d/90-nproc.conf文件中的1024修改为131072。
[root@master ~]# cat /etc/security/limits.d/90-nproc.conf
# Default limit for number of user's processes to prevent
# accidental fork bombs.
# See rhbz #432903 for reasoning.

*          soft    nproc     131072                //    修改部位
root       soft    nproc     unlimited

2.4.3、系统存储

(注: 5台服务器全需要)

截取官档中一段话:

XFS is the preferred file system on Linux platforms for data storage. Greenplum recommends the following XFS mount options

Linux操作系统推荐使用XFS文件系统存储数据,GPL提供如下命令挂载XFS文件系统.http://www.cndba.cn/Marvinn/article/3100

rw,noatime,inode64,allocsize=16m

使用XFS文件系统,需安装相应的rpm软件包,并对磁盘设备进行格式化:

# rpm -ivh xfsprogs-2.9.4-4.el5.x86_64.rpm
Or
# yum install xfsprogs
# mkfs.xfs -f /dev/vdb

比如,挂载XFS格式的设备/dev/sdb到目录/data,/etc/fstab中的配置如下:

# mkdir /data
#    echo "/dev/vdb /data xfs noatime,inode64,allocsize=16m 0 0" >> /etc/fstab

重新mount,使之fstab生效
# mount -a
或者
#    mount -o remount /dev/vdb

验证
# mount
/dev/mapper/vg_master-lv_root on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw,rootcontext="system_u:object_r:tmpfs_t:s0")
/dev/vda1 on /boot type ext4 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
/dev/vdb on /data type xfs (rw,noatime,inode64,allocsize=16m)

2.4.4、磁盘IO调度策略

(注: 5台服务器全需)

Linux 磁盘IO调度策略支持不同调度,例如CFQ,AS,DeadLine等,GPL建议使用deadline 磁盘IO调度策略,命令如下:

For example:http://www.cndba.cn/Marvinn/article/3100

# echo schedulername > /sys/block/devname/queue/scheduler

当前修改:

# echo deadline > /sys/block/vdb/queue/scheduler

如下命令查看,下面示例的为正确的配置:

# cat /sys/block/vdb/queue/scheduler
    noop anticipatory [deadline] cfq

修改/boot/grub/grub.conf文件http://www.cndba.cn/Marvinn/article/3100

修改磁盘I/O调度策略的方法为,修改/boot/grub/grub.con文件的启动参数,在kernel一行的最后追加”elevator=deadline”,如下为正确配置的示例:

# vi /boot/grub/grub.conf

内容如下:
# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
# NOTICE:  You have a /boot partition.  This means that
#          all kernel and initrd paths are relative to /boot/, eg.
#          root (hd0,0)
#          kernel /vmlinuz-version ro root=/dev/mapper/vg_master-lv_root
#          initrd /initrd-[generic-]version.img
#boot=/dev/vda
default=0
timeout=5
splashimage=(hd0,0)/grub/splash.xpm.gz
hiddenmenu
title CentOS (2.6.32-431.el6.x86_64)
        root (hd0,0)
        kernel /vmlinuz-2.6.32-431.el6.x86_64 ro root=/dev/mapper/vg_master-lv_root rd_NO_LUKS  KEYBOARDTYPE=pc KEYTABLE=us LANG=en_US.UTF-8 rd_NO_MD rd_LVM_LV=vg_master/lv_swap SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_master/lv_root rd_NO_DM rhgb quiet elevator=deadline   //添加部分
        initrd /initramfs-2.6.32-431.el6.x86_64.img


 注意:修改该配置文件需谨慎,错误的修改会导致重启操作系统失败......

 重启系统
 # reboot

2.4.5、磁盘设置预读扇区数(read-ahead (blockdev)

(注: 5台机器全需更改)

官方文档的推荐值为16384,但认为应该为65536更合理,该值设置的是预读扇区数,实际上预读的字节数是blockdev设置除以2,而GP缺省的blocksize为32KB,刚好与65536(32768B/32KB)对应。检查某块磁盘的read-ahead设置:

# blockdev --getra devname

例如:

# blockdev --getra /dev/vdb

256

修改系统的read-ahead设置,可通过/etc/rc.d/rc.local来修改,在文件尾部追加如下代码:

blockdev --setra 65536 /dev/vdb

更改完毕,重启系统
#    reboot

验证是否更改成功
# blockdev --getra /dev/vdb
65536

如需临时修改read-ahead设置,可通过执行下面的命令来实现:

# blockdev --setra bytes devname 

例如:
# blockdev --setra 65536 /dev/vdb

2.4.6、禁止THP

(注: 5台机器全需更改)

参考官方文档:https://gpdb.docs.pivotal.io/4370/prep_os-system-params.html#topic3

禁止透明大页,Redhat 6以及更高版本默认激活THP,THP会降低GP database性能,通过修改文件/boot/grub/grub.conf添加参数transparent_hugepage=never禁止THP的应用,但需要重新启动系统

#    vi /boot/grub/grub.conf

[root@master ~]# cat /boot/grub/grub.conf
# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
# NOTICE:  You have a /boot partition.  This means that
#          all kernel and initrd paths are relative to /boot/, eg.
#          root (hd0,0)
#          kernel /vmlinuz-version ro root=/dev/mapper/vg_master-lv_root
#          initrd /initrd-[generic-]version.img
#boot=/dev/vda
default=0
timeout=5
splashimage=(hd0,0)/grub/splash.xpm.gz
hiddenmenu
title CentOS (2.6.32-431.el6.x86_64)
        root (hd0,0)
        kernel /vmlinuz-2.6.32-431.el6.x86_64 ro root=/dev/mapper/vg_master-lv_root rd_NO_LUKS  KEYBOARDTYPE=pc KEYTABLE=us LANG=en_US.UTF-8 rd_NO_MD rd_LVM_LV=vg_master/lv_swap SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_master/lv_root rd_NO_DM rhgb quiet elevator=deadline transparent_hugepage=never        // 增加修改部位
        initrd /initramfs-2.6.32-431.el6.x86_64.img

重新启动系统
# reboot

验证是否更改成功,输出如下表示THP禁止:
# cat /sys/kernel/mm/*transparent_hugepage/enabled
always madvise [never]
always madvise [never]

注意:Redhat 7.2或者Centos 7.2需要禁止IPC对象,其他版本不需要(当前版本是Centos 6.5所以不需要更改),因为它会造成GPL程序gpinitsystem 带有semaphore错误,解决步骤如下:

1、创建gpadmin系统用户,-r 创建系统用户 -m 为用户创建家目录
#    useradd -r -m gpadmin

2、修改参数文件/etc/systemd/logind.conf,禁止RemoveIPC,添加如下:
RemoveIPC=no

3、root用户重启 systemd-login service 
#    service systemd-logind restart

三、节点安装GPL

3.1、创建gpadmin用户

(注:5台机器每台都需要)

#  groupadd -g 530 gpadmin
#  useradd -g 530 -u530 -m -d /home/gpadmin -s /bin/bash gpadmin

设置gpadmin用户密码为gpadmin
#  passwd gpadmin

3.2、依赖包安装

http://www.cndba.cn/Marvinn/article/3100

#    yum -y install ed openssh-clients gcc gcc-c++  make automake autoconf libtool perl rsync coreutils glib2 lrzsz sysstat e4fsprogs xfsprogs ntp readline-devel zlib zlib-devel unzip

注意:greenplum依赖ed,否则无法初始化成功

3.4、创建GPL安装目录

http://www.cndba.cn/Marvinn/article/3100

(注:5台机器每台都需要)

为greenplum软件创建安装目录,并且赋给gpadmin用户权限(每台操作)

# mkdir /gp/greenplum
# chown -R gpadmin:gpadmin /gp

3.5、GPL集群安装与配置

3.5.1、GPL软件安装(Master节点)

参考文档:https://gpdb.docs.pivotal.io/4370/prep_os-install-master.html#topic7

注意:GP的安装操作都是在主节点Master上执行的,所以后续操作若无特别说明均在Master节点上安装操作

1、上传软件安装包到master节点/gp/greenplum,并解压

[root@master greenplum]# unzip greenplum-db-4.3.7.3-build-2-RHEL5-x86_64.zip

[root@master greenplum]# ll
total 277988
-rwxr-xr-x 1 root root 143442255 Feb 25  2016 greenplum-db-4.3.7.3-build-2-RHEL5-x86_64.bin
-rw-r--r-- 1 root root 141201754 Oct 24 11:45 greenplum-db-4.3.7.3-build-2-RHEL5-x86_64.zip
-rw-r--r-- 1 root root      6997 Feb 25  2016 README_INSTALL

2、执行安装
[root@master greenplum]# ./greenplum-db-4.3.7.3-build-2-RHEL5-x86_64.bin 

中间输出license 省略

是否接受license 
********************************************************************************
Do you accept the Pivotal Database license agreement? [yes|no]
********************************************************************************
yes

选择安装目录
********************************************************************************
Provide the installation path for Greenplum Database or press ENTER to 
accept the default installation path: /usr/local/greenplum-db-4.3.7.3
*******************************************************************************
/gp/greenplum   

********************************************************************************
Install Greenplum Database into </gp/greenplum>? [yes|no]
********************************************************************************

yes

Extracting product to /gp/greenplum

********************************************************************************
Installation complete.
Greenplum Database is installed in /gp/greenplum

Pivotal Greenplum documentation is available
for download at http://docs.gopivotal.com/gpdb
********************************************************************************
es 这一步会将安装包解压到 /gp/greenplum/下,并建立软连接 /gp/greenplum-db

软件安装完成.....

3、设置gpadmin环境变量
[root@master greenplum]# su - gpadmin
[gpadmin@master ~]$ vi ~/.bash_profile 
添加如下:
source /gp/greenplum-db/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/data/master/gpseg-1        // XFS数据目录
export PGHOME=/gp/greenplum
export PGPORT=5432
export PGDATABASE=PG        //    数据库名

[gpadmin@master ~]$ source ~/.bash_profile 

3.5.2、所有节点安装配置GPL

1、配置pgadmin用户互信

[root@master greenplum]# su - gpadmin
[gpadmin@master gp]$ mkdir conf
[gpadmin@master gp]$ ll
total 8
drwxrwxr-x  2 gpadmin gpadmin 4096 Oct 26 07:40 conf
drwxr-xr-x 11 gpadmin gpadmin 4096 Oct 26 07:04 greenplum
lrwxrwxrwx  1 root    root      11 Oct 26 07:04 greenplum-db -> ./greenplum

2、创建互信配置文件
主机所有节点:
[gpadmin@master gp]$ vi /gp/conf/hostlist

添加如下:
master
standby
gpl1
gpl2
gpl3

Seg所有节点:
[gpadmin@master gp]$ vi /gp/conf/seg_host

添加如下:
gpl1
gpl2
gpl3

3、建立节点服务器间的信任(root用户以及gpadmin用户自动互信)

gpadfmin用户: 主要是为了GPL集群软件的安装
[gpadmin@master gp]$ gpssh-exkeys -f /gp/conf/hostlist
[STEP 1 of 5] create local ID and authorize on local host

[STEP 2 of 5] keyscan all hosts and update known_hosts file

[STEP 3 of 5] authorize current user on remote hosts
  ... send to standby
  ***
  *** Enter password for standby:         //输入gpadmin用户密码gpadmin
  ... send to gpl1
  ... send to gpl2
  ... send to gpl3

[STEP 4 of 5] determine common authentication file content

[STEP 5 of 5] copy authentication files to all remote hosts
  ... finished key exchange with standby
  ... finished key exchange with gpl1
  ... finished key exchange with gpl2
  ... finished key exchange with gpl3

[INFO] completed successfully

root用户: root用户互信主要是为了后续集群NTP时间服务同步

[root@master conf]# source /gp/greenplum-db/greenplum_path.sh
[root@master conf]# gpssh-exkeys -f /gp/conf/hostlist
[STEP 1 of 5] create local ID and authorize on local host

[STEP 2 of 5] keyscan all hosts and update known_hosts file

[STEP 3 of 5] authorize current user on remote hosts
  ... send to standby
  ***
  *** Enter password for standby: 
[ERROR standby] bad password
  ***
  *** Enter password for standby: 
  ... send to gpl1
  ... send to gpl2
  ... send to gpl3

[STEP 4 of 5] determine common authentication file content

[STEP 5 of 5] copy authentication files to all remote hosts
  ... finished key exchange with standby
  ... finished key exchange with gpl1
  ... finished key exchange with gpl2
  ... finished key exchange with gpl3

[INFO] completed successfully

4、批量安装gpl软件(master节点gpadmin用户执行批量安装)
[gpadmin@master gp]$ gpseginstall -f /gp/conf/hostlist -u gpadmin -p gpadmin

20181026:08:05:41:002218 gpseginstall:master:gpadmin-[INFO]:-Installation Info:
link_name greenplum-db
binary_path /gp/greenplum
binary_dir_location /gp
binary_dir_name greenplum
                        .....        中间输出省略    ........
//看到这表示批量安装成功
20181026:08:08:12:002218 gpseginstall:master:gpadmin-[INFO]:-SUCCESS -- Requested commands completed 



5、检查是否安装完成

[gpadmin@master gp]$ gpssh -f /gp/conf/hostlist -e ls -l $GPHOME
输出每个节点的信息,样例如表示成功
[gpadmin@master gp]$ gpssh -f /gp/conf/hostlist -e ls -l $GPHOME
[   gpl1] ls -l /gp/greenplum-db/.
[   gpl1] total 278264
[   gpl1] drwxr-xr-x 3 gpadmin gpadmin      4096 Feb 25  2016 bin
[   gpl1] drwxr-xr-x 2 gpadmin gpadmin      4096 Feb 25  2016 demo
[   gpl1] drwxr-xr-x 5 gpadmin gpadmin      4096 Feb 25  2016 docs
[   gpl1] drwxr-xr-x 2 gpadmin gpadmin      4096 Feb 25  2016 etc
[   gpl1] drwxr-xr-x 3 gpadmin gpadmin      4096 Feb 25  2016 ext

中间输出省略......

3.5.3、创建存储目录

http://www.cndba.cn/Marvinn/article/3100

1、master,standby节点操作命令:
注意:当前/data目录是xfs格式的磁盘

#    mkdir -p /data/master
#    chown -R gpadmin.gpadmin /data

2、操作segment节点,建立目录,改权限(gpl1、gpl2、gpl3节点)
mkdir -p /data/primary
mkdir -p /data/mirror
chown -R gpadmin.gpadmin /data

3.5.4、配置NTP时间同步

参考官方链接:https://gpdb.docs.pivotal.io/4370/prep_os-synch-clocks.html#topic14

1、在master主节点编辑/etc/ntp.conf文件
添加NTP服务器IP

server 202.120.2.101     //    (上海交通大学网络中心NTP服务器地址)

2、在standby备节点编辑/etc/ntp.conf文件
设置第一个server参数指向Master主机,第二个参数执行NTP服务器IP,如下面:

server master prefer
server 202.120.2.101

3、所有Segment 节点编辑/etc/ntp.conf文件
    设置第一个server参数指向Master主机,第二个server参数指向Standby主机。如下面:

server master prefer
server standby

4、开启所有节点ntpd服务
# service ntpd start
Starting ntpd:                                             [  OK  ]

# service ntpd status
ntpd (pid  18705) is running...

# chkconfig ntpd on

5、在Master主机,使用NTP守护进程同步所有Segment主机的系统时钟。例如,使用gpssh来完成:

[root@master conf]#    source /gp/greenplum-db/greenplum_path.sh            // 初始化gpl运行环境
[root@master conf]# gpssh -f /gp/conf/hostlist -v -e 'ntpd'
[WARN] Reference default values as $MASTER_DATA_DIRECTORY/gpssh.conf could not be found
Using delaybeforesend 0.05 and prompt_validation_timeout 1.0

[Reset ...]
[INFO] login gpl2
[INFO] login master
[INFO] login gpl1
[INFO] login gpl3
[INFO] login standby
[   gpl2] ntpd
[ master] ntpd
[   gpl1] ntpd
[   gpl3] ntpd
[standby] ntpd
[INFO] completed successfully

[Cleanup...]

3.5.5、验证安装环境

参考官方链接:https://gpdb.docs.pivotal.io/4370/install_guide/validate.html#topic1

1、验证系统参数设置
[root@master conf]#    source /gp/greenplum-db/greenplum_path.sh            // 初始化gpl运行环境
[root@master conf]# gpcheck -f /gp/conf/hostlist -m master -s standby

报错:
20181029:10:00:11:001477 gpcheck:master:root-[INFO]:-dedupe hostnames
20181029:10:00:11:001477 gpcheck:master:root-[INFO]:-Detected platform: Generic Linux Cluster
20181029:10:00:11:001477 gpcheck:master:root-[INFO]:-generate data on servers
20181029:10:00:13:001477 gpcheck:master:root-[INFO]:-copy data files from servers
20181029:10:00:13:001477 gpcheck:master:root-[INFO]:-delete remote tmp files
20181029:10:00:14:001477 gpcheck:master:root-[INFO]:-Using gpcheck config file: /gp/greenplum-db/./etc/gpcheck.cnf
20181029:10:00:14:001477 gpcheck:master:root-[ERROR]:-GPCHECK_ERROR host(standby): XFS filesystem on device /dev/vdb has 5 XFS mount options and 4 are expected
20181029:10:00:14:001477 gpcheck:master:root-[ERROR]:-GPCHECK_ERROR host(gpl1): XFS filesystem on device /dev/vdb has 5 XFS mount options and 4 are expected
20181029:10:00:14:001477 gpcheck:master:root-[ERROR]:-GPCHECK_ERROR host(gpl3): XFS filesystem on device /dev/vdb has 5 XFS mount options and 4 are expected
20181029:10:00:14:001477 gpcheck:master:root-[ERROR]:-GPCHECK_ERROR host(gpl2): XFS filesystem on device /dev/vdb has 5 XFS mount options and 4 are expected
20181029:10:00:14:001477 gpcheck:master:root-[INFO]:-gpcheck completing...

是因为gpcheck检查工具使用/gp/greenplum-db/./etc/gpcheck.cnf配置文件检查
[root@master conf]# cat /gp/greenplum-db/./etc/gpcheck.cnf
[global]
configfile_version = 3

[linux]
xfs_mount_options = rw,noatime,inode64,allocsize=16m

.....部分输出省略......

缘由:
xfs检查命令为rw,noatime,inode64,allocsize=16m,而/etc/fstab挂载/dev/vdb命令为rw,nodev,noatime,inode64,allocsize=16m,两者不匹配,所以报错

解决:
1、修改检查配置文件xfs_mount_options = rw,nodev,noatime,inode64,allocsize=16m
2、修改/etc/fstab文件rw,noatime,inode64,allocsize=16m。重新挂载

当前采用第二中解决方案:更改所有主机的挂载并重新mount
# vi /etc/fstab
将如下命令
/dev/vdb /data xfs nodev,noatime,inode64,allocsize=16m 0 0
修改为
/dev/vdb /data xfs noatime,inode64,allocsize=16m 0 0

重新mount
#    mount -o remount /dev/vdb


再次检查,检查通过
[root@master ~]# gpcheck -f /gp/conf/hostlist -m master -s standby
20181029:10:15:00:001633 gpcheck:master:root-[INFO]:-dedupe hostnames
20181029:10:15:00:001633 gpcheck:master:root-[INFO]:-Detected platform: Generic Linux Cluster
20181029:10:15:00:001633 gpcheck:master:root-[INFO]:-generate data on servers
20181029:10:15:01:001633 gpcheck:master:root-[INFO]:-copy data files from servers
20181029:10:15:01:001633 gpcheck:master:root-[INFO]:-delete remote tmp files
20181029:10:15:02:001633 gpcheck:master:root-[INFO]:-Using gpcheck config file: /gp/greenplum-db/./etc/gpcheck.cnf
20181029:10:15:02:001633 gpcheck:master:root-[INFO]:-GPCHECK_NORMAL
20181029:10:15:02:001633 gpcheck:master:root-[INFO]:-gpcheck completing...
[root@master ~]# 


2、检查硬件性能

    网络测试选项包括:并行测试(-r N)、串行测试(-r n)、矩阵测试(-r M)。测试时运行一个网络测试程序从当前主机向远程主机传输5秒钟的数据流。缺省时,数据并行传输到每个远程主机,报告出传输的最小、最大、平均和中值速率,单位为MB/S。如果主体的传输速率低于预期(小于100MB/S),可以使用-r n参数运行串行的网络测试以得到每个主机的结果。要运行矩阵测试,指定-r M参数,使得每个主机发送接收指定的所有其他主机的数据,这个测试可以验证网络层能否承受全矩阵工作负载。

[root@master ~]# gpcheckperf -f /gp/conf/hostlist -r N -d /tmp > subnet1.out
[root@master ~]# vi subnet1.out            //当前目录生成,可打开查看

输出省略....


3、检查磁盘IO、内存带宽(用户需具备读写权限)
因为当前环境所有主机都挂载在/data目录,所以直接-d /data即可验证所有主机/data磁盘IO,若主机目录不一样。验证每台需要-d 按/gp/conf/hostlist中填写的顺序单独执行路径

[root@master ~]# su - gpadmin
[gpadmin@master ~]# gpcheckperf -f  /gp/conf/hostlist -d /data -r ds
或者
[gpadmin@master ~]# gpcheckperf -f /gp/conf/hostlist -r ds -D /
                              -d /data/master -d  /data/master /
                              -d /data/primary -d  /data/mirror

3.5.6、GPL集群初始化

参考官方链接:https://gpdb.docs.pivotal.io/4370/install_guide/init_gpdb.html#topic1

流程:
1、确认前面的步骤已完成
2、创建只包含segment主机地址的hostlist,如果有多网口,需要全部都列出来
3、配置文件,可以参考cp $GPHOME/docs/cli_help/gpconfigs/gpinitsystem_config /dba_files/gp_files
4、注意:可以在初始化时值配置Primary Segment Instance,而在之后使用gpaddmirrors命令部署Mirror Segment Instance

1、创建GPL数据库配置文件(Master节点 gpadmin用户 操作)

[root@master data]# su - gpadmin
[gpadmin@master ~]$ cp $GPHOME/docs/cli_help/gpconfigs/gpinitsystem_config  /home/gpadmin/gpinitsystem_config
[gpadmin@master ~]$ chmod 775 /home/gpadmin/gpinitsystem_config

[gpadmin@master ~]$ vi /home/gpadmin/gpinitsystem_config
具体配置文件如下:(注:配置文件详解见后)

ARRAY_NAME="EMC Greenplum DW"
SEG_PREFIX=gpseg
PORT_BASE=40000
declare -a DATA_DIRECTORY=(/data/primary /data/primary /data/primary)
MASTER_HOSTNAME=master
MASTER_DIRECTORY=/data/master
MASTER_PORT=5432
TRUSTED_SHELL=ssh
CHECK_POINT_SEGMENTS=8
ENCODING=UNICODE
MIRROR_PORT_BASE=50000
REPLICATION_PORT_BASE=41000
MIRROR_REPLICATION_PORT_BASE=51000
declare -a MIRROR_DATA_DIRECTORY=(/data/mirror /data/mirror /data/mirror)

2、初始化GPL集群
[root@master data]# su - gpadmin

初始化GPL集群,-s 命令表示添加standby节点
[gpadmin@master conf]$ gpinitsystem -c /home/gpadmin/gpinitsystem_config -h /gp/conf/seg_host -s standby

20181029:16:33:01:009588 gpinitsystem:master:gpadmin-[INFO]:-Checking configuration parameters, please wait...

..........................................中间输出省略...........................................................
以下输出之前的配置文件配置...

配置Segment Primary
20181029:16:33:15:009588 gpinitsystem:master:gpadmin-[INFO]:----------------------------------------
20181029:16:33:15:009588 gpinitsystem:master:gpadmin-[INFO]:-Greenplum Primary Segment Configuration
20181029:16:33:15:009588 gpinitsystem:master:gpadmin-[INFO]:----------------------------------------
20181029:16:33:15:009588 gpinitsystem:master:gpadmin-[INFO]:-gpl1       /data/primary/gpseg0    40000   2       0       41000
20181029:16:33:15:009588 gpinitsystem:master:gpadmin-[INFO]:-gpl1       /data/primary/gpseg1    40001   3       1       41001
20181029:16:33:15:009588 gpinitsystem:master:gpadmin-[INFO]:-gpl1       /data/primary/gpseg2    40002   4       2       41002
20181029:16:33:15:009588 gpinitsystem:master:gpadmin-[INFO]:-gpl2       /data/primary/gpseg3    40000   5       3       41000
20181029:16:33:15:009588 gpinitsystem:master:gpadmin-[INFO]:-gpl2       /data/primary/gpseg4    40001   6       4       41001
20181029:16:33:15:009588 gpinitsystem:master:gpadmin-[INFO]:-gpl2       /data/primary/gpseg5    40002   7       5       41002
20181029:16:33:15:009588 gpinitsystem:master:gpadmin-[INFO]:-gpl3       /data/primary/gpseg6    40000   8       6       41000
20181029:16:33:15:009588 gpinitsystem:master:gpadmin-[INFO]:-gpl3       /data/primary/gpseg7    40001   9       7       41001
20181029:16:33:15:009588 gpinitsystem:master:gpadmin-[INFO]:-gpl3       /data/primary/gpseg8    40002   10      8       41002

配置Segment Mirror:
20181029:16:33:15:009588 gpinitsystem:master:gpadmin-[INFO]:---------------------------------------
20181029:16:33:15:009588 gpinitsystem:master:gpadmin-[INFO]:-Greenplum Mirror Segment Configuration
20181029:16:33:15:009588 gpinitsystem:master:gpadmin-[INFO]:---------------------------------------
20181029:16:33:15:009588 gpinitsystem:master:gpadmin-[INFO]:-gpl2       /data/mirror/gpseg0     50000   11      0       51000
20181029:16:33:15:009588 gpinitsystem:master:gpadmin-[INFO]:-gpl2       /data/mirror/gpseg1     50001   12      1       51001
20181029:16:33:15:009588 gpinitsystem:master:gpadmin-[INFO]:-gpl2       /data/mirror/gpseg2     50002   13      2       51002
20181029:16:33:15:009588 gpinitsystem:master:gpadmin-[INFO]:-gpl3       /data/mirror/gpseg3     50000   14      3       51000
20181029:16:33:15:009588 gpinitsystem:master:gpadmin-[INFO]:-gpl3       /data/mirror/gpseg4     50001   15      4       51001
20181029:16:33:15:009588 gpinitsystem:master:gpadmin-[INFO]:-gpl3       /data/mirror/gpseg5     50002   16      5       51002
20181029:16:33:15:009588 gpinitsystem:master:gpadmin-[INFO]:-gpl1       /data/mirror/gpseg6     50000   17      6       51000
20181029:16:33:15:009588 gpinitsystem:master:gpadmin-[INFO]:-gpl1       /data/mirror/gpseg7     50001   18      7       51001
20181029:16:33:15:009588 gpinitsystem:master:gpadmin-[INFO]:-gpl1       /data/mirror/gpseg8     50002   19      8       51002
Continue with Greenplum creation Yy/Nn>
y
20181029:16:33:29:009588 gpinitsystem:master:gpadmin-[INFO]:-Building the Master instance database, please wait...
20181029:16:34:50:009588 gpinitsystem:master:gpadmin-[INFO]:-Starting the Master in admin mode
20181029:16:35:00:009588 gpinitsystem:master:gpadmin-[INFO]:-Commencing parallel build of primary segment instances
20181029:16:35:00:009588 gpinitsystem:master:gpadmin-[INFO]:-Spawning parallel processes    batch [1], please wait...
.........
20181029:16:35:01:009588 gpinitsystem:master:gpadmin-[INFO]:-Waiting for parallel processes batch [1], please wait...
.......................................................................................................................
Continue with Greenplum creation Yy/Nn>
y

.......................后续输出省略...........................................

如果没有报ERROR,GP数据库就安装好了。

注意:
    由于我的配置文件中是已经开启Segment Mirror镜像功能并且初始化GPL集群命令中-s 附带standby 节点,所以GPL集群初始化已自动配置Standby节点以及Segment Mirror镜像功能,若是集群初始化命令中未附带-s 以及配置文件中未配置Segment mirror镜像后续可自行添加。具体命令见附加项

3、验证postgres是否正常
[gpadmin@master ~]$ ps -ef|grep postgres
gpadmin  25613     1  0 16:47 ?        00:00:00 /gp/greenplum/bin/postgres -D /data/master/gpseg-1 -p 5432 -b 1 -z 9 --silent-mode=true -i -M master -C -1 -x 20 -E
gpadmin  25614 25613  0 16:47 ?        00:00:00 postgres: port  5432, master logger process                                                                        
gpadmin  25617 25613  0 16:47 ?        00:00:00 postgres: port  5432, stats collector process                                                                      
gpadmin  25618 25613  0 16:47 ?        00:00:00 postgres: port  5432, writer process                                                                               
gpadmin  25619 25613  0 16:47 ?        00:00:00 postgres: port  5432, checkpoint process                                                                           
gpadmin  25620 25613  0 16:47 ?        00:00:00 postgres: port  5432, seqserver process                                                                            
gpadmin  25621 25613  0 16:47 ?        00:00:00 postgres: port  5432, ftsprobe process                                                                             
gpadmin  25622 25613  0 16:47 ?        00:00:00 postgres: port  5432, sweeper process                                                                              
gpadmin  25656 25613  0 16:47 ?        00:00:00 postgres: port  5432, wal sender process gpadmin 172.16.10.22(15110) con6 172.16.10.22(15110) streaming 0/C2350A8  
gpadmin  27657 27428  0 17:13 pts/12   00:00:00 grep postgres

从上面可以看到数据目录为: /data/master/gpseg-1

------------------------------------------------------------------------------------------------------------------
用gpstate来查看当前的状态,此命令也是排错时的必备工具

gpstart        # 起库
gpstop        # 关库
gpstate -e  #查看mirror的状态,我们这次的案例没有安装mirror
gpstate -f  #查看standby master的状态
gpstate -s  #查看整个GP群集的状态
gpstate -i  #查看GP的版本
gpstate --help #帮助文档,可以查看gpstat更多用法,
--------------------------------------------------------------------------------------------------------------------

3.5.7、环境变量配置变更

用gpadmin登录
# su - gpadmin 
$ vi ~/.bash_profile
添加下面这段        
source /gp/greenplum-db/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/data/master/gpseg-1        //    数据目录
export PGHOME=/gp/greenplum
export PGPORT=5432
export PGDATABASE=PG                                # 创建的数据库名
export PGUSER=gpadmin

再source一下
$ source ~/.bash_profile

如果有standby,也要将此配置复制过去或者直接拷贝到standby节点gpadmin用户环境变量再source即可。如本案例中就有master,standby,那么我们将复制到standby下,注意:可以不需要复制到Segment数据节点,因为操作都在掌管集群的Master或者Standby执行

3.5.8、创建数据库

1、创建数据库(Master节点)
创建数据库需注意大小写
[gpadmin@master ~]$ createdb PG -E utf-8
[gpadmin@master ~]$ psql -d PG
psql (8.2.15)
Type "help" for help.

PG=# select version();
                                                                       version                                                                        
------------------------------------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 8.2.15 (Greenplum Database 4.3.7.3 build 2) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on Feb 25 2016 06:38:30
(1 row)

PG=# create table marvin ( no int primary key,student_name varchar(40),age int);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "marvin_pkey" for table "marvin"
CREATE TABLE

PG=# insert into marvin values(1,'yayun',18);
INSERT 0 1
PG=# insert into marvin values(2,'yunqu',25);
INSERT 0 1

PG=# select * from marvin;
 no | student_name | age 
----+--------------+-----
  1 | yayun        |  18
  2 | yunqu        |  25
(2 rows)

查看GREENPLUM的数据分布,字段gp_segment_id是表中隐藏列,另外在表数据支持hash分布(DISTRIBUTED BY)和随机分布(DISTRIBUTED RANDOMLY)两种分布策略,若在创建表不指定DISTRIBUTED BY 或者DISTRIBUTED RANDOMLY 则默认为Hash 分布

PG=# select gp_segment_id,count(*) from marvin group by gp_segment_id;
 gp_segment_id | count 
---------------+-------
             3 |     1
             0 |     1
(2 rows)

3.5.9、附加项http://www.cndba.cn/Marvinn/article/3100

1、gpinitsystem_config 配置文件详解:

配置文件详解:
[gpadmin@master ~]$ cat /home/gpadmin/gpinitsystem_config 
# FILE NAME: gpinitsystem_config

# Configuration file needed by the gpinitsystem

################################################
#### REQUIRED PARAMETERS
################################################

#### Name of this Greenplum system enclosed in quotes.    // 数据库的代号
ARRAY_NAME="EMC Greenplum DW"

#### Naming convention for utility-generated data directories.    //    Segment的名称前缀
SEG_PREFIX=gpseg

#### Base number by which primary segment port numbers     //    起始的端口号
#### are calculated.
PORT_BASE=40000

#### File system location(s) where primary segment data directories 
#### will be created. The number of locations in the list dictate
#### the number of primary segments that will get created per
#### physical host (if multiple addresses for a host are listed in 
#### the hostfile, the number of segments will be spread evenly across
#### the specified interface addresses).

指定Primary Segment的数据目录, DATA_DIRECTORY参数指定每个Segment主机配置多少个Instance。如果
在host文件中为每个Segment主机列出了多个网口,这些Instance将平均分布到所有列出的网口上。
这里的案例,hosts里有3个segment,gpl1,gpl2, gpl3 三台主机,都是单网卡
--------------------------------------------------------------------------------------------------------------
declare -a DATA_DIRECTORY=(/data/primary /data/primary /data/primary)

#### OS-configured hostname or IP address of the master host.    //    Master主机名
MASTER_HOSTNAME=master

#### File system location where the master data directory     //    Master主机数据目录
#### will be created.
MASTER_DIRECTORY=/data/master

#### Port number for the master instance. //    Master端口
MASTER_PORT=5432

#### Shell utility used to connect to remote hosts.    // Remote远程连接工具
TRUSTED_SHELL=ssh

#### Maximum log file segments between automatic WAL checkpoints.
---------------------------------------------------------------------------------------------------------------------
设置的是检查点段的大小,较大的检查点段可以改善大数据量装载的性能,同时会加长灾难事务恢复的时间。更多信息可参考相关文档。缺省值为8,
若为保守起见,建议配置为缺省值,例如设置为16时,产生16个WAL日志文件后(如果每个日志文件的大小为16M,即产生16*16M字节的日志),进行一次checkpoint。如果多台服务器级的主机,有足够的内存>16G >16核,那么可以考虑设置为CHECK_POINT_SEGMENTS=256
----------------------------------------------------------------------------------------------------------------------
CHECK_POINT_SEGMENTS=8

#### Default server-side character set encoding.    //    数据库服务器字符集
ENCODING=UNICODE

################################################
#### OPTIONAL MIRROR PARAMETERS
################################################

#### Base number by which mirror segment port numbers     //    Mirror节点起始端口号
#### are calculated.
#MIRROR_PORT_BASE=50000

#### Base number by which primary file replication port     //    primary Segment主备同步的起始端口号
#### numbers are calculated.
#REPLICATION_PORT_BASE=41000

#### Base number by which mirror file replication port     //    Mirror Segment主备同步的起始端口号
#### numbers are calculated. 
#MIRROR_REPLICATION_PORT_BASE=51000

#### File system location(s) where mirror segment data directories 
#### will be created. The number of mirror locations must equal the
#### number of primary locations as specified in the 
#### DATA_DIRECTORY parameter.
//    Mirror Segment的数据目录

#declare -a MIRROR_DATA_DIRECTORY=(/data1/mirror /data1/mirror /data1/mirror /data2/mirror /data2/mirror /data2/mirror)


################################################
#### OTHER OPTIONAL PARAMETERS
################################################

#### Create a database of this name after initialization.    // 初始化后创建数据库
#DATABASE_NAME=name_of_database

#### Specify the location of the host address file here instead of
#### with the the -h option of gpinitsystem.
#MACHINE_LIST_FILE=/home/gpadmin/gpconfigs/hostfile_gpinitsystem

2、增加standby

当master宕掉后,会自动启用standby作为master,下面来看一下standby怎么添加(仅限于集群初始化命令中未附带-s 参数,未自动配置standby节点)

在standby服务器上执行
[root@bstandby ~]# mkdir /data/master
[root@standby ~]# chown gpadmin:gpadmin /data/master

在master服务器上执行  
[gpadmin@master ~]$ gpinitstandby -s standby
中间输入一次Y

3、增加mirror

mirror就是镜像,也叫数据备份。mirror对于数据存储来说很重要,两台存储节点同时宕掉的几率还是很小的。如果前面在GP初始化文件里忘记配置mirror了,请按照下面的方法添加

[gpadmin@master ~]$ gpaddmirrors -p 1000
运行过程中需要输入两次mirror路径:/data/mirror

4、设置访问权限

打开/data/master/gpseg-1/pg_hba.conf 按照最下面的格式添加客户端ip或网段

#user define
host    all     all     192.168.1.0/24   trust
host    all     all     127.0.0.1/28    trust

5、访问方式

可以通过gpAdmin桌面客户端来访问,也可以用命令行来访问,下面来说一下命令行访问的方式,marvin是后面新添加的用户

[gpadmin@master ~]$ psql -d dbname -h hostname -p 5432 -U gpadmin
[gpadmin@master ~]$ psql -d dbname -h hostname -p 5432 -U marvin

至此,GreenPlum 5节点集群安装完毕….

版权声明:本文为博主原创文章,未经博主允许不得转载。

用户评论
* 以下用户言论只代表其个人观点,不代表CNDBA社区的观点或立场
Marvinn

Marvinn

关注

路漫漫其修远兮、吾将上下而求索

  • 99
    原创
  • 0
    翻译
  • 2
    转载
  • 36
    评论
  • 访问:458477次
  • 积分:449
  • 等级:中级会员
  • 排名:第12名
精华文章
    最新问题
    查看更多+
    热门文章
      热门用户
      推荐用户
        Copyright © 2016 All Rights Reserved. Powered by CNDBA · 皖ICP备2022006297号-1·

        QQ交流群

        注册联系QQ