签到成功

知道了

CNDBA社区CNDBA社区

Greenplum 官方脚本深坑

2018-11-01 10:58 6689 0 原创 PostgreSQL
作者: Marvinn

Greenplum 官方脚本深坑http://www.cndba.cn/Marvinn/article/3105

gpssh-key以及gpseginstall官方脚本,是用来主机自助互信以及批量安装的脚本,但是这个两个脚本有个深坑: 

是默认互信端口22,其他的互信端口过不去,导致安装不了.

某客户现场配置Greenplum集群时,默认端口22是被封,只开启了其他端口,导致一直互信不了,即使手工互信跳过Gpssh-key,但是gpseginstall批量安装一直过不去,另外一个批量执行命令gpssh是可以执行的,说明手工互信没问题,然后叫他们开启22端口,客户说估计开启不了,内网限制,叫想想其他办法,没办法,只能迎着头皮通过报错看官方gpss-key以及gpseginstall脚本,发现使用python paramiko远程执行命令,通过修改可以批量安装Greenplum,但是gpss-key还是过不去,就忽略,确保安装成功就行,因为这个脚本主要就是互信的,互信手工已经做了.

批量安装报错

http://www.cndba.cn/Marvinn/article/3105

手工互信后,批量安装报错:

[root@master commands]# source /gp/greenplum-db/greenplum_path.sh 
[root@master commands]# gpseginstall -f /gp/conf/hostlist -u gpadmin -p gpadmin
20181031:16:49:14:008413 gpseginstall:master:root-[INFO]:-Installation Info:
link_name greenplum-db
binary_path /gp/greenplum
binary_dir_location /gp
binary_dir_name greenplum
20181031:16:49:14:008413 gpseginstall:master:root-[INFO]:-check cluster password access
  *** Enter password for gpdb4: 
[Errno 111] Connection refused
  *** Enter password for gpdb4: 
[Errno 111] Connection refused
  *** Enter password for gpdb4: 
[Errno 111] Connection refused
  *** Enter password for gpdb4: 
[Errno 111] Connection refused
  *** Enter password for gpdb4: 
[Errno 111] Connection refused
20181031:16:49:42:008413 gpseginstall:master:root-[ERROR]:-could not successfully access all machines
20181031:16:49:42:008413 gpseginstall:master:root-[ERROR]:-trace: Did not get a valid password for host gpdb4
20181031:16:49:42:008413 gpseginstall:master:root-[CRITICAL]:-early exit from gpseginstall

重复输入密码一直过不去,重复5次,报错如下(密码没有输错),更可气的是密码没错,没链接上或者拒绝连接不报错.....主要是因为/gp/greenplum-db/lib/python/gppylib/commands/base.py 脚本未处理报错,直接pass,也是醉了

具体分析做法如下:http://www.cndba.cn/Marvinn/article/3105

查看分析脚本(截取部分)
[root@master bin]# vi /gp/greenplum-db/bin/gpseginstall
报错步骤:discover
def discoverPasswordMap():

    logger.info("check cluster password access")

    global passwordMap

    try:
        passwordMap = NakedExecutionPasswordMap(hosts.keys())
        passwordMap.discover()
    except Exception, e:
        logger.error("could not successfully access all machines")
        msg = e.__str__()
        if msg:
            logger.error("trace: %s" % msg)
        return True

    if passwordMap.complete:
        return False
    else:
        return True

[root@master commands]# vi /gp/greenplum-db/lib/python/gppylib/commands/base.py
函数discover
            # ASK USER
            foundit = False
            for attempt in range(5):
                try:
                    passwd = getpass.getpass('  *** Enter password for %s: ' % (host), sys.stderr)
                    client.connect(host, password=passwd)
                    foundit = True
                    self.mapping[host] = passwd
                    if passwd not in self.unique_passwords:
                        self.unique_passwords.add(passwd)
                    break
                except Exception, e:
                    #pass
                    print e                这里就是为什么输入5次,因为端口原因连接不上,还不报错,把pass注释,打印错误就知道端口拒绝连接

            try:


查看paramiko 客户端连接            
[root@master commands]# vi /gp/greenplum-db/lib/python/paramiko/client.py        
    def connect(self, hostname, port=SSH_PORT, username=None, password=None, pkey=None,
                key_filename=None, timeout=None, allow_agent=True, look_for_keys=True,
                compress=False):
        """
        Connect to an SSH server and authenticate to it.  The server's host key
        is checked against the system host keys (see L{load_system_host_keys})
        and any local host keys (L{load_host_keys}).  If the server's hostname
        is not found in either set of host keys, the missing host key policy
        is used (see L{set_missing_host_key_policy}).  The default policy is
        to reject the key and raise an L{SSHException}.

        Authentication is attempted in the following order of priority:

            - The C{pkey} or C{key_filename} passed in (if any)
            - Any key we can find through an SSH agent
            - Any "id_rsa" or "id_dsa" key discoverable in C{~/.ssh/}
            - Plain username/password auth, if a password was given

from paramiko.agent import Agent
from paramiko.common import *
from paramiko.dsskey import DSSKey
from paramiko.hostkeys import HostKeys
from paramiko.resource import ResourceManager
from paramiko.rsakey import RSAKey
from paramiko.ssh_exception import SSHException, BadHostKeyException
from paramiko.transport import Transport


SSH_PORT = 22

连接connect函数默认使用22,但是当前22端口被关闭,既然这样就把这个 vi /gp/greenplum-db/lib/python/paramiko/client.py    python文件SSH_PORT改成现在所需端口即可,这样可确保greenplum批量安装成功,但是 gpssh-exkeys 还会报错,可忽略....

修改效果http://www.cndba.cn/Marvinn/article/3105http://www.cndba.cn/Marvinn/article/3105

[root@master commands]# gpseginstall -f /gp/conf/hostlist -u gpadmin -p gpadmin

20181031:16:57:33:008720 gpseginstall:master:root-[INFO]:-Installation Info:
link_name greenplum-db
binary_path /gp/greenplum
binary_dir_location /gp
binary_dir_name greenplum
20181031:16:57:33:008720 gpseginstall:master:root-[INFO]:-check cluster password access
20181031:16:57:34:008720 gpseginstall:master:root-[INFO]:-de-duplicate hostnames
20181031:16:57:34:008720 gpseginstall:master:root-[INFO]:-master hostname: master
20181031:16:57:35:008720 gpseginstall:master:root-[INFO]:-check for user gpadmin on cluster
20181031:16:57:35:008720 gpseginstall:master:root-[INFO]:-add user gpadmin on master
20181031:16:57:35:008720 gpseginstall:master:root-[INFO]:-add user gpadmin on cluster
20181031:16:57:35:008720 gpseginstall:master:root-[INFO]:-chown -R gpadmin:gpadmin /gp/greenplum-db
20181031:16:57:35:008720 gpseginstall:master:root-[INFO]:-chown -R gpadmin:gpadmin /gp/greenplum
20181031:16:57:35:008720 gpseginstall:master:root-[INFO]:-rm -f /gp/greenplum.tar; rm -f /gp/greenplum.tar.gz
20181031:16:57:35:008720 gpseginstall:master:root-[INFO]:-cd /gp; tar cf greenplum.tar greenplum
20181031:16:57:38:008720 gpseginstall:master:root-[INFO]:-gzip /gp/greenplum.tar
20181031:16:58:18:008720 gpseginstall:master:root-[INFO]:-remote command: mkdir -p /gp
20181031:16:58:18:008720 gpseginstall:master:root-[INFO]:-remote command: rm -rf /gp/greenplum
20181031:16:58:19:008720 gpseginstall:master:root-[INFO]:-scp software to remote location
20181031:16:58:22:008720 gpseginstall:master:root-[INFO]:-remote command: gzip -f -d /gp/greenplum.tar.gz
20181031:16:58:30:008720 gpseginstall:master:root-[INFO]:-md5 check on remote location
20181031:16:58:32:008720 gpseginstall:master:root-[INFO]:-remote command: cd /gp; tar xf greenplum.tar
20181031:16:58:33:008720 gpseginstall:master:root-[INFO]:-remote command: rm -f /gp/greenplum.tar
20181031:16:58:34:008720 gpseginstall:master:root-[INFO]:-remote command: cd /gp; rm -f greenplum-db; ln -fs greenplum greenplum-db
20181031:16:58:34:008720 gpseginstall:master:root-[INFO]:-remote command: chown -R gpadmin:gpadmin /gp/greenplum-db
20181031:16:58:35:008720 gpseginstall:master:root-[INFO]:-remote command: chown -R gpadmin:gpadmin /gp/greenplum
20181031:16:58:35:008720 gpseginstall:master:root-[INFO]:-rm -f /gp/greenplum.tar.gz
20181031:16:58:35:008720 gpseginstall:master:root-[INFO]:-Changing system passwords ...
20181031:16:58:36:008720 gpseginstall:master:root-[INFO]:-exchange ssh keys for user root
20181031:16:58:37:008720 gpseginstall:master:root-[INFO]:-Error running cmd: gpssh-exkeys -f /gp/conf/hostlist
20181031:16:58:37:008720 gpseginstall:master:root-[INFO]:-list index out of range
20181031:16:58:37:008720 gpseginstall:master:root-[INFO]:-gppsh-exkeys failed running from within pexpect ... now try outside of pexpect
[STEP 1 of 5] create local ID and authorize on local host
  ... /root/.ssh/id_rsa file exists ... key generation skipped
[STEP 2 of 5] keyscan all hosts and update known_hosts file
[STEP 3 of 5] authorize current user on remote hosts
  ... send to gpdb1
    ***
  *** Enter password for gpdb1: 
[ERROR gpdb1] Unknown server gpdb1
---
  *** Enter password for gpdb1: 
[ERROR gpdb1] Unknown server gpdb1
---
  *** Enter password for gpdb1: 
[ERROR gpdb1] Unknown server gpdb1
---
  *** Enter password for gpdb1: 
[ERROR gpdb1] Unknown server gpdb1
---
  *** Enter password for gpdb1: 
到这里为止,其实greemplum已经批量安装完成,剩下几步骤,可手工执行。。。


对比之前安装成功的输出文件,发现缺乏以下几步,手工执行即可:

后续未完成的命令 1:

20180905:13:53:23:002733 gpseginstall:hadoop02:root-[INFO]:-Changing system passwords ...
20180905:13:53:24:002733 gpseginstall:hadoop02:root-[INFO]:-exchange ssh keys for user root
20180905:13:53:25:002733 gpseginstall:hadoop02:root-[INFO]:-exchange ssh keys for user gpadmin
20180905:13:53:27:002733 gpseginstall:hadoop02:root-[INFO]:-/gp/greenplum-db/./sbin/gpfixuserlimts -f /etc/security/limits.conf -u gpadmin

--------------------------------------------------------------------------------------------------------
1、Master节点执行解决

[gpadmin@master ~]$ /gp/greenplum-db/./sbin/gpfixuserlimts -f /etc/security/limits.conf -u gpadmin
Error writing file [Errno 13] Permission denied: '/etc/security/limits.conf.tmp'
[gpadmin@master ~]$ exit
logout
[root@master .ssh]# /gp/greenplum-db/./sbin/gpfixuserlimts -f /etc/security/limits.conf -u gpadmin

---------------------------------------------------------------------------------------------------------

后续未完成的命令 2:
20180905:13:53:27:002733 gpseginstall:hadoop02:root-[INFO]:-remote command: . /gp/greenplum-db/./greenplum_path.sh; /gp/greenplum-db/./sbin/gpfixuserlimts -f /etc/security/limits.conf -u gpadmin
20180905:13:53:27:002733 gpseginstall:hadoop02:root-[INFO]:-version string on master: gpssh version 4.3.8.1 build 1
20180905:13:53:27:002733 gpseginstall:hadoop02:root-[INFO]:-remote command: . /gp/greenplum-db/./greenplum_path.sh; /gp/greenplum-db/./bin/gpssh --version
20180905:13:53:28:002733 gpseginstall:hadoop02:root-[INFO]:-remote command: . /gp/greenplum-db-4.3.8.1/greenplum_path.sh; /gp/greenplum-db-4.3.8.1/bin/gpssh --version
20180905:13:53:33:002733 gpseginstall:hadoop02:root-[INFO]:-SUCCESS -- Requested commands completed


2、所有远程gpdb1、gpdb2、gpdb3、gpdb4、gpdb5节点执行解决:
[root@gpdb4 data]# source /gp/greenplum-db/./greenplum_path.sh
[root@gpdb4 data]# /gp/greenplum-db/./sbin/gpfixuserlimts -f /etc/security/limits.conf -u gpadmin
[root@gpdb4 data]# /gp/greenplum-db/./bin/gpssh --version
[root@gpdb4 data]# /gp/greenplum/./bin/gpssh --version
gpssh version 4.3.7.3 build 2

到这即可完成所有的批量安装,后续检查即可进行集群初始化,后面步骤一切顺利,到此步贴出.....

总结:
    其实从批量安装的输出可以看出:批量安装无非就是把安装文件全部拷贝到所有节点,然后 软链接ln -fs greenplum greenplum-db,最后再执行一下上面手工执行的几个命令,确保安装成功

后续检查报错解决http://www.cndba.cn/Marvinn/article/3105http://www.cndba.cn/Marvinn/article/3105http://www.cndba.cn/Marvinn/article/3105http://www.cndba.cn/Marvinn/article/3105

http://www.cndba.cn/Marvinn/article/3105

[root@master commands]# gpcheck -f /gp/conf/hostlist -m master

20181031:17:09:57:009369 gpcheck:master:root-[INFO]:-dedupe hostnames
20181031:17:09:57:009369 gpcheck:master:root-[INFO]:-Detected platform: Generic Linux Cluster
20181031:17:09:57:009369 gpcheck:master:root-[INFO]:-generate data on servers
20181031:17:09:58:009369 gpcheck:master:root-[INFO]:-copy data files from servers
20181031:17:09:58:009369 gpcheck:master:root-[INFO]:-delete remote tmp files
20181031:17:09:58:009369 gpcheck:master:root-[INFO]:-Using gpcheck config file: /gp/greenplum-db/./etc/gpcheck.cnf
20181031:17:09:58:009369 gpcheck:master:root-[ERROR]:-GPCHECK_ERROR host(gpdb4): on device (/dev/sda) blockdev readahead value '65536' does not match expected value '16384'
20181031:17:09:58:009369 gpcheck:master:root-[ERROR]:-GPCHECK_ERROR host(gpdb1): on device (/dev/sda) blockdev readahead value '65536' does not match expected value '16384'
20181031:17:09:58:009369 gpcheck:master:root-[ERROR]:-GPCHECK_ERROR host(gpdb2): on device (/dev/sda) blockdev readahead value '65536' does not match expected value '16384'
20181031:17:09:58:009369 gpcheck:master:root-[ERROR]:-GPCHECK_ERROR host(gpdb3): on device (/dev/sda) blockdev readahead value '65536' does not match expected value '16384'
20181031:17:09:58:009369 gpcheck:master:root-[INFO]:-gpcheck completing...

这里报错是因为我磁盘预读设置为65536,而检查配置文件为16384,报错可忽略,也可以临时更改所有节点/dev/sda磁盘预读,跳过检查

blockdev --setra 65536 /dev/sda

[root@master commands]# 
[root@master commands]# gpcheck -f /gp/conf/hostlist -m master
20181031:17:10:27:009561 gpcheck:master:root-[INFO]:-dedupe hostnames
20181031:17:10:27:009561 gpcheck:master:root-[INFO]:-Detected platform: Generic Linux Cluster
20181031:17:10:27:009561 gpcheck:master:root-[INFO]:-generate data on servers
20181031:17:10:27:009561 gpcheck:master:root-[INFO]:-copy data files from servers
20181031:17:10:28:009561 gpcheck:master:root-[INFO]:-delete remote tmp files
20181031:17:10:28:009561 gpcheck:master:root-[INFO]:-Using gpcheck config file: /gp/greenplum-db/./etc/gpcheck.cnf
20181031:17:10:28:009561 gpcheck:master:root-[INFO]:-GPCHECK_NORMAL
20181031:17:10:28:009561 gpcheck:master:root-[INFO]:-gpcheck completing...

所以,Greemplum 若端口不是22,安装还是挺郁闷的….

版权声明:本文为博主原创文章,未经博主允许不得转载。

用户评论
* 以下用户言论只代表其个人观点,不代表CNDBA社区的观点或立场
Marvinn

Marvinn

关注

路漫漫其修远兮、吾将上下而求索

  • 99
    原创
  • 0
    翻译
  • 2
    转载
  • 36
    评论
  • 访问:458478次
  • 积分:449
  • 等级:中级会员
  • 排名:第12名
精华文章
    最新问题
    查看更多+
    热门文章
      热门用户
      推荐用户
        Copyright © 2016 All Rights Reserved. Powered by CNDBA · 皖ICP备2022006297号-1·

        QQ交流群

        注册联系QQ