1 错误现象
在部署 GBase 8c 分布式集群的时候报如下错误:
[dave@www.cndba.cn script]$
[dave@www.cndba.cn script]$ ./gha_ctl install -c gbase -p /home/gbase
"msg":"'NoneType' object has no attribute 'group'"
[dave@www.cndba.cn script]$
gbase 安装日志默认在/tmp/gha_ctl/gha_ctl.log,查看安装日志:
2023-09-18 13:40:31 etcd.py __init__ 92 INFO 13362 AbstractEtcdClientWithFailover config:{'host_tuple': (('', 2479), ('', 2479), ('', 2479)), 'username': 'gbase', 'password': 'AidmOwGw$L5%FyQb', 'hosts': ['', '', '']}
2023-09-18 13:40:31 client.py __init__ 186 DEBUG 13362 New etcd client created for
2023-09-18 13:40:31 retry.py from_int 332 DEBUG 13362 Converted retries value: 1 -> Retry(total=1, connect=None, read=None, redirect=0, status=None)
2023-09-18 13:40:31 connectionpool.py _new_conn 227 DEBUG 13362 Starting new HTTP connection (1):
2023-09-18 13:40:31 dcs.py create_connection_patched 103 DEBUG 13362 create_connection_patched err:[Errno 111] Connection refused
2023-09-18 13:40:31 retry.py increment 575 DEBUG 13362 Incremented Retry for (url='/v2/machines'): Retry(total=0, connect=None, read=None, redirect=0, status=None)
2023-09-18 13:40:31 connectionpool.py urlopen 780 WARNING 13362 Retrying (Retry(total=0, connect=None, read=None, redirect=0, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x2b66ea8bb430>: Failed to establish a new connection: [Errno 111] Connection refused')': /v2/machines
2023-09-18 13:40:31 connectionpool.py _new_conn 227 DEBUG 13362 Starting new HTTP connection (2):
2023-09-18 13:40:31 dcs.py create_connection_patched 103 DEBUG 13362 create_connection_patched err:[Errno 111] Connection refused
2023-09-18 13:40:31 etcd.py machines 214 ERROR 13362 Failed to get list of machines from MaxRetryError("HTTPConnectionPool(host='', port=2479): Max retries exceeded with url: /v2/machines (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x2b66ea8bb4f0>: Failed to establish a new connection: [Errno 111] Connection refused'))")
2 解决方法
这种错误的原因有 3 种:
- etcd 的端口被占用了,这个可以通过netstat -an 确认。
- 集群节点之间时间不同步。
- yml 配置文件格式不正确。
我们这里的的原因就是yml 格式的问题。 开始我是手工修改的 IP地址,没有解决问题,后来重新配置,在 vim 中使用%s 命令替换 IP 地址后,问题解决。