Oracle 19c 升级19.6 RU 导致权限异常 gipcInternalConnectSync: failed sync request 解决方法
作者:
dave
这2天一直在测试RAC 环境的19.6 的RU升级问题。因为opatchauto 一起升级GI 和DB 操作导致CRS 权限异常, CRS 无法启动。 根据提示,之前已经处理过一批错误,如下:
Oracle 19c RAC PRVG-11960 : Set user ID bit is not set for file 解决方法
https://www.cndba.cn/dave/article/4079
解决之后,CRS 还是无法启动,如下:
[root@www.cndba.cn1 bin]# crsctl start crs
CRS-41053: checking Oracle Grid Infrastructure for file permission issues
CRS-4124: Oracle High Availability Services startup failed.
CRS-4000: Command Start failed, or completed with errors.
[root@www.cndba.cn1 bin]#
查看日志文件:
[root@www.cndba.cn1 trace]# tail -100f crsctl_8891.trc
Trace file /u01/app/grid/diag/crs/rac1/crs/trace/crsctl_8891.trc
Oracle Database 19c Clusterware Release 19.0.0.0.0 - Production
Version 19.6.0.0.0 Copyright 1996, 2019 Oracle. All rights reserved.
default:535224704: u_set_comp_error: comptype '103' : error '29'
default:535224704: u_set_comp_error: comptype '103' : error '29'
2020-03-14 15:18:26.910*:kgfpm.c@1144: kgfpmInitPatchIter: npatches 4
2020-03-14 15:18:26.910*:kgfpm.c@1176: kgfpmGetNextPatch: patchid[7ffdcdc93e98] 30489227
2020-03-14 15:18:26.910*:kgfpm.c@1176: kgfpmGetNextPatch: patchid[7ffdcdc93e98] 30489632
2020-03-14 15:18:26.910*:kgfpm.c@1176: kgfpmGetNextPatch: patchid[7ffdcdc93e98] 30557433
2020-03-14 15:18:26.910*:kgfpm.c@1176: kgfpmGetNextPatch: patchid[7ffdcdc93e98] 30655595
default:535224704: u_set_comp_error: comptype '103' : error '29'
default:535224704: u_set_comp_error: comptype '103' : error '29'
2020-03-14 15:18:27.007 :GIPCXCPT:535224704: gipcInternalConnectSync: failed sync request, addr 0x55d9c4c72b60 [000000000000014d] { gipcAddress : name 'clsc://(ADDRESS=(PROTOCOL=ipc)(KEY=OHASD_UI_SOCKET)(GIPCID=00000000-00000000-0))', objFlags 0x0, addrFlags 0x4 }, ret gipcretConnectionRefused (29)
2020-03-14 15:18:27.007 :GIPCXCPT:535224704: gipcConnectSyncF [clscrsconGipcConnect : clscrscon.c : 698]: EXCEPTION[ ret gipcretConnectionRefused (29) ] failed sync connect endp 0x55d9c4c717c0 [000000000000014a] { gipcEndpoint : localAddr 'clsc://(ADDRESS=(PROTOCOL=ipc)(KEY=)(GIPCID=00000000-00000000-0))', remoteAddr 'clsc://(ADDRESS=(PROTOCOL=ipc)(KEY=OHASD_UI_SOCKET)(GIPCID=00000000-00000000-0))', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef (nil), ready 0, wobj 0x55d9c4c6c2d0, sendp 0x55d9c4c739a0 status 13flags 0xa108071a, flags-2 0x0, usrFlags 0x0 }, addr 0x55d9c4c72b60 [000000000000014d] { gipcAddress : name 'clsc://(ADDRESS=(PROTOCOL=ipc)(KEY=OHASD_UI_SOCKET)(GIPCID=00000000-00000000-0))', objFlags 0x0, addrFlags 0x4 }, flags 0x0
default:535224704: u_set_comp_error: comptype '103' : error '29'
2020-03-14 15:18:32.017 :GIPCXCPT:535224704: gipcInternalConnectSync: failed sync request, addr 0x55d9c4c73040 [0000000000000186] { gipcAddress : name 'clsc://(ADDRESS=(PROTOCOL=ipc)(KEY=OHASD_UI_SOCKET)(GIPCID=00000000-00000000-0))', objFlags 0x0, addrFlags 0x4 }, ret gipcretConnectionRefused (29)
2020-03-14 15:18:32.017 :GIPCXCPT:535224704: gipcConnectSyncF [clscrsconGipcConnect : clscrscon.c : 698]: EXCEPTION[ ret gipcretConnectionRefused (29) ] failed sync connect endp 0x55d9c4c80200 [0000000000000183] { gipcEndpoint : localAddr 'clsc://(ADDRESS=(PROTOCOL=ipc)(KEY=)(GIPCID=00000000-00000000-0))', remoteAddr 'clsc://(ADDRESS=(PROTOCOL=ipc)(KEY=OHASD_UI_SOCKET)(GIPCID=00000000-00000000-0))', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef (nil), ready 0, wobj 0x55d9c4c6c2f0, sendp 0x55d9c4c73e80 status 13flags 0xa108071a, flags-2 0x0, usrFlags 0x0 }, addr 0x55d9c4c73040 [0000000000000186] { gipcAddress : name 'clsc://(ADDRESS=(PROTOCOL=ipc)(KEY=OHASD_UI_SOCKET)(GIPCID=00000000-00000000-0))', objFlags 0x0, addrFlags 0x4 }, flags 0x0
default:535224704: u_set_comp_error: comptype '103' : error '29'
2020-03-14 15:18:37.025 :GIPCXCPT:535224704: gipcInternalConnectSync: failed sync request, addr 0x55d9c4c73520 [00000000000001bf] { gipcAddress : name 'clsc://(ADDRESS=(PROTOCOL=ipc)(KEY=OHASD_UI_SOCKET)(GIPCID=00000000-00000000-0))', objFlags 0x0, addrFlags 0x4 }, ret gipcretConnectionRefused (29)
从错误提示看,与socket文件有关,CRS 在启动的时候会在/var/tmp/.oracle目录下创建相关的SOCKET文件,并且在RAC 运行中,这些文件也不能删除,否则RAC 会异常。 这里我尝试修改了/var/tmp/.oracle 权限后依然无效,应该还有其他权限有问题。
[root@www.cndba.cn1 tmp]# chown root:oinstall /var/tmp/.oracle -R
[root@www.cndba.cn1 tmp]# chmod 775 /var/tmp/.oracle -R
最后采用大招,直接将节点的2的整个/u01 目录打包,删除节点1的/u01 目录,在解压缩:
[root@www.cndba.cn2 trace]# tar -cvf /u01.tar /u01
[root@www.cndba.cn2 trace]# scp u01.tar rac1:/
[root@www.cndba.cn1 crs]# tar -xvf u01.tar
解压成功后,在节点1用root执行root.sh 脚本:
[root@www.cndba.cn1 crs]# /u01/app/19.3.0/grid/root.sh
Performing root user operation.
The following environment variables are set as:
ORACLE_OWNER= grid
ORACLE_HOME= /u01/app/19.3.0/grid
Enter the full pathname of the local bin directory: [/usr/local/bin]:
The contents of "dbhome" have not changed. No need to overwrite.
The contents of "oraenv" have not changed. No need to overwrite.
The contents of "coraenv" have not changed. No need to overwrite.
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Relinking oracle with rac_on option
Using configuration parameter file: /u01/app/19.3.0/grid/crs/install/crsconfig_params
The log of current session can be found at:
/u01/app/grid/crsdata/rac1/crsconfig/rootcrs_rac1_2020-03-14_03-22-17PM.log
2020/03/14 15:22:32 CLSRSC-594: Executing installation step 1 of 19: 'SetupTFA'.
2020/03/14 15:22:33 CLSRSC-594: Executing installation step 2 of 19: 'ValidateEnv'.
2020/03/14 15:22:33 CLSRSC-363: User ignored prerequisites during installation
2020/03/14 15:22:33 CLSRSC-594: Executing installation step 3 of 19: 'CheckFirstNode'.
2020/03/14 15:22:39 CLSRSC-594: Executing installation step 4 of 19: 'GenSiteGUIDs'.
2020/03/14 15:22:39 CLSRSC-594: Executing installation step 5 of 19: 'SetupOSD'.
2020/03/14 15:22:39 CLSRSC-594: Executing installation step 6 of 19: 'CheckCRSConfig'.
2020/03/14 15:22:41 CLSRSC-594: Executing installation step 7 of 19: 'SetupLocalGPNP'.
2020/03/14 15:22:47 CLSRSC-594: Executing installation step 8 of 19: 'CreateRootCert'.
2020/03/14 15:22:47 CLSRSC-594: Executing installation step 9 of 19: 'ConfigOLR'.
2020/03/14 15:23:03 CLSRSC-594: Executing installation step 10 of 19: 'ConfigCHMOS'.
2020/03/14 15:23:04 CLSRSC-594: Executing installation step 11 of 19: 'CreateOHASD'.
2020/03/14 15:23:10 CLSRSC-594: Executing installation step 12 of 19: 'ConfigOHASD'.
2020/03/14 15:23:10 CLSRSC-330: Adding Clusterware entries to file 'oracle-ohasd.service'
2020/03/14 15:23:31 CLSRSC-4002: Successfully installed Oracle Trace File Analyzer (TFA) Collector.
2020/03/14 15:24:25 CLSRSC-594: Executing installation step 13 of 19: 'InstallAFD'.
2020/03/14 15:25:28 CLSRSC-594: Executing installation step 14 of 19: 'InstallACFS'.
2020/03/14 15:26:41 CLSRSC-594: Executing installation step 15 of 19: 'InstallKA'.
2020/03/14 15:26:47 CLSRSC-594: Executing installation step 16 of 19: 'InitConfig'.
2020/03/14 15:27:07 CLSRSC-594: Executing installation step 17 of 19: 'StartCluster'.
2020/03/14 15:28:05 CLSRSC-343: Successfully started Oracle Clusterware stack
2020/03/14 15:28:05 CLSRSC-594: Executing installation step 18 of 19: 'ConfigNode'.
2020/03/14 15:28:29 CLSRSC-594: Executing installation step 19 of 19: 'PostConfig'.
2020/03/14 15:29:04 CLSRSC-325: Configure Oracle Grid Infrastructure for a Cluster ... succeeded
Error 4 opening dom ASM/Self in 0x43e1820
Domain name to open is ASM/Self
Error 4 opening dom ASM/Self in 0x43e1820
[root@www.cndba.cn1 crs]#
操作结束后,节点1的CRS 终于正常了:
[root@www.cndba.cn1 crs]# crsctl stat res -t
--------------------------------------------------------------------------------
Name Target State Server State details
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.LISTENER.lsnr
ONLINE ONLINE rac1 STABLE
ONLINE ONLINE rac2 STABLE
ora.MGMT.GHCHKPT.advm
OFFLINE OFFLINE rac1 STABLE
OFFLINE OFFLINE rac2 STABLE
ora.chad
ONLINE ONLINE rac1 STABLE
ONLINE ONLINE rac2 STABLE
ora.helper
OFFLINE OFFLINE rac1 IDLE,STABLE
OFFLINE OFFLINE rac2 IDLE,STABLE
ora.mgmt.ghchkpt.acfs
OFFLINE OFFLINE rac1 STABLE
OFFLINE OFFLINE rac2 STABLE
ora.net1.network
ONLINE ONLINE rac1 STABLE
ONLINE ONLINE rac2 STABLE
ora.ons
ONLINE ONLINE rac1 STABLE
ONLINE ONLINE rac2 STABLE
ora.proxy_advm
OFFLINE OFFLINE rac1 STABLE
OFFLINE OFFLINE rac2 STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.ASMNET1LSNR_ASM.lsnr(ora.asmgroup)
1 ONLINE ONLINE rac1 STABLE
2 ONLINE ONLINE rac2 STABLE
3 ONLINE OFFLINE STABLE
ora.DATA.dg(ora.asmgroup)
1 ONLINE ONLINE rac1 STABLE
2 ONLINE ONLINE rac2 STABLE
3 OFFLINE OFFLINE STABLE
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE rac2 STABLE
ora.MGMT.dg(ora.asmgroup)
1 ONLINE ONLINE rac1 STABLE
2 ONLINE ONLINE rac2 STABLE
3 OFFLINE OFFLINE STABLE
ora.MGMTLSNR
1 ONLINE ONLINE rac2 169.254.21.159 192.1
68.222.181,STABLE
ora.OCR.dg(ora.asmgroup)
1 ONLINE ONLINE rac1 STABLE
2 ONLINE ONLINE rac2 STABLE
3 OFFLINE OFFLINE STABLE
ora.asm(ora.asmgroup)
1 ONLINE ONLINE rac1 Started,STABLE
2 ONLINE ONLINE rac2 Started,STABLE
3 OFFLINE OFFLINE STABLE
ora.asmnet1.asmnetwork(ora.asmgroup)
1 ONLINE ONLINE rac1 STABLE
2 ONLINE ONLINE rac2 STABLE
3 OFFLINE OFFLINE STABLE
ora.cndba.db
1 ONLINE ONLINE rac1 Open,HOME=/u01/app/o
racle/product/19.3.0
/db_1,STABLE
2 ONLINE ONLINE rac2 Open,HOME=/u01/app/o
racle/product/19.3.0
/db_1,STABLE
ora.cvu
1 ONLINE ONLINE rac2 STABLE
ora.mgmtdb
1 ONLINE ONLINE rac2 Open,STABLE
ora.qosmserver
1 ONLINE ONLINE rac2 STABLE
ora.rac1.vip
1 ONLINE ONLINE rac1 STABLE
ora.rac2.vip
1 ONLINE ONLINE rac2 STABLE
ora.rhpserver
1 OFFLINE OFFLINE STABLE
ora.scan1.vip
1 ONLINE ONLINE rac2 STABLE
--------------------------------------------------------------------------------
[root@www.cndba.cn1 crs]#
版权声明:本文为博主原创文章,未经博主允许不得转载。