签到成功

知道了

CNDBA社区CNDBA社区

Oracle 10g Rac root.sh Failure at final check of Oracle CRS stack 10 解决方法

2016-11-25 15:42 4057 0 原创 Oracle 故障处理
作者: dave

 

一.问题说明

 

安装环境:Oracle linux 6.1

数据库: 10.2.0.1

 

安装Oracle 10g的RAC,在第一个节点执行root.sh 时报错,如下:

 

[root@rac1 ~]# /u01/app/10.2.0/grid/root.sh

WARNING: directory '/u01/app/10.2.0' is notowned by root

WARNING: directory '/u01/app' is not ownedby root

WARNING: directory '/u01' is not owned byroot

Checking to see if Oracle CRS stack isalready configured

 

Setting the permissions on OCR backupdirectory

Setting up NS directories

Oracle Cluster Registry configurationupgraded successfully

WARNING: directory '/u01/app/10.2.0' is notowned by root

WARNING: directory '/u01/app' is not ownedby root

WARNING: directory '/u01' is not owned byroot

Successfully accumulated necessary OCRkeys.

Using ports: CSS=49895 CRS=49896 EVMC=49898and EVMR=49897.

node <nodenumber>: <nodename><private interconnect name> <hostname>

node 1: rac1 rac1-priv rac1

node 2: rac2 rac2-priv rac2

Creating OCR keys for user 'root', privgrp'root'..

Operation successful.

Now formatting voting device: /dev/raw/raw3

 

Now formatting voting device: /dev/raw/raw4

Now formatting voting device: /dev/raw/raw5http://www.cndba.cn/cndba/dave/article/462

Format of 3 voting devices complete.

Startup will be queued to init within 90seconds.

Adding daemons to inittab

Expecting the CRS daemons to be up within600 seconds.http://www.cndba.cn/cndba/dave/article/462

 

Failure at final check of Oracle CRS stack.

10

 

二.MOS上有篇文档说明这个问题:

 

2.1 文档一:

Root.sh failed at Failure at final check ofOracle CRS stack 10 [ID 725878.1]

 

Case

This particular case is caused by the OSinit system does not working.

" Failure at final check of Oracle CRS stack.
10" 
means CRS daemon did not startup during 600 seconds period.

 

In the root.sh script, it adds CRS relatedentry in /etc/inittab, run "init q" and expect 3 CRS related daemonprocesses to start, eg:

init.cssd
init.crsd
init.evmd

 

With init system problem, none of thesedaemon processes are spawned, this causes CRS process startup failure as theyrely on the CRS daemon processes to start first.
--这里说明是init system problem 出现问题,导致进程无法启动。可以通过以下方法验证这个问题:


This can be verified by adding a simple entry in /etc/inittab:

test:2:once:/usr/bin/echo "HELLOTEST" > /tmp/test.log


run "init q" as root user. If the init is working, then there shouldbe a file /tmp/test.log generated.

 

Solution

 

--MOS上仅给出了AIX上的解决方案,如下:

Please consult with system administrator tofix initissue.

Here the solution is only valid for AIXplatform:

1. Starting the script install_assist (AIXGUI utility Installation Assistance)
2. Updating for example the date, then exit install_assist properly
3. Reboot the system
After that daemon process in /etc/inittab started, CRS installation completed.

 

 

2.2 文档二:

Clusterware Fails To Start DuringRoot.sh -- "Failure at final check of Oracle CRS stack 10" [ID329450.1]

 

The Oracle Clusterware runs as root, but for some operations itneed to run as the oracle user, and uses the "su -l" which invokesthe oracle user shell login/profile script. If that shell profile script hasinteractive or cpu bound operations or prompts this may affect theClusterware operation.

 

--这边文档说的是.bash_profile中的参数有交互性的参数,删除这些参数就可以了。

 

其他文档:

Troubleshooting 10g or 11.1 OracleClusterware Root.sh Problems [ID 240001.1]

 

 

三.问题分析

 

查看相关log:

http://www.cndba.cn/cndba/dave/article/462

[oracle@rac1 client]$ pwd

/u01/app/10.2.0/grid/log/rac1/client

http://www.cndba.cn/cndba/dave/article/462

[oracle@rac1 client]$ ls

clscfg_6337.log  clsc.log css.log  ocrconfig_6285.log

 

[oracle@rac1 client]$ tail -30 css.log

2012-07-12 23:23:15.565: [CSSCLNT][3681171200]clsssInitNative: connect failed, rc 9

2012-07-12 23:23:16.977: [CSSCLNT][3681171200]clsssInitNative: connect failed, rc 9

2012-07-12 23:23:18.390: [CSSCLNT][3681171200]clsssInitNative: connect failed, rc 9

2012-07-12 23:23:19.885: [CSSCLNT][3681171200]clsssInitNative: connect failed, rc 9

 

[oracle@rac1 client]$ tail -10 clsc.log

Oracle Database 10g CRS Release 10.2.0.1.0Production Copyright 1996, 2005 Oracle. All rights reserved.

2012-07-12 23:24:51.389: [default][4093163264]Terminating clsd session

2012-07-12 23:25:00.274: [default][135894784]Terminating clsd session

 

[oracle@rac1 client]$ tail clscfg_6337.log

Oracle Database 10g CRS Release 10.2.0.1.0Production Copyright 1996, 2005 Oracle. All rights reserved.

2012-07-12 23:12:24.477: [  CLSCFG][1566725888]clscfg: Nodelist is [rac1rac2 ]

 

[oracle@rac1 rac1]$ cat alertrac1.log

2012-07-12 10:06:10.703

[client(6285)]CRS-1006:The OCR location/dev/raw/raw2 is inaccessible. Details in/u01/app/10.2.0/grid/log/rac1/client/ocrconfig_6285.log.

2012-07-12 10:06:11.076

[client(6285)]CRS-1001:The OCR wasformatted using version 2.

2012-07-12 10:12:24.479

[client(6337)]CRS-1801:Cluster crsconfigured with nodes rac1 rac2 .

 

 

--在一个节点用root执行如下命令,清除OCR上的信息:

[root@rac1 ~]# sh /u01/app/10.2.0/grid/install/rootdeinstall.sh

Removing contents from OCR mirror device

2560+0 records in

2560+0 records out

10485760 bytes (10 MB) copied, 2.46509 s,4.3 MB/shttp://www.cndba.cn/cndba/dave/article/462

Removing contents from OCR device

2560+0 records in

2560+0 records out

10485760 bytes (10 MB) copied, 1.18886 s,8.8 MB/s

 

 

然后在运行root.sh 问题依旧。

 

 

尝试使用了如下方法:

1.     关闭防火墙

我在安装之前已经把防火墙关闭,所以这里只是检查一下。

 

[root@rac1 tmp]# service iptables status

iptables: Firewall is not running.

[root@rac1 tmp]# chkconfig iptables --list

iptables        0:off  1:off   2:off   3:off  4:off   5:off   6:off

 

2.     注释了如下文件:

[root@rac1 tmp]# cat /etc/pam.d/other

#%PAM-1.0

auth    required       pam_deny.so

account required       pam_deny.so

password required       pam_deny.sohttp://www.cndba.cn/cndba/dave/article/462

session required       pam_deny.so

 

3.     删除相关socket

# rm -f /usr/tmp/.oracle/*http://www.cndba.cn/cndba/dave/article/462

# rm -f /tmp/.oracle/*

# rm -f /var/tmp/.oracle/*http://www.cndba.cn/cndba/dave/article/462

 

Unable To Connect To Cluster ManagerOra-29701 as Network Socket Files are Removed [ID 391790.1]

 

运行sh/u01/app/10.2.0/grid/install/rootdeinstall.sh清除后再次安装,问题依旧,可能还是兼容性的问题。

 

后来把OS换成Redhat 5.4,成功安装了,可能还是Oracle 10g在Oracle Linux 6上的兼容性问题,在Oracle Linux 6上,我测试过Oracle 11.2.0.3的RAC,安装没有问题。

 

 

 

 

 

 http://www.cndba.cn/cndba/dave/article/462

-------------------------------------------------------------------------------------------------------

版权所有,文章允许转载,但必须以链接方式注明源地址,否则追究法律责任!

Skype: tianlesoftware

QQ:              tianlesoftware@gmail.com

Email:   tianlesoftware@gmail.com

Blog:     http://www.tianlesoftware.com

Weibo: http://weibo.com/tianlesoftware

Twitter: http://twitter.com/tianlesoftware

Facebook: http://www.facebook.com/tianlesoftware

Linkedin: http://cn.linkedin.com/in/tianlesoftware

 

 

-------加群需要在备注说明Oracle表空间和数据文件的关系,否则拒绝申请----

DBA1 群:62697716(满);   DBA2 群:62697977(满)  DBA3 群:62697850(满)  

DBA 超级群:63306533(满);  DBA4 群:83829929   DBA5群: 142216823

DBA6 群:158654907    DBA7 群:172855474   DBA总群:104207940

http://www.cndba.cn/cndba/dave/article/462

版权声明:本文为博主原创文章,未经博主允许不得转载。

用户评论
* 以下用户言论只代表其个人观点,不代表CNDBA社区的观点或立场
dave

dave

关注

人的一生应该是这样度过的:当他回首往事的时候,他不会因为虚度年华而悔恨,也不会因为碌碌无为而羞耻;这样,在临死的时候,他就能够说:“我的整个生命和全部精力,都已经献给世界上最壮丽的事业....."

  • 2297
    原创
  • 3
    翻译
  • 730
    转载
  • 201
    评论
  • 访问:10530306次
  • 积分:4628
  • 等级:核心会员
  • 排名:第1名
精华文章
    热门文章
      Copyright © 2016 All Rights Reserved. Powered by CNDBA · 皖ICP备2022006297号-1·

      AI QQ群