orasa00进程在实例启动期间消耗大量内存并且引起系统挂起 (Doc ID 2610725.1)
适用于:
Oracle Database - Enterprise Edition - 版本 12.1.0.2 和更高版本
本文档所含信息适用于所有平台
1. 症状
在Oracle 12c 以后的版本中,实例启动期间某些进程将进入D状态并耗尽内存。
实例启动显示以下的进程无法启动,无法派生jobq从属进程等。
例如:来自<oracle base>/diag/rdbms/racdb/racdb1/trace位置的alert_racdb1.log:
Fri Jan 18 14:58:44 2019
Decreasing number of real time LMS from 5 to 0
Process m000 died, see its trace file
Errors in file /u01/app/oracle/diag/rdbms/racdb/racdb1/trace/racdb1_cjq0_8703.trc (incident=58598):
ORA-00445: background process "J000" did not start after 30 seconds
Incident details in: /u01/app/oracle/diag/rdbms/racdb/racdb1/incident/incdir_58598/racdb1_cjq0_8703_i58598.trc
Process m000 died, see its trace file
Errors in file /u01/app/oracle/diag/rdbms/racdb/racdb1/trace/racdb1_mmon_7591.trc (incident=58030):
ORA-00445: background process "m001" did not start after 120 seconds
Fri Jan 18 15:13:05 2019
Dumping diagnostic data in directory=[cdmp_20190118151304], requested by (instance=1, osid=8703 (CJQ0)), summary=[incident=58598].
Incident details in: /u01/app/oracle/diag/rdbms/racdb/racdb1/incident/incdir_58030/racdb1_mmon_7591_i58030.trc
Fri Jan 18 15:15:19 2019
kkjcre1p: unable to spawn jobq slave process
在实例启动期间,top命令观察到orasa00进程消耗了大量内存,并且因为内存不足进入D状态。 随后造成系统挂起,其他进程被阻塞。
$top
top - 00:32:14 up 4 days, 16:35, 7 users, load average: 0.90, 1.04, 1.88
Tasks: 666 total, 2 running, 663 sleeping, 0 stopped, 1 zombie
Cpu(s): 0.4%us, 3.0%sy, 0.0%ni, 96.5%id, 0.1%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 264691032k total, 130006720k used, 134684312k free, 2111200k buffers
Swap: 12861420k total, 0k used, 12861420k free, 121138680k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
8134 oracle 20 0 60.3g 29g 29g R 98.3 11.7 0:18.41 ora_sa00_racdb <<<<<<<<<<
8202 oracle 20 0 60.3g 178m 171m S 1.4 0.1 0:01.56 oracle_8202_racdb
8142 oracle 20 0 60.4g 110m 84m S 0.8 0.0 0:00.46 ora_dia0_racdb
实例启动期间的ps输出,sa00进程处于’D’状态
oracle 28838 1 19 13.4 24.2 159212608 128095476 lock_p D 18:38:20 00:12:17 ora_sa00_<INSTANCENAME>
操作系统消息日志(Linux中的/var/log/messages)报告内存不足
Jan 18 15:13:05 <hostname> kernel: [36973] 54321 36973 43872285 882 40 1906 0 ora_cjq0_<instancename>
Jan 18 15:13:05 <hostname> kernel: Out of memory: Kill process 12457 (ora_sa00_<instancename>) score 189 or sacrifice child >>>>>>>>>>>>>>>>>>>> OOMemory
Jan 18 15:13:05 <hostname> kernel: Killed process 12457 (ora_sa00_<instancename>) total-vm:175465540kB, anon-rss:140kB, file-rss:1592kB, shmem-rss:112608924kB
2. 原因
从12.1版本开始,pre_page_sga的默认值为TRUE。 当pre_page_sga = TRUE时,SA00尝试对SGA内存进行预分配,以便在第一次访问内存时进程不会遇到主要的 page fault。
3. 解决方案
在数据库的初始化参数文件中设置pre_page_sga = false。 随后重新启动数据库实例。 设置pre_page_sga = FALSE,将强制SA00跳过预分配SGA内存。