1 MRP 负载过高说明
在19c Data Guard 环境中,如果CPU_COUNT 较大的情况下,可能会出现MRP 进程过多,导致主机load average 平均负载过高,同时同步较慢的情况。
查询等待事件:
select event,count(*) from v$session group by event order by 2 desc;
SQL> col event for a60
SQL> set pages 200
SQL> select event,count(*) from v$session group by event order by 2 desc;
EVENT COUNT(*)
------------------------------------------------------------ ----------
rdbms ipc message 19
Space Manager: slave idle wait 8
parallel recovery slave next change 4
class slave wait 4
pmon timer 3
SQL*Net message from client 3
DIAG idle wait 2
如下等待事件:parallel recovery slave next change,其数量和CPU 数一致。
观察数据库alert日志,在启动日志应用后,伴随着mrp进程,还启动了4个并行的用于镜像恢复的子进程,这些进程可能会是cpu占用高的进程,且4正好对应服务器逻辑cpu个数。
PR00 (PID:56671): Managed Standby Recovery starting Real Time Apply
max_pdb is 4
2025-07-18T22:37:21.468248+08:00
Parallel Media Recovery started with 4 slaves
[dave@www.cndba.cn 37960098]$ ps -ef|grep ora_pr|grep -v "grep "|wc -l
5
[dave@www.cndba.cn 37960098]$ ps -ef|grep ora_pr|grep -v "grep "
oracle 56671 1 2 22:37 ? 00:01:53 ora_pr00_cndba
oracle 56673 1 1 22:37 ? 00:01:27 ora_pr01_cndba
oracle 56675 1 1 22:37 ? 00:01:20 ora_pr02_cndba
oracle 56677 1 1 22:37 ? 00:01:25 ora_pr03_cndba
oracle 56679 1 1 22:37 ? 00:01:15 ora_pr04_cndba
[dave@www.cndba.cn 37960098]$
SQL> select role,thread#,sequence#,action from v$dataguard_process;
ROLE THREAD# SEQUENCE# ACTION
------------------------------------------------ ---------- ---------- ------------------------
redo transport monitor 0 0 IDLE
log writer 0 0 IDLE
gap manager 0 0 IDLE
redo transport timer 0 0 IDLE
archive local 0 0 IDLE
archive redo 0 0 IDLE
archive redo 0 0 IDLE
archive redo 0 0 IDLE
sync 0 0 IDLE
managed recovery 0 0 IDLE
recovery logmerger 1 45 APPLYING_LOG
recovery apply slave 0 0 APPLYING_LOG
recovery apply slave 0 0 APPLYING_LOG
recovery apply slave 0 0 APPLYING_LOG
recovery apply slave 0 0 APPLYING_LOG
RFS ping 1 45 IDLE
RFS archive 0 0 IDLE
RFS sync 1 45 IDLE
2 解决方法
2.1 解决方法1: [不推荐]
因为Managed Standby Recovery默认等于CPU的数量,所以可以降低cpu_count 参数,用于限制数据库使用cpu的个数。
alter system set cpu_count=48 scope=spfile;
重启数据库生效:
shutdown immediate;
startup;
alter database recover managed standby database disconnect;
2.2 解决方法2: [推荐]
启用MRP时指定并发数:
SQL> alter database recover managed standby database cancel;
SQL> alter database recover managed standby database parallel 2 disconnect;
SQL> select role,thread#,sequence#,action from v$dataguard_process;
ROLE THREAD# SEQUENCE# ACTION
------------------------------------------------ ---------- ---------- ------------------------
redo transport monitor 0 0 IDLE
log writer 0 0 IDLE
gap manager 0 0 IDLE
redo transport timer 0 0 IDLE
archive local 0 0 IDLE
archive redo 0 0 IDLE
archive redo 0 0 IDLE
archive redo 0 0 IDLE
sync 0 0 IDLE
RFS ping 1 55 IDLE
RFS archive 0 0 IDLE
RFS sync 1 55 IDLE
managed recovery 0 0 IDLE
recovery logmerger 1 55 APPLYING_LOG
recovery apply slave 0 0 APPLYING_LOG
recovery apply slave 0 0 APPLYING_LOG
post role transition 0 0 IDLE
17 rows selected.
参考:
Tuning the parallelism to improve Log Apply Rate for a Physical Standby Database (Doc ID 2848979.1)
PHYSICAL: Why does Managed Recovery spawn so many workers? (Doc ID 885872.1)
版权声明:本文为博主原创文章,未经博主允许不得转载。