签到成功

知道了

CNDBA社区CNDBA社区

Oracle 18c Data Guard ABMR自动坏块修复说明

2018-09-01 13:20 3188 0 原创 Oracle 18c
作者: dave

1. 背景说明

在前面的博客中,我们看了DG有关的操作:
Oracle 18c Data Guard 切换保护模式 和 主备库角色操作手册
https://www.cndba.cn/dave/article/2996http://www.cndba.cn/cndba/dave/article/2998

本篇我们继续看下DG的自动坏块修复功能ABMR(Automatic Block Media Repair)。

当物理备库使用ADG和real-time apply功能就可以使用ABMR功能。从12c 开始, 默认就已启动了Real-time Apply, 具体可以参考:
Oracle 12c 新特性 — DG 默认使用 Real-Time Apply
https://www.cndba.cn/cndba/dave/article/213

在ADG中,如果Oracle 检测到主库或备库的物理坏块,会自动使用对端数据库上的有效副本来修复坏块。 这个操作对数据库和用户是透明的,应用不需要做任何修改。 如果坏块不能修复(比如文件头损坏,单块的修复时间超过60秒,或者坏块的数量超过100个),那么会报ORA-1578的错误,需要DBA干预,进行手工修复。 http://www.cndba.cn/cndba/dave/article/2998

使用ABMR需要配置三个参数:
LOG_ARCHIVE_CONFIG
LOG_ARCHIVE_DEST_n —- 指向主库
FAL_SERVER —- 指向主库

ABMR只能修复常见的物理坏块,对于逻辑坏块,通常建议在主库和备库上启用DB_BLOCK_CHECKING参数来进行预防,当然,启用该参数对DB的性能会有一定的影响,如果主库负载比较大,那么只在备库上启用该参数。

2. 操作实例

DG 环境的搭建参考我的博客:
Oracle 18c Data Guard 搭建手册
https://www.cndba.cn/dave/article/2995http://www.cndba.cn/cndba/dave/article/2998http://www.cndba.cn/cndba/dave/article/2998

18c CDB架构中连接PDB,参考如下博客:
Oracle 18c 通过 service_name 连接PDB 报ORA-12514 错误解决方法
https://www.cndba.cn/dave/article/2997

1) 主库创建测试表
[dave@www.cndba.cn ~]$ sqlplus zhixin/zhixin@dave

SQL*Plus: Release 18.0.0.0.0 - Production on Mon Aug 27 19:42:07 2018
Version 18.3.0.0.0

Copyright (c) 1982, 2018, Oracle.  All rights reserved.

Last Successful login time: Mon Aug 27 2018 18:42:55 -04:00

Connected to:
Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production
Version 18.3.0.0.0

SQL> create table dave as select * from dba_objects;

Table created.
2)将rowid 转换成file#和block#:
SQL> set pages 200
SQL>select rowid, dbms_rowid.rowid_block_number(rowid) blockno, dbms_rowid.rowid_relative_fno(rowid) fno from dave where rownum < 100;

ROWID              BLOCKNO         FNO
------------------ ---------- ----------
AAAR/yAAMAAAACDAAA      131          12
AAAR/yAAMAAAACDAAB      131          12
AAAR/yAAMAAAACDAAC      131          12
AAAR/yAAMAAAACDAAD      131          12
AAAR/yAAMAAAACDAAE      131          12
AAAR/yAAMAAAACDAAF      131          12
AAAR/yAAMAAAACDAAG      131          12
AAAR/yAAMAAAACDAAH      131          12
AAAR/yAAMAAAACDAAI            131          12
AAAR/yAAMAAAACDAAJ      131          12
AAAR/yAAMAAAACDAAK      131          12

SQL> col file_name for a50
SQL> select file_id,file_name from dba_data_files;

   FILE_ID FILE_NAME
---------- --------------------------------------------------
     9 /u01/app/oracle/oradata/CNDBA/dave/system01.dbf
    10 /u01/app/oracle/oradata/CNDBA/dave/sysaux01.dbf
    11 /u01/app/oracle/oradata/CNDBA/dave/undotbs01.dbf
    12 /u01/app/oracle/oradata/CNDBA/dave/users01.dbf
3) 在主库使用dd 来破坏数据块
[dave@www.cndba.cn ~]$ dd if=/dev/zero of=/u01/app/oracle/oradata/CNDBA/dave/users01.dbf count=1 seek=131 bs=8192 conv=notrunc
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 0.000259948 s, 31.5 MB/s
4). 检查数据文件损坏
[dave@www.cndba.cn trace]$ dbv file=/u01/app/oracle/oradata/CNDBA/dave/users01.dbf blocksize=8192

DBVERIFY: Release 18.0.0.0.0 - Production on Mon Aug 27 19:45:11 2018

Copyright (c) 1982, 2018, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /u01/app/oracle/oradata/CNDBA/dave/users01.dbf
Page 131 is marked corrupt
Corrupt block relative dba: 0x03000083 (file 12, block 131)
Completely zero block found during dbv: 

DBVERIFY - Verification complete

Total Pages Examined         : 1760
Total Pages Processed (Data) : 1420
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 127
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 212
Total Pages Marked Corrupt   : 1
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 1867032 (0.1867032)
[dave@www.cndba.cn trace]$
5) 在主库刷新buffer cache强制从硬盘读数据
SQL> alter system flush buffer_cache;
System altered.

SQL> select count(1) from dave;
  COUNT(1)
----------
    583264

SQL> select count(1) from dave where dbms_rowid.rowid_block_number(rowid)=131 and dbms_rowid.rowid_relative_fno(rowid)=12;

  COUNT(1)
----------
    68

查询正常,我们查看alert log 日志,里面详细记录这个ABMR的过程:http://www.cndba.cn/cndba/dave/article/2998

DAVE(3):Hex dump of (file 12, block 131) in trace file /u01/app/oracle/diag/rdbms/cndba_p/cndba/trace/cndba_ora_32764.trc
DAVE(3):
DAVE(3):Corrupt block relative dba: 0x03000083 (file 12, block 131)
DAVE(3):Completely zero block found during user buffer read
DAVE(3):
DAVE(3):Reading datafile '/u01/app/oracle/oradata/CNDBA/dave/users01.dbf' for corrupt data at rdba: 0x03000083 (file 12, block 131)
DAVE(3):Reread (file 12, block 131) found same corrupt data (no logical check)
DAVE(3):Automatic block media recovery requested for (file# 12, block# 131)
2018-08-27T19:45:48.567238-04:00
Automatic block media recovery successful for (file# 12, block# 131)
2018-08-27T19:45:48.567943-04:00
DAVE(3):Automatic block media recovery successful for (file# 12, block# 131)

#再次dbv检查,恢复正常:
[dave@www.cndba.cn admin]$ dbv file=/u01/app/oracle/oradata/CNDBA/dave/users01.dbf blocksize=8192

DBVERIFY: Release 18.0.0.0.0 - Production on Mon Aug 27 19:48:19 2018

Copyright (c) 1982, 2018, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /u01/app/oracle/oradata/CNDBA/dave/users01.dbf


DBVERIFY - Verification complete

Total Pages Examined         : 1760
Total Pages Processed (Data) : 1421
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 159
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 180
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 1867035 (0.1867035)
6)直接在备库dd测试,然后查看log
[dave@www.cndba.cn ~]$ dd if=/dev/zero of=/u01/app/oracle/oradata/CNDBA/dave/users01.dbf count=1 seek=132 bs=8192 conv=notrunc
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 0.000359799 s, 22.8 MB/s
[dave@www.cndba.cn ~]$ dbv file=/u01/app/oracle/oradata/CNDBA/dave/users01.dbf blocksize=8192

DBVERIFY: Release 18.0.0.0.0 - Production on Fri Aug 31 09:36:14 2018

Copyright (c) 1982, 2018, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /u01/app/oracle/oradata/CNDBA/dave/users01.dbf
Page 131 is marked corrupt
Corrupt block relative dba: 0x03000083 (file 12, block 131)
Completely zero block found during dbv: 

Page 132 is marked corrupt
Corrupt block relative dba: 0x03000084 (file 12, block 132)
Completely zero block found during dbv: 



DBVERIFY - Verification complete

Total Pages Examined         : 1760
Total Pages Processed (Data) : 1419
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 159
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 180
Total Pages Marked Corrupt   : 2
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 1867035 (0.1867035)

但是查看日志并没有修复坏块的记录。多次flush buffer cache后,查询表也正常:

SQL> select count(1) from dave;

  COUNT(1)
----------
     72913

实际上这个和备库触发ABMR的机制有关系。 我们在主库进行更新操作:

SQL> update dave set owner='dave' where owner='SYS';
51339 rows updated.

SQL> commit;
Commit complete.

SQL> alter system flush buffer_cache;
System altered.

在备库进行验证:

http://www.cndba.cn/cndba/dave/article/2998
http://www.cndba.cn/cndba/dave/article/2998

[dave@www.cndba.cn dbs]$ dbv file=/u01/app/oracle/oradata/CNDBA/dave/users01.dbf blocksize=8192

DBVERIFY: Release 18.0.0.0.0 - Production on Fri Aug 31 09:59:41 2018

Copyright (c) 1982, 2018, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /u01/app/oracle/oradata/CNDBA/dave/users01.dbf


DBVERIFY - Verification complete

Total Pages Examined         : 1760
Total Pages Processed (Data) : 1421
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 159
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 180
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 1873249 (0.1873249)
[dave@www.cndba.cn dbs]$

发现之前的坏块都已经被修复了。 所以当备库出现坏块时,是在主库更新时用主库的block来覆盖备库的corrupt block,此时alert log 中并无记录http://www.cndba.cn/cndba/dave/article/2998

http://www.cndba.cn/cndba/dave/article/2998

另外需要注意的,开始是用SYS用户和SYSTEM表空间进行测试的,ABMR并没有生效,后来用新创建的用户测试,ABMR功能正常,所以在上线业务时,也需要单独创建业务用户才能使用该功能。

http://www.cndba.cn/cndba/dave/article/2998

版权声明:本文为博主原创文章,未经博主允许不得转载。

用户评论
* 以下用户言论只代表其个人观点,不代表CNDBA社区的观点或立场
dave

dave

关注

人的一生应该是这样度过的:当他回首往事的时候,他不会因为虚度年华而悔恨,也不会因为碌碌无为而羞耻;这样,在临死的时候,他就能够说:“我的整个生命和全部精力,都已经献给世界上最壮丽的事业....."

  • 2283
    原创
  • 3
    翻译
  • 579
    转载
  • 196
    评论
  • 访问:8183450次
  • 积分:4428
  • 等级:核心会员
  • 排名:第1名
精华文章
    最新问题
    查看更多+
    热门文章
      热门用户
      推荐用户
        Copyright © 2016 All Rights Reserved. Powered by CNDBA · 皖ICP备2022006297号-1·

        QQ交流群

        注册联系QQ