签到成功

知道了

CNDBA社区CNDBA社区

坏块类型-ALL ZERO模拟与修复

2018-03-15 17:10 3133 0 原创 数据坏块
作者: Marvinn

创建测试数据http://www.cndba.cn/Marvinn/article/2684

http://www.cndba.cn/Marvinn/article/2684

20:27:22 SYS@orcl>alter session set container = pdb001;

Session altered.

Elapsed: 00:00:00.24

20:30:06 SYS@orcl>create  tablespace jwt datafile '/data/datafile/ORCL/5A5847E139AF054EE05375B029ACCC94/datafile/jwt.dbf' size 10m autoextend on;

Tablespace created.

Elapsed: 00:00:11.95

20:31:01 SYS@orcl>create user jwt identified by 123456 default tablespace jwt;
                grant connect,resource to jwt;
                grant dba to jwt;

User created.

Elapsed: 00:00:01.57

20:44:23 SYS@orcl>conn jwt/123456@pdb
Connected.
20:44:30 JWT@pdb>create table jwt(id number,name varchar2(10));

Table created.

Elapsed: 00:00:00.03
20:45:06 JWT@pdb>declare
20:45:15   2   i number;
20:45:21   3  begin
20:45:27   4  for i in 1..10000 loop
20:45:28   5  insert into jwt values(i,'ss');
20:45:34   6   end loop;
20:45:40   7  commit;
20:45:43   8  end;
20:45:45   9  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.55
20:45:46 JWT@pdb>

注意:使用BBED工具对块进行操作时,需关库或者设置对应的表空间为只读再操做数据块,作防止BBED修改的块被覆盖,造成不必要的错误或者失败

http://www.cndba.cn/Marvinn/article/2684

​ 由于当前处于12C测试环境,所以只需要将对应数据文件所在的PDB数据库关闭(mount状态即可)或者设置表空间只读即可,而其他版本的数据库需关库或者设置表空间只读操作

查看表数据
20:45:46 JWT@pdb>select count(*) from jwt;

  COUNT(*)
----------
     10000

Elapsed: 00:00:00.01

查看表所在数据文件以及块号、行号(测试当前只取前5行)
20:47:35 JWT@pdb>select dbms_rowid.rowid_relative_fno(rowid) rfn,dbms_rowid.rowid_block_number(rowid) bln,dbms_rowid.rowid_row_number(rowid) rn from JWT where rownum<=5;

       RFN        BLN         RN
---------- ---------- ----------
        29        131          0
        29        131          1
        29        131          2
        29        131          3
        29        131          4

Elapsed: 00:00:00.01

查看数据文件号为29的数据文件
20:47:59 JWT@pdb>select file#,name from v$datafile where file#=29;

     FILE#
----------
NAME
--------------------------------------------------------------------------------
        29
/data/datafile/ORCL/5A5847E139AF054EE05375B029ACCC94/datafile/jwt.dbf


Elapsed: 00:00:00.08

防止过程有误差,提前备份好http://www.cndba.cn/Marvinn/article/2684http://www.cndba.cn/Marvinn/article/2684

oracle@12cDG:/home/oracle>rman target /

RMAN> backup pluggable database pdb001 datafile 29;
Starting backup at 2018-03-14 21:46:51
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=121 device type=DISK
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of backup command at 03/14/2018 21:46:52
RMAN-06817: Pluggable Database ORCLPDB cannot be backed up in NOARCHIVELOG mode.

--重启库,开启归档,重新备份
startup mount
alter database archivelog;
oracle@12cDG:/home/oracle>rman target /


oracle@12cDG:/u01/oracle/diag/rdbms/orcl/orcl/trace> rman target /

Recovery Manager: Release 12.2.0.1.0 - Production on Wed Mar 14 21:36:37 2018

Copyright (c) 1982, 2017, Oracle and/or its affiliates.  All rights reserved.

connected to target database: ORCL (DBID=1481723203)

RMAN> backup pluggable database pdb001 datafile 29;

Starting backup at 2018-03-14 21:36:51
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=89 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00018 name=/data/datafile/ORCL/5A5847E139AF054EE05375B029ACCC94/datafile/o1_mf_undotbs1_dwx0fm75_.dbf
input datafile file number=00017 name=/data/datafile/ORCL/5A5847E139AF054EE05375B029ACCC94/datafile/o1_mf_sysaux_dwx0fm74_.dbf
input datafile file number=00016 name=/data/datafile/ORCL/5A5847E139AF054EE05375B029ACCC94/datafile/o1_mf_system_dwx0fm6b_.dbf
input datafile file number=00020 name=/data/datafile/ORCL/5A5847E139AF054EE05375B029ACCC94/datafile/test1.dbf
input datafile file number=00019 name=/data/datafile/ORCL/5A5847E139AF054EE05375B029ACCC94/datafile/test.dbf
input datafile file number=00029 name=/data/datafile/ORCL/5A5847E139AF054EE05375B029ACCC94/datafile/jwt.dbf
channel ORA_DISK_1: starting piece 1 at 2018-03-14 21:36:53
channel ORA_DISK_1: finished piece 1 at 2018-03-14 21:37:18
piece handle=/u01/oracle/12.2.0/dbs/bdstpt3l_1_1 tag=TAG20180314T213652 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:25
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00029 name=/data/datafile/ORCL/5A5847E139AF054EE05375B029ACCC94/datafile/jwt.dbf
channel ORA_DISK_1: starting piece 1 at 2018-03-14 21:37:18
channel ORA_DISK_1: finished piece 1 at 2018-03-14 21:37:19
piece handle=/u01/oracle/12.2.0/dbs/bestpt4e_1_1 tag=TAG20180314T213652 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 2018-03-14 21:37:19

Starting Control File and SPFILE Autobackup at 2018-03-14 21:37:19
piece handle=/u01/oracle/12.2.0/dbs/c-1481723203-20180314-01 comment=NONE
Finished Control File and SPFILE Autobackup at 2018-03-14 21:37:20

坏块问题模拟与恢复:http://www.cndba.cn/Marvinn/article/2684

验证当前环境是否存在坏块现状

http://www.cndba.cn/Marvinn/article/2684

22:04:31 SYS@orcl>select * from v$database_block_corruption;

no rows selected

Elapsed: 00:00:00.10
22:04:35 SYS@orcl>

--从上可知,当前环境不存在坏块现状

 corruption_type: 坏块类型
     ■ ALL ZERO  - 磁盘上的块头只包含0字节,在ORACLE 7,若它没有被修复,该块可能是有效的,在Oracle 8将会被格式话成标准空块.
    ■ FRACTURED - 数据块头看起来合理,但块前跟块后属于不同版本.
    ■ CHECKSUM  - 数据块的检查值不一致,不确定为什么检查值失败,但可能是数据块中间的扇区版本值不同
    ■ CORRUPT - 数据块被错误标识或者不是一个数据块(比如数据块地址丢失)
    ■ LOGICAL - 逻辑坏块
    ■ NOLOGGING - 数据块中没有重做日志存在(比如,主库上NOLOGGING操作可能在物理备库上产生这种类型的坏块)

模拟 ALL AERO

dd模拟
notrunc12cDG:/home/oracle>dd if=/dev/zero of=/data/datafile/ORCL/5A5847E139AF054EE05375B029ACCC94/datafile/jwt.dbf bs=8192 count=1 seek=131 conv=notrunc

1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 0.000761568 s, 10.8 MB/s

oracle@12cDG:/home/oracle>sqlplus / as sysdba

09:49:59 SYS@orcl>alter session set container=pdb001;

Session altered.

Elapsed: 00:00:00.02

--刷新数据库缓冲区
09:50:06 SYS@orcl>alter system  flush buffer_cache;

System altered.

Elapsed: 00:00:00.10

--发现坏块
09:50:20 SYS@orcl>select * from jwt.jwt where rownum<=6;
select * from jwt.jwt where rownum<=6
       *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 29, block # 131)
ORA-01110: data file 29:
'/data/datafile/ORCL/5A5847E139AF054EE05375B029ACCC94/datafile/jwt.dbf'


Elapsed: 00:00:12.09

--查看视图,发现坏块类型ALL ZERO
09:50:57 SYS@orcl>select * from v$database_block_corruption;

     FILE#     BLOCK#     BLOCKS CORRUPTION_CHANGE# CORRUPTIO     CON_ID
---------- ---------- ---------- ------------------ --------- ----------
        29        131          1                  0 ALL ZERO           6

Elapsed: 00:00:00.07

设置只读表空间
09:51:45 SYS@orcl>alter tablespace jwt read only;

Tablespace altered.

Elapsed: 00:00:00.94

恢复方法:(优先考虑备份恢复)

http://www.cndba.cn/Marvinn/article/2684
http://www.cndba.cn/Marvinn/article/2684

1、BBED

恢复方法:(优先考虑备份恢复)

1、此类型(all zero)的坏块使用无法使用BBED,报错无效块类型,命令无法使用
BBED> set filename '/data/datafile/ORCL/5A5847E139AF054EE05375B029ACCC94/datafile/jwt.dbf'
        FILENAME        /data/datafile/ORCL/5A5847E139AF054EE05375B029ACCC94/datafile/jwt.dbf

BBED> set block 131
        BLOCK#          131

BBED> map /v
 File: /data/datafile/ORCL/5A5847E139AF054EE05375B029ACCC94/datafile/jwt.dbf (0)
 Block: 131                                   Dba:0x00000000
------------------------------------------------------------
BBED-00400: invalid blocktype (00)


BBED> p kcbr
BBED-00400: invalid blocktype (00)


BBED> p kcbh
BBED-00400: invalid blocktype (00)

--dump 全是0,空块
BBED> dump /v
 File: /data/datafile/ORCL/5A5847E139AF054EE05375B029ACCC94/datafile/jwt.dbf (0)
 Block: 131     Offsets:    0 to  511  Dba:0x00000000
-------------------------------------------------------
 00000000 00000000 00000000 00000000 l ................
 00000000 00000000 00000000 00000000 l ................
 00000000 00000000 00000000 00000000 l ................
 00000000 00000000 00000000 00000000 l ................
 00000000 00000000 00000000 00000000 l ................
 00000000 00000000 00000000 00000000 l ................
 00000000 00000000 00000000 00000000 l ................
 00000000 00000000 00000000 00000000 l ................
 00000000 00000000 00000000 00000000 l ................
 00000000 00000000 00000000 00000000 l ................
 00000000 00000000 00000000 00000000 l ................
 00000000 00000000 00000000 00000000 l ................
 00000000 00000000 00000000 00000000 l ................
 00000000 00000000 00000000 00000000 l ................
 00000000 00000000 00000000 00000000 l ................
 00000000 00000000 00000000 00000000 l ................
 00000000 00000000 00000000 00000000 l ................
 00000000 00000000 00000000 00000000 l ................
 00000000 00000000 00000000 00000000 l ................
 00000000 00000000 00000000 00000000 l ................
 00000000 00000000 00000000 00000000 l ................
 00000000 00000000 00000000 00000000 l ................
 00000000 00000000 00000000 00000000 l ................
 00000000 00000000 00000000 00000000 l ................
 00000000 00000000 00000000 00000000 l ................
 00000000 00000000 00000000 00000000 l ................
 00000000 00000000 00000000 00000000 l ................
 00000000 00000000 00000000 00000000 l ................
 00000000 00000000 00000000 00000000 l ................
 00000000 00000000 00000000 00000000 l ................
 00000000 00000000 00000000 00000000 l ................
 00000000 00000000 00000000 00000000 l ................

 <16 bytes per line>

BBED>

2、DBMS_REPAIR包

2、DBMS_REPAIR包(丢失坏块数据) --dbms_repair标记和跳过坏块

 设置可读写
 09:58:03 SYS@orcl>alter tablespace jwt read write;

Tablespace altered.

1、创建管理表(DBMS_REPAI包前三项值不变,最后一项填写所属表空间名)当前表空间名为JWT
09:58:17 SYS@orcl>exec DBMS_REPAIR.ADMIN_TABLES('REPAIR_TABLE',1,1,'JWT');

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.30
09:58:35 SYS@orcl>exec DBMS_REPAIR.ADMIN_TABLES('ORPHAN_TABLE',2,1,'JWT');

PL/SQL procedure successfully completed.

2、检查坏块 dbms_repair.check_object存储包(corrupt_count不变):
'''
Set serveroutput on
DECLARE
   cc  NUMBER;
BEGIN
   DBMS_REPAIR.check_object (schema_name => 'JWT',
                             object_name     => 'JWT',
                             corrupt_count   => cc);
   DBMS_OUTPUT.put_line ( TO_CHAR (cc));
END;
'''
09:58:51 SYS@orcl>Set serveroutput on;
09:59:51 SYS@orcl>DECLARE
10:00:35   2     cc  NUMBER;
10:00:35   3  BEGIN
10:00:35   4     DBMS_REPAIR.check_object (schema_name => 'JWT',    -- 注意此处是坏块对象所属用户名
10:00:35   5                               object_name     => 'JWT',  -- 表名
10:00:35   6                               corrupt_count   => cc);
10:00:35   7     DBMS_OUTPUT.put_line ( TO_CHAR (cc));
10:00:35   8  END;
10:00:35   9  /
1

PL/SQL procedure successfully completed.

正常情况下输入为0.

如果有坏块,可以在创建的REPAIR_TABLE中查看块损坏信息SQL:
SELECT   object_name,
         relative_file_id,
         block_id,
         marked_corrupt,
         corrupt_description,
         repair_description,
         CHECK_TIMESTAMP
  FROM   repair_table;  

10:00:36 SYS@orcl>SELECT   object_name,
10:02:05   2           relative_file_id,
10:02:05   3           block_id,
10:02:05   4           marked_corrupt,
10:02:05   5           corrupt_description,
10:02:05   6           repair_description,
10:02:05   7           CHECK_TIMESTAMP
10:02:05   8    FROM   repair_table;

OBJECT_NAME
--------------------------------------------------------------------------------
RELATIVE_FILE_ID   BLOCK_ID MARKED_COR
---------------- ---------- ----------
CORRUPT_DESCRIPTION
--------------------------------------------------------------------------------
REPAIR_DESCRIPTION
--------------------------------------------------------------------------------
CHECK_TIMESTAMP
-------------------
JWT
              29        131 TRUE


OBJECT_NAME
--------------------------------------------------------------------------------
RELATIVE_FILE_ID   BLOCK_ID MARKED_COR
---------------- ---------- ----------
CORRUPT_DESCRIPTION
--------------------------------------------------------------------------------
REPAIR_DESCRIPTION
--------------------------------------------------------------------------------
CHECK_TIMESTAMP
-------------------
mark block software corrupt
2018-03-15 10:00:36


Elapsed: 00:00:00.01


注意:在8i下,check_object只会检查坏块,MARKED_CORRUPT为false,故需要执行第三步: 定位坏块,fix_corrupt_blocks定位 ,修改MARKED_CORRUPT为true,同时更新CHECK_TIMESTAMP。9i以后经过check_object,MARKED_CORRUPT的值已经标识为TRUE了。所以可以直接进行第四步了。

3、定位坏块:dbms_repair.fix_corrupt_blocks    (fix_count不变)  
只有将坏块信息写入定义的REPAIR_TABLE后,才能定位坏块。  
DECLARE
   cc   NUMBER;
BEGIN
   DBMS_REPAIR.fix_corrupt_blocks (schema_name   => 'JWT',
                                   object_name   => 'JWT',
                                   fix_count     => cc);
   DBMS_OUTPUT.put_line (a => TO_CHAR (cc));
END;

4、跳过坏块
我们前面虽然定位了坏块,但是,如果我们访问table:
10:02:05 SYS@orcl>select * from jwt.jwt where rownum<=5;
select * from jwt.jwt where rownum<=5
       *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 29, block # 131)
ORA-01110: data file 29:
'/data/datafile/ORCL/5A5847E139AF054EE05375B029ACCC94/datafile/jwt.dbf'


Elapsed: 00:00:03.06

还是会得到错误信息, 这里需要用skip_corrupt_blocks来跳过坏块(flags不变)
10:06:02 SYS@orcl>exec dbms_repair.skip_corrupt_blocks(schema_name => 'JWT',
object_name => 'JWT',flags => 1);

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.03
10:07:26 SYS@orcl>select * from jwt.jwt where rownum<=5;

        ID NAME
---------- ----------
      2430 ss
      2431 ss
      2432 ss
      2433 ss
      2434 ss

Elapsed: 00:00:00.00
10:07:38 SYS@orcl>select count(*) from JWT.JWT;

  COUNT(*)
----------
      9395
丢失数据 10000 - 9395 = 605行数据
验证
10:07:43 SYS@orcl>select * from v$database_block_corruption;

     FILE#     BLOCK#     BLOCKS CORRUPTION_CHANGE# CORRUPTIO     CON_ID
---------- ---------- ---------- ------------------ --------- ----------
        29        131          1                  0 ALL ZERO           6

Elapsed: 00:00:00.00
10:09:33 SYS@orcl>alter system flush buffer_cache;

System altered.

Elapsed: 00:00:00.02
10:10:00 SYS@orcl>select * from v$database_block_corruption;

     FILE#     BLOCK#     BLOCKS CORRUPTION_CHANGE# CORRUPTIO     CON_ID
---------- ---------- ---------- ------------------ --------- ----------
        29        131          1                  0 ALL ZERO           6

从上可知DBMS_REPAIR方法只是跳过和绕过了坏块(丢失坏块数据),避免访问报错,但是其坏块问题依然存在

5、处理index上的无效键值;dump_orphan_keys(object_type、repair_table_name、orphan_table_name、 key_count不变)

DECLARE
   cc   NUMBER;
BEGIN
   DBMS_REPAIR.dump_orphan_keys (schema_name         => 'JWT',            --SCHEMA用户
                                 object_name         => 'INDX_MAR',        --索引名
                                 object_type         => 2,
                                 repair_table_name   => 'REPAIR_TABLE',
                                 orphan_table_name   => 'ORPHAN_TABLE',
                                 key_count           => CC);
END;

通过以下命令可以知道丢失行的信息:
SQL> SELECT * FROM ORPHAN_TABLE;

我们根据这个结果来考虑是否需要rebuild index.


6、重建freelist:rebuild_freelists

exec dbms_repair.rebuild_freelists(schema_name => 'JWT'   --所属用户SCHEMA
                        ,object_name => 'JWT'            --表名
                        );

3、CTAS方式复制(EXP/IMP)

3、CTAS方式复制(设置10231事件)
利用原备份进行恢复
oracle@12cDG:/home/oracle>rman target /

Recovery Manager: Release 12.2.0.1.0 - Production on Thu Mar 15 10:14:24 2018

Copyright (c) 1982, 2017, Oracle and/or its affiliates.  All rights reserved.

connected to target database: ORCL (DBID=1481723203)

RMAN> restore pluggable database pdb001 datafile 29;

Starting restore at 2018-03-15 10:14:52
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=67 device type=DISK

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00016 to /data/datafile/ORCL/5A5847E139AF054EE05375B029ACCC94/datafile/o1_mf_system_dwx0fm6b_.dbf
channel ORA_DISK_1: restoring datafile 00017 to /data/datafile/ORCL/5A5847E139AF054EE05375B029ACCC94/datafile/o1_mf_sysaux_dwx0fm74_.dbf
channel ORA_DISK_1: restoring datafile 00018 to /data/datafile/ORCL/5A5847E139AF054EE05375B029ACCC94/datafile/o1_mf_undotbs1_dwx0fm75_.dbf
channel ORA_DISK_1: restoring datafile 00019 to /data/datafile/ORCL/5A5847E139AF054EE05375B029ACCC94/datafile/test.dbf
channel ORA_DISK_1: restoring datafile 00020 to /data/datafile/ORCL/5A5847E139AF054EE05375B029ACCC94/datafile/test1.dbf
channel ORA_DISK_1: reading from backup piece /u01/oracle/12.2.0/dbs/bdstpt3l_1_1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of restore command at 03/15/2018 10:14:55
ORA-19870: error while restoring backup piece /u01/oracle/12.2.0/dbs/bdstpt3l_1_1
ORA-19573: cannot obtain exclusive enqueue for datafile 18
ORA-45909: restore, recover or block media recovery may be in progress

关闭数据库进行恢复
10:10:31 SYS@orcl>shutdown immediate;
Pluggable Database closed.

RMAN> restore pluggable database pdb001 datafile 29;

Starting restore at 2018-03-15 10:16:12
using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00016 to /data/datafile/ORCL/5A5847E139AF054EE05375B029ACCC94/datafile/o1_mf_system_dwx0fm6b_.dbf
channel ORA_DISK_1: restoring datafile 00017 to /data/datafile/ORCL/5A5847E139AF054EE05375B029ACCC94/datafile/o1_mf_sysaux_dwx0fm74_.dbf
channel ORA_DISK_1: restoring datafile 00018 to /data/datafile/ORCL/5A5847E139AF054EE05375B029ACCC94/datafile/o1_mf_undotbs1_dwx0fm75_.dbf
channel ORA_DISK_1: restoring datafile 00019 to /data/datafile/ORCL/5A5847E139AF054EE05375B029ACCC94/datafile/test.dbf
channel ORA_DISK_1: restoring datafile 00020 to /data/datafile/ORCL/5A5847E139AF054EE05375B029ACCC94/datafile/test1.dbf
channel ORA_DISK_1: reading from backup piece /u01/oracle/12.2.0/dbs/bdstpt3l_1_1
channel ORA_DISK_1: piece handle=/u01/oracle/12.2.0/dbs/bdstpt3l_1_1 tag=TAG20180314T213652
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:35
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00029 to /data/datafile/ORCL/5A5847E139AF054EE05375B029ACCC94/datafile/jwt.dbf
channel ORA_DISK_1: reading from backup piece /u01/oracle/12.2.0/dbs/bestpt4e_1_1
channel ORA_DISK_1: ORA-19870: error while restoring backup piece /u01/oracle/12.2.0/dbs/bestpt4e_1_1
ORA-19505: failed to identify file "/u01/oracle/12.2.0/dbs/bestpt4e_1_1"
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 7

failover to previous backup

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00029 to /data/datafile/ORCL/5A5847E139AF054EE05375B029ACCC94/datafile/jwt.dbf
channel ORA_DISK_1: reading from backup piece /u01/oracle/12.2.0/dbs/bdstpt3l_1_1
channel ORA_DISK_1: piece handle=/u01/oracle/12.2.0/dbs/bdstpt3l_1_1 tag=TAG20180314T213652
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
Finished restore at 2018-03-15 10:16:49

RMAN> recover pluggable database pdb001;

Starting recover at 2018-03-15 10:18:15
using channel ORA_DISK_1

starting media recovery

archived log for thread 1 with sequence 656 is already on disk as file /data/arch/1_656_953942467.dbf
archived log for thread 1 with sequence 657 is already on disk as file /data/arch/1_657_953942467.dbf
archived log for thread 1 with sequence 658 is already on disk as file /data/arch/1_658_953942467.dbf
archived log for thread 1 with sequence 659 is already on disk as file /data/arch/1_659_953942467.dbf
archived log file name=/data/arch/1_656_953942467.dbf thread=1 sequence=656
archived log file name=/data/arch/1_657_953942467.dbf thread=1 sequence=657
media recovery complete, elapsed time: 00:00:24
Finished recover at 2018-03-15 10:18:40

--至此恢复完成,验证下
10:15:12 SYS@orcl>alter database open;

Database altered.

Elapsed: 00:00:07.86
10:19:19 SYS@orcl>select * from v$database_block_corruption;

no rows selected

Elapsed: 00:00:00.03
10:19:23 SYS@orcl>select count(*) from jwt.jwt;

  COUNT(*)
----------
     10000

Elapsed: 00:00:00.03

....所以说还是备份恢复来的快....
删除之前备份,压缩命令重备下
backup as compressed backupset pluggable database pdb001 datafile 29;

--重新dd模拟
dd if=/dev/zero of=/data/datafile/ORCL/5A5847E139AF054EE05375B029ACCC94/datafile/jwt.dbf bs=8192 count=1 seek=131 conv=notrunc

由于重新模拟同一坏块,所以之前DBMS_REPAIR包同样又生效了

10:40:59 SYS@orcl>alter session set container =pdb001;

Session altered.

Elapsed: 00:00:00.16
10:41:09 SYS@orcl>select count(*) from jwt.jwt;

  COUNT(*)
----------
     10000

Elapsed: 00:00:00.00
10:41:32 SYS@orcl>alter flush buffer_cache;
alter flush buffer_cache
      *
ERROR at line 1:
ORA-00940: invalid ALTER command


Elapsed: 00:00:00.01
10:41:43 SYS@orcl>alter system flush buffer_cache;

System altered.

Elapsed: 00:00:00.04
10:41:53 SYS@orcl>select count(*) from jwt.jwt;

  COUNT(*)
----------
      9395

--所以,还得关库重新恢复再开库,模拟不同数据块试试
RMAN>restore pluggable database pdb001 datafile 29;
RMAN>recover pluggabe database pdb001;

--查看表数据存在于哪些数据块
select dbms_rowid.rowid_relative_fno(rowid) rfn,dbms_rowid.rowid_block_number(rowid) bln,dbms_rowid.rowid_row_number(rowid) rn from JWT.jwt where rownum<=1000;


--模拟dd 坏块132
dd if=/dev/zero of=/data/datafile/ORCL/5A5847E139AF054EE05375B029ACCC94/datafile/jwt.dbf bs=8192 count=1 seek=132 conv=notrunc

10:50:58 SYS@orcl>alter session set container=pdb001;

Session altered.

Elapsed: 00:00:00.01
10:51:08 SYS@orcl>select count(*) from jwt.jwt;

  COUNT(*)
----------
     10000

Elapsed: 00:00:00.01
10:51:15 SYS@orcl>alter system flush buffer_cache;

System altered.

Elapsed: 00:00:00.07
10:51:23 SYS@orcl>select count(*) from jwt.jwt;

  COUNT(*)
----------
      9395

Elapsed: 00:00:00.01

发现结果还是一样的,可查但丢失了数据,查看告警日志
PDB001(6):
PDB001(6):Reading datafile '/data/datafile/ORCL/5A5847E139AF054EE05375B029ACCC94/datafile/jwt.dbf' for corruption at rdba: 0x07400084 (file 29, block 132)
PDB001(6):Reread (file 29, block 132) found same corrupt data (no logical check)
2018-03-15T10:51:24.986390+08:00
Corrupt Block Found
         TIME STAMP (GMT) = 03/15/2018 10:51:24
         CONT = 6, TSN = 11, TSNAME = JWT
         RFN = 29, BLK = 132, RDBA = 121634948
         OBJN = 94366, OBJD = 94366, OBJECT = JWT, SUBOBJECT = 
         SEGMENT OWNER = JWT, SEGMENT TYPE = Table Segment
--表明发现jwt表上存在同样坏块,直接标记跳过了..

所以这里行不通...那就尝试重新创建表在模拟

create table marvin(id number,name varchar2(10));

Table created.

Elapsed: 00:00:00.30
11:04:56 JWT@pdb>declare
11:06:18   2  i number;
11:06:18   3   begin
for i in 1..10000 loop
11:06:18   4  11:06:18   5  insert into marvin values(i,'ss');
end loop;
commit;
end;
/
11:06:18   6  11:06:18   7  11:06:18   8  11:06:18   9  
PL/SQL procedure successfully completed.

11:07:44 JWT@pdb>select count(*) from marvin;

  COUNT(*)
----------
     10000

Elapsed: 00:00:00.01

11:07:52 JWT@pdb>select dbms_rowid.rowid_relative_fno(rowid) rfn,dbms_rowid.rowid_block_number(rowid) bln,dbms_rowid.rowid_row_number(rowid) rn from JWT.marvin where rownum<=10;

       RFN        BLN         RN
---------- ---------- ----------
        29        187          0
        29        187          1
        29        187          2
        29        187          3
        29        187          4
        29        187          5
        29        187          6
        29        187          7
        29        187          8
        29        187          9

10 rows selected.

--模拟dd 坏块187
dd if=/dev/zero of=/data/datafile/ORCL/5A5847E139AF054EE05375B029ACCC94/datafile/jwt.dbf bs=8192 count=1 seek=187 conv=notrunc

11:15:28 SYS@orcl>select count(*) from jwt.marvin;
select count(*) from jwt.marvin
       *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 29, block # 187)
ORA-01110: data file 29:
'/data/datafile/ORCL/5A5847E139AF054EE05375B029ACCC94/datafile/jwt.dbf'


Elapsed: 00:00:03.72
Elapsed: 00:00:03.72
11:15:37 SYS@orcl>select * from v$database_block_corruption;

     FILE#     BLOCK#     BLOCKS CORRUPTION_CHANGE# CORRUPTIO     CON_ID
---------- ---------- ---------- ------------------ --------- ----------
        29        187          1                  0 ALL ZERO           6

Elapsed: 00:00:00.08

至此,坏块模拟成功....

CTAS修复或者导入导出:
1、Marvin表OWNER连入ORACLE
11:17:27 SYS@orcl>CONN JWT/123456@PDB
Connected.
11:24:38 > 
11:24:40 JWT@PDB>

2、诊断事件10231,跳过坏块检查(注意这里需具备权限修改)
11:24:40 JWT@PDB>ALTER SYSTEM SET EVENTS '10231 trace name context forever,level 10';

System altered.

Elapsed: 00:00:00.06

3、创建一个临时表 tab_tmp 的表中除坏块的数据都检索出来(或者针对于数据量大的表,可单独把表导出)
11:25:26 JWT@PDB>CREATE TABLE tab_marvin as select * from marvin;

Table created.

Elapsed: 00:00:00.82
或者导出
    导出未损坏的数据
    exp system/marvin file=marvin.dmp tables=marvin
    即将导出指定的表通过常规路径 ...
    . . 正在导出表 marvin 9395 行被导出
    在没有警告的情况下成功终止导出

    然后我们可以为确保安全,先重命名表,最后drop table,然后导入数据
    imp system/marvin file=marvin.dmp tables=marvin
    . 正在将system的对象导入到 system
    . . 正在导入表 "marvin" 9395行被导入
    成功终止导入,但出现警告。

    验证:
    select count(*) from marvin;

4、更名原表,并把marvin更名为marvin_bak
11:28:04 JWT@PDB>alter table marvin rename to marvin_bak;

Table altered.

Elapsed: 00:00:00.08
11:31:06 JWT@PDB>alter table tab_marvin rename to marvin;

Table altered.

Elapsed: 00:00:00.02

5、在 tab03 上重新创建索引、约束、授权、 trigger 等对象
此处这些对象无,省略

6、利用表之间的业务关系,把坏块中的数据补足(手工插入补)
此处省略

验证
11:31:19 JWT@PDB>select count(*) from jwt.marvin;

  COUNT(*)
----------
      9395

Elapsed: 00:00:00.01
丢失坏块数据 10000 - 9395 = 605行

7、最后把原表重命令的表marvin_bak删除
11:36:21 JWT@PDB>drop table marvin_bak;

Table dropped.

Elapsed: 00:00:00.95



可以看到
如果要取消events设置,做以下操作:
alter system set events='10231 trace name context off';

13:37:17 JWT@PDB>alter system set events='10231 trace name context off';

System altered.

Elapsed: 00:00:00.25
13:38:05 JWT@PDB>
13:38:06 JWT@PDB>
13:38:07 JWT@PDB>alter system flush buffer_cache;

System altered.

Elapsed: 00:00:00.07
13:38:17 JWT@PDB>select count(*) from jwt.marvin;

  COUNT(*)
----------
      9395

Elapsed: 00:00:00.00
13:38:22 JWT@PDB>select * from v$database_block_corruption;

     FILE#     BLOCK#     BLOCKS CORRUPTION_CHANGE# CORRUPTIO     CON_ID
---------- ---------- ---------- ------------------ --------- ----------
        29        187          1                  0 ALL ZERO           6

Elapsed: 00:00:00.00

可以看到坏块还是存在,只是标记绕过丢失坏块数据而已...

最后我们把该表进行恢复…首先关库
restore pluggable database pdb001 datafile 29;
recover pluggable database pdb001;http://www.cndba.cn/Marvinn/article/2684

版权声明:本文为博主原创文章,未经博主允许不得转载。

用户评论
* 以下用户言论只代表其个人观点,不代表CNDBA社区的观点或立场
Marvinn

Marvinn

关注

路漫漫其修远兮、吾将上下而求索

  • 99
    原创
  • 0
    翻译
  • 2
    转载
  • 36
    评论
  • 访问:458471次
  • 积分:449
  • 等级:中级会员
  • 排名:第12名
精华文章
    最新问题
    查看更多+
    热门文章
      热门用户
      推荐用户
        Copyright © 2016 All Rights Reserved. Powered by CNDBA · 皖ICP备2022006297号-1·

        QQ交流群

        注册联系QQ