1 过程回顾
上周和合肥某大厂一个老朋友聊天,他最近有点点背,然后感慨到没有背过锅的 DBA 职业生涯是不完整的,为什么想到这句话,因为今天的案例,也是个锅,只是还没有找出是谁的。
19:52 :刚收拾好书包,准备回家,刚起身,接到业务电话,一听就是很着急,说系统核心表数据不对,不清楚啥情况。
20:01:检查了一下数据库,摸清了大概的情况,连会沟通,对齐相关的信息和恢复方案。
21:22:业务侧从测试环境导入部分数据,但丢失了 8 小时数据,和业务侧确认,核心表,数据很重要,不能丢。
22:05:完成 RMAN 环境准备,找了台新服务器,部署单节点的 DB 环境,将备份从服务器复制过来。
22:44:第一次恢复报错失败。分析错误,准备二次恢复。
23:33:从公司回家,继续恢复。
02:31:完成基于时间点的不完全恢复,找回丢失的数据。
2 恢复过程
本着技术贴的原则,这部分记录下 Oracle RAC 恢复到单机的过程,具体代码就不解释,相关信息已脱敏。
--创建 pfile:
$ cat initcndba.ora
*.audit_file_dest='/u01/app/oracle/admin/cndba/adump'
*.audit_trail='db'
*.compatible='12.2.0'
*.control_files='/u01/app/oracle/oradata/cndba/control01.ctl','/u01/app/oracle/fast_recovery_area/control02.ctl'
*.db_block_size=8192
*.db_create_file_dest='/u01/app/oracle'
*.db_file_name_convert='/data/oracle/cndba/','/u01/app/oracle/oradata/cndba/','/data/oracle/','/u01/app/oracle/oradata/cndba/'
*.db_name='cndba'
*.db_recovery_file_dest='/u01/app/oracle/fast_recovery_area'
*.db_recovery_file_dest_size=300G
*.db_unique_name='cndba_st'
*.diagnostic_dest='/u01/app/oracle'
*.dispatchers='(PROTOCOL=TCP) (SERVICE=cndbaXDB)'
*.fal_server='cndba'
*.log_archive_dest_1='location=/u01/archive_log'
*.log_archive_dest_state_1='ENABLE'
*.log_archive_dest_state_2='ENABLE'
*.log_archive_format='%t_%s_%r.arch'
*.log_file_name_convert='/data/oracle/cndba/onlinelog/','/u01/app/oracle/oradata/cndba/'
*.nls_language='AMERICAN'
*.nls_territory='AMERICA'
*.open_cursors=300
*.pga_aggregate_target=30G
*.processes=3840
*.remote_login_passwordfile='EXCLUSIVE'
#*.sec_case_sensitive_logon=TRUE
*.sga_target=120G
*.standby_file_management='auto'
*.undo_tablespace='UNDOTBS1'
--库启动到 nomount:
$ sqlplus / as sysdba
SQL*Plus: Release 12.2.0.1.0 Production on Thu Jun 13 21:47:44 2024
Copyright (c) 1982, 2016, Oracle. All rights reserved.
staConnected to an idle instance.
SQ
SQL>
SQL>
SQL> startup nomount;
ORACLE instance started.
Total System Global Area 1.2885E+11 bytes
Fixed Size 29873376 bytes
Variable Size 1.9327E+10 bytes
Database Buffers 1.0925E+11 bytes
Redo Buffers 238559232 bytes
SQL>
恢复控制文件
rman target /
RMAN> restore controlfile from '/u01/racbackup/ctl_file_af2t7d5b_1_1_20240613';
Starting restore at 13-JUN-24
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=5204 device type=DISK
channel ORA_DISK_1: restoring control file
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
output file name=/u01/app/oracle/oradata/cndba/control01.ctl
output file name=/u01/app/oracle/fast_recovery_area/control02.ctl
Finished restore at 13-JUN-24
RMAN> alter database mount;
Statement processed
released channel: ORA_DISK_1
RMAN> catalog start with '/u01/racbackup/';
注意:
这里所有的备份和 ASM 中的归档,需要使用 cp 命令复制出来。
修改redo路径
RMAN> select * from v$logfile;
SQL> set lines 300
SQL> select member from v$logfile;
MEMBER
------------------------------------------------------------
+DATA/cndba/ONLINELOG/group_1.284.970842869
+DATA/cndba/ONLINELOG/group_2.285.970842871
+DATA/cndba/ONLINELOG/group_3.286.970842873
+DATA/cndba/ONLINELOG/group_4.287.970842875
+DATA/cndba/ONLINELOG/group_5.288.970842877
+DATA/cndba/ONLINELOG/group_6.295.970844881
+DATA/cndba/ONLINELOG/group_7.296.970844881
+DATA/cndba/ONLINELOG/group_8.297.970844883
+DATA/cndba/ONLINELOG/group_9.298.970844885
+DATA/cndba/ONLINELOG/group_10.299.970844887
RMAN>alter database rename file '+DATA/cndba/ONLINELOG/group_1.284.970842869' to '/u01/app/oracle/oradata/cndba/redo01.log';
RMAN>alter database rename file '+DATA/cndba/ONLINELOG/group_2.285.970842871' to '/u01/app/oracle/oradata/cndba/redo02.log';
RMAN>alter database rename file '+DATA/cndba/ONLINELOG/group_3.286.970842873' to '/u01/app/oracle/oradata/cndba/redo03.log';
RMAN>alter database rename file '+DATA/cndba/ONLINELOG/group_4.287.970842875' to '/u01/app/oracle/oradata/cndba/redo04.log';
RMAN>alter database rename file '+DATA/cndba/ONLINELOG/group_5.288.970842877' to '/u01/app/oracle/oradata/cndba/redo05.log';
RMAN>alter database rename file '+DATA/cndba/ONLINELOG/group_6.295.970844881' to '/u01/app/oracle/oradata/cndba/redo06.log';
RMAN>alter database rename file '+DATA/cndba/ONLINELOG/group_7.296.970844881' to '/u01/app/oracle/oradata/cndba/redo07.log';
RMAN>alter database rename file '+DATA/cndba/ONLINELOG/group_8.297.970844883' to '/u01/app/oracle/oradata/cndba/redo08.log';
RMAN>alter database rename file '+DATA/cndba/ONLINELOG/group_9.298.970844885' to '/u01/app/oracle/oradata/cndba/redo09.log';
RMAN>alter database rename file '+DATA/cndba/ONLINELOG/group_10.299.970844887' to '/u01/app/oracle/oradata/cndba/redo10.log';
修改数据文件路径 (rac上查询)
SQL> set pages 200
SQL> col name for a150
SQL> set lines 250
SQL> select file# ,name from v$datafile;
#根据实际情况修改sql
SQL> set lines 250
SQL> select 'SET NEWNAME FOR DATAFILE '|| file# ||' to ' ||'''/u01/app/oracle/oradata/cndba'|| substr(name,INSTR(name, '/',-1)) ||''''|| ';' as name from v$datafile;
RMAN执行恢复脚本
run{
configure device type disk parallelism 24;
SET NEWNAME FOR DATAFILE 1 to '/u01/app/oracle/oradata/cndba/system.289.970842879';
SET NEWNAME FOR DATAFILE 2 to '/u01/app/oracle/oradata/cndba/sysaux.290.970842883';
SET NEWNAME FOR DATAFILE 3 to '/u01/app/oracle/oradata/cndba/undotbs1.291.970842883';
SET NEWNAME FOR DATAFILE 4 to '/u01/app/oracle/oradata/cndba/undotbs2.293.970842891';
SET NEWNAME FOR DATAFILE 5 to '/u01/app/oracle/oradata/cndba/users.294.970842891';
SET NEWNAME FOR DATAFILE 6 to '/u01/app/oracle/oradata/cndba/cndba.307.970908623';
SET NEWNAME FOR DATAFILE 7 to '/u01/app/oracle/oradata/cndba/cndba1.301.970909765';
SET NEWNAME FOR DATAFILE 8 to '/u01/app/oracle/oradata/cndba/cndba1.309.970909769';
SET NEWNAME FOR DATAFILE 9 to '/u01/app/oracle/oradata/cndba/cndba1.310.970909773';
SET NEWNAME FOR DATAFILE 10 to '/u01/app/oracle/oradata/cndba/cndba1.311.970909775';
SET NEWNAME FOR DATAFILE 11 to '/u01/app/oracle/oradata/cndba/cndba1.312.970909779';
SET NEWNAME FOR DATAFILE 12 to '/u01/app/oracle/oradata/cndba/cndba2.313.970909867';
SET NEWNAME FOR DATAFILE 13 to '/u01/app/oracle/oradata/cndba/cndba2.314.970909871';
SET NEWNAME FOR DATAFILE 14 to '/u01/app/oracle/oradata/cndba/cndba2.315.970909875';
SET NEWNAME FOR DATAFILE 15 to '/u01/app/oracle/oradata/cndba/cndba2.316.970909877';
SET NEWNAME FOR DATAFILE 16 to '/u01/app/oracle/oradata/cndba/cndba2.317.970909881';
SET NEWNAME FOR DATAFILE 17 to '/u01/app/oracle/oradata/cndba/cndba3.318.970909917';
SET NEWNAME FOR DATAFILE 18 to '/u01/app/oracle/oradata/cndba/cndba3.319.970909919';
SET NEWNAME FOR DATAFILE 19 to '/u01/app/oracle/oradata/cndba/cndba3.320.970909923';
SET NEWNAME FOR DATAFILE 20 to '/u01/app/oracle/oradata/cndba/cndba3.321.970909927';
SET NEWNAME FOR DATAFILE 21 to '/u01/app/oracle/oradata/cndba/cndba3.322.970909929';
SET NEWNAME FOR DATAFILE 22 to '/u01/app/oracle/oradata/cndba/cndba4.323.970909987';
SET NEWNAME FOR DATAFILE 23 to '/u01/app/oracle/oradata/cndba/cndba4.324.970909991';
SET NEWNAME FOR DATAFILE 24 to '/u01/app/oracle/oradata/cndba/cndba4.325.970909995';
SET NEWNAME FOR DATAFILE 25 to '/u01/app/oracle/oradata/cndba/cndba4.326.970909997';
SET NEWNAME FOR DATAFILE 26 to '/u01/app/oracle/oradata/cndba/cndba4.327.970910001';
SET NEWNAME FOR DATAFILE 27 to '/u01/app/oracle/oradata/cndba/cndba_charge.328.970910039';
SET NEWNAME FOR DATAFILE 28 to '/u01/app/oracle/oradata/cndba/cndba_service.329.970910079';
SET NEWNAME FOR DATAFILE 29 to '/u01/app/oracle/oradata/cndba/cndba_service.330.970910081';
SET NEWNAME FOR DATAFILE 30 to '/u01/app/oracle/oradata/cndba/cndba_ind.332.970910209';
SET NEWNAME FOR DATAFILE 31 to '/u01/app/oracle/oradata/cndba/cndba_ind.333.970910211';
SET NEWNAME FOR DATAFILE 32 to '/u01/app/oracle/oradata/cndba/cndba_ind.334.970910215';
SET NEWNAME FOR DATAFILE 33 to '/u01/app/oracle/oradata/cndba/cndba_service_ind.335.970910237';
SET NEWNAME FOR DATAFILE 34 to '/u01/app/oracle/oradata/cndba/cndba_service_ind.336.970910241';
SET NEWNAME FOR DATAFILE 35 to '/u01/app/oracle/oradata/cndba/cndba_cndba.337.970910323';
SET NEWNAME FOR DATAFILE 36 to '/u01/app/oracle/oradata/cndba/cndba_cndba.338.970910325';
SET NEWNAME FOR DATAFILE 37 to '/u01/app/oracle/oradata/cndba/cndba_cndba.339.970910329';
SET NEWNAME FOR DATAFILE 38 to '/u01/app/oracle/oradata/cndba/cndba_cndba.340.970910331';
SET NEWNAME FOR DATAFILE 39 to '/u01/app/oracle/oradata/cndba/cndba_cndba.341.970910335';
SET NEWNAME FOR DATAFILE 40 to '/u01/app/oracle/oradata/cndba/cndba_cndba.342.970910397';
SET NEWNAME FOR DATAFILE 41 to '/u01/app/oracle/oradata/cndba/cndba_cndba.343.970910401';
SET NEWNAME FOR DATAFILE 42 to '/u01/app/oracle/oradata/cndba/cndba_cndba.344.970910405';
SET NEWNAME FOR DATAFILE 43 to '/u01/app/oracle/oradata/cndba/cndba_cndba_ind.345.970910539';
SET NEWNAME FOR DATAFILE 44 to '/u01/app/oracle/oradata/cndba/cndba_cndba_ind.346.970910543';
SET NEWNAME FOR DATAFILE 45 to '/u01/app/oracle/oradata/cndba/cndba_data.347.970910691';
SET NEWNAME FOR DATAFILE 46 to '/u01/app/oracle/oradata/cndba/cndba_idx.348.970910731';
SET NEWNAME FOR DATAFILE 47 to '/u01/app/oracle/oradata/cndba/cndba_log_data.349.970910767';
SET NEWNAME FOR DATAFILE 48 to '/u01/app/oracle/oradata/cndba/cndba_log_idx.350.970910807';
SET NEWNAME FOR DATAFILE 49 to '/u01/app/oracle/oradata/cndba/cndba.355.971992211';
SET NEWNAME FOR DATAFILE 50 to '/u01/app/oracle/oradata/cndba/cndba.356.971992303';
SET NEWNAME FOR DATAFILE 51 to '/u01/app/oracle/oradata/cndba/cndba.357.971992527';
SET NEWNAME FOR DATAFILE 52 to '/u01/app/oracle/oradata/cndba/cndba.358.971992555';
SET NEWNAME FOR DATAFILE 53 to '/u01/app/oracle/oradata/cndba/cndba.359.971992591';
SET NEWNAME FOR DATAFILE 54 to '/u01/app/oracle/oradata/cndba/cndba.360.971992671';
SET NEWNAME FOR DATAFILE 55 to '/u01/app/oracle/oradata/cndba/cndba.361.971992707';
SET NEWNAME FOR DATAFILE 56 to '/u01/app/oracle/oradata/cndba/cndba.362.971992749';
SET NEWNAME FOR DATAFILE 57 to '/u01/app/oracle/oradata/cndba/cndba.363.971992927';
SET NEWNAME FOR DATAFILE 58 to '/u01/app/oracle/oradata/cndba/cndba.304.975853431';
SET NEWNAME FOR DATAFILE 59 to '/u01/app/oracle/oradata/cndba/cndba_hr.364.985017681';
SET NEWNAME FOR DATAFILE 60 to '/u01/app/oracle/oradata/cndba/cndba_ifly.366.985017737';
SET NEWNAME FOR DATAFILE 61 to '/u01/app/oracle/oradata/cndba/cndba.367.992187407';
SET NEWNAME FOR DATAFILE 62 to '/u01/app/oracle/oradata/cndba/cndba_data_center.368.1013953061';
SET NEWNAME FOR DATAFILE 63 to '/u01/app/oracle/oradata/cndba/cndba.353.1022325205';
SET NEWNAME FOR DATAFILE 64 to '/u01/app/oracle/oradata/cndba/cndba.302.1036920147';
SET NEWNAME FOR DATAFILE 65 to '/u01/app/oracle/oradata/cndba/cndba_service_ind.370.1048759079';
SET NEWNAME FOR DATAFILE 66 to '/u01/app/oracle/oradata/cndba/cndba_3.dbf';
SET NEWNAME FOR DATAFILE 67 to '/u01/app/oracle/oradata/cndba/cndba_4.dbf';
SET NEWNAME FOR DATAFILE 68 to '/u01/app/oracle/oradata/cndba/cndba_5.dbf';
SET NEWNAME FOR DATAFILE 69 to '/u01/app/oracle/oradata/cndba/cndba_6.dbf';
SET NEWNAME FOR DATAFILE 70 to '/u01/app/oracle/oradata/cndba/cndba_data01.dbf';
SET NEWNAME FOR DATAFILE 71 to '/u01/app/oracle/oradata/cndba/cndba_1.dbf';
SET NEWNAME FOR DATAFILE 72 to '/u01/app/oracle/oradata/cndba/cndba_3.dbf';
SET NEWNAME FOR DATAFILE 73 to '/u01/app/oracle/oradata/cndba/cndba_4.dbf';
SET NEWNAME FOR DATAFILE 74 to '/u01/app/oracle/oradata/cndba/cndba_service_3.dbf';
SET NEWNAME FOR DATAFILE 75 to '/u01/app/oracle/oradata/cndba/cndba_5.dbf';
SET NEWNAME FOR DATAFILE 76 to '/u01/app/oracle/oradata/cndba/undotbs1.372.1141485245';
SET NEWNAME FOR DATAFILE 77 to '/u01/app/oracle/oradata/cndba/undotbs2.383.1141485333';
SET NEWNAME FOR DATAFILE 78 to '/u01/app/oracle/oradata/cndba/cndba.385.1153153311';
SET NEWNAME FOR DATAFILE 79 to '/u01/app/oracle/oradata/cndba/cndba.386.1153153569';
SET NEWNAME FOR DATAFILE 80 to '/u01/app/oracle/oradata/cndba/cndba.387.1153153773';
SET NEWNAME FOR DATAFILE 81 to '/u01/app/oracle/oradata/cndba/cndba.381.1163062369';
restore database;
switch datafile all;
}
RMAN>recover database until time "to_date('2024-06-13 19:19:57','yyyy-mm-dd hh24:mi:ss')";
RMAN>alter database open resetlogs;
3 案例引发的思考
从过程看,这是一个严重的事故,但是到目前为止,还没有定位出是谁的锅。
从结果看,万幸数据都找回来了,没有丢数据。
事发时业务侧也不清楚发生了什么,只能从表的 DDL 操作时间,推测做了一次 drop table 和 create table 的动作,而且不是普通的 drop。因为正常情况下 oracle drop table后表是会存放到回收站里的,但是接入现场的第一时间查看回收站,并没有数据,所以是 drop purge 了,直接删除。
这个恢复案例最终的体现的还是数据安全,常用的提升数据安全的方法有如下几点:
- 业务用户的权限隔离: DDL 和 DML 要分开,DDL 权限只能 DBA 有,业务系统不允许使用具备 DDL 权限的账号【这也是一个妥协的过程】。
- 第三方的审计平台:本次案例分析了监听日志的 IP,但没有发现有价值的信息,所以暂时无法准确定位是什么触发了这个删除操作。
- 数据库备份:我们的系统每天都有备份,只是业务侧基于成本考虑,使用了 SAS 盘做的备份,并且之前因为数据库达到 T 级后,备份空间不够,不得已还采用了压缩备份,这些最终导致我们恢复的时间比较长。 对于恢复效率,业内有通用的解决方案,就是备份一体机,甚至有些产品可以做到分钟级的恢复。
可以看出,数据安全是和系统的重要性、投入正相关的。 其实我之前问过业务,这个系统是否重要,回答是很重要,但是业务没有预算,没法开展这些工作,所以最终还是运维扛下了所有。
对于重要的核心系统,数据库上备份一体机,上审计平台是常规操作,我们很多的外部项目也是这么操作的,只是真到自家系统用的时候,则是能省就省,想到了一句老话:卖盐的喝淡汤,编凉席的睡光床,古人诚不欺我。
版权声明:本文为博主原创文章,未经博主允许不得转载。