使用bbed恢复corrupt的块和10231的对比
当我们查询一张表的时候,在扫描到某些记录时,该记录的数据块已经corruption,那么我们可以用10231事件跳过该坏块读取其他数据。
此处我使用bbed恢复该corruption的块。并读取其中的数据。
参考文档:disassembling the oracle data block,writter by graham thornton 2005
操作系统信息
[oracle@rhl6 lib]$ uname -a Linux rhl6.0x64 2.6.32-71.el6.x86_64 #1 SMP Wed Sep 1 01:33:01 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
数据库版本
SQL> set line 400 SQL> select * from gv$version; INST_ID BANNER ---------- -------------------------------------------------------------------------------- 1 Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production 1 PL/SQL Release 11.2.0.3.0 - Production 1 CORE 11.2.0.3.0 Production 1 TNS for Linux: Version 11.2.0.3.0 - Production 1 NLSRTL Version 11.2.0.3.0 - Production
测试数据
SQL> sho user USER is "YALLONKING" SQL> alter session set nls_date_format='yyyy/mm/dd hh24:mi:ss'; Session altered. SQL> select rowid,t.* from yallonking t; ROWID ID NAME MY_DATE ------------------ ---------- -------------------- ------------------- AAARciAAFAAAACHAAA 1 yallonking 2013/07/22 09:44:53 AAARciAAFAAAACHAAB 2 oraking 2013/07/22 09:45:07 AAARciAAFAAAACHAAC 3 xiaolong 2013/07/22 09:50:09
下边我们插入大量数据,为了让该表的数据在不同的块中
SQL> insert into yallonking select rownum as id,'travel' as name,sysdate as mydate from dual connect by rownum<10000; 9999 rows created. SQL> commit; Commit complete. SQL> select rowid,t.* from yallonking t where id=9999; ROWID ID NAME MY_DATE ------------------ ---------- -------------------- ------------------- AAARciAAFAAAACkAAf 9999 travel 2013/07/23 10:33:53 SQL> conn /as sysdba Connected. SQL> select get_rowid('AAARciAAFAAAACHAAA') from dual; GET_ROWID('AAARCIAAFAAAACHAAA') -------------------------------------------------------------------------------- Row_id type is :1 Object# is :71458 Relative_fno is :5 Block number is :135 Row number is :0 SQL> select get_rowid('AAARciAAFAAAACkAAf') from dual; GET_ROWID('AAARCIAAFAAAACKAAF') -------------------------------------------------------------------------------- Row_id type is :1 Object# is :71458 Relative_fno is :5 Block number is :164 Row number is :31 SQL> select /*+ full(t) */ count(*) from yallonking.yallonking t; COUNT(*) ---------- 10002
首先我们使用bbed把第三行记录所在的块标记为corruption,也就是将135号块人为损坏
BBED> show all FILE# 1 BLOCK# 1 OFFSET 0 DBA 0x00400001 (4194305 1,1) FILENAME /u01/app/oracle/oradata/ora11gr2/system01.dbf BIFILE bifile.bbd LISTFILE /home/oracle/bbedfile.lst BLOCKSIZE 8192 MODE Edit EDIT Unrecoverable IBASE Dec OBASE Dec WIDTH 80 COUNT 512 LOGFILE log.bbd SPOOL No BBED> info all File# Name Size(blks) ----- ---- ---------- 1 /u01/app/oracle/oradata/ora11gr2/system01.dbf 89600 2 /u01/app/oracle/oradata/ora11gr2/sysaux01.dbf 76800 3 /u01/app/oracle/oradata/ora11gr2/undotbs01.dbf 39680 4 /u01/app/oracle/oradata/ora11gr2/users01.dbf 640 5 /u01/app/oracle/oradata/ora11gr2/yallonking01.dbf 256 BBED> set dba 5,135 DBA 0x01400087 (20971655 5,135) BBED> p kcbh struct kcbh, 20 bytes @0 ub1 type_kcbh @0 0x06 ub1 frmt_kcbh @1 0xa2 ub1 spare1_kcbh @2 0x00 ub1 spare2_kcbh @3 0x00 ub4 rdba_kcbh @4 0x01400087 ub4 bas_kcbh @8 0x000b9935 ub2 wrp_kcbh @12 0x0000 ub1 seq_kcbh @14 0x01 ub1 flg_kcbh @15 0x06 (KCBHFDLC, KCBHFCKV) ub2 chkval_kcbh @16 0xf127 ub2 spare3_kcbh @18 0x0000 BBED> modify /x ff dba 5,135 offset 14 File: /u01/app/oracle/oradata/ora11gr2/yallonking01.dbf (5) Block: 135 Offsets: 14 to 525 Dba:0x01400087 ------------------------------------------------------------------------ ff060659 00000100 00002217 0100e19c 0b000000 000002f8 32008000 40010500 1500a902 00002c34 c0009700 25000080 00003599 0b000200 0d00b602 00003403 c000b300 06002f21 0000e79c 0b000000 00000000 00000001 3201ffff 7602ae05 38033803 00003201 7f1f691f 521fcd09 e209f709 0c0a210a 360a4b0a 600a750a 8a0a9f0a b40ac90a de0af30a 080b1d0b 320b470b 5c0b710b 860b9b0b b00bc50b da0bef0b 040c190c 2e0c430c 580c6d0c 820c970c ac0cc10c d60ceb0c 000d150d 2a0d3f0d 540d690d 7e0d930d a80dbd0d d20de70d fc0d110e 260e3b0e 500e650e 7a0e8f0e a40eb90e ce0ee30e f80e0d0f 220f370f 4c0f610f 760f8b0f a00fb50f ca0fdf0f f40f0910 1e103310 48105d10 72108710 9c10b110 c610db10 f0100511 1a112f11 44115911 6e118311 9811ad11 c211d711 ec110112 17122d12 43125912 6f128512 9b12b112 c712dd12 f3120913 1f133513 4b136113 77138d13 a313b913 cf13e513 fb131114 27143d14 53146914 7f149514 ab14c114 d714ed14 03151915 2f154515 5b157115 87159d15 b315c915 df15f515 0b162116 37164d16 63167916 8f16a516 bb16d116 e716fd16 13172917 3f175517 6b178117 9717ad17 c317d917 ef170518 1b183118 47185d18 73188918 9f18b518 cb18e118 f7180d19 23193919 4f196519 7b199119 a719bd19 d319e919 ff19151a 2b1a411a 571a6d1a 831a981a <32 bytes per line> BBED> p kcbh struct kcbh, 20 bytes @0 ub1 type_kcbh @0 0x06 ub1 frmt_kcbh @1 0xa2 ub1 spare1_kcbh @2 0x00 ub1 spare2_kcbh @3 0x00 ub4 rdba_kcbh @4 0x01400087 ub4 bas_kcbh @8 0x000b9ce7 ub2 wrp_kcbh @12 0x0000 ub1 seq_kcbh @14 0xff ub1 flg_kcbh @15 0x06 (KCBHFDLC, KCBHFCKV) ub2 chkval_kcbh @16 0x5906 ub2 spare3_kcbh @18 0x0000 BBED> sum dba 5,135 apply; Check value for File 5, Block 135: current = 0x59f8, required = 0x59f8
下边我们尝试查看该表的记录
SQL> alter system flush buffer_cache; System altered. SQL> select /*+ full(t) */ count(*) from yallonking t; select /*+ full(t) */ count(*) from yallonking t * ERROR at line 1: ORA-01578: ORACLE data block corrupted (file # 5, block # 135) ORA-01110: data file 5: '/u01/app/oracle/oradata/ora11gr2/yallonking01.dbf'
此时,我们看到由于一个块的问题导致整个表无法访问
下边我们尝试设置10231事件跳过该问题块,尝试追回部分数据
SQL> alter system set events '10231 trace name context forever,level 10'; System altered. SQL> select /*+ full(t) */ count(*) from yallonking t; COUNT(*) ---------- 9696 SQL> select 10002-9696 from dual; 10002-9696 ---------- 306 SQL> alter system set events '10231 trace name context off'; System altered. SQL> select /*+ full(t) */ count(*) from yallonking t; select /*+ full(t) */ count(*) from yallonking t * ERROR at line 1: ORA-01578: ORACLE data block corrupted (file # 5, block # 135) ORA-01110: data file 5: '/u01/app/oracle/oradata/ora11gr2/yallonking01.dbf'
注:以上我们使用10231事件可以跳过坏块读取其他块上的记录,但是我们会丢失部分数据(此处为306条记录)。
下边我们使用bbed修复问题的数据块(当然和之前人为损坏的过程类似)
BBED> p kcbh struct kcbh, 20 bytes @0 ub1 type_kcbh @0 0x06 ub1 frmt_kcbh @1 0xa2 ub1 spare1_kcbh @2 0x00 ub1 spare2_kcbh @3 0x00 ub4 rdba_kcbh @4 0x01400087 ub4 bas_kcbh @8 0x000b9ce7 ub2 wrp_kcbh @12 0x0000 ub1 seq_kcbh @14 0xff ub1 flg_kcbh @15 0x06 (KCBHFDLC, KCBHFCKV) ub2 chkval_kcbh @16 0x59f8 ub2 spare3_kcbh @18 0x0000 BBED> m /x 01 dba 5,135 offset 14 File: /u01/app/oracle/oradata/ora11gr2/yallonking01.dbf (5) Block: 135 Offsets: 14 to 525 Dba:0x01400087 ------------------------------------------------------------------------ 0106f859 00000100 00002217 0100e19c 0b000000 000002f8 32008000 40010500 1500a902 00002c34 c0009700 25000080 00003599 0b000200 0d00b602 00003403 c000b300 06002f21 0000e79c 0b000000 00000000 00000001 3201ffff 7602ae05 38033803 00003201 7f1f691f 521fcd09 e209f709 0c0a210a 360a4b0a 600a750a 8a0a9f0a b40ac90a de0af30a 080b1d0b 320b470b 5c0b710b 860b9b0b b00bc50b da0bef0b 040c190c 2e0c430c 580c6d0c 820c970c ac0cc10c d60ceb0c 000d150d 2a0d3f0d 540d690d 7e0d930d a80dbd0d d20de70d fc0d110e 260e3b0e 500e650e 7a0e8f0e a40eb90e ce0ee30e f80e0d0f 220f370f 4c0f610f 760f8b0f a00fb50f ca0fdf0f f40f0910 1e103310 48105d10 72108710 9c10b110 c610db10 f0100511 1a112f11 44115911 6e118311 9811ad11 c211d711 ec110112 17122d12 43125912 6f128512 9b12b112 c712dd12 f3120913 1f133513 4b136113 77138d13 a313b913 cf13e513 fb131114 27143d14 53146914 7f149514 ab14c114 d714ed14 03151915 2f154515 5b157115 87159d15 b315c915 df15f515 0b162116 37164d16 63167916 8f16a516 bb16d116 e716fd16 13172917 3f175517 6b178117 9717ad17 c317d917 ef170518 1b183118 47185d18 73188918 9f18b518 cb18e118 f7180d19 23193919 4f196519 7b199119 a719bd19 d319e919 ff19151a 2b1a411a 571a6d1a 831a981a <32 bytes per line>
注:这里位于offest为14的参数 seq_kcbh 为 control file sequence number,我们这里是将其改为原来的01
在后边的实验中,我们将测试该值改为其他非0xff值
BBED> p kcbh struct kcbh, 20 bytes @0 ub1 type_kcbh @0 0x06 //块类型 ub1 frmt_kcbh @1 0xa2 //块格式 ub1 spare1_kcbh @2 0x00 ub1 spare2_kcbh @3 0x00 ub4 rdba_kcbh @4 0x01400087 //块rdba ub4 bas_kcbh @8 0x000b9ce7 //SCN base ub2 wrp_kcbh @12 0x0000 //SCN wrap ub1 seq_kcbh @14 0x01 //SCN sequence number ub1 flg_kcbh @15 0x06 (KCBHFDLC, KCBHFCKV) ub2 chkval_kcbh @16 0x59f8 ub2 spare3_kcbh @18 0x0000 BBED> sum dba 5,132 apply Check value for File 5, Block 132: current = 0xcc08, required = 0xcc08 BBED> sum dba 5,135 Check value for File 5, Block 135: current = 0x59f8, required = 0x5906
查看块的tailchk
关于tailchk的说明如下
tailchk=lower order two bytes of the SCN base||block type||SCN sequence number
下边我们查找该块以上这三个值
BBED> d /v dba 5,135 offset 0 count 64 File: /u01/app/oracle/oradata/ora11gr2/yallonking01.dbf (5) Block: 135 Offsets: 0 to 63 Dba:0x01400087 ------------------------------------------------------- 06a20000 87004001 e79c0b00 00000106 l .¢....@.?....... f8590000 01000000 22170100 e19c0b00 l ?Y......"...á... 00000000 02f83200 80004001 05001500 l .....?2...@..... a9020000 2c34c000 97002500 00800000 l ?...,4à...%..... <16 bytes per line> type format unused rdba scn base scn wrap seq flag 06 a2 0000 87004001 e79c0b00 0000 01 06 所以,该块的tailchk应该为:(000b)9ce7||06||01,也就是 9ce70601
下边查看该块当前的tailchk值
BBED> p tailchk ub4 tailchk @8188 0x9ce70601 BBED> d /v dba 5,135 offset 8188 File: /u01/app/oracle/oradata/ora11gr2/yallonking01.dbf (5) Block: 135 Offsets: 8188 to 8191 Dba:0x01400087 ------------------------------------------------------- 0106e79c l ..?. <16 bytes per line>
注:由于之前的人为做的修改,所以该处此时不需要做其他操作。
验证该块
BBED> v dba 5,135 DBVERIFY - Verification starting FILE = /u01/app/oracle/oradata/ora11gr2/yallonking01.dbf BLOCK = 135 Block 135 is corrupt Corrupt block relative dba: 0x01400087 (file 0, block 135) Bad check value found during verification Data in bad block: type: 6 format: 2 rdba: 0x01400087 last change scn: 0x0000.000b9ce7 seq: 0x1 flg: 0x06 spare1: 0x0 spare2: 0x0 spare3: 0x0 consistency value in tail: 0x9ce70601 check value in block header: 0x59f8 computed block checksum: 0xfe DBVERIFY - Verification complete Total Blocks Examined : 1 Total Blocks Processed (Data) : 0 Total Blocks Failing (Data) : 0 Total Blocks Processed (Index): 0 Total Blocks Failing (Index): 0 Total Blocks Empty : 0 Total Blocks Marked Corrupt : 1 Total Blocks Influx : 0 Message 531 not found; product=RDBMS; facility=BBED
注:比较奇怪的是,该处发现该块依然是Corrupt,期间什么修改操作也没有做,在将数据库重启后便好。
查看数据
SQL> alter system flush buffer_cache; System altered. SQL> select /*+ full(t) */ count(*) from yallonking t; COUNT(*) ---------- 10002
注:通过这种方式,我们可以找回所有数据,而不是部分数据
==============================================================================
下边我们继续试验,看能否将一个对象的不同块的seq_kcbh改为不同值
BBED> m /x 05 dba 5,135 offset 14 File: /u01/app/oracle/oradata/ora11gr2/yallonking01.dbf (5) Block: 135 Offsets: 14 to 77 Dba:0x01400087 ------------------------------------------------------------------------ 05060659 00000100 00002217 0100e19c 0b000000 000002f8 32008000 40010500 1500a902 00002c34 c0009700 25000080 00003599 0b000200 0d00b602 00003403 <32 bytes per line> BBED> sum apply; Check value for File 5, Block 135: current = 0x590a, required = 0x590a SQL> alter system flush buffer_cache; System altered. SQL> select /*+ full(t) */ count(*) from yallonking t; select /*+ full(t) */ count(*) from yallonking t * ERROR at line 1: ORA-01578: ORACLE data block corrupted (file # 5, block # 135) ORA-01110: data file 5: '/u01/app/oracle/oradata/ora11gr2/yallonking01.dbf' BBED> v DBVERIFY - Verification starting FILE = /u01/app/oracle/oradata/ora11gr2/yallonking01.dbf BLOCK = 135 Block 135 is corrupt Corrupt block relative dba: 0x01400087 (file 0, block 135) Fractured block found during verification Data in bad block: type: 6 format: 2 rdba: 0x01400087 last change scn: 0x0000.000b9ce7 seq: 0x5 flg: 0x06 spare1: 0x0 spare2: 0x0 spare3: 0x0 consistency value in tail: 0x9ce70609 check value in block header: 0x590a computed block checksum: 0x0 DBVERIFY - Verification complete Total Blocks Examined : 1 Total Blocks Processed (Data) : 0 Total Blocks Failing (Data) : 0 Total Blocks Processed (Index): 0 Total Blocks Failing (Index): 0 Total Blocks Empty : 0 Total Blocks Marked Corrupt : 1 Total Blocks Influx : 2 Message 531 not found; product=RDBMS; facility=BBED BBED> p ckbh BBED-00207: invalid offset specifier (ckbh) BBED> p kcbh struct kcbh, 20 bytes @0 ub1 type_kcbh @0 0x06 ub1 frmt_kcbh @1 0xa2 ub1 spare1_kcbh @2 0x00 ub1 spare2_kcbh @3 0x00 ub4 rdba_kcbh @4 0x01400087 ub4 bas_kcbh @8 0x000b9ce7 ub2 wrp_kcbh @12 0x0000 ub1 seq_kcbh @14 0x05 ub1 flg_kcbh @15 0x06 (KCBHFDLC, KCBHFCKV) ub2 chkval_kcbh @16 0x590a ub2 spare3_kcbh @18 0x0000 BBED> p tailchk ub4 tailchk @8188 0x9ce70609 BBED> m /x 05 dba 5,135 offset 8188 File: /u01/app/oracle/oradata/ora11gr2/yallonking01.dbf (5) Block: 135 Offsets: 8188 to 8191 Dba:0x01400087 ------------------------------------------------------------------------ 0506e79c <32 bytes per line> BBED> sum apply Check value for File 5, Block 135: current = 0x5906, required = 0x5906 BBED> v DBVERIFY - Verification starting FILE = /u01/app/oracle/oradata/ora11gr2/yallonking01.dbf BLOCK = 135 DBVERIFY - Verification complete Total Blocks Examined : 1 Total Blocks Processed (Data) : 1 Total Blocks Failing (Data) : 0 Total Blocks Processed (Index): 0 Total Blocks Failing (Index): 0 Total Blocks Empty : 0 Total Blocks Marked Corrupt : 0 Total Blocks Influx : 0 Message 531 not found; product=RDBMS; facility=BBED SQL> alter system flush buffer_cache; System altered. SQL> select /*+ full(t) */ count(*) from yallonking t; COUNT(*) ---------- 10002
我们继续尝试把 seq_kcbh 值改为 ff
BBED> m /x ff dba 5,135 offset 14 File: /u01/app/oracle/oradata/ora11gr2/yallonking01.dbf (5) Block: 135 Offsets: 14 to 77 Dba:0x01400087 ------------------------------------------------------------------------ ff060659 00000100 00002217 0100e19c 0b000000 000002f8 32008000 40010500 1500a902 00002c34 c0009700 25000080 00003599 0b000200 0d00b602 00003403 <32 bytes per line> BBED> sum Check value for File 5, Block 135: current = 0x5906, required = 0x59fc BBED> v DBVERIFY - Verification starting FILE = /u01/app/oracle/oradata/ora11gr2/yallonking01.dbf BLOCK = 135 Block 135 is corrupt Corrupt block relative dba: 0x01400087 (file 0, block 135) Fractured block found during verification Data in bad block: type: 6 format: 2 rdba: 0x01400087 last change scn: 0x0000.000b9ce7 seq: 0xff flg: 0x06 spare1: 0x0 spare2: 0x0 spare3: 0x0 consistency value in tail: 0x9ce70605 check value in block header: 0x5906 computed block checksum: 0xfa DBVERIFY - Verification complete Total Blocks Examined : 1 Total Blocks Processed (Data) : 0 Total Blocks Failing (Data) : 0 Total Blocks Processed (Index): 0 Total Blocks Failing (Index): 0 Total Blocks Empty : 0 Total Blocks Marked Corrupt : 1 Total Blocks Influx : 2 Message 531 not found; product=RDBMS; facility=BBED BBED> m /x ff dba 5,135 offset 8188 File: /u01/app/oracle/oradata/ora11gr2/yallonking01.dbf (5) Block: 135 Offsets: 8188 to 8191 Dba:0x01400087 ------------------------------------------------------------------------ ff06e79c <32 bytes per line> BBED> sum Check value for File 5, Block 135: current = 0x5906, required = 0x5906 BBED> v DBVERIFY - Verification starting FILE = /u01/app/oracle/oradata/ora11gr2/yallonking01.dbf BLOCK = 135 Block Checking: DBA = 20971655, Block Type = KTB-managed data block Found block already marked corrupted DBVERIFY - Verification complete Total Blocks Examined : 1 Total Blocks Processed (Data) : 1 Total Blocks Failing (Data) : 0 Total Blocks Processed (Index): 0 Total Blocks Failing (Index): 0 Total Blocks Empty : 0 Total Blocks Marked Corrupt : 0 Total Blocks Influx : 0 Message 531 not found; product=RDBMS; facility=BBED
注:此处虽然校验通过,但是在查询中还是有问题
SQL> alter system flush buffer_cache; System altered. SQL> select /*+ full(t) */ count(*) from yallonking t; select /*+ full(t) */ count(*) from yallonking t * ERROR at line 1: ORA-01578: ORACLE data block corrupted (file # 5, block # 135) ORA-01110: data file 5: '/u01/app/oracle/oradata/ora11gr2/yallonking01.dbf'
小结:
也就是说在出现块损坏的情况下(如下类似错误)
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 5, block # 135)
ORA-01110: data file 5: '/u01/app/oracle/oradata/ora11gr2/yallonking01.dbf'
我们只需要把对应块的 tailchk 值修改正确即可,而该值的组成部分中的 seq_kcbh 值可以为非 ff 的任意值.
还要注意的是,我们在数据库运行的同时修改完,可能需要重启下数据库。
注:此处发现该块还是corrupt
下边我们确认,只有坏块132的seq_kcbh值和好的块162的seq_kcbh值一样才可。
这句话完全是错误的
你仔细注意下tailchk
tailchk=Lower order two bytes of SCN Base(bas_kcbh) + Block Type(type_kcbh) + SCN Seq(seq_kcbh)
这里你修改了Seq 这地方也得修修改改。
好的,早上已经重新实验并修改了该文档,欢迎常来,欢迎指正…