Saturday, January 16, 2021

Oracle database Rman forceful Incomplete recovery : Adjust Scn and Open database



The main reason most of us are not able to  Open database  after restore and recovery is due to  below 2 main issues 

1)  missing  required archive logs  which results in mismatch in  checkpoint#  between  datafiles  

2)  Another issue we face is  database SCN.   is behind  datafiles checkpoint#  


If  datafiles scn are not in sync  after restore  i,.e  it has fuzzy and if you have required archive log then  we don't need any special fix rather applying additional required archives . 
So the first step we need to do after restore and recovery is  to check there is no  fuzzy  . 



###############################################
###############################################

Checking Fuzzy 


Fri Jun 05 14:15:15 2015
ALTER DATABASE RECOVER CANCEL
Fri Jun 05 14:15:16 2015
Errors in file /ora_data/diag/rdbms/SNORT/SNORT/trace/SNORT_pr00_18570.trc:
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: ‘/ora_data/oradata/SNORT/system01.dbf’
ORA-1547 signalled during: ALTER DATABASE RECOVER CANCEL …
ALTER DATABASE RECOVER CANCEL
ORA-1112 signalled during: ALTER DATABASE RECOVER CANCEL …



First key thing to check is all your datafile headers are in sync . We can check same using below  sql 


select CONTROLFILE_TYPE from v$database ;
select FUZZY,count(*) from v$datafile_header group by fuzzy ;

set lines 187
set numwidth 20
set pages 999
alter session set nls_date_format='DD-MON-YYYY:HH24:MI:SS';
select * from v$recover_file;
select min(fhrba_Seq)min_seq, max(fhrba_Seq) max_seq_required from x$kcvfh;
select distinct fuzzy from v$datafile_header;

alter session set nls_date_format='YYYY/MON/DD hh24:mi:ss';
 
select checkpoint_time,fuzzy,count(*),status
from ( select checkpoint_time,fuzzy,status
       from v$datafile_header
       union all
       select controlfile_time,'CTL',null from v$database)
group by checkpoint_time,fuzzy,status;




How  to fix fuzzy and determine archive needed : 


1) find archive log needed to  fix fuzzy 

SQL> -- Check for MIN, and MAX SCN in Datafiles
SQL> select min(CHECKPOINT_CHANGE#), max(CHECKPOINT_CHANGE#) from v$datafile_header ;

-- Use MIN(CHECKPOINT_CHANGE#) 2446300 as found before, then use it with this query to find the
-- first SEQ# 'number' and archivelog file needed for recover to start with.
-- All SEQ# up to the online Current Redolog SEQ# must be available without any gap for successful recovery

-- MIN(CHECKPOINT_CHANGE#) 2446300


SQL> select thread#, sequence#, substr(name,1,80) from v$Archived_log
      where 2446300 between first_change# and next_change#;



2)   use below  to determine max scn till when we need to recover database . 

set serveroutput on
declare
scn number(12) := 0;
scnmax number(12) := 0;
begin
for f in (select * from v$datafile) loop
scn := dbms_backup_restore.scandatafile(f.file#);
dbms_output.put_line(‘File ‘ || f.file# ||’ absolute fuzzy scn = ‘ || scn);
if scn > scnmax then scnmax := scn; end if;
end loop;

dbms_output.put_line(‘Minimum PITR SCN = ‘ || scnmax);
end;
/

SQL> alter database recover database until change 7203942;


Absolute Fuzzy

In  some cases we see that  there is no fuzzy however still it shows datafiles header mismatch error while opening database with resetlogs . 

Oracle reserves a section of the file header block of each file for just such an occurrence. This is called the Absolute Fuzzy SCN and represents the SCN required for recovery to make this a consistent file. Our bookends are then defined as the checkpoint SCN and the Absolute Fuzzy SCN. At a minimum, Oracle must recover from the checkpoint SCN through the Absolute Fuzzy SCN for consistency. If Oracle did not detect any SCNs higher than the checkpoint SCN during the backup then the backup would be considered consistent (file header status 0x0) and the Absolute Fuzzy SCN would remain at 0x0 - obviating the need for any backup-necessary redo to be applied. As you can see, this is the reason Oracle waits until all data blocks in the file have been read and written before it writes the header to the backup set. This permits the proper settings for the bookends


Follow below to determine archive needed for recovery 

SQL> select hxfil file#, substr(hxfnm, 1, 50) name, fhscn checkpoint_change#, fhafs Absolute_Fuzzy_SCN, max(fhafs) over () Min_PIT_SCN from x$kcvfh where fhafs!=0 ;

Note: Column Min_PIT_SCN will return same value even for multiple rows as we have applied ANALYTICAL "MAX() OVER ()" function on it.

SQL> V$ARCHIVED_LOG
SQL> ALTER SESSION SET NLS_DATE_FORMAT='DD-MON-RR HH24:MI:SS';
SQL> SELECT THREAD#, SEQUENCE#, FIRST_TIME, NEXT_TIME FROM V$ARCHIVED_LOG WHERE '31-AUG-11 23:20:14' BETWEEN FIRST_TIME AND NEXT_TIME;

If the above query does not return any rows, it may be that the information has aged out of the controlfile run the following query against v$log_history.

SQL> V$ LOG_HISTORY view does not have a column NEXT_TIME
SQL> ALTER SESSION SET NLS_DATE_FORMAT='DD-MON-RR HH24:MI:SS';
SQL> select a.THREAD#, a.SEQUENCE#, a.FIRST_TIME from V$LOG_HISTORY a
where FIRST_TIME =
( SELECT MAX(b.FIRST_TIME) FROM V$LOG_HISTORY b
WHERE b.FIRST_TIME < to_date('31-AUG-11 23:20:14', 'DD-MON-RR HH24:MI:SS')
) ;


RMAN> RUN
{
SET UNTIL SEQUENCE 531 THREAD 1;
RECOVER DATABASE;
}





###############################################
###############################################

Mismatch between datafile header scn  due to missing archive .


This is case where due to missing archivelogs there is mismatch between checkpoint# of datafiles  and below errors is reported


ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/u03/oradata/tstc/dbsyst01.dbf'

Or:

ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 48 needs more recovery to be consistent
ORA-01110: data file 48: '/vol06/oradata/testdb/ard01.dbf'

Or 

ORA-01595: error freeing extent (11) of rollback segment (2))
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], []
Fri Sep 26 10:23:22 2014

ORA-600 [4194] The official explanation of the error is: "Undo Record Number Mismatch While Adding Undo Record". When the database increases the UNDO record through REDO recovery, it is found that the number of the UNDO record does not match, that is, there is an inconsistency.





Below was  solution adopted for above said  problems  to force open database 


Top open  database with resetlogs we need to use below approach . 

1) Open database in mount  stage 

2)  Set  database _OFFLINE_ROLLBACK_SEGMENTS   and  _corrupted_rollback_segments . We will rollback segment list from database alert logs . 
 
_OFFLINE_ROLLBACK_SEGMENTS=("_SYSSMU11$", "_SYSSMU12$", "_SYSSMU13$", "_SYSSMU14$", "_SYSSMU15$", "_SYSSMU16$", "_SYSSMU17$", "_SYSSMU18$")

_corrupted_rollback_segments=("_SYSSMU11$", "_SYSSMU12$", "_SYSSMU13$", "_SYSSMU14$", "_SYSSMU15$", "_SYSSMU16$", "_SYSSMU17$", "_SYSSMU18$")


2) Set database  parameter UNDO_MANAGEMENT=MANUAL  and  _allow_resetlogs_corruption = TRUE



4)  Perform dummy  recovery 
> recover database until cancel 
cancel 


5)  Open database with resetlogs 


6) Create new  Undo  tablespace  

CREATE UNDO TABLESPACE undo2 datafile '/u01/app/oracle/oradata/RTS_NEW/undo2_df1.dbf' size 200m autoextend on maxsize 30G;


7) Assign new  undo tablespace to database 

SQL> alter system set undo_tablespace = undo2 scope=spfile;
System altered.
SQL> alter system set undo_management=auto scope=spfile;
System altered.


8) Take Full backup of database 


###############################################
###############################################


Database scn is behind datafile checkpoint#  and ORA-00600 is reported 

To put it simply this ORA-00600 error means that a datafile has a recorded SCN that’s ahead of the database SCN.  The current database SCN is shown as the 3rd argument (in this case 551715) and the datafile SCN is shown as the 5th argument (in this case 562781).  Hence a difference of:
562781 - 551715 = 11066
In this example, that’s not too large of a gap.  But in a real system, the difference may be more significant.  
Also if multiple datafiles are ahead of the current SCN you should expect to see multiple ORA-00600 errors.


Before 12c one of below options were used , however from 12c  patch/event 21307096     has been introduced to advance database scn .

1 : the oradebug Poke directly modify the values in memory;
2 : Event 10015 is to increase the value of scn;
3 : _minimum_giga_scn to increase the value of scn;
4 : GDB / the dbx directly modify the values in memory;
5 : modified values to modify the control file of scn;
6 : modify the data of the file header scn modified value;
7 : adjust_scn to increase scn.
8. Rollforward scn with multiple restart 




One of The solution to this problem is quite simple: roll forward the current SCN until it exceeds the datafile SCN.  The database automatically generates a number of internal transactions on each startup hence the way to roll forward the database SCN is to simply perform repeated shutdowns and startups.  Depending on how big the gap is, it may be necessary to repeatedly shutdown abort and startup – the gap between the 5th and 3rd parameter to the ORA-00600 will decrease each time.  However eventually the gap will reduce to zero and the database will open
 
 
-> Alter database  open resetlogs ( will fail with ORA-00600: internal error code, arguments: [2663],    )
-> Shut abourt
-> Startup
  



Pre 12c  : _minimum_giga_scn


There also another   option of setting _minimum_giga_scn  which  is not supported from  11.2.0.2.5 .
 
From past 3 days its failing with the same reason.
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [2662], [1477], [4214140426], [1477], [4215310734], [4228956], [], [], [], [], [], []
Error 704 happened during db open, shutting down database
USER (ospid: 3866830): terminating the instance due to error 70
 
SQL> select checkpoint_change# from v$database;
 SQL> select ceil(&decimal_scn_expected/1024/1024/1024) from dual;
 enter the value from the fist select when prompted
 
Then use those results as follows:
 set parameter _minimum_giga_scn=<results from most recent query> in the init.ora file.
 
startup mount
recover database
alter database open;
 




Post 12c : applying 21307096  event/patch 

From 12c  Oracle support introduced applying patch 21307096   to advance database before opening database with resetlogs.  This was indirect replacement  of _minimum_giga_scn

Below  steps can be followed  for applying 21307096  event/patch 

 
1. Apply patch 21307096 then: 

2. The fix needs to be enabled with Event 21307096 at level SCN delta. 

The SCN delta in million units is with the range of values from 1 to 4095 which increases the scn by: 
 
lowest_scn + event level * 1000000 
 
Example: if the lowest datafile checkpoint scn in the database is 990396 
and the highest is 992660 then SCN delta is 1; given by (992660 - 990396) / 1000000 
 
event="21307096 trace name context forever, level 1" 
 
or use this query: 
 
select decode(ceil((max(CHECKPOINT_CHANGE#) - min(CHECKPOINT_CHANGE#))/1000000),0,'Event 21307096 is not needed' 
, 'event="21307096 trace name context forever, level ' 
||ceil((max(CHECKPOINT_CHANGE#) - min(CHECKPOINT_CHANGE#))/1000000) 
||'"') "EVENT TO SET in init.ora:" 
from v$datafile_header 
where status != 'OFFLINE'; 
 
Note that the event is needed to activate this fix so please add it in the init.ora file. 

Here are some tests in 12.1.0.2 using each level for alter database open resetlogs:
  level 1 Elapsed: 00:01:02.35
  level 2 Elapsed: 00:02:16.23
  level 6 Elapsed: 00:06:08.05
  
In general:  based on a 16k per second scn rate (16K/sec) , the open resetlogs time
would be at least (event level * 1000000 / 16000) seconds. Then level 1 would be at least 
62+ seconds and level 4095 would be 71+ hours !.


Before starting,
- Check db is in mounted state and the ORACLE_HOME is right  which has the patch 21307096 aplied.
- Backup the current contrlfiles as we may be over writing them.
-- Ensure all datafiles are online 
 
a. Ensure datafiles, redo log files have correct path. If not, rename the files appropriately.
select name from v$datafile;
select member form v$logfile;

 
b. Disable block change tracking.
alter database disable block change tracking;


c. Take a backup of "create controlfile" command -
conn / as sysdba
alter database backup controlfile to trace;
oradebug setmypid
oradebug tracefile_name
!cat <tracefile_name>


d. Startup mount the database after uncommenting the below parameters -

_corrupted_rollback_segments =’take from alert log’
_allow_resetlogs_corruption = TRUE
event = "21307096 trace name context forever, level 1"
undo_management=MANUAL  -- Comment out undo_management=AUTO

 
e. Perform fake recovery and open resetlogs -

conn / as sysdba
recover database using backup controlfile until cancel;
CANCEL

alter database open resetlogs;
 

If the above fails to open the database then -

- Modify event 21307096 level to next level.
- startup nomount the db
- create the controlfile using the command saved earlier.
- execute step e) again.


f. Once the database is opened add temp file to TEMP tablespace and create a new undo tablespace UNDOTBS1.

g. Shut down the database.
shut immediate


h. Comment out the below parameters in init file -

_corrupted_rollback_segments =
_allow_resetlogs_corruption = TRUE
event = "21307096 trace name context forever, level 3"
undo_management=MANUAL  -- Uncomment undo_management=AUTO


Set undo_tablespace=UNDOTBS1.

i. Open the database.

 
 

###############################################
###############################################

----------------
ORA-01194 Error:
----------------
-- this error may raise during startup a cloned database
-- resolution: provide the online redo log file to recover

SQL> startup 
Page 281 Oracle DBA Code Examples
ORACLE instance started.
..
Database mounted.
ORA-01589: must use RESETLOGS or NORESETLOGS option for database open
SQL> alter database open noresetlogs;
alter database open noresetlogs
*
ERROR at line 1:
ORA-01588: must use RESETLOGS option for database open
SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: 'C:\ORACLE\ORADATA\MANAGER\SYSTEM01.DBF'
SQL> RECOVER DATABASE UNTIL CANCEL USING BACKUP CONTROLFILE;
ORA-00279: change 405719 generated at 06/30/2008 15:51:04 needed for thread 1
ORA-00289: suggestion : C:\ORACLE\RDBMS\ARC00019.001
ORA-00280: change 405719 for thread 1 is in sequence #19
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
C:\ORACLE\ORADATA\MANAGER\REDO03.LOG
Log applied.
Media recovery complete.
SQL> alter database open resetlogs;
Database altered.


----------------
ORA-01152 Error:
----------------
-- resolution: provide the online redo log file to recover

ORA-00289: suggestion :
/u01/app/oracle/admin/finance/arch/finance/_0000012976.arc
ORA-00280: change 962725326 for thread 1 is in sequence #12976
ORA-00278:
logfile'/u01/app/oracle/admin/finance/arch/finance/_0000012975.arc'
no longer needed for this recovery
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01152: file 1 was not restored from a sufficiently old backup
ORA-01110: data file 1: '/pase16/oradata/finance/system_01.dbf'ORA-01112:
media recovery not started
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/pase04/oradata/finance/redo01a.rdo
ORA-00279: change 962746677 generated at 07/30/2008 04:33:52 needed for thread 1
ORA-00289: suggestion :
/u01/app/oracle/admin/finance/arch/finance/_0000012978.arc
ORA-00280: change 962746677 for thread 1 is in sequence #12978
ORA-00278: log file '/pase04/oradata/finance/redo01a.rdo'
no longer needed for this recovery
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/pase04/oradata/finance/redo02a.rdo
Log applied.
Media recovery complete. 
Page 282 Oracle DBA Code Examples


----------------
ORA-00376 Error:
----------------
-- reason: might be datafile or tablespace being offline
-- resolution: bringing the tablespace or datafile online

ORA-00376: file 10 cannot be read at this time
ORA-01110: data file 10: '/u01/app/oracle/remorse/data_01.dbf'


No comments:

Post a Comment