Tuesday, September 26, 2023

Handling errors faced while Generating Oracle Awr report / Awr snapshots



Considering we were facing lot of issues related to awr snapshots and awr report  thought of documenting  known  troubleshooting and  known  fix 



Known Issues : 


Issue 1 )    Generating AWR report Hangs on 'Control File Sequential Read

Solution ::  alter session set "_hash_join_enabled"=true  as per Doc  Id : 2045523.1




Issue 2)  AWR snaoshot hangs with enq: WF - contention 

Solution 1:  As per Doc ID 2695000.1 Kill the MMON child processes and the MMON process for the instance that is not generating AWR reports so that it is restarted by the database.
 ps -ef|grep ora_m0
 ps -ef|grep _mmon_




Solution 2 :  Gather stats on these 3 tables and check the manual report generation again . If possible try gathering   fixed objects stats 

SQL> exec dbms_stats.gather_table_stats('SYS', 'X$KEWRSQLIDTAB');
SQL> exec dbms_stats.gather_table_stats('SYS', 'X$KEWRTSQLPLAN');
SQL> exec dbms_stats.gather_table_stats('SYS', 'X$KEHSQT');

Verify with the table last analyzed details

SQL> select OWNER, TABLE_NAME, LAST_ANALYZED from dba_tab_statistics where table_name ='X$KEWRSQLIDTAB';
SQL> select OWNER, TABLE_NAME, LAST_ANALYZED from dba_tab_statistics where table_name='X$KEWRTSQLPLAN';
SQL> select OWNER, TABLE_NAME, LAST_ANALYZED from dba_tab_statistics where table_name='X$KEHSQT';




Solution 3  : As per   Doc ID 2649588.1  MMON Process Causing Blocking And Contention - enq: WF - contention waits   due to increase in  size of awr data which  needs to be purged 

https://abdul-hafeez-kalsekar-tuning.blogspot.com/2023/09/mmon-process-causing-blocking-and.html  --  Doc ID 2649588.1

https://abdul-hafeez-kalsekar-tuning.blogspot.com/2023/09/troubleshooting-missing-automatic.html

https://abdul-hafeez-kalsekar-tuning.blogspot.com/2023/09/purging-oracle-awr-details.html

https://abdul-hafeez-kalsekar-tuning.blogspot.com/2023/09/manually-purge-optimizer-statistics-awr.html



Obtain flush timing for all the tables. (obtained from note 1301503.1)

set pagesize 999
column name format a28
column time format a29
variable dbid number
exec select dbid into :dbid from v$database;

variable snap_id number
exec select max(snap_id) into :snap_id from wrm$_snapshot where dbid=:dbid;

select table_name_kewrtb name, end_time-begin_time time
from wrm$_snapshot_details, x$kewrtb
where snap_id = :snap_id
and dbid = :dbid
and table_id = table_id_kewrtb
order by table_id;






Data Collection for analysis  : 


1) 10046   tracing 

Try generating Snapshots manually  . Try generating 10046   tracing  in session where you are trying manual snapshots 

alter session set timed_statistics = true;
alter session set statistics_level=all;
alter session set max_dump_file_size = unlimited;
alter session set tracefile_identifier = '10046_awr';
alter session set events '10046 trace name context forever,level 12';

set timing on
EXEC DBMS_WORKLOAD_REPOSITORY.create_snapshot;




2) Try generating Tfa 

$TFA_HOME/bin/tfactl diagcollect -srdc dbawrspace



3) Verify Mmon trace files and  alert log files 



4) . AWR info report

Check to see which objects are the largest in the AWR by running an AWR info report

conn / as sysdba
@?/rdbms/admin/awrinfo.sql




4) Other Views 

================

SQL> select con_id, instance_number, snap_id, begin_interval_time, end_interval_time from cdb_hist_snapshot order by 1,2,3;

SQL> select * from cdb_hist_wr_control;



COLUMN NAME FORMAT a40
COLUMN VALUE FORMAT a30
set lines 200 pages 200

select ksppinm name,
ksppstvl value
from sys.x$ksppi x,
sys.x$ksppcv y
where (x.indx = y.indx)
and ksppinm = '_awr_mmon_cpuusage';


Select count(*) from WRI$_OPTSTAT_HISTHEAD_HISTORY ;



select dbms_stats.get_stats_history_retention from dual;

select dbms_stats.get_stats_history_availability from dual;

select systimestamp - min(savtime) from sys.wri$_optstat_histgrm_history;
select min(savtime) from sys.wri$_optstat_histgrm_history;

SELECT MIN(SNAP_ID), MAX(SNAP_ID), COUNT(*) FROM SYS.WRH$_SYSMETRIC_HISTORY A
WHERE NOT EXISTS
(SELECT * FROM SYS.DBA_HIST_SNAPSHOT B WHERE B.SNAP_ID = A.SNAP_ID AND A.DBID = B.DBID);

SELECT COUNT(1) Orphaned_ASH_Rows
FROM wrh$_active_session_history a
WHERE NOT EXISTS
(SELECT 1
FROM wrm$_snapshot
WHERE snap_id = a.snap_id
AND dbid = a.dbid
AND instance_number = a.instance_number
);


========================





**************************
set markup html on spool on
spool /tmp/AWRCheck25thNov.html
set echo on

select systimestamp from dual;
select instance_name,version,host_name,status from gv$instance;

alter session set container = CDB$ROOT;

show con_name

show parameter awr

select * from cdb_hist_wr_control;

select con_id, instance_number, snap_id, begin_interval_time, end_interval_time from cdb_hist_snapshot order by 1,2,3;

alter session set container=FREXR1P;

exec dbms_workload_repository.create_snapshot();

alter session set container = CDB$ROOT;

select con_id, instance_number, snap_id, begin_interval_time, end_interval_time from cdb_hist_snapshot order by 1,2,3;

spool off
set markup html off
**************************





References : 

Troubleshooting: Missing Automatic Workload Repository (AWR) Snapshots and Other Collection Issues (Doc ID 1301503.1)

AWR Snapshots Not Generating ORA-13516 ( Doc ID 2756259.1 )

ORA-13516 AWR Operation failed: SWRF Schema not initialized ORA-06512 SYS.DBMS_WORKLOAD_REPOSITORY ( Doc ID 459887.1 )

Problem: ORA-13516 When Trying To Make A Manual Snapshot Of Repository Database ( Doc ID 365097.1 )

AWR Snapshots Not Generating ( Doc ID 308003.1 )

ORA-13516: AWR Operation Failed: CATPROC Not Valid (Doc ID 2547174.1)

MMON Trace Shows: "*** KEWRAFC: Flush slave failed, AWR Enqueue Timeout" ( Doc ID 560204.1 )

AWR Snapshots Fail to Generate With Associated "Suspending MMON action '%s' for 82800 seconds" alert log messages and Underlying ORA-12751 Errors ( Doc ID 2043531.1 )

Monday, September 11, 2023

Oracle sysresv utility to release shared memory / semaphores -- No need for ipcs and ipcrm


After shutdown of Oracle database , shared memory segments staying behind is known issues   in Linux .  Ideally Dba use Ipcs and Ipcrm  to release  shared  memory segments .


Also there is less known Linux utility pmap 
Eg :  [oracle]$ pmap `pgrep -f lgwr`



There is  Oracle utility sysresv  that can  be used to release shared memory segments and there is no need  to use Ipcs and Ipcrm .  

Eg :  sysresv -l instance1 

 
Usage:
------

sysresv: 
usage   : sysresv [-if] [-d <on/off>] [-l sid1 <sid2> ...]
          -i : Prompt before removing ipc resources for each sid
          -f : Remove ipc resources silently, oevrrides -i option
          -d <on/off> : List ipc resources for each sid if on
          -l sid1 <sid2> .. : apply sysresv to each sid
Default : sysresv -d on -l $ORACLE_SID
Note    : ipc resources are attempted to be deleted for a
          sid only if there is no currently running instance
          with that sid. 


   
Examples:
---------

o  Instance is not running:

   /u02/app/oracle/product/8.1.7> sysresv

   IPC Resources for ORACLE_SID "R817" :
   Shared Memory
   ID              KEY
   No shared memory segments used
   Semaphores:
   ID              KEY
   No semaphore resources used
   Oracle Instance not alive for sid "R817" 


o  Instance is running:

   /u03/app/oracle/product/8.1.6> sysresv

   IPC Resources for ORACLE_SID "ORCL" :
   Shared Memory:
   ID              KEY
   16437           0xe4efa8dc
   Semaphores:
   ID              KEY
   12320802        0x09d48346
   Oracle Instance alive for sid "ORCL"


o  Attempting to remove memory and semphores using sysresv when Oracle 
   detects an instance is running:

   /u03/app/oracle/product/8.1.6> sysresv -f

   IPC Resources for ORACLE_SID "ORCL" :
   Shared Memory:
   ID              KEY
   16437           0xe4efa8dc
   Semaphores:
   ID              KEY
   12320802        0x09d48346
   Oracle Instance alive for sid "ORCL"
   SYSRESV-005: Warning
           Instance maybe alive - aborting remove for sid "ORCL" 


o  Removing IPC resources:

   [Sysresv shows memory and semaphores exist but Oracle determines the 
    instance is not alive.  Cleanup is needed.]

   /u03/app/oracle/product/8.1.6> sysresv

   IPC Resources for ORACLE_SID "ORCL" :
   Shared Memory:
   ID              KEY
   16837           0xe4efa8dc
   Semaphores:
   ID              KEY
   12714018        0x09d48346
   Oracle Instance not alive for sid "ORCL" 
 

o  Removing IPC resources using sysresv:

   /u03/app/oracle/product/8.1.6> sysresv -i

   IPC Resources for ORACLE_SID "ORCL" :
   Shared Memory
   ID              KEY
   No shared memory segments used
   Semaphores:
   ID              KEY
   No semaphore resources used
   Oracle Instance not alive for sid "ORCL"
   Remove ipc resources for sid "ORCL" (y/n)?y
   Done removing ipc resources for sid "ORCL"
   /u03/app/oracle/product/8.1.6


   Verify the resources were removed:

   /u03/app/oracle/product/8.1.6> sysresv

   IPC Resources for ORACLE_SID "ORCL" :
   Shared Memory
   ID              KEY
   No shared memory segments used
   Semaphores:
   ID              KEY
   No semaphore resources used
   Oracle Instance not alive for sid "ORCL"  

     
o  If you need to remove memory segments, and Oracle detects the
   instance is alive through sysresv:

   % ipcrm -m <memid>

   Where <memid> is the memory id shown in the sysresv output.

   Example:
   % ipcrm -m 16437

   If you need to remove semaphores, and Oracle detects the
   instance is alive through sysresv:

   % ipcrm -s <semid>

   where <semid> is the semaphore id shown in the sysresv output.

   Example:
   % ipcrm -s 12320802



Reference :

SYSRESV Utility (Doc ID 123322.1)

Startup fail with ORA-29701 & ORA-29702 after the DB crash. (Doc ID 2160481.1)