Saturday, July 10, 2021

Oracle HangAnalyze and SystemState Dump

 We usually  come situation where we need to gather system state and Hanganalyze dump for performance analyze and to upload  to Oracle support .


Let see some insight .  Below is  how i personally capture hanganalyze and system state  dump . Will explain  in detail for each component in trail blog . 


Collection commands for Hanganalyze and Systemstate: RAC with fixes for bug 11800959 and bug 11827088

For 11g:
Sqlplus '/as sysdba'
Oradebug setorapname reco
Oradebug unlimit
oradebug setinst all
Oradebug-g all hanganalyze 3
Oradebug-g all dump systemstate 267
oradebug tracefile_name
…….. Wait at least 1 min
Oradebug -g all hanganalyze 3
Oradebug-g all dump systemstate 267
oradebug tracefile_name
Exit


Collection commands for Hanganalyze and Systemstate: RAC without fixes for Bug 11800959 and Bug 11827088

Sqlplus '/as sysdba'
Oradebug setorapname reco
Oradebug unlimit
Oradebug-g all hanganalyze 3
Oradebug-g all dump systemstate 258
…….. Wait at least 1 min
Oradebug-g all hanganalyze 3
Oradebug-g all dump systemstate 258
Exit


For 10g, run oradebug setmypid instead of oradebug setorapname reco:
Sqlplus '/as sysdba'
Oradebug setmypid
Oradebug unlimit
Oradebug-g all hanganalyze 3
Oradebug-g all dump systemstate 258
…….. Wait at least 1 min
Oradebug-g all hanganalyze 3
Oradebug-g all dump systemstate 258
Exit

In RAC environment, a dump will be created for all RAC instances in the DIAG trace file for each instance.



##############################################
##############################################

Oracle Preliminary connection
________________________

When database i hung  at times we cannot  login to database . In  Such situation we need to  use preliminary connection to  connect to datbase .

sqlplus -prelim / as sysdba

sqlplus /nolog
set _prelim on
connect / as sysdba




##############################################
##############################################

System state Dump 

A system state dump contains the process state for every process.

Every state object for every process is dumped.

A state object represents the state of a database resource including:

processes
sessions
enqueues (locks)
buffers
State objects are held in the SGA

A system state dump does not represent a snapshot of the instance because the database is not frozen for the duration of the dump. The start point of the dump will be earlier than the end point.

Oracle recommends the use of system state dumps to diagnose:
hanging databases
slow databases
database errors
waiting processes
blocking processes
resource contention


Logon to sqlplus as sysdba
SQL> oradebug setmypid
SQL> oradebug unlimit
SQL> oradebug dump systemstate 266
…….. Wait at least 1 min
SQL> oradebug dump systemstate 266
…….. Wait at lease 1 min
SQL> oradebug dump systemstate 266
SQL> oradebug tracefile_name




With releases 11g & RAC and above You should attach to the DIAG process and change its file size limit.
sqlplus "/ as sysdba"
select SPID from v$process where program like '%DIAG%';
oradebug setospid <OS_PID> -- pid of the diag process
oradebug unlimit
oradebug dump systemstate 266
oradebug tracefile_name
exit



Collection commands for Hanganalyze and Systemstate: RAC
There are 2 bugs affecting RAC that without the relevant patches being applied on your system, make using level 266 or 267 very costly. Therefore without these fixes in place it highly unadvisable to use these level
For information on these patches see:
Document 11800959.8 Bug 11800959-a systemstate dump with level> = 10 in RAC dumps huge busy global cache elements-can hang/crash instances
Document 11827088.8 Bug 11827088-Latch 'gc element' contention, LMHB terminates the instance



Systemstate dump has multiple levels:
2: dump (excluding lock element)
10: dump
11: dump + global cache of RAC
256: short stack (function stack)
258: 256 + 2 --> short stack + dump (excluding lock element)
266: 256 + 10 --> short stack + dump
267: 256 + 11 --> short stack + dump + global cache of RAC


Level 11 and 267 will dump the global cache and generate a large trace file, which is generally not recommended. In general, if the process is not too many, we recommend that you use 266 because it can dump the function stack of the process and analyze what operations the process is performing. However, it takes more than 30 minutes to generate a short stack. If there are many processes, such as 2000. In this case, level 10 or level 258 can be generated. level 258 will collect more short stacks than level 10, but some lock element data will be collected less than level 10.
Although process-related information is collected through system state dump, how to effectively interpret relevant information and diagnose and analyze problems is a great challenge!

Reading and Understanding Systemstate Dumps (Doc ID 423153.1) 
How to Collect Systemstate Dumps When you Cannot Connect to Oracle (Doc ID 121779.1)



##############################################
##############################################

Hanganalyze 

Oracle notes that HANGANALYZE run at levels higher that 3 may generate a huge number of trace files for large systems. Do not use levels higher than 3 without discussing their effects with Oracle Technical Support.

Database HANG live is a headache. How to find the reason for HANG live is a problem that DBA must face. When the database HANG lives, most DBAs are often analyzed through the V $ SESSION_WAIT view. In fact, Oracle has a very effective tool-hanganalyze. HANGANALYZE can tell the DBA the information about HANG very clearly, which is convenient for further analysis.

From 19c Oracle has come up with  emergency monitoring report to replace Hanganalyze . 


Below are level we can enable hanganalyze 

     10     Dump all processes (IGN state)
     5      Level 4 + Dump all processes involved in wait chains (NLEAF state)
     4      Level 3 + Dump leaf nodes (blockers) in wait chains (LEAF,LEAF_NW,IGN_DMP state)
     3      Level 2 + Dump only processes thought to be in a hang (IN_HANG state)
   1-2    Only HANGANALYZE output, no process dump at all


Single Instace : 
_________________________
Logon to sqlplus as sysdba
oradebug setmypid;
oradebug unlimit;
oradebug hanganalyze 3;
oradebug tracefile_name


Rac : 
_________________________
Logon to sqlplus as sysdba
oradebug setmypid;
oradebug unlimit;
oradebug setinst all
oradebug -g def hanganalyze 3
oradebug tracefile_name



##############################################
##############################################


Tracing for a Specific ORA-nnnn Error


alter system set events '4021 errorstack(3)  systemstate_global(258)  hanganalyze_global(3) ' ; 
a;ter system set  events '4021 trace name all off'; 




##############################################
##############################################

 v$wait_chains.



From 11gR2 onwards, oracle has provided a dynamic performance view called v$wait_chains. This also contain same information which we gather by running the hanganalyze command. So instead of using hanganalyze, you can use the below query to find wait chains
This is done using dia0 background processes starts collecting hanganalyze information and stores this in memory in the “hang analysis cache”. It does this every 3 seconds for local hanganalyze information and every 10 seconds for global (RAC) hanganalyze information


There is no gv$ equivalent as v$wait_chains would report on multiple instances in a multi-instance (RAC) environment
Some queries for this view
SQL> SELECT chain_id, num_waiters, in_wait_secs, osid, blocker_osid, substr(wait_event_text,1,30)
 FROM v$wait_chains; 



SELECT decode( a.blocker_sid , NULL , '<chain id#' ||a.chain_id||'>' ) chain_id,
RPAD( '+' , LEVEL , '-' ) ||a.sid sid,
RPAD( ' ' , LEVEL , ' ' ) ||a.wait_event_text wait_event
FROM V$WAIT_CHAINS a
CONNECT BY PRIOR a.sid=a.blocker_sid
AND PRIOR a.sess_serial#=a.blocker_sess_serial#
AND PRIOR a.instance = a.blocker_instance START WITH a.blocker_is_valid='FALSE'
ORDER BY a.chain_id ,
LEVEL
/



Query for Top 100 wait chain processes
set pages 1000
set lines 120
set heading off
column w_proc format a50 tru
column instance format a20 tru
column inst format a28 tru
column wait_event format a50 tru
column p1 format a16 tru
column p2 format a16 tru
column p3 format a15 tru
column Seconds format a50 tru
column sincelw format a50 tru
column blocker_proc format a50 tru
column fblocker_proc format a50 tru
column waiters format a50 tru
column chain_signature format a100 wra
column blocker_chain format a100 wra
SELECT * 
FROM (SELECT 'Current Process: '||osid W_PROC, 'SID '||i.instance_name INSTANCE, 
 'INST #: '||instance INST,'Blocking Process: '||decode(blocker_osid,null,'',blocker_osid)|| 
 ' from Instance '||blocker_instance BLOCKER_PROC,
 'Number of waiters: '||num_waiters waiters,
 'Final Blocking Process: '||decode(p.spid,null,'',
 p.spid)||' from Instance '||s.final_blocking_instance FBLOCKER_PROC, 
 'Program: '||p.program image,
 'Wait Event: ' ||wait_event_text wait_event, 'P1: '||wc.p1 p1, 'P2: '||wc.p2 p2, 'P3: '||wc.p3 p3,
 'Seconds in Wait: '||in_wait_secs Seconds, 'Seconds Since Last Wait: '||time_since_last_wait_secs sincelw,
 'Wait Chain: '||chain_id ||': '||chain_signature chain_signature,'Blocking Wait Chain: '||decode(blocker_chain_id,null,
 '',blocker_chain_id) blocker_chain
FROM v$wait_chains wc,
 gv$session s,
 gv$session bs,
 gv$instance i,
 gv$process p
WHERE wc.instance = i.instance_number (+)
 AND (wc.instance = s.inst_id (+) and wc.sid = s.sid (+)
 and wc.sess_serial# = s.serial# (+))
 AND (s.final_blocking_instance = bs.inst_id (+) and s.final_blocking_session = bs.sid (+))
 AND (bs.inst_id = p.inst_id (+) and bs.paddr = p.addr (+))
 AND ( num_waiters > 0
 OR ( blocker_osid IS NOT NULL
 AND in_wait_secs > 10 ) )
ORDER BY chain_id,
 num_waiters DESC)
WHERE ROWNUM < 101;



##############################################
##############################################

Reference :

Reading and Understanding Systemstate Dumps (Doc ID 423153.1)
How to Collect Systemstate Dumps When you Cannot Connect to Oracle (Doc ID 121779.1)
How to Collect Diagnostics for Database Hanging Issues (Doc ID 452358.1)

Tuesday, July 6, 2021

Oracle Rac Administrator Managed VS Policy Managed ( Server pool )

There are two methods to manage your cluster database.


1) Administrator Managed database
2) Policy Managed Database ( Introduced in 11g R2 ) 








Administrator Managed database

Administrator Managed Database : In Administrator Managed database , a DBA manages each instance of the database by defining specific instances to run on specific nodes in the cluster.It is traditional way of cluster management.

When you define a service for an administrator managed database, you define which instances support that service. These are known as the PREFERRED instances. 
You can also define other instances to support a service if the service’s preferred instance fails. These are known as AVAILABLE instances.



Policy Managed Database

Policy Managed Database : It’s a new method to manage clustered database.It is introduced to help implement dynamic grid configurations. 
In Policy Managed database , DBA is required only to define the cardinality(number of database required).  
Oracle Clusterware manages the allocation of nodes to run the instances and Oracle RAC allocates the required redo threads and undo tablespaces as needed.

Service for Policy managed database are defined as UNIFORM and SINGLETON.

UNIFORM : Running on all instances in the server pool

SINGLETON : Running on only one instance in the server pool.For singleton services, RAC chooses on which instance in the server pool the service is active. 
If that instance fails, then the service fails over to another instance in the pool. A service can only run in one server pool.
 

Some benefits of the Policy-Managed configuration:
Provides distribution according to demand.
Manages business requirements effectively.
Scales just in time.
Maximizes data center efficiency.


By default, there are two server pools, GENERIC and FREE. We can display existing pools as follows.

-bash-4.3$ srvctl config srvpool
Server pool name: Free
Importance: 0, Min: 0, Max: -1
Candidate server names: 
Server pool name: Generic
Importance: 0, Min: 0, Max: -1
Candidate server names: racdb1,racdb2



Features of server pools :
min : (-l)
The minimum number of servers that must be protected in the server pool.

max : (-u)
Maximum number of servers allowed in server pool.

imp : (-i)



By following these steps, you can convert a database that is created as Admin-Managed to a Policy-Managed database.

First, let’s look at our current configuration.

$ srvctl config database -d RACDB
Database unique name: RACDB
Database name: RACDB
Oracle home: /u01/app/oracle/product/11.2.0/dbhome_1
Oracle user: oracle
Spfile: +DATA/RACDB/spfileRACDB.ora
Domain: 
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools: RACDB
Database instances: RACDB1,RACDB2
Disk Groups: DATA,RECO
Mount point paths: 
Services: 
Type: RAC
Database is administrator managed



$ srvctl config database -d RACDB
Database unique name: RACDB
Database name: RACDB
Oracle home: /u01/app/oracle/product/11.2.0/dbhome_1
Oracle user: oracle
Spfile: +DATA/RACDB/spfileRACDB.ora
Domain: 
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools: RACDB
Database instances: RACDB1,RACDB2
Disk Groups: DATA,RECO
Mount point paths: 
Services: 
Type: RAC
Database is administrator managed
Let’s create a new pool. The min and max values for the new pool are set to 2.


$ srvctl add serverpool -g testpool -l 2 -u 2

$ srvctl add serverpool -g testpool -l 2 -u 2
When we view existing pools, the pool we just created is visible.


$ srvctl config srvpool
Server pool name: Free
Importance: 0, Min: 0, Max: -1
Candidate server names: 
Server pool name: Generic
Importance: 0, Min: 0, Max: -1
Candidate server names: racdb1,racdb2
Server pool name: testpool
Importance: 0, Min: 2, Max: 2
Candidate server names:


$ srvctl config srvpool
Server pool name: Free
Importance: 0, Min: 0, Max: -1
Candidate server names: 
Server pool name: Generic
Importance: 0, Min: 0, Max: -1
Candidate server names: racdb1,racdb2
Server pool name: testpool
Importance: 0, Min: 2, Max: 2
Candidate server names:
Now, we can convert our database.


$ srvctl modify database -d RACDB -g testpool


$ srvctl modify database -d RACDB -g testpool
The database configuration will now appear as policy managed.


$ srvctl config database -d RACDB
Database unique name: RACDB
Database name: RACDB
Oracle home: /u01/app/oracle/product/11.2.0/dbhome_1
Oracle user: oracle
Spfile: +DATA/RACDB/spfileRACDB.ora
Domain: 
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools: testpool
Database instances: 
Disk Groups: DATA,RECO
Mount point paths: 
Services: 
Type: RAC
Database is policy managed


$ srvctl config database -d RACDB
Database unique name: RACDB
Database name: RACDB
Oracle home: /u01/app/oracle/product/11.2.0/dbhome_1
Oracle user: oracle
Spfile: +DATA/RACDB/spfileRACDB.ora
Domain: 
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools: testpool
Database instances: 
Disk Groups: DATA,RECO
Mount point paths: 
Services: 
Type: RAC
Database is policy managed


Now our database is policy managed. Our database will run when there is an appropriate server for the testpool we created above. 
We can check the status of the server repositories as follows.

$ srvctl status serverpool
Server pool name: Free
Active servers count: 0
Server pool name: Generic
Active servers count: 0
Server pool name: testpool
Active servers count: 2


$ srvctl status serverpool
Server pool name: Free
Active servers count: 0
Server pool name: Generic
Active servers count: 0
Server pool name: testpool
Active servers count: 2

Saturday, July 3, 2021

Oracle Asynchronous Global Index Maintenance jobs for DROP and TRUNCATE Partition in Oracle Database 12c Release 1

 
This new feature in Oracle 12C is as default always on. Each TRUNCATE or DROP commands performed on a partition automatically triggers asynchronous global index maintenance. It means that you don’t need to wait for global INDEX maintenance 

The column DBA_INDEXES.ORPHANED_ENTRIES shows that Oracle is aware that the index may have keys referring to dropped partitions. There is a pre-seeded daily job that tidies them up; by default, it runs every night from 22:00. 

Global index maintenance is decoupled from the DROP and TRUNCATE partition maintenance operation without rendering a global index unusable. Index maintenance is done asynchronously and can be delayed to a later point-in-time.
 
Delay global index maintenance to off-peak hours without affecting index availability, and reduce and truncate partition and sub-partition maintenance operations faster and with fewer resources at the point in time for partition maintenance operations.
 
When combined with the update index clause, the DROP partition and the TRUNCATE partition command will result in metadata index maintenance. This feature is only used for heap tables and does not support object types, domain indexes, or tables owned by SYS.


For backward compatibility you still need to specify UPDATE INDEXES clause.

Limitations of asynchronous global index maintenance:

Only performed on heap tables
No support for tables with object types
No support for tables with domain indexes
Not performed for the user SYS



Asynchronous cleanup orphans in an index can be done:

Automatically by Oracle job SYS.PMO_DEFERRED_GIDX_MAINT_JOB

Manually
just run above job SYS.PMO_DEFERRED_GIDX_MAINT_JOB
run procedure DBMS_PART.CLEANUP_GIDX
run sql statement ALTER INDEX REBUILD [PARTITION] – the same like in previous releases
run sql statement ALTER INDEX [PARTITION] COALESCE CLEANU


 select job_name , start_date,enabled,state,comments
  2  from dba_scheduler_jobs
  3  where job_name ='PMO_DEFERRED_GIDX_MAINT_JOB';

JOB_NAME             START_DATE           ENABL STATE           COMMENTS
-------------------- -------------------- ----- --------------- --------------------
PMO_DEFERRED_GIDX_MA 29-JUN-13 02.00.00.6 TRUE  SCHEDULED       Oracle defined autom
INT_JOB              00000 AM US/CENTRAL                                                            atic index cleanup f
                                                                                                                                    or partition mainten
                                                                                                                                        ance operations with
                                                                                                                                 deferred global ind
                                                                                                                                    ex maintenance




Tradional Way : 
________________________

ALTER TABLE t1 TRUNCATE PARTITION part_2014 DROP STORAGE UPDATE INDEXES;
ALTER TABLE t1 DROP PARTITION part_2014 UPDATE INDEXES;



To Run Job manually 
________________________

select enabled,run_count from dba_scheduler_jobs where job_name='PMO_DEFERRED_GIDX_MAINT_JOB';

exec dbms_scheduler.run_job('SYS.PMO_DEFERRED_GIDX_MAINT_JOB')



Manually trigger the index maintenance.
________________________

EXEC DBMS_PART.cleanup_gidx(USER, 't1');



To Enable Logging 
________________________

By default run history for this job wont be visible . We need  to force enable its logging as per Doc 2506878.1 

sqlplus / as sysdba

SQL> select job_name from dba_Scheduler_jobs where job_class = 'SCHED$_LOG_ON_ERRORS_CLASS'

exec dbms_scheduler.set_attribute('<JOB_NAME>', 'logging_level',DBMS_SCHEDULER.LOGGING_FULL);

  


Disable Job 
________________________

This was main reason for me  to write Blog on this . We were facing  locking issue in most of database  so we planned to disable  this job 

select job_name, enabled from DBA_SCHEDULER_JOBS WHERE job_name = 'SPMO_DEFERRED_GIDX_MAINT_JOB';

exec dbms_scheduler.run_job('PMO_DEFERRED_GIDX_MAINT_JOB')

EXEC SYS.DBMS_SCHEDULER.DISABLE ('PMO_DEFERRED_GIDX_MAINT_JOB'); 


In case we are planing to disable all scheduled jobs we can use below  : 

  col ATTRIBUTE_NAME for a30
  col VALUE for a60
  set lines 180
  set pages 999
  select * from DBA_SCHEDULER_GLOBAL_ATTRIBUTE;

  exec dbms_scheduler.set_scheduler_attribute('SCHEDULER_DISABLED','FALSE');





Reference: 
1) New Oracle Provided 12C Jobs are Not Showing any Execution Run History Details (Doc ID 2506878.1) 

2) https://docs.oracle.com/database/121/VLDBG/GUID-087B87A6-959A-40C6-82AF-36E401FD089B.htm#VLDBG14107

Saturday, June 26, 2021

Tuning Oracle Rac InterConnect


In Rac   most important part is interconnect performance . We will walk through how to tune oracle cluster resource . 


Critical Oracle Background Processes using private interconnect

• LMSn (Global Cache Service Process)
Handles cache fusion(GCS) (Block transfers, messages)
Maintains records of each data file and block open in a cache in GRD (Global Resource Directory)
Controls flow of messages of remote instance(s)
Tracks global data block access
• LCK0 (Instance Enqueue Background Process)
Handles requests related to non-fusion resources such as library and row cache requests
• LMD0 (Global Enqueue Service Daemon)
Manages global enqueues
Detection of deadlocks
• LMON (Global Enqueue Service Monitor)
Maintenance instance membership
Reconfiguration of GCS/GES during recovery from instance crash or startup
• LMHB (Global Cache/Enqueue Service Heartbeat Monitor)
Monitors LMON, LMD and LMSn Processes

 


Wait events in RAC Database  :

1. gc cr/current grant 2-way (message wait)
   • Block does not exist in cache and LMS grants to FG process to read from disk
2. gc cr/current block 2-way/3-way (block wait)
   • Requested for read and write operations
   • Every execution triggers this since SCN advances
3. gc cr/current block congested (block congestion waits)
   • LMS didn’t process within 1 ms due to CPU shortage or scheduling delays
4. gc cr/current block busy (block concurrency/contention waits)
   • Indicates high concurrency
   • LMS needs to perform additional work to prepare block in requested mode
5. gcs log flush sync
   • Before sending reconstructed CR/CURR block, LMS request LGWR to flush redo vectors
6. gc cr failure/gc cr retry
   • Inefciencies with interconnect or invalid block request or checksum error
7. gc cr/current block corrupt/lost
   • Dropped packets due bufer overflow
   • Misconfigured interconnect

Gc [current/cr] [2/3]-way – If you have 2 node cluster you cannot get 3-way as only two RAC instances are available and therefore only 2-way is possible as you can have at most two hops. If you have three or more RAC instances then 2-way or 3-way is possible. Blocks are received after 2 or 3 network hops immediately. The event is not a subject to any tuning except increasing private interconnects bandwidth and decreasing the private interconnects latency. Monitor if average ms > 1ms or close to Disk I/O latency. Look at reducing latency.

Gc [current/cr] grant 2-way – Event when grant is received immediately. Grant is always local or 2-way. Grant occurs when a request is made for a block image current or cr and no instance have the image in its local buffer cache. The requesting instance is required to do an I/O from data file to get the blocks. The grant simply is a permission from the LMS this to happen that is, the process to read the block from the data file. Grant can be either cr or current. Gc current grant is go read the block from the database files, while gc cr grant is read the block from disk and build a read consistent block once is read.

Gc [current/cr][block/grant] congested – means that it has been received eventually but with a delay because of intensive CPU consumption, memory lack, LMS overload due to much work in the queues, paging, swapping. This is worth investigating as it provides a room for improvement. You should look at it as it indicates that LMS could not dequeue message fast enough.

Gc [current/cr] block busy – Received but not sent immediately due to high concurrency or contention. This means that the block is busy for example somebody issue block recover command from RMAN. Variety of reasons for being busy just means cannot be sent immediately but not because of memory, LMS or system oriented reasons but Oracle oriented reasons. It is also worth investigating.

Gc current grant busy – Grant is received but there is a delay due to many shared block images or load. For example you are extending the high water mark and you are formatting the block images or blocks with block headers.

Gc [current/cr][failure/retry] – Not received because of failure, checksum error usually in the protocol of the private interconnect due to network errors or hardware problems. This is something worth investigating. Failure means that cannot receive the block image while retry means that the problems recovers and ultimately the block image can be received but it needs to retry.

Gc buffer busy – time between block accesses less than buffer pin time. Pin buffers can be in exclusive or shared mode depending if buffers can be modified or read only. Obviously if there is a lot of contention for the same block by different processes this event can manifest itself in grater magnitude. Buffer busy are global cache events as a request is made from one instance and the block is available in another instance and the block is busy due to contention.



 How to check interconnect details : 

$ oifcfg getif
lan902 172.17.1.0 global  
lan901 10.28.188.0 global public  





 Ways  to check interconnect performance : 

1)  ifconfig -a 
Using ifconfig we can find if any packets are dropped 

ifconfig –a: eth0 Link encap:Ethernet HWaddr 00:0B:DB:4B:A2:04 inet addr:130.35.25.110 Bcast:130.35.27.255 Mask:255.255.252.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:21721236 errors:135 dropped:0 overruns:0 frame:95 TX packets:273120 errors:0 dropped:27 overruns:0 carrier:0





2)  global awr report 

In Global Awr report we can find cluster performance .






2) From  V$ views 

We use racdiag.sql  .   Also have posted  various scripts used my me . 

Script to Collect RAC Diagnostic Information (racdiag.sql) (Doc ID 135714.1)



1. Estimate Global Cache traffic flowing in/out a given node from AWR
• Messages are typically 200 bytes in size or less while CR/Curr blocks are in 8k(Same as
DB_BLOCK_SIZE) in size
• Goal is to keep network Packets/sec under 70% of estimate throughput of interconnect device
• DBA_HIST_SYSSTAT provides data related to all GC wait events, block/message transfers
• DBA_HIST_IC_DEVICE_STATS provides stats like packets received/transmitted/dropped for each
interface
• DBA_HIST_IC_CLIENT_STATS provides usage of interconnect by area(IPQ, DLM and Cache) 



select * from select * from v$instance_cache_transfer v$instance_cache_transfer
where class = 'data block' where class = 'data block'
and instance = 1;

 SET lines 200 pages 100
 SELECT inst_id,cr_requests,current_requests
 FROM gv$cr_block_server;
 
  SET lines 200 pages 100
 SELECT * FROM gv$cluster_interconnects ORDER BY inst_id,name;
 
  SET lines 200 pages 100
 SELECT * FROM gv$configured_interconnects ORDER BY inst_id,name;
 
SET lines 200 pages 100
SELECT inst_id,class,cr_block,current_block
FROM gv$instance_cache_transfer WHERE instance IN (1,2)
ORDER BY inst_id,class;

 #####  Interesting one coming from racdiag.sql script:
 SET lines 200
 SELECT b1.inst_id, b2.VALUE "GCS CR BLOCKS RECEIVED",
     b1.VALUE "GCS CR BLOCK RECEIVE TIME",
     ((b1.VALUE / b2.VALUE) * 10) "AVG CR BLOCK RECEIVE TIME (ms)"
     FROM gv$sysstat b1, gv$sysstat b2
     WHERE b1.name = 'global cache cr block receive time'
     AND b2.name = 'global cache cr blocks received'
     AND b1.inst_id = b2.inst_id OR b1.name = 'gc cr block receive time'
     AND b2.name = 'gc cr blocks received'
     AND b1.inst_id = b2.inst_id;
 

####Latency of interconnect messages:
  SET lines 200 pages 100
  SELECT * FROM dba_hist_interconnect_pings WHERE snap_id=2846;
 
 
#### Global Cache Service processes (LMS) statistics:
 SET lines 200 pages 100
 SELECT * FROM gv$current_block_server;
 

  SELECT
  DECODE(name,'gc cr blocks received','global cache blocks received','gc cr blocks served','global cache blocks served','gc current blocks received','global cache blocks     received','gc current blocks served','global cache blocks served',name) AS name,
  SUM(VALUE) AS VALUE,
  TO_CHAR(SYSDATE,'yyyy-mm-dd hh24:mi:ss') AS date_taken
  FROM gv$sysstat
  WHERE inst_id=1
  AND name IN ('gc cr blocks received','gc cr blocks served','gc current blocks received','gc current blocks served','gcs messages sent','ges messages sent')
  GROUP BY DECODE(name,'gc cr blocks received','global cache blocks received','gc cr blocks served','global cache blocks served','gc current blocks received','global cache   blocks received','gc current blocks served','global cache blocks served',name)
  UNION
  SELECT name,VALUE,TO_CHAR(SYSDATE,'yyyy-mm-dd hh24:mi:ss') AS date_taken
  FROM gv$dlm_misc
  WHERE name IN ('gcs msgs received','ges msgs received')
  AND inst_id=1;





Ways to Tune  interconnect :

A) Network Kernel Buffer 


UDP buffers
You want the fastest possible network to be used for the interconnect. To maximize your speed and efficiency on the interconnect, you should ensure that the User Datagram Protocol (UDP) buffers are set to the correct values. On Linux, you can check this via the following command:

   sysctl net.core.rmem_max net.core.wmem_max net.core.rmem_default net
   .core.wmem_default
   net.core.rmem_max = 4194304
   net.core.wmem_max = 1048576
   net.core.rmem_default = 262144
   net.core.wmem_default = 262144

Alternatively, you can read the associated values directly from the respective files in the directory /proc/sys/net/core. These values can be increased via the following SYSCTL commands:

   sysctl -w net.core.rmem_max=4194304
   sysctl -w net.core.wmem_max=1048576
   sysctl -w net.core.rmem_default=262144
   sysctl -w net.core.wmem_default=262144

The numbers in this example are the recommended values for Oracle RAC on Linux and are more than sufficient for the majority of configurations. Nevertheless, let’s talk about some background of the UDP buffers. The values determined by rmem_max and wmem_max are on a “per-socket” basis. So if you set rmem_max to 4MB, and you have 400 processes running, each with a socket open for communications in the interconnect, then each of these 400 processes could potentially use 4MB, meaning that the total memory usage could be 1.6GB just for this UDP buffer space. However, this is only “potential” usage. So if rmem_default is set to 1MB and rmem_max is set to 4MB, you know for sure that at least 400MB will be allocated (1MB per socket). Anything more than that will be allocated only as needed, up to the max value. So the total memory usage depends on the rmem_default, rmem_max, the number of open sockets, and the variable piece of how much buffer space each process is actually using. This is an unknown—but it could depend on the network latency or other characteristics of how well the network is performing and how much network load there is altogether. To get the total number of Oracle-related open UDP sockets, you can execute this command:

   netstat -anp -udp | grep ora | wc -l



B) Remove Reverse Path Filtering interference, by means of configuring rp_filter kernel parameter with value 0 (disabled) or 2 (loose). This is an IP level functionality, so in case we would be using HAIP over bonding, this parameter must be configured at bonding OS device and not the individual interface devices. With this configuration, we are preventing, in 2.6.31 and up kernels, a situation where blocks get locked or discarded. We can find more specific information in MOS rp_filter for multiple private interconnects and Linux Kernel 2.6.32+ (Doc ID 1286796.1).


C) Jumbo Frames introduces the ability for an Ethernet frame to exceed its IEEE 802 specified Maximum Transfer Unit of 1500 bytes up to a maximum of 9000 bytes. Even though Jumbo Frames is widely available in most NICs and data-center class managed switches it is not an IEEE approved the standard. While the benefits are clear, Jumbo Frames interoperability is not guaranteed with some existing networking devices. Though Jumbo Frames can be implemented for private Cluster Interconnects, it requires very careful configuration and testing to realize its benefits. In many cases, failures or inconsistencies can occur due to incorrect setup, bugs in the driver or switch software, which can result in sub-optimal performance and network errors.


D) Parameters Influencing RAC behavior

Disclaimer: Hidden parameters (those that begin with an underscore) such as _LM_DLMD_PRCS should be modified with caution and only after first consulting with Oracle Support. It should be noted that Oracle could change or remove such parameters without notice in subsequent releases.

In Oracle RAC, the setting of DB_BLOCK_SIZE multiplied by the MULTI_BLOCK_READ_COUNT determines the maximum size of a message for the Global Cache and the PARALLEL_EXECUTION_MESSAGE_SIZE determines the maximum size of a message used in Parallel Query. These message sizes can range from 2K to 64K or more and hence will get fragmented more so with a lower/default MTU.

The LMS background process (represents global cache services or GCS) of the requested instance may not be able to keep up with the requests from other instances and may cause the LMT background process (represents global enqueue services or GES) to queue new requests increasing its backlog and thus causing delayed response. The LMS process accepts requests on a First-In-First-Out (FIFO) basis. Oracle by default creates one LMS process for every two CPU’s on the system. While Oracle has not provided any direct method to tune the number of LMS processes, one method available is to increase them by modifying a hidden parameter _LM_DLMD_PROC


1. Always run critical background processes in RT priority • _lm_lms_priority_dynamic = false • _high_priority_processes = ‘LMS*|VKTM|LMD*|LGWR’

2. Disable Undo DRM during instance start after crash • _gc_undo_afnity = false

3. Disable DRM to avoid unacceptable and unpredictable freezes • _gc_policy_time = 0

4. Minimize reconfiguration time for bigger SGA • _lm_tickets • gcs_server_processes

5. Monitoring related critical background parameters • Heartbeat ping to local processes • _lm_rcvr_hang_check_frequency • _lm_rcvr_hang_allow_time • _lm_rcvr_hang_kill • Heartbeat ping to peer process on remote instances • _lm_idle_connection_check • _lm_idle_connection_check_interval • _lm_idle_connection_kill

6. Fairness and light work rule (_fairness_threshold) 



E)  Reducing Full table scans 

Evaluate and reduce the number of full table scans on the database. Queries that retrieve data by performing full table scans could cause large cache movement between instances and thus cause significant load on the cluster interconnects.


F)  Using multiple cluster interconnects 

Multiple cluster interconnects can be configured and allocated to Oracle using the parameter CLUSTER_INTERCONNECTS. The parameter overrides the default interconnect setting at the operating system level with a preferred traffic network and in certain platforms disables the high availability feature of RAC.


G)  Dont Use too  big Sga : 

Though this is not way to Tune Interconnect  but to avoid block transfer from one node to Another .  I have seen experts recommending not to use too large sga/buffer pool  , as they prefer physical reads over   cache fusion . 




Tuning Cluster Timeout parameters :

Most of node eviction  happens  due to inproper configuration of  cluster timeouts . Please work with Oracle support for idle  value however below i am sharing based  on
expirience .  Have seen this that after migration time out values changes.

The CSS misscount parameter represents the maximum time, in seconds, that a heartbeat can be missed before entering into a cluster reconfiguration to evict the node.

Disktimeout: Disk Latencies in seconds from node-to-Votedisk. Default Value is 200. (Disk IO)


crsctl get css diagwait    --- should give 13
crsctl get css misscount   --- should give  30
crsctl get css disktimeout --- should give   200
crsctl get css reboottime  --- should give 3
 
crsctl set css diagwait 13
crsctl set css misscount  30
crsctl set css disktimeout   200
crsctl set css reboottime  3


Reference : 
Steps To Change CSS Misscount, Reboottime and Disktimeout (Doc ID 284752.1)




Other issues with Rac Performance : 

A) Parallel Query Parallel Query
 One major issue in RAC is parallel query One major issue in RAC is parallel query   that goes across many nodes

Define Instance Groups Define Instance Groups

Specify in Specify in init.ora init.ora
prodb1.instance_groups='pqgroup1' prodb1.instance_groups='pqgroup1'
prodb2.instance_groups='pqgroup2' prodb2.instance_groups='pqgroup2'

Specify Instance Groups in Session Specify Instance Groups in Session
SQL> alter session set SQL> alter session set
parallel_instance_group parallel_instance_group ='pqgroup1';





RAC Common Tuning Tips

 Application tuning is often the most beneficial
 Resizing and tuning the buffer cache
 Increasing sequence caches to a high value: to avoid index leaf contention caused by high or batch inserts.
 Reducing long full-table scans in OLTP systems: to reduce GCS requests.
 Using Automatic Segment Space Management
 Using partitioning to reduce interinstance traffic
 Avoiding unnecessary parsing
 Minimizing locking usage
 Removing unselective indexes
 Configuring interconnect properly
  Increse Lms process  -- set GCS_SERVER_PROCESSES to 6/8 (needs instance restarting for parameter to take affect - in rolling)




Script intended to display RAC private interconnect traffic


#!/usr/bin/ksh
#
#  
#
# -----------------------------------------------------------------------------
#
# Script intended to display RAC private interconnect traffic
#
# -----------------------------------------------------------------------------
#
# Usage : $0 <instance name> <interval in seconds (default 10)> <count (default 5)>
#
# According to sprepins.sql file in $ORACLE_HOME/rdbms/admin RAC interconnect
# traffic formula is
# Estd Interconnect traffic:
# ((Global Cache blocks received + Global Cache blocks served)*db_block_size
# +(GCS/GES messages received + GCS/GES messages sent)*200)/elapsed time
#

# -----------------------------------------------------------------------------
# Variables
# -----------------------------------------------------------------------------

OSTYPE=$(uname -s)
case $OSTYPE in
"AIX" ) alias bdf="/usr/bin/df -Ik"
        alias ll="/usr/bin/ls -l" ;;
"SunOS") alias bdf="/usr/bin/df -k"
         alias ll="/usr/bin/ls -l"
         alias awk="/usr/xpg4/bin/awk"
         alias grep="/usr/xpg4/bin/grep" ;;
"Linux") alias bdf="/bin/df -k"
         alias grep=egrep
         alias ll="ls -l" ;;
esac

LOG=/tmp/`basename $0`.$$.log

DB_BLOCK_SIZE=0
START_GCBR=0
START_GCBS=0
START_GCSMS=0
START_GCSMR=0
START_GESMS=0
START_GESMR=0
START_DATE="N/A"
END_DATE="N/A"
TRAFFIC=0
i=1

# -----------------------------------------------------------------------------
# Initialization
# -----------------------------------------------------------------------------

if [ "$LOGNAME" != "root" ]
then
  echo -e "\nYou must launch this script as root\n"
  exit 1
fi

if [ $# -lt 1 ]
then
  echo -e "Usage : $0 <instance name> <interval in seconds (default 10)> <count (default 5)>\n"
  echo -e "List of instances : \n`grep -v -e "^#" -e "^$" /etc/oratab|cut -f1 -d:`"
  exit 1
else
  if grep -q "^$1:" /etc/oratab
  then
    echo -e "\nInstance $1 exists in /etc/oratab\n"
  else
    echo -e "Instance <$1> does not exist in /etc/oratab"
    echo -e "List of existing instances:\n`grep -v -e "^#" -e "^$" /etc/oratab|cut -f1 -d:`"
    exit 1
  fi
fi

ORACLE_SID=$1

if [ -n "$2" ]
then
  INTERVAL=$2
else
  INTERVAL=10
fi

if [ -n "$3" ]
then
  COUNT=$3
else
  COUNT=5
fi

if [ `bdf /tmp|grep tmp|awk '{print $4}'|tr -d %` -ge 98 ]
then
  echo -e "\nFile system /tmp is full\n"
  exit 1
fi

rm -f ${LOG}

IFS=:
OHOME=`grep "^$ORACLE_SID:" /etc/oratab | cut -d: -f2`
FLAG=`grep "^$ORACLE_SID:" /etc/oratab | cut -d: -f3`
USER=`grep "^$ORACLE_SID:" /etc/oratab | cut -d: -f4`
VG=`grep "^$ORACLE_SID:" /etc/oratab | cut -d: -f5`

# -----------------------------------------------------------------------------
# Main
# -----------------------------------------------------------------------------

echo -e "Please wait $INTERVAL seconds...\n"

IFS=*
while [ $i -le $COUNT ]
do
su - ${USER} -c "
sqlplus -S / as sysdba << EOF > ${LOG}
set serveroutput on size 999999 linesize 200 feedback off
declare
  instance_number number;
db_block_size number;
type figures_type is table of number index by varchar2(64);
  start_figures figures_type;
  end_figures figures_type;
cursor cursor1(instance_number number) is select
decode(name,'gc cr blocks received','Global Cache blocks received','gc cr blocks served','Global Cache blocks served','gc current blocks received','Global Cache blocks received','gc current blocks served','Global Cache blocks served',name) as name,
sum(value) as value,to_char(sysdate,'yyyy-mm-dd hh24:mi:ss') as date_taken
from gv\\\$SYSSTAT
where inst_id=instance_number
and name in ('gc cr blocks received','gc cr blocks served','gc current blocks received','gc current blocks served','gcs messages sent','ges messages sent')
group by decode(name,'gc cr blocks received','Global Cache blocks received','gc cr blocks served','Global Cache blocks served','gc current blocks received','Global Cache blocks received','gc current blocks served','Global Cache blocks served',name)
union
select name,value,to_char(sysdate,'yyyy-mm-dd hh24:mi:ss') as date_taken
from gv\\\$DLM_MISC
where name in ('gcs msgs received','ges msgs received')
and inst_id=instance_number;
  item1 cursor1%rowtype;
  plsql_start_date varchar2(19);
  plsql_end_date varchar2(19);
  interval number;
computed_interval number;
  i number;
begin
  select value into instance_number from v\\\$parameter where name='instance_number';
select value into db_block_size from v\\\$parameter where name='db_block_size';
  interval:=${INTERVAL};
  start_figures('Global Cache blocks received'):=${START_GCBR};
start_figures('Global Cache blocks served'):=${START_GCBS};
start_figures('gcs messages sent'):=${START_GCSMS};
start_figures('gcs msgs received'):=${START_GCSMR};
start_figures('ges messages sent'):=${START_GESMS};
start_figures('ges msgs received'):=${START_GESMR};
plsql_start_date:='${START_DATE}';

  --dbms_output.put_line('Instance Number: ' || instance_number);
  --dbms_output.put_line('db_block_size: ' || db_block_size);
--dbms_output.put_line('interval: ' || interval);
if (start_figures('Global Cache blocks received')=0) then
  open cursor1(instance_number);
  loop
    fetch cursor1 into item1;
  exit when cursor1%notfound;
  start_figures(item1.name):=item1.value;
  end loop;
  plsql_start_date:=item1.date_taken;
close cursor1;
  dbms_lock.sleep(interval);
end if;

open cursor1(instance_number);
loop
  fetch cursor1 into item1;
exit when cursor1%notfound;
end_figures(item1.name):=item1.value;
end loop;
plsql_end_date:=item1.date_taken;
close cursor1;

dbms_output.put_line('DB_BLOCK_SIZE=' || db_block_size);
dbms_output.put_line('START_GCBR=' || start_figures('Global Cache blocks received'));
dbms_output.put_line('START_GCBS=' || start_figures('Global Cache blocks served'));
dbms_output.put_line('START_GCSMS=' || start_figures('gcs messages sent'));
dbms_output.put_line('START_GCSMR=' || start_figures('gcs msgs received'));
dbms_output.put_line('START_GESMS=' || start_figures('ges messages sent'));
dbms_output.put_line('START_GESMR=' || start_figures('ges msgs received'));
dbms_output.put_line('END_GCBR=' || end_figures('Global Cache blocks received'));
dbms_output.put_line('END_GCBS=' || end_figures('Global Cache blocks served'));
dbms_output.put_line('END_GCSMS=' || end_figures('gcs messages sent'));
dbms_output.put_line('END_GCSMR=' || end_figures('gcs msgs received'));
dbms_output.put_line('END_GESMS=' || end_figures('ges messages sent'));
dbms_output.put_line('END_GESMR=' || end_figures('ges msgs received'));
dbms_output.put_line('START_DATE=\"' || plsql_start_date || '\"');
dbms_output.put_line('END_DATE=\"' || plsql_end_date || '\"');
computed_interval:=round((to_date(plsql_end_date,'yyyy-mm-dd hh24:mi:ss')-to_date(plsql_start_date,'yyyy-mm-dd hh24:mi:ss'))*24*60*60);
dbms_output.put_line('COMPUTED_INTERVAL=' || computed_interval);
dbms_output.put_line('TRAFFIC=' || ((end_figures('Global Cache blocks received')+end_figures('Global Cache blocks served')-start_figures('Global Cache blocks received')-start_figures('Global Cache blocks served'))*db_block_size+(end_figures('gcs messages sent')+end_figures('gcs msgs received')+end_figures('ges messages sent')+end_figures('ges msgs received')-start_figures('gcs messages sent')-start_figures('gcs msgs received')-start_figures('ges messages sent')-start_figures('ges msgs received'))*200)/1024/computed_interval);

end;
/
EOF
"

while read line
do
  #echo $line
eval $line
done < ${LOG}

#TRAFFIC=`expr (($END_GCBR+$END_GCBS-$START_GCBR-$START_GCBS)*$DB_BLOCK_SIZE+($END_GCSMS+$END_GCSMR+$END_GESMS+$END_GESMR-$START_GCSMS-$START_GCSMR-$START_GESMS-$START_GESMR)*200)/1024/$INTERVAL)`
if [ $TRAFFIC -le 1024 ]
then
  echo -e "Estd Interconnect traffic at $END_DATE (KBytes/s): $TRAFFIC\n"
else
  TRAFFIC=`echo $TRAFFIC/1024 | bc -l`
  echo -e "Estd Interconnect traffic at $END_DATE (MBytes/s): $TRAFFIC\n"
fi

#cat ${LOG}

START_GCBR=$END_GCBR
START_GCBS=$END_GCBS
START_GCSMS=$END_GCSMS
START_GCSMR=$END_GCSMR
START_GESMS=$END_GESMS
START_GESMR=$END_GESMR
START_DATE=$END_DATE

i=`expr $i + 1`

sleep $INTERVAL
done

rm -f ${LOG}




References : 

Troubleshooting gc block lost and Poor Network Performance in a RAC Environment (Doc ID 563566.1)

( Doc ID 1619155.1 ) Best Practices and Recommendations for RAC databases with SGA size over 100GB

Saturday, June 19, 2021

Oracle Fine-Grained Access Control for UTL_* Packages -- ACL



Oracle Database 11g provides a mechanism to refine the level of access to the network
access packages UTL_TCP, UTL_SMTP, UTL_MAIL, UTL_HTTP, and UTL_INADDR.


You can use the DBMS_NETWORK_ACL_ADMIN package to facilitate management of the UTL_*
network access packages as in the following steps:

We we ill get  ORA-24247   if  Acl is missing  




12c method of Creating Acl 

BEGIN
DBMS_NETWORK_ACL_ADMIN.APPEND_HOST_ACE(
host => '*',
ace => xs$ace_type(privilege_list => xs$name_list('connect', 'resolve'),
principal_name => 'USERNAME',
principal_type => xs_acl.ptype_db));

END;
/


Pre 12c Method  Of Creating Acl 

1) Create an Access Control List (ACL): 

All ACL definitions are stored in XML DB in the form of  XML documents. The ACL XML files reside in the /sys/acls directory of the XML DB
repository. Following is an example of using the CREATE_ACL procedure to create an XML
file called dba.xml:

begin
 DBMS_NETWORK_ACL_ADMIN.CREATE_ACL (
 ACL => 'dba.xml', -- case sensitive
 DESCRIPTION=> 'Network Access Control for the DBAs',
 PRINCIPAL => 'SCOTT', -- user or role the privilege is granted or denied(upper case)
 IS_GRANT => TRUE, -- privilege is granted or denied
 PRIVILEGE => 'connect', -- or 'resolve' (case sensitive)
 START_DATE => null, -- when the access control entity ACE will be valid
 END_DATE => null); -- ACE expiration date (TIMESTAMP WITH TIMEZONE format)
end;


Regarding the PRIVILEGE parameter, the database user needs the connect privilege to an
external network host computer if he or she is connecting using the UTL_TCP, UTL_HTTP,
UTL_SMTP, and UTL_MAIL utility packages. To resolve a host name that was given as a host IP
address, or the IP address that was given as a host name, with the UTL_INADDR package,
grant the database user the resolve privilege.

You can then query the RESOURCE_VIEW view to find the dba.xml ACL in the /sys/acls
directory:

select ANY_PATH
from RESOURCE_VIEW
where ANY_PATH LIKE '/sys/acls/dba%'

Too may entries in the ACL may lead to significant XML DB performance drop because ACL
are checked for each access to Oracle XML DB repository. As general rule of thumb, ACL
check operations perform best when the number of ACEs in the ACL is at 16 entries or less.



2) Add Access Control Entries: 

Once you create the initial ACL, you can continue to add more
privileges to the XML file. The following example will add the user RAMI to the dba.xml file
and grant him network access:
begin

 DBMS_NETWORK_ACL_ADMIN.ADD_PRIVILEGE (
 ACL => 'dba.xml',
 PRINCIPAL => 'RAMI',
 IS_GRANT => TRUE, 
 PRIVILEGE => 'connect',
 START_DATE => null, -- if the time interval is defined,
 END_DATE => null); -- the ACE will expire after the specified date range
end;
/
COMMIT;


In ACL, the security entries are evaluating in order precedence. If you have two contradicting
entries in the list, the first one in the order will take effect. You can control the order number
of an added entry as follows:

begin
 DBMS_NETWORK_ACL_ADMIN.ADD_PRIVILEGE (
 POSITION => 1, -- on the top
 ACL => 'dba.xml', 
 PRINCIPAL => 'SAMI',
 IS_GRANT => FALSE, 
 PRIVILEGE => 'connect',
 START_DATE => null, END_DATE => null);
end;



3) Assign Hosts: 

The ASSIGN_ACL procedure is used to authorize access to one or more
network hosts as follows:

begin
DBMS_NETWORK_ACL_ADMIN.ASSIGN_ACL (
 ACL => 'dba.xml', HOST => 'dbaexpert.com',
 LOWER_PORT => 80, UPPER_PORT => 443);
end;
COMMIT;

The lower port and the upper port define the lower and the upper boundaries of the allowable
port range. They should be set for connect privileges not resolve privileges.



4) Validate that the ACL permissions worked accordingly. 

Following is an example to test the code in the previous step.

select UTL_HTTP.REQUEST('http://www.ahmedbaraka.com') from dual;

If the sufficient ACL privileges or ACL assignments are not provided, you will receive the ORA24247 error.




Access Control Lists Maintenance

Use DELETE_PRIVILEGE to remove an access control entry from the XML file.

exec DBMS_NETWORK_ACL_ADMIN.DELETE_PRIVILEGE( ACL=>'dba.xml', PRINCIPAL=> 'RAMI');


 


The UNASSIGN_ACL procedure allows you to manually drop ACL assignments. It uses the same parameter list as the ASSIGN_ACL procedure, with any NULL parameters acting as wildcards.

BEGIN
  DBMS_NETWORK_ACL_ADMIN.unassign_acl (
    acl         => 'test_acl_file.xml',
    host        => '192.168.2.3', 
    lower_port  => 80,
    upper_port  => NULL); 

  COMMIT;
END;
/



Use the DROP_ACL procedure to remove the XML file from the /sys/acls directory as follows:

exec DBMS_NETWORK_ACL_ADMIN.DROP_ACL ( ACL=>'dba.xml' ); 






Views : 


Pre-checks to ensure XML DB installed:

-- user XDB exists
select * from ALL_USERS order by username desc;

-- resource_view exists
select * from resource_view;

-- shows XML DB is installed
select * from dba_registry;




To display list of the ACLs created in the database, use the following query:

select HOST, LOWER_PORT, UPPER_PORT, ACL from DBA_NETWORK_ACLS

You can query the DBA_NETWORK_ACL_PRIVILEGES view to query network privileges granted or
denied for the access control list as follows:

select PRINCIPAL, PRIVILEGE, IS_GRANT
from DBA_NETWORK_ACL_PRIVILEGES
where ACL like '%dba.xml'

Logged on users can use the following query to see their access entries in the dba.xml file:

select HOST, LOWER_PORT, UPPER_PORT, STATUS privilege
from USER_NETWORK_ACL_PRIVILEGES
where HOST in
(select * from
 table(DBMS_NETWORK_ACL_UTILITY.DOMAINS('dbaexpert.com')))
and PRIVILEGE = 'connect'
order by DBMS_NETWORK_ACL_UTILITY.DOMAIN_LEVEL(host) desc, LOWER_PORT;




COLUMN host FORMAT A30
COLUMN acl FORMAT A30
SELECT host, lower_port, upper_port, acl
FROM   dba_network_acls;



COLUMN acl FORMAT A30
COLUMN principal FORMAT A30
SELECT acl,
       principal,
       privilege,
       is_grant,
       TO_CHAR(start_date, 'DD-MON-YYYY') AS start_date,
       TO_CHAR(end_date, 'DD-MON-YYYY') AS end_date
FROM   dba_network_acl_privileges;



CONN test1/test1@db11g
COLUMN host FORMAT A30
SELECT host, lower_port, upper_port, privilege, status
FROM   dba_network_acl_privileges;





-- NB: new dba_host_aces view
SELECT host,
       lower_port,
       upper_port,
       ace_order,
       TO_CHAR(start_date, 'DD-MON-YYYY') AS start_date,
       TO_CHAR(end_date, 'DD-MON-YYYY') AS end_date,
       grant_type,
       inverted_principal,
       principal,
       principal_type,
       privilege
FROM   dba_host_aces
ORDER BY host, ace_order;




-- NB: new dba_host_acls view
SELECT HOST,
       LOWER_PORT,
       UPPER_PORT,
       ACL,
       ACLID,
       ACL_OWNER
FROM   dba_host_acls
ORDER BY host;



-- NB: dba_network_acls deprecated view in 12c
SELECT host, lower_port, upper_port, acl
FROM   dba_network_acls
ORDER BY host;

-- NB: dba_network_acl_privileges deprecated view in 12c
SELECT acl,
       principal,
       privilege,
       is_grant,
       TO_CHAR(start_date, 'DD-MON-YYYY') AS start_date,
       TO_CHAR(end_date, 'DD-MON-YYYY') AS end_date
FROM   dba_network_acl_privileges
ORDER BY acl, principal, privilege;






-- -----------------------------------------------------------------------------------
-- File Name    : https://oracle-base.com/dba/11g/network_acls_ddl.sql
-- Author       : Tim Hall
-- Description  : Displays DDL for all network ACLs.
-- Requirements : Access to the DBA views.
-- Call Syntax  : @network_acls_ddl
-- Last Modified: 28-JUL-2017
-- -----------------------------------------------------------------------------------

SET SERVEROUTPUT ON FORMAT WRAPPED LINESIZE 300
DECLARE
  l_last_acl       dba_network_acls.acl%TYPE                 := '~';
  l_last_principal dba_network_acl_privileges.principal%TYPE := '~';
  l_last_privilege dba_network_acl_privileges.privilege%TYPE := '~';
  l_last_host      dba_network_acls.host%TYPE                := '~';

  FUNCTION get_timestamp (p_timestamp IN TIMESTAMP WITH TIME ZONE)
    RETURN VARCHAR2
  AS
    l_return  VARCHAR2(32767);
  BEGIN
    IF p_timestamp IS NULL THEN
      RETURN 'NULL';
    END IF;
    RETURN 'TO_TIMESTAMP_TZ(''' || TO_CHAR(p_timestamp, 'DD-MON-YYYY HH24:MI:SS.FF TZH:TZM') || ''',''DD-MON-YYYY HH24:MI:SS.FF TZH:TZM'')';
  END;
BEGIN
  FOR i IN (SELECT a.acl,
                   a.host,
                   a.lower_port,
                   a.upper_port,
                   b.principal,
                   b.privilege,
                   b.is_grant,
                   b.start_date,
                   b.end_date
            FROM   dba_network_acls a
                   JOIN dba_network_acl_privileges b ON a.acl = b.acl
            ORDER BY a.acl, a.host, a.lower_port, a.upper_port)
  LOOP
    IF l_last_acl <> i.acl THEN
      -- First time we've seen this ACL, so create a new one.
      l_last_host := '~';

      DBMS_OUTPUT.put_line('-- -------------------------------------------------');
      DBMS_OUTPUT.put_line('-- ' || i.acl);
      DBMS_OUTPUT.put_line('-- -------------------------------------------------');
      DBMS_OUTPUT.put_line('BEGIN');
      DBMS_OUTPUT.put_line('  DBMS_NETWORK_ACL_ADMIN.drop_acl (');
      DBMS_OUTPUT.put_line('    acl          => ''' || i.acl || ''');');
      DBMS_OUTPUT.put_line('  COMMIT;');
      DBMS_OUTPUT.put_line('END;');
      DBMS_OUTPUT.put_line('/');
      DBMS_OUTPUT.put_line(' ');
      DBMS_OUTPUT.put_line('BEGIN');
      DBMS_OUTPUT.put_line('  DBMS_NETWORK_ACL_ADMIN.create_acl (');
      DBMS_OUTPUT.put_line('    acl          => ''' || i.acl || ''',');
      DBMS_OUTPUT.put_line('    description  => ''' || i.acl || ''',');
      DBMS_OUTPUT.put_line('    principal    => ''' || i.principal || ''',');
      DBMS_OUTPUT.put_line('    is_grant     => ' || i.is_grant || ',');
      DBMS_OUTPUT.put_line('    privilege    => ''' || i.privilege || ''',');
      DBMS_OUTPUT.put_line('    start_date   => ' || get_timestamp(i.start_date) || ',');
      DBMS_OUTPUT.put_line('    end_date     => ' || get_timestamp(i.end_date) || ');');
      DBMS_OUTPUT.put_line('  COMMIT;');
      DBMS_OUTPUT.put_line('END;');
      DBMS_OUTPUT.put_line('/');
      DBMS_OUTPUT.put_line(' ');
      l_last_acl := i.acl;
      l_last_principal := i.principal;
      l_last_privilege := i.privilege;
    END IF;

    IF l_last_principal <> i.principal 
    OR (l_last_principal = i.principal AND l_last_privilege <> i.privilege) THEN
      -- Add another principal to an existing ACL.
      DBMS_OUTPUT.put_line('BEGIN');
      DBMS_OUTPUT.put_line('  DBMS_NETWORK_ACL_ADMIN.add_privilege (');
      DBMS_OUTPUT.put_line('    acl       => ''' || i.acl || ''',');
      DBMS_OUTPUT.put_line('    principal => ''' || i.principal || ''',');
      DBMS_OUTPUT.put_line('    is_grant  => ' || i.is_grant || ',');
      DBMS_OUTPUT.put_line('    privilege => ''' || i.privilege || ''',');
      DBMS_OUTPUT.put_line('    start_date   => ' || get_timestamp(i.start_date) || ',');
      DBMS_OUTPUT.put_line('    end_date     => ' || get_timestamp(i.end_date) || ');');
      DBMS_OUTPUT.put_line('  COMMIT;');
      DBMS_OUTPUT.put_line('END;');
      DBMS_OUTPUT.put_line('/');
      DBMS_OUTPUT.put_line(' ');
      l_last_principal := i.principal;
      l_last_privilege := i.privilege;
    END IF;

    IF l_last_host <> i.host||':'||i.lower_port||':'||i.upper_port THEN
      DBMS_OUTPUT.put_line('BEGIN');
      DBMS_OUTPUT.put_line('  DBMS_NETWORK_ACL_ADMIN.assign_acl (');
      DBMS_OUTPUT.put_line('    acl         => ''' || i.acl || ''',');
      DBMS_OUTPUT.put_line('    host        => ''' || i.host || ''',');
      DBMS_OUTPUT.put_line('    lower_port  => ' || NVL(TO_CHAR(i.lower_port),'NULL') || ',');
      DBMS_OUTPUT.put_line('    upper_port  => ' || NVL(TO_CHAR(i.upper_port),'NULL') || ');');
      DBMS_OUTPUT.put_line('  COMMIT;');
      DBMS_OUTPUT.put_line('END;');
      DBMS_OUTPUT.put_line('/');
      DBMS_OUTPUT.put_line(' ');
      l_last_host := i.host||':'||i.lower_port||':'||i.upper_port;
    END IF;
  END LOOP;
END;
/




References :

https://docs.oracle.com/database/121/ARPLS/d_networkacl_adm.htm#ARPLS74575
How to setup ACL on 12c and later (Doc ID 2357072.1)
Using ACEs to Set Up ACLs on 12c and above (To Remedy ORA-24247 Errors) (Doc ID 2267848.1)

Saturday, June 12, 2021

Oracle Datapump : logtime=all metrics=y , KEEP_MASTER , and increase parallelism for active export/import

 
We all been using  datapump however  there below are few less known features that can make  our life easy .


####################### 
METRICS=YES LOGTIME=ALL
####################### 


While migration its critical to find steps taken by each process to estimate migration time . 
The LOGTIME parameter adds a timestamp down to the millisecond for every line in your logfile. This can be helpful in diagnosing issues with I/O. It adds e timestamp in front of each line and a message about each task completed by the worker:

expdp hr DIRECTORY=dpump_dir1 DUMPFILE=expdat.dmp SCHEMAS=hr METRICS=YES LOGTIME=ALL

15-NOV-17 11:48:32.305: W-5 Processing object type DATABASE_EXPORT/SCHEMA/PROCACT_SCHEMA
15-NOV-17 11:48:34.439: W-13      Completed 28 PROCACT_SCHEMA objects in 1 seconds
15-NOV-17 11:48:34.439: W-13      Completed by worker 1 28 PROCACT_SCHEMA objects in 1 seconds
15-NOV-17 11:48:34.440: W-13 Processing object type DATABASE_EXPORT/SCHEMA/TABLE/TABLE
15-NOV-17 11:48:35.472: W-17 Startup took 70 seconds
15-NOV-17 11:48:35.596: W-18 Startup took 70 seconds
15-NOV-17 11:48:35.719: W-20 Startup took 70 seconds
15-NOV-17 11:48:35.841: W-19 Startup took 70 seconds
15-NOV-17 11:48:43.520: W-5      Completed 1714 TABLE objects in 7 seconds
15-NOV-17 11:48:43.520: W-5      Completed by worker 1 39 TABLE objects in 2 seconds
15-NOV-17 11:48:43.520: W-5      Completed by worker 2 113 TABLE objects in 7 seconds
15-NOV-17 11:48:43.520: W-5      Completed by worker 3 85 TABLE objects in 6 seconds
15-NOV-17 11:48:43.520: W-5      Completed by worker 4 111 TABLE objects in 6 seconds
15-NOV-17 11:48:43.520: W-5      Completed by worker 5 25 TABLE objects in 1 seconds
15-NOV-17 11:48:43.520: W-5      Completed by worker 6 113 TABLE objects in 7 seconds
15-NOV-17 11:48:43.520: W-5      Completed by worker 7 113 TABLE objects in 7 seconds
15-NOV-17 11:48:43.520: W-5      Completed by worker 8 111 TABLE objects in 6 seconds
15-NOV-17 11:48:44.266: W-1 Processing object type DATABASE_EXPORT/SCHEMA/TABLE/TABLE_DATA


####################### 
KEEP_MASTER
####################### 

KEEP_MASTER prevents the Data Pump Master table from getting deleted after an Export/Import job completion.
There are cases where we want to keep  master  table  for analysis to be done later . 

SQL> select owner,segment_name,segment_type,tablespace_name,(bytes/1024/1024)MB from dba_segments where segment_name='SYS_EXPORT_TABLE_01';
SQL> select OBJECT_TYPE,OBJECT_NAME,OBJECT_SCHEMA,ORIGINAL_OBJECT_SCHEMA,ORIGINAL_OBJECT_NAME,OBJECT_TABLESPACE,SIZE_ESTIMATE,OBJECT_ROW from SYS_SQL_FILE_FULL_01 where ORIGINAL_OBJECT_SCHEMA is not null;
OBJECT_TYPE –> Show the object type.
OBJECT_SCHEMA –> Contains the schema name to which it has to be imported.
ORIGINAL_OBJECT_SCHEMA –> column has the original object’s schema name.
OBJECT_TABLESPACE –> Shows the tablespace where the object will be imported.
SIZE_ESTIMATE –> Estimated size of the table in bytes


####################### 
Increasing parallelism for ongoing Job 
####################### 

We  have been in situations  where we want to increase/Decrease parallelism of  datapump Jobs . We can  achieve this  using below . 

To increase level of parallelism on running datapump job do the following :
- identify job name (select owner_name, job_name from dba_datapump_jobs);
- connect to it : impdp attach=schema.job_name
- change parallelism : parallel=n (n - number of parallel workers)
- wait a bit for change to apply
- confirm change : status


####################### 
Patch 21539301 : Index creation  serial in impdp 
####################### 

Previously, Data Pump would create indexes one at a time, specifying the PARALLEL keyword for the CREATE INDEX statement to invoke parallel query for index creation. We used to recommend a workaround to create indexes in parallel, which involved a three-step process of importing without indexes, then creating a SQLFILE of the CREATE INDEX statements, and breaking that file into multiple windows.

Through extensive performance testing we found that it is faster to create multiple indexes in parallel (using a parallel degree of 1) instead of creating a single index using parallel query processes. This is enabled by the patch for bug 18793090, superseded by patch 21539301 

Need to check if similar patch is available in 19c 


####################### 
Other Useful Options : 
####################### 

Export Filtering Parameters

/* EXCLUDE & INCLUDE */
EXCLUDE=INDEX
EXCLUDE=PROCEDURE
EXCLUDE=TABLE:"LIKE 'EMP%'"
EXCLUDE=SCHEMA:"='HR'"
INCLUDE=TABLE:"IN ('EMP', 'DEPT')"
# When used in command line, use slashes to escape single and double
quotation:
expdp .. schemas=SCOTT EXCLUDE=TABLE:\"=\'EMP\'\"
/* QUERY */
QUERY=OE.ORDERS: "WHERE order_id > 100000 ORDER BY order_date desc" 



Export Remapping Parameters

/* REMAP_DATA (11g) */
-- the remapping function shouldn’t have commit or rollback
REMAP_DATA=[schema.]tablename.column_name:[schema.]pkg.function
expdp hr/passwd DIRECTORY=dp_dir DUMPFILE=remap.dmp
TABLES=hr.employees REMAP_DATA=hr.employees.last_name:hr.remap_pckg.modifychar



Export Encryption Parameters

(11g): To secure the exported dump file, the following new parameters are presented in Oracle 11g
Data pump: ENCRYPTION, ENCRYPTION_PASSWORD and ENCRYPTION_ALGORITHM. To enable
encryption, you must specify either the ENCRYPTION or ENCRYPTION_PASSWORD parameter, or both.

ENCRYPTION = {all | data_only | encrypted_columns_only | metadata_only | none}
ENCRYPTION_ALGORITHM = { AES128 | AES192 | AES256 }
ENCRYPTION_MODE = { DUAL | PASSWORD | TRANSPARENT }

expdp hr DUMPFILE=dp_dir.hr_enc.dmp JOB_NAME=enc ENCRYPTION=data_only
ENCRYPTION_PASSWORD=mypassword

expdp hr DIRECTORY=dp_dir DUMPFILE=hr_enc.dmp
 ENCRYPTION=all ENCRYPTION_PASSWORD=mypassword
 ENCRYPTION_ALGORITHM=AES256 ENCRYPTION_MODE=dual 




Export Estimating Parameters

ESTIMATE={BLOCKS | STATISTICS}
ESTIMATE_ONLY=y
expdp system/pswd estimate_only=y





Export Network Link Parameter

You can initiate an export job from your server and have Data Pump export data from a remote
database to dump files located on the instance from which you initiate the Data Pump export job.

READ ONLY DB can still be loaded from.

expdp hr/hr DIRECTORY=dpump_dir1 NETWORK_LINK=source_database_link DUMPFILE=network_export.dmp
-- more detailed steps:
-- scenario: I will take a data pump export from database ORCL
-- and dumpfile will be written to database TIGER

sqlplus sa/a@tiger
create database link orcl.net using 'ORCL';
OR
Create database link orcl.net connect to sa identified by a
using '(DESCRIPTION=(ADDRESS = (PROTOCOL = TCP)(HOST =10.4.x.x) (PORT=1521))
(connect_data=(service_name=orcl)))';

select * from dual@orcl.net;

$expdp arju/a@tiger directory=d schemas=arju dumpfile=arju_dump_from_orcl.dmp  network_link=orcl.net




Expdp:
FLASHBACK_TIME=SYSTIMESTAMP , PARALLEL=4 , FILESIZE=5G ,  EXCLUDE=STATISTICS , DUMPFILE=TEST_%U.dmp  , COMPRESSION=ALL , COMPRESSION_ALGORITHM=MEDIUM . EXCLUDE:CLUSTER, DB_LINK

Impdp:
transform=lob_storage:securefile  , REMAP_TABLESPACE=%:DATA




####################### 
Checking Performance issue 
####################### 

1) 
SRDC - Diagnostic Collection for DataPump Export Performance Issues (Doc ID 1935743.1)


2) 
$ TFA_HOME/bin/tfactl diagcollect -srdc dbexpdpperf


3) Run with  metric=y logtime=all trace=TRACE=480300    ( and collect below logs ) 
>> db instance alert log covering the recent run
>> expdp log from export directory
>> dw* and dm* traces




####################### 
References : 
####################### 

https://docs.oracle.com/database/121/SUTIL/GUID-2E7A8123-F8D8-4C62-AE4D-8A3F1753E6D3.htm#SUTIL3851

https://docs.oracle.com/database/121/SUTIL/GUID-56B6B4EA-5A2B-45B8-9B41-8966378C2C3D.htm#SUTIL4281