Abdul hafeez kalsekar -- OCS/OCE/OCP: June 2021

Saturday, June 26, 2021

Tuning Oracle Rac InterConnect

In Rac most important part is interconnect performance . We will walk through how to tune oracle cluster resource .

Critical Oracle Background Processes using private interconnect

• LMSn (Global Cache Service Process)
Handles cache fusion(GCS) (Block transfers, messages)
Maintains records of each data file and block open in a cache in GRD (Global Resource Directory)
Controls flow of messages of remote instance(s)
Tracks global data block access
• LCK0 (Instance Enqueue Background Process)
Handles requests related to non-fusion resources such as library and row cache requests
• LMD0 (Global Enqueue Service Daemon)
Manages global enqueues
Detection of deadlocks
• LMON (Global Enqueue Service Monitor)
Maintenance instance membership
Reconfiguration of GCS/GES during recovery from instance crash or startup
• LMHB (Global Cache/Enqueue Service Heartbeat Monitor)
Monitors LMON, LMD and LMSn Processes

Wait events in RAC Database :

1. gc cr/current grant 2-way (message wait)
• Block does not exist in cache and LMS grants to FG process to read from disk
2. gc cr/current block 2-way/3-way (block wait)
• Requested for read and write operations
• Every execution triggers this since SCN advances
3. gc cr/current block congested (block congestion waits)
• LMS didn’t process within 1 ms due to CPU shortage or scheduling delays
4. gc cr/current block busy (block concurrency/contention waits)
• Indicates high concurrency
• LMS needs to perform additional work to prepare block in requested mode
5. gcs log flush sync
• Before sending reconstructed CR/CURR block, LMS request LGWR to flush redo vectors
6. gc cr failure/gc cr retry
• Inefciencies with interconnect or invalid block request or checksum error
7. gc cr/current block corrupt/lost
• Dropped packets due bufer overflow
• Misconfigured interconnect

Gc [current/cr] [2/3]-way – If you have 2 node cluster you cannot get 3-way as only two RAC instances are available and therefore only 2-way is possible as you can have at most two hops. If you have three or more RAC instances then 2-way or 3-way is possible. Blocks are received after 2 or 3 network hops immediately. The event is not a subject to any tuning except increasing private interconnects bandwidth and decreasing the private interconnects latency. Monitor if average ms > 1ms or close to Disk I/O latency. Look at reducing latency.

Gc [current/cr] grant 2-way – Event when grant is received immediately. Grant is always local or 2-way. Grant occurs when a request is made for a block image current or cr and no instance have the image in its local buffer cache. The requesting instance is required to do an I/O from data file to get the blocks. The grant simply is a permission from the LMS this to happen that is, the process to read the block from the data file. Grant can be either cr or current. Gc current grant is go read the block from the database files, while gc cr grant is read the block from disk and build a read consistent block once is read.

Gc [current/cr][block/grant] congested – means that it has been received eventually but with a delay because of intensive CPU consumption, memory lack, LMS overload due to much work in the queues, paging, swapping. This is worth investigating as it provides a room for improvement. You should look at it as it indicates that LMS could not dequeue message fast enough.

Gc [current/cr] block busy – Received but not sent immediately due to high concurrency or contention. This means that the block is busy for example somebody issue block recover command from RMAN. Variety of reasons for being busy just means cannot be sent immediately but not because of memory, LMS or system oriented reasons but Oracle oriented reasons. It is also worth investigating.

Gc current grant busy – Grant is received but there is a delay due to many shared block images or load. For example you are extending the high water mark and you are formatting the block images or blocks with block headers.

Gc [current/cr][failure/retry] – Not received because of failure, checksum error usually in the protocol of the private interconnect due to network errors or hardware problems. This is something worth investigating. Failure means that cannot receive the block image while retry means that the problems recovers and ultimately the block image can be received but it needs to retry.

Gc buffer busy – time between block accesses less than buffer pin time. Pin buffers can be in exclusive or shared mode depending if buffers can be modified or read only. Obviously if there is a lot of contention for the same block by different processes this event can manifest itself in grater magnitude. Buffer busy are global cache events as a request is made from one instance and the block is available in another instance and the block is busy due to contention.

How to check interconnect details :

$ oifcfg getif

lan902 172.17.1.0 global

lan901 10.28.188.0 global public

Ways to check interconnect performance :

1) ifconfig -a
Using ifconfig we can find if any packets are dropped

ifconfig –a: eth0 Link encap:Ethernet HWaddr 00:0B:DB:4B:A2:04 inet addr:130.35.25.110 Bcast:130.35.27.255 Mask:255.255.252.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:21721236 errors:135 dropped:0 overruns:0 frame:95 TX packets:273120 errors:0 dropped:27 overruns:0 carrier:0

2) global awr report

In Global Awr report we can find cluster performance .

2) From V$ views

We use racdiag.sql . Also have posted various scripts used my me .

Script to Collect RAC Diagnostic Information (racdiag.sql) (Doc ID 135714.1)

1. Estimate Global Cache traffic flowing in/out a given node from AWR

• Messages are typically 200 bytes in size or less while CR/Curr blocks are in 8k(Same as

DB_BLOCK_SIZE) in size

• Goal is to keep network Packets/sec under 70% of estimate throughput of interconnect device

• DBA_HIST_SYSSTAT provides data related to all GC wait events, block/message transfers

• DBA_HIST_IC_DEVICE_STATS provides stats like packets received/transmitted/dropped for each

interface

• DBA_HIST_IC_CLIENT_STATS provides usage of interconnect by area(IPQ, DLM and Cache)

select * from select * from v$instance_cache_transfer v$instance_cache_transfer

where class = 'data block' where class = 'data block'

and instance = 1;

SET lines 200 pages 100

SELECT inst_id,cr_requests,current_requests

FROM gv$cr_block_server;

SET lines 200 pages 100

SELECT * FROM gv$cluster_interconnects ORDER BY inst_id,name;

SET lines 200 pages 100

SELECT * FROM gv$configured_interconnects ORDER BY inst_id,name;

SET lines 200 pages 100

SELECT inst_id,class,cr_block,current_block

FROM gv$instance_cache_transfer WHERE instance IN (1,2)

ORDER BY inst_id,class;

##### Interesting one coming from racdiag.sql script:

SET lines 200

SELECT b1.inst_id, b2.VALUE "GCS CR BLOCKS RECEIVED",

b1.VALUE "GCS CR BLOCK RECEIVE TIME",

((b1.VALUE / b2.VALUE) * 10) "AVG CR BLOCK RECEIVE TIME (ms)"

FROM gv$sysstat b1, gv$sysstat b2

WHERE b1.name = 'global cache cr block receive time'

AND b2.name = 'global cache cr blocks received'

AND b1.inst_id = b2.inst_id OR b1.name = 'gc cr block receive time'

AND b2.name = 'gc cr blocks received'

AND b1.inst_id = b2.inst_id;

####Latency of interconnect messages:

SET lines 200 pages 100

SELECT * FROM dba_hist_interconnect_pings WHERE snap_id=2846;

#### Global Cache Service processes (LMS) statistics:

SET lines 200 pages 100

SELECT * FROM gv$current_block_server;

SELECT

DECODE(name,'gc cr blocks received','global cache blocks received','gc cr blocks served','global cache blocks served','gc current blocks received','global cache blocks received','gc current blocks served','global cache blocks served',name) AS name,

SUM(VALUE) AS VALUE,

TO_CHAR(SYSDATE,'yyyy-mm-dd hh24:mi:ss') AS date_taken

FROM gv$sysstat

WHERE inst_id=1

AND name IN ('gc cr blocks received','gc cr blocks served','gc current blocks received','gc current blocks served','gcs messages sent','ges messages sent')

GROUP BY DECODE(name,'gc cr blocks received','global cache blocks received','gc cr blocks served','global cache blocks served','gc current blocks received','global cache blocks received','gc current blocks served','global cache blocks served',name)

UNION

SELECT name,VALUE,TO_CHAR(SYSDATE,'yyyy-mm-dd hh24:mi:ss') AS date_taken

FROM gv$dlm_misc

WHERE name IN ('gcs msgs received','ges msgs received')

AND inst_id=1;

Ways to Tune interconnect :

A) Network Kernel Buffer

UDP buffers

You want the fastest possible network to be used for the interconnect. To maximize your speed and efficiency on the interconnect, you should ensure that the User Datagram Protocol (UDP) buffers are set to the correct values. On Linux, you can check this via the following command:

sysctl net.core.rmem_max net.core.wmem_max net.core.rmem_default net

.core.wmem_default

net.core.rmem_max = 4194304

net.core.wmem_max = 1048576

net.core.rmem_default = 262144

net.core.wmem_default = 262144

Alternatively, you can read the associated values directly from the respective files in the directory /proc/sys/net/core. These values can be increased via the following SYSCTL commands:

sysctl -w net.core.rmem_max=4194304

sysctl -w net.core.wmem_max=1048576

sysctl -w net.core.rmem_default=262144

sysctl -w net.core.wmem_default=262144

The numbers in this example are the recommended values for Oracle RAC on Linux and are more than sufficient for the majority of configurations. Nevertheless, let’s talk about some background of the UDP buffers. The values determined by rmem_max and wmem_max are on a “per-socket” basis. So if you set rmem_max to 4MB, and you have 400 processes running, each with a socket open for communications in the interconnect, then each of these 400 processes could potentially use 4MB, meaning that the total memory usage could be 1.6GB just for this UDP buffer space. However, this is only “potential” usage. So if rmem_default is set to 1MB and rmem_max is set to 4MB, you know for sure that at least 400MB will be allocated (1MB per socket). Anything more than that will be allocated only as needed, up to the max value. So the total memory usage depends on the rmem_default, rmem_max, the number of open sockets, and the variable piece of how much buffer space each process is actually using. This is an unknown—but it could depend on the network latency or other characteristics of how well the network is performing and how much network load there is altogether. To get the total number of Oracle-related open UDP sockets, you can execute this command:

netstat -anp -udp | grep ora | wc -l

B) Remove Reverse Path Filtering interference, by means of configuring rp_filter kernel parameter with value 0 (disabled) or 2 (loose). This is an IP level functionality, so in case we would be using HAIP over bonding, this parameter must be configured at bonding OS device and not the individual interface devices. With this configuration, we are preventing, in 2.6.31 and up kernels, a situation where blocks get locked or discarded. We can find more specific information in MOS rp_filter for multiple private interconnects and Linux Kernel 2.6.32+ (Doc ID 1286796.1).

C) Jumbo Frames introduces the ability for an Ethernet frame to exceed its IEEE 802 specified Maximum Transfer Unit of 1500 bytes up to a maximum of 9000 bytes. Even though Jumbo Frames is widely available in most NICs and data-center class managed switches it is not an IEEE approved the standard. While the benefits are clear, Jumbo Frames interoperability is not guaranteed with some existing networking devices. Though Jumbo Frames can be implemented for private Cluster Interconnects, it requires very careful configuration and testing to realize its benefits. In many cases, failures or inconsistencies can occur due to incorrect setup, bugs in the driver or switch software, which can result in sub-optimal performance and network errors.

D) Parameters Influencing RAC behavior

Disclaimer: Hidden parameters (those that begin with an underscore) such as _LM_DLMD_PRCS should be modified with caution and only after first consulting with Oracle Support. It should be noted that Oracle could change or remove such parameters without notice in subsequent releases.

In Oracle RAC, the setting of DB_BLOCK_SIZE multiplied by the MULTI_BLOCK_READ_COUNT determines the maximum size of a message for the Global Cache and the PARALLEL_EXECUTION_MESSAGE_SIZE determines the maximum size of a message used in Parallel Query. These message sizes can range from 2K to 64K or more and hence will get fragmented more so with a lower/default MTU.

The LMS background process (represents global cache services or GCS) of the requested instance may not be able to keep up with the requests from other instances and may cause the LMT background process (represents global enqueue services or GES) to queue new requests increasing its backlog and thus causing delayed response. The LMS process accepts requests on a First-In-First-Out (FIFO) basis. Oracle by default creates one LMS process for every two CPU’s on the system. While Oracle has not provided any direct method to tune the number of LMS processes, one method available is to increase them by modifying a hidden parameter _LM_DLMD_PROC

1. Always run critical background processes in RT priority • _lm_lms_priority_dynamic = false • _high_priority_processes = ‘LMS*|VKTM|LMD*|LGWR’

2. Disable Undo DRM during instance start after crash • _gc_undo_afnity = false

3. Disable DRM to avoid unacceptable and unpredictable freezes • _gc_policy_time = 0

4. Minimize reconfiguration time for bigger SGA • _lm_tickets • gcs_server_processes

5. Monitoring related critical background parameters • Heartbeat ping to local processes • _lm_rcvr_hang_check_frequency • _lm_rcvr_hang_allow_time • _lm_rcvr_hang_kill • Heartbeat ping to peer process on remote instances • _lm_idle_connection_check • _lm_idle_connection_check_interval • _lm_idle_connection_kill

6. Fairness and light work rule (_fairness_threshold)

E) Reducing Full table scans

Evaluate and reduce the number of full table scans on the database. Queries that retrieve data by performing full table scans could cause large cache movement between instances and thus cause significant load on the cluster interconnects.

F) Using multiple cluster interconnects

Multiple cluster interconnects can be configured and allocated to Oracle using the parameter CLUSTER_INTERCONNECTS. The parameter overrides the default interconnect setting at the operating system level with a preferred traffic network and in certain platforms disables the high availability feature of RAC.

G) Dont Use too big Sga :

Though this is not way to Tune Interconnect but to avoid block transfer from one node to Another . I have seen experts recommending not to use too large sga/buffer pool , as they prefer physical reads over cache fusion .

Tuning Cluster Timeout parameters :

Most of node eviction happens due to inproper configuration of cluster timeouts . Please work with Oracle support for idle value however below i am sharing based on

expirience . Have seen this that after migration time out values changes.

The CSS misscount parameter represents the maximum time, in seconds, that a heartbeat can be missed before entering into a cluster reconfiguration to evict the node.

Disktimeout: Disk Latencies in seconds from node-to-Votedisk. Default Value is 200. (Disk IO)

crsctl get css diagwait --- should give 13

crsctl get css misscount --- should give 30

crsctl get css disktimeout --- should give 200

crsctl get css reboottime --- should give 3

crsctl set css diagwait 13

crsctl set css misscount 30

crsctl set css disktimeout 200

crsctl set css reboottime 3

Reference :

Steps To Change CSS Misscount, Reboottime and Disktimeout (Doc ID 284752.1)

Other issues with Rac Performance :

A) Parallel Query Parallel Query

One major issue in RAC is parallel query One major issue in RAC is parallel query that goes across many nodes

Define Instance Groups Define Instance Groups

Specify in Specify in init.ora init.ora

prodb1.instance_groups='pqgroup1' prodb1.instance_groups='pqgroup1'

prodb2.instance_groups='pqgroup2' prodb2.instance_groups='pqgroup2'

Specify Instance Groups in Session Specify Instance Groups in Session

SQL> alter session set SQL> alter session set

parallel_instance_group parallel_instance_group ='pqgroup1';

RAC Common Tuning Tips

 Application tuning is often the most beneficial

 Resizing and tuning the buffer cache

 Increasing sequence caches to a high value: to avoid index leaf contention caused by high or batch inserts.

 Reducing long full-table scans in OLTP systems: to reduce GCS requests.

 Using Automatic Segment Space Management

 Using partitioning to reduce interinstance traffic

 Avoiding unnecessary parsing

 Minimizing locking usage

 Removing unselective indexes

 Configuring interconnect properly

 Increse Lms process -- set GCS_SERVER_PROCESSES to 6/8 (needs instance restarting for parameter to take affect - in rolling)

Script intended to display RAC private interconnect traffic

#!/usr/bin/ksh

# -----------------------------------------------------------------------------

# Script intended to display RAC private interconnect traffic

# -----------------------------------------------------------------------------

# Usage : $0 <instance name> <interval in seconds (default 10)> <count (default 5)>

# According to sprepins.sql file in $ORACLE_HOME/rdbms/admin RAC interconnect

# traffic formula is

# Estd Interconnect traffic:

# ((Global Cache blocks received + Global Cache blocks served)*db_block_size

# +(GCS/GES messages received + GCS/GES messages sent)*200)/elapsed time

# -----------------------------------------------------------------------------

# Variables

# -----------------------------------------------------------------------------

OSTYPE=$(uname -s)

case $OSTYPE in

"AIX" ) alias bdf="/usr/bin/df -Ik"

alias ll="/usr/bin/ls -l" ;;

"SunOS") alias bdf="/usr/bin/df -k"

alias ll="/usr/bin/ls -l"

alias awk="/usr/xpg4/bin/awk"

alias grep="/usr/xpg4/bin/grep" ;;

"Linux") alias bdf="/bin/df -k"

alias grep=egrep

alias ll="ls -l" ;;

esac

LOG=/tmp/`basename $0`.$$.log

DB_BLOCK_SIZE=0

START_GCBR=0

START_GCBS=0

START_GCSMS=0

START_GCSMR=0

START_GESMS=0

START_GESMR=0

START_DATE="N/A"

END_DATE="N/A"

TRAFFIC=0

i=1

# -----------------------------------------------------------------------------

# Initialization

# -----------------------------------------------------------------------------

if [ "$LOGNAME" != "root" ]

then

echo -e "\nYou must launch this script as root\n"

exit 1

if [ $# -lt 1 ]

then

echo -e "Usage : $0 <instance name> <interval in seconds (default 10)> <count (default 5)>\n"

echo -e "List of instances : \n`grep -v -e "^#" -e "^$" /etc/oratab|cut -f1 -d:`"

exit 1

else

if grep -q "^$1:" /etc/oratab

then

echo -e "\nInstance $1 exists in /etc/oratab\n"

else

echo -e "Instance <$1> does not exist in /etc/oratab"

echo -e "List of existing instances:\n`grep -v -e "^#" -e "^$" /etc/oratab|cut -f1 -d:`"

exit 1

ORACLE_SID=$1

if [ -n "$2" ]

then

INTERVAL=$2

else

INTERVAL=10

if [ -n "$3" ]

then

COUNT=$3

else

COUNT=5

if [ `bdf /tmp|grep tmp|awk '{print $4}'|tr -d %` -ge 98 ]

then

echo -e "\nFile system /tmp is full\n"

exit 1

rm -f ${LOG}

IFS=:

OHOME=`grep "^$ORACLE_SID:" /etc/oratab | cut -d: -f2`

FLAG=`grep "^$ORACLE_SID:" /etc/oratab | cut -d: -f3`

USER=`grep "^$ORACLE_SID:" /etc/oratab | cut -d: -f4`

VG=`grep "^$ORACLE_SID:" /etc/oratab | cut -d: -f5`

# -----------------------------------------------------------------------------

# Main

# -----------------------------------------------------------------------------

echo -e "Please wait $INTERVAL seconds...\n"

IFS=*

while [ $i -le $COUNT ]

su - ${USER} -c "

sqlplus -S / as sysdba << EOF > ${LOG}

set serveroutput on size 999999 linesize 200 feedback off

declare

instance_number number;

db_block_size number;

type figures_type is table of number index by varchar2(64);

start_figures figures_type;

end_figures figures_type;

cursor cursor1(instance_number number) is select

decode(name,'gc cr blocks received','Global Cache blocks received','gc cr blocks served','Global Cache blocks served','gc current blocks received','Global Cache blocks received','gc current blocks served','Global Cache blocks served',name) as name,

sum(value) as value,to_char(sysdate,'yyyy-mm-dd hh24:mi:ss') as date_taken

from gv\\\$SYSSTAT

where inst_id=instance_number

and name in ('gc cr blocks received','gc cr blocks served','gc current blocks received','gc current blocks served','gcs messages sent','ges messages sent')

group by decode(name,'gc cr blocks received','Global Cache blocks received','gc cr blocks served','Global Cache blocks served','gc current blocks received','Global Cache blocks received','gc current blocks served','Global Cache blocks served',name)

union

select name,value,to_char(sysdate,'yyyy-mm-dd hh24:mi:ss') as date_taken

from gv\\\$DLM_MISC

where name in ('gcs msgs received','ges msgs received')

and inst_id=instance_number;

item1 cursor1%rowtype;

plsql_start_date varchar2(19);

plsql_end_date varchar2(19);

interval number;

computed_interval number;

i number;

begin

select value into instance_number from v\\\$parameter where name='instance_number';

select value into db_block_size from v\\\$parameter where name='db_block_size';

interval:=${INTERVAL};

start_figures('Global Cache blocks received'):=${START_GCBR};

start_figures('Global Cache blocks served'):=${START_GCBS};

start_figures('gcs messages sent'):=${START_GCSMS};

start_figures('gcs msgs received'):=${START_GCSMR};

start_figures('ges messages sent'):=${START_GESMS};

start_figures('ges msgs received'):=${START_GESMR};

plsql_start_date:='${START_DATE}';

--dbms_output.put_line('Instance Number: ' || instance_number);

--dbms_output.put_line('db_block_size: ' || db_block_size);

--dbms_output.put_line('interval: ' || interval);

if (start_figures('Global Cache blocks received')=0) then

open cursor1(instance_number);

loop

fetch cursor1 into item1;

exit when cursor1%notfound;

start_figures(item1.name):=item1.value;

end loop;

plsql_start_date:=item1.date_taken;

close cursor1;

dbms_lock.sleep(interval);

end if;

open cursor1(instance_number);

loop

fetch cursor1 into item1;

exit when cursor1%notfound;

end_figures(item1.name):=item1.value;

end loop;

plsql_end_date:=item1.date_taken;

close cursor1;

dbms_output.put_line('DB_BLOCK_SIZE=' || db_block_size);

dbms_output.put_line('START_GCBR=' || start_figures('Global Cache blocks received'));

dbms_output.put_line('START_GCBS=' || start_figures('Global Cache blocks served'));

dbms_output.put_line('START_GCSMS=' || start_figures('gcs messages sent'));

dbms_output.put_line('START_GCSMR=' || start_figures('gcs msgs received'));

dbms_output.put_line('START_GESMS=' || start_figures('ges messages sent'));

dbms_output.put_line('START_GESMR=' || start_figures('ges msgs received'));

dbms_output.put_line('END_GCBR=' || end_figures('Global Cache blocks received'));

dbms_output.put_line('END_GCBS=' || end_figures('Global Cache blocks served'));

dbms_output.put_line('END_GCSMS=' || end_figures('gcs messages sent'));

dbms_output.put_line('END_GCSMR=' || end_figures('gcs msgs received'));

dbms_output.put_line('END_GESMS=' || end_figures('ges messages sent'));

dbms_output.put_line('END_GESMR=' || end_figures('ges msgs received'));

dbms_output.put_line('START_DATE=\"' || plsql_start_date || '\"');

dbms_output.put_line('END_DATE=\"' || plsql_end_date || '\"');

computed_interval:=round((to_date(plsql_end_date,'yyyy-mm-dd hh24:mi:ss')-to_date(plsql_start_date,'yyyy-mm-dd hh24:mi:ss'))*24*60*60);

dbms_output.put_line('COMPUTED_INTERVAL=' || computed_interval);

dbms_output.put_line('TRAFFIC=' || ((end_figures('Global Cache blocks received')+end_figures('Global Cache blocks served')-start_figures('Global Cache blocks received')-start_figures('Global Cache blocks served'))*db_block_size+(end_figures('gcs messages sent')+end_figures('gcs msgs received')+end_figures('ges messages sent')+end_figures('ges msgs received')-start_figures('gcs messages sent')-start_figures('gcs msgs received')-start_figures('ges messages sent')-start_figures('ges msgs received'))*200)/1024/computed_interval);

end;

EOF

while read line

#echo $line

eval $line

done < ${LOG}

#TRAFFIC=`expr (($END_GCBR+$END_GCBS-$START_GCBR-$START_GCBS)*$DB_BLOCK_SIZE+($END_GCSMS+$END_GCSMR+$END_GESMS+$END_GESMR-$START_GCSMS-$START_GCSMR-$START_GESMS-$START_GESMR)*200)/1024/$INTERVAL)`

if [ $TRAFFIC -le 1024 ]

then

echo -e "Estd Interconnect traffic at $END_DATE (KBytes/s): $TRAFFIC\n"

else

TRAFFIC=`echo $TRAFFIC/1024 | bc -l`

echo -e "Estd Interconnect traffic at $END_DATE (MBytes/s): $TRAFFIC\n"

#cat ${LOG}

START_GCBR=$END_GCBR

START_GCBS=$END_GCBS

START_GCSMS=$END_GCSMS

START_GCSMR=$END_GCSMR

START_GESMS=$END_GESMS

START_GESMR=$END_GESMR

START_DATE=$END_DATE

i=`expr $i + 1`

sleep $INTERVAL

done

rm -f ${LOG}

References :

Troubleshooting gc block lost and Poor Network Performance in a RAC Environment (Doc ID 563566.1)

( Doc ID 1619155.1 ) Best Practices and Recommendations for RAC databases with SGA size over 100GB

Saturday, June 19, 2021

Oracle Fine-Grained Access Control for UTL_* Packages -- ACL

Oracle Database 11g provides a mechanism to refine the level of access to the network
access packages UTL_TCP, UTL_SMTP, UTL_MAIL, UTL_HTTP, and UTL_INADDR.

You can use the DBMS_NETWORK_ACL_ADMIN package to facilitate management of the UTL_*

network access packages as in the following steps:

We we ill get ORA-24247 if Acl is missing

12c method of Creating Acl

BEGIN

DBMS_NETWORK_ACL_ADMIN.APPEND_HOST_ACE(

host => '*',

ace => xs$ace_type(privilege_list => xs$name_list('connect', 'resolve'),

principal_name => 'USERNAME',

principal_type => xs_acl.ptype_db));

END;

Pre 12c Method Of Creating Acl

1) Create an Access Control List (ACL):

All ACL definitions are stored in XML DB in the form of XML documents. The ACL XML files reside in the /sys/acls directory of the XML DB
repository. Following is an example of using the CREATE_ACL procedure to create an XML
file called dba.xml:

begin
DBMS_NETWORK_ACL_ADMIN.CREATE_ACL (
ACL => 'dba.xml', -- case sensitive
DESCRIPTION=> 'Network Access Control for the DBAs',
PRINCIPAL => 'SCOTT', -- user or role the privilege is granted or denied(upper case)
IS_GRANT => TRUE, -- privilege is granted or denied
PRIVILEGE => 'connect', -- or 'resolve' (case sensitive)
START_DATE => null, -- when the access control entity ACE will be valid
END_DATE => null); -- ACE expiration date (TIMESTAMP WITH TIMEZONE format)
end;

Regarding the PRIVILEGE parameter, the database user needs the connect privilege to an
external network host computer if he or she is connecting using the UTL_TCP, UTL_HTTP,
UTL_SMTP, and UTL_MAIL utility packages. To resolve a host name that was given as a host IP
address, or the IP address that was given as a host name, with the UTL_INADDR package,
grant the database user the resolve privilege.

You can then query the RESOURCE_VIEW view to find the dba.xml ACL in the /sys/acls
directory:

select ANY_PATH
from RESOURCE_VIEW
where ANY_PATH LIKE '/sys/acls/dba%'

Too may entries in the ACL may lead to significant XML DB performance drop because ACL
are checked for each access to Oracle XML DB repository. As general rule of thumb, ACL
check operations perform best when the number of ACEs in the ACL is at 16 entries or less.

2) Add Access Control Entries:

Once you create the initial ACL, you can continue to add more
privileges to the XML file. The following example will add the user RAMI to the dba.xml file
and grant him network access:
begin

DBMS_NETWORK_ACL_ADMIN.ADD_PRIVILEGE (
ACL => 'dba.xml',
PRINCIPAL => 'RAMI',
IS_GRANT => TRUE,
PRIVILEGE => 'connect',
START_DATE => null, -- if the time interval is defined,
END_DATE => null); -- the ACE will expire after the specified date range
end;
/
COMMIT;

In ACL, the security entries are evaluating in order precedence. If you have two contradicting
entries in the list, the first one in the order will take effect. You can control the order number
of an added entry as follows:

begin
DBMS_NETWORK_ACL_ADMIN.ADD_PRIVILEGE (
POSITION => 1, -- on the top
ACL => 'dba.xml',

PRINCIPAL => 'SAMI',
IS_GRANT => FALSE,

PRIVILEGE => 'connect',
START_DATE => null, END_DATE => null);
end;

3) Assign Hosts:

The ASSIGN_ACL procedure is used to authorize access to one or more
network hosts as follows:

begin
DBMS_NETWORK_ACL_ADMIN.ASSIGN_ACL (
ACL => 'dba.xml', HOST => 'dbaexpert.com',
LOWER_PORT => 80, UPPER_PORT => 443);
end;
COMMIT;

The lower port and the upper port define the lower and the upper boundaries of the allowable
port range. They should be set for connect privileges not resolve privileges.

4) Validate that the ACL permissions worked accordingly.

Following is an example to test the code in the previous step.

select UTL_HTTP.REQUEST('http://www.ahmedbaraka.com') from dual;

If the sufficient ACL privileges or ACL assignments are not provided, you will receive the ORA24247 error.

Access Control Lists Maintenance

Use DELETE_PRIVILEGE to remove an access control entry from the XML file.

exec DBMS_NETWORK_ACL_ADMIN.DELETE_PRIVILEGE( ACL=>'dba.xml', PRINCIPAL=> 'RAMI');

The UNASSIGN_ACL procedure allows you to manually drop ACL assignments. It uses the same parameter list as the ASSIGN_ACL procedure, with any NULL parameters acting as wildcards.

BEGIN

DBMS_NETWORK_ACL_ADMIN.unassign_acl (

acl => 'test_acl_file.xml',

host => '192.168.2.3',

lower_port => 80,

upper_port => NULL);

COMMIT;

END;

Use the DROP_ACL procedure to remove the XML file from the /sys/acls directory as follows:

exec DBMS_NETWORK_ACL_ADMIN.DROP_ACL ( ACL=>'dba.xml' );

Views :

Pre-checks to ensure XML DB installed:

-- user XDB exists

select * from ALL_USERS order by username desc;

-- resource_view exists

select * from resource_view;

-- shows XML DB is installed

select * from dba_registry;

To display list of the ACLs created in the database, use the following query:

select HOST, LOWER_PORT, UPPER_PORT, ACL from DBA_NETWORK_ACLS

You can query the DBA_NETWORK_ACL_PRIVILEGES view to query network privileges granted or
denied for the access control list as follows:

select PRINCIPAL, PRIVILEGE, IS_GRANT
from DBA_NETWORK_ACL_PRIVILEGES
where ACL like '%dba.xml'

Logged on users can use the following query to see their access entries in the dba.xml file:

select HOST, LOWER_PORT, UPPER_PORT, STATUS privilege
from USER_NETWORK_ACL_PRIVILEGES
where HOST in
(select * from
table(DBMS_NETWORK_ACL_UTILITY.DOMAINS('dbaexpert.com')))
and PRIVILEGE = 'connect'
order by DBMS_NETWORK_ACL_UTILITY.DOMAIN_LEVEL(host) desc, LOWER_PORT;

COLUMN host FORMAT A30

COLUMN acl FORMAT A30

SELECT host, lower_port, upper_port, acl

FROM dba_network_acls;

COLUMN acl FORMAT A30

COLUMN principal FORMAT A30

SELECT acl,

principal,

privilege,

is_grant,

TO_CHAR(start_date, 'DD-MON-YYYY') AS start_date,

TO_CHAR(end_date, 'DD-MON-YYYY') AS end_date

FROM dba_network_acl_privileges;

CONN test1/test1@db11g

COLUMN host FORMAT A30

SELECT host, lower_port, upper_port, privilege, status

FROM dba_network_acl_privileges;

-- NB: new dba_host_aces view

SELECT host,

lower_port,

upper_port,

ace_order,

TO_CHAR(start_date, 'DD-MON-YYYY') AS start_date,

TO_CHAR(end_date, 'DD-MON-YYYY') AS end_date,

grant_type,

inverted_principal,

principal,

principal_type,

privilege

FROM dba_host_aces

ORDER BY host, ace_order;

-- NB: new dba_host_acls view

SELECT HOST,

LOWER_PORT,

UPPER_PORT,

ACL,

ACLID,

ACL_OWNER

FROM dba_host_acls

ORDER BY host;

-- NB: dba_network_acls deprecated view in 12c

SELECT host, lower_port, upper_port, acl

FROM dba_network_acls

ORDER BY host;

-- NB: dba_network_acl_privileges deprecated view in 12c

SELECT acl,

principal,

privilege,

is_grant,

TO_CHAR(start_date, 'DD-MON-YYYY') AS start_date,

TO_CHAR(end_date, 'DD-MON-YYYY') AS end_date

FROM dba_network_acl_privileges

ORDER BY acl, principal, privilege;

-- -----------------------------------------------------------------------------------

-- File Name : https://oracle-base.com/dba/11g/network_acls_ddl.sql

-- Author : Tim Hall

-- Description : Displays DDL for all network ACLs.

-- Requirements : Access to the DBA views.

-- Call Syntax : @network_acls_ddl

-- Last Modified: 28-JUL-2017

-- -----------------------------------------------------------------------------------

SET SERVEROUTPUT ON FORMAT WRAPPED LINESIZE 300

DECLARE

l_last_acl dba_network_acls.acl%TYPE := '~';

l_last_principal dba_network_acl_privileges.principal%TYPE := '~';

l_last_privilege dba_network_acl_privileges.privilege%TYPE := '~';

l_last_host dba_network_acls.host%TYPE := '~';

FUNCTION get_timestamp (p_timestamp IN TIMESTAMP WITH TIME ZONE)

RETURN VARCHAR2

l_return VARCHAR2(32767);

BEGIN

IF p_timestamp IS NULL THEN

RETURN 'NULL';

END IF;

RETURN 'TO_TIMESTAMP_TZ(''' || TO_CHAR(p_timestamp, 'DD-MON-YYYY HH24:MI:SS.FF TZH:TZM') || ''',''DD-MON-YYYY HH24:MI:SS.FF TZH:TZM'')';

END;

BEGIN

FOR i IN (SELECT a.acl,

a.host,

a.lower_port,

a.upper_port,

b.principal,

b.privilege,

b.is_grant,

b.start_date,

b.end_date

FROM dba_network_acls a

JOIN dba_network_acl_privileges b ON a.acl = b.acl

ORDER BY a.acl, a.host, a.lower_port, a.upper_port)

LOOP

IF l_last_acl <> i.acl THEN

-- First time we've seen this ACL, so create a new one.

l_last_host := '~';

DBMS_OUTPUT.put_line('-- -------------------------------------------------');

DBMS_OUTPUT.put_line('-- ' || i.acl);

DBMS_OUTPUT.put_line('-- -------------------------------------------------');

DBMS_OUTPUT.put_line('BEGIN');

DBMS_OUTPUT.put_line(' DBMS_NETWORK_ACL_ADMIN.drop_acl (');

DBMS_OUTPUT.put_line(' acl => ''' || i.acl || ''');');

DBMS_OUTPUT.put_line(' COMMIT;');

DBMS_OUTPUT.put_line('END;');

DBMS_OUTPUT.put_line('/');

DBMS_OUTPUT.put_line(' ');

DBMS_OUTPUT.put_line('BEGIN');

DBMS_OUTPUT.put_line(' DBMS_NETWORK_ACL_ADMIN.create_acl (');

DBMS_OUTPUT.put_line(' acl => ''' || i.acl || ''',');

DBMS_OUTPUT.put_line(' description => ''' || i.acl || ''',');

DBMS_OUTPUT.put_line(' principal => ''' || i.principal || ''',');

DBMS_OUTPUT.put_line(' is_grant => ' || i.is_grant || ',');

DBMS_OUTPUT.put_line(' privilege => ''' || i.privilege || ''',');

DBMS_OUTPUT.put_line(' start_date => ' || get_timestamp(i.start_date) || ',');

DBMS_OUTPUT.put_line(' end_date => ' || get_timestamp(i.end_date) || ');');

DBMS_OUTPUT.put_line(' COMMIT;');

DBMS_OUTPUT.put_line('END;');

DBMS_OUTPUT.put_line('/');

DBMS_OUTPUT.put_line(' ');

l_last_acl := i.acl;

l_last_principal := i.principal;

l_last_privilege := i.privilege;

END IF;

IF l_last_principal <> i.principal

OR (l_last_principal = i.principal AND l_last_privilege <> i.privilege) THEN

-- Add another principal to an existing ACL.

DBMS_OUTPUT.put_line('BEGIN');

DBMS_OUTPUT.put_line(' DBMS_NETWORK_ACL_ADMIN.add_privilege (');

DBMS_OUTPUT.put_line(' acl => ''' || i.acl || ''',');

DBMS_OUTPUT.put_line(' principal => ''' || i.principal || ''',');

DBMS_OUTPUT.put_line(' is_grant => ' || i.is_grant || ',');

DBMS_OUTPUT.put_line(' privilege => ''' || i.privilege || ''',');

DBMS_OUTPUT.put_line(' start_date => ' || get_timestamp(i.start_date) || ',');

DBMS_OUTPUT.put_line(' end_date => ' || get_timestamp(i.end_date) || ');');

DBMS_OUTPUT.put_line(' COMMIT;');

DBMS_OUTPUT.put_line('END;');

DBMS_OUTPUT.put_line('/');

DBMS_OUTPUT.put_line(' ');

l_last_principal := i.principal;

l_last_privilege := i.privilege;

END IF;

IF l_last_host <> i.host||':'||i.lower_port||':'||i.upper_port THEN

DBMS_OUTPUT.put_line('BEGIN');

DBMS_OUTPUT.put_line(' DBMS_NETWORK_ACL_ADMIN.assign_acl (');

DBMS_OUTPUT.put_line(' acl => ''' || i.acl || ''',');

DBMS_OUTPUT.put_line(' host => ''' || i.host || ''',');

DBMS_OUTPUT.put_line(' lower_port => ' || NVL(TO_CHAR(i.lower_port),'NULL') || ',');

DBMS_OUTPUT.put_line(' upper_port => ' || NVL(TO_CHAR(i.upper_port),'NULL') || ');');

DBMS_OUTPUT.put_line(' COMMIT;');

DBMS_OUTPUT.put_line('END;');

DBMS_OUTPUT.put_line('/');

DBMS_OUTPUT.put_line(' ');

l_last_host := i.host||':'||i.lower_port||':'||i.upper_port;

END IF;

END LOOP;

END;

References :

https://docs.oracle.com/database/121/ARPLS/d_networkacl_adm.htm#ARPLS74575

How to setup ACL on 12c and later (Doc ID 2357072.1)

Using ACEs to Set Up ACLs on 12c and above (To Remedy ORA-24247 Errors) (Doc ID 2267848.1)

Saturday, June 12, 2021

Oracle Datapump : logtime=all metrics=y , KEEP_MASTER , and increase parallelism for active export/import

We all been using datapump however there below are few less known features that can make our life easy .

#######################
METRICS=YES LOGTIME=ALL
#######################

While migration its critical to find steps taken by each process to estimate migration time .
The LOGTIME parameter adds a timestamp down to the millisecond for every line in your logfile. This can be helpful in diagnosing issues with I/O. It adds e timestamp in front of each line and a message about each task completed by the worker:

expdp hr DIRECTORY=dpump_dir1 DUMPFILE=expdat.dmp SCHEMAS=hr METRICS=YES LOGTIME=ALL

15-NOV-17 11:48:32.305: W-5 Processing object type DATABASE_EXPORT/SCHEMA/PROCACT_SCHEMA
15-NOV-17 11:48:34.439: W-13 Completed 28 PROCACT_SCHEMA objects in 1 seconds
15-NOV-17 11:48:34.439: W-13 Completed by worker 1 28 PROCACT_SCHEMA objects in 1 seconds
15-NOV-17 11:48:34.440: W-13 Processing object type DATABASE_EXPORT/SCHEMA/TABLE/TABLE
15-NOV-17 11:48:35.472: W-17 Startup took 70 seconds
15-NOV-17 11:48:35.596: W-18 Startup took 70 seconds
15-NOV-17 11:48:35.719: W-20 Startup took 70 seconds
15-NOV-17 11:48:35.841: W-19 Startup took 70 seconds
15-NOV-17 11:48:43.520: W-5 Completed 1714 TABLE objects in 7 seconds
15-NOV-17 11:48:43.520: W-5 Completed by worker 1 39 TABLE objects in 2 seconds
15-NOV-17 11:48:43.520: W-5 Completed by worker 2 113 TABLE objects in 7 seconds
15-NOV-17 11:48:43.520: W-5 Completed by worker 3 85 TABLE objects in 6 seconds
15-NOV-17 11:48:43.520: W-5 Completed by worker 4 111 TABLE objects in 6 seconds
15-NOV-17 11:48:43.520: W-5 Completed by worker 5 25 TABLE objects in 1 seconds
15-NOV-17 11:48:43.520: W-5 Completed by worker 6 113 TABLE objects in 7 seconds
15-NOV-17 11:48:43.520: W-5 Completed by worker 7 113 TABLE objects in 7 seconds
15-NOV-17 11:48:43.520: W-5 Completed by worker 8 111 TABLE objects in 6 seconds
15-NOV-17 11:48:44.266: W-1 Processing object type DATABASE_EXPORT/SCHEMA/TABLE/TABLE_DATA

#######################
KEEP_MASTER
#######################

KEEP_MASTER prevents the Data Pump Master table from getting deleted after an Export/Import job completion.
There are cases where we want to keep master table for analysis to be done later .

SQL> select owner,segment_name,segment_type,tablespace_name,(bytes/1024/1024)MB from dba_segments where segment_name='SYS_EXPORT_TABLE_01';
SQL> select OBJECT_TYPE,OBJECT_NAME,OBJECT_SCHEMA,ORIGINAL_OBJECT_SCHEMA,ORIGINAL_OBJECT_NAME,OBJECT_TABLESPACE,SIZE_ESTIMATE,OBJECT_ROW from SYS_SQL_FILE_FULL_01 where ORIGINAL_OBJECT_SCHEMA is not null;
OBJECT_TYPE –> Show the object type.
OBJECT_SCHEMA –> Contains the schema name to which it has to be imported.
ORIGINAL_OBJECT_SCHEMA –> column has the original object’s schema name.
OBJECT_TABLESPACE –> Shows the tablespace where the object will be imported.
SIZE_ESTIMATE –> Estimated size of the table in bytes

#######################
Increasing parallelism for ongoing Job
#######################

We have been in situations where we want to increase/Decrease parallelism of datapump Jobs . We can achieve this using below .

To increase level of parallelism on running datapump job do the following :
- identify job name (select owner_name, job_name from dba_datapump_jobs);
- connect to it : impdp attach=schema.job_name
- change parallelism : parallel=n (n - number of parallel workers)
- wait a bit for change to apply
- confirm change : status

#######################
Patch 21539301 : Index creation serial in impdp

#######################

Previously, Data Pump would create indexes one at a time, specifying the PARALLEL keyword for the CREATE INDEX statement to invoke parallel query for index creation. We used to recommend a workaround to create indexes in parallel, which involved a three-step process of importing without indexes, then creating a SQLFILE of the CREATE INDEX statements, and breaking that file into multiple windows.

Through extensive performance testing we found that it is faster to create multiple indexes in parallel (using a parallel degree of 1) instead of creating a single index using parallel query processes. This is enabled by the patch for bug 18793090, superseded by patch 21539301

Need to check if similar patch is available in 19c

#######################

Other Useful Options :

#######################

Export Filtering Parameters

/* EXCLUDE & INCLUDE */

EXCLUDE=INDEX

EXCLUDE=PROCEDURE

EXCLUDE=TABLE:"LIKE 'EMP%'"

EXCLUDE=SCHEMA:"='HR'"

INCLUDE=TABLE:"IN ('EMP', 'DEPT')"

# When used in command line, use slashes to escape single and double

quotation:

expdp .. schemas=SCOTT EXCLUDE=TABLE:\"=\'EMP\'\"

/* QUERY */

QUERY=OE.ORDERS: "WHERE order_id > 100000 ORDER BY order_date desc"

Export Remapping Parameters

/* REMAP_DATA (11g) */

-- the remapping function shouldn’t have commit or rollback

REMAP_DATA=[schema.]tablename.column_name:[schema.]pkg.function

expdp hr/passwd DIRECTORY=dp_dir DUMPFILE=remap.dmp

TABLES=hr.employees REMAP_DATA=hr.employees.last_name:hr.remap_pckg.modifychar

Export Encryption Parameters

(11g): To secure the exported dump file, the following new parameters are presented in Oracle 11g

Data pump: ENCRYPTION, ENCRYPTION_PASSWORD and ENCRYPTION_ALGORITHM. To enable

encryption, you must specify either the ENCRYPTION or ENCRYPTION_PASSWORD parameter, or both.

ENCRYPTION = {all | data_only | encrypted_columns_only | metadata_only | none}

ENCRYPTION_ALGORITHM = { AES128 | AES192 | AES256 }

ENCRYPTION_MODE = { DUAL | PASSWORD | TRANSPARENT }

expdp hr DUMPFILE=dp_dir.hr_enc.dmp JOB_NAME=enc ENCRYPTION=data_only

ENCRYPTION_PASSWORD=mypassword

expdp hr DIRECTORY=dp_dir DUMPFILE=hr_enc.dmp

ENCRYPTION=all ENCRYPTION_PASSWORD=mypassword

ENCRYPTION_ALGORITHM=AES256 ENCRYPTION_MODE=dual

Export Estimating Parameters

ESTIMATE={BLOCKS | STATISTICS}

ESTIMATE_ONLY=y

expdp system/pswd estimate_only=y

Export Network Link Parameter

You can initiate an export job from your server and have Data Pump export data from a remote

database to dump files located on the instance from which you initiate the Data Pump export job.

READ ONLY DB can still be loaded from.

expdp hr/hr DIRECTORY=dpump_dir1 NETWORK_LINK=source_database_link DUMPFILE=network_export.dmp

-- more detailed steps:

-- scenario: I will take a data pump export from database ORCL

-- and dumpfile will be written to database TIGER

sqlplus sa/a@tiger

create database link orcl.net using 'ORCL';

Create database link orcl.net connect to sa identified by a

using '(DESCRIPTION=(ADDRESS = (PROTOCOL = TCP)(HOST =10.4.x.x) (PORT=1521))

(connect_data=(service_name=orcl)))';

select * from dual@orcl.net;

$expdp arju/a@tiger directory=d schemas=arju dumpfile=arju_dump_from_orcl.dmp network_link=orcl.net

Expdp:

FLASHBACK_TIME=SYSTIMESTAMP , PARALLEL=4 , FILESIZE=5G , EXCLUDE=STATISTICS , DUMPFILE=TEST_%U.dmp , COMPRESSION=ALL , COMPRESSION_ALGORITHM=MEDIUM . EXCLUDE:CLUSTER, DB_LINK

Impdp:

transform=lob_storage:securefile , REMAP_TABLESPACE=%:DATA

#######################
Checking Performance issue
#######################

SRDC - Diagnostic Collection for DataPump Export Performance Issues (Doc ID 1935743.1)

https://abdul-hafeez-kalsekar-tuning.blogspot.com/2025/02/srdc-diagnostic-collection-for-datapump.html

SQL> connect / as sysdba

SQL> @srdc_expdp_performance.sql

--wait for 10 minutes

SQL> @srdc_expdp_performance.sql

--wait for 10 minutes

SQL> @srdc_expdp_performance.sql

$ TFA_HOME/bin/tfactl diagcollect -srdc dbexpdpperf

2) Run with metric=y logtime=all trace=TRACE=480300 ( and collect below logs )

trace=1ff0B00 metrics=y logtime=all

>> db instance alert log covering the recent run

>> expdp log from export directory

>> dw* and dm* traces

SQL> oradebug setmypid

SQL> oradebug unlimit

SQL> oradebug -g all hanganalyze 3

SQL> oradebug dump systemstate 258

SQL> oradebug -g all hanganalyze 3

SQL> oradebug dump systemstate 258

SQL> oradebug -g all hanganalyze 3

SQL> oradebug dump systemstate 258

> Upload AWR report for the time datapump operation were in progress

5) Tracing :

-- Version = 12c

SQL> alter system set events 'sql_trace {process: pname = dw | process: pname = dm} level=12';

#######################

References :

#######################

https://docs.oracle.com/database/121/SUTIL/GUID-2E7A8123-F8D8-4C62-AE4D-8A3F1753E6D3.htm#SUTIL3851

https://docs.oracle.com/database/121/SUTIL/GUID-56B6B4EA-5A2B-45B8-9B41-8966378C2C3D.htm#SUTIL4281

Sunday, June 6, 2021

Oracle Convert Rac Active-Active - To One Node Rac Active/Passive

We recently came across situation where application was not supporting Rac active/active . And we have convert database to Rac Active Passive .

Below are steps for Converting Rac Active/ Active to One Node Rac Active/Passive

a) Check Status of RAC database
[RACDB1@Abdul-rac1 ] $srvctl status database -d RACDB
Instance RACDB1 is running on node Abdul-rac1
Instance RACDB2 is running on node Abdul-rac2

b) Keep one of the instance down before converting database to one-node RAC configuration.
[RACDB1@Abdul-rac1 ] $srvctl stop instance -d RACDB -i RACDB2

c) Keep one of the instance active in Cluster before converting database to one-node RAC configuration.

[RACDB1@Abdul-rac1 ] $srvctl convert database -d RACDB -c RACONENODE -i RACDB1
PRCD-1214 : Administrator-managed RAC database RACDB has more than one instance

The above error means you should have only 1 Instance of Database running and active for converting it to One-Node RAC.

[RACDB1@Abdul-rac1 ] $srvctl remove instance -d RACDB -i RACDB2
Remove instance from the database RACDB? (y/[n]) y

d)It is mandatory to have at-least one service active for given instance before you convert it into RACONENODE mode, else you will get error.

[RACDB1@Abdul-rac1 ] $srvctl convert database -d RACDB -c RACONENODE -i RACDB1

PRCD-1242 : Unable to convert RAC database RACDB to RAC One Node database because the database had no service added

[RACDB1@Abdul-rac1 ] $srvctl add service -d RACDB -s TEST -preferred RACDB1

e) Use srvctl convert command to change your database from RAC mode to RACONENODE.

[RACDB1@Abdul-rac1 ] $srvctl convert database -d RACDB -c RACONENODE -i RACDB

$ srvctl -h | grep convert
Usage: srvctl convert database -d <db_unique_name> -c RAC [-n <node>]
Usage: srvctl convert database -d <db_unique_name> -c RACONENODE [-i <instname>] [-w <timeout>]

f) Verify One Node RAC configuration.

[RACDB1@Abdul-rac1 ] $srvctl config database -d RACDB
Database unique name: RACDB
Database name: RACDB
Oracle home: /u01/app/oracle/product/11.2.0/db_1
Oracle user: oracle
Spfile: +ASMDATA/RACDB/spfileRACDB.ora
Domain:
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools: RACDB
Database instances:
Disk Groups: DATA, FRA
Mount point paths:
Services: TEST
Type: RACOneNode <<<<<<< The database has successfully turned into 1-Node Configuration.
Online relocation timeout: 30
Instance name prefix: RACDB1
Candidate servers: Abdul-rac1
Database is administrator managed

g) In case you want to relocate a RAC 1-Node database from 1 node to another, you can use following syntax . Default shutdown mode for manual relocation is shutdown transactional .

[RACDB1@Abdul-rac1 ]srvctl relocate database -d RACDB -n Abdul-rac2

srvctl relocate database -h

srvctl relocate database -db <db_unique_name> {[-node <target>] [-timeout <timeout>] [-stopoption <stop_option>] | -abort [-revert]} [-drain_timeout <timeout>] [-verbose]

-db <db_unique_name> Unique name of database to relocate

-node <target> Target node to which to relocate database

-timeout <timeout> Online relocation timeout in minutes (only for RAC One Node database)

-abort Abort failed online relocation (only for RAC One Node database)

-revert Remove target node of failed online relocation request from the candidate server list of administrator-managed RAC One Node database

-stopoption <stop_option> Override default shutdown option for running instance (only NORMAL allowed)

-drain_timeout <drain_timeout> Service drain timeout specified in seconds

-verbose Verbose output

-help Print usage

We will get below Undo error which needs to be fixed by setting Undo tablespace to specific node

$ srvctl relocate database –d rontest -n node3 <
PRCD-1222 : Online relocation of database rontest failed but database was restored to its original state
PRCD-1129 : Failed to start instance rontest_2 for database rontest
PRCR-1064 : Failed to start resource ora.rontest.db on node node3
CRS-5017: The resource action "ora.rontest.db start" encountered the following error:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-30013: undo tablespace 'UNDOTBS1' is currently in use
Process ID: 1587
Session ID: 35 Serial number: 1
CRS-2674: Start of 'ora.rontest.db' on 'node3' failed

SQL> select tablespace_name from dba_data_files where tablespace_name like ‘%UNDO%’;
TABLESPACE_NAME
------------------------------
UNDOTBS1
UNDOTBS2
So the tablespace was there, but the initialisation parameter was wrong! Let’s correct this:
SQL> alter system set undo_tablespace='UNDOTBS1' sid='rontest_1';
System altered.
SQL> alter system set undo_tablespace='UNDOTBS2' sid='rontest_2';

If we want to remove and re-add database in cluster :

1) Stop database --> srvctl stop database -d dbname

2) remove database from cluster --> srvctl remove database -d dbname

3) re-add database to cluster using below

srvctl add database -dbname database_name -dbtype RACONENODE -oraclehome /u01/19c -instance database_name -spfile asm_spfile_path -pwfile password-file_path -server server1,server2 -db database_name

srvctl add database -db db_unique_name [-eval]

-oraclehome oracle_home [-node node_name ] [-domain domain_name ]

[-spfile spfile ] [-pwfile password_file_path ][-dbtype {RACONENODE | RAC |

SINGLE} [-server“ server_list ”]

[-instance instance_name ] [-timeout timeout ]]

[-role {PRIMARY | PHYSICAL_STANDBY | LOGICAL_STANDBY | SNAPSHOT_STANDBY“]

[-startoption start_options ] [-stopoption stop_options ] [-dbname db_name ]

[-acfspath“ acfs_path_list ”] [-policy {AUTOMATIC | 手动| NORESTART}]

[-serverpool“ server_pool_list ”[-pqpool“ pq_pool_list ”]]

[-diskgroup “disk_group_list” ] [-verbose]

If we need modify Instance name :

If we want to change instance name , best way is to remove re-add database . We can also use below instance name will be hard coded to each node . Ideally if instance name given at time of conversion is OBCDATA . then instance name on first node will be OBCDATA_1. if there is server crash , same instance name is moved to another node i,e instance will come up with same name OBCDATA_1 on node 2 . If we do manual relocation instance will always come up as OBCDATA_2 because during manual relocation we cannot have 2 instance with same name .

using below option will force instance to always have different name , even in case of server failover ./

srvctl modify instance -d dbname -i instance_name -n name

References :

https://docs.oracle.com/en/database/oracle/oracle-database/19/rilin/installing-oracle-rac-rac-one-node-software-only.html#GUID-E865878B-C328-4368-A8F5-C7B2CD81172B

Instance Name changes during RAC One Node Failover/Switchover ( Doc ID 1540629.1 )

Naming Of RAC ONE NODE Instances Across RAC Nodes ( Doc ID 2063599.1 )

Oracle 19c Autopgrade utility

Oracle Database Autoupgrade Utility is a new feature designed in Oracle 19c to automate the Upgrade process which Identifies issues before upgrade, Performs Preupgrade actions, Deploying the upgrades and Performs Post upgrade actions . You can upgrade multiple databases at the same time using a single configuration file.

Below i did for testing purpose only . I personally use dbua .

AutoUpgrade Tool | AutoUpgrade - 4 modes

-analyze -fixups -upgrade -deploy

Below are the High level steps:

Install Oracle 19.3.0.0 binaries
Prerequisite for Autoupgrade
Create the config file
Analyze the database
Deploy the upgrade
Post upgrade task

Environment Details:
Source Hostname: new19c
Database version: 12.2.0.1
Database Name: Abdultest
ORACLE_HOME: /u01/app/oracle/product/12.2.0/db_1/
Target Hostname: new19c
Databaes Version: 19.3.0.0
Database name: Abdultest
ORACLE_HOME: /u01/app/oracle/product/19.3.0/dbhome_1

Source DB Details
SQL> select name, open_mode, version, status from v$database, v$instance;
NAME OPEN_MODE VERSION STATUS
--------- -------------------- ----------------- ------------
ABDULTEST READ WRITE 12.2.0.1.0 OPEN

1. Install Oracle 19.3.0.0 binaries

I have already completed the installation of 19c, Refer here for step by step installation of Oracle 19.3.0.0 on Linux 7 64bit.

2. Prerequisite for Autoupgrade : Download the latest autoupgrade.jar file

Autoupgrade utility autoupgrade.jar file exists by default under $ORACLE_HOME/rdbms/admin directory from Oracle 19.3 release onwards, however Oracle strongly recommends to download the latest AutoUpgrade version before doing the upgrade. Refer 2485457.1

Replace the autoupgrade.jar with the latest version downloaded

[oracle@new19c ~]$ mv $ORACLE_HOME/rdbms/admin/autoupgrade.jar $ORACLE_HOME/rdbms/admin/autoupgrade.jar-bkp
[oracle@new19c ~]$ cp /tmp/autoupgrade.jar $ORACLE_HOME/rdbms/admin/
[oracle@new19c ~]$

[oracle@new19c ~]$ $ORACLE_HOME/jdk/bin/java -jar $ORACLE_HOME/rdbms/admin/autoupgrade.jar -version

build.hash e84c9c2
build.version 19.10.0
build.date 2020/10/23 10:36:46
build.max_target_version 19
build.supported_target_versions 12.2,18,19
build.type production

Java version
Java version should be 8 or later, which is available by default in Oracle Database homes from release

12.1.0.2 and latest.

[oracle@new19c temp]$ $ORACLE_HOME/jdk/bin/java -version
java version "1.8.0_201"
Java(TM) SE Runtime Environment (build 1.8.0_201-b09)
Java HotSpot(TM) 64-Bit Server VM (build 25.201-b09, mixed mode)

3. Create the config file

Create a directory to hold all upgrade config and log files.

[oracle@new19c ~]$ mkdir /u01/19c-autoupg
[oracle@new19c ~]$ cd /u01/19c-autoupg

Create the sample config file
cd /u01/19c-autoupg
export ORACLE_HOME=/u01/app/oracle/product/19.3.0/dbhome_1
export PATH=$PATH:$ORACLE_HOME/jdk/bin

$ORACLE_HOME/jdk/bin/java -jar $ORACLE_HOME/rdbms/admin/autoupgrade.jar -create_sample_file config

--output
Created sample configuration file /u01/19c-autoupg/sample_config.cfg

Modify the config file

Copy the sample config file and make the necessary changes as per the database environment.
cd /u01/19c-autoupg
cp sample_config.cfg abdultest_db_config.cfg
vi abdultest_db_config.cfg

This is the config file I used for upgrade:
[oracle@new19c 19c-autoupg]$ cat abdultest_db_config.cfg
global.autoupg_log_dir=/u01/19c-autoupg/upg_logs
#
# Database abdultest
#
upg1.dbname=abdultest
upg1.start_time=NOW
upg1.source_home=/u01/app/oracle/product/12.2.0/db_1/
upg1.target_home=/u01/app/oracle/product/19.3.0/dbhome_1
upg1.sid=abdultest
upg1.log_dir=/u01/19c-autoupg/upg_logs/abdultest
upg1.upgrade_node=new19c
upg1.target_version=19.3
upg1.run_utlrp=yes
upg1.timezone_upg=yes

4. Analyze the database
Autoupgrade Analyze mode checks your database to see if it is ready for the upgrade. This will reads data from the database and does not perform any updates.

Execute autoupgrade in analyze mode with the below syntax,
export ORACLE_HOME=/u01/app/oracle/product/19.3.0/dbhome_1
export PATH=$PATH:$ORACLE_HOME/jdk/bin

cd /u01/19c-autoupg
$ORACLE_HOME/jdk/bin/java -jar $ORACLE_HOME/rdbms/admin/autoupgrade.jar -config abdultest_db_config.cfg -mode ANALYZE

Output
[oracle@new19c 19c-autoupg]$ $ORACLE_HOME/jdk/bin/java -jar $ORACLE_HOME/rdbms/admin/autoupgrade.jar -config cdbdev_db_config.cfg -mode ANALYZE

AutoUpgrade tool launched with default options
Processing config file ...
+--------------------------------+
| Starting AutoUpgrade execution |
+--------------------------------+
1 databases will be analyzed
Type 'help' to list console commands

We can monitor, manage and control the jobs from the autoupgrade console. Example:
lsj – to list the jobs
status – to show the job status
tasks – shows the tasks executing

logs - to check log folder

All Analyze logs are created under autoupg_log_dir

[oracle@new19c 100]$ cd /u01/19c-autoupg/upg_logs/abdultest/abdultest/100/prechecks/
[oracle@new19c prechecks]$ ls -l
total 536
-rwx------. 1 oracle oinstall 5051 Nov 19 03:28 abdultest_checklist.cfg
-rwx------. 1 oracle oinstall 18050 Nov 19 03:28 abdultest_checklist.json
-rwx------. 1 oracle oinstall 17101 Nov 19 03:28 abdultest_checklist.xml

-rwx------. 1 oracle oinstall 36704 Nov 19 03:28 abdultest_preupgrade.html
-rwx------. 1 oracle oinstall 17649 Nov 19 03:28 abdultest_vpreupgrade.log

-rwx------. 1 oracle oinstall 158030 Nov 19 03:28 prechecks_cdb_root.log
-rwx------. 1 oracle oinstall 140241 Nov 19 03:28 prechecks_pdbdev.log
-rwx------. 1 oracle oinstall 139243 Nov 19 03:28 prechecks_pdb_seed.log

We can review the html file (abdultest_preupgrade.html) which will list all precheck Errors, warnings and recommendations.

5. Deploy the upgrade

Autoupgrade Deploy mode performs the actual upgrade of the database from preugrade source database analysis to post-upgrade checks.

Note: Before deploying the upgrade, you must have a backup plan in place.

Execute the autoupgrade in DEPLOY mode using the below syntax,

export ORACLE_HOME=/u01/app/oracle/product/19.3.0/dbhome_1

export PATH=$PATH:$ORACLE_HOME/jdk/bin

cd /u01/19c-autoupg$ORACLE_HOME/jdk/bin/java -jar

$ORACLE_HOME/rdbms/admin/autoupgrade.jar -config abdultest_db_config.cfg -mode DEPLOY

Once the upgrade process is started consider monitoring the logs to see the progress of the upgrade. Autoupgrade logs are available under,

/u01/19c-autoupg/upg_logs/abdultest/abdultest/101/dbupgrade

Output

[oracle@new19c 19c-autoupg]$ $ORACLE_HOME/jdk/bin/java -jar $ORACLE_HOME/rdbms/admin/autoupgrade.jar -config abdultest_db_config.cfg -mode DEPLOY

AutoUpgrade tool launched with default options

---- Drop GRP at your convenience once you consider it is no longer needed ----
Drop GRP from cdbdev: drop restore point AUTOUPGRADE_9212_CDBDEV122010

[oracle@new19c 19c-autoupg]$

Check the upg_summary.log
Oracle Database Release 19 Post-Upgrade Status Tool 11-19-2020 06:30:0
Container Database: CDBDEV
[CON_ID: 2 => PDB$SEED]
Component Current Full Elapsed Time
Name Status Version HH:MM:SS
Oracle Server VALID 19.3.0.0.0 00:34:10
JServer JAVA Virtual Machine VALID 19.3.0.0.0 00:02:25
Oracle XDK VALID 19.3.0.0.0 00:01:19
Oracle Database Java Packages VALID 19.3.0.0.0 00:00:10
OLAP Analytic Workspace VALID 19.3.0.0.0 00:00:55
Oracle Label Security VALID 19.3.0.0.0 00:00:11
Oracle Database Vault VALID 19.3.0.0.0 00:03:00
Oracle Text VALID 19.3.0.0.0 00:00:42
Oracle Workspace Manager VALID 19.3.0.0.0 00:00:52
Oracle Real Application Clusters OPTION OFF 19.3.0.0.0 00:00:00
Oracle XML Database VALID 19.3.0.0.0 00:03:32
Oracle Multimedia VALID 19.3.0.0.0 00:00:46
Spatial VALID 19.3.0.0.0 00:09:15
Oracle OLAP API VALID 19.3.0.0.0 00:00:26
Datapatch 00:04:50
Final Actions 00:05:11
Post Upgrade 00:02:06
Post Compile 00:11:29
Total Upgrade Time: 01:17:50 [CON_ID: 2 => PDB$SEED * ]
Asterisks denotes compilation time has been included during the upgrade process.
Database time zone version is 26. It is older than current release time
zone version 32. Time zone upgrade is needed using the DBMS_DST package.
Upgrade Times Sorted In Descending Order
Total Upgrade Time: 01:17:50 [CON_ID: 2 => PDB$SEED * ]
Total Upgrade Time: 01:06:31 [CON_ID: 1 => CDB$ROOT]
Total Upgrade Time: 01:06:12 [CON_ID: 3 => PDBDEV]
Grand Total Upgrade Time: [0d:2h:42m:43s]

[oracle@new19c dbupgrade]$
Timezone file upgrade and database recompilation has already completed by the autoupgrade utility as the below values are adjusted as “yes” in the config file,
upg1.run_utlrp=yes =yes # yes(default) to run utlrp as part of upgrade
upg1.timezone_upg=yes # yes(default) to upgrade timezone if needed
Check the Timezone version
SQL> SELECT VERSION FROM V$TIMEZONE_FILE;
VERSION
----------
32
SQL>
Check the db details
SQL> select name, open_mode, version, status from v$database, v$instance;
NAME OPEN_MODE VERSION STATUS
--------- -------------------- ----------------- ------------
ABDULTEST READ WRITE 19.0.0.0.0 OPEN
SQL>
6. Post-upgrade task
Once the upgrade is successful and all testing is done, drop the restore point.
Drop the Guaranteed restore point
SQL> select name from v$restore_point;
NAME
------------------------------
AUTOUPGRADE_9212_ABDULTEST122010
SQL>
SQL> drop restore point AUTOUPGRADE_9212_ABDULTEST122010;
Restore point dropped.
SQL>

Change the compatible parameter

Note: After the upgrade, database has to be tested properly before updating the compatible parameter. Once the parameter is updated database cannot be downgraded.
show parameter compatible
alter system set compatible='19.0.0' scope=spfile;
shutdown immediate;
startup;
show parameter compatible

--output
SQL> show parameter compatible
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
compatible string 12.2.0
noncdb_compatible boolean FALSE
SQL> alter system set compatible='19.0.0' scope=spfile;
System altered.
SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.
Total System Global Area 1560278096 bytes
Fixed Size 9135184 bytes
Variable Size 973078528 bytes
Database Buffers 570425344 bytes
Redo Buffers 7639040 bytes
Database mounted.
Database opened.
SQL>
SQL> show parameter compatible
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
compatible string 19.0.0
noncdb_compatible boolean FALSE
SQL>
It’s Done. Database is successfully upgraded from 12c to 19c.

Reference :

https://docs.oracle.com/en/database/oracle/oracle-database/19/upgrd/using-autoupgrade-oracle-database-upgrades.html#GUID-71883C8C-7A34-4E93-8955-040CB04F2109