Saturday, November 27, 2021

Oracle Data Scrubbing Exadata and Asm

 
Oracle ASM disk scrubbing improves availability and reliability by searching for data that may be less likely to be read. Disk scrubbing checks logical data corruptions and repairs them automatically in normal and high redundancy disks groups. The scrubbing process repairs logical corruptions using the mirror disks. Disk scrubbing can be combined with disk group rebalancing to reduce I/O resources. The disk scrubbing process has minimal impact to the regular I/O in production systems.

You can perform scrubbing on a disk group, a specified disk, or a specified file of a disk group with the ALTER DISKGROUP SQL statement. For example, the following SQL statements show various options used when running the ALTER DISKGROUP disk_group SCRUB SQL statement.



Exadata Disk Scrubbing

A subtler Exadata Maintenance job is the bi-weekly Disk Scrub. This job does not appear in the CellCLI alert history. It only appears in the $CELLTRACE/alert.log.

Disk Scrubbing is designed to periodically validate the integrity of the mirrored ASM extents and thus eliminate latent corruption. The scrubbing is supposed to only run when average I/O utilization is under 25 percent. However, this can still cause spikes in utilization and latency and adversely affect database I/O, Oracle documentation says that a 4TB high capacity hard disk can take 8-12 hours to scrub, but I have seen it run more than 24 hours. Normally, this isn’t noticeable as it runs quietly in the background. 

However, if you have a high I/O workload, the additional 10-15 percent latency is noticeable.
Logs and Schedule

The $CELLTRACE/alert.log on the cell nodes reports the timing and results.
Wed Jan 11 16:00:07 2017
Begin scrubbing CellDisk:CD_11_xxxxceladm01.
Begin scrubbing CellDisk:CD_10_xxxxceladm01.

Thu Jan 12 15:12:37 2017
Finished scrubbing CellDisk:CD_10_xxxxceladm01, scrubbed blocks (1MB):3780032, found bad blocks:0
Thu Jan 12 15:42:02 2017
Finished scrubbing CellDisk:CD_11_xxxxceladm01, scrubbed blocks (1MB):3780032, found bad blocks:0
You can connect to the cell nodes and alter the Start Time and Interval at CellCLI prompt:
CellCLI> alter cell hardDiskScrubStartTime=’2017-01-21T08:00:00-08:00′;
CellCLI> list cell attributes name,hardDiskScrubInterval
biweekly


ASM Disk Scrubbing

ASM Disk Scrubbing performs a similar task to Exadata Disk Scrubbing. It searches the ASM blocks and repairs logical corruption using the mirror disks. The big difference is that ASM Disk scrubbing is run manually at disk group or file level and can be seen in V$ASM_OPERATION view and the alert_+ASM.log
Logs and Views
The alert_+ASM.log on the database node reports the command and duration.
Mon Feb 06 09:03:58 2017
SQL> alter diskgroup DBFS_DG scrub power low
Mon Feb 06 09:03:58 2017
NOTE: Start scrubbing diskgroup DBFS_DG
Mon Jan 06 09:03:58 2017
SUCCESS: alter diskgroup DBFS_DG scrub power low



Performing  Scrubbing :

Scrubbing A data file:  
Within a disk, ASM disk scrubbing can be invoked to scrub individual disk data.
alter diskgroup
   data
scrub file
   '+data/orcl/datafile/example.266.806582193'
repair power high force;



Scrubbing A specific disk:  
You can direct ASM to use the mirrored data to scrub and synchronize a mirrored disk spindle.
alter diskgroup
   data
scrub disk
   data_0005
repair power high force;



Scrubbing An ASM disk group:  
Entire ASM disk groups can be done with disk scrubbing.
alter diskgroup
   data
scrub power low;

 
scrub  . . . repair:   
This option automatically repairs disk corruptions. If the "repair" keywords is not used, Oracle will only identify corruptions and not fix them:
alter diskgroup
   data
scrub disk
   data_0005
repair power low; --> reports and repairs corruptions
alter diskgroup
   data
scrub disk
   data_0005
power high force;  --> reports corruptions



scrub . . . power:  
If the "power" argument is specified with data scrubbing, you can have several levels of power:
alter diskgroup data scrub power low;
alter diskgroup data scrub power auto; --> default
alter diskgroup data scrub power high
alter diskgroup data scrub power max;
SQL> ALTER DISKGROUP data SCRUB POWER LOW;
SQL> ALTER DISKGROUP data SCRUB FILE '+DATA/ORCL/DATAFILE/example.266.806582193' 
       REPAIR POWER HIGH FORCE;
SQL> ALTER DISKGROUP data SCRUB DISK DATA_0005 REPAIR POWER HIGH FORCE;



scrub  . . . wait:  
If the optional "wait" option is specified, the command returns after the scrubbing operation has completed. If the WAIT option is not specified, the scrubbing operation is added into the scrubbing queue and the command returns immediately.

scrub . . . force:  
If the optional "force" option is specified, the command is processed even if the system I/O load is high or if scrubbing has been disabled internally at the system level.




Seeing Scrubbing Process : 

You can monitor the scrubbing progress from V$ASM_OPERATION view.

[grid@dbserver]$ ps -ef | grep asm_sc
grid     17902     1  0 11:27 ?        00:00:00 asm_scrb_+ASM
grid     24365     1  0 11:49 ?        00:00:01 asm_scc0_+ASM
...
SCRB - ASM disk scrubbing master process that coordinates disk scrubbing operations.
SCCn - ASM disk scrubbing slave check process that performs the checking operations. The possible processes are SCC0-SCC9.
We would see additional processes during the repair operation:
SCRn - ASM disk scrubbing slave repair process that performs repair operation. The possible processes are SCR0-SCR9.
  
SCVn - ASM disk scrubbing slave verify process that performs the verifying operations. The possible processes are SCV0-SCV9.



ZFS Data Scrubbing : 
We can also perform  ZFS scrubbing . We can find more information  on this  in below link 

https://docs.oracle.com/cd/E19120-01/open.solaris/817-2271/gbbxi/index.html




Reference :

https://docs.oracle.com/database/121/OSTMG/GUID-ECCAE4F3-CFC5-4B55-81D2-FFB6953C035C.htm#OSTMG95351

http://ora-srv.wlv.ac.uk/oracle19c_doc/ostmg/alter-diskgroups.html#GUID-6BB31112-8687-4C1E-AF14-D94FFCDA736F

Doc 
2094581.1
2049911.1