Tuesday, September 7, 2021

Exadata Disk replacement

 
In this  blog  We will be concentrating on disk  replacement strategy for Exadata . However before proceeding on disk replacement we will also try to get some insight on   Asm redundancy levels in Exadata and failover groups . 



When we created these ASM disk groups, we specified normal redundancy. With Exadata, external redundancy is not an option - you either need to use normal or high redundancy. With normal redundancy, each extent is mirrored to a different cell, and with high redundancy, they are mirrored via ASM to two additional cells. Specifically, extents are mirrored to partner disks in different failure groups.

On an Exadata Quarter Rack, Oracle only supports normal redundancy because there are only three failure groups, one for each cell. While this certainly seems like it should be enough failure groups to support three extent mirrors, the challenges comes with Oracle's requirement on OCR and voting disks. 

With high redundancy ASM disk groups, Oracle requires five failure groups. Since Oracle cannot guarantee that you won't place your OCR/voting disks on this ASM disk group, it simply prevents you from creating it. This being said, the syntax above for creating a high redundancy ASM disk group is valid for the Half Rack and Full Rack configurations.

In the Exadata environment, the failure group for disks on each cell node is cellnode itself. In this way, the blocks on the disk in the cell node is also present on another cellnode, so even if the cellnode is turned off, there will be no problem.



Before we proceed we need to understand what is Cell Disk  and Grid Disk 

The Exadata Storage Server contains 12 physical disks.
There is one to one relationship between a physical disk and the cell disk. One physical disk corresponds to a single cell disk.
Then a Grid disk comprises of many cell disks.
And a Grid disk corresponds to one ASM disk.
An ASM diskgroup comprises of many ASM disks.


For disk maintenance activity asmdeactivationoutcome  for  grid disk  and DISK_REPAIR_TIME for diskgroup is important .


 Below are 2 files that make cell nodes visible to Database node ( Compute Node ) 
 cellinit.ora -- decides which network takes storage traffic.
 cellip.ora - list of cells, new cells can be added dynamically without shutdown


 Below is hierarchy in which disk are created 
 Physical disk / Cell disk is allocated to Cell 
 Grid disk is created on top of Cell Disk 
 Asm  disk is allocated using Grid Disk 


Each Cell comes with 12 SAS Harddisks (600 GB each with High Performance resp. 2 TB each with High Capacity). The picture below shows a Cell with the 12 Harddisks on the front:
Also each Cell has 4 Flashcards built in that are divided into 4 Flashdisks each, summarizing to 16 Flashdisks in each Cell that deliver by default 384 GB Flash Cache.  


1) Physical Disks
Physical Disks can be of the type Harddisk or of the type Flashdisk. You cannot create or drop them. The only administrative task on that layer can be to turn the LED at the front of the Cell on before you replace a damaged Harddisk to be sure you pull out the right one, with a command like
CellCLI> alter physicaldisk  serviceled on



2) Luns
Luns are the second layer of abstraction. They have been introduced, because the first two Harddisks in every Cell are different than the other 10 in so far as they contain the Operating System (Oracle Enterprise Linux). About 30 GB have been carved out of the first 2 Harddisks for that purpose. We have 2 of them for redundancy  – the Cell can still operate if only one of the first 2 Harddisks fails. If we investigate the first 2 LUNs, we see the mirrored OS Partitions. Jo 
As an Administrator, you do not need to do anything on the Lun Layer except looking at it with commands like
CellCLI> list lun



3) Celldisks
Celldisks are the third layer of abstraction. It was introduced to enable interleaving in the first place.
CellCLI> create celldisk all harddisk interleaving='normal_redundancy'
CellCLI> create celldisk all harddisk


4) Griddisks
Griddisks are the fourth layer of abstraction, and they will be the Candidate Disks to build your ASM diskgroups from.
With interleaving specified at the Celldisk layer, this is different: The Griddisks are then being created from outer and inner parts of the Harddisk, leading to equal performance of the Griddisks and also then of the later created Diskgroups.


5) Asm  diskgroups 

Asm diskgroups are created using Grid disk from step 4 



++++++++++++++++++++++++++++++++++++++++++++++++++++
Strategy for Disk  replacement depends on  below scenario
++++++++++++++++++++++++++++++++++++++++++++++++++++
1) If disk is   failed /total failure 
2) Disk is slow performance or  Predictive Failures
3)  if   poor performance disk to be replace is flash disk 



 
 
++++++++++++++++++++++++++++++++++++++++++++++++++++
Flash Disk Replacement due to poor performance in Exadata 
++++++++++++++++++++++++++++++++++++++++++++++++++++
 
To identify a poor performance flash disk, use the following command:

CellCLI> LIST PHYSICALDISK WHERE DISKTYPE=flashdisk AND STATUS= 'warning - poor performance' DETAIL
 
Recommended Action  
The flash disk has entered poor performance status. A white cell locator LED has been lit to help locate the affected cell. 
Please replace the flash disk.
If the flash disk is used for flash cache, then flash cache will be disabled on this disk thus reducing the effective flash cache size. 
If the flash disk is used for flash log, then flash log will be disabled on this disk thus reducing the effective flash log size.
If the flash disk is used for grid disks, then Oracle ASM rebalance will automatically restore the data redundancy.
Sun Oracle Exadata Storage Server is equipped with four PCIe cards. Each card has four flash disks (FDOMs) for a total of 16 flash disks. 
The 4 PCIe cards are present on PCI slot numbers 1, 2, 4, and 5. 
The PCIe cards are not hot-pluggable such that Exadata Cell must be powered down before replacing the flash disks or cards.
Hence DataCenter Team replaced a flash disk in co-ordination with us (DBA) because the flash disk was in poor performance status.


[1] Verify the existing disk_repair_time attribute for all diskgroups

SQL> select dg.name,a.value from 
v$asm_diskgroup dg, v$asm_attribute a 
where dg.group_number=a.group_number and
a.name='disk_repair_time';


[2] The default disk_repair_time is 3.6 hours only so better to adjust.

 SQL> ALTER DISKGROUP DATA SET ATTRIBUTE 'DISK_REPAIR_TIME'='8.5H';





1. Shut down the cell.
The following procedure describes how to power down Exadata Cell.Run the following command to check if there are offline disks 
on other cells that are mirrored with disks on this cell:

CellCLI > LIST GRIDDISK ATTRIBUTES name WHERE asmdeactivationoutcome != 'Yes'

If any grid disks are returned, then it is not safe to take the storage server offline because proper Oracle ASM disk group redundancy 
will not be intact. Taking the storage server offline when one or more grid disks are in this state will cause Oracle ASM to dismount 
the affected disk group, causing the databases to shut down abruptly.
Inactivate all the grid disks when Oracle Exadata Storage Server is safe to take offline using the following command:

CellCLI> ALTER GRIDDISK ALL INACTIVE

The preceding command will complete once all disks are inactive and offline. Depending on the storage server activity,
 it may take several minutes for this command to complete.
Verify all grid disks are INACTIVE to allow safe storage server shut down by running the following command.

CellCLI> LIST GRIDDISK

If all grid disks are INACTIVE, then the storage server can be shutdown without affecting database availability.


Stop the cell services using the following command:

CellCLI> ALTER CELL SHUTDOWN SERVICES ALL
Shut down the cell.

2. Replace the failed flash disk based on the PCI number and FDOM number.

3. Power up the cell. The cell services will be started automatically.

4.Bring all grid disks are online using the following command:
CellCLI> ALTER GRIDDISK ALL ACTIVE



5. Verify that all grid disks have been successfully put online using the following command:
CellCLI> LIST GRIDDISK ATTRIBUTES name, asmmodestatus
        Wait until asmmodestatus from SYNCING to ONLINE for all grid disks. 
 
CellCLI> LIST GRIDDISK ATTRIBUTES name, asmmodestatus


      
6. Verify FlashCache 

CellCLI> list flashcache
orclcel05_FLASHCACHE warning - degraded

If the flashcache state is not normal after the flash disk change, drop flashcache, flashlog, and flashdisk drop and recreate in the same order.

Exadata X6 systems do not require this step.

CellCLI> drop flashcache
Flash cache orclcel05_FLASHCACHE successfully dropped
 
CellCLI> drop flashlog
Flash log orclcel05_FLASHLOG successfully dropped
 
CellCLI> drop celldisk all flashdisk force


CellCLI> create celldisk all flashdisk
 
CellCLI> create flashlog all
Flash log orclcel05_FLASHLOG successfully created
 
CellCLI> create flashcache all
Flash cache orclcel05_FLASHCACHE successfully created





Oracle ASM synchronization is only complete when all grid disks show attribute asmmodestatus=ONLINE. Before taking another storage server offline, Oracle ASM synchronization must complete on the restarted Oracle Exadata Storage Server. If synchronization is not complete, then the check performed on another storage server will fail.

The new flash disk will be automatically used by the system. If the flash disk is used for flash cache, then the effective cache size will increase. If the flash disk is used for grid disks, then the grid disks will be recreated on the new flash disk. If those gird disks were part of an Oracle ASM disk group, then they will be added back to the disk group and the data will be rebalanced on them based on the disk group redundancy and asm_power_limit parameter.

Oracle ASM rebalance occurs when dropping or adding a disk. To check the status of the rebalance, do the following:

The rebalance operation may have been successfully run. Check the Oracle ASM alert logs to confirm
The rebalance operation may be currently running. Check the GV$ASM_OPERATION view to determine if the rebalance operation is still running.

The rebalance operation may have failed. Check the GV$ASM_OPERATION.ERROR view to determine if the rebalance operation failed.

Rebalance operations from multiple disk groups can be done on different Oracle ASM instances in the same cluster if the physical disk being replaced contains ASM disks from multiple disk groups. One Oracle ASM instance can run one rebalance operation at a time. If all Oracle ASM instances are busy, then rebalance operations will be queued.






++++++++++++++++++++++++++++++++++++++++++++++++++++
Replacing Failed disk in Exadata 
++++++++++++++++++++++++++++++++++++++++++++++++++++

Hard Disks on the cell node are automatically dropped at the time of damage, and the disks of the asm disk groups will also be deleted. 
After the disk drop operation and after the disk changes, rebalance process will automatically start in ASM disk groups.

 
1) 
CellCLI> List alerthistory 
CellCLI> LIST PHYSICALDISK WHERE diskType=HardDisk AND status like failed DETAIL



2) This query should return no rows indicating the disk is no longer in the ASM diskgroup configuration.    If this returns any other value, then contact the SR owner for further guidance.

SQL> select group_number,path,header_status,mount_status,mode_status,name from V$ASM_DISK where path like 'Í_05_abcdecel02';



3) The ALTER PHYSICALDISK command sets the Service Action Required LED on or off.

CellCLI> ALTER PHYSICALDISK n:n DROP FOR REPLACEMENT
CellCLI> alter physicaldisk 20:11 drop for replacement




4)   If Drop is Hung  in Step 3  proceed with Step 4  

CellCLI> alter griddisk <griddisk_name_from_alertlog> flush nowait;
NOTE: Wait for sometimes and monitor the cell's alert.log file.

Check following too to see if flushing dirty blocks from flash cache to specific grid disk(s) cached is decreasing :
CellCLI> list metriccurrent attributes name,metricvalue where name like 'FC_BY_DIRTY';
NOTE: Need to wait and monitor the cell's alert.log file for flush completed.
 
CellCLI> alter griddisk <griddisk_name_from_alertlog> cancel flush;




5) Replace damage disk 

CellCLI> list griddisk attributes name,status,asmmodestatus,asmdeactivationoutcome

NOTE: If it is showing this ->> "normal - dropped for replacement", then proceed to replace the disk.

alter physicaldisk 28:5 serviceled off




 
6)  Disk addition   back  to Asm 

If the new griddisks were not automatically added back into the ASM diskgroup configuration, then locate the disks with group_number=0, and add them back in manually using "alter diskgroup <name> add disk <path> re-balance power 10;" command:

If replacement time crosses ASM FAST MIRROR RESYNC period 

SQL> select path,header_status from v$asm_disk where group_number=0;
PATH                                               HEADER_STATU
-------------------------------------------------- ------------
o/192.168.9.10/DBFS_DG_CD_05_abcdcel02        FORMER
o/192.168.9.10/DATA_Q1_CD_05_abcdcel02        FORMER
o/192.168.9.10/RECO_Q1_CD_05_abcdcel02        FORMER

SQL> alter diskgroup dbfs_dg add disk 'o/192.168.9.10/DBFS_DG_CD_05_abcdcel02' rebalance power 10;
SQL> alter diskgroup data_q1 add disk 'o/192.168.9.10/DATA_Q1_CD_05_abcdcel02' rebalance power 10;
SQL> alter diskgroup reco_q1 add disk 'o/192.168.9.10/RECO_Q1_CD_05_abcdcel02' rebalance power 10;







++++++++++++++++++++++++++++++++++++++++++++++++++++
Below 2 sql was used for  checking disk statistics 
++++++++++++++++++++++++++++++++++++++++++++++++++++

COL cv_cellname       HEAD CELLNAME         FOR A20
COL cv_cellversion    HEAD CELLSRV_VERSION  FOR A20
COL cv_flashcachemode HEAD FLASH_CACHE_MODE FOR A20
PROMPT Show Exadata cell versions from V$CELL_CONFIG....
SELECT 
    disktype
  , cv_cellname
  , status
  , ROUND(SUM(physicalsize/1024/1024/1024)) total_gb
  , ROUND(AVG(physicalsize/1024/1024/1024)) avg_gb
  , COUNT(*) num_disks
  , SUM(CASE WHEN predfailStatus  = 'TRUE' THEN 1 END) predfail
  , SUM(CASE WHEN poorPerfStatus  = 'TRUE' THEN 1 END) poorperf
  , SUM(CASE WHEN wtCachingStatus = 'TRUE' THEN 1 END) wtcacheprob
  , SUM(CASE WHEN peerFailStatus  = 'TRUE' THEN 1 END) peerfail
  , SUM(CASE WHEN criticalStatus  = 'TRUE' THEN 1 END) critical
FROM (
    SELECT /*+ NO_MERGE */
        c.cellname cv_cellname
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/name/text()')                          AS VARCHAR2(20)) diskname
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/diskType/text()')                      AS VARCHAR2(20)) diskType          
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/luns/text()')                          AS VARCHAR2(20)) luns              
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/makeModel/text()')                     AS VARCHAR2(50)) makeModel         
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/physicalFirmware/text()')              AS VARCHAR2(20)) physicalFirmware  
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/physicalInsertTime/text()')            AS VARCHAR2(30)) physicalInsertTime
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/physicalSerial/text()')                AS VARCHAR2(20)) physicalSerial    
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/physicalSize/text()')                  AS VARCHAR2(20)) physicalSize      
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/slotNumber/text()')                    AS VARCHAR2(30)) slotNumber        
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/status/text()')                        AS VARCHAR2(20)) status            
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/id/text()')                            AS VARCHAR2(20)) id                
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/key_500/text()')                       AS VARCHAR2(20)) key_500           
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/predfailStatus/text()')                AS VARCHAR2(20)) predfailStatus    
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/poorPerfStatus/text()')                AS VARCHAR2(20)) poorPerfStatus    
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/wtCachingStatus/text()')               AS VARCHAR2(20)) wtCachingStatus   
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/peerFailStatus/text()')                AS VARCHAR2(20)) peerFailStatus    
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/criticalStatus/text()')                AS VARCHAR2(20)) criticalStatus    
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/errCmdTimeoutCount/text()')            AS VARCHAR2(20)) errCmdTimeoutCount
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/errHardReadCount/text()')              AS VARCHAR2(20)) errHardReadCount  
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/errHardWriteCount/text()')             AS VARCHAR2(20)) errHardWriteCount 
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/errMediaCount/text()')                 AS VARCHAR2(20)) errMediaCount     
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/errOtherCount/text()')                 AS VARCHAR2(20)) errOtherCount     
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/errSeekCount/text()')                  AS VARCHAR2(20)) errSeekCount      
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/sectorRemapCount/text()')              AS VARCHAR2(20)) sectorRemapCount  
    FROM
        v$cell_config c
      , TABLE(XMLSEQUENCE(EXTRACT(XMLTYPE(c.confval), '/cli-output/physicaldisk'))) v   
    WHERE 
        c.conftype = 'PHYSICALDISKS'
)
GROUP BY
    cv_cellname
  , disktype
  , status
ORDER BY
    disktype
  , cv_cellname
/




COL cellname            HEAD CELLNAME       FOR A20
COL celldisk_name       HEAD CELLDISK       FOR A30
COL physdisk_name       HEAD PHYSDISK       FOR A30
COL griddisk_name       HEAD GRIDDISK       FOR A30
COL asmdisk_name        HEAD ASMDISK        FOR A30
BREAK ON asm_diskgroup SKIP 1 ON asm_disk
PROMPT Showing Exadata disk topology from V$ASM_DISK and V$CELL_CONFIG....
WITH
  pd AS (
    SELECT /*+ MATERIALIZE */
        c.cellname
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/name/text()')                          AS VARCHAR2(100)) name
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/diskType/text()')                      AS VARCHAR2(100)) diskType          
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/luns/text()')                          AS VARCHAR2(100)) luns              
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/makeModel/text()')                     AS VARCHAR2(100)) makeModel         
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/physicalFirmware/text()')              AS VARCHAR2(100)) physicalFirmware  
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/physicalInsertTime/text()')            AS VARCHAR2(100)) physicalInsertTime
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/physicalSerial/text()')                AS VARCHAR2(100)) physicalSerial    
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/physicalSize/text()')                  AS VARCHAR2(100)) physicalSize      
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/slotNumber/text()')                    AS VARCHAR2(100)) slotNumber        
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/status/text()')                        AS VARCHAR2(100)) status            
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/id/text()')                            AS VARCHAR2(100)) id                
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/key_500/text()')                       AS VARCHAR2(100)) key_500           
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/predfailStatus/text()')                AS VARCHAR2(100)) predfailStatus    
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/poorPerfStatus/text()')                AS VARCHAR2(100)) poorPerfStatus    
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/wtCachingStatus/text()')               AS VARCHAR2(100)) wtCachingStatus   
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/peerFailStatus/text()')                AS VARCHAR2(100)) peerFailStatus    
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/criticalStatus/text()')                AS VARCHAR2(100)) criticalStatus    
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/errCmdTimeoutCount/text()')            AS VARCHAR2(100)) errCmdTimeoutCount
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/errHardReadCount/text()')              AS VARCHAR2(100)) errHardReadCount  
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/errHardWriteCount/text()')             AS VARCHAR2(100)) errHardWriteCount 
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/errMediaCount/text()')                 AS VARCHAR2(100)) errMediaCount     
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/errOtherCount/text()')                 AS VARCHAR2(100)) errOtherCount     
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/errSeekCount/text()')                  AS VARCHAR2(100)) errSeekCount      
      , CAST(EXTRACTVALUE(VALUE(v), '/physicaldisk/sectorRemapCount/text()')              AS VARCHAR2(100)) sectorRemapCount  
    FROM
        v$cell_config c
      , TABLE(XMLSEQUENCE(EXTRACT(XMLTYPE(c.confval), '/cli-output/physicaldisk'))) v  -- gv$ isn't needed, all cells should be visible in all instances
    WHERE 
        c.conftype = 'PHYSICALDISKS'
),
 cd AS (
    SELECT /*+ MATERIALIZE */
        c.cellname 
      , CAST(EXTRACTVALUE(VALUE(v), '/celldisk/name/text()')                              AS VARCHAR2(100)) name
      , CAST(EXTRACTVALUE(VALUE(v), '/celldisk/comment        /text()')                   AS VARCHAR2(100)) disk_comment
      , CAST(EXTRACTVALUE(VALUE(v), '/celldisk/creationTime   /text()')                   AS VARCHAR2(100)) creationTime
      , CAST(EXTRACTVALUE(VALUE(v), '/celldisk/deviceName     /text()')                   AS VARCHAR2(100)) deviceName
      , CAST(EXTRACTVALUE(VALUE(v), '/celldisk/devicePartition/text()')                   AS VARCHAR2(100)) devicePartition
      , CAST(EXTRACTVALUE(VALUE(v), '/celldisk/diskType       /text()')                   AS VARCHAR2(100)) diskType
      , CAST(EXTRACTVALUE(VALUE(v), '/celldisk/errorCount     /text()')                   AS VARCHAR2(100)) errorCount
      , CAST(EXTRACTVALUE(VALUE(v), '/celldisk/freeSpace      /text()')                   AS VARCHAR2(100)) freeSpace
      , CAST(EXTRACTVALUE(VALUE(v), '/celldisk/id             /text()')                   AS VARCHAR2(100)) id
      , CAST(EXTRACTVALUE(VALUE(v), '/celldisk/interleaving   /text()')                   AS VARCHAR2(100)) interleaving
      , CAST(EXTRACTVALUE(VALUE(v), '/celldisk/lun            /text()')                   AS VARCHAR2(100)) lun
      , CAST(EXTRACTVALUE(VALUE(v), '/celldisk/physicalDisk   /text()')                   AS VARCHAR2(100)) physicalDisk
      , CAST(EXTRACTVALUE(VALUE(v), '/celldisk/size           /text()')                   AS VARCHAR2(100)) disk_size
      , CAST(EXTRACTVALUE(VALUE(v), '/celldisk/status         /text()')                   AS VARCHAR2(100)) status
    FROM
        v$cell_config c
      , TABLE(XMLSEQUENCE(EXTRACT(XMLTYPE(c.confval), '/cli-output/celldisk'))) v  -- gv$ isn't needed, all cells should be visible in all instances
    WHERE 
        c.conftype = 'CELLDISKS'
),
 gd AS (
    SELECT /*+ MATERIALIZE */
        c.cellname 
      , CAST(EXTRACTVALUE(VALUE(v), '/griddisk/name/text()')                               AS VARCHAR2(100)) name
      , CAST(EXTRACTVALUE(VALUE(v), '/griddisk/asmDiskgroupName/text()')                   AS VARCHAR2(100)) asmDiskgroupName 
      , CAST(EXTRACTVALUE(VALUE(v), '/griddisk/asmDiskName     /text()')                   AS VARCHAR2(100)) asmDiskName
      , CAST(EXTRACTVALUE(VALUE(v), '/griddisk/asmFailGroupName/text()')                   AS VARCHAR2(100)) asmFailGroupName
      , CAST(EXTRACTVALUE(VALUE(v), '/griddisk/availableTo     /text()')                   AS VARCHAR2(100)) availableTo
      , CAST(EXTRACTVALUE(VALUE(v), '/griddisk/cachingPolicy   /text()')                   AS VARCHAR2(100)) cachingPolicy
      , CAST(EXTRACTVALUE(VALUE(v), '/griddisk/cellDisk        /text()')                   AS VARCHAR2(100)) cellDisk
      , CAST(EXTRACTVALUE(VALUE(v), '/griddisk/comment         /text()')                   AS VARCHAR2(100)) disk_comment
      , CAST(EXTRACTVALUE(VALUE(v), '/griddisk/creationTime    /text()')                   AS VARCHAR2(100)) creationTime
      , CAST(EXTRACTVALUE(VALUE(v), '/griddisk/diskType        /text()')                   AS VARCHAR2(100)) diskType
      , CAST(EXTRACTVALUE(VALUE(v), '/griddisk/errorCount      /text()')                   AS VARCHAR2(100)) errorCount
      , CAST(EXTRACTVALUE(VALUE(v), '/griddisk/id              /text()')                   AS VARCHAR2(100)) id
      , CAST(EXTRACTVALUE(VALUE(v), '/griddisk/offset          /text()')                   AS VARCHAR2(100)) offset
      , CAST(EXTRACTVALUE(VALUE(v), '/griddisk/size            /text()')                   AS VARCHAR2(100)) disk_size
      , CAST(EXTRACTVALUE(VALUE(v), '/griddisk/status          /text()')                   AS VARCHAR2(100)) status
    FROM
        v$cell_config c
      , TABLE(XMLSEQUENCE(EXTRACT(XMLTYPE(c.confval), '/cli-output/griddisk'))) v  -- gv$ isn't needed, all cells should be visible in all instances
    WHERE 
        c.conftype = 'GRIDDISKS'
),
 lun AS (
    SELECT /*+ MATERIALIZE */
        c.cellname 
      , CAST(EXTRACTVALUE(VALUE(v), '/lun/cellDisk         /text()')              AS VARCHAR2(100)) cellDisk      
      , CAST(EXTRACTVALUE(VALUE(v), '/lun/deviceName       /text()')              AS VARCHAR2(100)) deviceName    
      , CAST(EXTRACTVALUE(VALUE(v), '/lun/diskType         /text()')              AS VARCHAR2(100)) diskType      
      , CAST(EXTRACTVALUE(VALUE(v), '/lun/id               /text()')              AS VARCHAR2(100)) id            
      , CAST(EXTRACTVALUE(VALUE(v), '/lun/isSystemLun      /text()')              AS VARCHAR2(100)) isSystemLun   
      , CAST(EXTRACTVALUE(VALUE(v), '/lun/lunAutoCreate    /text()')              AS VARCHAR2(100)) lunAutoCreate 
      , CAST(EXTRACTVALUE(VALUE(v), '/lun/lunSize          /text()')              AS VARCHAR2(100)) lunSize       
      , CAST(EXTRACTVALUE(VALUE(v), '/lun/physicalDrives   /text()')              AS VARCHAR2(100)) physicalDrives
      , CAST(EXTRACTVALUE(VALUE(v), '/lun/raidLevel        /text()')              AS VARCHAR2(100)) raidLevel
      , CAST(EXTRACTVALUE(VALUE(v), '/lun/lunWriteCacheMode/text()')              AS VARCHAR2(100)) lunWriteCacheMode
      , CAST(EXTRACTVALUE(VALUE(v), '/lun/status           /text()')              AS VARCHAR2(100)) status        
    FROM
        v$cell_config c
      , TABLE(XMLSEQUENCE(EXTRACT(XMLTYPE(c.confval), '/cli-output/lun'))) v  -- gv$ isn't needed, all cells should be visible in all instances
    WHERE 
        c.conftype = 'LUNS'
)
 , ad  AS (SELECT /*+ MATERIALIZE */ * FROM v$asm_disk)
 , adg AS (SELECT /*+ MATERIALIZE */ * FROM v$asm_diskgroup)
SELECT 
    adg.name                        asm_diskgroup
  , ad.name                         asm_disk
  , gd.name                         griddisk_name
  , cd.name                         celldisk_name
  , pd.cellname
  , SUBSTR(cd.devicepartition,1,20) cd_devicepart
  , pd.name                         physdisk_name
  , SUBSTR(pd.status,1,20)          physdisk_status
  , lun.lunWriteCacheMode
--  , SUBSTR(cd.devicename,1,20)      cd_devicename
--  , SUBSTR(lun.devicename,1,20)     lun_devicename
--    disktype
FROM
    gd
  , cd
  , pd
  , lun
  , ad
  , adg
WHERE
    ad.group_number = adg.group_number (+)
AND gd.asmdiskname = ad.name (+)
AND cd.name = gd.cellDisk (+)
AND pd.id = cd.physicalDisk (+)
AND cd.name = lun.celldisk (+)
--GROUP BY
--    cellname
--  , disktype
--  , status
ORDER BY
--    disktype
    asm_diskgroup
  , asm_disk
  , griddisk_name
  , celldisk_name
  , physdisk_name
  , cellname
/






++++++++++++++++++++++++++++++++++++++++++++++++++++
References : 
++++++++++++++++++++++++++++++++++++++++++++++++++++
Oracle Exadata Diagnostic Information required for Disk Failures and some other Hardware issues (Doc ID 761868.1)

Things to Check in ASM When Replacing an ONLINE disk from Exadata Storage Cell (Doc ID 1326611.1)

Exadata ALTER PHYSICALDISK N:N DROP FOR REPLACEMENT is hung (Doc ID 2574663.1)

How to Replace a Hard Drive in an Exadata Storage Server (Predictive Failure) (Doc ID 1390836.1)

How to Replace a Hard Drive in an Exadata Storage Server (Hard Failure) (Doc ID 1386147.1)

Drop Online Disk: Things to Check in ASM When Replacing an ONLINE disk from Exadata Storage Cell (Doc ID 1326611.1)

Understanding ASM Capacity and Reservation of Free Space in Exadata (Doc ID 1551288.1)

Flash disk poor performance status (Doc ID 1206015.1)

Steps to shut down or reboot an Exadata storage cell without affecting ASM (Doc ID 1188080.1)

Understanding ASM Capacity and Reservation of Free Space in Exadata (Doc ID 1551288.1)

1 comment: