Wednesday, January 19, 2022

Oracle Exadata / Exacc Cell Server Replacement without downtime



We most of time come accross requirement to replace Cell servers / shutdown Cell Server .  While replacing Cell or powering off cell we need to 
concentrate  on  3 things to be specific 

1) Checking REQUIRED_MIRROR_FREE_MB and   USABLE_FILE_MB .
2) Setting Right disk_repair_time for using ASM FAST MIRROR RESYNC  
3) Checking ASMDeactivationOutcome  .


ASMDeactivationOutcome  and   disk_repair_time  is  well documented in Metalink ID 1188080.1 .   In This Blog we will try to cover Checking REQUIRED_MIRROR_FREE_MB and   USABLE_FILE_MB .


If a physicaldisk fails, if the disk itself has hardware failure, all griddisks on that physicaldisk will be immediately DROPPED with FORCE option from ASM. Which is called Pro-Active Disk Quarantine ASM will not wait DISK_REPAIR_TIME to drop disks in this case. 




REQUIRED_MIRROR_FREE_MB and   USABLE_FILE_MB 

What most of us have doubt is how much free space we need to ensure  there is  no outage  when cell  is powered off .
Each cell server is a failure group

The ASM calculates the USABLE_FILE_MB using the following formula:
USABLE_FILE_MB = (FREE_MB - REQUIRED_MIRROR_FREE_MB) / 2

In Exadata with ASM version 12cR1, the REQUIRED_MIRROR_FREE_MB is reported as the size of the largest disk [2] in the disk group.



TOTAL_MB:- Refers to total capacity of the diskgroup
FREE_MB :- Refers to raw free space available in diskgroup in MB.

FREE_MB = (TOTAL_MB – (HOT_USED_MB + COLD_USED_MB))

REQUIRED_MIRROR_FREE_MB :- Indicates how much free space is required in an ASM disk group to restore redundancy after the failure of an ASM disk or ASM failure group.In exadata it is the disk capacity of one failure group.

USABLE_FILE_MB :- Indicates how much space is available in an ASM disk group considering the redundancy level of the disk group.

Its calculated as :-

USABLE_FILE_MB=(FREE_MB – REQUIRED_MIRROR_FREE_MB ) / 2 –> For Normal Redundancy
USABLE_FILE_MB=(FREE_MB – REQUIRED_MIRROR_FREE_MB ) / 3 –> For High Redundancy


 
1) Run this query for all the DG for Checking REQUIRED_MIRROR_FREE_MB and   USABLE_FILE_MB .   Formula used is  Total Allocatable Size/Redundancy

select sum(total_mb)/3 from v$asm_disk where lower(failgroup)='cel02.cn.db.com  and lower(name) like '%data%'; 


select FAILGROUP, count(NAME) "Disks", sum(TOTAL_MB) "MB"
from v$asm_disk_stat
where GROUP_NUMBER=2
group by FAILGROUP
order by 3;


select NAME, TOTAL_MB, FREE_MB, REQUIRED_MIRROR_FREE_MB, USABLE_FILE_MB
from v$asm_diskgroup_stat
where GROUP_NUMBER=2;




This is Old Copy of Oracle provided  Script i Had : 

SET SERVEROUTPUT ON

DECLARE
   v_num_disks    NUMBER;
   v_group_number   NUMBER;
   v_max_total_mb   NUMBER;

   v_required_free_mb   NUMBER;
   v_usable_mb      NUMBER;
   v_cell_usable_mb   NUMBER;
   v_one_cell_usable_mb   NUMBER;   
   v_enuf_free      BOOLEAN := FALSE;
   v_enuf_free_cell   BOOLEAN := FALSE;
   
   v_req_mirror_free_adj_factor   NUMBER := 1.10;
   v_req_mirror_free_adj         NUMBER := 0;
   v_one_cell_req_mir_free_mb     NUMBER  := 0;
   
   v_disk_desc      VARCHAR(10) := 'SINGLE';
   v_offset      NUMBER := 50;
   
   v_db_version   VARCHAR2(8);
   v_inst_name    VARCHAR2(1);
   
BEGIN

   SELECT substr(version,1,8), substr(instance_name,1,1)    INTO v_db_version, v_inst_name    FROM v$instance;
   
   IF v_inst_name <> '+' THEN
      DBMS_OUTPUT.PUT_LINE('ERROR: THIS IS NOT AN ASM INSTANCE!  PLEASE LOG ON TO AN ASM INSTANCE AND RE-RUN THIS SCRIPT.');
      GOTO the_end;
   END IF;

    DBMS_OUTPUT.PUT_LINE('------ DISK and CELL Failure Diskgroup Space Reserve Requirements  ------');
    DBMS_OUTPUT.PUT_LINE(' This procedure determines how much space you need to survive a DISK or CELL failure. It also shows the usable space ');
    DBMS_OUTPUT.PUT_LINE(' available when reserving space for disk or cell failure.  ');
   DBMS_OUTPUT.PUT_LINE(' Please see MOS note 1551288.1 for more information.  ');
   DBMS_OUTPUT.PUT_LINE('.  .  .');      
    DBMS_OUTPUT.PUT_LINE(' Description of Derived Values:');
    DBMS_OUTPUT.PUT_LINE(' One Cell Required Mirror Free MB : Required Mirror Free MB to permit successful rebalance after losing largest CELL regardless of redundancy type');
    DBMS_OUTPUT.PUT_LINE(' Disk Required Mirror Free MB     : Space needed to rebalance after loss of single or double disk failure (for normal or high redundancy)');
    DBMS_OUTPUT.PUT_LINE(' Disk Usable File MB              : Usable space available after reserving space for disk failure and accounting for mirroring');
    DBMS_OUTPUT.PUT_LINE(' Cell Usable File MB              : Usable space available after reserving space for SINGLE cell failure and accounting for mirroring');
   DBMS_OUTPUT.PUT_LINE('.  .  .');   

   IF v_db_version = '11.2.0.3' THEN
      v_req_mirror_free_adj_factor := 1.10;
      DBMS_OUTPUT.PUT_LINE('ASM Version: 11.2.0.3');
   ELSE
      v_req_mirror_free_adj_factor := 1.5;
      DBMS_OUTPUT.PUT_LINE('ASM Version: '||v_db_version||'  - WARNING DISK FAILURE COVERAGE ESTIMATES HAVE NOT BEEN VERIFIED ON THIS VERSION!');   
   END IF;
   
   DBMS_OUTPUT.PUT_LINE('.  .  .');      
      
   FOR dg IN (SELECT name, type, group_number, total_mb, free_mb, required_mirror_free_mb FROM v$asm_diskgroup ORDER BY name) LOOP

      v_enuf_free := FALSE;
     
     v_req_mirror_free_adj := dg.required_mirror_free_mb * v_req_mirror_free_adj_factor;
      
      -- Find largest amount of space allocated to a cell   
      SELECT sum(disk_cnt), max(max_total_mb), max(sum_total_mb)*v_req_mirror_free_adj_factor
     INTO v_num_disks, v_max_total_mb, v_one_cell_req_mir_free_mb
      FROM (SELECT count(1) disk_cnt, max(total_mb) max_total_mb, sum(total_mb) sum_total_mb 
      FROM v$asm_disk 
     WHERE group_number = dg.group_number 
     GROUP BY failgroup);     
     
      -- Eighth Rack
      IF dg.type = 'NORMAL' THEN
      
         -- Eighth Rack
         IF (v_num_disks < 36) THEN
            -- Use eqn: y = 1.21344 x+ 17429.8
            v_required_free_mb :=  1.21344 * v_max_total_mb + 17429.8;
            IF dg.free_mb > v_required_free_mb THEN v_enuf_free := TRUE; END IF;
         
         -- Quarter Rack
         ELSIF (v_num_disks >= 36 AND v_num_disks < 84) THEN 
            -- Use eqn: y = 1.07687 x+ 19699.3
            v_required_free_mb := 1.07687 * v_max_total_mb + 19699.3;
            IF dg.free_mb > v_required_free_mb THEN v_enuf_free := TRUE; END IF;
         
         -- Half Rack
         ELSIF (v_num_disks >= 84 AND v_num_disks < 168) THEN 
            -- Use eqn: y = 1.02475 x+53731.3
            v_required_free_mb := 1.02475 * v_max_total_mb + 53731.3;
            IF dg.free_mb > v_required_free_mb THEN v_enuf_free := TRUE; END IF;

         -- Full rack is most conservative, it will be default
         ELSE
            -- Use eqn: y = 1.33333 x+83220.
            v_required_free_mb := 1.33333 * v_max_total_mb + 83220;
            IF dg.free_mb > v_required_free_mb THEN v_enuf_free := TRUE; END IF;      
         
         END IF;
         
         -- DISK usable file MB
         v_usable_mb := ROUND((dg.free_mb - v_required_free_mb)/2);
         v_disk_desc := 'ONE disk';
         
         -- CELL usable file MB
         v_cell_usable_mb := ROUND( (dg.free_mb - v_one_cell_req_mir_free_mb)/2 );
         v_one_cell_usable_mb := v_cell_usable_mb;
       
      ELSE
         -- HIGH redundancy
         
         -- Eighth Rack
         IF (v_num_disks <= 18) THEN
            -- Use eqn: y = 4x + 0
            v_required_free_mb :=  4.0 * v_max_total_mb;
            IF dg.free_mb > v_required_free_mb THEN v_enuf_free := TRUE; END IF;
         
         -- Quarter Rack
         ELSIF (v_num_disks > 18 AND v_num_disks <= 36) THEN 
            -- Use eqn: y = 3.87356 x+417692.
            v_required_free_mb := 3.87356 * v_max_total_mb + 417692;
            IF dg.free_mb > v_required_free_mb THEN v_enuf_free := TRUE; END IF;
         
         -- Half Rack
         ELSIF (v_num_disks > 36 AND v_num_disks <= 84) THEN 
            -- Use eqn: y = 2.02222 x+56441.6
            v_required_free_mb := 2.02222 * v_max_total_mb + 56441.6;
            IF dg.free_mb > v_required_free_mb THEN v_enuf_free := TRUE; END IF;

         -- Full rack is most conservative, it will be default
         ELSE
            -- Use eqn: y = 2.14077 x+54276.4
            v_required_free_mb := 2.14077 * v_max_total_mb + 54276.4;
            IF dg.free_mb > v_required_free_mb THEN v_enuf_free := TRUE; END IF;      
         
         END IF;
      
         -- DISK usable file MB
         v_usable_mb := ROUND((dg.free_mb - v_required_free_mb)/3);      
         v_disk_desc := 'TWO disks';   
         
         -- CELL usable file MB
         v_one_cell_usable_mb := ROUND( (dg.free_mb - v_one_cell_req_mir_free_mb)/3 );
       
      END IF;
      
      DBMS_OUTPUT.PUT_LINE('-------------------------------------------------------------------------');
      DBMS_OUTPUT.PUT_LINE('DG Name: '||LPAD(dg.name,v_offset-9));
      DBMS_OUTPUT.PUT_LINE('DG Type: '||LPAD(dg.type,v_offset-9));
      DBMS_OUTPUT.PUT_LINE('Num Disks: '||LPAD(TO_CHAR(v_num_disks),v_offset-11));
      DBMS_OUTPUT.PUT_LINE('Disk Size MB: '||LPAD(TO_CHAR(v_max_total_mb,'999,999,999,999'),v_offset-14));  
      DBMS_OUTPUT.PUT_LINE('.  .  .');
      DBMS_OUTPUT.PUT_LINE('DG Total MB: '||LPAD(TO_CHAR(dg.total_mb,'999,999,999,999'),v_offset-13));
      DBMS_OUTPUT.PUT_LINE('DG Used MB: '||LPAD(TO_CHAR(dg.total_mb - dg.free_mb,'999,999,999,999'),v_offset-12));
      DBMS_OUTPUT.PUT_LINE('DG Free MB: '||LPAD(TO_CHAR(dg.free_mb,'999,999,999,999'),v_offset-12));
      DBMS_OUTPUT.PUT_LINE('.  .  .');     
      DBMS_OUTPUT.PUT_LINE('One Cell Required Mirror Free MB: '||LPAD(TO_CHAR(ROUND(v_one_cell_req_mir_free_mb),'999,999,999,999'),v_offset-34));
      DBMS_OUTPUT.PUT_LINE('.  .  .');          
      DBMS_OUTPUT.PUT_LINE('Disk Required Mirror Free MB: '||LPAD(TO_CHAR(ROUND(v_required_free_mb),'999,999,999,999'),v_offset-30));
      DBMS_OUTPUT.PUT_LINE('.  .  .');          
      DBMS_OUTPUT.PUT_LINE('Disk Usable File MB: '||LPAD(TO_CHAR(ROUND(v_usable_mb),'999,999,999,999'),v_offset-21));   
      DBMS_OUTPUT.PUT_LINE('Cell Usable File MB: '||LPAD(TO_CHAR(ROUND(v_one_cell_usable_mb),'999,999,999,999'),v_offset-21));   
      DBMS_OUTPUT.PUT_LINE('.  .  .');
      
      IF v_enuf_free THEN 
         DBMS_OUTPUT.PUT_LINE('Enough Free Space to Rebalance after loss of '||v_disk_desc||': PASS');
      ELSE
         DBMS_OUTPUT.PUT_LINE('Enough Free Space to Rebalance after loss of '||v_disk_desc||': FAIL');
      END IF;   

     IF dg.type = 'NORMAL' THEN
        -- Calc Free Space for Rebalance Due to Cell Failure
        IF v_req_mirror_free_adj < dg.free_mb THEN 
          DBMS_OUTPUT.PUT_LINE('Enough Free Space to Rebalance after loss of ONE cell: PASS');
        ELSE
          DBMS_OUTPUT.PUT_LINE('Enough Free Space to Rebalance after loss of ONE cell: WARNING (cell failure is very rare)');
        END IF;         
   ELSE
        -- Calc Free Space for Rebalance Due to Single Cell Failure
        IF v_one_cell_req_mir_free_mb < dg.free_mb THEN 
          DBMS_OUTPUT.PUT_LINE('Enough Free Space to Rebalance after loss of ONE cell: PASS');
        ELSE
          DBMS_OUTPUT.PUT_LINE('Enough Free Space to Rebalance after loss of ONE cell: WARNING (cell failure is very rare and high redundancy offers ample protection already)');
        END IF;
        
   END IF;
      
   END LOOP;
   
   <<the_end>>
   DBMS_OUTPUT.PUT_LINE('.  .  .');  
   DBMS_OUTPUT.PUT_LINE('Script completed.');
   
END;
/





This is Latest Copy of  Oracle Provided Scipt in Doc 1551288.1 as of today  . Have posted below same

SET SERVEROUTPUT ON
SET LINES 155
SET PAGES 0
SET TRIMSPOOL ON

DECLARE
   v_space_reserve_factor NUMBER := 0.15;
   v_num_disks    NUMBER;
   v_group_number   NUMBER;
   v_max_total_mb   NUMBER;
   v_max_used_mb NUMBER;
   v_fg_count   NUMBER;

   v_required_free_mb   NUMBER;
   v_usable_mb      NUMBER;
   v_cell_usable_mb   NUMBER;
   v_one_cell_usable_mb   NUMBER;
   v_enuf_free      BOOLEAN := FALSE;
   v_enuf_free_cell   BOOLEAN := FALSE;

   v_req_mirror_free_adj_factor   NUMBER := 1.10;
   v_req_mirror_free_adj         NUMBER := 0;
   v_one_cell_req_mir_free_mb     NUMBER  := 0;

   v_disk_desc      VARCHAR(10) := 'SINGLE';
   v_offset      NUMBER := 50;

   v_db_version   VARCHAR2(8);
   v_inst_name    VARCHAR2(1);

   v_dg_pct_msg   VARCHAR2(500);
   v_cfc_fail_msg VARCHAR2(500);

BEGIN

   SELECT substr(version,1,8), substr(instance_name,1,1)    INTO v_db_version, v_inst_name    FROM v$instance;

   IF v_inst_name <> '+' THEN
      DBMS_OUTPUT.PUT_LINE('ERROR: THIS IS NOT AN ASM INSTANCE!  PLEASE LOG ON TO AN ASM INSTANCE AND RE-RUN THIS SCRIPT.');
      GOTO the_end;
   END IF;

    DBMS_OUTPUT.PUT_LINE('------ DISK and CELL Failure Diskgroup Space Reserve Requirements  ------');
    DBMS_OUTPUT.PUT_LINE(' This procedure determines how much space you need to survive a DISK or CELL failure. It also shows the usable space ');
    DBMS_OUTPUT.PUT_LINE(' available when reserving space for disk or cell failure (loss of cell is rare and not usually a concern).  ');
    DBMS_OUTPUT.PUT_LINE(' These required mirror and usable space assume space utilized to full capacity - a worst case condition.');
    DBMS_OUTPUT.PUT_LINE(' Please see MOS note 1551288.1 for more information.  ');
    DBMS_OUTPUT.PUT_LINE('.  .  .');
    DBMS_OUTPUT.PUT_LINE(' Description of Derived Values:');
    DBMS_OUTPUT.PUT_LINE(' Recommended Reserve MB           : Space needed to rebalance after loss of single or double disk failure (for normal or high redundancy)');
    DBMS_OUTPUT.PUT_LINE(' Disk Usable File MB              : Usable space available after reserving space for disk failure and accounting for mirroring');
    DBMS_OUTPUT.PUT_LINE(' PCT Util                         : Percent of Total Diskgroup Space Utilized');
    DBMS_OUTPUT.PUT_LINE(' DFC                              : Disk Failure Coverage Check (PASS = able to rebalance after loss of single disk)');
   DBMS_OUTPUT.PUT_LINE('.  .  .');

   DBMS_OUTPUT.PUT_LINE('ASM Version is '||v_db_version);


-- Set up headings
      DBMS_OUTPUT.PUT_LINE('-------------------------------------------------------------------------------------------------------------------------------------------------');
      DBMS_OUTPUT.PUT('|          ');
      DBMS_OUTPUT.PUT('|         ');
      DBMS_OUTPUT.PUT('|     ');
      DBMS_OUTPUT.PUT('|     ');
      DBMS_OUTPUT.PUT('|            ');
      DBMS_OUTPUT.PUT('|                ');
      DBMS_OUTPUT.PUT('|                ');
      DBMS_OUTPUT.PUT('|                ');
      DBMS_OUTPUT.PUT('|Recommended     ');
      DBMS_OUTPUT.PUT('|                ');
      DBMS_OUTPUT.PUT('|       |');    
      DBMS_OUTPUT.PUT_LINE('    |');
      -- next row
      DBMS_OUTPUT.PUT('|          ');
      DBMS_OUTPUT.PUT('|DG       ');
      DBMS_OUTPUT.PUT('|Num  ');
      DBMS_OUTPUT.PUT('|Num  ');     
      DBMS_OUTPUT.PUT('|Disk Size   ');
      DBMS_OUTPUT.PUT('|DG Total        ');
      DBMS_OUTPUT.PUT('|DG Used         ');
      DBMS_OUTPUT.PUT('|DG Free         ');
      DBMS_OUTPUT.PUT('|Reserve         ');
      DBMS_OUTPUT.PUT('|Disk Usable     ');
      DBMS_OUTPUT.PUT('|PCT    |');
      DBMS_OUTPUT.PUT_LINE('    |');
      -- next row
      DBMS_OUTPUT.PUT('|DG Name   ');
      DBMS_OUTPUT.PUT('|Type     ');
      DBMS_OUTPUT.PUT('|FGs  ');
      DBMS_OUTPUT.PUT('|Disks');
      DBMS_OUTPUT.PUT('|MB          ');
      DBMS_OUTPUT.PUT('|MB              ');
      DBMS_OUTPUT.PUT('|MB              ');
      DBMS_OUTPUT.PUT('|MB              ');
      DBMS_OUTPUT.PUT('|MB              ');
      DBMS_OUTPUT.PUT('|File MB         ');
      DBMS_OUTPUT.PUT('|Util   ');
      DBMS_OUTPUT.PUT_LINE('|DFC |');
      DBMS_OUTPUT.PUT_LINE('-------------------------------------------------------------------------------------------------------------------------------------------------');

   FOR dg IN (SELECT name, type, group_number, total_mb, free_mb, required_mirror_free_mb FROM v$asm_diskgroup ORDER BY name) LOOP

      v_enuf_free := FALSE;

      -- Find largest amount of space allocated to a cell
      SELECT sum(disk_cnt), max(max_total_mb), max(sum_used_mb), count(distinct failgroup)
     INTO v_num_disks,v_max_total_mb, v_max_used_mb, v_fg_count
      FROM (SELECT failgroup, count(1) disk_cnt, max(total_mb) max_total_mb, sum(total_mb - free_mb) sum_used_mb
      FROM v$asm_disk
     WHERE group_number = dg.group_number and failgroup_type = 'REGULAR'
     GROUP BY failgroup);

   -- Amount to reserve depends on version and number of FGs
   IF  ((v_db_version like '12.2%') or (v_db_version like '18%') or  (v_db_version like '19%')) THEN
     IF v_fg_count < 5 THEN
     v_space_reserve_factor := 0.15 ;
     v_dg_pct_msg := v_dg_pct_msg||'Diskgroup '||dg.name||' using reserve factor of 15% '||chr(10);
     ELSE 
       v_space_reserve_factor := 0.09 ;
       v_dg_pct_msg := v_dg_pct_msg||'Diskgroup '||dg.name||' using reserve factor of 9% '||chr(10);
     END IF;
   ELSIF ( (v_db_version like '12.1%' ) or (v_db_version like '11.2.0.4%') ) THEN
       v_space_reserve_factor := 0.15 ;     
       v_dg_pct_msg := v_dg_pct_msg||'Diskgroup '||dg.name||' using reserve factor of 15% '||chr(10);
   ELSE 
       v_space_reserve_factor := 0.15 ;
       v_dg_pct_msg := v_dg_pct_msg||'Diskgroup '||dg.name||' using reserve factor of 15% '||chr(10);
   END IF;

   v_required_free_mb := v_space_reserve_factor * dg.total_mb;
   IF dg.free_mb > v_required_free_mb THEN v_enuf_free := TRUE; END IF;

IF dg.type = 'NORMAL' THEN

-- DISK usable file MB
v_usable_mb := ROUND((dg.free_mb - v_required_free_mb)/2);

ELSIF dg.type = 'HIGH' THEN
-- HIGH redundancy
-- DISK usable file MB
v_usable_mb := ROUND((dg.free_mb - v_required_free_mb)/3);
 
ELSIF dg.type = 'EXTEND' THEN
-- EXTENDED redundancy for stretch clusters

-- DISK usable file MB
v_usable_mb := ROUND((dg.free_mb - v_required_free_mb)/4);

ELSE
-- We don't know this type...maybe FLEX DG - not enough info to say 
v_usable_mb := NULL;

END IF;
  
      DBMS_OUTPUT.PUT('|'||RPAD(dg.name,v_offset-40));
      DBMS_OUTPUT.PUT('|'||RPAD(nvl(dg.type,'  '),v_offset-41));
      DBMS_OUTPUT.PUT('|'||LPAD(TO_CHAR(v_fg_count),v_offset-45));
      DBMS_OUTPUT.PUT('|'||LPAD(TO_CHAR(v_num_disks),v_offset-45));
      DBMS_OUTPUT.PUT('|'||TO_CHAR(v_max_total_mb,'999,999,999'));
      DBMS_OUTPUT.PUT('|'||TO_CHAR(dg.total_mb,'999,999,999,999'));
      DBMS_OUTPUT.PUT('|'||TO_CHAR(dg.total_mb - dg.free_mb,'999,999,999,999'));
      DBMS_OUTPUT.PUT('|'||TO_CHAR(dg.free_mb,'999,999,999,999'));
      DBMS_OUTPUT.PUT('|'||TO_CHAR(ROUND(v_required_free_mb),'999,999,999,999'));
      DBMS_OUTPUT.PUT('|'||TO_CHAR(ROUND(v_usable_mb),'999,999,999,999'));

     -- Calc Disk Utilization Percentage
      IF dg.total_mb > 0 THEN
         DBMS_OUTPUT.PUT('|'||TO_CHAR((((dg.total_mb - dg.free_mb)/dg.total_mb)*100),'999.9')||CHR(37));
      ELSE
         DBMS_OUTPUT.PUT('|       ');
      END IF;

      IF v_enuf_free THEN
         DBMS_OUTPUT.PUT_LINE('|'||'PASS|');
      ELSE
         DBMS_OUTPUT.PUT_LINE('|'||'FAIL|');
      END IF;


   END LOOP;

     DBMS_OUTPUT.PUT_LINE('-------------------------------------------------------------------------------------------------------------------------------------------------');
   <<the_end>>

   DBMS_OUTPUT.PUT_LINE(v_dg_pct_msg);

   IF v_cfc_fail_msg is not null THEN
      DBMS_OUTPUT.PUT_LINE('Cell Failure Coverage Freespace Failures Detected. Warning Message Follows.');
      DBMS_OUTPUT.PUT_LINE(v_cfc_fail_msg);
   END IF;

   DBMS_OUTPUT.PUT_LINE('.  .  .');
   DBMS_OUTPUT.PUT_LINE('Script completed.');

END;
/

WHENEVER SQLERROR EXIT FAILURE;




Dropping Disk :

Once  we  have done our checks we can proceed to drop  disk from all diskgroups  assigned to  that  cell which is going for maintenance 


SQL> ALTER DISKGROUP   DATA01 drop DISKS IN FAILGROUP cell01 rebalance power 32 nowait;
SQL> ALTER DISKGROUP  RECO  drop DISKS IN FAILGROUP cell01 rebalance power 32 nowait;



Once disk is repaired we may re-add in all diskgroups 

How To Add Back An ASM Disk or Failgroup (Normal or High Redundancy) After A Transient Failure Occurred Or When The DISK_REPAIR_TIME Attribute Expired (10.1 to 12.1)? (Doc ID 946213.1)


SQL> ALTER DISKGROUP <DG_NAME> ADD FAILGROUP <FG_NAME> DISK '/dev/rdsk/c3t13xxxx' REBALANCE POWER <power number 1-11>;


 



Procedure to reboot or shut down a storage cell without affecting ASM


Once  you are confirmed from above script from dba side that  cell can be taken offline .  Sa will perform  below steps 


A) 
an OEM blackout covering ALL ASM instances and the affected storage cell should be raised prior to shutting down or rebooting a storage cell.



B) Check current ASM mode status of each grid disk

# cellcli -e list griddisk attributes name, asmmodestatus, asmdeactivationoutcome, errorCount, status

If one or more disks return asmdeactivationoutcome='No', you should wait for a minute and repeat above step. If all disks return asmdeactivationoutcome='Yes' proceed with the remaining steps. 



C) Make all grid disks inactive in ASM

# cellcli -e alter griddisk all inactive

Note : This action could take 10 minutes or longer depending on activity. It is very important to make sure you were able to offline all the disks successfully before shutting down the cell services. Inactivating the grid disks will automatically OFFLINE the disks in the ASM instance.
Sometimes “alter griddisk all” inactivates all grid disks, but some of them are still online, even after waiting up to 30 to 60 minutes. To prevent it from happening, specify all the griddisks instead of “all”, and inactivate them individually.

For example:
# cellcli -e alter griddisk SYSTEMDG_CD_02_dmorlcel11, SYSTEMDG_CD_03_dmorlcel11, SYSTEMDG_CD_04_dmorlcel11 inactive


If it does not work, you’ll have to set these disks offline from ASM, as the “oragrid” user. This needs to be done by a DBA. For example:
$ sqlplus / as sysasm
> alter diskgroup SYSTEMDG offline disk SYSTEMDG_CD_02_dmorlcel11

You probably only need to do it on one disk only, and the rest will be set offline at the same time.



D)  Confirm that the griddisks are now offline

# cellcli -e list griddisk attributes name, asmmodestatus, asmdeactivationoutcome, errorCount, status
The output should show asmmodestatus=UNUSED and asmdeactivationoutcome=Yes for all griddisks once the disks are offline in ASM


Check griddisks
# cellcli -e list griddisk


E) If all are offline, proceed with storage cell reboot or power down.
·         IMPORTANT: After cell reboot is complete, re-enable the grid disks in ASM
# cellcli -e alter griddisk all active




To replace a flash disk due to disk failure, perform the following procedure:


Doc ID 1113023.1


Step 1.  Run the following command to stop the cell services:

CellCLI> ALTER CELL SHUTDOWN SERVICES ALL

The preceding command will check if there are any disks that are offline, in predictive failure status or need to be resilvered. If Oracle ASM redundancy is intact, the command will take the griddisks offline in ASM and then proceed to stop the cell services. If the following error message is seen, it is not safe to stop the cell services because some disk group may be forced to dismount due to reduced redundancy.

If such an error is encountered, please restore the Oracle ASM disk group redundancy and retry the command when all disk status is back to normal.


Step 2. Shut down the cell.
Step 3. Replace the failed flash disk based on the PCI number and FDOM number.

Step 4. Power up the cell. The cell services will be started automatically. As part of the cell service startup, all grid disks will also be auto-onlined in Oracle ASM.
Step 5. Verify that all grid disks have been successfully brought online using the following command:

CellCLI> LIST GRIDDISK ATTRIBUTES NAME, ASMMODESTATUS




________________________
Reference : 
________________________

Steps to shut down or reboot an Exadata storage cell without affecting ASM (Doc ID 1188080.1)
Script to Calculate New Grid Disk and Disk Group Sizes in Exadata (Doc ID 1464809.1)
Understanding ASM Capacity and Reservation of Free Space in Exadata (Doc ID 1551288.1)

https://docs.oracle.com/cd/E80920_01/DBMMN/maintaining-exadata-storage-servers.htm#DBMMN-GUID-96881470-4D44-4AF6-8A29-33E3164CD244

No comments:

Post a Comment