We most of time come accross requirement to replace Cell servers / shutdown Cell Server . While replacing Cell or powering off cell we need to
concentrate on 3 things to be specific
1) Checking REQUIRED_MIRROR_FREE_MB and USABLE_FILE_MB .
2) Setting Right disk_repair_time for using ASM FAST MIRROR RESYNC
3) Checking ASMDeactivationOutcome .
ASMDeactivationOutcome and disk_repair_time is well documented in Metalink ID 1188080.1 . In This Blog we will try to cover Checking REQUIRED_MIRROR_FREE_MB and USABLE_FILE_MB .
If a physicaldisk fails, if the disk itself has hardware failure, all griddisks on that physicaldisk will be immediately DROPPED with FORCE option from ASM. Which is called Pro-Active Disk Quarantine ASM will not wait DISK_REPAIR_TIME to drop disks in this case.
REQUIRED_MIRROR_FREE_MB and USABLE_FILE_MB
What most of us have doubt is how much free space we need to ensure there is no outage when cell is powered off .
Each cell server is a failure group
The ASM calculates the USABLE_FILE_MB using the following formula:
USABLE_FILE_MB = (FREE_MB - REQUIRED_MIRROR_FREE_MB) / 2
In Exadata with ASM version 12cR1, the REQUIRED_MIRROR_FREE_MB is reported as the size of the largest disk [2] in the disk group.
TOTAL_MB:- Refers to total capacity of the diskgroup
FREE_MB :- Refers to raw free space available in diskgroup in MB.
FREE_MB = (TOTAL_MB – (HOT_USED_MB + COLD_USED_MB))
REQUIRED_MIRROR_FREE_MB :- Indicates how much free space is required in an ASM disk group to restore redundancy after the failure of an ASM disk or ASM failure group.In exadata it is the disk capacity of one failure group.
USABLE_FILE_MB :- Indicates how much space is available in an ASM disk group considering the redundancy level of the disk group.
Its calculated as :-
USABLE_FILE_MB=(FREE_MB – REQUIRED_MIRROR_FREE_MB ) / 2 –> For Normal Redundancy
USABLE_FILE_MB=(FREE_MB – REQUIRED_MIRROR_FREE_MB ) / 3 –> For High Redundancy
1) Run this query for all the DG for Checking REQUIRED_MIRROR_FREE_MB and USABLE_FILE_MB . Formula used is Total Allocatable Size/Redundancy
select sum(total_mb)/3 from v$asm_disk where lower(failgroup)='cel02.cn.db.com and lower(name) like '%data%';
select FAILGROUP, count(NAME) "Disks", sum(TOTAL_MB) "MB"
from v$asm_disk_stat
where GROUP_NUMBER=2
group by FAILGROUP
order by 3;
select NAME, TOTAL_MB, FREE_MB, REQUIRED_MIRROR_FREE_MB, USABLE_FILE_MB
from v$asm_diskgroup_stat
where GROUP_NUMBER=2;
This is Old Copy of Oracle provided Script i Had :
SET SERVEROUTPUT ON
DECLARE
v_num_disks NUMBER;
v_group_number NUMBER;
v_max_total_mb NUMBER;
v_required_free_mb NUMBER;
v_usable_mb NUMBER;
v_cell_usable_mb NUMBER;
v_one_cell_usable_mb NUMBER;
v_enuf_free BOOLEAN := FALSE;
v_enuf_free_cell BOOLEAN := FALSE;
v_req_mirror_free_adj_factor NUMBER := 1.10;
v_req_mirror_free_adj NUMBER := 0;
v_one_cell_req_mir_free_mb NUMBER := 0;
v_disk_desc VARCHAR(10) := 'SINGLE';
v_offset NUMBER := 50;
v_db_version VARCHAR2(8);
v_inst_name VARCHAR2(1);
BEGIN
SELECT substr(version,1,8), substr(instance_name,1,1) INTO v_db_version, v_inst_name FROM v$instance;
IF v_inst_name <> '+' THEN
DBMS_OUTPUT.PUT_LINE('ERROR: THIS IS NOT AN ASM INSTANCE! PLEASE LOG ON TO AN ASM INSTANCE AND RE-RUN THIS SCRIPT.');
GOTO the_end;
END IF;
DBMS_OUTPUT.PUT_LINE('------ DISK and CELL Failure Diskgroup Space Reserve Requirements ------');
DBMS_OUTPUT.PUT_LINE(' This procedure determines how much space you need to survive a DISK or CELL failure. It also shows the usable space ');
DBMS_OUTPUT.PUT_LINE(' available when reserving space for disk or cell failure. ');
DBMS_OUTPUT.PUT_LINE(' Please see MOS note 1551288.1 for more information. ');
DBMS_OUTPUT.PUT_LINE('. . .');
DBMS_OUTPUT.PUT_LINE(' Description of Derived Values:');
DBMS_OUTPUT.PUT_LINE(' One Cell Required Mirror Free MB : Required Mirror Free MB to permit successful rebalance after losing largest CELL regardless of redundancy type');
DBMS_OUTPUT.PUT_LINE(' Disk Required Mirror Free MB : Space needed to rebalance after loss of single or double disk failure (for normal or high redundancy)');
DBMS_OUTPUT.PUT_LINE(' Disk Usable File MB : Usable space available after reserving space for disk failure and accounting for mirroring');
DBMS_OUTPUT.PUT_LINE(' Cell Usable File MB : Usable space available after reserving space for SINGLE cell failure and accounting for mirroring');
DBMS_OUTPUT.PUT_LINE('. . .');
IF v_db_version = '11.2.0.3' THEN
v_req_mirror_free_adj_factor := 1.10;
DBMS_OUTPUT.PUT_LINE('ASM Version: 11.2.0.3');
ELSE
v_req_mirror_free_adj_factor := 1.5;
DBMS_OUTPUT.PUT_LINE('ASM Version: '||v_db_version||' - WARNING DISK FAILURE COVERAGE ESTIMATES HAVE NOT BEEN VERIFIED ON THIS VERSION!');
END IF;
DBMS_OUTPUT.PUT_LINE('. . .');
FOR dg IN (SELECT name, type, group_number, total_mb, free_mb, required_mirror_free_mb FROM v$asm_diskgroup ORDER BY name) LOOP
v_enuf_free := FALSE;
v_req_mirror_free_adj := dg.required_mirror_free_mb * v_req_mirror_free_adj_factor;
-- Find largest amount of space allocated to a cell
SELECT sum(disk_cnt), max(max_total_mb), max(sum_total_mb)*v_req_mirror_free_adj_factor
INTO v_num_disks, v_max_total_mb, v_one_cell_req_mir_free_mb
FROM (SELECT count(1) disk_cnt, max(total_mb) max_total_mb, sum(total_mb) sum_total_mb
FROM v$asm_disk
WHERE group_number = dg.group_number
GROUP BY failgroup);
-- Eighth Rack
IF dg.type = 'NORMAL' THEN
-- Eighth Rack
IF (v_num_disks < 36) THEN
-- Use eqn: y = 1.21344 x+ 17429.8
v_required_free_mb := 1.21344 * v_max_total_mb + 17429.8;
IF dg.free_mb > v_required_free_mb THEN v_enuf_free := TRUE; END IF;
-- Quarter Rack
ELSIF (v_num_disks >= 36 AND v_num_disks < 84) THEN
-- Use eqn: y = 1.07687 x+ 19699.3
v_required_free_mb := 1.07687 * v_max_total_mb + 19699.3;
IF dg.free_mb > v_required_free_mb THEN v_enuf_free := TRUE; END IF;
-- Half Rack
ELSIF (v_num_disks >= 84 AND v_num_disks < 168) THEN
-- Use eqn: y = 1.02475 x+53731.3
v_required_free_mb := 1.02475 * v_max_total_mb + 53731.3;
IF dg.free_mb > v_required_free_mb THEN v_enuf_free := TRUE; END IF;
-- Full rack is most conservative, it will be default
ELSE
-- Use eqn: y = 1.33333 x+83220.
v_required_free_mb := 1.33333 * v_max_total_mb + 83220;
IF dg.free_mb > v_required_free_mb THEN v_enuf_free := TRUE; END IF;
END IF;
-- DISK usable file MB
v_usable_mb := ROUND((dg.free_mb - v_required_free_mb)/2);
v_disk_desc := 'ONE disk';
-- CELL usable file MB
v_cell_usable_mb := ROUND( (dg.free_mb - v_one_cell_req_mir_free_mb)/2 );
v_one_cell_usable_mb := v_cell_usable_mb;
ELSE
-- HIGH redundancy
-- Eighth Rack
IF (v_num_disks <= 18) THEN
-- Use eqn: y = 4x + 0
v_required_free_mb := 4.0 * v_max_total_mb;
IF dg.free_mb > v_required_free_mb THEN v_enuf_free := TRUE; END IF;
-- Quarter Rack
ELSIF (v_num_disks > 18 AND v_num_disks <= 36) THEN
-- Use eqn: y = 3.87356 x+417692.
v_required_free_mb := 3.87356 * v_max_total_mb + 417692;
IF dg.free_mb > v_required_free_mb THEN v_enuf_free := TRUE; END IF;
-- Half Rack
ELSIF (v_num_disks > 36 AND v_num_disks <= 84) THEN
-- Use eqn: y = 2.02222 x+56441.6
v_required_free_mb := 2.02222 * v_max_total_mb + 56441.6;
IF dg.free_mb > v_required_free_mb THEN v_enuf_free := TRUE; END IF;
-- Full rack is most conservative, it will be default
ELSE
-- Use eqn: y = 2.14077 x+54276.4
v_required_free_mb := 2.14077 * v_max_total_mb + 54276.4;
IF dg.free_mb > v_required_free_mb THEN v_enuf_free := TRUE; END IF;
END IF;
-- DISK usable file MB
v_usable_mb := ROUND((dg.free_mb - v_required_free_mb)/3);
v_disk_desc := 'TWO disks';
-- CELL usable file MB
v_one_cell_usable_mb := ROUND( (dg.free_mb - v_one_cell_req_mir_free_mb)/3 );
END IF;
DBMS_OUTPUT.PUT_LINE('-------------------------------------------------------------------------');
DBMS_OUTPUT.PUT_LINE('DG Name: '||LPAD(dg.name,v_offset-9));
DBMS_OUTPUT.PUT_LINE('DG Type: '||LPAD(dg.type,v_offset-9));
DBMS_OUTPUT.PUT_LINE('Num Disks: '||LPAD(TO_CHAR(v_num_disks),v_offset-11));
DBMS_OUTPUT.PUT_LINE('Disk Size MB: '||LPAD(TO_CHAR(v_max_total_mb,'999,999,999,999'),v_offset-14));
DBMS_OUTPUT.PUT_LINE('. . .');
DBMS_OUTPUT.PUT_LINE('DG Total MB: '||LPAD(TO_CHAR(dg.total_mb,'999,999,999,999'),v_offset-13));
DBMS_OUTPUT.PUT_LINE('DG Used MB: '||LPAD(TO_CHAR(dg.total_mb - dg.free_mb,'999,999,999,999'),v_offset-12));
DBMS_OUTPUT.PUT_LINE('DG Free MB: '||LPAD(TO_CHAR(dg.free_mb,'999,999,999,999'),v_offset-12));
DBMS_OUTPUT.PUT_LINE('. . .');
DBMS_OUTPUT.PUT_LINE('One Cell Required Mirror Free MB: '||LPAD(TO_CHAR(ROUND(v_one_cell_req_mir_free_mb),'999,999,999,999'),v_offset-34));
DBMS_OUTPUT.PUT_LINE('. . .');
DBMS_OUTPUT.PUT_LINE('Disk Required Mirror Free MB: '||LPAD(TO_CHAR(ROUND(v_required_free_mb),'999,999,999,999'),v_offset-30));
DBMS_OUTPUT.PUT_LINE('. . .');
DBMS_OUTPUT.PUT_LINE('Disk Usable File MB: '||LPAD(TO_CHAR(ROUND(v_usable_mb),'999,999,999,999'),v_offset-21));
DBMS_OUTPUT.PUT_LINE('Cell Usable File MB: '||LPAD(TO_CHAR(ROUND(v_one_cell_usable_mb),'999,999,999,999'),v_offset-21));
DBMS_OUTPUT.PUT_LINE('. . .');
IF v_enuf_free THEN
DBMS_OUTPUT.PUT_LINE('Enough Free Space to Rebalance after loss of '||v_disk_desc||': PASS');
ELSE
DBMS_OUTPUT.PUT_LINE('Enough Free Space to Rebalance after loss of '||v_disk_desc||': FAIL');
END IF;
IF dg.type = 'NORMAL' THEN
-- Calc Free Space for Rebalance Due to Cell Failure
IF v_req_mirror_free_adj < dg.free_mb THEN
DBMS_OUTPUT.PUT_LINE('Enough Free Space to Rebalance after loss of ONE cell: PASS');
ELSE
DBMS_OUTPUT.PUT_LINE('Enough Free Space to Rebalance after loss of ONE cell: WARNING (cell failure is very rare)');
END IF;
ELSE
-- Calc Free Space for Rebalance Due to Single Cell Failure
IF v_one_cell_req_mir_free_mb < dg.free_mb THEN
DBMS_OUTPUT.PUT_LINE('Enough Free Space to Rebalance after loss of ONE cell: PASS');
ELSE
DBMS_OUTPUT.PUT_LINE('Enough Free Space to Rebalance after loss of ONE cell: WARNING (cell failure is very rare and high redundancy offers ample protection already)');
END IF;
END IF;
END LOOP;
<<the_end>>
DBMS_OUTPUT.PUT_LINE('. . .');
DBMS_OUTPUT.PUT_LINE('Script completed.');
END;
/
This is Latest Copy of Oracle Provided Scipt in Doc 1551288.1 as of today . Have posted below same .
SET SERVEROUTPUT ON
SET LINES 155
SET PAGES 0
SET TRIMSPOOL ON
DECLARE
v_space_reserve_factor NUMBER := 0.15;
v_num_disks NUMBER;
v_group_number NUMBER;
v_max_total_mb NUMBER;
v_max_used_mb NUMBER;
v_fg_count NUMBER;
v_required_free_mb NUMBER;
v_usable_mb NUMBER;
v_cell_usable_mb NUMBER;
v_one_cell_usable_mb NUMBER;
v_enuf_free BOOLEAN := FALSE;
v_enuf_free_cell BOOLEAN := FALSE;
v_req_mirror_free_adj_factor NUMBER := 1.10;
v_req_mirror_free_adj NUMBER := 0;
v_one_cell_req_mir_free_mb NUMBER := 0;
v_disk_desc VARCHAR(10) := 'SINGLE';
v_offset NUMBER := 50;
v_db_version VARCHAR2(8);
v_inst_name VARCHAR2(1);
v_dg_pct_msg VARCHAR2(500);
v_cfc_fail_msg VARCHAR2(500);
BEGIN
SELECT substr(version,1,8), substr(instance_name,1,1) INTO v_db_version, v_inst_name FROM v$instance;
IF v_inst_name <> '+' THEN
DBMS_OUTPUT.PUT_LINE('ERROR: THIS IS NOT AN ASM INSTANCE! PLEASE LOG ON TO AN ASM INSTANCE AND RE-RUN THIS SCRIPT.');
GOTO the_end;
END IF;
DBMS_OUTPUT.PUT_LINE('------ DISK and CELL Failure Diskgroup Space Reserve Requirements ------');
DBMS_OUTPUT.PUT_LINE(' This procedure determines how much space you need to survive a DISK or CELL failure. It also shows the usable space ');
DBMS_OUTPUT.PUT_LINE(' available when reserving space for disk or cell failure (loss of cell is rare and not usually a concern). ');
DBMS_OUTPUT.PUT_LINE(' These required mirror and usable space assume space utilized to full capacity - a worst case condition.');
DBMS_OUTPUT.PUT_LINE(' Please see MOS note 1551288.1 for more information. ');
DBMS_OUTPUT.PUT_LINE('. . .');
DBMS_OUTPUT.PUT_LINE(' Description of Derived Values:');
DBMS_OUTPUT.PUT_LINE(' Recommended Reserve MB : Space needed to rebalance after loss of single or double disk failure (for normal or high redundancy)');
DBMS_OUTPUT.PUT_LINE(' Disk Usable File MB : Usable space available after reserving space for disk failure and accounting for mirroring');
DBMS_OUTPUT.PUT_LINE(' PCT Util : Percent of Total Diskgroup Space Utilized');
DBMS_OUTPUT.PUT_LINE(' DFC : Disk Failure Coverage Check (PASS = able to rebalance after loss of single disk)');
DBMS_OUTPUT.PUT_LINE('. . .');
DBMS_OUTPUT.PUT_LINE('ASM Version is '||v_db_version);
-- Set up headings
DBMS_OUTPUT.PUT_LINE('-------------------------------------------------------------------------------------------------------------------------------------------------');
DBMS_OUTPUT.PUT('| ');
DBMS_OUTPUT.PUT('| ');
DBMS_OUTPUT.PUT('| ');
DBMS_OUTPUT.PUT('| ');
DBMS_OUTPUT.PUT('| ');
DBMS_OUTPUT.PUT('| ');
DBMS_OUTPUT.PUT('| ');
DBMS_OUTPUT.PUT('| ');
DBMS_OUTPUT.PUT('|Recommended ');
DBMS_OUTPUT.PUT('| ');
DBMS_OUTPUT.PUT('| |');
DBMS_OUTPUT.PUT_LINE(' |');
-- next row
DBMS_OUTPUT.PUT('| ');
DBMS_OUTPUT.PUT('|DG ');
DBMS_OUTPUT.PUT('|Num ');
DBMS_OUTPUT.PUT('|Num ');
DBMS_OUTPUT.PUT('|Disk Size ');
DBMS_OUTPUT.PUT('|DG Total ');
DBMS_OUTPUT.PUT('|DG Used ');
DBMS_OUTPUT.PUT('|DG Free ');
DBMS_OUTPUT.PUT('|Reserve ');
DBMS_OUTPUT.PUT('|Disk Usable ');
DBMS_OUTPUT.PUT('|PCT |');
DBMS_OUTPUT.PUT_LINE(' |');
-- next row
DBMS_OUTPUT.PUT('|DG Name ');
DBMS_OUTPUT.PUT('|Type ');
DBMS_OUTPUT.PUT('|FGs ');
DBMS_OUTPUT.PUT('|Disks');
DBMS_OUTPUT.PUT('|MB ');
DBMS_OUTPUT.PUT('|MB ');
DBMS_OUTPUT.PUT('|MB ');
DBMS_OUTPUT.PUT('|MB ');
DBMS_OUTPUT.PUT('|MB ');
DBMS_OUTPUT.PUT('|File MB ');
DBMS_OUTPUT.PUT('|Util ');
DBMS_OUTPUT.PUT_LINE('|DFC |');
DBMS_OUTPUT.PUT_LINE('-------------------------------------------------------------------------------------------------------------------------------------------------');
FOR dg IN (SELECT name, type, group_number, total_mb, free_mb, required_mirror_free_mb FROM v$asm_diskgroup ORDER BY name) LOOP
v_enuf_free := FALSE;
-- Find largest amount of space allocated to a cell
SELECT sum(disk_cnt), max(max_total_mb), max(sum_used_mb), count(distinct failgroup)
INTO v_num_disks,v_max_total_mb, v_max_used_mb, v_fg_count
FROM (SELECT failgroup, count(1) disk_cnt, max(total_mb) max_total_mb, sum(total_mb - free_mb) sum_used_mb
FROM v$asm_disk
WHERE group_number = dg.group_number and failgroup_type = 'REGULAR'
GROUP BY failgroup);
-- Amount to reserve depends on version and number of FGs
IF ((v_db_version like '12.2%') or (v_db_version like '18%') or (v_db_version like '19%')) THEN
IF v_fg_count < 5 THEN
v_space_reserve_factor := 0.15 ;
v_dg_pct_msg := v_dg_pct_msg||'Diskgroup '||dg.name||' using reserve factor of 15% '||chr(10);
ELSE
v_space_reserve_factor := 0.09 ;
v_dg_pct_msg := v_dg_pct_msg||'Diskgroup '||dg.name||' using reserve factor of 9% '||chr(10);
END IF;
ELSIF ( (v_db_version like '12.1%' ) or (v_db_version like '11.2.0.4%') ) THEN
v_space_reserve_factor := 0.15 ;
v_dg_pct_msg := v_dg_pct_msg||'Diskgroup '||dg.name||' using reserve factor of 15% '||chr(10);
ELSE
v_space_reserve_factor := 0.15 ;
v_dg_pct_msg := v_dg_pct_msg||'Diskgroup '||dg.name||' using reserve factor of 15% '||chr(10);
END IF;
v_required_free_mb := v_space_reserve_factor * dg.total_mb;
IF dg.free_mb > v_required_free_mb THEN v_enuf_free := TRUE; END IF;
IF dg.type = 'NORMAL' THEN
-- DISK usable file MB
v_usable_mb := ROUND((dg.free_mb - v_required_free_mb)/2);
ELSIF dg.type = 'HIGH' THEN
-- HIGH redundancy
-- DISK usable file MB
v_usable_mb := ROUND((dg.free_mb - v_required_free_mb)/3);
ELSIF dg.type = 'EXTEND' THEN
-- EXTENDED redundancy for stretch clusters
-- DISK usable file MB
v_usable_mb := ROUND((dg.free_mb - v_required_free_mb)/4);
ELSE
-- We don't know this type...maybe FLEX DG - not enough info to say
v_usable_mb := NULL;
END IF;
DBMS_OUTPUT.PUT('|'||RPAD(dg.name,v_offset-40));
DBMS_OUTPUT.PUT('|'||RPAD(nvl(dg.type,' '),v_offset-41));
DBMS_OUTPUT.PUT('|'||LPAD(TO_CHAR(v_fg_count),v_offset-45));
DBMS_OUTPUT.PUT('|'||LPAD(TO_CHAR(v_num_disks),v_offset-45));
DBMS_OUTPUT.PUT('|'||TO_CHAR(v_max_total_mb,'999,999,999'));
DBMS_OUTPUT.PUT('|'||TO_CHAR(dg.total_mb,'999,999,999,999'));
DBMS_OUTPUT.PUT('|'||TO_CHAR(dg.total_mb - dg.free_mb,'999,999,999,999'));
DBMS_OUTPUT.PUT('|'||TO_CHAR(dg.free_mb,'999,999,999,999'));
DBMS_OUTPUT.PUT('|'||TO_CHAR(ROUND(v_required_free_mb),'999,999,999,999'));
DBMS_OUTPUT.PUT('|'||TO_CHAR(ROUND(v_usable_mb),'999,999,999,999'));
-- Calc Disk Utilization Percentage
IF dg.total_mb > 0 THEN
DBMS_OUTPUT.PUT('|'||TO_CHAR((((dg.total_mb - dg.free_mb)/dg.total_mb)*100),'999.9')||CHR(37));
ELSE
DBMS_OUTPUT.PUT('| ');
END IF;
IF v_enuf_free THEN
DBMS_OUTPUT.PUT_LINE('|'||'PASS|');
ELSE
DBMS_OUTPUT.PUT_LINE('|'||'FAIL|');
END IF;
END LOOP;
DBMS_OUTPUT.PUT_LINE('-------------------------------------------------------------------------------------------------------------------------------------------------');
<<the_end>>
DBMS_OUTPUT.PUT_LINE(v_dg_pct_msg);
IF v_cfc_fail_msg is not null THEN
DBMS_OUTPUT.PUT_LINE('Cell Failure Coverage Freespace Failures Detected. Warning Message Follows.');
DBMS_OUTPUT.PUT_LINE(v_cfc_fail_msg);
END IF;
DBMS_OUTPUT.PUT_LINE('. . .');
DBMS_OUTPUT.PUT_LINE('Script completed.');
END;
/
WHENEVER SQLERROR EXIT FAILURE;
Dropping Disk :
Once we have done our checks we can proceed to drop disk from all diskgroups assigned to that cell which is going for maintenance
SQL> ALTER DISKGROUP DATA01 drop DISKS IN FAILGROUP cell01 rebalance power 32 nowait;
SQL> ALTER DISKGROUP RECO drop DISKS IN FAILGROUP cell01 rebalance power 32 nowait;
Once disk is repaired we may re-add in all diskgroups
How To Add Back An ASM Disk or Failgroup (Normal or High Redundancy) After A Transient Failure Occurred Or When The DISK_REPAIR_TIME Attribute Expired (10.1 to 12.1)? (Doc ID 946213.1)
SQL> ALTER DISKGROUP <DG_NAME> ADD FAILGROUP <FG_NAME> DISK '/dev/rdsk/c3t13xxxx' REBALANCE POWER <power number 1-11>;
Procedure to reboot or shut down a storage cell without affecting ASM
Once you are confirmed from above script from dba side that cell can be taken offline . Sa will perform below steps
A)
an OEM blackout covering ALL ASM instances and the affected storage cell should be raised prior to shutting down or rebooting a storage cell.
B) Check current ASM mode status of each grid disk
# cellcli -e list griddisk attributes name, asmmodestatus, asmdeactivationoutcome, errorCount, status
If one or more disks return asmdeactivationoutcome='No', you should wait for a minute and repeat above step. If all disks return asmdeactivationoutcome='Yes' proceed with the remaining steps.
C) Make all grid disks inactive in ASM
# cellcli -e alter griddisk all inactive
Note : This action could take 10 minutes or longer depending on activity. It is very important to make sure you were able to offline all the disks successfully before shutting down the cell services. Inactivating the grid disks will automatically OFFLINE the disks in the ASM instance.
Sometimes “alter griddisk all” inactivates all grid disks, but some of them are still online, even after waiting up to 30 to 60 minutes. To prevent it from happening, specify all the griddisks instead of “all”, and inactivate them individually.
For example:
# cellcli -e alter griddisk SYSTEMDG_CD_02_dmorlcel11, SYSTEMDG_CD_03_dmorlcel11, SYSTEMDG_CD_04_dmorlcel11 inactive
If it does not work, you’ll have to set these disks offline from ASM, as the “oragrid” user. This needs to be done by a DBA. For example:
$ sqlplus / as sysasm
> alter diskgroup SYSTEMDG offline disk SYSTEMDG_CD_02_dmorlcel11
You probably only need to do it on one disk only, and the rest will be set offline at the same time.
D) Confirm that the griddisks are now offline
# cellcli -e list griddisk attributes name, asmmodestatus, asmdeactivationoutcome, errorCount, status
The output should show asmmodestatus=UNUSED and asmdeactivationoutcome=Yes for all griddisks once the disks are offline in ASM
Check griddisks
# cellcli -e list griddisk
E) If all are offline, proceed with storage cell reboot or power down.
· IMPORTANT: After cell reboot is complete, re-enable the grid disks in ASM
# cellcli -e alter griddisk all active
To replace a flash disk due to disk failure, perform the following procedure:
Doc ID 1113023.1
Step 1. Run the following command to stop the cell services:
CellCLI> ALTER CELL SHUTDOWN SERVICES ALL
The preceding command will check if there are any disks that are offline, in predictive failure status or need to be resilvered. If Oracle ASM redundancy is intact, the command will take the griddisks offline in ASM and then proceed to stop the cell services. If the following error message is seen, it is not safe to stop the cell services because some disk group may be forced to dismount due to reduced redundancy.
If such an error is encountered, please restore the Oracle ASM disk group redundancy and retry the command when all disk status is back to normal.
Step 2. Shut down the cell.
Step 3. Replace the failed flash disk based on the PCI number and FDOM number.
Step 4. Power up the cell. The cell services will be started automatically. As part of the cell service startup, all grid disks will also be auto-onlined in Oracle ASM.
Step 5. Verify that all grid disks have been successfully brought online using the following command:
CellCLI> LIST GRIDDISK ATTRIBUTES NAME, ASMMODESTATUS
________________________
Reference :
________________________
Steps to shut down or reboot an Exadata storage cell without affecting ASM (Doc ID 1188080.1)
Script to Calculate New Grid Disk and Disk Group Sizes in Exadata (Doc ID 1464809.1)
Understanding ASM Capacity and Reservation of Free Space in Exadata (Doc ID 1551288.1)
https://docs.oracle.com/cd/E80920_01/DBMMN/maintaining-exadata-storage-servers.htm#DBMMN-GUID-96881470-4D44-4AF6-8A29-33E3164CD244