Saturday, May 22, 2021

Oracle Rac -- Adding HAIP Manually : Redundant Interconnect



Oracle Grid Infrastructure from 11g R2  now provides RAC HAIP, which is link aggregation moved to the clusterware level. Instead of bonding the network adapters on the OS side, Grid Infrastructure in instructed to use multiple network adapters. Grid Infrastructure will still start HAIP even if the system is configured with only one private network adapter. Shows the resource name ora.cluster_interconnect.haip is online.

Starting 11gR2, Oracle supports up to 4 redundant interconnects that will be automatically managed by the cluster for fail-over and load balancing.
Following procedure shows how to add a new private interconnect to an existing RAC cluster.


 Before making this change, check on all nodes if any resource is OFFLINE.
> All the CRS resources should be online.
> In one case, we had ADVM (12c) STABLE and we got errors:
> Check to see what private interfaces are already registered


** IMPORTANT: Make sure your second/redundant Private has DIFFERENT SUBNET

List all the interfaces on the server
Make sure that the new private interface that you are going to add is correctly plumbed
On all other nodes - it has to be on the same interface name 


Each NIC used in the HAIP configuration must be in its own subnet, if the same subnet is used and the NIC having its subnet first in the routing table fails you can experience a node eviction

Each NIC defined as a cluster interconnect  on a given node will have a static ip address (Private IP) assigned to it and each cluster interconnect NIC on a given node must be on a unique subnet. If any one of the cluster interconnect NICs is down on a node, then the subnet associated with the down NIC is considered not usable by any node of the cluster 







#####################################
Check HAIP Information  Before Adding
#####################################

[oracle@host01 bin]$ ./crsctl stat res -t -init

---------------------------------------------------------------------------------------------------- 

Name          Target  State        Server                   State details      Cluster Resources
---------------------------------------------------------------------------------------------------- 
ora.asm  1        ONLINE  ONLINE       host01                   Started,STABLE
ora.cluster_interconnect.haip   1        ONLINE  ONLINE       host01                   STABLE



oifcfg  shows , only one adapter is defined for the Cluster Interconnect.

[oracle@host01 bin]$ ./oifcfg getif
eth0  192.168.56.0  global  public
eth1  192.168.10.0  global  cluster_interconnect



The ifconfig command shows that network device eth2 is part of two subnets.

[oracle@host01 bin]$ ifconfig -a

eth0      Link encap:Ethernet  HWaddr 08:00:27:98:EA:FE 
         inet addr:192.168.56.71  Bcast:192.168.56.255  Mask:255.255.255.0
         inet6 addr: fe80::a00:27ff:fe98:eafe/64 Scope:Link
         UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
         RX packets:947 errors:0 dropped:0 overruns:0 frame:0
         TX packets:818 errors:0 dropped:0 overruns:0 carrier:0
         collisions:0 txqueuelen:1000
         RX bytes:100821 (98.4 KiB)  TX bytes:92406 (90.2 KiB)

eth2      Link encap:Ethernet  HWaddr 08:00:27:54:73:8F 
          inet addr:192.168.10.1  Bcast:192.168.10.255  Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fe54:738f/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:9000  Metric:1
          RX packets:406939 errors:0 dropped:0 overruns:0 frame:0
          TX packets:382298 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:445270636 (424.6 MiB)  TX bytes:202801222 (193.4 MiB)
 

eth2:1    Link encap:Ethernet  HWaddr 08:00:27:54:73:8F 
          inet addr:192.168.225.190  Bcast:192.168.255.255  Mask:255.255.0.0
          UP BROADCAST RUNNING MULTICAST  MTU:9000  Metric:1





When Grid Infrastructure is stopped, the ifconfig command will no longer show the eth1:1 device. The gv$cluster_interconnects view shows the HAIP subnets for each instance.

select     
     inst_id,
     name,
     ip_address
  from
     gv$cluster_interconnects;


   INST_ID NAME            IP_ADDRESS
---------- --------------- ----------------
         1 eth2:1          192.168.225.190
         2 eth2:1          192.168.230.98





#####################################
Adding Secondary Private Network / HAIP 
#####################################

While HAIP is running, there is no redundancy or additional network bandwidth because only one network interface is configured. If a second network interface is available for the private network, it will need to be added to Grid Infrastructure. The device needs to be a well-configured network adapter in the operating system. The new network interface needs to have the same configuration as the current interface, i.e. both must be on the same subnet, have the same MTU size, etc.  The oifcfg command is used to set the new interface as a cluster_interconnect device.

$ oifcfg iflist
eth0  10.0.2.0        <--local router
eth1  192.168.56.0    <-- public Interface
eth2  192.168.10.0     <-- RAC cluster_interconnect
eth2  192.168.0.0     <-- RAC used  
eth3  192.168.0.0     <-- Our new device we want to add to the cluster_interconnect




oifcfg iflist -p -n  ( eg only ) 

eth0  192.168.4.0  PRIVATE  255.255.255.0
eth1  192.168.0.128  PRIVATE  255.255.255.128
eth1  192.168.0.0  UNKNOWN  255.255.0.0
Note:
– The first column is the network adapter name.
– The second column is the subnet ID.
– The third column indicates whether it’s private, public or unknown according to RFC standard, it has NOTHING to do whether it’s used as a private or public network in Oracle Clusterware.
– The last column is the netmask.


oifcfg setif <interface-name>/<subnet>:<cluster_interconnect|public>

[oracle@host01 bin]$ ./oifcfg setif -global  eth3/192.168.10.0:cluster_interconnect,asm



The device eth3 is now part of the Cluster Interconnect. The commands do not need to be repeated on all nodes as Grid Infrastructure takes care of that for us. On host02, the device is already configured.
 
[oracle@host02 bin]$ ./oifcfg getif

eth1  192.168.56.0  global  public
eth2  192.168.10.0  global  cluster_interconnect
eth3  192.168.10.0  global  cluster_interconnect

 

Grid Infrastructure needs to be restarted on all nodes.

[root@host01 bin]# ./crsctl stop crs
[root@host01 bin]# ./crsctl start crs




Once the cluster nodes are back up and running, the new interface will be part of the RAC HAIP configuration.

[root@host01 ~]# ifconfig ?a

eth1      Link encap:Ethernet  HWaddr 08:00:27:98:EA:FE 
          inet addr:192.168.56.71  Bcast:192.168.56.255  Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fe98:eafe/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:5215 errors:0 dropped:0 overruns:0 frame:0
          TX packets:6593 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:2469064 (2.3 MiB)  TX bytes:7087438 (6.7 MiB)

 
eth2      Link encap:Ethernet  HWaddr 08:00:27:54:73:8F 
          inet addr:192.168.10.1  Bcast:192.168.10.255  Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fe54:738f/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:9000  Metric:1
          RX packets:3517 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2771 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000

          RX bytes:789056 (770.5 KiB)  TX bytes:694387 (678.1 KiB)

 

eth2:1    Link encap:Ethernet  HWaddr 08:00:27:54:73:8F 
          inet addr:192.168.21.30  Bcast:192.168.127.255  Mask:255.255.128.0
          UP BROADCAST RUNNING MULTICAST  MTU:9000  Metric:1

 
eth3      Link encap:Ethernet  HWaddr 08:00:27:6A:8B:8A 
          inet addr:192.168.10.3  Bcast:192.168.10.255  Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fe6a:8b8a/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:9000  Metric:1
          RX packets:857 errors:0 dropped:0 overruns:0 frame:0
         TX packets:511 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:158563 (154.8 KiB)  TX bytes:64923 (63.4 KiB)

 
eth3:1    Link encap:Ethernet  HWaddr 08:00:27:6A:8B:8A 
          inet addr:192.168.170.240 Bcast:192.168.255.255 Mask:255.255.128.0
          UP BROADCAST RUNNING MULTICAST  MTU:9000  Metric:1

 

The new interface is also found in the gv$cluster_interconnects view.

 select
     inst_id,
     name,
     ip_address
  from
     gv$cluster_interconnects;

 
   INST_ID NAME            IP_ADDRESS
---------- --------------- ----------------
         1 eth2:1          192.168.21.30
         1 eth3:1          192.168.170.240
         2 eth2:1         192.168.75.234
         2 eth3:1          192.168.188.35



#####################################
Testing  HAIP 
#####################################

Stop eth2 at OS level and verify eth3 
# ifconfig eth2 down
# ifconfig eth3
 
--> Custer_interconnect failed over  from eth2:1 to eth3:2 - eth2 is not used anymore

Re-enable eth2 again at OS level
# ifconfig eth2 up
--> Wait some seconds to see that failed_over cluster_interconnect is back on eth2



#####################################
Disable haip service and haip dependencies
#####################################


Disable : 
[ora12c1:root]:/>crsctl modify res ora.cluster_interconnect.haip -attr "ENABLED=0" -init


Enable : 
[ora12c1:root]:/>crsctl modify res ora.cluster_interconnect.haip -attr "ENABLED=0" -init





#####################################
Reference
#####################################
How to Modify Private Network Information in Oracle Clusterware (Doc ID 283684.1)

https://www.oracle.com/technetwork/products/clusterware/overview/interconnect-vlan-06072012-1657506.pdf

https://docs.oracle.com/database/121/CWLIN/networks.htm#CIHIJAJB
ID 1481481.1

1 comment: