Tuesday, May 26, 2009

Perform the following network configuration on both Oracle RAC nodes in the cluster

Perform the following network configuration on both Oracle RAC nodes in the cluster!

Note: Although we configured several of the network settings during the Linux installation, it is important to not skip this section as it contains critical steps that are required for the RAC environment.

Introduction to Network Settings

During the Linux O/S install we already configured the IP address and host name for both of the Oracle RAC nodes. We now need to configure the /etc/hosts file as well as adjusting several of the network settings for the interconnect.

Both of the Oracle RAC nodes should have one static IP address for the public network and one static IP address for the private cluster interconnect. Do not use DHCP naming for the public IP address or the interconnects; you need static IP addresses! The private interconnect should only be used by Oracle to transfer Cluster Manager and Cache Fusion related data along with data for the network storage server (Openfiler). Note that Oracle does not support using the public network interface for the interconnect. You must have one network interface for the public network and another network interface for the private interconnect. For a production RAC implementation, the interconnect should be at least gigabit (or more) and only be used by Oracle as well as having the network storage server (Openfiler) on a separate gigabit network.

Configuring Public and Private Network

In our two node example, we need to configure the network on both Oracle RAC nodes for access to the public network as well as their private interconnect.

The easiest way to configure network settings in Oracle Enterprise Linux is with the program "Network Configuration". This application can be started from the command-line as the "root" user account as follows:

# su -
# /usr/bin/system-config-network &

Using the Network Configuration application, you need to configure both NIC devices as well as the /etc/hosts file. Both of these tasks can be completed using the Network Configuration GUI. Notice that the /etc/hosts settings are the same for both nodes.

Our example configuration will use the following settings:

Oracle RAC Node 1 - (linux1)
Device IP Address Subnet Gateway Purpose
eth0 192.168.1.100 255.255.255.0 192.168.1.1 Connects linux1 to the public network
eth1 192.168.2.100 255.255.255.0 Connects linux1 (interconnect) to linux2 (linux2-priv)
/etc/hosts
127.0.0.1        localhost.localdomain localhost

# Public Network - (eth0)
192.168.1.100 linux1
192.168.1.101 linux2

# Private Interconnect - (eth1)
192.168.2.100 linux1-priv

192.168.2.101 linux2-priv

# Public Virtual IP (VIP) addresses - (eth0)
192.168.1.200 linux1-vip
192.168.1.201 linux2-vip

# Private Storage Network for Openfiler - (eth1)
192.168.1.195 openfiler1

192.168.2.195 openfiler1-priv
Oracle RAC Node 2 - (linux2)
Device IP Address Subnet Gateway Purpose
eth0 192.168.1.101 255.255.255.0 192.168.1.1 Connects linux2 to the public network
eth1 192.168.2.101 255.255.255.0 Connects linux2 (interconnect) to linux1 (linux1-priv)
/etc/hosts
127.0.0.1        localhost.localdomain localhost

# Public Network - (eth0)
192.168.1.100 linux1
192.168.1.101 linux2

# Private Interconnect - (eth1)
192.168.2.100 linux1-priv

192.168.2.101 linux2-priv

# Public Virtual IP (VIP) addresses - (eth0)
192.168.1.200 linux1-vip
192.168.1.201 linux2-vip

# Private Storage Network for Openfiler - (eth1)
192.168.1.195 openfiler1

192.168.2.195 openfiler1-priv

Note that the virtual IP addresses only need to be defined in the /etc/hosts file (or your DNS) for both Oracle RAC nodes. The public virtual IP addresses will be configured automatically by Oracle when you run the Oracle Universal Installer, which starts Oracle's Virtual Internet Protocol Configuration Assistant (VIPCA). All virtual IP addresses will be activated when the srvctl start nodeapps -n command is run. This is the Host Name/IP Address that will be configured in the client(s) tnsnames.ora file (more details later).

In the screenshots below, only Oracle RAC Node 1 (linux1) is shown. Be sure to make all the proper network settings to both Oracle RAC nodes.



Figure 2 Network Configuration Screen, Node 1 (linux1)



Figure 3 Ethernet Device Screen, eth0 (linux1)



Figure 4 Ethernet Device Screen, eth1 (linux1)



Figure 5: Network Configuration Screen, /etc/hosts (linux1)

Once the network is configured, you can use the ifconfig command to verify everything is working. The following example is from linux1:

# /sbin/ifconfig -a
eth0 Link encap:Ethernet HWaddr 00:14:6C:76:5C:71
inet addr:192.168.1.100 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::214:6cff:fe76:5c71/64 Scope:Link

UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1546 errors:0 dropped:0 overruns:0 frame:0
TX packets:1273 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000

RX bytes:1179157 (1.1 MiB) TX bytes:183011 (178.7 KiB)
Interrupt:169 Base address:0x2f00

eth1 Link encap:Ethernet HWaddr 00:0E:0C:64:D1:E5
inet addr:192.168.2.100 Bcast:192.168.2.255 Mask:255.255.255.0

inet6 addr: fe80::20e:cff:fe64:d1e5/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:11 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:1000
RX bytes:0 (0.0 b) TX bytes:782 (782.0 b)
Base address:0xddc0 Memory:fe9c0000-fe9e0000

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0

inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:4893 errors:0 dropped:0 overruns:0 frame:0
TX packets:4893 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:0
RX bytes:6521518 (6.2 MiB) TX bytes:6521518 (6.2 MiB)

sit0 Link encap:IPv6-in-IPv4
NOARP MTU:1480 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0

TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)

About Virtual IP

Why is there a Virtual IP (VIP) in 10g? Why does it just return a dead connection when its primary node fails?

It's all about availability of the application. When a node fails, the VIP associated with it is supposed to be automatically failed over to some other node. When this occurs, two things happen.

  1. The new node re-arps the world indicating a new MAC address for the address. For directly connected clients, this usually causes them to see errors on their connections to the old address.
  2. Subsequent packets sent to the VIP go to the new node, which will send error RST packets back to the clients. This results in the clients getting errors immediately.

This means that when the client issues SQL to the node that is now down, or traverses the address list while connecting, rather than waiting on a very long TCP/IP time-out (~10 minutes), the client receives a TCP reset. In the case of SQL, this is ORA-3113. In the case of connect, the next address in tnsnames is used.

Going one step further is making use of Transparent Application Failover (TAF). With TAF successfully configured, it is possible to completely avoid ORA-3113 errors alltogether! TAF will be discussed in more detail in Section 30 ("Transparent Application Failover - (TAF)").

Without using VIPs, clients connected to a node that died will often wait a 10-minute TCP timeout period before getting an error. As a result, you don't really have a good HA solution without using VIPs (Source - Metalink Note 220970.1).

Confirm the RAC Node Name is Not Listed in Loopback Address

Ensure that the node names (linux1 or linux2) are not included for the loopback address in the /etc/hosts file. If the machine name is listed in the in the loopback address entry as below:

127.0.0.1 linux1 localhost.localdomain localhost
it will need to be removed as shown below:
127.0.0.1 localhost.localdomain localhost

If the RAC node name is listed for the loopback address, you will receive the following error during the RAC installation:

ORA-00603: ORACLE server session terminated by fatal error
or
ORA-29702: error occurred in Cluster Group Service operation

Confirm localhost is defined in the /etc/hosts file for the loopback address

Ensure that the entry for localhost.localdomain and localhost are included for the loopback address in the /etc/hosts file for each of the Oracle RAC nodes:

    127.0.0.1        localhost.localdomain localhost
If an entry does not exist for localhost in the /etc/hosts file, Oracle Clusterware will be unable to start the application resources — notably the ONS process. The error would indicate "Failed to get IP for localhost" and will be written to the log file for ONS. For example:
CRS-0215 could not start resource 'ora.linux1.ons'. Check log file
"/u01/app/crs/log/linux1/racg/
ora.linux1.ons.log"
for more details.
The ONS log file will contain lines similar to the following:

Oracle Database 10g CRS Release 10.2.0.1.0 Production Copyright 1996, 2005 Oracle. All rights reserved.
2007-04-14 13:10:02.729: [ RACG][3086871296][13316][3086871296][ora.linux1.ons]: Failed to get IP for localhost (1)
Failed to get IP for localhost (1)
Failed to get IP for localhost (1)
onsctl: ons failed to start
...

Adjusting Network Settings

With Oracle 9.2.0.1 and later, Oracle makes use of UDP as the default protocol on Linux for inter-process communication (IPC), such as Cache Fusion and Cluster Manager buffer transfers between instances within the RAC cluster.

Oracle strongly suggests to adjust the default and maximum send buffer size (SO_SNDBUF socket option) to 256KB, and the default and maximum receive buffer size (SO_RCVBUF socket option) to 256KB.

The receive buffers are used by TCP and UDP to hold received data until it is read by the application. The receive buffer cannot overflow because the peer is not allowed to send data beyond the buffer size window. This means that datagrams will be discarded if they don't fit in the socket receive buffer, potentially causing the sender to overwhelm the receiver.

The default and maximum window size can be changed in the /proc file system without reboot:

# su - root

# sysctl -w net.core.rmem_default=262144
net.core.rmem_default = 262144

# sysctl -w net.core.wmem_default=262144
net.core.wmem_default = 262144


# sysctl -w net.core.rmem_max=262144
net.core.rmem_max = 262144

# sysctl -w net.core.wmem_max=262144
net.core.wmem_max = 262144

The above commands made the changes to the already running OS. You should now make the above changes permanent (for each reboot) by adding the following lines to the /etc/sysctl.conf file for both nodes in your RAC cluster:

# Default setting in bytes of the socket receive buffer
net.core.rmem_default=262144

# Default setting in bytes of the socket send buffer
net.core.wmem_default=262144

# Maximum socket receive buffer size which may be set by using

# the SO_RCVBUF socket option
net.core.rmem_max=262144

# Maximum socket send buffer size which may be set by using
# the SO_SNDBUF socket option
net.core.wmem_max=262144

Check and turn off UDP ICMP rejections

During the Linux installation process, I indicated to not configure the firewall option. By default the option to configure a firewall is selected by the installer. This has burned me several times so I like to do a double-check that the firewall option is not configured and to ensure udp ICMP filtering is turned off.

If UDP ICMP is blocked or rejected by the firewall, the Oracle Clusterware software will crash after several minutes of running. When the Oracle Clusterware process fails, you will have something similar to the following in the _evmocr.log file:

08/29/2005 22:17:19
oac_init:2: Could not connect to server, clsc retcode = 9
08/29/2005 22:17:19
a_init:12!: Client init unsuccessful : [32]

ibctx:1:ERROR: INVALID FORMAT
proprinit:problem reading the bootblock or superbloc 22

When experiencing this type of error, the solution was to remove the udp ICMP (iptables) rejection rule - or to simply have the firewall option turned off. The Oracle Clusterware software will then start to operate normally and not crash. The following commands should be executed as the root user account:

  1. Check to ensure that the firewall option is turned off. If the firewall option is stopped (like it is in my example below) you do not have to proceed with the following steps.
    # /etc/rc.d/init.d/iptables status
    Firewall is stopped.
  2. If the firewall option is operating you will need to first manually disable UDP ICMP rejections:
    # /etc/rc.d/init.d/iptables stop
    Flushing firewall rules: [
    OK ]
    Setting chains to policy ACCEPT: filter [
    OK ]
    Unloading iptables modules: [
    OK ]
  3. Then, to turn UDP ICMP rejections off for next server reboot (which should always be turned off):
    # chkconfig iptables off

No comments:

Post a Comment