Tuesday, July 7, 2009

Why crossover cables are not supported in RAC

Why crossover cables are not supported in RAC


Many Oracle shops in this world use crossover cables, literally a network cable, between nodes for use as the interconnect between two rac nodes. Does this work, yep, you bet. Is it supported, no. Why? well it all has to do with how a node reacts when its sister fails in a two node cluster.

Each node in the cluster constantly checks on the other nodes in the cluster through both the network (interconnect) and storage (voting disks), if one or both are lost, the cluster node is instructed to commit suicide and reboot itself in hopes of rejoining the cluster healthy and happy.

If a crossover cable is used, and one of the nodes drops the remaining node will have to wait for the tcp timout, generally 60-300 seconds, before it realized that the lost node is gone. At which point, the cluster will remove the lost node from the cluster. What can happen during that time is two fold, the surviving node can lock up, litterally freeze during the wait for the timeout and/or the cluster can become very confused if the dead node restarts and attempts to join the cluster at a point when the cluster still thinks it is there. Strange things have been known to happen, many errors thrown and at times will cause both nodes to evict and restart.

Having a switch between the nodes allows a signal to be sent immediately if a node quits responding, at which time the surviving node will check for 60 seconds then evict the failing node, allowing it to rejoin (upon reboot) a clean cluster without any problems.

In short, crossover cables are fine in an emergency or development, any situation where failover is not critical, but for production, spend the money on a good switch, two in fact if you can bond your nics (that’s for another post), for the best senario to survive a failover with as few issues as possible.

No comments:

Post a Comment