In time of an outage, will your systems continue to run and provide services to your customers? Unless you test it, you won’t know.
Testing the redundancy in your network is very important, unfortunately, it usually doesn’t get done. For years I have suggested that we perform annual redundancy testing. Every year the testing gets denied.
Here is a list of things I would like to test.
– Make sure both power supplies (In dual powered equipment) can handle the load by themselves.
– Dual connected servers stay online when 1 of the 2 switches are powered off.
– Move the entire data center to the same UPS, then to the other to make sure it can handle the load.
– Fail any Active/Standby pair to the Standby to make sure the standby configuration and hardware works.
During a switch upgrade I took down multiple services. All of the servers were dual homed to the other switch, but they still went down. Later we found out that the redundant switch port was not configured. Another server never had the network cards setup for active/standby, but the standby network card was connected to the standby switch. Another server didn’t even have the cable run to the second switch. Everybody thought these services were redundant, but it didn’t work.
Many times I don’t have time to perform failover testing when I deploy new equipment. With out testing in the production environment, we don’t know how the other equipment is going to react to a failure. In my network we have many TAP’s and Bypass switches. If the switch on one side of the TAP fails, does the port go down on the other side of the TAP? I hope it goes down. The device on the other side needs to know that the neighbor just went down.
Do you perform annual redundancy testing? If so, what problems have you found and what outages have you avoided?
Please share, maybe your experiences can help other justify performing these tests!!