Always avoid Single Point of Failures SPOFs

I was told recently that organization XYZ suffered outage because one of their core devices did not have redundancy.
In other words, there was a single point of failure somewhere in their network.
And then their technical team kept fire fighting until the issue was resolved.

This post is about avoiding this nuisance called SPOF – Single Point of Failure
And how are single point of failures avoided ?
Let’s just quickly go through technologies that take care of High Availability.

A. Introduce device clustering
==============================

Examples are
1. Stacking switches
These could be 3750 stacked switches at the access layer.
Or 6500 VSS which works well at the distribution layer.

2. Active / Passive Firewalls
or Active / Active Firewalls

Active / Passive is pretty straightforward concept where one active firewall in the cluster handles all the data traffic and the other device sits idle ready to take control once active unit fails.
Active / Active is where one device is made primary for half of the security contexts and the other device is primary for rest of the security contexts.

3. Virtual Port-channel
Nexus family of Cisco suppport VPC feature.
Two Nexus 5000 switches or two Nexus 7000 can be configured as VPC peers.
This feature simplifies the layer 2 topology, it removes blocking at layer 2.

The way blocked ports are removed is because logically there are two devices with port-channel between them.
And this single port-channel is in forwarding state.
Bandwidth wise, this is better because you are getting throughput from both the links.
This is a considerable improvement when compared with traditional layer 2 networks where one link would be forwarding and the other redundant link would be in blocking state.

One more benefit out of VPCs is the way they behave with FHRPs viz. HRSP, VRRP
vPC interaction with FHRPs ensures that both VPC peers can forward traffic northbound; the traffic hitting HSRP standby node need not cross the vPC peer link.

Last but not the least, VPCs could also be double-sided where northbound and southbound devices both are forming vPC towards the other end.
Northbound device forming VPC towards southbound device & vice versa.

B. Interface redundancy
========================
Protocols like LACP / Pagp allow interfaces to be combined into a port-channel.
This again avoids layer 2 blocked ports and the aggregated logical interface is forwarding traffic.
Even in the SAN world, you could combine two interfaces into a single logical interface and you could trunk VSANs over the logical link.

What this means is increased throughput/speed.

You could also use port-channel hashing methods like src-dst-ip or src-dst-mac to influence load sharing over bundled interfaces.

C. Redundancy at application level.
=====================================
Whereby the application is hosted on multiple servers and multiple servers are hosted using a virtual IP on load balancer.
Client request hits virtual IP of laod balancer.
And load balancer forward traffic towards pool members based on load balancing algorithm configured.

D. Redundancy for server network interfaces.
============================================
Server interfaces could be bundled in active/active or active/passive style.
IP address assignment goes under the logical bundled interface.

E. Redundancy at router level
==============================
SVIs created at the distribution layer can be combined with FHRPs like HRSP, VRRP, GLBP.
These FHRPs provide multiple gateways which are physically redundant.
Alternatively, you could have odd VLANs active on one gateway and even numbered VLANs active on the other gateway.

One thought on “Always avoid Single Point of Failures SPOFs

Leave a comment