Tuesday, December 4, 2018

Odd Number of Spines in Leaf-and-Spine Fabrics

In the market overview section of the introductory part of data center fabric architectures webinar I made a recommendation to use larger number of fixed-configuration spine switches instead of two chassis-based spines when building a medium-sized leaf-and-spine fabric, and explained the reasoning behind it (increased availability, reduced impact of spine failure).

One of the attendees wondered about the “right” number of spine switches – does it has to be four, or could you have three or five spines. In his words:

Assuming that one can sufficiently cover the throughput/oversubscription plus resiliency/blast-radius requirements, is it fine to use an odd number of spines or would it be better to stick to an even number?

Equal-cost multipathing (ECMP) is usually implemented with a fixed number of output buckets with multiple buckets mapped to the same next hop (I wrote about Cisco CEF implementation in 2006). Some fields that are assumed to be a good representation of flow entropy are then extracted from each incoming packet, a hash value is computed from those fields, and that value selects the output bucket (and the next hop).

Some ECMP implementations used just the destination IP addresses. A bit later source IP addresses were added to the mix to spread traffic toward the same host across multiple links. Today most implementations use the full 5-tuple (source/destination addresses/ports + protocol ID), and some vendors allow you to add additional fields like IPv6 flow label to support ideas like FlowBender.

If the software uses (approximately) the same number of buckets for each next hop you get ECMP, if it allocates more buckets to one next hop you get unequal-cost multipathing (like what we had with EIGRP variance or MPLS-TE over multiple tunnels).

Some ECMP implementations can use any number of buckets… but that complicates the hash function, so it’s common to see hardware implementations use powers of 2.

In the good old days when you could do ECMP across 8 or 16 output buckets it made sense to make the number of spines a power of two.

Today’s silicon has between 256 and 1024 ECMP buckets, so it shouldn’t matter anymore. Do all vendors implement that capability correctly? I have no idea… If you know more, please write a comment – I always love to hear about juicy real-life details like IPv6 support in some EVPN implementations.

Let's block ads! (Why?)

Thanks to Ivan Pepelnjak (see source)

No comments:

Post a Comment