Crash Test – What happen when you loose L1&L2 on an UCSM Cluster?

Nexus-L1L2L1 & L2, What’s it?

The L1 & L2 links are dedicated physical ports (GbE) on UCS Fabric Interconnect platform, responsible of carrying the heartbeat traffic for the UCSM cluster. To be clear, both links are crucial to the survival of the cluster!

For the record, the Fabric Interconnects (or FI), are based on a Nexus hardware platform, tweaked for the job :

  • FI Gen 1 (6100XP) are based on Nexus 5000 physical platforms,
  • FI Gen 2 (6200UP) are based on Nexus 5500 physical platforms.

If you are wondering “What are theses L1 & L2 things for, on my brand new and shiny Nexus?”, here is the answer : nothing (on a Nexus). They are here from day one on Nexus, because it was planned from the start to decline the same hardware base and reuse it for Fabric Interconnects.

It’s just about rationalization of the production line : it’s cheaper to let the 4 GbE ports on all cases than handling several products.

Yes, the company that produced the Nexus line of product (Nuova Systems, acquired by Cisco on 2008) had UCS on mind from the beginning. If you look closer at the résumé of the co-founders of Nuova, all pieces comes together. They are the three musketeers as I tend to call them (even if “band” is mainly from Italy) : Mario Mazzola, Prem Jaim & Luca Cafiero. It’s not their first shot at creating a startup. But I will keep this story for another day – for another blog post, why not.

Talking back about L1 & L2 ports, if we keep on the analogy to the Nexus platform, we can say that L1 & L2 could play the same role of VPC Peer-Links on a VPC,  minus some details. Explained that way, I am sure it is more clear for everybody 🙂

Note : L1 & L2 ports connection are done this way :

  • FI-A L1 <-> FI-B L1,
  • FI-A L2 <-> FI-B L2.

Key point, no switch between L1 & L2!!! They are channel bonded, and expect another Fabric Interconnect at the other edge. Don’t try to workaround, and connect both Fabric Interconnects through a switch : this is explicitly not supported.

The heartbeat traffic

UCSM is the management software of the UCS platform. It’s embedded on each Fabric Interconnect and run in an Active/Stand-by way. The motivation and the revelence behind this choice, rather than opting for an installed of VM hosted application could feed an entire tribe of Trolls.

– We do love Trolls around here, but they will be fed another time on another blog post (again!) –

Like any clustered application, signaling link is required to keep up a good knowledge of each member’s health : it’s usually the role of a redundant heartbeat link, in other words a messaging bus.

So we are dealing here with a two member cluster, and it is not a detail.

Loosing a member and Split-Brain

The cluster use the heartbeat link to monitor the health status of each member, and failover the services on the healthy member when disaster occur. A member is deemed healthy, if a quorum can be reached among cluster members. When the cluster consist of only two members, it’s the word of one against the other … So we need a referee (called “Tie-Breaker”, “Failover Monitor” or “Witness” depending on the technology), in order to avoid split-brain. On a UCSM cluster, it’s the default gateway that play this referee job when one member can’t join the other.

Note : UCSM cluster do some other testing between FI and chassis. The FI check if the chassis’s SEEPROM is reachable. The following article on UCSGuru dive a bit more about the tests and the particular case of C-Series servers, which don’t have SEEPROM : HA with UCSM Integrated Rack Mounts.

What’s happening when L1&L2 link is down on a UCSM Cluster.

Ok, that was a pretty long introduction, to finally come the first question. What’s the observed behaviour from an operator/administrator point of view?

When one of the links become down, a “major” alert is raised on UCSM for each FI : “HA Cluster Interconnect Link Failure”.

  • Error “immediately” visible,
  • No impact on data path.

When both links are down (L1 & L2), after a timeout period (about minutes) :

  • A new error is raised, this one is “Critical” : “HA Cluster Interconnect Total link failure”,
    • Subordinate FI is deemed unreachable,
    • The B side is shut (assuming that B was the subordinate), DATA PATH included :
      • FEX,
      • LAN & SAN uplinks.

So keep an eye on your L1 & L2 links, and be sure that at least one of them is always up.

Unfortunately I don’t know the exact duration of the timeout. At least, I can say we are talking about minutes.

To keep comparing with Nexus and VPC, we can see a similar behaviour when VPC Peer-Link is down : the secondary is sacrificed on the split-brain altar.

Back to a nominal situation

As soon as at least one link is back alive :

  • Resync of subordinate FI on the primary,
  • Data links reactivation,
  • UCSM cluster back alive.

Some best practices about L1&L2

  • L1 & L2 links should be direct-connected, no switch between them,
  • During a migration or a physical move, it may happen to have only L1 active for a period of time, but keep it as short as possible and bring him back his L2 buddy as soon as possible,
  • It is about Ethernet : keep in mind distance limitations, don’t stretch more than 100 meters!
Posted in Design, Infrastructure Tagged with: , ,
2 comments on “Crash Test – What happen when you loose L1&L2 on an UCSM Cluster?
  1. Chris says:

    ◾FI-A L1 FI-B L1,
    ◾FI-B L1 FI-B L2.

    I don’t think this is right. The first line is right, but the second line should be FI-A L2 FI-B L2

    • kral2 says:

      You are right Chris, that’s a terrible typo error : FI-B should not loop to himself 🙂
      I corrected the article.

      Thanks for the feedback.

Leave a Reply