L3 Point of View: SAN Topologies
Park Place Hardware Maintenance
What really goes on when you plug a port into a healthy SAN switch, and what does this tell us about the ports themselves? Why isn’t it more of a “plug and play” operation? Ethernet switches set the mark when it came to autonegotiation despite some pitfalls. However, Ethernet has typically been more widespread throughout the years, and therefore had more opportunities to mature. Over the years, Fibre Channel switch ports have successfully overcome many limitations and become more plug and play as well, in 2 ways: Speed and topology.
The real world implementations of the Fibre Channel (FC) Link Speed Negotiation state machine have become much more refined, so the need to “fix” ports at a given speed to improve their reliability has become much less common with recent devices. Similarly, the topology autonegotiation mechanism (traditionally ‘loop first; then link’) has improved greatly to the point where you usually don’t even have to think about its pitfalls of the past when troubleshooting the more recent devices. The same level of progress has been made when it comes to ISL port (E_Port) autonegotiation, and so forth.
So this allows the ports to gain synchronization and initialize on their own for the most part, once they are plugged in, prior to performing login and registering with the switch. However, the FC switch admin is still expected to administer zones. Why is that? And what do zones do?
One of the primary functions of a zone is to allow device ports to perform login and discovery with the correct device ports that sit across the fabric once they have already registered with the switch database. Zones are created by the switch administrator. Due to the fact that the switch will notify a port when another port in its zone loses link, administrators are generally instructed to create small 1:1 (target:initiator) zones (aside from special cases). This type of a controlled environment can be a blessing or a curse depending on how well it is managed.
How is the creation of a zone actually accomplished by the switch itself? Backing up for a moment, when the device ports register with the switch database, they provide a few key pieces of information:
- WWPN – the unique 8 byte port identifier, similar to an Ethernet port’s MAC address, as defined by the IEEE OUI guidelines
- WWNN – the unique 8 byte node identifier for the device associated with the WWPN, defined by the same guidelines
- Symbolic Port Name – ASCII string associated with the WWPN, provided by the RSPN frame, as defined by the FC-GS specs
- Symbolic Node Name – ASCII string associated with the WWPN, provided by the RSNN frame, as defined by the FC-GS specs
The switch ports will also assign the device ports with a corresponding 3-byte NPort_ID, that is unique across the fabric. This contains the domain identifier, and is also used as a source and destination identifier in all of the framing exchanges. Here, you can automatically see why it can be disruptive to reassign domain identifiers (since not only can this cause disruptive ISL frame exchanges to take place, but it will generally also cause reassignments for all of the associated NPort_IDs).
Why do we need to know about port identifiers to understand zoning? The frames that are used to convey the zoning definitions utilize the identifiers that are associated with the ports. Traditionally there are 2 types of zones:
- Hard zones – switch zones that are defined via physical switch ports
- Soft zones – switch zones that are defined via the WWPNs of the devices that are connected to the switch ports
Nowadays, we see mostly soft zones. This means that if you move a cable from one switch port to another, you no longer have to alter the corresponding zone(s). That is because the switch keeps track of the relationship between the WWPNs and its own physical ports internally. On a secondary note, it can also mean that the zones are easier for us to read and decipher across multiple ISLs, especially when they are coupled with RSPN strings and/or WWPN aliases.
Once the devices have registered with the switch database (and the switch can now track the relationship between the WWPNs and the associated physical ports that they are plugged into on the switch), the switch is ready to provide devices with the zoning information that they need about each other. Specifically, in most cases (ignoring the hierarchical and database rebuilding types of frames that are exchanged across the ISLs here), this means conveying zoned-in target port information to initiator ports.
To accomplish this, the initiator port will transmit a query frame, such as a GID_FT, to the switch. Assuming everything is fine, the switch will respond with an appropriate response frame, such as a GID_FT ACC, which contains the NPortID(s) of the other ports in its zone(s). If the target port goes offline and comes back up, the switch will generally let the initiator port know via an RSCN frame with the NPortID of the target in the payload, so that the initiator can log back in with the target in a timely manner.
The identifiers used in frames such as the GID_FT ACC frames and the RSCN frames are not the 8 byte WWPNs, but rather, the 3 byte NPortIDs. These are the same identifiers that are used in the S_ID and D_ID fields of the frames themselves (once they have been assigned to the physical device ports by the switch, via the FLOGI ACC, or to virtual ports via FDISC ACC). The FDISC allows one physical port to log in with the switch via multiple port identifiers via the mechanism defined by the NPIV protocol. This, when combined with soft zoning, allows individual virtual ports such as the ones that are associated with the “guest” operating systems that reside on VMWare ESX hosts to be zoned in with other physical and/or virtual ports. In this manner, virtual platforms can access their own chunk of bandwidth on a physical port, without having to actually own the entire piece of real estate.
In a world of ASIC, SFP, and cable issues, one last factor that isn’t always automated, and is often overlooked, are buffer credits. (Not to be confused with SCSI queue depth, which takes place at a much higher level in the SAN stack.) In certain situations, it’s important to have some idea about what these are doing, such as those involving long distances or other sources of latency on one side or another. Unlike ACKs, which provide end-to-end flow control, buffer credits provide flow control for a given hop (therefore, they can be tallied and communicated in the fields of various frames, but primitive signals are the only necessary structures for the actual record keeping). So at a basic level, if a receiver runs out of buffer credits, the transmitter can’t send its frame. There are adjustments that can be made to even things out (for instance, buffer credit allocations can often be increased for a given switch port. Some devices that are built to help others traverse long distances will often find ways to inject their own buffer credits, even if they are relatively transparent otherwise). Ideally, however, it’s best to try and isolate any sources of latency when troubleshooting credit issues, and take it from there.
Claire Kraft, Level III Engineer