Monday, September 06, 2004

Cheap Switches and Broadcast Storms

One of the continually amazing and wonderful things about computer technology is the continuing trend toward smaller, faster, cheaper.

For example, when Kalpana (which was purchased by Cisco in 1994) introduced the first Ethernet switches in 1990, they were huge - about the size of a PC - which, to be fair, was also true of the Cisco routers they would challenge at the core of the LAN.

They were slow - half-duplex 10 Megabit Ethernet - although again, this was the best available at the time.

And the first seven-port model retailed for $10,500, or $1,500 per Ethernet port - still cheaper than a router, which might be three times that cost.

[These figures come from Network Computing magazine, which ranked the Kalpana EtherSwitch as the 5th "Most Important Product of the Decade" in October of 2000.]

Today, I can purchase a D-Link, Linksys, or Netgear eight-port full-duplex 100 Megabit switch for about $40, and can just about put it in my pocket. I don't even have to go to a computer store - Office Depot has Ethernet switches. So does Wal-Mart. They'll probably be in the check-out lane at Kroger soon, next to the batteries and gum.

(I can also purchase an original Kalpana EtherSwitch on eBay right now for $15, plus shipping, if I really want one.)

The cheap switches are meant for home networks and small offices, but I see lots of them on, under, and behind the desktops at larger businesses. The reasons are fairly simple - offices (and cubicles) which may have started with one occupant now hold two; the desktop PCs are now often joined by a laptop, and occasionally a networked printer (and a VoIP telephone is next!) - and when a new cable run to the wiring closet or server room might cost $50-$150 and require an expensive core switch upgrade (because all the ports are full!) it seems like a no-brainer to throw a $40 switch into each office.

And it generally is a good solution, but there is a potential problem, which one of my clients recently found out the hard way.

The small, cheap switches are unmanaged, which appeals to their target market. There's nothing to figure out - you plug them in and they work. But since they lack the capability for management, they also lack features which might require configuration, such as Spanning Tree Protocol - which we'll return to in a moment.

I GOT A CALL after lunchtime on a Friday, from a colleague's cell phone - asking if I knew a store in town that was likely to have a couple of 24-port switches in stock. The client's switches - a pair of Cisco Catalyst 2900s - were 'going crazy.' No one could connect to anything; the port status lights were all blinking rapidly, and even after turning the switches off, the problem came right back after just a couple of minutes of operation.

I thought it was highly unlikely that both switches had malfunctioned at the same time, and in any case I wanted to see what was happening (I had originally installed the switches myself) so I told him to skip the store and pick me up, and I grabbed a couple of Catalyst switches from my lab.

Sure enough, the switches were blinking like crazy, but a quick look at the settings and status didn't indicate anything obviously wrong. In the interest of getting the client's network back online while we did forensics on the switches, we plugged in the spares from my lab and moved about forty patch cables from the old switches to mine.

The problem cleared up, everyone started getting back to work - and then my switches went crazy. Since the status lights indicated constant traffic on every port, I figured that any port might give me some clue as to what was going on. Using Microsoft Network Monitor on one of the servers, I captured network traffic for several seconds.

I found hundreds of frames, all from the same MAC address, each a NetBIOS broadcast request for a Master Browser. A Windows client was attempting to build a list of network resources. But Windows clients don't normally broadcast hundreds of NetBIOS requests per second. And I had trouble believing that any of the client machines could even continuously transmit at the rate I was observing.

I queried the switches to determine the source port for that MAC address (the Catalyst switches are managed) and we tracked the source to a computer in shipping, which was connected to a small switch. We disconnected the computer, and the network settled down. I thought perhaps it had a malfunctioning network card; less likely but still possible was a virus or other malware (especially since this machine had recently been configured by FedEx.)

And then the network went down again.

A quick check back at the Catalysts showed that yes, it was the same problem, and the traffic was still originating from the same port. We had another look at the switch, and found that there were five patch cables, even though we had only found three computers in shipping. We followed one cable behind a bench, and shelves, and a stack of boxes, and then back to a different port on the same small switch! Aha! We unplugged the cable, and the problems went away - this time for good.

What my client had been experiencing was a classic broadcast storm, which occurs when an Ethernet network has been configured with a loop.

Data is sent through an Ethernet network in small packages called frames, which, much like a letter, have a destination address, a source address, and a payload, or data. The addresses are media access control, or MAC addresses, and are unique to each network interface card, or NIC.

Switches use the source addresses to determine the location of each computer, printer, and other network device on the network. Frames with a specific source address will only enter one port (plug) on the switch. Once a switch has mapped a MAC address to a specific port, then frames with that address as the destination address will only be sent to that port.

(This is what distinguishes a switch from a hub. Hubs do not build a map of MAC addresses, and simply send an incoming signal out of every other port.)

But frames sometimes have a broadcast address as the destination - which is a special address intended for every NIC. Computers use this address when a message must be sent to all other computers - or when the destination MAC address is unknown.

(Consider a license plate number, which uniquely identifies a car. If you see a parked car with its lights left on, you might have the license number announced over the intercom. Everyone will hear the announcement (broadcast) but only the owner of the car needs to take action. Everyone else will ignore the announcement once they realize that the number isn't theirs.)

Because a broadcast frame is meant for every computer, a switch will send the frame out of every port (except for the port where it entered the switch.)

But if there is a loop in the network, a broadcast frame leaving one switch port will enter through another, and the switch will once again send that same frame out of every port except the one it entered. In the case of my client, the loop was on a single switch. When any of the computers on that switch sent a broadcast, the switch would send the frame out of both looped ports - and almost instantaneously, it would appear inbound on the same two ports after going around the loop. The switch would once again send the frames out of every other port, including the other port in the loop, and the switch would become a perpetual-motion frame generator, constructing new broadcast frames as fast as it could process them.

And since one of the ports on the switch led to the rest of the network, the whole network was flooded with an endless stream of broadcast frames. There was essentially no room left on the wire for any other machine to talk, so the network came to a standstill.*

SPANNING TREE PROTOCOL is designed to prevent loops in a switched network. All the interconnected switches in a network select one switch as a root bridge - based on the MAC address of the switch, and the switch's bridge priority number.

(The bridge priority needs to be adjustable - the bridge with the lowest priority will become the root. Otherwise the root bridge would be determined just by the MAC addresses, which would be something like choosing a leader based on who had the lowest Social Security number. This is why only managed switches typically implement spanning tree.)

All the switches then determine which of their ports (based on link speed and 'hops') has the lowest cost path to the root bridge. The switches do this by sending bridge protocol data units, or BPDUs, from each port - containing their own priority number, their current root bridge, and their lowest cost path to the root bridge. While each switch begins with itself as root, it will learn from the incoming BPDUs, until all the switches converge on a single choice for the root bridge, and have determined their own lowest cost path to that root.

If a switch determines that there is more than one path to the root, it will disable (block) the higher cost ports. If two or more ports have the same cost, then the port with the lowest number will be the active port, and the others will be blocked. Blocking ports will never transmit any frames, but will still listen to incoming BPDUs, in order to respond to changes in the network.

In this way, spanning tree is able to detect loops in the network, and shut down one of the looped ports. For a simple loop, such as the one at my client, the switch will notice that the incoming BPDUs have its own MAC address, and will disable the higher-numbered port.

OF COURSE, THERE'S ALWAYS A CATCH. Spanning tree can take 30 seconds to converge after any change, which means that it might be 30 seconds before any new computer plugged into the network begins working. If the computer is configured to get an IP address automatically using DHCP, it may give up in those 30 seconds - and not try again for five minutes. In the mean time, of course, the computer won't be able to use the network, and a non-technical user might reasonably conclude that something's broken. Cheap, consumer-oriented switches are designed to start working as soon as a new device is plugged in, which means no spanning tree.

And managed switches naturally are more expensive - sometimes by several hundred dollars over unmanaged versions. Many manufacturers don't offer managed switches with fewer than twelve or even 24 ports.

IF YOU'RE GOING TO HAVE THESE THINGS on your network, you need to make sure that all your users understand that loops are big trouble. In the case of my client, an employee had made an effort to clean up the shipping area, and when they discovered an unplugged network cable, intended for temporary use by laptops, they chose to 'clean it up' by plugging it into the switch - right next to its other end! Ouch. The LAN network was unusable for most of that Friday, idling more than a dozen salespeople in that office, and even more in Austin and Dallas who connect via VPNs. The direct cost in consulting fees to diagnose the problem was substantial, but the cost in lost sales and lost employee productivity was much greater. You can be sure that the management had a long talk with the person responsible.

Once a network reaches a certain size (and degree of complexity) you should implement a policy that no employee is allowed to plug anything into the network without the approval of the network support personnel. This can, of course, occasionally inconvenience an employee - or a visitor with a laptop - but when an innocent mistake can potentially bring down the entire network, the consequences are too great to ignore.

Other common problems are wireless access points - which can provide a path into the core of your network, bypassing your firewall - and laptops running Windows XP with both a wireless and Ethernet adapter. It's not uncommon for XP with multiple adapters to be configured for bridging, which means that, yup - it's implementing spanning tree, and can cause interesting behavior by forcing the network to re-converge when it's plugged in. It's a bad thing when a visitor's laptop becomes the root bridge.

Know what's plugged in to your network, and only allow authorized people to make changes.

* My more technically aware readers will have observed that normally, plugging a cable back into the same switch (or another switch) will not cause a problem unless the cable is wired as a crossover cable. But because the cheap switches are designed to be consumer-friendly, they will 'helpfully' automatically change to crossover mode!

"Each port on the DSS-5+ supports automatic MDI/MDIX detection providing true 'plug and play' capability without the need for confusing crossover cables or crossover ports."

- from the product description on a 5-port D-Link switch


corq said...

I realize this is an older posting, but thank you for illuminating this - something only experience can testify to!

I have been confronted with a broadcast storm on a small network only once, but after this event, discovered a wealth of inefficient processes that we were able to address afterward, even though they were 'red herrings' during the original incident. Thank you for sharing the experience and the metaphors, this is very helpful to those of us just learning the subtleties of the network world. (Best advice I was ever given: "Define what the problem ISN'T!")

Anonymous said...

Thank you posting this! I am not a network specialist. I am learning and several times I have had network storms. I even replaced some switches only to have to problem resurface. This was a very enlightening read. Now I know why all my lights are flashing together like that, lol. I only wish I has a way to locate the source of the problem.

The Tru-Geek said...

You can trace broadcast storms through your managed switches.
switch# debug arp (logs and displays arp requests)
switch# term mon (allows the display to show from your telnet or ssh session)
watch the display for the same ip address over and aver once you see that enter:
switch# term no mon (turns off term mon)
switch# un all (turns off all debugging)
switch# sh arp
find the IP address that you discovered earlier and write down the MAC that is associated with it.
switch# sh mac address-table
find the MAC you just wrote down and see what port it is associated with. If it is associated with a trunk port do the last step again until you find a single port connection.

That's how you trace a broadcast storm through your network without having to purchase 3rd party tools.