03-26-2025 02:24 PM - edited 03-26-2025 02:26 PM
I recently joined a new company where they use ICX7450 as core and distribution switches.
Distribution switch is connected by trunk to 6 3rd party field switches.
Additionally these field switches have the following connections.
1 and 2 connected together
3 and 4 connected together
5 and 6 connected together
All field switches are L2 switches so routing is not an option unfortunately.
Distribution switch has per vlan 802-1w enabled, all field switches have standard RSTP enabled.
Core switch has STP completely off (no idea why)
distribution switch has priority set to 0 on all vlans.
The problem we have:
We always need to have one port shut down for example to switch 1 or switch 2 (also either 3 or 4 and either 5 or 6) on distribution switch. When we enable both links creating a loop we get random behavior. Sometimes broadcast storm starts right away, we see distrib 7450 get CPU spike to 20% but surprisingly we had many situations where everything was working fine for hours. I logged in to field switches where I saw they correctly elected alternating route and put one port in discarding mode. No CPU spike on distribution at all. I went home only to wake up to hundreds e-mail from Solarwinds and network was completely down. Turns out for whatever reason one field switch turned it's discarding port to forwarding.
Thinking that it's a problem with classic RSTP on field switch unable to talk to per vlan RSTP on 7450 I enabled single instance of 802-1w on distrib switch. In theory everywhere except for Core we have classic RSTP. This makes problem even worse. Broadcast storm right away when we create a loop.
I have already spent 2 weeks fighting with this and I pull my hair on this. Why is it happening.
Oh I should add that our distribution switch is running SPR code because we have VE interfaces on it. Does SPR not work with STP?
Anyway please help me identify the problem.
03-28-2025 09:54 AM
UPDATE:
I might have found the mistake of previous network engineer but I need confirmation.
Our default vlan is left unchanged as Vlan 1
Trunk ports are ports ethe 1/2/1 to 1/2/4, ethe 1/3/1 to 1/3/3 and ethe 1/4/1
Now I believe his mistake was that he specifically removed untagged traffic on all trunk ports
This is copy of our config:
vlan 1 name DEFAULT-VLAN by port
no untagged ethe 1/2/1 to 1/2/4 ethe 1/3/1 to 1/3/3 ethe 1/4/1
spanning-tree 802-1w
spanning-tree 802-1w priority 0
Am I right to assume that this essentially blocks BPDUs between our ICX7450 and 3rd party switches running simple RSTP because they expect untagged BPDU frames on trunk ports?
Thanks
03-31-2025 09:45 AM
You could be on the right path. What does 'show 8' say on the Dist? Is it the root for all vlans? What does 'show 8' or equivalent say on the access switches? Who shows as root down there? If the access switches are not running per vlan rstp, there certainly needs to be a config change. Make sure vlans match on both sides. You could possibly run a single 802.1w instance up on the Dist switch.
Config to convert would be something like this:
enable
conf t
spanning-tree single 802-1w (puts everything in one rstp instance under reserved vlan 4094)
spanning-tree single 802-1w priority 0 (or pick your root bridge value)
I see no problem with rstp off on the core in that topology (no looping, so no need to tax the cpu on the core). Let me know what you find or if that helps.