cancel
Showing results for 
Search instead for 
Did you mean: 

7150-24s won't talk - what's the missing link?

OKGolombRuler
New Contributor II

This is baking my noodle so I'm going to lay out a bunch of context in hopes someone can help me identify the (probably-trivial) problem before I lose the remainder of my hairline. 🙂

BLUF: Amateur-hour homelabber can't get new 7150-24P to establish link to old 7150-24 over a length of good Cat5, and has run out of ideas.

Context:

* Homelab network consisting of 2x 7150-C12P, 1x 7150-24, 1x 7150-C08P, and a new-to-me 7150-24P. APs are Ruckus 510s with VLANs.  Mesh is enabled, as a couple of APs are a major trenching project away from a wire.

* All devices running 8.0.95d.  All but the switching-only C08P are running the routing image.

* Physical layout: The main -24 (henceforth, the 'core') is stacked via fiber with one C12P in the basement.  It provides a LAG uplink to the firewall and hosts some direct-connect devices and runs to the other switches.  The runs are Cat5 or paired MOCA bridges.  Those 'spoke' switches are ofc not in the stack, but bring trunked VLANs, access ports, and POE for APs where needed.

The Change:

* Moved the C08P from Cat5 to a far-flung MOCA run with a bare AP.  Connected C08P's same ports for uplink and AP, and as expected, immediately and happily began passing traffic.  So far so good.

* Deployed the 24P (henceforth, 'new' switch) to the Cat5 previously hosting the C08P.  Factory reset, OS flash, created VLANs, IP addresses.  Got the AP online.  Configured uplink port, plugged in the cable from the wall and... nothing.  No link.  Port blocking.

Diagnostics:

* Noted that the AP mesh was cheerfully backfeeding vlans into the switch via the AP's port.  Thought that might cause spanning tree problems, so made sure the 24P had a lower score (higher numerical value) than the core for 802-1w priority on all VLANS.  No change.

* Spent a few hours wild goose chasing 802.1w before ultimately removing all VLANs off the AP's port (so the AP is powered and backhauls client traffic by mesh, but it won't affect the switch's STP, etc.

* Tried other ports and modules on the new 24- no link anywhere I connected the cable.

* Disabled port on both switches and cleared arp, mac, VLANs, then enabled.  Still no link.

* Thought maybe I'd buggered the cable between wall jack and new switch.  Put core port untagged in a client vlan and plugged in client device. Link came right up and started passing traffic.

* Thought maybe new switch had a bad port.  Cleared new switch port config, plugged in client device to same - link came right up.  So I think the switch port HARDWARE is good.

* Thought maybe port security was turned on.  Nope.

* Thought maybe I had different spanning tree versions running on each switch (spanning-tree vs 802-1w).  Nope.

* Thought maybe I'd somehow bobbled the configs, so I wiped both ports' configs, cleared out all the VLAN config (so they were both untagged in VLAN1, which is my default VLAN (and which I use as a sandtrap-- nothing real ever goes in there) -- still no link.

* Thought maybe speed auto-negotiate wasn't working, tried hardcoding both ports to 1000-full.  No link.

All of which suggests to me that it's a PEBCAK / config problem, as both switches will individually link to the same client device, and the cables all seem to work fine.  But having factory-reset the new switch a couple times, I'm running out of ideas for Stupid Operator Tricks I could have inadvertently performed, or things I could check.

The Request:

Can anyone please:

* Point me at a detailed breakdown of the 'show int ether x/y/z' output, as far as what every statement on there means?  Maybe the switch is telling me the problem but I'm not hearing it.  What command/s might you run to find out why the port doesn't have link?

* Suggest to me useful debugging / logging statements I could activate?  I thought of (and tried) debugging port and 802-1w but the former didn't tell me anything interesting and the latter told me a lot about what was happening on every other port but the one in question.

* Suggest other things I might have done to prevent link from occurring, specifically between these two switches?

* Suggest other diagnostically-useful tests to run?

 

1 ACCEPTED SOLUTION

OKGolombRuler
New Contributor II

Thought I'd posted a long reply with nicely-formatted config snippets and show outputs, but it seems to have gone missing.  So let me provide an update with what I found a day later, and how I found out how to find it.

BLUF: Layer 1 problem, one of the pairs has a short or wasn't connecting right.  Unplugging/replugging at the wall jack/keystone 'fixed' it.

Details: @BenBeck suggested a layer 1 problem.  OK, how do you diagnose L1 problems on FastIron? What tools do we have?

After a bunch of googling and thesaurus time ('troubleshooting'? No. 'diagnostic'?  Ok....) I found this lovely pair of commands:

phy cable-diagnostics tdr x/y/z
sh cable-diagnostics tdr x/y/z

 

and wouldn't you know it, what do we have here:

#sh cable-diagnostics tdr 1/2/1

Port Speed Local pair Pair Length Remote pair Pair status
---- ----- ---------- ----------- ----------- -----------
1/2/1 1G Pair A 10-50M Pair B terminated
Pair B 10-50M Pair A terminated
Pair C 10-50M Pair D terminated
Pair D <=2M open

and on the other switch:

#sh cable-diagnostics tdr 1/1/24

Port Speed Local pair Pair Length Remote pair Pair status
---- ----- ---------- ----------- ----------- -----------
1/1/24 UKNWN Pair A ??? Pair B terminated
Pair B 10-50M Pair A terminated
Pair C 10-50M Pair D terminated
Pair D <=2M open

To have a 'control' output, I ran it on a port with a client and link:

sh cable-diagnostics tdr 1/1/1

Port Speed Local pair Pair Length Remote pair Pair status
---- ----- ---------- ----------- ----------- -----------
1/1/1 1G Pair A <10M Pair B terminated
Pair B <10M Pair A terminated
Pair C <10M Pair D terminated
Pair D <10M Pair C terminated

 and at that point I was pretty sure it was a cable problem.  I unplugged the cable from the wall, looked at it, plugged it back in... and BAM, link light.  Still need a little quiet time to determine whether the jumper, the keystone, or what else is going in the garbage can, but here's hoping this little writeup helps someone else with THEIR L1 / link light / diagnostic / troubleshooting / other-search-terms-here problem.  Learn from my fail. 🙂

--OKGR

View solution in original post

3 REPLIES 3

BenBeck
Moderator
Moderator

I'm just on my phone here, so this won't be a complete response. When you say it won't link, do you mean at layer 1? If you look at "show int br" or "show int eth x/y/z" in problem state, is the port up anywhere? If we're down at layer 1, layer 2/3 are irrelevant. "Show log" can help. If the port itself is up, then you can issue a "show Mac" to see what vlans are being learned where (assuming there is traffic). "Show lldp neighbor" and "show lldp neighbor detail" can be useful if lldp is on everywhere. You can watch "show stat" or "show stat eth x/y/z" to watch port counters. Realistically, the best bet would be to open a support case so support could jump on the CLI and "follow the breadcrumbs". Without output, it makes it very hard to say what is happening. I'd check layer 1 first then vlans at layer 2 (show vlan or show vlan br eth x/y/z) and go from there.

Ben Beck, RCNA, RCNI, Principal Technical Support Engineer
support.ruckuswireless.com/contact-us

OKGolombRuler
New Contributor II

Thought I'd posted a long reply with nicely-formatted config snippets and show outputs, but it seems to have gone missing.  So let me provide an update with what I found a day later, and how I found out how to find it.

BLUF: Layer 1 problem, one of the pairs has a short or wasn't connecting right.  Unplugging/replugging at the wall jack/keystone 'fixed' it.

Details: @BenBeck suggested a layer 1 problem.  OK, how do you diagnose L1 problems on FastIron? What tools do we have?

After a bunch of googling and thesaurus time ('troubleshooting'? No. 'diagnostic'?  Ok....) I found this lovely pair of commands:

phy cable-diagnostics tdr x/y/z
sh cable-diagnostics tdr x/y/z

 

and wouldn't you know it, what do we have here:

#sh cable-diagnostics tdr 1/2/1

Port Speed Local pair Pair Length Remote pair Pair status
---- ----- ---------- ----------- ----------- -----------
1/2/1 1G Pair A 10-50M Pair B terminated
Pair B 10-50M Pair A terminated
Pair C 10-50M Pair D terminated
Pair D <=2M open

and on the other switch:

#sh cable-diagnostics tdr 1/1/24

Port Speed Local pair Pair Length Remote pair Pair status
---- ----- ---------- ----------- ----------- -----------
1/1/24 UKNWN Pair A ??? Pair B terminated
Pair B 10-50M Pair A terminated
Pair C 10-50M Pair D terminated
Pair D <=2M open

To have a 'control' output, I ran it on a port with a client and link:

sh cable-diagnostics tdr 1/1/1

Port Speed Local pair Pair Length Remote pair Pair status
---- ----- ---------- ----------- ----------- -----------
1/1/1 1G Pair A <10M Pair B terminated
Pair B <10M Pair A terminated
Pair C <10M Pair D terminated
Pair D <10M Pair C terminated

 and at that point I was pretty sure it was a cable problem.  I unplugged the cable from the wall, looked at it, plugged it back in... and BAM, link light.  Still need a little quiet time to determine whether the jumper, the keystone, or what else is going in the garbage can, but here's hoping this little writeup helps someone else with THEIR L1 / link light / diagnostic / troubleshooting / other-search-terms-here problem.  Learn from my fail. 🙂

--OKGR

Accepting my own answer to make the hopefully-useful commands and notes more visible, but again kudos to @BenBeck for setting me on the right path with his Layer 1 comment!