cancel
Showing results for 
Search instead for 
Did you mean: 

Radius server unreachable events

david_henderson
Contributor II
vSZ version 5.2.0.0.699 with 411 R710s APs on campus. Every few days we are getting a bunch of radius server unreachable events. What is odd is the details of the event do not even point to our radius server. All events are reported like this

AP [Int301@F8:E7:1E:2A:A4:40] is unable to reach radius server [127.0.0.1].

Of course 127.0.0.1 is not our radius server. We are using Cloudpath as our radius server. Our wifi is rock solid so there are no other symptoms other than a rash of these events every few days.

Any ideas what causes this?
63 REPLIES 63

Would be curious to know the outcome of said ticket.

david_black_594
Contributor III
The problems with version 5 seem to surface on larger networks. We have several clients with very large production networks, each with several hundred sites, thousands of APs, and multiple multi-node clusters worldwide.  We are responsible for managing and maintaining these production networks so we are very cautious with upgrades.  We do our own testing and we've not found a single version 5 release that we like.  We've kept all client production networks on 3.6.2 with one exception - a single 4-node cluster managing APs and switches that's running 5.1.0.0.496 (the version we dislike the least).

We very much look forward to having a stable and tolerable v5 release one of these days, but because our neck is on the line, we intend to keep the our clients on 3.6.2 until there a there is a release that passes our testing.  If you check the Ruckus support site, you'll also find that TAC's recommended version for SZ or vSZ is 3.6.2.0.222.  

dave_watkins_74
Contributor
Sadly we started our vSZ journey on 5. And we need support for new AP's so need to keep up to date. We're at a couple of hundred AP's and I've found issues on every release but need the new AP support so need to upgrade. I should go an look if they have fixed the email address fields to accept gTLD's longer than 6 chars or if they have fixed the broken multiple realm support of radius based admin logins that they broke with the previous release.

david_henderson
Contributor II
I was the one who opened the ticket with support about this issue and they still have not resolved it. We have 411 R710 APs and are seeing two things. Occasionally, maybe once or twice each week, we get a dozen or so radius server cannot be reached events. Twice now though this has been a cascade of these events. Just yesterday I received literally thousands of emails with this same radius servers cannot be reached event. These continued overnight with thousands of more events. The last time this happened the only way I could stop them was to reboot both of my vSZ controllers which I am in the process of doing right now.

I think this is bug in the code but have not heard this from support

dave_watkins_74
Contributor
I've done a lot of digging on this one and I _think_ our primary issue is that NPS ignores/discards packets with attributes it doesn't support/understand. There isn't any way you can tell it to send reject messages to these requests and so, if you deal with a lot of BYOD devices that aren't configured correctly you're at their mercy. Enough of these request cause failovers

Longer description here
https://community.jisc.ac.uk/groups/eduroam/article/improving-reliability-microsoft-nps-authenticati...

Sadly we're also seeing GPO configured windows clients not being assinged the right VLAN when they roam between AP's which might be related to Radius failovers. They drop down to the default VLAN assigned on the SSID not the radius assigned one. Which if course changes their IP and causes havoc.