Alright let me begin to explain a problem we are having with our Ruckus system. I'm just going to start at the very beginning and share all our experiences so I can get the best help on here.
So I started a job about 8 months ago at a school and one of the problems they have apparently always had was the wifi. It was slow, disconnect, etc. Well the school finally gave up some money for us to buy a decent system. We purchased a bunch of R700's to go with our Ruckus system.
At the very start of the year we were having really really bad connection issues with everyone. We updated our firmware and then everything was solved. It went 2 months without hardly a single problem! I was very impressed. Then after those 2 months problems started popping up again. For the past two weeks we are having an issue where the computer is showing that it is "connected" but has a yellow triangle with an exclamation mark over the wifi symbol. So what happens is when that triangle shows up the access point acts like it's flooded or can't communicate. I trying pinging any other device on the network and many of the packets are lost, sometimes they take 500ms to 4000ms, and then other times they go fast and I'll get a 20ms response. All of that can happen in one ping. I have seen this happening when there were less than 5 devices on the access point I was connected to.
At first it was only reported by one person that they were having issues. I changed out the access point with a different one and everything seemed solved. Well it wasn't. The problems were happening all over the place the next day.
So our first thought is the network is being flooded somewhere or maybe there is a loop. So as the connection problem was happening on my laptop, I turned the wifi off and plugged it in via ethernet to the same switch that the access point was on. Everything worked great if I was connected on the wire. Sometimes I was getting pings of <1ms! We went to every switch we had and looked at the logs on the switches. There was nothing leading us to find any heavy amount of traffic. Every person we asked that was using an ethernet connection had no problem.
Another thing to note is the problem comes and goes seemingly randomly. We are struggling to understand why this problem happened when we didn't change anything with our wireless or the rest of the network.
Channelfly is currently set to turn off after the access points have an uptime of 5 mins. So it's not like the access points are switching channels. Also I understand that interference can cause problems like this but it's happening not only to access points that have neighbors but also to access points that have no neighbors and are completely isolated. Also considering that we were working so great for 2 months has me wondering if it was interference why would it only show up now? One more thing. The connection problem we are having now is not the same as we were having at the start of the year. At the start of the year you just could even connect up and some access points would "freeze up" until your rebooted them. This is a very consistent problem in a sense that it has the same issue occur when it happens on all devices.
We have also rebooted several of the access points and even the Zone Director after hours in case that would fix anything. Tried updating to the latest recommended firmware as well. No change.
If there are any suggestions let me know and if you have any questions or want me to troubleshoot something let me know. I didn't include every troubleshooting step we have done because there is too much to include. Please help.... 😞
EDIT: Forgot to include that when the problem happens we aren't solely relying on ping results to show the problem. We are trying to load web pages etc. I understand pings are not a high priority and thus can be slow to respond if other traffic is being handled. In this case though they are showing a good example of what we are seeing.
Can you include some more info, things like what model ZoneDirector you're using and the firmware version, how many AP's in total, how they are powered (from POE switches or via injectors). Is this restricted to one SSID or multiple SSID's and finally how many clients are on the AP when this happens? Any idea if this happens to both 2.4 and 5Ghz bands or just one of them?
Have you looked at the bandwidth graphs of the AP when this occurs?
It is a ZoneDirector 3000 running 18.104.22.168.14. There is about 36 total access points in two different buildings. The buildings are about 1 mile apart but have a fiber connection so they are on the same LAN. In the one building we have R500 AP's and nobody has mentioned anything about the wifi being an issue there but there aren't many people consistently using it. All the AP's are powered by POE. The issue occurs on any of our SSID's (there are 3 in total). The amount of clients can range from a full classroom (30 students) or in some cases I'm the only one connected testing and it still happens. The 2.4Ghz and 5Ghz are both affected. The 2.4Ghz is usually worse but both have had the issue occur. I will monitor the bandwidth graphs more extensively tomorrow and look for anything suspicious.
Alright. We have some extra R500's. We are going to test by swapping out some of the R700's in strategic spots to know if they have the same issue or not. If not we know something is going on with the R700's. I'll make sure to post our findings here.