We manage an event with the SZ300 fw 5.1. An event with 106 AP's, and we've seen some differences that are not yet clear if I should open a TAC support, or if I can get any common denominator here in the forum:
1 - Latency failures (even with few users), and connection failure constantly in distinct AP's.
2 - Constant channel change (Background Scanning) set to 600s (10 minutes).
3 - Failures without logging in AP: Where are the error logs, failure reasons (failure to join, dhcp handshake failure, open / wpa2 network failure, reassociation failure ...), have not I found them?
4 - The AP packet capture file came empty (tried 3 times, and collected in times apart), collection for approximately 15 minutes.
5 - A large amount of entries (log) of the same user in SZ that does not fit the reality. The user is on the same AP, even MAP, but is being disconnected and connected again, why? How do I find out if some channel change is due, if it is some application that it is using, and what causes wlc to log off and connect again, since it remains with the IP address. I discard roaming because it is only in 1 AP.
6 - The same user of item 5, has 5 sessions in SCI. I need to know what the metric is, because in the view of the controller this user is unique, but why in history does he report with 109 entries in the SZ300, and "only" 5 sessions in SCI? What is being evaluated? This scenario is with all my clients, and the numbers are dissenting.
NOTE: The session iddle time-out setup was for 5 minutes at this event, but wpa2 clients reported on the difficulty of connecting after the downtime, a very untypical scenario.
Regarding customer discrepancy, I've already noticed in SCG200, but with respect to the other items, I'm thinking that version 5.1 might be the offender.
MAC address filter is used to filter for a particular client MAC you want to look for, it is not the AP mac address. You also need to setup wireshark to receive the pcap from your particular AP create a new remote interface where the host is your AP IP addr.
Also, wireshark needs to be version 2.6.7 or earlier.
For #1, have you modified the default behavior for latency metrics? On your dashboard, Health section, click the "gear" icon and click on AP status. You will see the different measurements and criteria, I believe latency by default is set 150ms. Modifying these settings will have an effect on your AP status page.
For #2 - many factors can contribute to this, best to open a TAC case to see what's going on under the hood. BG will change channels depending on a multitude of factors and scan intervals
For #3 I believe "historical connection failures" needs to be turned on in the zone for this to work.
For #4 - see above
For #5 - check the configured metrics in #1 but for client disconnections. Also you can select the AP, click on More and you will have a troubleshooting section and/or ability to download support logs for that AP. You can do the same on the client side to see what is going on. TAC case to help go under the hood would help as well.
For #6 - Could be related to disconnection/connection so different session starts/stops and times.
Best to open a TAC case because if you think there is a bug, they would be able to find it.
Thanks for the explanation, I'm setting up a lab to refine the metrics / flags following your tip. In parallel, as we will have other events (Game XP - Rock In Rio), I activated Rucus support to help us.
Sorry for the delay, but was absent and responsible for the integration of Aruba, and Meraki here in our park.