cancel
Showing results for 
Search instead for 
Did you mean: 

"heartbeats lost" after upgrade to Unleashed 200.7.10.2.339

erin_mcclellen
Contributor
I have a tiny network with 2 R600s in Unleashed mode.  Last night, I upgraded to the latest (200.7.10.2.339) that apparently came out about a week ago. Multiple times after that, I'm seeing "heartbeats lost" messages.

Looking at the switch ports, I'm not losing link - the last up/down of the links was the reboot after the upgrade.

Also weird is that this is a 2 AP setup, and there are reports for both APs. How does an AP lose a heartbeat from itself?

Image_ images_messages_5f91c400135b77e24791351b_4020284449675a2ce37253c4c07792de_RackMultipart201903221098831tp-1dc5faa4-2865-4480-a642-5ec335ae7b8e-1721604811.png1553278101

Image_ images_messages_5f91c400135b77e24791351b_abb6448bb779b5e875f2a16b025e9e6e_RackMultipart201903224167021dl-ca702a29-567b-485c-846d-df4b0bfd721a-2077673026.png1553277218
14 REPLIES 14

erin_mcclellen
Contributor
Dang, the R-600 is now outdated? I mean is there a huge difference between this and an R-610?

Anyhow, as I suspected this is absolutely looking a bug in the firmware.  I downgraded a site from the latest 200.7 release to the latest 200.6 release and the following things have happened since then:

  • The slave unit has stopped rebooting itself nightly (we tested the cable multiple times, swapped POE injectors prior to this).
  • The "hearbeat lost" both from the slave and from the master to itself have not happened yet (17 hours so far, would normally see this 4-5 times a day at least)
  • The client had issues with occasional drops, which I think happened concurrent with the "heartbeat lost" messages
I think we bought a contract for this location today, any tips for getting an actual resolution on this when we open a ticket? I can't keep these on 200.6 forever...

michael_h014h2t
New Contributor II
I’ve got the same issue with a pair of R500s.  The heartbeat issue just started about a month or so ago.  It sounds like the solution is to back rev them to 200.6.  I did confirm that I’m on 200.7.  I’m not sure how to back-rev them.  Any hints?

Thanks.

hpatel99
New Contributor III
The only problem with downgrading the firmware to 200.6 is that the nicer built in captive portal and customization options are gone too.

In my case at least, the heartbeat errors have become much less frequent over time, as I stated above. So I’m staying with 200.7.

michael_h014h2t
New Contributor II
Problem solved.  For me, it was a router problem.  Normally, my infrastructure has reserved IP addresses and specific host name configurations.  As the result of a few router issues, I’ve been swapping some routers around, got lazy and my infrastructure’s been pulling IP addresses from the DHCP pool with no special configuration.  I reserved IP addresses for my two problematic APs and configured their host names as I usually do.  I haven’t had a heartbeat problem since.

I don’t set a preference on which AP serves as the master, primarily for failover purposes.  What I notice in this configuration is that they mask themselves as each other.  For example, if I try to connect to the IP address of the non-master, it will connect me to the master.  Looking at my router, I noticed two things.
  1. About the time the APs would lose their heartbeat, they would also disappear out of the DHCP table.
  2. When the APs were present in the DHCP table (in the pool), they had the same host name.
My theory is that as the two APs try to mask themselves as each other, at least from the router’s perspective, the router got confused.  When the APs would attempt a heartbeat, the packets would occasionally go to the wrong AP and the heartbeat would fail.

So far so good.  I’m still running 200.7.

erin_mcclellen
Contributor
It would still be really nice if Ruckus would fix this. The current solution of running really outdated firmware is really suboptimal.

It's really easy to reproduce - every site we have with Unleashed was doing this until we rolled back.