As a follow up to this statement, we eventually discovered that our specific issue was related to NAT behaviour. We run a Cisco ASR with a NAT pool for SNATing the clients out to the internet. We ran into some other issues on the ASR relating to address+port exhaustion which lead to a code upgrade. What was noticed after the code upgrade was the NAT behavior changed, specifically with how it assigned outer source ports. Before the code upgrade, all source ports were randomized on the outside interface. After the code upgrade, all source ports were maintained unless there was a conflict, then a randomized source port was used for outgoing sessions. I believe this change in behaviour made it easier for the iPhones to maintain data paths to messaging services. Suddenly they all stayed connected while the screens were off, the difference was night and day after the NAT code upgrade.