I'm using the net-snmp-lvs module to interface LVS statistics to SNMP so I can graph them (I'm using OpenNMS).
I have a virtual HTTP service that is balanced across eight real servers. In testing, everything seemed to work just fine and I got some nice graphs that show the Connection Rate, Packet Rate, and Byte Rate for the virtual service and each of the real servers.
This morning, we attempted a cutover, ie. we re-directed real traffic to the new service. Sadly, our perimeter firewall hit > 90% CPU so we had to revert. But, in the time that we were live, I noticed that the Connection Rate statistics were missing for both the virtual service and the real servers for the period in which the service was under high load:
Notice the gap in the Connection Rate graph when the Packet & Byte rate graphs show high values.
I am currently investigating the cause of this issue.



Comments