Network CSI Part Three: Choppy Waters
June 24, 2017
We’ve seen this a thousand times on screen: A camera shows a grizzled detective hovering over a crime scene in New York City, and slowly pans out until you have a bird’s eye view of the entire city — and the understanding that the truth is out there, somewhere, hiding among 8 million people.
Chances are likely you’ve experienced this overwhelming feeling on more than one occasion while troubleshooting call quality issues on your network. It can be immensely difficult trying to link clues together to come up with a resolution — especially when you have a deadline, a massive enterprise network to search through, and not enough evidence to fix it.
In this series, we have been focusing on a Fortune 500 cosmetics firm that was suffering from this very problem. Every once in awhile, a major call quality issue would arise on the network that would impact communications, and the IT department would be tasked with discovering the root cause and eliminating it. In part one, the problem centered around dropped Skype for Business calls. In part two, the issue was latency and jitter.
PathSolutions helped solve both cases using our fully-automated network troubleshooting solution, TotalView. But we weren’t surprised in the least bit when the phone rang a third time and it was the same customer with another puzzle for us.
The saga continues
The client was once again experiencing VoIP issues on its network. This time, however, the customer was experiencing phone calls that were sounding choppy and garbled. There were also an issue with the company’s voicemail system; the customer would experience significant quality problems when replaying messages.
The client was stumped. Its Internet connection was fine, and the network was running at a proper speed. Plus, it had an adequate amount of bandwidth. So that was not the issue, either. And when the customer re-checked its QoS queue configurations (as we discussed in part two), this was also found to be working properly.
So once again we invite you to try and figure out what the root cause could be, using just these few clues.
We’ll go grab lunch, because you’re going to be here for awhile!
As always, we used TotalView to drill down into the network and investigate. This time, the network looked perfectly healthy up to the server that was hosting the phone system. No network problems could be found on any links, switches, or routers in the environment.
We used the call simulator to test to the server, and everything was healthy (low latency, jitter, loss, no out of order packets) up until the server was hit with the simulated VoIP traffic.
Since the network was healthy, but the server was not, it was easy to determine that the business’s phone system was running on a virtualized server that was not optimized or tuned for VoIP traffic. The hypervisor was also powering several other servers and applications, including email and a database server which can eat up a large amount of system resources needed for VoIP.
All the customer had to do was change its settings to support real-time applications, and once again, call quality returned to normal.
So there you have it: Three pesky, time-consuming VoIP issues that would typically take hours or days to solve, fixed in a matter of minutes using TotalView. All three examples we outlined in this series showcase how TotalView can be used not just for daily monitoring, but to also drill down deep into the network and focus on granular issues that would otherwise go undiscovered.