Notice: This blog entry has nothing to do with elephants sipping water through trunks.
Now that that's out of the way, many organizations have migrated to SIP trunks due to the savings that can be realized with these services, yet when problems arise they don't know where or how to troubleshoot them.
SIP Trunk Components
In order for a SIP trunk to function, three different components are typically involved:
- Local Session Border Controller (SBC): This is the local gateway that accepts and sends calls across a SIP trunk. In certain circumstances, an SBC might not be needed if there is not a lot of users at a remote site, or if the SIP trunk provider communicates with the same protocols that your phone system uses (no protocol translation needed).
- WAN Connectivity: This is usually always provided as an Internet connected link. This is where most all of the savings occur, as you are not paying for a proprietary WAN connector that is designed for handling just voice calls.
- Remote SIP SBC/Gateway: This is the SIP trunk provider's system for receiving and handling the SIP calls that you send and receive.
Troubleshooting each of these components requires different approaches.
Define the Problem
SIP Trunk monitoring
Depending on what the problem is, you will need to troubleshoot it with a different methodology. Generally, two different types of problems can occur with SIP Trunks:
- Call Establishment problems: Calls are not being established or received properly.
- Mid-call problems: Calls have poor call quality, one-way-audio, or are dropped mid-call.
Each type of problem usually relates to a different protocol having problems.
SIP Trunk Call Establishment ProblemsIf there are problems with calls being established, then the problem is usually a configuration or authentication issue with the local or remote SBC.
The problem may lie in the SIP handshake that occurs between these two devices.
The best way to analyze this is to set up a packet capture between the two SBCs and watch the communications between the gateways to see if there communications errors exchanged between the SBCs that might lead to a resolution.
The good news is that SIP response codes are well-defined:
1xx — Provisional Responses
2xx — Successful Responses
3xx — Redirection Responses
4xx — Client Failure Responses
5xx — Server Failure Responses
6xx — Global Failure Responses
For detailed response codes, refer to Wikipedia:SIP Response Codes.
If there is packet loss that is preventing calls from being established, SIP will perform a limited number of retries before giving up, but the lost packets will be seen in the packet capture trace. Troubleshooting the packet loss for the SIP setup part should be done the same way you would troubleshoot the SIP Trunk call quality problems in the next section.
SIP Trunk Call Quality Problems
Quality problems are usually related to calls that are established and may be fine for a short period of time before issues start to occur. Typically, the quality problems relate to clipping of words or phrases, or entire missing words or phrases.
The transient nature of these sorts of problems can be related to:
- Local SBC resource limitations
- Remote SBC/Gateway resource limitations
- Network impairments
Monitoring the Local SBC ResourcesMonitoring the local SBC for resource limitations is an important step to make sure that it is not causing problems. Things that should be monitored are:
- CPU utilization
- Free memory
- Interface utilization and packet loss
Typically these statistics can be monitored via the SBC's own management software, or via SNMP variables.
Remote SBC/Gateway Resource Limitations
The SIP trunk provider may provide some visibility into their endpoints and ability to handle traffic. Usually they will prevent a customer from seeing too much, as they never want to expose that they have problems when dealing with heavy loads. The best piece of information you can gather on their performance is to measure when you have problems with them, and how many calls your SBC has with them at any time.
This is typically the source of call quality problems, as many issues can exist in the network elements involved between the SBCs.
- Links can get overloaded and drop packets
- Devices can get overloaded and drop packets
- Load balancing and multiple route-paths can cause out-of-order packets
- Route patterns can change, causing latency and jitter swings (see the related articles on managing latency and jitter).
Troubleshooting these elements typically involves using specialized single-ended call simulation tools, and route-path evaluation tools like those included in PathSolutions TotalView.
SIP Trunk Best Practices
The best practices are covered in this recorded video made for IAUG:
VoIP troubleshooting problems can be prevented if the right information is brought to bear about your network's performance and configuration.