[OpenSIPS-Users] HEP Tracing exhausting TCP connections
Ben Newlin
Ben.Newlin at genesys.com
Fri Jul 26 12:45:48 EDT 2019
It turns out the netstat example I provided was a smaller case. Here is an example where a single OpenSIPS instance had attempted over 400 TCP connections to the capture server: https://pastebin.com/4jKQtZZY
Ben Newlin
From: Users <users-bounces at lists.opensips.org> on behalf of Ben Newlin <Ben.Newlin at genesys.com>
Reply-To: OpenSIPS users mailling list <users at lists.opensips.org>
Date: Friday, July 26, 2019 at 12:17 PM
To: "users at lists.opensips.org" <users at lists.opensips.org>
Subject: [OpenSIPS-Users] HEP Tracing exhausting TCP connections
Hello,
We are experiencing an issue with the siptrace module using HEPv3 over TCP. What we are seeing is that when traffic increases, OpenSIPS is opening a much larger number of sockets to our capture server than we expect. We have our capture server configured to accept a maximum of 2048 connections.
We believe that the module is intended to share TCP connections to the capture server; something like one per process. We have found that if we increase traffic at a certain pace, OpenSIPS will attempt to open multiple connections per process and will exhaust the 2048 port limit on the server. When this happens, connections begin to fail and the max async messages (hep_async_max_postponed_chunks) is exceeded. We believe this causes the proto_hep module to switch to blocking TCP, as we see all OpenSIPS processes then block against siptrace for up to 5 minutes. The proto_hep TCP connect and send timeouts are set to the defaults, but these do not seem to be being honored for non-async connections.
I was able to capture the output of netstat and opensipsctl ps during the issue and they are available here: https://pastebin.com/jAMnSE8z. You can see that even though we only have 16 UDP/TCP listener threads, there are almost 100 TCP connections open to the configured proto_hep server. This is for just one instance.
I was also able to capture a trap while this was occurring, but not while OpenSIPS was blocked which was my hope. It is difficult to time that occurrence exactly but we will keep trying. Trap output is here: https://pastebin.com/cqGhrZrf
Ben Newlin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opensips.org/pipermail/users/attachments/20190726/fcd9ddc4/attachment-0001.html>
More information about the Users
mailing list