[OpenSIPS-Users] Autoscaler in 3.2.x

Bogdan-Andrei Iancu bogdan at opensips.org
Wed Sep 14 12:58:45 UTC 2022


Hi Yury,

You need to check the TCP setting and to be sure your OpenSIPS will (1) 
not try to perform TCP connect against destination known not to be able 
to accept (like TCP/WS end points behind NAT) - see the 
tcp_no_new_conn_bflag [1] - or (2) not block for long time while 
attempting a connect - see the tcp_connect_timeout [2] or consider 
enabling async [3].

[1] 
https://www.opensips.org/Documentation/Script-CoreParameters-3-2#tcp_no_new_conn_bflag
[2] 
https://www.opensips.org/Documentation/Script-CoreParameters-3-2#tcp_connect_timeout
[3] https://opensips.org/html/docs/modules/3.2.x/proto_tcp.html#idp168992

Regards,

Bogdan-Andrei Iancu

OpenSIPS Founder and Developer
   https://www.opensips-solutions.com
OpenSIPS Summit 27-30 Sept 2022, Athens
   https://www.opensips.org/events/Summit-2022Athens/

On 9/13/22 12:01 PM, Yury Kirsanov wrote:
> Hi Bogdan,
> Thanks for this update, but it looks like I can't check autoscaler 
> because of this first issue with blocking TCP connect. Is there a way 
> to resolve it? Am I doing something wrong? Or is that something to do 
> with OpenSIPS code? As yes, you're right, as soon as I restart 
> OpenSIPS having a lot of SIP devices trying to connect to it - it goes 
> crazy, starts to consume memory and stops to forward packets sitting 
> there at 100% load until it runs out of memory and segfaults. 
> Sometimes I can't even restart it to come to normal state to make it 
> work, it just loops into same crash whatever I try to do.
>
> I've compiled OpenSIPS 3.3.1 with your patch and was able to start it 
> but not sure, maybe I was just lucky this time.
>
> What should I do? Thanks!
>
> Best regards,
> Yury.
>
> On Tue, 13 Sept 2022, 18:56 Bogdan-Andrei Iancu, <bogdan at opensips.org 
> <mailto:bogdan at opensips.org>> wrote:
>
>     Hi Yury,
>
>     it looks like you some multiple issues, overlapping here. The
>     traps you sent here have nothing to do with the auto-scaling, but
>     with a blocking TCP connect for SIP - most of the procs get
>     blocked into a sync TCP connect.
>
>     Regards,
>
>     Bogdan-Andrei Iancu
>
>     OpenSIPS Founder and Developer
>        https://www.opensips-solutions.com  <https://www.opensips-solutions.com>
>     OpenSIPS Summit 27-30 Sept 2022, Athens
>        https://www.opensips.org/events/Summit-2022Athens/  <https://www.opensips.org/events/Summit-2022Athens/>
>
>     On 9/12/22 4:39 PM, Yury Kirsanov wrote:
>>     Hi Bogdan,
>>     I've applied the patch (had to find where to apply it manually
>>     for 3.2.8 downloaded from Web page, line 1568 instead of 1652)
>>     and restarted the server with only about 300-350 SIP devices and
>>     immediately got into same issue. I'm attaching two GDB dumps made
>>     within several minutes from each other. Autoscale was now OFF,
>>     please see my previous message as currently for some reason I'm
>>     experiencing lockups even when it's off :(
>
>>     Best regards,
>>     Yury.
>>
>>     On Mon, Sep 12, 2022 at 7:48 PM Bogdan-Andrei Iancu
>>     <bogdan at opensips.org <mailto:bogdan at opensips.org>> wrote:
>>
>>         Hi Yuri,
>>
>>         Could you give this patch a try? it should fix the blocking
>>         you experience (it should apply on 3.2 too).
>>
>>         Best regards,
>>
>>         Bogdan-Andrei Iancu
>>
>>         OpenSIPS Founder and Developer
>>            https://www.opensips-solutions.com  <https://www.opensips-solutions.com>
>>         OpenSIPS Summit 27-30 Sept 2022, Athens
>>            https://www.opensips.org/events/Summit-2022Athens/  <https://www.opensips.org/events/Summit-2022Athens/>
>>
>>         On 9/7/22 2:54 PM, Bogdan-Andrei Iancu wrote:
>>>         Hi Yury,
>>>
>>>         Thanks for the details info here - let me do a review of
>>>         some code and run some tests, as at this point I have a good
>>>         idea on the direction to dig into.
>>>
>>>         I will update here.
>>>
>>>         Best regards,
>>>         Bogdan-Andrei Iancu
>>>
>>>         OpenSIPS Founder and Developer
>>>            https://www.opensips-solutions.com  <https://www.opensips-solutions.com>
>>>         OpenSIPS Summit 27-30 Sept 2022, Athens
>>>            https://www.opensips.org/events/Summit-2022Athens/  <https://www.opensips.org/events/Summit-2022Athens/>
>>>         On 9/6/22 11:24 AM, Yury Kirsanov wrote:
>>>>         Hi Bogdan,
>>>>         Yes, I'm listening on all types of sockets including UDP,
>>>>         TCP and TLS on the outside public interface and then
>>>>         forward traffic into internal LAN via UDP only.
>>>>
>>>>         Previously it was getting stuck quite easily, now I had to
>>>>         wait for a while before this actually happened. I've routed
>>>>         part of my customers to this server to obtain this result
>>>>         so I will have to do that again.
>>>>
>>>>         As soon as I see one of the processes stuck I'll dot the
>>>>         trap command and send you all the details including
>>>>         processes load, ps output and so on.
>>>>
>>>>         For now I had to switch autoscaling off and just create
>>>>         many listeners. Do I understand correctly that I need to
>>>>         restart OpenSIPS in order to apply autoscaling profiles and
>>>>         reload-routes is not sufficient?
>>>>
>>>>         Also, do I need separate UDP profiles for public and
>>>>         private interfaces? And do I need to apply autoscaling
>>>>         profile just to a socket or I need to specify udp or
>>>>         tcp_workers with autoscaler too?
>>>>
>>>>         Thanks and best regards,
>>>>         Yury.
>>>>
>>>>         On Tue, 6 Sept 2022, 18:18 Bogdan-Andrei Iancu,
>>>>         <bogdan at opensips.org <mailto:bogdan at opensips.org>> wrote:
>>>>
>>>>             Hi Yury,
>>>>
>>>>             Thanks for the info. I see that the stuck process (24)
>>>>             is an auto-scalled one (based on its id). Do you have
>>>>             SIP traffic from UDP to TCP or doing some HEP capturing
>>>>             for SIP ? I saw a recent similar report where a UDP
>>>>             auto-scalled worked got stuck when trying to do some
>>>>             communication with the TCP main/manager process (in
>>>>             order to handle a TCP operation).
>>>>
>>>>             BTW, any chance to do a "opensips-cli -x trap" when you
>>>>             have that stuck process, just to see where is it stuck?
>>>>             and is it hard to reproduce? as I may ask you to
>>>>             extract some information from the running process....
>>>>
>>>>             Regards,
>>>>
>>>>             Bogdan-Andrei Iancu
>>>>
>>>>             OpenSIPS Founder and Developer
>>>>                https://www.opensips-solutions.com  <https://www.opensips-solutions.com>
>>>>             OpenSIPS Summit 27-30 Sept 2022, Athens
>>>>                https://www.opensips.org/events/Summit-2022Athens/  <https://www.opensips.org/events/Summit-2022Athens/>
>>>>
>>>>             On 9/3/22 6:54 PM, Yury Kirsanov wrote:
>>>>
>>>
>>>
>>>         _______________________________________________
>>>         Users mailing list
>>>         Users at lists.opensips.org  <mailto:Users at lists.opensips.org>
>>>         http://lists.opensips.org/cgi-bin/mailman/listinfo/users  <http://lists.opensips.org/cgi-bin/mailman/listinfo/users>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opensips.org/pipermail/users/attachments/20220914/17099cfa/attachment-0001.html>


More information about the Users mailing list