[OpenSIPS-Users] mi_fifo lock on reply after a period of time in 3.1.3

Bogdan-Andrei Iancu bogdan at opensips.org
Fri Oct 8 13:19:00 EST 2021


Hi Andrew,

The second blocked process (doing the TLS/TCP stuff) surprisingly got 
stuck while waiting for a TCP fd from the TCP Main process.

You mentioned that the logs of the UDP worker (doing the TCP send) 
suddenly stopped - around that time, do you see any errors from that 
process or from the TCP MAIN processes ?

Regards,

Bogdan-Andrei Iancu

OpenSIPS Founder and Developer
   https://www.opensips-solutions.com
OpenSIPS eBootcamp 2021
   https://opensips.org/training/OpenSIPS_eBootcamp_2021/

On 10/8/21 2:43 PM, Andrew Yager wrote:
> Hi Bogdan-Andrei,
>
> Have restarted since the last bt, but have recreated again and
> attached. Earlier today we did also get another bt full on the second
> blocked pid, but I didn't save it. In that case it was a UDP reply
> from one of our upstream servers that had gone through mid_registrar
> and was being relayed to a TCP endpoint. The TCP endpoint did have an
> open file descriptor we could see, and it had sent and was blocked on
> receive at the same point (I'm getting better at reading backtraces!
> :D).
>
> The thing I do note is happening is that every example I have is a UDP
> message being received from an upstream server being relayed to a
> client on a TCP/TLS connection via a UDP worker.
>
> While we are using WolfSSL in this box, the other box where we have
> the same behaviour (but I haven't taken backtraces yet) is running
> OpenSSL and on 3.1.3; so it's not SSL library specific.
>
> I'm going to see if I can get a backtrace from the 3.1.3 box shortly.
>
> Andrew
>
> On Fri, 8 Oct 2021 at 17:13, Bogdan-Andrei Iancu <bogdan at opensips.org> wrote:
>> Hi Andrew,
>>
>> OK, interesting progress here. So, the FIFO process blocks as it is
>> trying to send an IPC JOB to an UDP process which looks like also being
>> blocked.
>>
>> Could you attach with GDB to the that UDP blocked process too ? (you
>> have its PID in the printing of the pt[x] in first gdb)
>>
>> Regards,
>>
>> Bogdan-Andrei Iancu
>>
>> OpenSIPS Founder and Developer
>>     https://www.opensips-solutions.com
>> OpenSIPS eBootcamp 2021
>>     https://opensips.org/training/OpenSIPS_eBootcamp_2021/
>>
>> On 10/8/21 1:43 AM, Andrew Yager wrote:
>>> Interestingly, where I usually see a range of continued messages from
>>> a process continually in the debug log, they appear to stop for this
>>> PID at 3:47am, and that process seems blocked on a tcp/tls send:
>>>
>>>




More information about the Users mailing list