[OpenSIPS-Users] mi_fifo lock on reply after a period of time in 3.1.3

Bogdan-Andrei Iancu bogdan at opensips.org
Thu Oct 7 06:25:30 EST 2021


Hi Andrew,

OK, let's try these steps:

1) right after starting opensips, do an "opensips-cli -x mi ps" and 
identify the PID of the FIFO process

2) wait for the blocking to happen

3) from a console with root privileges , do "gdb opensips 
_PID_OF_FIFO_"  and run "bt full"

Best regards,

Bogdan-Andrei Iancu

OpenSIPS Founder and Developer
   https://www.opensips-solutions.com
OpenSIPS eBootcamp 2021
   https://opensips.org/training/OpenSIPS_eBootcamp_2021/

On 10/7/21 2:19 AM, Andrew Yager wrote:
> Hi Bogdan-Andrei,
>
> This does look to be frozen in this state too. I guess GDB is the next 
> step.
>
> Andrew
>
>
> On Wed, 6 Oct 2021 at 17:12, Bogdan-Andrei Iancu <bogdan at opensips.org 
> <mailto:bogdan at opensips.org>> wrote:
>
>     Hi Andrew,
>
>     What you can do is, after getting the FIFO blocked, to do a trap
>     "opensips-cli trap" to try to see what the FIFO process is doing
>     (or trying to do). If the trap does not work, let me know, I will
>     give you the instructions on how to directly attache with GDB to
>     the process.
>
>     Best regards,
>
>     Bogdan-Andrei Iancu
>
>     OpenSIPS Founder and Developer
>        https://www.opensips-solutions.com  <https://www.opensips-solutions.com>
>     OpenSIPS eBootcamp 2021
>        https://opensips.org/training/OpenSIPS_eBootcamp_2021/  <https://opensips.org/training/OpenSIPS_eBootcamp_2021/>
>
>     On 10/6/21 12:24 AM, Andrew Yager wrote:
>>     Also restarting opensips produces this:
>>
>>     Restarting opensips (via systemctl): opensips.serviceERROR:
>>     communication exception for 'which' returned: cannot access fifo
>>     file /tmp/opensips_fifo: [Errno 32] Broken pipe!
>>
>>     Andrew
>>
>>
>>     On Wed, 6 Oct 2021 at 08:23, Andrew Yager <andrew at rwts.com.au
>>     <mailto:andrew at rwts.com.au>> wrote:
>>
>>         Hi,
>>
>>         Just replicated this on the 3.2.2 nightly build. Debug logs
>>         are enabled, and this is all I can see on the last request
>>         that stalled:
>>
>>         Oct  6 04:49:32 hvprxy osips[1186130]:
>>         DBG:mi_fifo:mi_fifo_callback: running command [{"jsonrpc":
>>         "2.0", "id": "5319", "method": "which", "params": []}]
>>         Oct  6 04:49:32 hvprxy osips[1186130]:
>>         DBG:mi_fifo:mi_fifo_callback: got mi response = [0x558865cc9560]
>>         Oct  6 04:49:32 hvprxy osips[1186130]:
>>         DBG:mi_fifo:mi_fifo_callback: running command [{"jsonrpc":
>>         "2.0", "id": "8377", "method": "get_statistics", "params":
>>         [["all"]]}]
>>         Oct  6 04:49:32 hvprxy osips[1186133]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186134]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186135]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186136]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186137]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186138]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186139]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186140]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186141]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186142]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186143]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186144]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186145]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186146]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186147]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186148]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186149]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186150]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186151]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186152]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186153]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186154]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186155]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186156]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186157]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186158]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186159]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186160]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186161]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186162]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186163]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186164]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186165]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186166]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186167]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186168]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186169]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186170]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186171]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186172]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186173]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>         Oct  6 04:49:32 hvprxy osips[1186174]:
>>         DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>
>>         Andrew
>>
>>         On Mon, 4 Oct 2021 at 22:37, Andrew Yager <andrew at rwts.com.au
>>         <mailto:andrew at rwts.com.au>> wrote:
>>
>>             Just further to this, I think it's the ul_dump command
>>             that seems to cause the issue first.
>>
>>             Andrew
>>
>>
>>             On Sat, 2 Oct 2021 at 13:05, Andrew Yager
>>             <andrew at rwts.com.au <mailto:andrew at rwts.com.au>> wrote:
>>
>>                 Hi,
>>
>>                 Not entirely sure where to start digging on this one.
>>                 On 3.1.3 we've had an issue appear "suddenly" whereby
>>                 our mi command output seems to "block" and not return
>>                 any useful data. Restarting opensips processes
>>                 restores comms.
>>
>>                 We end up with a huge number of
>>                 opensips_fifo_reply_\* files in the /tmp directory,
>>                 but effectively no data is ever written and
>>                 opensips-cli freezes.
>>
>>                 We've restarted twice to "resolve" now, but the issue
>>                 appears to reoccur. We're using the mi get_statistics
>>                 command to monitor stats/health and uldump to pull
>>                 some detail about usrloc data every 5 minutes.
>>
>>                 I don't think I can cause a debug level increase
>>                 because the MI process doesn't seem to be
>>                 communicating (at least it doesn't seem to work).
>>
>>                 opensips 3.1.3-1
>>                 opensips-cli 0.1~20210707~572d2db-
>>                 5.4.0-88-generic #99-Ubuntu SMP Thu Sep 23 17:29:00
>>                 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
>>
>>                 I can restart again, but if this is going to keep
>>                 happening, I want to make sure I have something
>>                 useful in the logs to be able to trace down.
>>
>>                 Andrew
>>
>>
>>     _______________________________________________
>>     Users mailing list
>>     Users at lists.opensips.org  <mailto:Users at lists.opensips.org>
>>     http://lists.opensips.org/cgi-bin/mailman/listinfo/users  <http://lists.opensips.org/cgi-bin/mailman/listinfo/users>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opensips.org/pipermail/users/attachments/20211007/8c802afd/attachment-0001.html>


More information about the Users mailing list