[OpenSIPS-Users] sched_yield()

Andrei Dragus adragus at opensips.org
Thu Jan 21 14:09:11 CET 2010


Hi,

Since all the backtraces are in allocation routines my guess is that the 
shared memory lock might be causing a problem.

Are you compiling with -DF_MALLOC?
What version of OpenSIPS are you using?
What is the total shared memory pool you are allocating?
What amount of memory are you using? ( Use : opensipsctl fifo 
get_statistics all )

Alex Massover wrote:
> Some more,
>
> (gdb) bt
> #0  0xb78dc424 in __kernel_vsyscall ()
> #1  0xb781741c in sched_yield () from /lib/i686/cmov/libc.so.6
> #2  0xb73d77fd in build_new_dlg () from /usr/lib/opensips/modules/dialog.so
> #3  0xb73d4b81 in dlg_create_dialog () from /usr/lib/opensips/modules/dialog.so
> #4  0xb73c9c9e in ?? () from /usr/lib/opensips/modules/dialog.so
> #5  0x08055030 in do_action ()
> #6  0x08053ebf in run_action_list ()
> #7  0x08056e7a in do_action ()
> #8  0x08053ebf in run_action_list ()
> #9  0x08057d99 in run_top_route ()
> #10 0x0808ad6c in receive_msg ()
> #11 0x080bd2f2 in udp_rcv_loop ()
> #12 0x08069339 in main ()
>                                    
>
>    (gdb) bt
> #0  0xb78dc424 in __kernel_vsyscall ()
> #1  0xb781741c in sched_yield () from /lib/i686/cmov/libc.so.6
> #2  0xb77242cd in build_cell () from /usr/lib/opensips/modules/tm.so
> #3  0xb7739c4a in t_newtran () from /usr/lib/opensips/modules/tm.so
> #4  0xb772e7b8 in t_relay_to () from /usr/lib/opensips/modules/tm.so
> #5  0xb773b501 in ?? () from /usr/lib/opensips/modules/tm.so
> #6  0x08055030 in do_action ()
> #7  0x08053ebf in run_action_list ()
> #8  0x08095cf2 in eval_expr ()
> #9  0x080958d9 in eval_expr ()
> #10 0x08095919 in eval_expr ()
> #11 0x080554e2 in do_action ()
> #12 0x08053ebf in run_action_list ()
> #13 0x080569d8 in do_action ()
> #14 0x08053ebf in run_action_list ()
> #15 0x08056e7a in do_action ()
> #16 0x08053ebf in run_action_list ()
> #17 0x08057d99 in run_top_route ()
> #18 0x0808ad6c in receive_msg ()
> #19 0x080bd2f2 in udp_rcv_loop ()
> #20 0x08069339 in main ()                        
>
> --
> Best Regards,
> Alex Massover
> VoIP R&D TL
> Jajah Inc.
>
>   
>> -----Original Message-----
>> From: users-bounces at lists.opensips.org [mailto:users-
>> bounces at lists.opensips.org] On Behalf Of Alex Massover
>> Sent: Thursday, January 21, 2010 2:24 PM
>> To: OpenSIPS users mailling list
>> Subject: Re: [OpenSIPS-Users] sched_yield()
>>
>> Hi,
>>
>> Another one.. It hangs for a number of seconds (but it's enough to
>> cause to SIP timeouts - MSG queue jumps to 260K), it's hard to make a
>> bt at the right moment.
>> This one looks better because there's sched_yield() there :)
>>
>> (gdb) bt
>> #0  0xb77d5424 in __kernel_vsyscall ()
>> #1  0xb771041c in sched_yield () from /lib/i686/cmov/libc.so.6
>> #2  0x080bf23d in new_avp ()
>> #3  0x080bf53f in add_avp ()
>> #4  0xb72c1c9c in ?? () from /usr/lib/opensips/modules/dialog.so
>> #5  0x08055030 in do_action ()
>> #6  0x08053ebf in run_action_list ()
>> #7  0x08056e7a in do_action ()
>> #8  0x08053ebf in run_action_list ()
>> #9  0x08056e7a in do_action ()
>> #10 0x08053ebf in run_action_list ()
>> #11 0x08056e7a in do_action ()
>> #12 0x08053ebf in run_action_list ()
>> #13 0x08057d99 in run_top_route ()
>> #14 0x0808ad6c in receive_msg ()
>> #15 0x080bd2f2 in udp_rcv_loop ()
>> #16 0x08069339 in main ()
>>
>> --
>> Best Regards,
>> Alex Massover
>> VoIP R&D TL
>> Jajah Inc.
>>
>>     
>>> -----Original Message-----
>>> From: users-bounces at lists.opensips.org [mailto:users-
>>> bounces at lists.opensips.org] On Behalf Of Alex Massover
>>> Sent: Thursday, January 21, 2010 2:05 PM
>>> To: OpenSIPS users mailling list
>>> Subject: Re: [OpenSIPS-Users] sched_yield()
>>>
>>> Hi Andrei,
>>> Hopefully this is it (with FASTLOCK)
>>>
>>> #0  0xb77d5424 in __kernel_vsyscall ()
>>> #1  0xb772babb in poll () from /lib/i686/cmov/libc.so.6
>>> #2  0xb77ba83a in ?? () from /lib/i686/cmov/libresolv.so.2
>>> #3  0xb77b8946 in __libc_res_nquery () from
>>> /lib/i686/cmov/libresolv.so.2
>>> #4  0xb77b8fdb in ?? () from /lib/i686/cmov/libresolv.so.2
>>> #5  0xb77b92ae in __libc_res_nsearch () from
>>> /lib/i686/cmov/libresolv.so.2
>>> #6  0xb77b96d4 in __res_nsearch () from /lib/i686/cmov/libresolv.so.2
>>> #7  0xb77b808a in res_search () from /lib/i686/cmov/libresolv.so.2
>>> #8  0x0808c613 in get_record ()
>>> #9  0x0808cf05 in ?? ()
>>> #10 0x0808e385 in sip_resolvehost ()
>>> #11 0x0807a26c in mk_proxy ()
>>> #12 0xb7627d39 in t_relay_to () from /usr/lib/opensips/modules/tm.so
>>> #13 0xb7634501 in ?? () from /usr/lib/opensips/modules/tm.so
>>> #14 0x08055030 in do_action ()
>>> #15 0x08053ebf in run_action_list ()
>>> #16 0x08095cf2 in eval_expr ()
>>> #17 0x080958d9 in eval_expr ()
>>> #18 0x08095919 in eval_expr ()
>>> #19 0x080554e2 in do_action ()
>>> #20 0x08053ebf in run_action_list ()
>>> #21 0x08056e7a in do_action ()
>>> #22 0x08053ebf in run_action_list ()
>>> ---Type <return> to continue, or q <return> to quit---
>>> #23 0x080569d8 in do_action ()
>>> #24 0x08053ebf in run_action_list ()
>>> #25 0x08056e7a in do_action ()
>>> #26 0x08053ebf in run_action_list ()
>>> #27 0x08057d99 in run_top_route ()
>>> #28 0x0808ad6c in receive_msg ()
>>> #29 0x080bd2f2 in udp_rcv_loop ()
>>> #30 0x08069339 in main ()
>>> (gdb)
>>>
>>> --
>>> Best Regards,
>>> Alex Massover
>>> VoIP R&D TL
>>> Jajah Inc.
>>>       
>>>> -----Original Message-----
>>>> From: users-bounces at lists.opensips.org [mailto:users-
>>>> bounces at lists.opensips.org] On Behalf Of Andrei Dragus
>>>> Sent: Wednesday, January 20, 2010 2:58 PM
>>>> To: OpenSIPS users mailling list
>>>> Subject: Re: [OpenSIPS-Users] sched_yield()
>>>>
>>>> Hi,
>>>>
>>>> I think that there is a lock that is being held more than it should
>>>>         
>>> be
>>>       
>>>> and that's what causes starvation. It would help us if you could
>>>>         
>>> attach
>>>       
>>>> to a process using gdb and give us a full backtrace.
>>>>
>>>> Temporary solutions which should work would be to reduce the number
>>>>         
>>> of
>>>       
>>>> processes to 4-6 or to recompile replacing -DFAST_LOCK with one of
>>>>         
>>> the
>>>       
>>>> other options (-DUSE_POSIX_SEM or -DUSE_PTHREAD_MUTEX) but we
>>>>         
>> should
>>     
>>>> see
>>>> where this is from to fix it.
>>>>
>>>> Alex Massover wrote:
>>>>         
>>>>> Hi!
>>>>>
>>>>> Yes, from the source on debian, I build deb package. (I did some
>>>>>           
>>>> minor changes to the source, but the problem happens also without
>>>>         
>> my
>>     
>>>> changes)
>>>>         
>>>>> 16 children on 4 cores.
>>>>>
>>>>> What do you suggest to reduce it to 4? It runs on 2.6.32 on
>>>>>           
>> VMware
>>     
>>>> ESX.
>>>>         
>>>>> I'm also trying now sleep(0) instead of sched_yield().
>>>>>
>>>>> --
>>>>> Best Regards,
>>>>> Alex Massover
>>>>> VoIP R&D TL
>>>>> Jajah Inc.
>>>>>
>>>>>           
>>>>>> -----Original Message-----
>>>>>> From: users-bounces at lists.opensips.org [mailto:users-
>>>>>> bounces at lists.opensips.org] On Behalf Of Andrei Dragus
>>>>>> Sent: Wednesday, January 20, 2010 1:05 PM
>>>>>> To: OpenSIPS users mailling list
>>>>>> Subject: Re: [OpenSIPS-Users] sched_yield()
>>>>>>
>>>>>> Hi Alex,
>>>>>>
>>>>>> Are you building OpenSIPS from source?
>>>>>> How many processes do you have and on how many cores?
>>>>>>
>>>>>>
>>>>>> Alex Massover wrote:
>>>>>>
>>>>>>             
>>>>>>> Hello!
>>>>>>>
>>>>>>> I'm facing a strange problem, sometimes under a stress OpenSIPS
>>>>>>> "locks" - load average jumps, SIP processing delays, opensips
>>>>>>>               
>> msg
>>     
>>>>>>> queue fills with a lot of sip messages, opensips processes
>>>>>>>               
>> start
>>     
>>> to
>>>       
>>>>>>> comsume a lot of CPU.
>>>>>>>
>>>>>>> And strace shows:
>>>>>>>
>>>>>>> sched_yield()
>>>>>>>
>>>>>>> sched_yield()
>>>>>>>
>>>>>>> sched_yield()
>>>>>>>
>>>>>>> sched_yield()
>>>>>>>
>>>>>>> ....
>>>>>>>
>>>>>>> for all processes.
>>>>>>>
>>>>>>> If I stop the stress - after a while (not immediately) - it
>>>>>>>               
>>>> unlocks,
>>>>         
>>>>>>> also suddenly, I can see in top that all opensips processes
>>>>>>>               
>> stop
>>     
>>> to
>>>       
>>>>>>> consume CPU.
>>>>>>>
>>>>>>> What can it be? Some kind of starvation?
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Best Regards,
>>>>>>>
>>>>>>> Alex Massover
>>>>>>>
>>>>>>> VoIP R&D TL
>>>>>>>
>>>>>>> Jajah Inc.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> This mail was sent via Mail-SeCure System.
>>>>>>> ---------------------------------------------------------------
>>>>>>>               
>> --
>>     
>>> --
>>>       
>>>> --
>>>>         
>>>>>> ---
>>>>>>
>>>>>>             
>>>>>>> _______________________________________________
>>>>>>> Users mailing list
>>>>>>> Users at lists.opensips.org
>>>>>>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>>>>>>>
>>>>>>>
>>>>>>>               
>>>>>> --
>>>>>> Andrei Dragus
>>>>>> www.voice-system.ro
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Users mailing list
>>>>>> Users at lists.opensips.org
>>>>>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>>>>>>
>>>>>> This mail was received via Mail-SeCure System.
>>>>>>
>>>>>>
>>>>>>             
>>>>> This mail was sent via Mail-SeCure System.
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Users mailing list
>>>>> Users at lists.opensips.org
>>>>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>>>>>
>>>>>           
>>>> --
>>>> Andrei Dragus
>>>> www.voice-system.ro
>>>>
>>>>
>>>> _______________________________________________
>>>> Users mailing list
>>>> Users at lists.opensips.org
>>>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>>>>
>>>> This mail was received via Mail-SeCure System.
>>>>
>>>>         
>>> This mail was sent via Mail-SeCure System.
>>>
>>>
>>>
>>> _______________________________________________
>>> Users mailing list
>>> Users at lists.opensips.org
>>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>>>
>>> This mail was received via Mail-SeCure System.
>>>
>>>       
>> This mail was sent via Mail-SeCure System.
>>
>>
>>
>> _______________________________________________
>> Users mailing list
>> Users at lists.opensips.org
>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>>
>> This mail was received via Mail-SeCure System.
>>
>>     
>
>
> This mail was sent via Mail-SeCure System.
>
>
>
> _______________________________________________
> Users mailing list
> Users at lists.opensips.org
> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>   

-- 
Andrei Dragus
www.voice-system.ro 




More information about the Users mailing list