[OpenSIPS-Users] opensips tm timer core dump

Bogdan-Andrei Iancu bogdan at voice-system.ro
Fri Oct 22 16:02:10 CEST 2010


Hi Kennard,

I suppose the bt is the same ? do you still have the core file ?

Regards,
Bogdan

Kennard_White at logitech.com wrote:
>
> Hi Bodgen,
>
> I replicated the error. Unfortunately the entire insert_timer_unsafe 
> and been in-lined and little is available:
>
> Program terminated with signal 11, Segmentation fault.
> #0 0x00007f8b8356c2c2 in insert_timer_unsafe (new_tl=0x7f8b7a54e310,
> list_id=WT_TIMER_LIST, ext_timeout=<value optimized out>) at timer.c:731
> 731 timer.c: No such file or directory.
> in timer.c
> (gdb) print tl
> $1 = <value optimized out>
> (gdb) print *tl
> Cannot access memory at address 0x0
> (gdb) print ptr
> $2 = <value optimized out>
> (gdb) print *ptr
> Cannot access memory at address 0x0
> (gdb) print *new_tl
> No symbol "new_tl" in current context.
> (gdb) up
> #1 set_1timer (new_tl=0x7f8b7a54e310, list_id=WT_TIMER_LIST,
> ext_timeout=<value optimized out>) at timer.c:904
> 904 in timer.c
> (gdb) print *new_tl
> $3 = {next_tl = 0x0, prev_tl = 0x0, ld_tl = 0x0, time_out = 0,
> timer_list = 0x0, deleted = 0}
> (gdb) print list
> $4 = <value optimized out>
> (gdb) print timeout
> $5 = 32
> (gdb) print new_tl
> $6 = (struct timer_link *) 0x7f8b7a54e310
>
> I'll keep the core for a while -- please let me know if there is 
> anything else I can try.
>
> Thanks,
> Kennard
>
> Inactive hide details for Bogdan-Andrei Iancu ---10/08/2010 04:40:47 
> AM---Hi Kennard, Ok, keep the core next time :)Bogdan-Andrei Iancu 
> ---10/08/2010 04:40:47 AM---Hi Kennard, Ok, keep the core next time :)
>
> From: Bogdan-Andrei Iancu <bogdan at voice-system.ro>
> To: OpenSIPS users mailling list <users at lists.opensips.org>
> Date: 10/08/2010 04:40 AM
> Subject: Re: [OpenSIPS-Users] opensips tm timer core dump
> Sent by: users-bounces at lists.opensips.org
>
> ------------------------------------------------------------------------
>
>
>
> Hi Kennard,
>
> Ok, keep the core next time :)
>
> Regards,
> Bogdan
>
> Kennard_White at logitech.com wrote:
> >
> > Hi Bogden,
> >
> > Thanks for explaining the child processes involved -- I misunderstood
> > what was happening.
> >
> > Unfortunately, I don't have the core anymore. My recollection is that
> > I couldn't print anything useful due to compiler optimization. That
> > said, this should re-create pretty easily, and I'll get more dumps
> > next time it happens.
> >
> > Regards,
> > Kennard
> >
> > Inactive hide details for Bogdan-Andrei Iancu ---10/05/2010 01:41:38
> > AM---Hi Kennard, The core was generated by process 22255:Bogdan-Andrei
> > Iancu ---10/05/2010 01:41:38 AM---Hi Kennard, The core was generated
> > by process 22255:
> >
> > From: Bogdan-Andrei Iancu <bogdan at voice-system.ro>
> > To: OpenSIPS users mailling list <users at lists.opensips.org>
> > Date: 10/05/2010 01:41 AM
> > Subject: Re: [OpenSIPS-Users] opensips tm timer core dump
> > Sent by: users-bounces at lists.opensips.org
> >
> > ------------------------------------------------------------------------
> >
> >
> >
> > Hi Kennard,
> >
> > The core was generated by process 22255:
> >    [22238]: INFO:core:handle_sigs: child process 22255 exited by a
> > signal 11
> >
> > and this process also reported mem problems:
> >    [22255]: ERROR:tm:new_t: out of mem
> >
> > Can you print the "tl" or "ptr" variables in frame 0?
> >
> > Regards,
> > Bogdan
> >
> > Kennard_White at logitech.com wrote:
> > >
> > > Running against opensips HEAD, I got a segfault in the tm timer code.
> > > I believe this is triggered by running out of shared memory.
> > >
> > >
> > > The stack trace:
> > >
> > > (gdb) where
> > > #0 0x00007fe8f8d96212 in insert_timer_unsafe (new_tl=0x7fe8f66337b0,
> > > list_id=WT_TIMER_LIST, ext_timeout=<value optimized out>) at 
> timer.c:731
> > > #1 set_1timer (new_tl=0x7fe8f66337b0, list_id=WT_TIMER_LIST,
> > > ext_timeout=<value optimized out>) at timer.c:904
> > > #2 0x00007fe8f8d78ac8 in t_release_transaction (trans=0x7fe8f6633730)
> > > at t_funcs.c:122
> > > #3 0x00007fe8f8d808e5 in t_unref (p_msg=<value optimized out>)
> > > at t_lookup.c:1152
> > > #4 0x0000000000483ae5 in exec_post_req_cb ()
> > > #5 0x000000000046c1e4 in receive_msg ()
> > > #6 0x00000000004bc77c in udp_rcv_loop ()
> > > #7 0x000000000042de9c in main ()
> > >
> > > The offending code (I believe):
> > > if (tl->time_out==ptr->time_out) {
> > > tl->ld_tl = ptr->ld_tl
> > > ptr->ld_tl = 0;
> > > tl->ld_tl->ld_tl = tl; <-- SEG FAULT HERE (according to trace)
> > > } else {
> > > tl->ld_tl = tl;
> > > }
> > >
> > > Unfortunately, due to optimization I cannot dump anything useful, and
> > > I'm not convinced the actual fault is on the line indicated. Note that
> > > the core dump is not one of the processes that reported out of memory.
> > > Maybe one of the other processes left the timer list in a corrupt 
> state?
> > >
> > > The log file:
> > > Sep 29 11:43:36 org-sip01 /var/run/openser/opensips-pres[22255]:
> > > ERROR:tm:sip_msg_cloner: no more share memory
> > > Sep 29 11:43:36 org-sip01 /var/run/openser/opensips-pres[22255]:
> > > ERROR:tm:new_t: out of mem
> > > Sep 29 11:43:36 org-sip01 /var/run/openser/opensips-pres[22255]:
> > > ERROR:tm:t_newtran: new_t failed
> > > Sep 29 11:43:36 org-sip01 /var/run/openser/opensips-pres[22254]:
> > > WARNING:core:fm_malloc: Not enough free memory, will atempt
> > defragmenation
> > > Sep 29 11:43:36 org-sip01 /var/run/openser/opensips-pres[22254]:
> > > ERROR:tm:sip_msg_cloner: no more share memory
> > > Sep 29 11:43:36 org-sip01 /var/run/openser/opensips-pres[22254]:
> > > ERROR:tm:new_t: out of mem
> > > Sep 29 11:43:36 org-sip01 /var/run/openser/opensips-pres[22254]:
> > > ERROR:tm:t_newtran: new_t failed
> > > Sep 29 11:43:36 org-sip01 /var/run/openser/opensips-pres[22238]:
> > > INFO:core:handle_sigs: child process 22255 exited by a signal 11
> > > Sep 29 11:43:36 org-sip01 /var/run/openser/opensips-pres[22238]:
> > > INFO:core:handle_sigs: core was generated
> > > Sep 29 11:43:36 org-sip01 /var/run/openser/opensips-pres[22238]:
> > > INFO:core:handle_sigs: terminating due to SIGCHLD
> > > Sep 29 11:43:36 org-sip01 /var/run/openser/opensips-pres[22256]:
> > > INFO:core:sig_usr: signal 15 received
> > >
> > > 
> ------------------------------------------------------------------------
> > >
> > > _______________________________________________
> > > Users mailing list
> > > Users at lists.opensips.org
> > > http://lists.opensips.org/cgi-bin/mailman/listinfo/users
> > >  
> >
> >
> > --
> > Bogdan-Andrei Iancu
> > OpenSIPS Bootcamp
> > 15 - 19 November 2010, Edison, New Jersey, USA
> > www.voice-system.ro
> >
> >
> > _______________________________________________
> > Users mailing list
> > Users at lists.opensips.org
> > http://lists.opensips.org/cgi-bin/mailman/listinfo/users
> >
> > ------------------------------------------------------------------------
> >
> > _______________________________________________
> > Users mailing list
> > Users at lists.opensips.org
> > http://lists.opensips.org/cgi-bin/mailman/listinfo/users
> >  
>
>
> -- 
> Bogdan-Andrei Iancu
> OpenSIPS Bootcamp
> 15 - 19 November 2010, Edison, New Jersey, USA
> www.voice-system.ro
>
>
> _______________________________________________
> Users mailing list
> Users at lists.opensips.org
> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Users mailing list
> Users at lists.opensips.org
> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>   


-- 
Bogdan-Andrei Iancu
OpenSIPS Bootcamp
15 - 19 November 2010, Edison, New Jersey, USA
www.voice-system.ro




More information about the Users mailing list