[OpenSIPS-Users] OpenSIPS fix_route_dialog crashes

Bogdan-Andrei Iancu bogdan at opensips.org
Thu Sep 15 11:00:19 CEST 2016


Hi Ben,

Thank you for update. Have you tried to compile the memory debugging 
support ? it might speed up the detection of the error.

Regards,

Bogdan-Andrei Iancu
OpenSIPS Founder and Developer
http://www.opensips-solutions.com

On 05.08.2016 23:00, Newlin, Ben wrote:
>
> Bogdan,
>
> I tried to reproduce this with some simple instructions, but it didn’t 
> reproduce. There must be some dependency on other functions or 
> configurations we are using. It would be too hard to try to figure out 
> exactly what that is, so I will have to capture the data for you. It 
> will just take a while for me to revive an older configuration that 
> allows me to enable the memory debugger.
>
> Ben Newlin
>
> *From: *Bogdan-Andrei Iancu <bogdan at opensips.org>
> *Date: *Tuesday, August 2, 2016 at 3:47 AM
> *To: *"Newlin, Ben" <Ben.Newlin at inin.com>, OpenSIPS users mailling 
> list <users at lists.opensips.org>
> *Subject: *Re: [OpenSIPS-Users] OpenSIPS fix_route_dialog crashes
>
> Ben,
>
> To make it easier, please send me the instructions on how to reproduce 
> the crash.
>
> Thanks and Regards,
>
>
> Bogdan-Andrei Iancu
> OpenSIPS Founder and Developer
> http://www.opensips-solutions.com
>
> On 01.08.2016 20:17, Newlin, Ben wrote:
>
>     Bogdan,
>
>     I am not familiar with gdb, so I double check what you’ve
>     assessed. If there are some other steps with gdb you would like me
>     to perform, just let me know what to do.
>
>     Is there a way to compile the memory debugger without using the
>     interactive `make menuconfig` command? Our build system is
>     completely automated, so it impossible for me to do it this way.
>     Can I pass the options as build parameters or alter the makefile
>     in some way?
>
>     I can provide a SIPp scenario which should reproduce the issue on
>     any basic script that uses Dialog with topology hiding, if that
>     would be easier.
>
>     Ben Newlin
>
>     *From: *Bogdan-Andrei Iancu <bogdan at opensips.org>
>     <mailto:bogdan at opensips.org>
>     *Date: *Monday, August 1, 2016 at 10:57 AM
>     *To: *OpenSIPS users mailling list <users at lists.opensips.org>
>     <mailto:users at lists.opensips.org>, "Newlin, Ben"
>     <Ben.Newlin at inin.com> <mailto:Ben.Newlin at inin.com>
>     *Subject: *Re: [OpenSIPS-Users] OpenSIPS fix_route_dialog crashes
>
>     Hi Ben,
>
>     According to the BT, the crash is in a pkg_malloc() call:
>                         route = pkg_malloc(size);
>     Please double check this with gdb info.
>
>     If so, this indicate a memory corruption and we have 2 options here:
>         - you compile with memory debugger (see my previous emails)
>         - provide step-by-step indications on how to reproduce this crash.
>
>     Thanks and Regards,
>
>
>     Bogdan-Andrei Iancu
>
>     OpenSIPS Founder and Developer
>
>     http://www.opensips-solutions.com
>
>     On 29.07.2016 15:54, Newlin, Ben wrote:
>
>         This is 1.11.6, running on CentOS 7.
>
>         Ben Newlin
>
>         *From: *<users-bounces at lists.opensips.org>
>         <mailto:users-bounces at lists.opensips.org> on behalf of
>         Bogdan-Andrei Iancu <bogdan at opensips.org>
>         <mailto:bogdan at opensips.org>
>         *Reply-To: *OpenSIPS users mailling list
>         <users at lists.opensips.org> <mailto:users at lists.opensips.org>
>         *Date: *Friday, July 29, 2016 at 8:50 AM
>         *To: *"Newlin, Ben" <Ben.Newlin at inin.com>
>         <mailto:Ben.Newlin at inin.com>, OpenSIPS users mailling list
>         <users at lists.opensips.org> <mailto:users at lists.opensips.org>
>         *Subject: *Re: [OpenSIPS-Users] OpenSIPS fix_route_dialog crashes
>
>         Ben,
>
>         What OpenSIPS version is this (the crashing one) ? 1.11 or 2.1 ?
>
>         Regards,
>
>
>
>         Bogdan-Andrei Iancu
>
>         OpenSIPS Founder and Developer
>
>         http://www.opensips-solutions.com
>
>         On 27.07.2016 19:02, Newlin, Ben wrote:
>
>             I have identified that these crashes are occurring when
>             the far end system is not returning the Record-Route
>             headers in the 200 OK response. The headers are present in
>             the 180 response, but not the 200 OK. I have reproduced
>             the scenario using SIPp and captured a SIP trace:
>             http://pastebin.com/ckKk3EhY <http://pastebin.com/ckKk3EhY>
>
>             The crash occurs on receipt of the ACK request and attempt
>             to match the dialog.
>
>             I also captured a BT for this scenario as well, in case
>             anything specific in the trace made the issue easier to
>             find: http://pastebin.com/cM3FhPiw
>
>             I am working with the other system to try to fix their
>             behavior.
>
>             Ideally the Record-Route headers from previous replies
>             could be used in this case to allow the call to succeed,
>             but I don’t know if that is possible.
>
>             Thanks,
>
>             Ben Newlin
>
>             *From: *"Newlin, Ben" <Ben.Newlin at inin.com>
>             <mailto:Ben.Newlin at inin.com>
>             *Date: *Wednesday, July 27, 2016 at 9:44 AM
>             *To: *Bogdan-Andrei Iancu <bogdan at opensips.org>
>             <mailto:bogdan at opensips.org>, OpenSIPS users mailling list
>             <users at lists.opensips.org> <mailto:users at lists.opensips.org>
>             *Subject: *Re: [OpenSIPS-Users] OpenSIPS fix_route_dialog
>             crashes
>
>             Bogdan,
>
>             This is a different scenario than the other you responded
>             to. As I said, we have two types of servers that work
>             together. One is a load-balancer and runs as a proxy. It
>             uses double Record-Route because it sends messages between
>             public and private networks. Then we have our other
>             servers using TH which receive those requests. We are not
>             using TH and RR on the same server (although I would like to).
>
>             If validate_dialog() and fix_route_dialog() (and possibly
>             loose_route()) should not be called when using TH, I
>             believe the documentation should reference that. It states
>             that match_dialog() must be used with TH, but does not
>             indicate that the other functions should not be used or
>             that the functionality won’t work. There is also no
>             documentation of the incompatibility between RR and TH.
>
>             Either way, I ran a test where I removed all calls to
>             loose_route(), validate_dialog(), and fix_route_dialog()
>             from my script. The crash still occurred and the BT still
>             pointed to fix_route_dialog() function. So it must be
>             getting called from within Dialog module somewhere. That
>             BT is here: http://pastebin.com/wu2X2Hxh
>
>             I collected this BT with loose_route() being called from
>             my script, but not validate_dialog() or
>             fix_route_dialog(): http://pastebin.com/6V7yPaHF
>
>             This BT was collected with all three functions being
>             called from my script: http://pastebin.com/fZYYdndn
>
>             Ben Newlin
>
>             *From: *Bogdan-Andrei Iancu <bogdan at opensips.org>
>             <mailto:bogdan at opensips.org>
>             *Date: *Wednesday, July 27, 2016 at 3:57 AM
>             *To: *OpenSIPS users mailling list
>             <users at lists.opensips.org>
>             <mailto:users at lists.opensips.org>, "Newlin, Ben"
>             <Ben.Newlin at inin.com> <mailto:Ben.Newlin at inin.com>
>             *Subject: *Re: [OpenSIPS-Users] OpenSIPS fix_route_dialog
>             crashes
>
>             Hi Ben,
>
>             First, if you use TH, makes no sense to do Record-Routing
>             - there are 2 SIP concepts that overlaps. You either act
>             as an end-point (by doing TH), either as a proxy (doing RR).
>
>             If doing TH, makes no sense to use validate + fix as these
>             functions check and repair the routing information in the
>             request (like Route and Contact headers). if you do TH,
>             this routing info is actually hidden and added by
>             OpenSIPS, so there is nothing to fix and repair.
>
>             Nevertheless, this should not crash or corrupt OpenSIPS.
>             HAve you managed to get a corefile ?
>
>             Also if you suspect memory corruption, you can compile-in
>             the memory debugger - see
>             http://www.opensips.org/Documentation/TroubleShooting-OutOfMem
>             .
>
>             Regards,
>
>
>
>
>
>             Bogdan-Andrei Iancu
>
>             OpenSIPS Founder and Developer
>
>             http://www.opensips-solutions.com
>
>             On 26.07.2016 23:20, Newlin, Ben wrote:
>
>                 I have had 3 OpenSIPS server crashes in the last week.
>                 All were due to segmentation faults. I was not able to
>                 capture core dumps; I am configuring that now to catch
>                 the next crash.
>
>                 My logs leading up to the crash are full of errors
>                 from fix_route_dialog() complaining about invalid URIs
>                 for sequential requests:
>
>                 Jul 26 19:34:02 [220] ERROR:dialog:fix_route_dialog:
>                 Failed to parse SIP uri
>
>                 Jul 26 19:34:02 [220] ERROR:core:parse_uri: bad uri,
>                 state 0 parsed: <ip:1> (4) /
>                 <ip:10.18.8.18:5060;ftag=gK0448f137;lr;r2=on>> (44)
>
>                 Jul 26 19:11:19 [218] ERROR:dialog:fix_route_dialog:
>                 Failed to parse SIP uri
>
>                 Jul 26 19:11:19 [218] ERROR:core:parse_uri: bad uri,
>                 state 0 parsed: <b0i2> (4) /
>                 <b0i2yjor;transport=udp<sip:10.18.8.17:5060;ftag=7207ce89;lr;r2=on>
>                 (65)
>
>                 Jul 26 17:43:19 [220] ERROR:dialog:fix_route_dialog:
>                 Failed to parse SIP uri
>
>                 Jul 26 17:43:19 [220] ERROR:core:parse_uri: bad uri,
>                 state 0 parsed: <ervi> (4) /
>                 <ervice_id6fdbc70f-2438-4726-807c-0d081df4d87> (44)
>
>                 Many times the “URI” displayed in the error message is
>                 actually internal OpenSIPS variables, as in the last
>                 error above. When they are from the SIP message, I
>                 have verified that the messages themselves are
>                 correctly formatted. This leads me to believe there is
>                 memory corruption occurring.
>
>                 This all started when I updated my load-balancer
>                 servers to use Record-Routing, specifically the
>                 “double_rr” mechanism for when multiple interfaces
>                 exist. The Record-Routing is occurring on different
>                 servers which have not crashed. Only the servers
>                 receiving the Record-Routed messages are experiencing
>                 the errors.
>
>                 Here is a piece of the code processing sequential
>                 requests. I am using the topology_hiding()
>                 functionality of the Dialog module. Are
>                 validate_dialog() and fix_route_dialog() still valid
>                 in a topology_hiding scenario?
>
>                 if (t_check_trans())
>
>                 setflag(SEQ_REQUEST);
>
>                   if (has_totag())
>
>                   {
>
>                 loose_route();
>
>                     if (match_dialog())
>
>                     {
>
>                 if (!validate_dialog())
>
>                 fix_route_dialog();
>
>                 if (is_method("BYE"))
>
>                 setflag(ACC_FLAG);
>
>                 setflag(SEQ_REQUEST);
>
>                     }
>
>                 else if (!isflagset(SEQ_REQUEST))
>
>                     {
>
>                 if (!is_method("ACK")) {
>
>                 route(rlog, LV_ERROR, "check_sequential", "Sequential
>                 request not matched");
>
>                   route(reply_error, "481", "Call Does Not Exist");
>
>                       }
>
>                 return(EXIT);
>
>                     }
>
>                   }
>
>                 I will attempt to get core dumps of future crashes.
>
>                 Thanks,
>
>                 Ben Newlin
>
>
>
>
>
>
>
>
>                 _______________________________________________
>
>                 Users mailing list
>
>                 Users at lists.opensips.org <mailto:Users at lists.opensips.org>
>
>                 http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>
>
>
>
>
>         _______________________________________________
>
>         Users mailing list
>
>         Users at lists.opensips.org <mailto:Users at lists.opensips.org>
>
>         http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opensips.org/pipermail/users/attachments/20160915/71ffbe19/attachment-0001.htm>


More information about the Users mailing list