[OpenSIPS-Users] OpenSIPS not processing responses?

Muhammad Shahzad Shafi shahzad at voip-demos.com
Mon Nov 18 12:40:21 CET 2013


 

Ah so, this makes sense. I guess you just need to adjust and fine
tune some TCP parameters, take a look at
this,

www.opensips.org/Documentation/Script-CoreParameters-1-8#toc75
[2] Important flags to optimize are,


tcp_connect_timeout
tcp_send_timeout tcp_no_new_conn_flag
tcp_keepxxxx

Also as you mentioned, the IP in error message is
different and unreachable, which leads me to question, which node is
handing NAT? i guess it would be opensips (since it seems to be edge
proxy in your setup, comparable to P-CSCF in IMS), so you need to make
sure you handle the NAT correctly on it (with NAT keepalive etc.
setup).

Thank you. 

On 2013-11-15 14:59, Gavin Murphy wrote: 

> Yes,
we are bridging between TCP and UDP. There are no custom scripts being
used, just the core OpenSIPS scripting, with OpenSIPS acting just as a
proxy. We had another event occur last evening and managed to capture
some interesting information that suggests the problem is related to
blocking tcp connections:
> 
> aaa at ip-10-72-7-129:/dir$ grep
"ERROR:core:tcp_blocking_connect: timeout 10 s elapsed from 10 s"
opensips.log | sed 's/:.. ip.*//'| uniq -c
> 2 Nov 14 06:50
> 2 Nov 14
06:53
> 2 Nov 14 06:55
> 1 Nov 14 06:56
> 2 Nov 14 06:57
> 2 Nov 14
07:00
> ....
> 2 Nov 14 13:53
> 3 Nov 14 13:54
> 2 Nov 14 13:55
> 23 Nov
14 13:56
> 29 Nov 14 13:57
> 30 Nov 14 13:58
> 30 Nov 14 13:59
> 
>
While I'm not familiar with the inner workings of OpenSIPS, it seems
like there is something happening that may be causing all of the child
processes to block while trying to establish outbound TCP connections.
Unfortunately the "tcp_blocking_connect: timeout" error (and the
subsequent "tcp_blocking_connect failed" error) doesn't indicate the
address that caused the failure, and there's no way that I can see to
correlate the error back to a request being processed. But I suspect
it's a case where a client connected via TCP, registered with our
registrar (where the path to the client, including the address of the
TCP connection endpoint on OpenSIPS that has an ephemeral port, is
recorded), but when we go to send a bunch of requests to that client it
is no longer there and it causes all of the opensips child processes to
effectively block trying to establish a connection to an address it will
never be able to connect to. In the end 477 Send Failed is returned in
those cases, but I think the child trying to establish the connection is
blocked until the connect times out, and things start piling up behind
it, leading to a vicious circle.
> 
> Assuming all of the analysis is
correct, can we prevent OpenSIPS from spending time trying to establish
a connection that can never be established?
> 
> Thanks,
> 
> Gavin
> 
>
On 15/11/2013 7:00 AM, users-request at lists.opensips.org [1] wrote: 
>

>> Seems opensips is acting as transport bridge between udp and tcp,
so, 
>> do the SIP replies actually arrive at same interface (transport
+ ip + 
>> port) from where the request was sent, and opensips is
listening to that 
>> interface? Can you share SIP trace with us?
>> 
>>
Also, are you using any custom application, e.g. perl / lua script etc.

>> in opensips dial plan, that might be blocking the opensips thread
that 
>> is managing this transaction. As far as i remember opensips
1.8.x does 
>> not have async processing, therefore, if any script or
module command 
>> that consumes time e.g. custom sql query, external
script or shell 
>> command etc. would block opensips and any responses
received during that 
>> time are likely to be ignored.
>> 
>> Thank
you.
>> 
>> On 2013-11-14 16:05, Gavin Murphy wrote:
>> 
>>> Hi all,
>>>

>>> We're seeing a possible issue in OpenSIPS related to the timely
>>>
processing of replies. For example, OpenSIPS receives a REGISTER and
>>>
passes it on to our application server (which is the registrar). The
>>>
registrar receives it within the same second and generates the
>>>
response (a 401 initially) within 0.003 seconds. However OpenSIPS
>>>
doesn't appear to receive that message, and half a second after the
>>>
first REGISTER is sent, there is a retransmission after 0.5 second,
>>>
which is again received by our registrar and a response is generated
>>>
very quickly. But still the 401 doesn't get received and/or
processed
>>> by OpenSIPS. The retransmission happens again a few times
as per RFC
>>> 3261. Eventually, almost 30 seconds later OpenSIPS logs
that the 
>>> reply
>>> to the REGISTER has been received.
>>> 
>>>
Based on the evidence it seems that there is no problem with
OpenSIPS
>>> sending the request and it being received by our registrar.
There 
>>> also
>>> doesn't appear to be any issues with the
retransmission by OpenSIPS 
>>> at
>>> the right intervals, nor does the
registrar appear to be introducing
>>> any delays. There is not much
other traffic going on at the same 
>>> time,
>>> but there are other
REGISTERs that are getting the same result.
>>> 
>>> Here are some logs
from OpenSIPS:
>>> Initial relay to our registrar:
>>> Nov 1 19:14:40
ip-10-72-7-129 rcs-opensips[13839]: RCS INFO:
>>> ROUTE[2] (Inbound and
Outbound) - REGISTER REQUEST relayed. Exiting.
>>> Nov 1 19:14:40
ip-10-72-7-129 rcs-opensips[13839]: DBG:tm:t_unref:
>>> UNREF_UNSAFE:
[0x7f8f8b8d6518] after is 0
>>> Nov 1 19:14:40 ip-10-72-7-129
rcs-opensips[13839]:
>>> DBG:core:destroy_avp_list: destroying list
(nil)
>>> 
>>> Retransmissions:
>>> Nov 1 19:14:40 ip-10-72-7-129
rcs-opensips[13836]:
>>> DBG:tm:retransmission_handler:
retransmission_handler : request
>>> resending (t=0x7f8f8b8d6518,
REGISTER ... )
>>> Nov 1 19:14:41 ip-10-72-7-129
rcs-opensips[13836]:
>>> DBG:tm:retransmission_handler:
retransmission_handler : request
>>> resending (t=0x7f8f8b8d6518,
REGISTER ... )
>>> Nov 1 19:14:43 ip-10-72-7-129
rcs-opensips[13836]:
>>> DBG:tm:retransmission_handler:
retransmission_handler : request
>>> resending (t=0x7f8f8b8d6518,
REGISTER ... )
>>> Nov 1 19:14:53 ip-10-72-7-129
rcs-opensips[13836]:
>>> DBG:tm:retransmission_handler:
retransmission_handler : request
>>> resending (t=0x7f8f8b8d6518,
REGISTER ... )
>>> Nov 1 19:15:04 ip-10-72-7-129
rcs-opensips[13836]:
>>> DBG:tm:retransmission_handler:
retransmission_handler : request
>>> resending (t=0x7f8f8b8d6518,
REGISTER ... )
>>> 
>>> Reply "received" (IP,s TNs, and domains
anonymized):
>>> Nov 1 19:15:08 ip-10-72-7-129 rcs-opensips[13831]:
>>>
DBG:tm:t_reply_matching: REF_UNSAFE:[0x7f8f8b8d6518] after is 1
>>> Nov
1 19:15:08 ip-10-72-7-129 rcs-opensips[13831]:
>>>
DBG:tm:t_reply_matching: reply matched (T=0x7f8f8b8d6518)!
>>> Nov 1
19:15:08 ip-10-72-7-129 rcs-opensips[13831]: DBG:tm:t_check:
>>>
end=0x7f8f8b8d6518
>>> Nov 1 19:15:08 ip-10-72-7-129
rcs-opensips[13831]:
>>> DBG:tm:reply_received: org. status uas=0,
uac[0]=0 local=0
>>> is_invite=0)
>>> Nov 1 19:15:08 ip-10-72-7-129
rcs-opensips[13831]: RCS INFO:
>>> PROCESSING ONREPLY ROUTE[2] (Reply
for Inbound Request)
>>> Nov 1 19:15:08 ip-10-72-7-129
rcs-opensips[13831]: RCS DEBUG:
>>> ONREPLY_ROUTE[1] (Reply for Inbound
Request) - Dump Request Info.
>>> Nov 1 19:15:08 ip-10-72-7-129
rcs-opensips[13831]: RCS DEBUG:
>>> ============= DUMP REPLY
=============
>>> Nov 1 19:15:08 ip-10-72-7-129 rcs-opensips[13831]: RCS
DEBUG: SIP
>>> message buffer:#012SIP/2.0 401 Unauthorized#015#012Via:
SIP/2.0/UDP
>>>
x.x.x.x:6000;branch=z9hG4bKf439.62091482.0;i=c04c3,SIP/2.0/TCP
>>> 
>>>
x.x.x.x:40042;branch=z9hG4bK1490773660;received=x.x.x.x;rport=40042#015#012Record-Route:
>>>
#015#012Call-ID:
>>>
6d76ce9e-ed19-9701-1e75-f070eacd3400#015#012From:
>>>
;tag=330928025#015#012To:
>>> ;tag=Jmuh3YOvEmCA#015#012CSeq:
221716357
>>> REGISTER#015#012Contact: #015#012Require:
>>>
gruu#015#012WWW-Authenticate: Digest
>>> 
>>>
realm="domain.net",nonce="5273fda02d788dccba9301c8e1f68f078f8e95e5",qop="auth",opaque="004533235332435434ffac663e",algorithm=MD5#015#012Content-Length:
>>>
0#015#012P-Associated-URI:
>>> #015#012P-Preferred-Identity:
>>>
#015#012P-Access-Network-Info:
>>>
ADSL;utran-cell-id-3gpp=00000000#015#012Privacy:
none#015#012#015#012
>>> 
>>> On the registrar the logs appear as
follows:
>>> [2013-Nov-01 19:14:40.185708] [14002] [6-inf] ===RECV==>
REGISTER
>>> (6d76ce9e-ed19-9701-1e75-f070eacd3400:+xxxxxxxxxx =>
+xxxxxxxxxx)
>>> [2013-Nov-01 19:14:40.188331] [14002] [6-inf] 
>>>
+xxxxxxxxxx)
>>> [2013-Nov-01 19:14:40.721011] [14002] [6-inf]
===RECV==> REGISTER
>>>
(6d76ce9e-ed19-9701-1e75-f070eacd3400:+xxxxxxxxxx => +xxxxxxxxxx)
>>>
[2013-Nov-01 19:14:40.721162] [14002] [6-inf] 
>>> +xxxxxxxxxx)
>>>
[2013-Nov-01 19:14:41.723160] [14002] [6-inf] ===RECV==> REGISTER
>>>
(6d76ce9e-ed19-9701-1e75-f070eacd3400:+xxxxxxxxxx => +xxxxxxxxxx)
>>>
[2013-Nov-01 19:14:41.723262] [14002] [6-inf] 
>>> +xxxxxxxxxx)
>>> 
>>>
etc....
>>> 
>>> This "bogging down" seems to happen on a regular basis,
and the only
>>> way we are able to resolve it at the moment is to
restart OpenSIPS.
>>> Until it is restarted all of the registrations are
essentially
>>> failing. I believe it is v1.8.2 that we are running.
>>>

>>> Anyone have any thoughts as to what might be happening?
>>> 
>>>
Thanks,
>>> 
>>> Gavin
> 
> -- 
> Gavin Murphy
> Vice President & CTO |
www.newpace.ca - Real Technology Solutions 
> (e)
gavin.murphy at newpace.ca
> (w) +1.902.406.8375 x 1002
> (m)
+1.902.401.9445
> (f) +1.902.406.8377

-- 
Mit freundlichen
Grüßen
Muhammad Shahzad
-----------------------------------
CISCO Rich
Media Communication Specialist (CRMCS)
CISCO Certified Network Associate
(CCNA)
Cell: +49 176 99 83 10 85
MSN: shari_786pk at hotmail.com
Email:
shaheryarkh at googlemail.com
 

Links:
------
[1]
mailto:users-request at lists.opensips.org
[2]
http://www.opensips.org/Documentation/Script-CoreParameters-1-8#toc75
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opensips.org/pipermail/users/attachments/20131118/dbc6acd9/attachment-0001.htm>


More information about the Users mailing list