[OpenSER-Users] 200 OK retransmissions on missing ACK can cause subsequent calls to fail

Bogdan-Andrei Iancu bogdan at voice-system.ro
Wed May 7 21:12:50 CEST 2008


Hi Sean,

Yes, t_check() sets T as NULL if no transaction is matched, but the 
reply_received() function (that calls t_check), if T was set to NULL 
will go to "not_found" label and set T to T_UNDEFINED.

Do you agree on this? if so, we can start working in adding some more 
debug logs to see where the problem is.

Regards,
Bogdan

Sean O'Donnell wrote:
> Hi all,
>
> I’m using openser as a call distributor/proxy between a soft-switch/SBC and
> voicemail platform.  I’m seeing a problem with openser in that it is sometimes
> cancels an in-progress call (fr_inv_timer firing) because it didn’t match the
> 200/OK with the call.
>
> After some investigation, I noticed that this was happening after a missing ACK
> on a previous call caused the voicemail platform to retransmit 200/OK responses
> beyond the TM wt_timer expiration, which in turn left several openser child
> processes (those that received a 200 after wt_timer expiration) in a state such
> that they might not properly match transactions on subsequent calls.  
>
> My setup:
> I have openser 1.2.0 operating on a linux box with two network interfaces, with
> one interface (call it the outside interface) taking incoming calls from the
> soft-switch, and the other (inside) connected to the VM platform.  I have
> openser configured to use both interfaces (see config below) and the TM wt_timer
> set to 5 seconds (default).  As this is a voicemail system, all of the call
> traffic is inbound from the soft-switch.   Given the traffic flow, for the most
> part the openser child processes servicing the inside interface are handling
> responses (180,183,200) from the VM platform.
>
> Call scenario:
> When an INVITE arrives from the soft-switch, openser forwards it to the VM
> platform.  The VM platform responds with a 180 and then a 200.  I've noticed
> several instances where the soft-switch did not respond with an ACK.  This
> caused the VM platform to retransmit the 200 several times over a 10 second
> period.   These were absorbed correctly by openser for the duration of wt_timer.
>  After the timer expired, however, each openser child process that received a
> retransmitted 200 logged something like this:
>  4(2715) DEBUG: t_reply_matching: hash 45870 label 727647196 branch 0
>  4(2715) DEBUG: t_reply_matching: no matching transaction exists
>  4(2715) DEBUG: t_reply_matching: failure to match a transaction
>  4(2715) DEBUG: t_check: end=(nil)
>
> When I look at the TM code, the static variable T in t_lookup.c is now NULL for
> this child process.  
>
> On a subsequent inbound call, the INVITE is passed to the VM correctly, and the
> 180 transaction matches (causing the fr_inv_timer to be armed).  If the 200 is
> read by child proc 2715, I see: 
>  4(2715) DEBUG: t_check: start=(nil)
>  4(2715) DEBUG: t_check: T previously sought and not found
>
> The 200 is forwarded back to the soft-switch, which responds with an ACK.  Both
> end-points think the call is up, but since openser never matched the 200 with
> the call, the fr_inv_timer fires and cancels the call.   Basically, child proc
> 2715 won’t match any transaction after this unless it happens to process a
> request.
>
> I think this problem is made worse by the fact that I’m using two network
> interfaces, and that the openser children on the inside interface handle (for
> the most part) only responses.  This problem was touched on here:
> http://lists.openser.org/pipermail/users/2007-November/014188.html   but I
> didn’t see any follow up.    Also, I’ve checked openser 1.2.3 and 1.3.1 for
> fixes, but I don’t think this has been addressed.
>
> I have a work around, I think, by upping the wt_timer to something like 15
> seconds, but I was wondering if there is any scenario in which leaving T=NULL is
> desirable.
>
> Thanks in advance
> Sean
>
>   





More information about the Users mailing list