[OpenSER-Users] 200 OK retransmissions on missing ACK can cause subsequent calls to fail

Bogdan-Andrei Iancu bogdan at voice-system.ro
Fri May 9 21:46:47 CEST 2008


Hi Sean,

So, it is this scenario working ok for you now?

Regards,
Bogdan

Sean O'Donnell wrote:
> Hi Bogan:
> Ahh, my mistake, I was looking at the TM code from the 1.2.0 release. 
> I missed the
> goto in the more recent releases that fixed the problem.
> Thanks!
> Sean
>   
> ---- On Wed, 7 May 2008, Bogdan-Andrei
> Iancu (bogdan at voice-system.ro) wrote:
>
>     Hi Sean,
>
>     Yes, t_check() sets T as NULL if no transaction is matched, but the 
>     reply_received() function (that calls t_check), if T was set to NULL 
>     will go to "not_found" label and set T to T_UNDEFINED.
>
>     Do you agree on this? if so, we can start working in adding some more 
>     debug logs to see where the problem is.
>
>     Regards,
>     Bogdan
>
>     Sean O'Donnell wrote:
>     > Hi all,
>     >
>     > I’m using openser as a call distributor/proxy between a soft-switch/SBC and
>     > voicemail platform.  I’m seeing a problem with openser in that it is
>     sometimes
>     > cancels an in-progress call (fr_inv_timer firing) because it didn’t match
>     the
>     > 200/OK with the call.
>     >
>     > After some investigation, I noticed that this was happening after a missing
>     ACK
>     > on a previous call caused the voicemail platform to retransmit 200/OK
>     responses
>     > beyond the TM wt_timer expiration, which in turn left several openser child
>     > processes (those that received a 200 after wt_timer expiration) in a state
>     such
>     > that they might not properly match transactions on subsequent calls.  
>     >
>     > My setup:
>     > I have openser 1.2.0 operating on a linux box with two network interfaces,
>     with
>     > one interface (call it the outside interface) taking incoming calls from
>     the
>     > soft-switch, and the other (inside) connected to the VM platform.  I have
>     > openser configured to use both interfaces (see config below) and the TM
>     wt_timer
>     > set to 5 seconds (default).  As this is a voicemail system, all of the call
>     > traffic is inbound from the soft-switch.   Given the traffic flow, for the
>     most
>     > part the openser child processes servicing the inside interface are
>     handling
>     > responses (180,183,200) from the VM platform.
>     >
>     > Call scenario:
>     > When an INVITE arrives from the soft-switch, openser forwards it to the VM
>     > platform.  The VM platform responds with a 180 and then a 200.  I've
>     noticed
>     > several instances where the soft-switch did not respond with an ACK.  This
>     > caused the VM platform to retransmit the 200 several times over a 10 second
>     > period.   These were absorbed correctly by openser for the duration of
>     wt_timer.
>     >  After the timer expired, however, each openser child process that received
>     a
>     > retransmitted 200 logged something like this:
>     >  4(2715) DEBUG: t_reply_matching: hash 45870 label 727647196 branch 0
>     >  4(2715) DEBUG: t_reply_matching: no matching transaction exists
>     >  4(2715) DEBUG: t_reply_matching: failure to match a transaction
>     >  4(2715) DEBUG: t_check: end=(nil)
>     >
>     > When I look at the TM code, the static variable T in t_lookup.c is now NULL
>     for
>     > this child process.  
>     >
>     > On a subsequent inbound call, the INVITE is passed to the VM correctly, and
>     the
>     > 180 transaction matches (causing the fr_inv_timer to be armed).  If the 200
>     is
>     > read by child proc 2715, I see: 
>     >  4(2715) DEBUG: t_check: start=(nil)
>     >  4(2715) DEBUG: t_check: T previously sought and not found
>     >
>     > The 200 is forwarded back to the soft-switch, which responds with an ACK. 
>     Both
>     > end-points think the call is up, but since openser never matched the 200
>     with
>     > the call, the fr_inv_timer fires and cancels the call.   Basically, child
>     proc
>     > 2715 won’t match any transaction after this unless it happens to process a
>     > request.
>     >
>     > I think this problem is made worse by the fact that I’m using two network
>     > interfaces, and that the openser children on the inside interface handle
>     (for
>     > the most part) only responses.  This problem was touched on here:
>     > http://lists.openser.org/pipermail/users/2007-November/014188.html   but I
>     > didn’t see any follow up.    Also, I’ve checked openser 1.2.3 and 1.3.1 for
>     > fixes, but I don’t think this has been addressed.
>     >
>     > I have a work around, I think, by upping the wt_timer to something like 15
>     > seconds, but I was wondering if there is any scenario in which leaving
>     T=NULL is
>     > desirable.
>     >
>     > Thanks in advance
>     > Sean
>     >
>     >   
>
>
>
>         
>





More information about the Users mailing list