[OpenSIPS-Users] Mediaproxy hanging sessions on high load

Adrian Georgescu ag at ag-projects.com
Thu Mar 16 21:40:31 EDT 2017


Perhaps your virtual machine simply cannot handle the load. The commands to close sessions may also be dropped or lost under such environment.

Adrian


> On 16 Mar 2017, at 11:22, Daniel Zanutti <daniel.zanutti at gmail.com> wrote:
> 
> Hi Dan
> 
> Looks like this problem is only happening on virtual machines, not on physical machines. And only while they are on high load.
> 
> But i'm not sure about the kernel rule, is there any way to check it?
> 
> Please take a look at this case, this Relay will never halt because there are more than 3k sessions that will never finish internally (the call has already hangup hours ago):
> 
> 8	2.2.2.2	2.6.1	44h01'05"
> 112.03kbps	3045
> audio 3045	Halting
> 
> Some of these calls:
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 728	From: 222222 at 4.4.4.4
> To: 33333333 at sip.aaa.com.br <mailto:33333333 at sip.aaa.com.br>
> 		6.6.6.6:55632 <http://6.6.6.6:55632/>	2.2.2.2:46640 <http://2.2.2.2:46640/>	2.2.2.2:46866 <http://2.2.2.2:46866/>	7.7.7.7:4170 <http://7.7.7.7:4170/>	active	G729	audio	21h35'34"	0	0
> 729	From: 2222222 at 4.4.4.4:5064
> To: 33333333 at sip.aaa.com.br <http://sip.aaa.com.br/>
> 		6.6.6.6:34908	2.2.2.2:58158	2.2.2.2:54372	7.7.7.7:16846	active	G729	audio	16h11'51"	0	0
> 730	From: 22222222 at 4.4.4.4
> To: 33333333 at sip.aaa.com.br <http://sip.aaa.com.br/>
> 		6.6.6.6:46324 <http://6.6.6.6:46324/>	2.2.2.2:50156 <http://2.2.2.2:50156/>	2.2.2.2:48182 <http://2.2.2.2:48182/>	7.7.7.7:18516 <http://7.7.7.7:18516/>	active	G729	audio	19h45'38"	0	0
> 731	From: 222222 at 4.4.4.4:5061
> To: 33333333333 at sip.aaa.com.br <http://sip.aaa.com.br/>
> 		6.6.6.6:54800	2.2.2.2:43998	2.2.2.2:46144	7.7.7.7:12360	active	G729	audio	19h09'41"	0	0
> 732	From: 2222222 at 4.4.4.4 <mailto:2222222 at 4.4.4.4>
> To: 333333333333 at sip.aaa.com.br <http://sip.aaa.com.br/>
> 		6.6.6.6:18854 <http://6.6.6.6:18854/>	2.2.2.2:51924 <http://2.2.2.2:51924/>	2.2.2.2:40512 <http://2.2.2.2:40512/>	7.7.7.7:4200 <http://7.7.7.7:4200/>	active	G729	audio	19h37'59"	0	0
> 
> Is there any way to drop these sessions? Maybe using the internal timeout system of mediaproxy?
> 
> If you could take a look personally, we could negotiate an hourly rate.
> 
> Thanks again
> 
> 
> 
> On Thu, Mar 16, 2017 at 10:54 AM, Dan Pascu <dan at ag-projects.com <mailto:dan at ag-projects.com>> wrote:
> 
> One thing came to mind. A case when the relay could get overloaded is if a lot of clients start sessions and only one endpoint sends media. That is the only case where the relay would have to deal with the media traffic itself and having hundreds of such sessions at the same time could overload the relay.
> 
> The way the relay works is for each call it starts listening on 4 ports (2 for RTP and 2 for RTCP). Each endpoint will send 2 streams (1 RTP one RTCP) and initially the relay will just listen on these ports and when it receives data it learns the endpoint's address. After it learns both endpoint's addresses, it adds a conntrack rule in the kernel to allow the kernel to directly relay the media streams between the endpoints and it will never see a media packet from the endpoints again until the call ends. This allows for very efficient data forwarding because it's done entirely in the kernel with no data being transferred from kernel to user-space and back like traditional solutions. We have seen media relays handling hundreds of calls at a time with 0% CPU load on the relay.
> 
> So the only thing I can think of causing something like what you describe (even though I'm still not sure what you meant by hanging up sessions), is that somehow this process didn't finish setting up completely and the relay directly receives media streams from hundreds of devices because only one endpoint sends data (or the other endpoint's data gets filtered at some firewall), and because it cannot learn both endpoint's addresses it cannot setup the kernel conntrack rule to move data forwarding to the kernel.
> 
> On 14 Mar 2017, at 13:38, Dan Pascu wrote:
> 
> >
> > On 13 Mar 2017, at 18:58, Daniel Zanutti wrote:
> >
> >> Hi guys
> >>
> >> I sent this email a few days ago, anyone from Mediaproxy team could take a look? I could debug it, just need some directions on where to look.
> >
> > We have never encountered this problem, so I', not sure what to suggest, even more considering that the description is not very clear. What do you mean when you say the relay starts to hang some sessions? That it timeouts on them not having traffic and initiates a BYE for those sessions? Because in the next paragraph you imply that they never timeout.
> >
> >>
> >> Thanks
> >>
> >> On Tue, Mar 7, 2017 at 11:10 AM, Daniel Zanutti <daniel.zanutti at gmail.com <mailto:daniel.zanutti at gmail.com>> wrote:
> >> I'm using mediaproxy on several installations and have noticed that when the machine is on high load (> 700 sessions), the media-relay process starts to hang some sessions.
> >>
> >> These sessions doesn't have any RTP being sent/received anymore and they never hangup. After some hours of frozen sessions, the media-relay process doesn't connect to the dispatcher anymore, but keep using high CPU on the machine. Maybe it's on loop internally, not sure.
> >>
> >> Is there any solution for this? Maybe a timer to cleanup old sessions (2 or 4+ hours old).
> >>
> >> Thanks
> >>
> >>
> >>
> >> _______________________________________________
> >> Users mailing list
> >> Users at lists.opensips.org <mailto:Users at lists.opensips.org>
> >> http://lists.opensips.org/cgi-bin/mailman/listinfo/users <http://lists.opensips.org/cgi-bin/mailman/listinfo/users>
> >
> >
> > --
> > Dan
> >
> >
> >
> >
> >
> > _______________________________________________
> > Users mailing list
> > Users at lists.opensips.org <mailto:Users at lists.opensips.org>
> > http://lists.opensips.org/cgi-bin/mailman/listinfo/users <http://lists.opensips.org/cgi-bin/mailman/listinfo/users>
> 
> 
> --
> Dan
> 
> 
> 
> 
> 
> _______________________________________________
> Users mailing list
> Users at lists.opensips.org <mailto:Users at lists.opensips.org>
> http://lists.opensips.org/cgi-bin/mailman/listinfo/users <http://lists.opensips.org/cgi-bin/mailman/listinfo/users>
> 
> _______________________________________________
> Users mailing list
> Users at lists.opensips.org
> http://lists.opensips.org/cgi-bin/mailman/listinfo/users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opensips.org/pipermail/users/attachments/20170316/c0e8a109/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 203 bytes
Desc: Message signed with OpenPGP
URL: <http://lists.opensips.org/pipermail/users/attachments/20170316/c0e8a109/attachment-0001.sig>


More information about the Users mailing list