[OpenSIPS-Users] Mediaproxy hanging sessions on high load

Thu Mar 16 10:22:34 EDT 2017

Hi Dan

Looks like this problem is only happening on virtual machines, not on
physical machines. And only while they are on high load.

But i'm not sure about the kernel rule, is there any way to check it?

Please take a look at this case, this Relay will never halt because there
are more than 3k sessions that will never finish internally (the call has
already hangup hours ago):

8 2.2.2.2 2.6.1 44h01'05"
112.03kbps 3045
audio 3045 Halting

Some of these calls:

728 *From:* 222222 at 4.4.4.4
*To:* 33333333 at sip.aaa.com.br
[image: unknown agent] [image: HG4000/1.0] 6.6.6.6:55632 2.2.2.2:46640
2.2.2.2:46866 7.7.7.7:4170 active G729 audio 21h35'34" 0 0
729 *From:* 2222222 at 4.4.4.4:5064
*To:* 33333333 at sip.aaa.com.br
[image: TS-v4.6.0-11eW] [image: Agitel GSM Bridge v2.0] 6.6.6.6:34908
2.2.2.2:58158 2.2.2.2:54372 7.7.7.7:16846 active G729 audio 16h11'51" 0 0
730 *From:* 22222222 at 4.4.4.4
*To:* 33333333 at sip.aaa.com.br
[image: Mediant 2000/v.6.60A.328.003] [image: unknown agent] 6.6.6.6:46324
2.2.2.2:50156 2.2.2.2:48182 7.7.7.7:18516 active G729 audio 19h45'38" 0 0
731 *From:* 222222 at 4.4.4.4:5061
*To:* 33333333333 at sip.aaa.com.br
[image: TS-v4.6.0-14b] [image: gsm-gw-3.4.1] 6.6.6.6:54800 2.2.2.2:43998
2.2.2.2:46144 7.7.7.7:12360 active G729 audio 19h09'41" 0 0
732 *From:* 2222222 at 4.4.4.4
*To:* 333333333333 at sip.aaa.com.br
[image: Trinit IVR] [image: HG4000/1.0] 6.6.6.6:18854 2.2.2.2:51924
2.2.2.2:40512 7.7.7.7:4200 active G729 audio 19h37'59" 0 0

Is there any way to drop these sessions? Maybe using the internal timeout
system of mediaproxy?

If you could take a look personally, we could negotiate an hourly rate.

Thanks again

On Thu, Mar 16, 2017 at 10:54 AM, Dan Pascu <dan at ag-projects.com> wrote:

>
> One thing came to mind. A case when the relay could get overloaded is if a
> lot of clients start sessions and only one endpoint sends media. That is
> the only case where the relay would have to deal with the media traffic
> itself and having hundreds of such sessions at the same time could overload
> the relay.
>
> The way the relay works is for each call it starts listening on 4 ports (2
> for RTP and 2 for RTCP). Each endpoint will send 2 streams (1 RTP one RTCP)
> and initially the relay will just listen on these ports and when it
> receives data it learns the endpoint's address. After it learns both
> endpoint's addresses, it adds a conntrack rule in the kernel to allow the
> kernel to directly relay the media streams between the endpoints and it
> will never see a media packet from the endpoints again until the call ends.
> This allows for very efficient data forwarding because it's done entirely
> in the kernel with no data being transferred from kernel to user-space and
> back like traditional solutions. We have seen media relays handling
> hundreds of calls at a time with 0% CPU load on the relay.
>
> So the only thing I can think of causing something like what you describe
> (even though I'm still not sure what you meant by hanging up sessions), is
> that somehow this process didn't finish setting up completely and the relay
> directly receives media streams from hundreds of devices because only one
> endpoint sends data (or the other endpoint's data gets filtered at some
> firewall), and because it cannot learn both endpoint's addresses it cannot
> setup the kernel conntrack rule to move data forwarding to the kernel.
>
> On 14 Mar 2017, at 13:38, Dan Pascu wrote:
>
> >
> > On 13 Mar 2017, at 18:58, Daniel Zanutti wrote:
> >
> >> Hi guys
> >>
> >> I sent this email a few days ago, anyone from Mediaproxy team could
> take a look? I could debug it, just need some directions on where to look.
> >
> > We have never encountered this problem, so I', not sure what to suggest,
> even more considering that the description is not very clear. What do you
> mean when you say the relay starts to hang some sessions? That it timeouts
> on them not having traffic and initiates a BYE for those sessions? Because
> in the next paragraph you imply that they never timeout.
> >
> >>
> >> Thanks
> >>
> >> On Tue, Mar 7, 2017 at 11:10 AM, Daniel Zanutti <
> daniel.zanutti at gmail.com> wrote:
> >> I'm using mediaproxy on several installations and have noticed that
> when the machine is on high load (> 700 sessions), the media-relay process
> starts to hang some sessions.
> >>
> >> These sessions doesn't have any RTP being sent/received anymore and
> they never hangup. After some hours of frozen sessions, the media-relay
> process doesn't connect to the dispatcher anymore, but keep using high CPU
> on the machine. Maybe it's on loop internally, not sure.
> >>
> >> Is there any solution for this? Maybe a timer to cleanup old sessions
> (2 or 4+ hours old).
> >>
> >> Thanks
> >>
> >>
> >>
> >> _______________________________________________
> >> Users mailing list
> >> Users at lists.opensips.org
> >> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
> >
> >
> > --
> > Dan
> >
> >
> >
> >
> >
> > _______________________________________________
> > Users mailing list
> > Users at lists.opensips.org
> > http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>
>
> --
> Dan
>
>
>
>
>
> _______________________________________________
> Users mailing list
> Users at lists.opensips.org
> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opensips.org/pipermail/users/attachments/20170316/a7f99718/attachment-0001.html>