[OpenSIPS-Users] Opensips, avpops and avp_db_query( ), and pgpool2 -- questions, suggestions, issues

Bobby Smith bobby.smith at gmail.com
Mon Apr 20 16:31:24 CEST 2009


Removing pgpool from the equation and simply doing a direct postgres
connection, it seems like the issue still occurs.  If opensips is idle on
the avpops db for some time (sounds like about 30 minutes), and you try to
execute an arbitrary query to that connection from the routing script, it
just keeps passing to the next child without returning data from the query
until it runs out of children to execute the process, and so it just kind of
"hangs".

This is with maximum logging enabled.  You can see where the first process
attempts to parse the message.  When we get to the db lookup, the task
immediately goes to the next process without actually having the first
process stop, and it hangs on the "SELECT" statement.  I can at this point
however directly connect to my postgres backend and do the query manually.

This repeats for all the SIP receiving processes, until none are left.  At
this point, the proxy stops responding to any messages.

Is there some sort of connection timeout like parameter involving avp_ops
that I'm missing?  It seems like I can troubleshoot so far that a) if the
proxy is never idle, this doesn't seem to happen, and b) if the proxy is
idle (processing 0 messages), it happens after about 30 minutes and requires
a restart of opensips to function correctly.



Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: DBG:core:parse_msg: SIP
Request:
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: DBG:core:parse_msg:
method:  <INVITE>
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: DBG:core:parse_msg:
uri:     <sip:+13129576830 at 1.2.3.4:5060;transport=udp>
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: DBG:core:parse_msg:
version: <SIP/2.0>
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: DBG:core:parse_headers:
flags=2
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: DBG:core:parse_via_param:
found param type 232, <branch> = <z9hG4bK4dbf.e8b09c62.0>; state=16
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: DBG:core:parse_via: end
of header reached, state=5
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: DBG:core:parse_headers:
via found, flags=2
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: DBG:core:parse_headers:
this is the first via
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: DBG:core:receive_msg:
After parse_msg...
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: DBG:core:receive_msg:
preparing to run routing scripts...
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: Starting processing for
method INVITE
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: DBG:core:parse_headers:
flags=100
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: DBG:core:parse_via_param:
found param type 232, <branch> = <z9hG4bK4dbf.2d03d153.0>; state=16
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: DBG:core:parse_via: end
of header reached, state=5
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: DBG:core:parse_headers:
via found, flags=100
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: DBG:core:parse_headers:
parse_headers: this is the second via
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: DBG:core:parse_via_param:
found param type 232, <branch> = <z9hG4bK506071629460-1207609494153>;
state=16
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: DBG:core:parse_via: end
of header reached, state=5
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: DBG:core:parse_headers:
via found, flags=100
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: DBG:core:parse_to: end of
header reached, state=10
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: DBG:core:parse_to:
display={}, ruri={sip:+13129576830 at 4.79.212.229:5060}
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: DBG:core:get_hdr_field:
<To> [38]; uri=[sip:+13129576830 at 4.79.212.229:5060]
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: DBG:core:get_hdr_field:
to body [<sip:+13129576830 at 4.79.212.229:5060>#015#012]
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: DBG:core:get_hdr_field:
cseq <CSeq>: <1> <INVITE>
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
DBG:maxfwd:is_maxfwd_present: value = 67
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: Entering rewrite-ruri
block: ruri-user is +13129576830
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: regex search turned up
true!!
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: request user: 13129576830
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: this is the phone number:
13129576830
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: Original User is
+13129576830
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: Is the avp phone_number
still set? 13129576830
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
DBG:avpops:ops_dbquery_avps: query [SELECT account_id FROM
customer.phone_numbers WHERE phone_number = '13129576830']
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
DBG:db_postgres:db_postgres_submit_query: 0x76e170 PQsendQuery(SELECT
account_id FROM customer.phone_numbers WHERE phone_number = '13129576830')
Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: DBG:core:db_new_result:
allocate 48 bytes for result set at 0x776640
Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]: DBG:core:parse_msg: SIP
Request:
Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]: DBG:core:parse_msg:
method:  <INVITE>
Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]: DBG:core:parse_msg:
uri:     <sip:+13129576830 at 1.2.3.4:5060;transport=udp>
Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]: DBG:core:parse_msg:
version: <SIP/2.0>
Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]: DBG:core:parse_headers:
flags=2
Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]: DBG:core:parse_via_param:
found param type 232, <branch> = <z9hG4bK4dbf.e8b09c62.0>; state=16
Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]: DBG:core:parse_via: end
of header reached, state=5
Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]: DBG:core:parse_headers:
via found, flags=2
Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]: DBG:core:parse_headers:
this is the first via
Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]: DBG:core:receive_msg:
After parse_msg...
Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]: DBG:core:receive_msg:
preparing to run routing scripts...
Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]: Starting processing for
method INVITE
Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]: DBG:core:parse_headers:
flags=100
Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]: DBG:core:parse_via_param:
found param type 232, <branch> = <z9hG4bK4dbf.2d03d153.0>; state=16
Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]: DBG:core:parse_via: end
of header reached, state=5
Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]: DBG:core:parse_headers:
via found, flags=100
Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]: DBG:core:parse_headers:
parse_headers: this is the second via
Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]: DBG:core:parse_via_param:
found param type 232, <branch> = <z9hG4bK506071629460-1207609494153>;
state=16
Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]: DBG:core:parse_via: end
of header reached, state=5
Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]: DBG:core:parse_headers:
via found, flags=100
Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]: DBG:core:parse_to: end of
header reached, state=10
Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]: DBG:core:parse_to:
display={}, ruri={sip:+13129576830 at 4.79.212.229:5060}




On Fri, Apr 17, 2009 at 5:32 PM, Bobby Smith <bobby.smith at gmail.com> wrote:

> An example of what happens -- on processing a call right before a database
> lookup (the select query below in the previous message):
>
> 00:00:00 pgpool: opensips my_database localhost(46362) SELECT
> postgres   445   438  0 09:00 ?        00:00:00 pgpool: opensips
> my_database localhost(46383) SELECT
> postgres   446   438  0 09:00 ?        00:00:00 pgpool: wait for connection
> request
> postgres   447   438  0 09:00 ?        00:00:00 pgpool: wait for connection
> request
> postgres   448   438  0 09:00 ?        00:00:00 pgpool: wait for connection
> request
> postgres   449   438  0 09:00 ?        00:00:00 pgpool: wait for connection
> request
> postgres   450   438  0 09:00 ?        00:00:00 pgpool: wait for connection
> request
> postgres   451   438  0 09:00 ?        00:00:00 pgpool: opensips
> my_database localhost(46366) SELECT
> postgres   452   438  0 09:00 ?        00:00:00 pgpool: opensips
> my_database localhost(46371) SELECT
> postgres   453   438  0 09:00 ?        00:00:00 pgpool: wait for connection
> request
> postgres   454   438  0 09:00 ?        00:00:00 pgpool: opensips
> my_database localhost(46367) SELECT
> postgres   455   438  0 09:00 ?        00:00:00 pgpool: opensips
> my_database localhost(46372) SELECT
> postgres   456   438  0 09:00 ?        00:00:00 pgpool: opensips
> my_database localhost(46393) idle
> postgres   457   438  0 09:00 ?        00:00:00 pgpool: opensips
> my_database localhost(46391) SELECT
> postgres   458   438  0 09:00 ?        00:00:00 pgpool: opensips
> my_database localhost(46377) SELECT
> postgres   459   438  0 09:00 ?        00:00:00 pgpool: opensips
> my_database localhost(46378) SELECT
> postgres   460   438  0 09:00 ?        00:00:00 pgpool: opensips
> my_database localhost(46374) SELECT
> postgres   461   438  0 09:00 ?        00:00:00 pgpool: wait for connection
> request
> postgres   462   438  0 09:00 ?        00:00:00 pgpool: opensips
> my_database localhost(46387) SELECT
> postgres   463   438  0 09:00 ?        00:00:00 pgpool: wait for connection
> request
> postgres   464   438  0 09:00 ?        00:00:00 pgpool: opensips
> my_database localhost(46380) SELECT
> postgres   465   438  0 09:00 ?        00:00:00 pgpool: wait for connection
> request
> postgres   466   438  0 09:00 ?        00:00:00 pgpool: opensips
> my_database localhost(46385) SELECT
> postgres   468   438  0 09:00 ?        00:00:00 pgpool: opensips
> my_database localhost(46389) SELECT
> postgres   469   438  0 09:00 ?        00:00:00 pgpool: opensips
> my_database localhost(46361) SELECT
> postgres   471   438  0 09:00 ?        00:00:00 pgpool: wait for connection
> request
> postgres   473   438  0 09:00 ?        00:00:00 pgpool: wait for connection
> request
> postgres   474   438  0 09:00 ?        00:00:00 pgpool: wait for connection
> request
> postgres   475   438  0 09:00 ?        00:00:00 pgpool: wait for connection
> request
> postgres   476   438  0 09:00 ?        00:00:00 pgpool: wait for connection
> request
> postgres   477   438  0 09:00 ?        00:00:00 pgpool: wait for connection
> request
> postgres   478   438  0 09:00 ?        00:00:00 pgpool: opensips
> my_database localhost(46365) SELECT
>
> I have 32 "available" connections, 16 of which are occupied by opensips
> children processes.  On this one query, it looks like "every" opensips
> process does the same processing to *all* the pgpool connections.  This is
> just for a single INVITE.  I don't know if the database lookup is failing or
> not, but it looks like the next child processes the message until none are
> left?
>
> It will kind of just float here in this state for some time now -- I've got
> a heartbeat script that can come back and clean this up (transparent to the
> endusers as we're in a failover setup), but I would like to know how to go
> about troubleshooting this particular type of issue.
>
> I know it's not the database dying or pgpool losing it's connections, as I
> have logging enabled on both.
>
> Any suggestions on troubleshooting, or has anyone tried this sort of
> configuration before?
>
> Thanks much in advance.
>
>
> On Thu, Apr 16, 2009 at 4:39 PM, Bobby Smith <bobby.smith at gmail.com>wrote:
>
>> All,
>>
>> We're currently (successfully) testing a configuration of Opensips as a
>> stateless proxy.  I wanted to be able to execute an arbitrary database query
>> to a database that's NOT the opensips database, when I realized the power of
>> the avp_db_query( ) function in avpops.  The database is a backend postgres
>> database in an basic failover configuration (16 max connections cached,
>> primary db, failover to secondary db for those connections).
>>
>> Currently, opensips has 16 children processes.  In the configuration, we
>> have the following:
>>
>> modparam("avpops", "db_url", "postgres://opensips:opensips@localhost
>> :9999/my_database")
>> modparam("avpops", "avp_table", "")
>>
>> A few of the issues I'm concerned about or experiencing with the operation
>> of it.
>>
>> a) Why is that second parameter necessary (avp_table)?  It feels like if I
>> have to set this that it's always going to keep open connections with the
>> database, and indeed, it is.  When I ps -ef, it shows I have 16 pgpool
>> connections that are idle to my_database.  The problem that this is causing
>> is that, if for some reason one of the connections has an issue and dies or
>> times out, this happens in the logs:
>>
>> Apr 16 05:46:02 serinbound1 /sbin/opensips[30773]:
>> ERROR:db_postgres:db_postgres_store_result: 0x76e008 - invalid query,
>> execution aborted
>> Apr 16 05:46:02 serinbound1 /sbin/opensips[30773]:
>> ERROR:db_postgres:db_postgres_store_result: 0x76e008: PGRES_FATAL_ERROR
>> Apr 16 05:46:02 serinbound1 /sbin/opensips[30773]:
>> ERROR:db_postgres:db_postgres_store_result: 0x76e008: server closed the
>> connection unexpectedly#012#011This probably means the server terminated
>> abnormally#012#011before or while processing the request.#012
>> Apr 16 05:46:02 serinbound1 /sbin/opensips[30773]:
>> ERROR:core:db_do_raw_query: error while storing result
>> Apr 16 05:46:02 serinbound1 /sbin/opensips[30773]:
>> ERROR:avpops:db_query_avp: cannot do the query
>> Apr 16 05:46:02 serinbound1 /sbin/opensips[30773]: Database Error!  No
>> Lookup!
>> Apr 16 05:46:02 serinbound1 /sbin/opensips[30773]: Query Executed:
>> Account Number is <null>
>>
>> The query I'm executing (from the config):
>> avp_db_query("SELECT value FROM schema.table WHERE value =
>> '$avp(s:string_a)'", "$avp(s:string_b)");
>>
>> When these error messages pop up, pgpool still has active database
>> connections to postgres, but not held by opensips.  I can use one of the
>> pooled connections to connect to the backend database and execute a query,
>> but for some reason opensips cannot.
>>
>> And, having a restart work (scripted), I occasionally see the following:
>>
>> Apr 16 03:59:05 serinbound2 /sbin/opensips[23105]:
>> ERROR:db_postgres:db_postgres_new_connection: server closed the connection
>> unexpectedly#012#011This probably means the server terminated
>> abnormally#012#011before or while processing the request.#012
>> Apr 16 03:59:05 serinbound2 /sbin/opensips[23105]:
>> ERROR:db_postgres:db_postgres_new_connection: cleaning up
>> 0x76e090=pkg_free()
>> Apr 16 03:59:05 serinbound2 /sbin/opensips[23105]: ERROR:core:db_do_init:
>> could not add connection to the pool
>> Apr 16 03:59:05 serinbound2 /sbin/opensips[23105]:
>> ERROR:avpops:avpops_db_init: cannot initialize database connection
>> Apr 16 03:59:05 serinbound2 /sbin/opensips[23105]:
>> ERROR:core:init_mod_child: failed to initializing module avpops, rank -1
>> Apr 16 03:59:05 serinbound2 /sbin/opensips[23105]:
>> ERROR:core:start_timer_processes: init_child failed for timer proc
>> Apr 16 03:59:05 serinbound2 /sbin/opensips[23083]: INFO:core:handle_sigs:
>> child process 23105 exited normally, status=255
>> Apr 16 03:59:05 serinbound2 /sbin/opensips[23083]: INFO:core:handle_sigs:
>> terminating due to SIGCHLD
>> Apr 16 03:59:05 serinbound2 /sbin/opensips[23107]: INFO:core:sig_usr:
>> signal 15 received
>>
>> At this point, the application crashes.
>>
>> Any suggestions or workarounds for this?  More specifically, I'd like
>> opensips to not have to grab the DB connection if it doesn't need it at that
>> time performing a lookup (as i'm not really using avp's in the lookup, just
>> to save the results), also, I'd like to see it not crash completely if it
>> loses that connection.  Instead, just send a error message back and allow me
>> to insert a statement like:
>>
>>                if(! what i expect the value of the saved avp from teh
>> database query to be)
>>                 {
>>                         sl_send_reply("500", "Internal Server Error");
>>                         exit;
>>                 }
>>
>> Thanks for your help, I know this was fairly detailed but without much in
>> terms of logging.
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.opensips.org/pipermail/users/attachments/20090420/8c7b4b57/attachment-0001.htm 


More information about the Users mailing list