[OpenSIPS-Users] Opensips, avpops and avp_db_query( ), and pgpool2 -- questions, suggestions, issues
Bogdan-Andrei Iancu
bogdan at voice-system.ro
Tue Apr 21 10:16:58 CEST 2009
Hi Bobby,
you mean the current process is stucked in something.. I suspect in
PQgetResult() trying to read the result. Can you attach with gdb to one
of this blocked processes and get the backtrace (to see where they are
blocked) ?
Regards,
Bogdan
Bobby Smith wrote:
> Removing pgpool from the equation and simply doing a direct postgres
> connection, it seems like the issue still occurs. If opensips is idle
> on the avpops db for some time (sounds like about 30 minutes), and you
> try to execute an arbitrary query to that connection from the routing
> script, it just keeps passing to the next child without returning data
> from the query until it runs out of children to execute the process,
> and so it just kind of "hangs".
>
> This is with maximum logging enabled. You can see where the first
> process attempts to parse the message. When we get to the db lookup,
> the task immediately goes to the next process without actually having
> the first process stop, and it hangs on the "SELECT" statement. I can
> at this point however directly connect to my postgres backend and do
> the query manually.
>
> This repeats for all the SIP receiving processes, until none are
> left. At this point, the proxy stops responding to any messages.
>
> Is there some sort of connection timeout like parameter involving
> avp_ops that I'm missing? It seems like I can troubleshoot so far
> that a) if the proxy is never idle, this doesn't seem to happen, and
> b) if the proxy is idle (processing 0 messages), it happens after
> about 30 minutes and requires a restart of opensips to function correctly.
>
>
>
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: DBG:core:parse_msg:
> SIP Request:
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:parse_msg: method: <INVITE>
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:parse_msg: uri:
> <sip:+13129576830 at 1.2.3.4:5060;transport=udp>
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:parse_msg: version: <SIP/2.0>
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:parse_headers: flags=2
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:parse_via_param: found param type 232, <branch> =
> <z9hG4bK4dbf.e8b09c62.0>; state=16
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: DBG:core:parse_via:
> end of header reached, state=5
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:parse_headers: via found, flags=2
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:parse_headers: this is the first via
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:receive_msg: After parse_msg...
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:receive_msg: preparing to run routing scripts...
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: Starting processing
> for method INVITE
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:parse_headers: flags=100
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:parse_via_param: found param type 232, <branch> =
> <z9hG4bK4dbf.2d03d153.0>; state=16
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: DBG:core:parse_via:
> end of header reached, state=5
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:parse_headers: via found, flags=100
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:parse_headers: parse_headers: this is the second via
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:parse_via_param: found param type 232, <branch> =
> <z9hG4bK506071629460-1207609494153>; state=16
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: DBG:core:parse_via:
> end of header reached, state=5
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:parse_headers: via found, flags=100
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: DBG:core:parse_to:
> end of header reached, state=10
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: DBG:core:parse_to:
> display={}, ruri={sip:+13129576830 at 4.79.212.229:5060
> <http://sip:+13129576830@4.79.212.229:5060>}
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:get_hdr_field: <To> [38];
> uri=[sip:+13129576830 at 4.79.212.229:5060
> <http://sip:+13129576830@4.79.212.229:5060>]
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:get_hdr_field: to body [<sip:+13129576830 at 4.79.212.229:5060
> <http://sip:+13129576830@4.79.212.229:5060>>#015#012]
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:get_hdr_field: cseq <CSeq>: <1> <INVITE>
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:maxfwd:is_maxfwd_present: value = 67
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: Entering
> rewrite-ruri block: ruri-user is +13129576830
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: regex search turned
> up true!!
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: request user:
> 13129576830
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: this is the phone
> number: 13129576830
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: Original User is
> +13129576830
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: Is the avp
> phone_number still set? 13129576830
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:avpops:ops_dbquery_avps: query [SELECT account_id FROM
> customer.phone_numbers WHERE phone_number = '13129576830']
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:db_postgres:db_postgres_submit_query: 0x76e170 PQsendQuery(SELECT
> account_id FROM customer.phone_numbers WHERE phone_number = '13129576830')
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:db_new_result: allocate 48 bytes for result set at 0x776640
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]: DBG:core:parse_msg:
> SIP Request:
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:parse_msg: method: <INVITE>
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:parse_msg: uri:
> <sip:+13129576830 at 1.2.3.4:5060;transport=udp>
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:parse_msg: version: <SIP/2.0>
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:parse_headers: flags=2
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:parse_via_param: found param type 232, <branch> =
> <z9hG4bK4dbf.e8b09c62.0>; state=16
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]: DBG:core:parse_via:
> end of header reached, state=5
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:parse_headers: via found, flags=2
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:parse_headers: this is the first via
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:receive_msg: After parse_msg...
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:receive_msg: preparing to run routing scripts...
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]: Starting processing
> for method INVITE
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:parse_headers: flags=100
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:parse_via_param: found param type 232, <branch> =
> <z9hG4bK4dbf.2d03d153.0>; state=16
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]: DBG:core:parse_via:
> end of header reached, state=5
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:parse_headers: via found, flags=100
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:parse_headers: parse_headers: this is the second via
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:parse_via_param: found param type 232, <branch> =
> <z9hG4bK506071629460-1207609494153>; state=16
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]: DBG:core:parse_via:
> end of header reached, state=5
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:parse_headers: via found, flags=100
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]: DBG:core:parse_to:
> end of header reached, state=10
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]: DBG:core:parse_to:
> display={}, ruri={sip:+13129576830 at 4.79.212.229:5060
> <http://sip:+13129576830@4.79.212.229:5060>}
>
>
>
>
> On Fri, Apr 17, 2009 at 5:32 PM, Bobby Smith <bobby.smith at gmail.com
> <mailto:bobby.smith at gmail.com>> wrote:
>
> An example of what happens -- on processing a call right before a
> database lookup (the select query below in the previous message):
>
> 00:00:00 pgpool: opensips my_database localhost(46362) SELECT
> postgres 445 438 0 09:00 ? 00:00:00 pgpool: opensips
> my_database localhost(46383) SELECT
> postgres 446 438 0 09:00 ? 00:00:00 pgpool: wait for
> connection request
> postgres 447 438 0 09:00 ? 00:00:00 pgpool: wait for
> connection request
> postgres 448 438 0 09:00 ? 00:00:00 pgpool: wait for
> connection request
> postgres 449 438 0 09:00 ? 00:00:00 pgpool: wait for
> connection request
> postgres 450 438 0 09:00 ? 00:00:00 pgpool: wait for
> connection request
> postgres 451 438 0 09:00 ? 00:00:00 pgpool: opensips
> my_database localhost(46366) SELECT
> postgres 452 438 0 09:00 ? 00:00:00 pgpool: opensips
> my_database localhost(46371) SELECT
> postgres 453 438 0 09:00 ? 00:00:00 pgpool: wait for
> connection request
> postgres 454 438 0 09:00 ? 00:00:00 pgpool: opensips
> my_database localhost(46367) SELECT
> postgres 455 438 0 09:00 ? 00:00:00 pgpool: opensips
> my_database localhost(46372) SELECT
> postgres 456 438 0 09:00 ? 00:00:00 pgpool: opensips
> my_database localhost(46393) idle
> postgres 457 438 0 09:00 ? 00:00:00 pgpool: opensips
> my_database localhost(46391) SELECT
> postgres 458 438 0 09:00 ? 00:00:00 pgpool: opensips
> my_database localhost(46377) SELECT
> postgres 459 438 0 09:00 ? 00:00:00 pgpool: opensips
> my_database localhost(46378) SELECT
> postgres 460 438 0 09:00 ? 00:00:00 pgpool: opensips
> my_database localhost(46374) SELECT
> postgres 461 438 0 09:00 ? 00:00:00 pgpool: wait for
> connection request
> postgres 462 438 0 09:00 ? 00:00:00 pgpool: opensips
> my_database localhost(46387) SELECT
> postgres 463 438 0 09:00 ? 00:00:00 pgpool: wait for
> connection request
> postgres 464 438 0 09:00 ? 00:00:00 pgpool: opensips
> my_database localhost(46380) SELECT
> postgres 465 438 0 09:00 ? 00:00:00 pgpool: wait for
> connection request
> postgres 466 438 0 09:00 ? 00:00:00 pgpool: opensips
> my_database localhost(46385) SELECT
> postgres 468 438 0 09:00 ? 00:00:00 pgpool: opensips
> my_database localhost(46389) SELECT
> postgres 469 438 0 09:00 ? 00:00:00 pgpool: opensips
> my_database localhost(46361) SELECT
> postgres 471 438 0 09:00 ? 00:00:00 pgpool: wait for
> connection request
> postgres 473 438 0 09:00 ? 00:00:00 pgpool: wait for
> connection request
> postgres 474 438 0 09:00 ? 00:00:00 pgpool: wait for
> connection request
> postgres 475 438 0 09:00 ? 00:00:00 pgpool: wait for
> connection request
> postgres 476 438 0 09:00 ? 00:00:00 pgpool: wait for
> connection request
> postgres 477 438 0 09:00 ? 00:00:00 pgpool: wait for
> connection request
> postgres 478 438 0 09:00 ? 00:00:00 pgpool: opensips
> my_database localhost(46365) SELECT
>
> I have 32 "available" connections, 16 of which are occupied by
> opensips children processes. On this one query, it looks like
> "every" opensips process does the same processing to *all* the
> pgpool connections. This is just for a single INVITE. I don't
> know if the database lookup is failing or not, but it looks like
> the next child processes the message until none are left?
>
> It will kind of just float here in this state for some time now --
> I've got a heartbeat script that can come back and clean this up
> (transparent to the endusers as we're in a failover setup), but I
> would like to know how to go about troubleshooting this particular
> type of issue.
>
> I know it's not the database dying or pgpool losing it's
> connections, as I have logging enabled on both.
>
> Any suggestions on troubleshooting, or has anyone tried this sort
> of configuration before?
>
> Thanks much in advance.
>
>
> On Thu, Apr 16, 2009 at 4:39 PM, Bobby Smith
> <bobby.smith at gmail.com <mailto:bobby.smith at gmail.com>> wrote:
>
> All,
>
> We're currently (successfully) testing a configuration of
> Opensips as a stateless proxy. I wanted to be able to execute
> an arbitrary database query to a database that's NOT the
> opensips database, when I realized the power of the
> avp_db_query( ) function in avpops. The database is a backend
> postgres database in an basic failover configuration (16 max
> connections cached, primary db, failover to secondary db for
> those connections).
>
> Currently, opensips has 16 children processes. In the
> configuration, we have the following:
>
> modparam("avpops", "db_url",
> "postgres://opensips:opensips@localhost:9999/my_database")
> modparam("avpops", "avp_table", "")
>
> A few of the issues I'm concerned about or experiencing with
> the operation of it.
>
> a) Why is that second parameter necessary (avp_table)? It
> feels like if I have to set this that it's always going to
> keep open connections with the database, and indeed, it is.
> When I ps -ef, it shows I have 16 pgpool connections that are
> idle to my_database. The problem that this is causing is
> that, if for some reason one of the connections has an issue
> and dies or times out, this happens in the logs:
>
> Apr 16 05:46:02 serinbound1 /sbin/opensips[30773]:
> ERROR:db_postgres:db_postgres_store_result: 0x76e008 - invalid
> query, execution aborted
> Apr 16 05:46:02 serinbound1 /sbin/opensips[30773]:
> ERROR:db_postgres:db_postgres_store_result: 0x76e008:
> PGRES_FATAL_ERROR
> Apr 16 05:46:02 serinbound1 /sbin/opensips[30773]:
> ERROR:db_postgres:db_postgres_store_result: 0x76e008: server
> closed the connection unexpectedly#012#011This probably means
> the server terminated abnormally#012#011before or while
> processing the request.#012
> Apr 16 05:46:02 serinbound1 /sbin/opensips[30773]:
> ERROR:core:db_do_raw_query: error while storing result
> Apr 16 05:46:02 serinbound1 /sbin/opensips[30773]:
> ERROR:avpops:db_query_avp: cannot do the query
> Apr 16 05:46:02 serinbound1 /sbin/opensips[30773]: Database
> Error! No Lookup!
> Apr 16 05:46:02 serinbound1 /sbin/opensips[30773]: Query
> Executed: Account Number is <null>
>
> The query I'm executing (from the config):
> avp_db_query("SELECT value FROM schema.table WHERE value =
> '$avp(s:string_a)'", "$avp(s:string_b)");
>
> When these error messages pop up, pgpool still has active
> database connections to postgres, but not held by opensips. I
> can use one of the pooled connections to connect to the
> backend database and execute a query, but for some reason
> opensips cannot.
>
> And, having a restart work (scripted), I occasionally see the
> following:
>
> Apr 16 03:59:05 serinbound2 /sbin/opensips[23105]:
> ERROR:db_postgres:db_postgres_new_connection: server closed
> the connection unexpectedly#012#011This probably means the
> server terminated abnormally#012#011before or while processing
> the request.#012
> Apr 16 03:59:05 serinbound2 /sbin/opensips[23105]:
> ERROR:db_postgres:db_postgres_new_connection: cleaning up
> 0x76e090=pkg_free()
> Apr 16 03:59:05 serinbound2 /sbin/opensips[23105]:
> ERROR:core:db_do_init: could not add connection to the pool
> Apr 16 03:59:05 serinbound2 /sbin/opensips[23105]:
> ERROR:avpops:avpops_db_init: cannot initialize database connection
> Apr 16 03:59:05 serinbound2 /sbin/opensips[23105]:
> ERROR:core:init_mod_child: failed to initializing module
> avpops, rank -1
> Apr 16 03:59:05 serinbound2 /sbin/opensips[23105]:
> ERROR:core:start_timer_processes: init_child failed for timer proc
> Apr 16 03:59:05 serinbound2 /sbin/opensips[23083]:
> INFO:core:handle_sigs: child process 23105 exited normally,
> status=255
> Apr 16 03:59:05 serinbound2 /sbin/opensips[23083]:
> INFO:core:handle_sigs: terminating due to SIGCHLD
> Apr 16 03:59:05 serinbound2 /sbin/opensips[23107]:
> INFO:core:sig_usr: signal 15 received
>
> At this point, the application crashes.
>
> Any suggestions or workarounds for this? More specifically,
> I'd like opensips to not have to grab the DB connection if it
> doesn't need it at that time performing a lookup (as i'm not
> really using avp's in the lookup, just to save the results),
> also, I'd like to see it not crash completely if it loses that
> connection. Instead, just send a error message back and allow
> me to insert a statement like:
>
> if(! what i expect the value of the saved avp
> from teh database query to be)
> {
> sl_send_reply("500", "Internal Server
> Error");
> exit;
> }
>
> Thanks for your help, I know this was fairly detailed but
> without much in terms of logging.
>
>
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Users mailing list
> Users at lists.opensips.org
> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>
More information about the Users
mailing list