[OpenSIPS-Users] Clusterer and Full Sharing USRLOC sync state/persistence

Jock McKechnie jock.mckechnie at gmail.com
Wed Oct 17 16:15:52 EDT 2018


Good afternoon;

I've been poking with getting a four-node cluster doing fully shared
USRLOC data up and running and I'm having troubles getting the restart
persistency to work reliably. The cluster comes up without issues and
all nodes appear to find each other, and I can stop various nodes and
the USRLOC database will back-fill on node start later ('opensipsctl
ul show' populates as expected).

I have discovered, however, that after a certain period of run time -
I haven't worked out what the magic switch is yet - interspersed with
node restarts I get a cluster State of "not synced" at which point
when nodes restart they will no longer sync the USRLOC data back and
sort of become orphans. So far the only way I've found to get things
working again is to restart _all_ nodes simultaneously, which brings
them back into an "Ok" cluster State (opensipsctl fifo
clusterer_list_cap).

When the cluster is in a 'not synced' state I haven't managed to work
out how to restart nodes to bring the cluster back into a working
state if they're not all cycled together.

This is my first experience with the clusterer system and I'm not
completely famaliar with its ins, outs and quirks as yet.

I'm wondering if there's a correct procedure to bring a cluster back
into sync after it's fallen out of sync - if there's a way to
determine who is the node at fault and bring things back in such a
manner that won't require a complete restart and losing the data set
in memory.

We are a MySQL and Cassandra house and, at discussed in a previous
thread, the only working cluster source for this data is a MongoDB.
Unfortunately from a corporate stand point I'm unlikely to get anyone
willing to support a MongoDB set just to support offline restart
persistence of the OpenSIPS cluster so I'm hoping that I can get the
in-memory set up reliable enough that I can use it.

My thanks for your time, and any hints you might be able to provide;

 - Jock



More information about the Users mailing list