Configuring corosync and pacemaker for Kamailio

Corosync and pacemaker clusters allow two machines to share a high availability address. Each machine has its own normal IP address used to administer the machine. There is then a third "service" IP address that Kamailio clients connect to. This normally runs on the primary machine as an IP alias eth0:0. The backup machine then monitors the health of the primary machine, and if it crashes the backup takes over the service IP address, again as eth0:0. Each machine monitors the other through the network.

Install corosync and pacemaker

Do the steps below on both machines. First, install the packages:

On machines running Debian or Ubuntu:

apt-get install ntp # If this fails, use apt-get install chrony instead
apt-get install pacemaker pcs
systemctl enable corosync.service
systemctl enable pacemaker.service

On machines running CentOS or Rocky Linux:

dnf config-manager --set-enabled HighAvailability
dnf install -y chrony
dnf install -y pacemaker pcs
systemctl enable corosync.service
systemctl enable pacemaker.service

On machines running Redhat Enterprise or Fedora:

yum -y install ntp
yum -y install pacemaker pcs
update-rc.d corosync defaults
update-rc.d pacemaker defaults

Configure corosync and pacemaker

Verify that the directory /var/log/corosync has been created on both machines. If not, then do:

mkdir /var/log/corosync

On the primary node, if running Debian or Ubuntu:

apt-get install haveged
corosync-keygen
apt-get remove --purge haveged
scp /etc/corosync/authkey username@<ip address of the secondary node>:/etc/corosync/.

On the primary node, if running CentOS, Redhat Enterprise, or Fedora:

yum install haveged # or dnf install haveged
corosync-keygen
yum remove haveged # or dnf remove haveged
scp /etc/corosync/authkey username@<ip address of the secondary node>:/etc/corosync/.

On the secondary node:

chown root: /etc/corosync/authkey
chmod 400 /etc/corosync/authkey

On both nodes:

cp /opt/enswitch/current/install/etc/corosync/corosync.conf /etc/corosync/corosync.conf
mkdir /etc/corosync/service.d
ln -s /opt/enswitch/current/etc/corosync/service.d/pcmk /etc/corosync/service.d/pcmk
ln -s /opt/enswitch/current/etc/corosync/corosync /etc/default/corosync
vi /etc/corosync/corosync.conf

Edit /etc/corosync/corosync.conf as follows:

Set the first IP address of your network in bindnetaddr under interface, under totem.
Set the IP addresses of the primary and secondary node under nodelist.
Set the hostnames of the primary and secondary node under nodelist.

On both nodes start the corosync and pacemaker services (or restart them if they are already started):

service corosync start
service pacemaker start

On only one of the nodes check whether there are any resources already configured by default:

pcs status

If there are these must be removed:

pcs resource disable <resource name>
pcs resource delete <resource name>

On only one of the nodes:

pcs property set stonith-enabled=false
pcs property set no-quorum-policy=ignore

To add the floating IP resource where Kamailio will be listening on run on either one of the nodes the following command after replacing the IP address and netmask with the appropriate values:

pcs resource create ip ocf:heartbeat:IPaddr2 ip=192.168.1.1 cidr_netmask=24 op monitor interval=30s --group enswitch

Note that the cluster resources group named "enswitch" will be created with the previous command.

Add the Enswitch resources that will be running on the nodes and omit those that are not needed. They will be added to the existing "enswitch" resources group by running the following commands on only one of the nodes:

pcs resource create kamailio systemd:kamailio op monitor interval=30s --group enswitch
pcs resource create enswitch_messaged systemd:enswitch_messaged op monitor interval=30s --group enswitch
pcs resource create enswitch_sipd systemd:enswitch_sipd op monitor interval=30s --group enswitch

Disable resources which are managed by Corosync/Pacemaker so that they aren't started on system boot by systemd or init:

Systems using systemd:

systemctl disable kamailio
systemctl disable enswitch_messaged
systemctl disable enswitch_sipd

Systems using Devuan (and older Debian-like systems):

update-rc.d kamailio remove
update-rc.d enswitch_messaged remove
update-rc.d enswitch_sipd remove

Systems using CentOS, Fedora, or Redhat:

chkconfig kamailio off
chkconfig enswitch_messaged off
chkconfig enswitch_sipd off

On only one of the nodes check whether the resources have started and are active on the correct node:

pcs status

If any of the resources have not started, then run on any one of the nodes:

pcs resource enable <resource name>