Corosync and pacemaker clusters allow two machines to share a high availability address. Each machine has its own normal IP address used to administer the machine. There is then a third "service" IP address that Kamailio clients connect to. This normally runs on the primary machine as an IP alias eth0:0. The backup machine then monitors the health of the primary machine, and if it crashes the backup takes over the service IP address, again as eth0:0. Each machine monitors the other through the network.

Install corosync and pacemaker

Do the steps below on both machines. First, install the packages:

Debian / Ubuntu:

  • apt-get install ntp
  • [If the above fails, try apt-get install chrony instead]
  • apt-get install pacemaker pcs
  • systemctl enable corosync.service
  • systemctl enable pacemaker.service

Devuan Ascii and later:

  • apt-get install ntp
  • [If the above fails, try apt-get install chrony instead]
  • apt-get install pacemaker pcs
  • update-rc.d corosync defaults
  • update-rc.d pacemaker defaults

Devuan Jessie:

Edit /etc/apt/sources.list and add the following lines:

  • deb http://auto.mirror.devuan.org/merged jessie-backports main
  • deb-src http://auto.mirror.devuan.org/merged jessie-backports main

Then do:

  • apt-get install ntp
  • [If the above fails, try apt-get install chrony instead]
  • apt-get install pacemaker pcs -t jessie-backports
  • update-rc.d corosync defaults
  • update-rc.d pacemaker defaults

CentOS 6:

  • yum -y install ntp
  • yum -y install pacemaker pcs
  • chkconfig corosync on
  • chkconfig pacemaker on

CentOS 7:

  • yum -y install ntp
  • yum -y install pacemaker pcs
  • systemctl enable corosync.service
  • systemctl enable pacemaker.service

CentOS 8 and later:

  • dnf config-manager --set-enabled HighAvailability
  • dnf install -y chrony
  • dnf install -y pacemaker pcs
  • systemctl enable corosync.service
  • systemctl enable pacemaker.service

On machines running Rocky Linux:

  • dnf config-manager --set-enabled HighAvailability
  • dnf install -y chrony
  • dnf install -y pacemaker pcs
  • systemctl enable corosync.service
  • systemctl enable pacemaker.service

Others:

  • yum -y install ntp
  • yum -y install pacemaker pcs
  • update-rc.d corosync defaults
  • update-rc.d pacemaker defaults

Configure corosync and pacemaker

Verify that the directory /var/log/corosync has been created on both machines. If not, then do:

  • mkdir /var/log/corosync

Do the steps below on the primary node only:

Debian / Ubuntu / Devuan:

  • apt-get install haveged
  • corosync-keygen
  • apt-get remove --purge haveged
  • scp /etc/corosync/authkey username@<ip address of the secondary node>:/etc/corosync/.

Others:

  • yum install haveged # or dnf install haveged
  • corosync-keygen
  • yum remove haveged # or dnf remove haveged
  • scp /etc/corosync/authkey username@<ip address of the secondary node>:/etc/corosync/.

Do the steps below on the secondary node only:

  • chown root: /etc/corosync/authkey
  • chmod 400 /etc/corosync/authkey

On both nodes do:

  • cp /opt/enswitch/current/install/etc/corosync/corosync.conf /etc/corosync/corosync.conf
  • mkdir /etc/corosync/service.d
  • ln -s /opt/enswitch/current/etc/corosync/service.d/pcmk /etc/corosync/service.d/pcmk
  • ln -s /opt/enswitch/current/etc/corosync/corosync /etc/default/corosync
  • vi /etc/corosync/corosync.conf

Edit /etc/corosync/corosync.conf as follows:

  • Set the first IP address of your network in bindnetaddr under interface, under totem.
  • Set the IP addresses of the primary and secondary node under nodelist.
  • Set the hostnames of the primary and secondary node under nodelist.

On both nodes start the corosync and pacemaker services (or restart them if they are already started):

  • service corosync start
  • service pacemaker start

On only one of the nodes check whether there are any resources already configured by default:

  • pcs status

If there are these must be removed:

  • pcs resource disable <resource name>
  • pcs resource delete <resource name>

On only one of the nodes run:

  • pcs property set stonith-enabled=false
  • pcs property set no-quorum-policy=ignore

To add the floating IP resource where Kamailio will be listening on run on either one of the nodes the following command after replacing the IP address and netmask with the appropriate values:

  • pcs resource create ip ocf:heartbeat:IPaddr2 ip=192.168.1.1 cidr_netmask=24 op monitor interval=30s --group enswitch

Note that the cluster resources group named "enswitch" will be created with the previous command.

Add the Enswitch resources that will be running on the nodes and omit those that are not needed. They will be added to the existing "enswitch" resources group by running the following commands on only one of the nodes:

Systems using systemd:

  • pcs resource create kamailio systemd:kamailio op monitor interval=30s --group enswitch
  • pcs resource create enswitch_messaged systemd:enswitch_messaged op monitor interval=30s --group enswitch
  • pcs resource create enswitch_sipd systemd:enswitch_sipd op monitor interval=30s --group enswitch

Systems using System V init scripts:

  • pcs resource create kamailio lsb:kamailio op monitor interval=30s --group enswitch
  • pcs resource create enswitch_messaged lsb:enswitch_messaged op monitor interval=30s --group enswitch
  • pcs resource create enswitch_sipd lsb:enswitch_sipd op monitor interval=30s --group enswitch

Disable resources which are managed by Corosync/Pacemaker so that they aren't started on system boot by systemd or init:

Systems using systemd:

  • systemctl disable kamailio
  • systemctl disable enswitch_messaged
  • systemctl disable enswitch_sipd

Systems using Devuan (and older Debian-like systems):

  • update-rc.d kamailio remove
  • update-rc.d enswitch_messaged remove
  • update-rc.d enswitch_sipd remove

Systems using CentOS, Fedora, or Redhat:

  • chkconfig kamailio off
  • chkconfig enswitch_messaged off
  • chkconfig enswitch_sipd off

On only one of the nodes check whether the resources have started and are active on the correct node:

  • pcs status

If any of the resources have not started, then run on any one of the nodes:

  • pcs resource enable <resource name>