Corosync and pacemaker clusters allow two machines to share a high availability address. Each machine has its own normal IP address used to administer the machine. There is then a third "service" IP address that Kamailio clients connect to. This normally runs on the primary machine as an IP alias eth0:0. The backup machine then monitors the health of the primary machine, and if it crashes the backup takes over the service IP address, again as eth0:0. Each machine monitors the other through the network.

Install corosync and pacemaker

Do the steps below on both machines. First, install the packages:

Debian / Ubuntu:

  • apt-get install ntp
  • apt-get install pacemaker crmsh
  • systemctl enable corosync.service
  • systemctl enable pacemaker.service

Devuan Ascii and later:

  • apt-get install ntp
  • apt-get install pacemaker crmsh
  • update-rc.d corosync defaults
  • update-rc.d pacemaker defaults

Devuan Jessie:

Edit /etc/apt/sources.list and add the following lines:

  • deb http://auto.mirror.devuan.org/merged jessie-backports main
  • deb-src http://auto.mirror.devuan.org/merged jessie-backports main

Then do:

  • apt-get install ntp
  • apt-get install pacemaker crmsh -t jessie-backports
  • update-rc.d corosync defaults
  • update-rc.d pacemaker defaults

CentOS 6:

  • cd /etc/yum.repos.d
  • wget https://download.opensuse.org/repositories/network:ha-clustering:Stable/CentOS_CentOS-6/network:ha-clustering:Stable.repo
  • yum -y install ntp
  • yum -y install pacemaker crmsh
  • chkconfig corosync on
  • chkconfig pacemaker on

CentOS 7:

  • cd /etc/yum.repos.d
  • wget https://download.opensuse.org/repositories/network:ha-clustering:Stable/CentOS_CentOS-7/network:ha-clustering:Stable.repo
  • yum -y install ntp
  • yum -y install pacemaker crmsh
  • systemctl enable corosync.service
  • systemctl enable pacemaker.service

Others:

  • yum -y install ntp
  • yum -y install pacemaker crmsh
  • update-rc.d corosync defaults
  • update-rc.d pacemaker defaults

Configure corosync and pacemaker

Verify that the directory /var/log/corosync has been created on both machines. If not, then do:

  • mkdir /var/log/corosync

Do the steps below on the primary node only:

Debian / Ubuntu / Devuan:

  • apt-get install haveged
  • corosync-keygen
  • apt-get remove --purge haveged
  • scp /etc/corosync/authkey username@<ip address of the secondary node>:/etc/corosync/.

Others:

  • yum install haveged
  • corosync-keygen
  • yum remove haveged
  • scp /etc/corosync/authkey username@<ip address of the secondary node>:/etc/corosync/.

Do the steps below on the secondary node only:

  • chown root: /etc/corosync/authkey
  • chmod 400 /etc/corosync/authkey

On both nodes do:

  • cp /opt/enswitch/current/install/etc/corosync/corosync.conf /etc/corosync/corosync.conf
  • mkdir /etc/corosync/service.d
  • ln -s /opt/enswitch/current/etc/corosync/service.d/pcmk /etc/corosync/service.d/pcmk
  • ln -s /opt/enswitch/current/etc/corosync/corosync /etc/default/corosync
  • vi /etc/corosync/corosync.conf

Edit /etc/corosync/corosync.conf as follows:

  • Set the first IP address of your network in bindnetaddr under interface, under totem.
  • Set the IP addresses of the primary and secondary node under nodelist.
  • Set the hostnames of the primary and secondary node under nodelist.

On both nodes start the corosync and pacemaker services (or restart them if they are already started):

  • service corosync start
  • service pacemaker start

On only one of the nodes check whether there are any resources already configured by default:

  • crm status

If there are these must be removed:

  • crm resource stop <resource name>
  • crm configure delete <resource name>

On only one of the nodes run:

  • crm configure property stonith-enabled=false
  • crm configure property no-quorum-policy=ignore

To add the floating IP resource where Kamailio or OpenSIPS will be listening on run on either one of the nodes the following command after replacing the IP address and netmask with the appropriate values:

  • crm configure primitive ip ocf:heartbeat:IPaddr2 params ip=192.168.1.1 cidr_netmask="24" op monitor interval="30s"

Add the Enswitch resources that will be running on the nodes and ommit those that are not needed. Then group them together by running the following command on only one of the nodes:

Systems using systemd:

  • crm configure primitive kamailio systemd:kamailio op monitor interval="30s"
  • crm configure primitive enswitch_messaged systemd:enswitch_messaged op monitor interval="30s"
  • crm configure primitive enswitch_sipd systemd:enswitch_sipd op monitor interval="30s"
  • crm configure primitive enswitch_blfd systemd:enswitch_blfd op monitor interval="30s"
  • crm configure group enswitch ip kamailio enswitch_messaged enswitch_sipd enswitch_blfd

Systems using System V init scripts:

  • crm configure primitive kamailio lsb:kamailio op monitor interval="30s"
  • crm configure primitive enswitch_messaged lsb:enswitch_messaged op monitor interval="30s"
  • crm configure primitive enswitch_sipd lsb:enswitch_sipd op monitor interval="30s"
  • crm configure primitive enswitch_blfd lsb:enswitch_blfd op monitor interval="30s"
  • crm configure group enswitch ip kamailio enswitch_messaged enswitch_sipd enswitch_blfd

Disable resources which are managed by Corosync/Pacemaker so that they aren't started on system boot by systemd or init:

Systems using systemd:

  • systemctl disable kamailio
  • systemctl disable enswitch_messaged
  • systemctl disable enswitch_sipd
  • systemctl disable enswitch_blfd

Systems using Devuan (and older Debian-like systems):

  • update-rc.d kamailio remove
  • update-rc.d enswitch_messaged remove
  • update-rc.d enswitch_sipd remove
  • update-rc.d enswitch_blfd remove

Systems using CentOS, Fedora, or Redhat:

  • chkconfig kamailio off
  • chkconfig enswitch_messaged off
  • chkconfig enswitch_sipd off
  • chkconfig enswitch_blfd off

On only one of the nodes check whether the resources have started and are active on the correct node:

  • crm status

If any of the resources have not started, then run on any one of the nodes:

  • crm resource start <resource name>