Ubuntu HA - Pacemaker Fence Agents

Ubuntu HA - Pacemaker Fence Agents

From the ClusterLabs definition:

A fence agent (or fencing agent) is a stonith-class resource agent.

The fence agent standard provides commands (such as off and reboot) that the cluster can use to fence nodes. As with other resource agent classes, this allows a layer of abstraction so that Pacemaker doesn’t need any knowledge about specific fencing technologies — that knowledge is isolated in the agent.

The Ubuntu Server team has been working to curate a set of fence agents, which basically means to have some tests for each of them running in a continuous integration system. Those agents are available in the fence-agents-base binary package. For those fence agents, we will be briefly describing how to use them.

fence_mpath

From its manpage:

fence_mpath is an I/O fencing agent that uses SCSI-3 persistent reservations to control access multipath devices. Underlying devices must support SCSI-3 persistent reservations (SPC-3 or greater) as well as the “preempt-and-abort” subcommand. The fence_mpath agent works by having a unique key for each node that has to be set in /etc/multipath.conf. Once registered, a single node will become the reservation holder by creating a “write exclusive, registrants only” reservation on the device(s). The result is that only registered nodes may write to the device(s). When a node failure occurs, the fence_mpath agent will remove the key belonging to the failed node from the device(s). The failed node will no longer be able to write to the device(s). A manual reboot is required.

One could configure a fence_mpath resource with the following command:

$ crm configure primitive $RESOURCE_NAME stonith:fence_mpath \
            params \
            pcmk_host_map="$NODE1:$NODE1_RES_KEY;$NODE2:$NODE2_RES_KEY;$NODE3:$NODE3_RES_KEY" \
            pcmk_host_argument=plug \
            pcmk_monitor_action=metadata \
            pcmk_reboot_action=off \
            devices=$MPATH_DEVICE \
            meta provides=unfencing

The $NODE1_RES_KEY is the reservation key used by this node 1 (same for the others node with access to the multipath device), please make sure you have reservation_key <key> in the default section inside /etc/multipath.conf and the multipathd service was reloaded after it.

This is one way to set up fence_mpath, for more information please check its manpage.

fence_scsi

From its manpage:

fence_scsi is an I/O fencing agent that uses SCSI-3 persistent reservations to control access to shared storage devices. These devices must support SCSI-3 persistent reservations (SPC-3 or greater) as well as the “preempt-and-abort” subcommand. The fence_scsi agent works by having each node in the cluster register a unique key with the SCSI device(s). Reservation key is generated from “node id” (default) or from “node name hash” (RECOMMENDED) by adjusting “key_value” option. Using hash is recommended to prevent issues when removing nodes from cluster without full cluster restart. Once registered, a single node will become the reservation holder by creating a “write exclusive, registrants only” reservation on the device(s). The result is that only registered nodes may write to the device(s). When a node failure occurs, the fence_scsi agent will remove the key belonging to the failed node from the device(s). The failed node will no longer be able to write to the device(s). A manual reboot is required.

One could configure a fence_scsi resource with the following command:

$ crm configure primitive $RESOURCE_NAME stonith:fence_scsi \
            params \
            pcmk_host_list="$NODE1 $NODE2 $NODE3" \
            devices=$SCSI_DEVICE \
            meta provides=unfencing

The pcmk_host_list parameter contains a list of cluster nodes that can access the managed SCSI device.

This is one way to set up fence_scsi, for more information please check its manpage.

fence_virsh

From its manpage:

fence_virsh is an I/O Fencing agent which can be used with the virtual machines managed by libvirt. It logs via ssh to a dom0 and there run virsh command, which does all work. By default, virsh needs root account to do properly work. So you must allow ssh login in your sshd_config.

One could configure a fence_virsh resource with the following command:

$ crm configure primitive $RESOURCE_NAME stonith:fence_virsh \
            params \
            ip=$HOST_IP_ADDRESS \
            login=$HOST_USER \
            identity_file=$SSH_KEY \
            plug=$NODE \
            ssh=true \
            use_sudo=true

This is one way to set up fence_virsh, for more information please check its manpage.

In order to avoid running the resource in the same node that should be fenced, we need to add a location restriction:

$ crm configure location fence-$NODE-location $RESOURCE_NAME -inf: $NODE

References