This article mostly use various commands to show how to manag cluster resources. Before get started, here is the cluster current status

# pcs status
Cluster name: pacemaker_cluster
Last updated: Sat Aug 29 08:21:24 2015
Last change: Sat Aug 29 08:21:04 2015
Stack: cman
Current DC: nodeA - partition with quorum
Version: 1.1.11-97629de
4 Nodes configured
18 Resources configured

Online: [ nodeA nodeB nodeC nodeD ]

Full list of resources:

 fs11    (ocf::heartbeat:Filesystem):    Started nodeA
 fs12    (ocf::heartbeat:Filesystem):    Started nodeA
 fs13    (ocf::heartbeat:Filesystem):    Started nodeB
 fs14    (ocf::heartbeat:Filesystem):    Started nodeB
 fs15    (ocf::heartbeat:Filesystem):    Started nodeC
 fs16    (ocf::heartbeat:Filesystem):    Started nodeC

Create a resource

To create a new resource

#pcs ource create fs21 ocf:heartbeat:Filesystem params device=/dev/mapper/LUN21 directory=/lun21 fstype="xfs" fast_stop="no" force_unmount="safe" op stop on-fail=stop timeout=200 op monitor on-fail=stop timeout=200 OCF_CHECK_LEVEL=10

Delete a resource

To delete a resource

#pcs delete resource fs11

Manually Moving Resources Around the Cluster

To mover a resource to any other nodes

nodeA# pcs resource move fs21

To move a resource to a particular node

nodeA# pcs resource move fs21 nodeC

To mover a resource to its origional nodeD

nodeA# pcs resource clear fs21

Note: You can also use the move command to move resource back to its origional node, however, it won't clear the constraint that move command generated. Thus, it's better to use 'resource clear' to move back to its normal status.

Note2: When moving a resource, any other resources that has constraint to the resource to be moved will get moved too.

Moving Resources Due to Failure

By default, there is no threshold defined, so pacemaker will move resource to other nodes whenever it fails. To define a threadhold to 3,run

# pcs resource meta fs21 migration-threshold=3

The command above defines the resource fs21 to move after 3 failures.

Note: after a resource move due to failure, it will not run on the origional node until the failcount is reset, or failure timeout reached.

To set all resource threshold to 3, so all resources in the cluster will move after 3 times fails

# pcs resource defaults migration-threshold=10

To show current failcount

# pcs resource failcount show fs21
No failcounts for fs21

To cear the failcount, run

# pcs resource failcount reset fs21

Note: the threshold only works when in normal mode, not for start and stop operation.

Start failures cause the failcount to be set to INFINITY and thus always cause the resource to move immediately.
Stop failures are slightly different and crucial. If a resource fails to stop and STONITH is enabled, then the cluster will fence the node in order to be able to start the resource elsewhere. If STONITH is not enabled, then the cluster has no way to continue and will not try to start the resource elsewhere, but will try to stop it again after the failure timeout.

Moving Resources Due to Connectivity Changes

Whthin the cluster, one a node has connection issue with other nodes, this node will be fenced off(depends on the fencing properity). How about external connectivity? What happens if a node has external connectivity issue?

The solution is to have a pingd resource created, and configure a location constraint for the resource that will move the resource to a different node when connectivity is lost.

#pcs resource create ping ocf:pacemaker:ping dampen=5s multiplier=1000 host_list=<external ip or host>

set collocation constraint to other resource on the node

# pcs constraint location <resource id> rule score=-INFINITY pingd lt 1 or not_defined pingd

Disabling, and Banning Cluster Resources

To stop a resource on a node and don't want it get started on other nodes.

#pcs resource disabled <resource id>

To start a resource on  a node and back the resource to normal state

#pcs resource enabled <resource id>

To ban a resource on a node

#pcs resource ban <resource id> [node]

If no node specified, it's banned on current node

To remove the ban constraint, run

#pcs resource clear <resource id>

To debug a resrouce start

#pcs resource debug-start <resource id>

Disabling a Monitor Operations

To disable monitor operation for a resource

#pcs resource update filesystem21 op monitor enabled="false"

To enable monitor operation for a resource

#pcs resource update fs21 op monitor enabled="true"

To permenant stop a resource monitoring , just delete the monitoring

Managed Resources

To set a resource to unmanaged state, compare to the resource deletion, unmanaged resource is still in the cluster configuration, but pacemaker doesn't manage it.

#pcs resource unmanage <resource id>

To set a resource to managed state

#pcs resource manage <resource id>