Cluster Server 7.4.1 Administrator's Guide - Linux
- Section I. Clustering concepts and terminology
- Introducing Cluster Server
- About Cluster Server
- About cluster control guidelines
- About the physical components of VCS
- Logical components of VCS
- Types of service groups
- About resource monitoring
- Agent classifications
- About cluster control, communications, and membership
- About security services
- Components for administering VCS
- About cluster topologies
- VCS configuration concepts
- Introducing Cluster Server
- Section II. Administration - Putting VCS to work
- About the VCS user privilege model
- Administering the cluster from the command line
- About administering VCS from the command line
- About installing a VCS license
- Administering LLT
- Starting VCS
- Stopping the VCS engine and related processes
- Logging on to VCS
- About managing VCS configuration files
- About managing VCS users from the command line
- About querying VCS
- About administering service groups
- Modifying service group attributes
- About administering resources
- Enabling and disabling IMF for agents by using script
- Linking and unlinking resources
- About administering resource types
- About administering clusters
- Configuring applications and resources in VCS
- VCS bundled agents for UNIX
- Configuring NFS service groups
- About NFS
- Configuring NFS service groups
- Sample configurations
- About configuring the RemoteGroup agent
- About configuring Samba service groups
- About testing resource failover by using HA fire drills
- Predicting VCS behavior using VCS Simulator
- Section III. VCS communication and operations
- About communications, membership, and data protection in the cluster
- About cluster communications
- About cluster membership
- About membership arbitration
- About membership arbitration components
- About server-based I/O fencing
- About majority-based fencing
- About the CP server service group
- About secure communication between the VCS cluster and CP server
- About data protection
- Examples of VCS operation with I/O fencing
- About cluster membership and data protection without I/O fencing
- Examples of VCS operation without I/O fencing
- Administering I/O fencing
- About the vxfentsthdw utility
- Testing the coordinator disk group using the -c option of vxfentsthdw
- About the vxfenadm utility
- About the vxfenclearpre utility
- About the vxfenswap utility
- About administering the coordination point server
- About configuring a CP server to support IPv6 or dual stack
- About migrating between disk-based and server-based fencing configurations
- Migrating between fencing configurations using response files
- Controlling VCS behavior
- VCS behavior on resource faults
- About controlling VCS behavior at the service group level
- About AdaptiveHA
- Customized behavior diagrams
- About preventing concurrency violation
- VCS behavior for resources that support the intentional offline functionality
- VCS behavior when a service group is restarted
- About controlling VCS behavior at the resource level
- VCS behavior on loss of storage connectivity
- Service group workload management
- Sample configurations depicting workload management
- The role of service group dependencies
- About communications, membership, and data protection in the cluster
- Section IV. Administration - Beyond the basics
- VCS event notification
- VCS event triggers
- Using event triggers
- List of event triggers
- Virtual Business Services
- Section V. Veritas High Availability Configuration wizard
- Introducing the Veritas High Availability Configuration wizard
- Administering application monitoring from the Veritas High Availability view
- Administering application monitoring from the Veritas High Availability view
- Administering application monitoring from the Veritas High Availability view
- Section VI. Cluster configurations for disaster recovery
- Connecting clusters–Creating global clusters
- VCS global clusters: The building blocks
- About global cluster management
- About serialization - The Authority attribute
- Prerequisites for global clusters
- Setting up a global cluster
- About IPv6 support with global clusters
- About cluster faults
- About setting up a disaster recovery fire drill
- Test scenario for a multi-tiered environment
- Administering global clusters from the command line
- About global querying in a global cluster setup
- Administering clusters in global cluster setup
- Setting up replicated data clusters
- Setting up campus clusters
- Connecting clusters–Creating global clusters
- Section VII. Troubleshooting and performance
- VCS performance considerations
- How cluster components affect performance
- How cluster operations affect performance
- VCS performance consideration when a system panics
- About scheduling class and priority configuration
- VCS agent statistics
- About VCS tunable parameters
- Troubleshooting and recovery for VCS
- VCS message logging
- Gathering VCS information for support analysis
- Troubleshooting the VCS engine
- Troubleshooting Low Latency Transport (LLT)
- Troubleshooting Group Membership Services/Atomic Broadcast (GAB)
- Troubleshooting VCS startup
- Troubleshooting issues with systemd unit service files
- Troubleshooting service groups
- Troubleshooting resources
- Troubleshooting sites
- Troubleshooting I/O fencing
- Fencing startup reports preexisting split-brain
- Troubleshooting CP server
- Troubleshooting server-based fencing on the VCS cluster nodes
- Issues during online migration of coordination points
- Troubleshooting notification
- Troubleshooting and recovery for global clusters
- Troubleshooting licensing
- Licensing error messages
- Troubleshooting secure configurations
- Troubleshooting wizard-based configuration issues
- Troubleshooting issues with the Veritas High Availability view
- VCS message logging
- VCS performance considerations
- Section VIII. Appendixes
How membership arbitration works
Upon startup of the cluster, all systems register a unique key on the coordinator disks. The key is unique to the cluster and the node, and is based on the LLT cluster ID and the LLT system ID.
See About the I/O fencing registration key format.
When there is a perceived change in membership, membership arbitration works as follows:
GAB marks the system as DOWN, excludes the system from the cluster membership, and delivers the membership change - the list of departed systems - to the fencing module.
The system with the lowest LLT system ID in the cluster races for control of the coordinator disks
In the most common case, where departed systems are truly down or faulted, this race has only one contestant.
In a split brain scenario, where two or more subclusters have formed, the race for the coordinator disks is performed by the system with the lowest LLT system ID of that subcluster. This system that races on behalf of all the other systems in its subcluster is called the RACER node and the other systems in the subcluster are called the SPECTATOR nodes.
During the I/O fencing race, if the RACER node panics or if it cannot reach the coordination points, then the VxFEN RACER node re-election feature allows an alternate node in the subcluster that has the next lowest node ID to take over as the RACER node.
The racer re-election works as follows:
In the event of an unexpected panic of the RACER node, the VxFEN driver initiates a racer re-election.
If the RACER node is unable to reach a majority of coordination points, then the VxFEN module sends a RELAY_RACE message to the other nodes in the subcluster. The VxFEN module then re-elects the next lowest node ID as the new RACER.
With successive re-elections if no more nodes are available to be re-elected as the RACER node, then all the nodes in the subcluster will panic.
The race consists of executing a preempt and abort command for each key of each system that appears to no longer be in the GAB membership.
The preempt and abort command allows only a registered system with a valid key to eject the key of another system. This ensures that even when multiple systems attempt to eject other, each race will have only one winner. The first system to issue a preempt and abort command will win and eject the key of the other system. When the second system issues a preempt and abort command, it cannot perform the key eject because it is no longer a registered system with a valid key.
If the value of the cluster-level attribute PreferredFencingPolicy is System, Group, or Site then at the time of a race, the VxFEN Racer node adds up the weights for all nodes in the local subcluster and in the leaving subcluster. If the leaving partition has a higher sum (of node weights) then the racer for this partition will delay the race for the coordination point. This effectively gives a preference to the more critical subcluster to win the race. If the value of the cluster-level attribute PreferredFencingPolicy is Disabled, then the delay will be calculated, based on the sums of node counts.
If the preempt and abort command returns success, that system has won the race for that coordinator disk.
Each system will repeat this race to all the coordinator disks. The race is won by, and control is attained by, the system that ejects the other system's registration keys from a majority of the coordinator disks.
On the system that wins the race, the vxfen module informs all the systems that it was racing on behalf of that it won the race, and that subcluster is still valid.
On the system(s) that do not win the race, the vxfen module will trigger a system panic. The other systems in this subcluster will note the panic, determine they lost control of the coordinator disks, and also panic and restart.
Upon restart, the systems will attempt to seed into the cluster.
If the systems that restart can exchange heartbeat with the number of cluster systems declared in /etc/gabtab, they will automatically seed and continue to join the cluster. Their keys will be replaced on the coordinator disks. This case will only happen if the original reason for the membership change has cleared during the restart.
If the systems that restart cannot exchange heartbeat with the number of cluster systems declared in /etc/gabtab, they will not automatically seed, and HAD will not start. This is a possible split brain condition, and requires administrative intervention.
If you have I/O fencing enabled in your cluster and if you have set the GAB auto-seeding feature through I/O fencing, GAB automatically seeds the cluster even when some cluster nodes are unavailable.
See Seeding a cluster using the GAB auto-seed parameter through I/O fencing.
Note:
Forcing a manual seed at this point will allow the cluster to seed. However, when the fencing module checks the GAB membership against the systems that have keys on the coordinator disks, a mismatch will occur. vxfen will detect a possible split brain condition, print a warning, and will not start. In turn, HAD will not start. Administrative intervention is required.