Cluster Server 7.4.1 Administrator's Guide - Linux
- Section I. Clustering concepts and terminology
- Introducing Cluster Server
- About Cluster Server
- About cluster control guidelines
- About the physical components of VCS
- Logical components of VCS
- Types of service groups
- About resource monitoring
- Agent classifications
- About cluster control, communications, and membership
- About security services
- Components for administering VCS
- About cluster topologies
- VCS configuration concepts
- Introducing Cluster Server
- Section II. Administration - Putting VCS to work
- About the VCS user privilege model
- Administering the cluster from the command line
- About administering VCS from the command line
- About installing a VCS license
- Administering LLT
- Starting VCS
- Stopping the VCS engine and related processes
- Logging on to VCS
- About managing VCS configuration files
- About managing VCS users from the command line
- About querying VCS
- About administering service groups
- Modifying service group attributes
- About administering resources
- Enabling and disabling IMF for agents by using script
- Linking and unlinking resources
- About administering resource types
- About administering clusters
- Configuring applications and resources in VCS
- VCS bundled agents for UNIX
- Configuring NFS service groups
- About NFS
- Configuring NFS service groups
- Sample configurations
- About configuring the RemoteGroup agent
- About configuring Samba service groups
- About testing resource failover by using HA fire drills
- Predicting VCS behavior using VCS Simulator
- Section III. VCS communication and operations
- About communications, membership, and data protection in the cluster
- About cluster communications
- About cluster membership
- About membership arbitration
- About membership arbitration components
- About server-based I/O fencing
- About majority-based fencing
- About the CP server service group
- About secure communication between the VCS cluster and CP server
- About data protection
- Examples of VCS operation with I/O fencing
- About cluster membership and data protection without I/O fencing
- Examples of VCS operation without I/O fencing
- Administering I/O fencing
- About the vxfentsthdw utility
- Testing the coordinator disk group using the -c option of vxfentsthdw
- About the vxfenadm utility
- About the vxfenclearpre utility
- About the vxfenswap utility
- About administering the coordination point server
- About configuring a CP server to support IPv6 or dual stack
- About migrating between disk-based and server-based fencing configurations
- Migrating between fencing configurations using response files
- Controlling VCS behavior
- VCS behavior on resource faults
- About controlling VCS behavior at the service group level
- About AdaptiveHA
- Customized behavior diagrams
- About preventing concurrency violation
- VCS behavior for resources that support the intentional offline functionality
- VCS behavior when a service group is restarted
- About controlling VCS behavior at the resource level
- VCS behavior on loss of storage connectivity
- Service group workload management
- Sample configurations depicting workload management
- The role of service group dependencies
- About communications, membership, and data protection in the cluster
- Section IV. Administration - Beyond the basics
- VCS event notification
- VCS event triggers
- Using event triggers
- List of event triggers
- Virtual Business Services
- Section V. Veritas High Availability Configuration wizard
- Introducing the Veritas High Availability Configuration wizard
- Administering application monitoring from the Veritas High Availability view
- Administering application monitoring from the Veritas High Availability view
- Administering application monitoring from the Veritas High Availability view
- Section VI. Cluster configurations for disaster recovery
- Connecting clusters–Creating global clusters
- VCS global clusters: The building blocks
- About global cluster management
- About serialization - The Authority attribute
- Prerequisites for global clusters
- Setting up a global cluster
- About IPv6 support with global clusters
- About cluster faults
- About setting up a disaster recovery fire drill
- Test scenario for a multi-tiered environment
- Administering global clusters from the command line
- About global querying in a global cluster setup
- Administering clusters in global cluster setup
- Setting up replicated data clusters
- Setting up campus clusters
- Connecting clusters–Creating global clusters
- Section VII. Troubleshooting and performance
- VCS performance considerations
- How cluster components affect performance
- How cluster operations affect performance
- VCS performance consideration when a system panics
- About scheduling class and priority configuration
- VCS agent statistics
- About VCS tunable parameters
- Troubleshooting and recovery for VCS
- VCS message logging
- Gathering VCS information for support analysis
- Troubleshooting the VCS engine
- Troubleshooting Low Latency Transport (LLT)
- Troubleshooting Group Membership Services/Atomic Broadcast (GAB)
- Troubleshooting VCS startup
- Troubleshooting issues with systemd unit service files
- Troubleshooting service groups
- Troubleshooting resources
- Troubleshooting sites
- Troubleshooting I/O fencing
- Fencing startup reports preexisting split-brain
- Troubleshooting CP server
- Troubleshooting server-based fencing on the VCS cluster nodes
- Issues during online migration of coordination points
- Troubleshooting notification
- Troubleshooting and recovery for global clusters
- Troubleshooting licensing
- Licensing error messages
- Troubleshooting secure configurations
- Troubleshooting wizard-based configuration issues
- Troubleshooting issues with the Veritas High Availability view
- VCS message logging
- VCS performance considerations
- Section VIII. Appendixes
Resource type attributes
You can override some static attributes for resource types.
For more information on any attribute listed below, see the chapter on setting agent parameters in the Cluster Server Agent Developer's Guide.
Table: Resource type attributes lists the resource type attributes.
Table: Resource type attributes
Resource type attributes | Description |
---|---|
ActionTimeout (user-defined) | Timeout value for the Action function.
|
AdvDbg (user-defined) | Enables activation of advanced debugging:
For information about the AdvDbg attribute, see the Cluster Server Agent Developer's Guide. |
AgentClass (user-defined) | Indicates the scheduling class for the VCS agent process. Use only one of the following sets of attributes to configure scheduling class and priority for VCS:
|
AgentDirectory (user-defined) | Complete path of the directory in which the agent binary and scripts are located. Agents look for binaries and scripts in the following directories:
If none of the above directories exist, the agent does not start. Use this attribute in conjunction with the AgentFile attribute to specify a different location or different binary for the agent.
|
AgentFailedOn (system use only) | A list of systems on which the agent for the resource type has failed.
|
AgentFile (user-defined) | Complete name and path of the binary for an agent. If you do not specify a value for this attribute, VCS uses the agent binary at the path defined by the AgentDirectory attribute.
|
AgentPriority (user-defined) | Indicates the priority in which the agent process runs. Use only one of the following sets of attributes to configure scheduling class and priority for VCS:
Type and dimension: string-scalar Default: 0 |
AgentReplyTimeout (user-defined) | The number of seconds the engine waits to receive a heartbeat from the agent before restarting the agent.
|
AgentStartTimeout (user-defined) | The number of seconds after starting the agent that the engine waits for the initial agent "handshake" before restarting the agent.
|
AlertOnMonitorTimeouts (user-defined) Note: This attribute can be overridden. | When a monitor times out as many times as the value or a multiple of the value specified by this attribute, then VCS sends an SNMP notification to the user. If this attribute is set to a value, say N, then after sending the notification at the first monitor timeout, VCS also sends an SNMP notification at each N-consecutive monitor timeout including the first monitor timeout for the second-time notification. When AlertOnMonitorTimeouts is set to 0, VCS will send an SNMP notification to the user only for the first monitor timeout; VCS will not send further notifications to the user for subsequent monitor timeouts until the monitor returns a success. The AlertOnMonitorTimeouts attribute can be used in conjunction with the FaultOnMonitorTimeouts attribute to control the behavior of resources of a group configured under VCS in case of monitor timeouts. When FaultOnMonitorTimeouts is set to 0 and AlertOnMonitorTimeouts is set to some value for all resources of a service group, then VCS will not perform any action on monitor timeouts for resources configured under that service group, but will only send notifications at the frequency set in the AlertOnMonitorTimeouts attribute.
|
ArgList (user-defined) | An ordered list of attributes whose values are passed to the open, close, online, offline, monitor, clean, info, and action functions.
|
AttrChangedTimeout (user-defined) Note: This attribute can be overridden. | Maximum time (in seconds) within which the attr_changed function must complete or be terminated.
|
CleanRetryLimit (user-defined) | Number of times to retry the clean function before moving a resource to ADMIN_WAIT state. If set to 0, clean is re-tried indefinitely. The valid values of this attribute are in the range of 0-1024.
|
CleanTimeout (user-defined) Note: This attribute can be overridden. | Maximum time (in seconds) within which the clean function must complete or else be terminated.
|
CloseTimeout (user-defined) Note: This attribute can be overridden. | Maximum time (in seconds) within which the close function must complete or else be terminated.
|
ConfInterval (user-defined) Note: This attribute can be overridden. | When a resource has remained online for the specified time (in seconds), previous faults and restart attempts are ignored by the agent. (See ToleranceLimit and RestartLimit attributes for details.)
|
ContainerOpts (system use only) | Specifies information that passes to the agent that controls the resources. These values are only effective when you set the ContainerInfo service group attribute.
|
EPClass (user-defined) | Enables you to control the scheduling class for the agent functions (entry points) other than the online entry point whether the entry point is in C or scripts. The following values are valid for this attribute:
Use only one of the following sets of attributes to configure scheduling class and priority for VCS:
|
EPPriority (user-defined) | Enables you to control the scheduling priority for the agent functions (entry points) other than the online entry point. The attribute controls the agent function priority whether the entry point is in C or scripts. The following values are valid for this attribute:
Use only one of the following sets of attributes to configure scheduling class and priority for VCS:
|
ExternalStateChange (user-defined) Note: This attribute can be overridden. | Defines how VCS handles service group state when resources are intentionally brought online or taken offline outside of VCS control. The attribute can take the following values: If the configured application is started outside of VCS control, VCS brings the corresponding service group online. If the configured application is stopped outside of VCS control, VCS takes the corresponding service group offline. If a configured application is stopped outside of VCS control, VCS sets the state of the corresponding VCS resource as offline. VCS does not take any parent resources or the service group offline. OfflineHold and OfflineGroup are mutually exclusive. |
FaultOnMonitorTimeouts (user-defined) Note: This attribute can be overridden. | When a monitor times out as many times as the value specified, the corresponding resource is brought down by calling the clean function. The resource is then marked FAULTED, or it is restarted, depending on the value set in the RestartLimit attribute. When FaultOnMonitorTimeouts is set to 0, monitor failures are not considered indicative of a resource fault. A low value may lead to spurious resource faults, especially on heavily loaded systems.
|
FaultPropagation (user-defined) Note: This attribute can be overridden. | Specifies if VCS should propagate the fault up to parent resources and take the entire service group offline when a resource faults. The value 1 indicates that when a resource faults, VCS fails over the service group, if the group's AutoFailOver attribute is set to 1. The value 0 indicates that when a resource faults, VCS does not take other resources offline, regardless of the value of the Critical attribute. The service group does not fail over on resource fault.
|
FireDrill (user-defined) | Specifies whether or not fire drill is enabled for the resource type. If the value is:
You can override this attribute.
|
IMF (user-defined) Note: This attribute can be overridden. | Determines whether the IMF-aware agent must perform intelligent resource monitoring. You can also override the value of this attribute at resource-level. Type and dimension: integer-association This attribute includes the following keys:
|
IMFRegList (user-defined) | An ordered list of attributes whose values are registered with the IMF notification module.
|
InfoInterval (user-defined) | Duration (in seconds) after which the info function is invoked by the agent framework for ONLINE resources of the particular resource type. If set to 0, the agent framework does not periodically invoke the info function. To manually invoke the info function, use the command hares -refreshinfo. If the value you designate is 30, for example, the function is invoked every 30 seconds for all ONLINE resources of the particular resource type.
|
IntentionalOffline (user-defined) | Defines how VCS reacts when a configured application is intentionally stopped outside of VCS control. Add this attribute for agents that support detection of an intentional offline outside of VCS control. Note that the intentional offline feature is available for agents registered as V51 or later. The value 0 instructs the agent to register a fault and initiate the failover of a service group when the supported resource is taken offline outside of VCS control. The value 1 instructs VCS to take the resource offline when the corresponding application is stopped outside of VCS control.
|
InfoTimeout (user-defined) | Timeout value for info function. If function does not complete by the designated time, the agent framework cancels the function's thread.
|
LevelTwoMonitorFreq (user-defined) | Specifies the frequency at which the agent for this resource type must perform second-level or detailed monitoring. Type and dimension: integer-scalar Default: 0 |
LogDbg (user-defined) | Indicates the debug severities enabled for the resource type or agent framework. Debug severities used by the agent functions are in the range of DBG_1 - DBG_21. The debug messages from the agent framework are logged with the severities DBG_AGINFO, DBG_AGDEBUG and DBG_AGTRACE, representing the least to most verbose.
The LogDbg attribute can be overridden. Using the LogDbg attribute, you can set DBG_AGINFO, DBG_AGTRACE, and DBG_AGDEBUG severities at the resource level, but it does not have an impact as these levels are agent-type specific. Veritas recommends to set values between DBG_1 to DBG_21 at resource level using the LogDbg attribute. |
LogFileSize (user-defined) | Specifies the size (in bytes) of the agent log file. Minimum value is 64 KB. Maximum value is 134217728 bytes (128MB).
|
LogViaHalog (user-defined) | Enables the log of all the entry points to be logged either in the respective agent log file or the engine log file based on the values configured.
Type: boolean-scalar Default: 0 |
MigrateWaitLimit (user-defined) | Number of monitor intervals to wait for a resource to migrate after the migrating procedure is complete. MigrateWaitLimit is applicable for the source and target node because the migrate operation takes the resource offline on the source node and brings the resource online on the target node. You can also define MigrateWaitLimit as the number of monitor intervals to wait for the resource to go offline on the source node after completing the migrate procedure and the number of monitor intervals to wait for the resource to come online on the target node after resource is offline on the source node.
Note: This attribute can be overridden. Probes fired manually are counted when MigrateWaitLimit is set and the resource is waiting to migrate. For example, if the MigrateWaitLimit of a resource is set to 5 and the MonitorInterval is set to 60 (seconds), the resource waits for a maximum of five monitor intervals (that is, 5 x 60), and if all five monitors within MigrateWaitLimit report the resource as online on source node, it sets the ADMIN_WAIT flag. If you run another probe, the resource waits for four monitor intervals (that is, 4 x 60), and if the fourth monitor does not report the state as offline on source, it sets the ADMIN_WAIT flag. This procedure is repeated for 5 complete cycles. Similarly, if resource not moved to online state within the MigrateWaitLimit then it sets the ADMIN_WAIT flag. |
MigrateTimeout (user-defined) | Maximum time (in seconds) within which the migrate procedure must complete or else be terminated.
Note: This attribute can be overridden. |
MonitorInterval (user-defined) Note: This attribute can be overridden. | Duration (in seconds) between two consecutive monitor calls for an ONLINE or transitioning resource. Note: Note: The value of this attribute for the MultiNICB type must be less than its value for the IPMultiNICB type. See the Cluster Server Bundled Agents Reference Guide for more information. A low value may impact performance if many resources of the same type exist. A high value may delay detection of a faulted resource.
|
MonitorStatsParam (user-defined) | Stores the required parameter values for calculating monitor time statistics. static str MonitorStatsParam = {Frequency = 10, ExpectedValue = 3000, ValueThreshold = 100, AvgThreshold = 40} Frequency: The number of monitor cycles after which the average monitor cycle time should be computed and sent to the engine. If configured, the value for this attribute must be between 1 and 30. The value 0 indicates that the monitor cycle ti me should not be computed. Default=0. ExpectedValue: The expected monitor time in milliseconds for all resources of this type. Default=100. ValueThreshold: The acceptable percentage difference between the expected monitor cycle time (ExpectedValue) and the actual monitor cycle time. Default=100. AvgThreshold: The acceptable percentage difference between the benchmark average and the moving average of monitor cycle times. Default=40.
|
MonitorTimeout (user-defined) Note: This attribute can be overridden. | Maximum time (in seconds) within which the monitor function must complete or else be terminated.
|
NumThreads (user-defined) | Number of threads used within the agent process for managing resources. This number does not include threads used for other internal purposes. If the number of resources being managed by the agent is less than or equal to the NumThreads value, only that many number of threads are created in the agent. Addition of more resources does not create more service threads. Similarly deletion of resources causes service threads to exit. Thus, setting NumThreads to 1 forces the agent to just use 1 service thread no matter what the resource count is. The agent framework limits the value of this attribute to 30.
|
OfflineMonitorInterval (user-defined) Note: This attribute can be overridden. | Duration (in seconds) between two consecutive monitor calls for an OFFLINE resource. If set to 0, OFFLINE resources are not monitored.
|
OfflineTimeout (user-defined) Note: This attribute can be overridden. | Maximum time (in seconds) within which the offline function must complete or else be terminated.
|
OfflineWaitLimit (user-defined) Note: This attribute can be overridden. | Number of monitor intervals to wait for the resource to go offline after completing the offline procedure. Increase the value of this attribute if the resource is likely to take a longer time to go offline. Probes fired manually are counted when OfflineWaitLimit is set and the resource is waiting to go offline. For example, say the OfflineWaitLimit of a resource is set to 5 and the MonitorInterval is set to 60. The resource waits for a maximum of five monitor intervals (five times 60), and if all five monitors within OfflineWaitLimit report the resource as online, it calls the clean agent function. If the user fires a probe, the resource waits for four monitor intervals (four times 60), and if the fourth monitor does not report the state as offline, it calls the clean agent function. If the user fires another probe, one more monitor cycle is consumed and the resource waits for three monitor intervals (three times 60), and if the third monitor does not report the state as offline, it calls the clean agent function.
|
OnlineClass (user-defined) | Enables you to control the scheduling class for the online agent function (entry point). This attribute controls the class whether the entry point is in C or scripts. The following values are valid for this attribute:
Use only one of the following sets of attributes to configure scheduling class and priority for VCS:
|
OnlinePriority (user-defined) | Enables you to control the scheduling priority for the online agent function (entry point). This attribute controls the priority whether the entry point is in C or scripts. The following values are valid for this attribute:
Use only one of the following sets of attributes to configure scheduling class and priority for VCS:
|
OnlineRetryLimit (user-defined) Note: This attribute can be overridden. | Number of times to retry the online operation if the attempt to online a resource is unsuccessful. This parameter is meaningful only if the clean operation is implemented.
|
OnlineTimeout (user-defined) Note: This attribute can be overridden. | Maximum time (in seconds) within which the online function must complete or else be terminated.
|
OnlineWaitLimit (user-defined) Note: This attribute can be overridden. | Number of monitor intervals to wait for the resource to come online after completing the online procedure. Increase the value of this attribute if the resource is likely to take a longer time to come online. Each probe command fired from the user is considered as one monitor interval. For example, say the OnlineWaitLimit of a resource is set to 5. This means that the resource will be moved to a faulted state after five monitor intervals. If the user fires a probe, then the resource will be faulted after four monitor cycles, if the fourth monitor does not report the state as ONLINE. If the user again fires a probe, then one more monitor cycle is consumed and the resource will be faulted if the third monitor does not report the state as ONLINE.
|
OpenTimeout (user-defined) Note: This attribute can be overridden. | Maximum time (in seconds) within which the open function must complete or else be terminated.
|
Operations (user-defined) | Indicates valid operations for resources of the resource type. Values are OnOnly (can online only), OnOff (can online and offline), None (cannot online or offline).
|
RestartLimit (user-defined) Note: This attribute can be overridden. | Number of times to retry bringing a resource online when it is taken offline unexpectedly and before VCS declares it FAULTED.
|
ScriptClass (user-defined) | Indicates the scheduling class of the script processes (for example, online) created by the agent. Use only one of the following sets of attributes to configure scheduling class and priority for VCS:
|
ScriptPriority (user-defined) | Indicates the priority of the script processes created by the agent. Use only one of the following sets of attributes to configure scheduling class and priority for VCS:
|
SourceFile (user-defined) | File from which the configuration is read. Do not configure this attribute in main.cf. Make sure the path exists on all nodes before running a command that configures this attribute.
|
SupportedActions (user-defined) | Valid action tokens for the resource type.
|
SupportedOperations (user-defined) | Indicates the additional operations for a resource type or an agent. Only migrate keyword is supported.
An example of a resource type that supports migration is a Kernel-based Virtual Machine (KVM). |
ToleranceLimit (user-defined) Note: This attribute can be overridden. | After a resource goes online, the number of times the monitor function should return OFFLINE before declaring the resource FAULTED. A large value could delay detection of a genuinely faulted resource.
|
TypeOwner (user-defined) | This attribute is used for VCS notification. VCS sends notifications to persons designated in this attribute when an event occurs related to the agent's resource type. If the agent of that type faults or restarts, VCS send notification to the TypeOwner. Note that while VCS logs most events, not all events trigger notifications. Make sure to set the severity level at which you want notifications to be sent to TypeOwner or to at least one recipient defined in the SmtpRecipients attribute of the NotifierMngr agent.
|
TypeRecipients (user-defined) | The email-ids set in the TypeRecipients attribute receive email notification for events related to a specific agent. There are only two types of events related to an agent for which notifications are sent:
|