InfoScale™ 9.0 Cluster Server Administrator's Guide - Windows
- Section I. Clustering concepts and terminology
- Introducing Cluster Server
- About Cluster Server
- About cluster control guidelines
- About the physical components of VCS
- Logical components of VCS
- Types of service groups
- Agent classifications
- About cluster control, communications, and membership
- About security services
- About cluster topologies
- VCS configuration concepts
- Introducing Cluster Server
- Section II. Administration - Putting VCS to work
- About the VCS user privilege model
- Getting started with VCS
- Administering the cluster from the command line
- About administering VCS from the command line
- Stopping the VCS engine and related processes
- About managing VCS configuration files
- About managing VCS users from the command line
- About querying VCS
- About administering service groups
- Modifying service group attributes
- About administering resources
- About administering resource types
- About administering clusters
- Configuring resources and applications in VCS
- About configuring resources and applications
- About Virtual Business Services
- About Intelligent Resource Monitoring (IMF)
- About fast failover
- How VCS monitors storage components
- About storage configuration
- About configuring network resources
- About configuring file shares
- About configuring IIS sites
- About configuring services
- Before you configure a service using the GenericService agent
- About configuring processes
- About configuring Microsoft Message Queuing (MSMQ)
- About configuring the infrastructure and support agents
- About configuring applications using the Application Configuration Wizard
- Adding resources to a service group
- About application monitoring on single-node clusters
- Configuring the service group in a non-shared storage environment
- About the VCS Application Manager utility
- About testing resource failover using virtual fire drills
- Modifying the cluster configuration
- Section III. Administration - Beyond the basics
- Controlling VCS behavior
- VCS behavior on resource faults
- About controlling VCS behavior at the service group level
- Customized behavior diagrams
- VCS behavior for resources that support the intentional offline functionality
- About controlling VCS behavior at the resource level
- Service group workload management
- Sample configurations depicting workload management
- The role of service group dependencies
- VCS event notification
- VCS event triggers
- List of event triggers
- Controlling VCS behavior
- Section IV. Cluster configurations for disaster recovery
- Connecting clusters–Creating global clusters
- VCS global clusters: The building blocks
- About global cluster management
- About serialization - The Authority attribute
- Prerequisites for global clusters
- Setting up a global cluster
- Configuring replication resources in VCS
- About IPv6 support with global clusters
- About cluster faults
- About setting up a disaster recovery fire drill
- Test scenario for a multi-tiered environment
- Administering global clusters from Cluster Manager (Java console)
- Administering global clusters from the command line
- About global querying in a global cluster setup
- Administering clusters in global cluster setup
- Setting up replicated data clusters
- Connecting clusters–Creating global clusters
- Section V. Troubleshooting and performance
- VCS performance considerations
- How cluster components affect performance
- How cluster operations affect performance
- VCS performance consideration when a system panics
- VCS agent statistics
- Troubleshooting and recovery for VCS
- VCS message logging
- Handling network failure
- Troubleshooting VCS startup
- Troubleshooting service groups
- Troubleshooting and recovery for global clusters
- VCS utilities
- VCS performance considerations
- Section VI. Appendixes
- Appendix A. VCS user privileges—administration matrices
- Appendix B. Cluster and system states
- Appendix C. VCS attributes
- Appendix D. Configuring LLT over UDP
- Appendix E. Handling concurrency violation in any-to-any configurations
- Appendix F. Accessibility and VCS
- Appendix G. Executive Order logging
Troubleshooting service groups
This topic cites the most common problems associated with bringing service groups online and taking them offline. Recommended action is also included, where applicable.
System is not in RUNNING state.
Recommended action: Type hasys -display system to verify the system is running.
For more information on system states:
Service group not configured to run on the system.
The SystemList attribute of the group may not contain the name of the system.
Recommended action: Use the output of the command hagrp -display service_group to verify the system name.
Service group not configured to autostart.
If the service group is not starting automatically on the system, the group may not be configured to AutoStart, or may not be configured to AutoStart on that particular system.
Recommended action: Use the output of the command hagrp -display service_group to verify the values of the AutoStart and AutoStartList attributes.
Service group is frozen.
Recommended action: Use the output of the command hagrp -display service_group to verify the value of the Frozen and TFrozen attributes. Use the command hagrp -unfreeze to unfreeze the group. Note that VCS will not take a frozen service group offline.
Service group autodisabled.
When VCS does not know the status of a service group on a particular system, it autodisables the service group on that system. Autodisabling occurs under the following conditions:
When the VCS engine, HAD, is not running on the system.
When all resources within the service group are not probed on the system.
When a particular system is visible through disk heartbeat only.
Under these conditions, all service groups that include the system in their SystemList attribute are autodisabled. This does not apply to systems that are powered off.
Recommended action: Use the output of the command hagrp -display service_group to verify the value of the AutoDisabled attribute.
Warning:
To bring a group online manually after VCS has autodisabled the group, make sure that the group is not fully or partially active on any system that has the AutoDisabled attribute set to 1 by VCS. Specifically, verify that all resources that may be corrupted by being active on multiple systems are brought down on the designated systems. Then, clear the AutoDisabled attribute for each system:
C:\> hagrp -autoenable service_group -sys system
Failover service group is online on another system.
The group is a failover group and is online or partially online on another system.
Recommended action: Use the output of the command hagrp -display service_group to verify the value of the State attribute. Use the command hagrp -offline to offline the group on another system.
Service group is waiting for the resource to be brought online/taken offline.
Recommended action: Review the IState attribute of all resources in the service group to locate which resource is waiting to go online (or which is waiting to be taken offline). Use the hastatus command to help identify the resource. See the engine and agent logs for information on why the resource is unable to be brought online or be taken offline.
To clear this state, make sure all resources waiting to go online/offline do not bring themselves online/offline. Use the command hagrp -flush to clear the internal state of VCS. You can now bring the service group online or take it offline on another system.
A critical resource faulted.
Output of the command hagrp -display service_group indicates that the service group has faulted.
Recommended action: Use the command hares -clear to clear the fault.
Service group is waiting for a dependency to be met.
Recommended action: To see which dependencies have not been met, type hagrp -dep service_group to view service group dependencies, or hares -dep resource to view resource dependencies.
Service group not fully probed.
This occurs if the agent processes have not monitored each resource in the service group. When the VCS engine, HAD, starts, it immediately "probes" to find the initial state of all of resources. (It cannot probe if the agent is not returning a value.) A service group must be probed on all systems included in the SystemList attribute before VCS attempts to bring the group online as part of AutoStart. This ensures that even if the service group was online prior to VCS being brought up, VCS will not inadvertently bring the service group online on another system.
Recommended action: Use the output of hagrp -display service_group to see the value of the ProbesPending attribute for the system's service group. (It should be zero.) To determine which resources are not probed, verify the local Probed attribute for each resource on the specified system. Zero means waiting for probe result, 1 means probed, and 2 means VCS not booted. See the engine and agent logs for information.
More Information