InfoScale™ 9.0 Cluster Server Agent Developer's Guide - AIX, Linux, Solaris, Windows

Last Published:
Product(s): InfoScale & Storage Foundation (9.0)
Platform: AIX,Linux,Solaris,Windows
  1. Introduction
    1.  
      About VCS agents
    2. How agents work
      1.  
        About the agent framework
      2.  
        About intelligent monitoring framework (IMF)
      3.  
        Resource type definitions
      4.  
        About agent functions (entry points)
      5.  
        About on-off, on-only, and persistent resources
      6. About attributes
        1.  
          Attribute data types
        2.  
          Attribute dimensions
        3.  
          Attribute scope across systems: global and local attributes
        4.  
          Attribute life: temporary attributes
      7.  
        About intentional offline of applications
    3. About developing an agent
      1.  
        Considerations for the application
      2. High-level overview of the agent development process
        1.  
          Creating the type definition file
        2.  
          Developing the entry points
        3.  
          Building the agent
        4.  
          Testing the agent
  2. Agent entry point overview
    1. About agent entry points
      1.  
        Supported entry points
      2.  
        How the agent framework interacts with entry points
    2. Agent entry points described
      1.  
        About the open entry point
      2.  
        About the monitor entry point
      3.  
        About the online entry point
      4.  
        About the offline entry point
      5.  
        About the clean entry point
      6. About the action entry point
        1.  
          Return values for action entry point
      7. About the info entry point
        1.  
          Return values for info entry point
        2.  
          About the ResourceInfo attribute
        3.  
          Invoking the info entry point
      8.  
        About the attr_changed entry point
      9.  
        About the close entry point
      10.  
        About the shutdown entry point
      11.  
        About the imf_init entry point
      12.  
        About the imf_register entry point
      13.  
        About the imf_getnotification entry point
      14.  
        About the migrate entry point
      15.  
        About the meter entry point
    3.  
      Return values for entry points
    4. Considerations for using C++ or script entry points
      1. About the VCSAgStartup routine
        1.  
          If you implement entry points using scripts
        2.  
          If you implement all or some of the entry points in C++
        3.  
          Example: VCSAgStartup with C++ and script entry points
    5. About the agent information file
      1. Example agent information file (UNIX)
        1.  
          Agent information
        2.  
          Attribute argument details
      2.  
        Implementing the agent XML information file
    6. About the ArgList and ArgListValues attributes
      1.  
        ArgListValues attribute for agents registered as V50 and later
      2. Overview of the name-value tuple format
        1.  
          Scalar attribute format
        2.  
          Vector attribute format
        3.  
          Keylist attribute format
        4.  
          Association attribute format
      3.  
        ArgListValues attribute for different agents versions
      4.  
        About the entry point timeouts
  3. Creating entry points in C++
    1. About creating entry points in C++
      1.  
        Entry point examples in this chapter
    2.  
      Data Structures
    3. Syntax for C++ entry points
      1.  
        Syntax for C++ VCSAgStartup
      2.  
        Syntax for C++ monitor
      3. Syntax for C++ info
        1.  
          resinfo_op
        2.  
          info_output
        3.  
          opt_update_args
        4.  
          opt_add_args
        5.  
          Example: info entry point implementation in C++
      4.  
        Syntax for C++ online
      5.  
        Syntax for C++ offline
      6.  
        Syntax for C++ clean
      7.  
        Syntax for C++ action
      8.  
        Syntax for C++ attr_changed
      9.  
        Syntax for C++ open
      10.  
        Syntax for C++ close
      11.  
        Syntax for C++ shutdown
      12.  
        Syntax for C++ migrate
      13.  
        Syntax for C++ meter
    4. Agent framework primitives
      1.  
        VCSAgGetMonitorLevel
      2.  
        VCSAgGetFwVersion
      3.  
        VCSAgGetRegVersion
      4.  
        VCSAgRegisterEPStruct
      5.  
        VCSAgSetCookie2
      6.  
        VCSAgRegister
      7.  
        VCSAgUnregister
      8.  
        VCSAgGetCookie
      9.  
        VCSAgStrlcpy
      10.  
        VCSAgStrlcat
      11.  
        VCSAgSnprintf
      12.  
        VCSAgCloseFile
      13.  
        VCSAgDelString
      14.  
        VCSAgExec
      15.  
        VCSAgExecWithTimeout
      16.  
        VCSAgGenSnmpTrap
      17.  
        VCSAgSendTrap
      18.  
        VCSAgLockFile
      19.  
        VCSAgInitEntryPointStruct
      20.  
        VCSAgSetStackSize
      21.  
        VCSAgUnlockFile
      22.  
        VCSAgValidateAndSetEntryPoint
      23.  
        VCSAgSetLogCategory
      24.  
        VCSAgGetProductName
      25.  
        VCSAgMonitorReturn
      26.  
        VCSAgSetResEPTimeout
      27.  
        VCSAgDecryptKey
      28.  
        VCSAgGetConfDir
      29.  
        VCSAgGetHomeDir
      30.  
        VCSAgGetLogDir
      31.  
        VCSAgGetSystemName
      32.  
        VCSAG_CONSOLE_LOG_MSG
      33.  
        VCSAG_LOG_MSG
      34.  
        VCSAG_LOGDBG_MSG
      35.  
        VCSAG_RES_LOG_MSG
    5. Agent Framework primitives for container support
      1.  
        VCSAgIsContainerUp
      2.  
        VCSAgGetContainerTypeEnum
      3.  
        VCSAgExecInContainer2
      4.  
        VCSAgIsContainerCapable
      5.  
        VCSAgExecInContainerWithTimeout
      6.  
        VCSAgGetUID
      7.  
        VCSAgIsPidInContainer
      8.  
        VCSAgIsProcInContainer
      9.  
        VCSAgGetContainerID2
      10.  
        VCSAgGetContainerName2
      11.  
        VCSAgGetContainerBasePath
      12.  
        VCSAgGetContainerEnabled
  4. Creating entry points in scripts
    1. About creating entry points in scripts
      1. Rules for using script entry points
        1.  
          On UNIX platforms
      2.  
        Parameters and values for script entry points
      3.  
        ArgList attributes
      4.  
        Examples
    2. Syntax for script entry points
      1.  
        Syntax for the monitor script
      2.  
        Syntax for the online script
      3.  
        Syntax for the offline script
      4.  
        Syntax for the clean script
      5.  
        Syntax for the action script
      6.  
        Syntax for the attr_changed script
      7.  
        Syntax for the info script
      8.  
        Syntax for the open script
      9.  
        Syntax for the close script
      10.  
        Syntax for the shutdown script
      11.  
        Syntax for the imf_init script
      12.  
        Syntax for the imf_register script
      13.  
        Syntax for the imf_getnotification script
      14.  
        Syntax for migrate script
      15.  
        Syntax for meter script
    3. Agent framework primitives
      1.  
        VCSAG_GET_MONITOR_LEVEL
      2.  
        VCSAG_GET_AGFW_VERSION
      3.  
        VCSAG_GET_REG_VERSION
      4.  
        VCSAG_SET_RES_EP_TIMEOUT
      5. VCSAG_GET_ATTR_VALUE
        1.  
          To get number of keys in the key list attribute and the index of attribute in argument list
        2.  
          To get a particular key in the key list and vector attribute
        3.  
          To get the number of keys in the association attribute, and index of attribute in the argument list
        4.  
          To get a particular key or value in the association attribute:
      6.  
        VCSAG_SET_RESINFO
      7.  
        VCSAG_MONITOR_EXIT
      8.  
        VCSAG_SYSTEM
      9.  
        VCSAG_SU
      10.  
        VCSAG_RETURN_IMF_RESID
      11.  
        VCSAG_RETURN_IMF_EVENT
      12.  
        VCSAG_BLD_PSCOMM
      13.  
        VCSAG_PHANTOM_STATE
      14.  
        VCSAG_SET_ENVS
      15.  
        VCSAG_LOG_MSG
      16.  
        VCSAG_LOGDBG_MSG
      17.  
        VCSAG_SQUEEZE_SPACES
    4. Agent Framework primitives with container support
      1.  
        VCSAG_GET_CONTAINER_BASE_PATH
      2.  
        VCSAG_GET_CONTAINER_INFO
      3.  
        VCSAG_IS_PROC_IN_CONTAINER
      4.  
        VCSAG_EXEC_IN_CONTAINER
    5. Example script entry points
      1.  
        Online entry point for FileOnOff
      2.  
        Monitor entry point for FileOnOff
      3.  
        Monitor entry point with intentional offline
      4.  
        Offline entry point for FileOnOff
      5.  
        Monitor entry point for agent having basic (level-1) and detailed (level-2) monitoring
  5. Logging agent messages
    1.  
      About logging agent messages
    2. Logging in C++ and script-based entry points
      1. Agent messages: format
        1.  
          Timestamp
        2.  
          Mnemonic
        3.  
          Severity
        4.  
          UMI
        5.  
          Message text
      2.  
        Log unification of VCS agent's entry points
    3. C++ agent logging APIs
      1.  
        Agent application logging macros for C++ entry points
      2.  
        Agent debug logging macros for C++ entry points
      3.  
        Severity arguments for C++ macros
      4.  
        Initializing function_name using VCSAG_LOG_INIT
      5.  
        Log category
      6.  
        Examples of logging APIs used in a C++ agent
    4. Script entry point logging functions
      1.  
        Using functions in scripts
      2. VCSAG_SET_ENVS
        1.  
          VCSAG_SET_ENVS examples, Shell script entry points
        2.  
          VCSAG_SET_ENVS examples, Perl script entry points
        3.  
          VCSAG_SET_ENVS examples, Python script entry points
      3. VCSAG_LOG_MSG
        1.  
          VCSAG_LOG_MSG examples, Shell script entry points
        2.  
          VCSAG_LOG_MSG examples, Perl script entry points
        3.  
          VCSAG_LOG_MSG examples, Python script entry points
      4. VCSAG_LOGDBG_MSG
        1.  
          VCSAG_LOGDBG_MSG examples, Shell script entry points
        2.  
          VCSAG_LOGDBG_MSG examples, Perl script entry points
        3.  
          VCSAG_LOGDBG_MSG examples, Python script entry points
      5.  
        Example of logging functions used in a script agent
  6. Building a custom agent
    1. Files for use in agent development
      1.  
        Script based agent binaries
      2.  
        C++ based agent binaries
    2. Creating the type definition file for a custom agent
      1.  
        Naming convention for the type definition file
      2.  
        Example: FileOnOffTypes.cf
      3.  
        Example: Type definition for a custom agent that supports intentional offline
      4.  
        Requirements for creating the agentTypes.cf file
      5.  
        Adding the custom type definition to the configuration
    3. Building a custom agent on UNIX
      1.  
        Implementing entry points using scripts
      2.  
        Example: Using script entry points on UNIX
      3.  
        Example: Using VCSAgStartup() and script entry points on UNIX
      4.  
        Implementing entry points using C++
      5.  
        Example: Using C++ entry points on UNIX
      6.  
        Example: Using C++ and script entry points on UNIX
    4.  
      Installing the custom agent
    5. Defining resources for the custom resource type
      1. Sample resource definition
        1.  
          How the FileOnOff agent uses configuration information
    6.  
      Agent framework versions details
  7. Building a script based IMF-aware custom agent
    1.  
      About building a script based IMF-aware custom agent
    2.  
      Linking AMF plugins with script agent
    3. Creating XML file required for AMF plugins to do resource registration for online and offline state monitoring
      1.  
        Example of amfregister.xml for registration of process-based resource with AMF for online monitoring
      2.  
        Example of amfregister.xml for registration of process-based resource with AMF for offline monitoring
      3.  
        Example of amfregister.xml for online and offline IMF monitoring for a given process
      4.  
        Examples for adding RepearName tag in amfregister.xml
    4.  
      Adding IMF and IMFRegList attributes in configuration
    5.  
      Monitor without IMF integration
    6.  
      Monitor without IMF but with LevelTwo monitor frequency
    7.  
      Monitor with IMF integration
    8.  
      Monitor with IMF but with LevelTwo monitor frequency
    9.  
      Installing the IMF-aware script-based custom agent
  8. Testing agents
    1.  
      About testing agents
    2. Using debug messages
      1.  
        Debugging agent functions (entry points).
      2.  
        Debugging the agent framework
    3. Debugging using AdvDbg attribute
      1. Working of AdvDbg attribute
        1.  
          Working of pstack action
        2.  
          Working of core action
      2.  
        Impact of AdvDbg attribute on existing functionality of the entry point
    4. Using the engine process to test agents
      1.  
        Test commands
  9. Static type attributes
    1. About static attributes
      1.  
        Overriding static type attributes
    2. Static type attribute definitions
      1.  
        ActionTimeout
      2. AdvDbg
        1.  
          Configuring AdvDbg attribute and formatting the individual key
        2.  
          Recommended steps for configuring AdvDbg attribute for monitor entry points
      3.  
        AEPTimeout
      4.  
        AgentClass
      5.  
        AgentDirectory
      6.  
        AgentFailedOn
      7.  
        AgentFile
      8.  
        AgentPriority
      9.  
        AgentReplyTimeout
      10.  
        AgentStartTimeout
      11.  
        AlertOnMonitorTimeouts
      12. ArgList
        1.  
          ArgList reference attributes
      13.  
        AttrChangedTimeout
      14.  
        AvailableMeters
      15.  
        CleanRetryLimit
      16.  
        CleanTimeout
      17.  
        CloseTimeout
      18.  
        ContainerOpts
      19.  
        ConfInterval
      20.  
        EPClass
      21.  
        EPPriority
      22.  
        ExternalStateChange
      23.  
        FaultOnMonitorTimeouts
      24.  
        FaultPropagation
      25.  
        FireDrill
      26.  
        IMF
      27.  
        IMFRegList
      28.  
        InfoInterval
      29.  
        InfoTimeout
      30.  
        IntentionalOffline
      31.  
        LevelTwoMonitorFreq
      32.  
        LogDbg
      33.  
        LogFileSize
      34.  
        LogViaHalog
      35.  
        ManageFaults
      36.  
        Meters
      37.  
        MeterControl
      38.  
        MeterRegList
      39.  
        MeterRetryLimit
      40.  
        MeterTimeout
      41.  
        MonitorInterval
      42.  
        MonitorStatsParam
      43.  
        MonitorTimeout
      44.  
        MigrateTimeout
      45.  
        MigrateWaitLimit
      46.  
        NumThreads
      47.  
        OfflineMonitorInterval
      48.  
        OfflineTimeout
      49.  
        OfflineWaitLimit
      50.  
        OnlineClass
      51.  
        OnlinePriority
      52.  
        OnlineRetryLimit
      53.  
        OnlineTimeout
      54.  
        OnlineWaitLimit
      55.  
        OpenTimeout
      56.  
        Operations
      57.  
        RegList
      58.  
        RestartLimit
      59.  
        ScriptClass
      60.  
        ScriptPriority
      61.  
        SourceFile
      62.  
        SupportedActions
      63.  
        SupportedOperations
      64.  
        ToleranceLimit
  10. State transition diagram
    1.  
      State transitions
    2.  
      State transitions with respect to ManageFaults attribute
  11. Internationalized messages
    1.  
      About internationalized messages
    2. Creating SMC files
      1.  
        SMC format
      2.  
        Example SMC file
      3.  
        Formatting SMC files
      4.  
        Naming SMC files, BMC files
      5.  
        Message examples
      6.  
        Using format specifiers
    3. Converting SMC files to BMC files
      1. Storing BMC files
        1.  
          VCS languages
      2.  
        Displaying the contents of BMC files
    4. Using BMC Map Files
      1.  
        Location of BMC Map Files
      2.  
        Creating BMC Map Files
      3.  
        Example BMC Map File
    5.  
      Updating BMC Files
  12. Troubleshooting VCS resource's unexpected behavior using First Failure Data Capture (FFDC)
    1.  
      Enhancing First Failure Data Capture (FFDC) to troubleshoot VCS resource's unexpected behavior
  13. Appendix A. Using pre-5.0 VCS agents
    1. Using pre-5.0 VCS agents and registering them with V50 or later
      1.  
        Outline of steps to change V40 agents to V50 or later
      2.  
        Example script in V40 and V50 or later
      3.  
        Sourcing ag_i18n_inc modules in script entry points
    2.  
      Guidelines for using pre-VCS 4.0 Agents
    3. Log messages in pre-VCS 4.0 agents
      1.  
        Mapping of log tags (pre-VCS 4.0) to log severities (VCS 4.0)
      2.  
        How Pre-VCS 4.0 Messages are Displayed by VCS 4.0 and Later
      3.  
        Comparing Pre-VCS 4.0 APIs and VCS 4.0 Logging Macros
    4. Pre-VCS 4.0 Message APIs
      1.  
        VCSAgLogConsoleMsg
      2.  
        VCSAgLogI18NMsg
      3.  
        VCSAgLogI18NMsgEx
      4.  
        VCSAgLogI18NConsoleMsg
      5.  
        VCSAgLogI18NConsoleMsgEx

State transitions

This section describes state transitions for:

  • Opening a resource

  • Resource in a steady state

  • Bringing a resource online

  • Taking a resource offline

  • Resource fault (without automatic restart)

  • Resource fault (with automatic restart)

  • Monitoring of persistent resources

  • Closing a resource

  • Migrating a resource

In addition, state transitions are shown for the handling of resources with respect to the ManageFaults service group attribute.

See State transitions with respect to ManageFaults attribute.

The states shown in these diagrams are associated with each resource by the agent framework. These states are used only within the agent framework and are independent of the IState resource attribute values indicated by the engine.

The agent writes resource state transition information into the agent log file when the LogDbg parameter, a static resource type attribute, is set to the value DBG_AGINFO. Agent developers can make use of this information when debugging agents.

Figure: Opening a resource

Opening a resource

When the agent starts up, each resource starts with the initial state of Detached. In the Detached state (Enabled=0), the agent rejects all commands to bring a resource online or take it offline.

Figure: Resource in a steady state

Resource in a steady state

When resources are in a steady state of Online or Offline, they are monitored at regular intervals. The intervals are specified by the MonitorInterval attribute in the Online state and by the OfflineMonitorInterval attribute in the Offline state. An Online resource that is unexpectedly detected as Offline is considered to be faulted. Refer to diagrams describing faulted resources.

Figure: Bringing a resource online: ManageFaults=ALL

Bringing a resource online: ManageFaults=ALL

When the agent receives a request from the engine to bring the resource online, the resource enters the Going Online state, where the online entry point is invoked.

If online entry point completes, the resource enters the Going Online Waiting state where it waits for the next monitor cycle.

If online entry point timesout, the agent call clean.

If monitor of GoingOnlineWaiting state returns a status as online, the resource moves to the Online state.

If monitor of GoingOnlineWaiting state returns a status as Intentional Offline, the resource moves to the Offline state.

If, however, the monitor times out, or returns a status of "not Online" (that is, unknown or offline), the following actions are considered:

  • If OnlineWaitLimit is not reached then resource returns to GoingOnlineWaiting and waits for next monitor.

  • If OnlineWaitLimit and OnlineRetryLimit are reached and the status remains unknow then resource returns to GoingOnlineWaiting and waits for next monitor.

  • If OnlineWaitLimit and OnlineRetryLimit are reached then the status remains offline then resource return to Offline state and marks the resource as faulted.

  • If OnlineWaitLimit is reached and OnlineRetryLimit is not reached then run clean, if CleanRetryLimit is not reached.

  • If OnlineWaitLimit and CleanRetryLimit are reached and OnlineRetryLimit is not reached then move the resource to GoingOnlineWaiting and mark it as ADMIN_WAIT.

  • If CleanRetryLimit is not reached and agent calls clean then following things can happen:

    • If clean times out or fails, the resource again returns to the Going Online Waiting state and waits for the next monitor cycle.

    • If clean succeeds with the OnlineRetryLimit reached, and the subsequent monitor reports the status as offline, the resource transitions to the offline state and it is marked as FAULTED.

Figure: Taking a resource offline and ManageFault = ALL

Taking a resource offline and ManageFault = ALL

Upon receiving a request from the engine to take a resource offline, the agent places the resource in a GoingOffline state and invokes the offline entry point and stop periodic monitoring.

If offline completes, the resource enters the GoingOffline Waiting state, agent starts periodic monitoring of resource and also insert a monitor command for the resource. If offline times out, the clean entry point is called for the resource. If clean times out or complete then start periodic monitoring and reset Offline Wait Count if clean was success and move resource to Going Offline Waiting state

If monitor of Going Offline Waiting state returns offline or intentional offline then resource moves to offline state

If monitor of the GoingOffline Waiting state returns unknown or online, or if the monitor times out then,

  • If OfflineWait Limit is not reached then the resource is moved to GoingOffline Waiting state.

  • If Offline Wait Limit is reached then the resource which is cleaned earlier is called, then mark the resource as UNABLE_TO_OFFLINE

  • If CleantRetryLimit is not reached then call clean.

  • If CleantRetryLimit is reached then mark resource as ADMIN_WAIT state and move the resource to GoingOffline Waiting state.

  • If the user initiates operation "-clearadminwait" then reset the ADMIN_WAIT flag. If user initiates operation "-clearaminwait -fault" then agent resets the ADMIN_WAIT flag

Figure: Resource fault when RestartLimit reached and ManageFault = ALL

Resource fault when RestartLimit reached and ManageFault = ALL

This diagram describes the activity that occurs when a resource faults and the RestartLimit is reached. When the monitor entry point times out successively and FaultOnMonitorTimeout is reached, or monitor returns offline and the ToleranceLimit is reached.

If clean retry limit is reached then set ADMIN_WAIT flag for resource and move resource to online state if not reached the agent invokes the clean entry point.

If clean fails, or if it times out, the agent places the resource in the online state as if no fault has occurred and starts periodic monitoring. If clean succeeds, the resource is placed in the Going Offline Waiting state and start periodic monitoring, where the agent waits for the next monitor.

If clean succeeds, the resource is placed in the GoingOffline Waiting state, where the agent waits for the next monitor.

  • If monitor reports online, the resource is placed back online as if no fault occurred. If monitor reports offline, the resource is placed in an offline state and marked as FAULTED. If monitor reports IO, the resource is placed in an offline state

  • If monitor reports unknown or times out, the agent places the resource back into the Going Offline Waiting state, and sets the UNABLE_TO_OFFLINE flag.

Note:

If clean succeeds, the agent move resource to GoingOfflineWait and the resource is marked faulted. If monitoring of GoingOfflineWaiting returns online then the resource is moved to online state as engine does not expects the resource to go in offline state the as GoingOfflineWaiting state was set by the agent as a result of clean success.

Figure: Resource fault when RestartLimit not reached and ManageFault = ALL

Resource fault when RestartLimit not reached and ManageFault = ALL

This diagram describes the activity that occurs when a resource faults and the RestartLimit is not reached. When the monitor entry point times out successively and FaultOnMonitorTimeout is reached, or monitor returns offline and the ToleranceLimit is reached then agent checks the clean counter to check if the clean entry point can be invoked.

If CleanRetryLimit is reached then set ADMIN_WAIT flag for the resource and move the resource to online state. If clean retry limit fails to reach, the agent invokes the clean entry point.

  • If clean succeeds, the resource is placed in the Going Online state and the online entry point is invoked to restart the resource; refer to the diagram, "Bringing a resource online."

  • If clean fails or times out, the agent places the resource in the Online state as if no fault occurred.

Refer to the diagram "Resource fault without automatic restart," for a discussion of activity when a resource faults and the RestartLimit is reached.

Figure: Monitoring of persistent resources

Monitoring of persistent resources

If monitor returns offline and the ToleranceLimit is reached, the resource is placed in an Offline state and noted as FAULTED. If monitor timeout and FaultOnMonitorTimeouts is reached, the resource is placed in an Offline state and noted as FAULTED.

Figure: Closing a resource

Closing a resource

The state diagram explains all the states from where a resource can move to Closing state. The following tables describes the actions performed in different state by which a resource can move to Closing state,

State

Action

Online to Closing

hastop - local - force or hares -delete or Enabled = 0 only if resource is persistent resource

Offline to Closing

Enabled = 0 or hastop - local or hastop - local - force or hares -delete

GoingOnlineWaiting

hastop - local - force or hares -delete

GoingOfflineWaiting

hastop - local - force or hares -delete

GoingMigrateWaiting

hastop - local - force or hares -delete

GoingOnline

hastop - local - force

GoingOffline

hastop - local - force

GoingMigrate

hastop - local - force

Probing

Enabled = 0 or hastop - local or hastop - local - force or hares - delete

Figure: Migrating a resource

Migrating a resource

The migration process is initiated from the source system, where virtual machine (VM) is online and the VM is migrated to the target system where it was offline. When the agent on the source system receives a migration request from the engine to migrate the resource, the resource goes to Going Migrate state, where migrate entry point is invoked. If the migrate entry point fails with return code 255, the resource is transitioned back to the online state and failure of migrate operation is communicated to the engine. This indicates that the migration operation cannot be performed.

Agent framework ignores any value returned between 101 to 254 range and will return to online state. If the migrate entry point completes successfully or times out is reached, the resource enters the Going Migrate Waiting state where it waits for the next monitor cycle and the monitor calls with the frequency as configured in MonitorInterval. If monitor returns an offline status, the resource moves to the offline state and the migration on the source system is considered complete.

Even after moving to offline state the agent keeps on monitoring the resource with same monitor frequency as configured in MonitorInterval. This is to detect if VM fails back at source node early. However, if monitor entry point times out or reports the state as online or unknown, the resource waits for the MigrateWaitLimit resource cycle to complete.

If any of the monitor within MigrateWaitLimit reports the state as offline, the resource transitions to offline state and the same is reported to the engine. If the monitor entry point times out or reports the state as online or unknown even after MigrateWaitLimit has reached, the ADMIN_WAIT flag is set.

If resource migration operation is successful on source node then on target node the agent change the monitoring frequency from OfflineMonitorInterval to MonitorInternal to detect success full migration early. But if resource is not detected as online on target node even after MigrateWaitLimit is reached then resource is moved to ADMIN_WAIT state and agent fail back to monitor frequency as configured in OfflineMonitorInterval

Note:

: The agent does not call clean if the migrate entry point times out or if monitor after migrate entry point times out or reports the state as online or unknown even after MigrateWaitLimit has reached. You need to manually clear the ADMIN_WAIT flag after resolving the issue.

Figure: Resource fault: ManageFaults attribute = ALL

Resource fault: ManageFaults attribute = ALL

Figure: Resource fault (monitor hung): ManageFaults attribute = ALL

Resource fault (monitor hung): ManageFaults attribute = ALL