Veritas InfoScale™ 7.3.1 Troubleshooting Guide - Linux
- Introduction
- Section I. Troubleshooting Veritas File System
- Section II. Troubleshooting Veritas Volume Manager
- Recovering from hardware failure
- Failures on RAID-5 volumes
- Recovery from failure of a DCO volume
- Recovering from instant snapshot failure
- Recovering from failed vxresize operation
- Recovering from boot disk failure
- VxVM boot disk recovery
- Recovery by reinstallation
- Managing commands, tasks, and transactions
- Backing up and restoring disk group configurations
- Troubleshooting issues with importing disk groups
- Recovering from CDS errors
- Logging and error messages
- Troubleshooting Veritas Volume Replicator
- Recovery from configuration errors
- Errors during an RLINK attach
- Errors during modification of an RVG
- Recovery on the Primary or Secondary
- Recovering from Primary data volume error
- Primary SRL volume error cleanup and restart
- Primary SRL header error cleanup and recovery
- Secondary data volume error cleanup and recovery
- Troubleshooting issues in cloud deployments
- Recovering from hardware failure
- Section III. Troubleshooting Dynamic Multi-Pathing
- Section IV. Troubleshooting Storage Foundation Cluster File System High Availability
- Troubleshooting Storage Foundation Cluster File System High Availability
- Troubleshooting CFS
- Troubleshooting fenced configurations
- Troubleshooting Cluster Volume Manager in Veritas InfoScale products clusters
- Troubleshooting interconnects
- Troubleshooting Storage Foundation Cluster File System High Availability
- Section V. Troubleshooting Cluster Server
- Troubleshooting and recovery for VCS
- VCS message logging
- Gathering VCS information for support analysis
- Troubleshooting the VCS engine
- Troubleshooting Low Latency Transport (LLT)
- Troubleshooting Group Membership Services/Atomic Broadcast (GAB)
- Troubleshooting VCS startup
- Troubleshooting issues with systemd unit service files
- Troubleshooting service groups
- Troubleshooting resources
- Troubleshooting I/O fencing
- System panics to prevent potential data corruption
- Fencing startup reports preexisting split-brain
- Troubleshooting CP server
- Troubleshooting server-based fencing on the Veritas InfoScale products cluster nodes
- Issues during online migration of coordination points
- Troubleshooting notification
- Troubleshooting and recovery for global clusters
- Troubleshooting licensing
- Licensing error messages
- VCS message logging
- Troubleshooting and recovery for VCS
- Section VI. Troubleshooting SFDB
If a unit service takes longer than the default timeout to stop or start the corresponding service, it goes into the Failed state
Sometimes, the unit service file may take more time than the default timeout value to stop or to start the service. In this scenario, the service goes into the 'failed' state.
For example:
( root@localhost-vm1 )[ ~ ] # systemctl status vcs vcs.service - VERITAS Cluster Server (VCS) Loaded: loaded (/opt/VRTSvcs/bin/vcs; enabled) Active: failed (Result: timeout) since Tue 2017-04-25 21:01:39 IST; 6s ago Process: 26546 ExecStart=/opt/VRTSvcs/bin/vcs start 2>&1 (code=exited, status=0/SUCCESS) Apr 25 21:00:09 localhost systemd[1]: Stopping VERITAS Cluster Server (VCS)... Apr 25 21:01:07 localhost AgentFramework[26625]: VCS ERROR V-16-20006-1005 CVMCluster:cvm_clus:monitor:node - st...ster reason: user initiated stop Apr 25 21:01:07 localhost AgentFramework[26625]: VCS ERROR V-16-2-13066 Thread(140447847638784) Agent is calling...ted. Apr 25 21:01:07 localhost Had[26588]: VCS ERROR V-16-2-13066 (localhost) Agent is calling clean for reso...leted. Apr 25 21:01:39 localhost systemd[1]: vcs.service stopping timed out. Terminating. Apr 25 21:01:39 localhost systemd[1]: Stopped VERITAS Cluster Server (VCS). Apr 25 21:01:39 localhost systemd[1]: Unit vcs.service entered failed state. Hint: Some lines were ellipsized, use -l to show in full.
Recommended action
To work around this issue, add a custom timeout value (in seconds) to the unit service file. The TimeoutSec parameter lets you configure the amount of time that the system must wait before it reports that the start or stop operation of a service is has failed.
The following example displays the parameter that is used to set a custom timeout in the unit service file:
( root@localhost-vm1 )[ ~ ] # vim /usr/lib/systemd /system/vcs.service [Unit] Description=VERITAS Cluster Server (VCS) SourcePath=/opt/VRTSvcs/bin/vcs ... ... [Service] ... ... ... TimeoutSec=300 [Install] ...
After you specify the custom timeout value in a unit service file, you must reload the systemd daemon so that the configuration is updated:
( root@localhost-vm1 )[ ~ ] # systemctl --system daemon-reload
Then, start or stop the unit service file:
( root@localhost-vm1 )[ ~ ] # systemctl start vcs