Veritas InfoScale™ 7.3.1 Troubleshooting Guide - Solaris
- Introduction
- Section I. Troubleshooting Veritas File System
- Section II. Troubleshooting Veritas Volume Manager
- Recovering from hardware failure
- Failures on RAID-5 volumes
- Recovery from failure of a DCO volume
- Recovering from instant snapshot failure
- Recovering from failed vxresize operation
- Recovering from boot disk failure
- Hot-relocation and boot disk failure
- Recovery from boot failure
- Repair of root or /usr file systems on mirrored volumes
- Replacement of boot disks
- Recovery by reinstallation
- Managing commands, tasks, and transactions
- Backing up and restoring disk group configurations
- Troubleshooting issues with importing disk groups
- Recovering from CDS errors
- Logging and error messages
- Troubleshooting Veritas Volume Replicator
- Recovery from configuration errors
- Errors during an RLINK attach
- Errors during modification of an RVG
- Recovery on the Primary or Secondary
- Recovering from Primary data volume error
- Primary SRL volume error cleanup and restart
- Primary SRL header error cleanup and recovery
- Secondary data volume error cleanup and recovery
- Troubleshooting issues in cloud deployments
- Recovering from hardware failure
- Section III. Troubleshooting Dynamic Multi-Pathing
- Section IV. Troubleshooting Storage Foundation Cluster File System High Availability
- Troubleshooting Storage Foundation Cluster File System High Availability
- Troubleshooting CFS
- Troubleshooting fenced configurations
- Troubleshooting Cluster Volume Manager in Veritas InfoScale products clusters
- Troubleshooting Storage Foundation Cluster File System High Availability
- Section V. Troubleshooting Cluster Server
- Troubleshooting and recovery for VCS
- VCS message logging
- Gathering VCS information for support analysis
- Troubleshooting the VCS engine
- Troubleshooting Low Latency Transport (LLT)
- Troubleshooting Group Membership Services/Atomic Broadcast (GAB)
- Troubleshooting VCS startup
- Troubleshooting service groups
- Troubleshooting resources
- Troubleshooting I/O fencing
- System panics to prevent potential data corruption
- Fencing startup reports preexisting split-brain
- Troubleshooting CP server
- Troubleshooting server-based fencing on the Veritas InfoScale products cluster nodes
- Issues during online migration of coordination points
- Troubleshooting notification
- Troubleshooting and recovery for global clusters
- Troubleshooting licensing
- Licensing error messages
- VCS message logging
- Troubleshooting and recovery for VCS
- Section VI. Troubleshooting SFDB
Replacing defective disks when the cluster is offline
If the disk in the coordinator disk group becomes defective or inoperable and you want to switch to a new diskgroup in a cluster that is offline, then perform the following procedure.
In a cluster that is online, you can replace the disks using the vxfenswap utility.
Review the following information to replace coordinator disk in the coordinator disk group, or to destroy a coordinator disk group.
Note the following about the procedure:
When you add a disk, add the disk to the disk group vxfencoorddg and retest the group for support of SCSI-3 persistent reservations.
You can destroy the coordinator disk group such that no registration keys remain on the disks. The disks can then be used elsewhere.
To replace a disk in the coordinator disk group when the cluster is offline
- Log in as superuser on one of the cluster nodes.
- If VCS is running, shut it down:
# hastop -all
Make sure that the port h is closed on all the nodes. Run the following command to verify that the port h is closed:
# gabconfig -a
- Stop the VCSMM driver on each node:
# svcadm disable -t vcsmm
- Stop I/O fencing on each node:
# svcadm disable -t vxfen
This removes any registration keys on the disks.
- Import the coordinator disk group. The file /etc/vxfendg includes the name of the disk group (typically, vxfencoorddg) that contains the coordinator disks, so use the command:
# vxdg -tfC import 'cat /etc/vxfendg'
where:
-t specifies that the disk group is imported only until the node restarts.
-f specifies that the import is to be done forcibly, which is necessary if one or more disks is not accessible.
-C specifies that any import locks are removed.
- To remove disks from the disk group, use the VxVM disk administrator utility, vxdiskadm.
You may also destroy the existing coordinator disk group. For example:
Verify whether the coordinator attribute is set to on.
# vxdg list vxfencoorddg | grep flags: | grep coordinator
Destroy the coordinator disk group.
# vxdg -o coordinator destroy vxfencoorddg
- Add the new disk to the node and initialize it as a VxVM disk.
Then, add the new disk to the vxfencoorddg disk group:
If you destroyed the disk group in step 6, then create the disk group again and add the new disk to it.
If the diskgroup already exists, then add the new disk to it.
# vxdg -g vxfencoorddg -o coordinator adddisk disk_name
- Test the recreated disk group for SCSI-3 persistent reservations compliance.
- After replacing disks in a coordinator disk group, deport the disk group:
# vxdg deport 'cat /etc/vxfendg'
- On each node, start the I/O fencing driver:
# svcadm enable vxfen
- On each node, start the VCSMM driver:
# svcadm enable vcsmm
- Verify that the I/O fencing module has started and is enabled.
# gabconfig -a
Make sure that port b membership exists in the output for all nodes in the cluster.
Make sure that port b and port o memberships exist in the output for all nodes in the cluster.
# vxfenadm -d
Make sure that I/O fencing mode is not disabled in the output.
- If necessary, restart VCS on each node:
# hastart