Veritas NetBackup™ Flex Scale Administrator's Guide
- Product overview
- Viewing information about the NetBackup Flex Scale cluster environment
- NetBackup Flex Scale infrastructure management
- User management
- About Universal Shares
- Node and disk management
- Adding a node to the cluster using the NetBackup Flex Scale web interface
- License management
- Managing hardware vendor packages
- User management
- NetBackup Flex Scale network management
- Bonding operations
- Data network configurations
- Network configuration on plain device (eth5)
- Network configuration on bonded interfaces (bond0 on eth5 and eth7)
- NetBackup Flex Scale infrastructure monitoring
- Resiliency in NetBackup Flex Scale
- EMS server configuration
- Site-based disaster recovery in NetBackup Flex Scale
- Performing disaster recovery using RESTful APIs
- NetBackup Flex Scale security
- STIG overview for NetBackup Flex Scale
- FIPS overview for NetBackup Flex Scale
- Support for immutability in NetBackup Flex Scale
- Deploying external certificates on NetBackup Flex Scale
- Troubleshooting
- Collecting logs for cluster nodes
- Troubleshooting NetBackup Flex Scale issues
- Appendix A. Maintenance procedures for HPE servers
- Appendix B. Configuring NetBackup optimized duplication
- Appendix C. Disaster recovery terminologies
- Appendix D. Configuring Auto Image Replication
Replacement procedure for a single OS disk
This topic describes the process of replacing a single OS disk that failed or is unreachable. Each node has two OS disks.
The following section describes how to identify a single OS disk failure from NetBackup Flex Scale:
An alert is generated for an OS disk failure or for an unreachable disk. To view the alert, do one of the following from the NetBackup Flex Scale infrastructure management UI:
Click Alerts area, click to see a complete list of alerts.
in the left pane. In theAt the top of any screen, click the
icon.Click
. On the Alerts management page, use the filters to locate specific types of alerts.
If SMTP is configured for AutoSupport, you receive email alerts. If Call Home is configured for your setup, diagnostic information is sent to the AutoSupport server.
Navigate to
and select the node on which the OS disk went bad, and then click . The UI shows the failure for the corresponding OS disk:The following section describes how to identify an OS disk failure from third-party tools:
The HPE Integrated Lights-Out (iLO) remote console shows a failure. The Health for the OS disk is shown as Critical and Warning for the Volume of the RAID 1 in iLO.
The health of the node is shown as unhealthy for that node in the NetBackup Flex Scale UI. Navigate to
to view the node health.An HPE representative identifies the faulty disk, its physical location in the appliance, and replaces the faulty OS disk. You can use the AHS logs to find the required details, and then replace the disk.
Note:
With NetBackup Flex Scaleversion 3.1, you can beacon the disk from the UI.
After you get the physical location of the disk on the appliance, replace the OS disk with a new OS disk. Note the model number of the new disk and ensure that it matches with the older one.
To replace the disk, the HPE representative completes the following steps:
- Check the disk model number from the iLO remote console.
- Identify the corresponding location of the OS disks in the appliance. In this example, Box6 - Bay 1 and Bay 2.
- Refer to the HPE procedure to replace the disk.
- In iLO, after the OS disk is replaced, Health for the OS disk is set to OK but the Health of the Volume of the RAID 1 is set to Warning till the rebuild completes.
After the hardware vendor notifies you that the hardware component is replaced, verify that the issue is resolved.
To verify that the issue is resolved, Veritas TSE completes the following steps:
- Wait till the RAID controller rebuilds the new OS disk. This operation takes approximately two hours. To check the rebuild progress, run the following command after elevating to root access:
nbfs3.1> support elevate # ssacli ctrl all show config HPE Smart Array P816i-a SR Gen10 in Slot 0 (Embedded) (sn: PWXLA0BRHDW07G) Internal Drive Cage at Port 1I, Box 2, OK Internal Drive Cage at Port 2I, Box 3, OK Internal Drive Cage at Port 3I, Box 6, OK Internal Drive Cage at Port 4I, Box 7, OK Port Name: 1I (Mixed) Port Name: 2I (Mixed) Port Name: 3I (Mixed) Port Name: 4I (Mixed) Array A (Solid State SATA, Unused Space: 0 MB) logicaldrive 1 (1.75 TB, RAID 1, Recovering, 4.13% complete) physicaldrive 3I:6:1 (port 3I:box 6:bay 1, SATA SSD, 1.9 TB, Rebuilding) physicaldrive 3I:6:2 (port 3I:box 6:bay 2, SATA SSD, 1.9 TB, OK)
- After the rebuild completes successfully, verify that all the AutoSupport alerts are resolved and the node state shows healthy in the NetBackup Flex Scale UI. To verify, navigate to Monitor > Infrastructure > Nodes.
- In iLO, verify that the Health of the Volume for the RAID 1 is set to OK.