Search <book_title>...

NetBackup™ Deployment Guide for Kubernetes Clusters

Last Published: 2024-06-17

Product(s): NetBackup & Alta Data Protection (10.4.0.1)

MSDP-X and Primary server corrupted

Note the storage server, cloud LSU and cloud bucket name.
Note the DR Passphrase also.
Copy DRPackages files (packages) from the pod to the local VM if not received over the email using the following command:
kubectl cp <primary-pod-namespace>/<primary-pod-name>:/mnt/nbdb/usr/openv/drpackage_<storageservername> <Path_where_to_copy_on_host_machine>
Delete the corrupted MSDP and Primary server by running the following command:
kubectl delete -f environment.yaml -n <namespace>
Note:
Perform this step carefully as it would delete NetBackup.
Clean the PV and PVCs of primary and MSDP server as follows:
- Get names of PV attached to primary and MSDP server PVC (catalog, log and data) using the kubectl get pvc -n <namespace> -o wide command.
- Delete primary and MSDP server PVC (catalog, log and data) using the kubectl delete pvc <pvc-name> -n <namespace> command.
- Delete the PV linked to primary server PVC using the kubectl delete pv <pv-name> command.
(EKS-specific) Navigate to mounted EFS directory and delete the content from primary_catalog folder by running the rm -rf /efs/* command.
Modify the environment.yaml file with the paused: true field in the MSDP and Media sections.
Change CR spec from paused: false to paused: true in MSDP Scaleout and media servers. Save it.
Note:
Ensure that only primary server is deployed. Now apply the modified environment.yaml file.
Save the environment.yaml file. Apply the environment.yaml file using the following command:
kubectl apply -f environment.yaml -n <namespace>
After the primary server is up and running, perform the following:
- Execute the kubectl exec -it -n <namespace> <primary-pod-name> -- /bin/bash command in the primary server pod.
- Increase the debug logs level on primary server.
- Create a directory DRPackages at persisted location using mkdir /mnt/nblogs/DRPackages.
Copy earlier copied DR files to primary pod at /mnt/nblogs/DRPackages using the kubectl cp <Path_of_DRPackages_on_host_machine> <primary-pod-namespace>/<primary-pod-name>:/mnt/nblogs/DRPackages command.
Execute the following steps (after exec) into the primary server pod:
- Change ownership of files in /mnt/nblogs/DRPackages using the chown nbsvcusr:nbsvcusr <file-name> command.
- Deactivate NetBackup health probes using the /opt/veritas/vxapp-manage/nb-health deactivate command.
- Stop the NetBackup services using the /usr/openv/netbackup/bin/bp.kill_all command.
- Execute the /usr/openv/netbackup/bin/admincmd/nbhostidentity -import -infile /mnt/ndbdb/usr/openv/drpackage/<filename>.drpkg command.
- Clear bpclntcmd -clear_host_cacheNetBackup host cache by running the command.
- Start NetBackup services using the /usr/openv/netbackup/bin/bp.start_all command.
- Refresh the certificate revocation list using the /usr/openv/netbackup/bin/nbcertcmd -getcrl command.
Run the primary server reconciler as follows:
- Edit the environment (using kubectl edit environment -n <namespace> command) and change primary spec's for paused field to true and save it.
- To enable the reconciler to run, the environment must be edited again and the primary's paused field must be set to false.
The SHA fingerprint is updated in the primary CR's status.
From Web UI, allow reissue of token from primary server for MSDP, media and Snapshot Manager server as follows:
Navigate to Security > Host Mappings for the MSDP storage server and select Allow Auto reissue Certificate.
Repeat this for media and Snapshot Manager server entries.
Edit the environment using kubectl edit environment -n <namespace> command and change paused field to false for MSDP.
Perform from step 2 in the following section:
“Scenario 2: MSDP Scaleout and its data is lost and the NetBackup primary server was destroyed and is re-installed”
Edit environment CR and change paused: false for media server.
Once media server pods are ready, perform full catalog recovery using one of the following options:
Trigger a catalog recovery from the Web UI.
Or
Exec into primary pod and run bprecover -wizard command.
Once recovery is completed, restart the NetBackup services:
Stop NetBackup services using the /usr/openv/netbackup/bin/bp.kill_all command.
Start NetBackup services using the /usr/openv/netbackup/bin/bp.start_all command.
Activate NetBackup health probes using the /opt/veritas/vxapp-manage/nb-health activate command.
Verify/Backup/Restore the backup images in NetBackup server to check if the MSDP-X cluster has recovered or not.
Verify that the Primary, Media, MSDP and Snapshot Manager server are up and running.
Verify that the Snapshot Manager is running.