NetBackup™ Deployment Guide for Kubernetes Clusters
- Introduction
- Section I. Deployment
- Prerequisites for Kubernetes cluster configuration
- Deployment with environment operators
- Deploying NetBackup
- Primary and media server CR
- Deploying NetBackup using Helm charts
- Deploying MSDP Scaleout
- Deploying Snapshot Manager
- Section II. Monitoring and Management
- Monitoring NetBackup
- Monitoring MSDP Scaleout
- Monitoring Snapshot Manager
- Managing the Load Balancer service
- Managing MSDP Scaleout
- Performing catalog backup and recovery
- Section III. Maintenance
- MSDP Scaleout Maintenance
- Upgrading
- Uninstalling
- Troubleshooting
- Troubleshooting AKS and EKS issues
- Troubleshooting AKS-specific issues
- Troubleshooting EKS-specific issues
- Troubleshooting AKS and EKS issues
- Appendix A. CR template
Procedure to rollback when upgrade fails
Note:
The rollback procedure in this section can be performed only after assuming that the customer has taken catalog backup before performing the upgrade.
Perform the following steps to rollback from upgrade failure and install the NetBackup version prior to upgrade
- Delete the environment CR object using the following command and wait until all the underlying resources are cleaned up:
kubectl delete environment.netbackup.veritas.com <environment name> -n <namespace>
For example, primary server CR, media server CR, MSDP CR and their underlined resources.
- Delete the new operator which is deployed during upgrade using the following command:
kubectl delete -k <new-operator-directory>
This will delete the new operator and new CRDs.
- Apply the NetBackup operator directory which was preserved (the directory which was used to install operator before upgrade) using the following command:
kubectl apply -k <operator_directory>
- Get names of PV attached to primary server PVC (data, catalog and log) using the following command:
kubectl get pvc -n <namespace> -o wide
- Delete the primary server PVC (data, catalog and log) using the following command:
kubectl delete pvc <pvc-name> -n <namespace>
- Delete the PV linked to primary server PVC using the following command:
kubectl delete pv <pv-name> command
- Edit the preserved
environment.yaml
file (from older version of NetBackup package directory) and remove keySecret section from MSDP Scaleout section. Also change the CR spec paused: false to paused: true for every object in MSDP Scaleout and media servers section. - Apply the edited
environment.yaml
file using the following command:kubectl apply -f <environment.yaml>
- After the primary server pod is in ready state (1/1), change the CR spec from paused: false to paused: true in environment object using the following command:
kubectl edit <environment_CR_name> -n <namespace>
- Exec into the primary server pod using the following command:
kubectl exec -it -n <PrimaryServer/MediaServer-CR-namespace> <primary-pod-name> -- /bin/bash
Increase the debug logs level on primary server.
Create a DRPackages directory at the persisted location using
mkdir /mnt/nblogs/DRPackages
folder.Change ownership of the DRPackages folder to service user using the following command:
chown nbsvcusr:nbsvcusr /mnt/nblogs/DRPackages
- Copy the earlier copied DR files to primary pod at
/mnt/nblogs/DRPackages
using the following command:kubectl cp <Path_of_DRPackages_on_host_machine> <primary-pod-namespace>/<primary-pod-name>:/mnt/nblogs/DRPackages
- Execute the following steps in the primary server pod:
Change ownership of the files in
/mnt/nblogs/DRPackages
using the following command:chown nbsvcusr:nbsvcusr <filename>
Deactivate NetBackup health probes using the following command:
/opt/veritas/vxapp-manage/nbu-health deactivate
Stop the NetBackup services using the following command:
/usr/openv/netbackup/bin/bp.kill_all
Execute the following command:
nbhostidentity -import -infile /mnt/nblogs/DRPackages/<filename>.drpkg
Restart all the NetBackup services using the following command:
/usr/openv/netbackup/bin/bp.start
- Verify if the security settings are enabled.
- Add respective media server entry in host properties using NetBackupAdministration Console as follows:
Navigate to NetBackup Management > Host properties > Master Server > Add Additional server and add media server.
- Restart the NetBackup services in primary server pod and external media server as follows:
Exec into the primary server pod using command:
kubectl exec -it -n <PrimaryServer/MediaServer-CR-namespace> <primary-pod-name> -- /bin/bash
Run the following command to stop all the services:
/usr/openv/netbackup/bin/bp.kill_all
After stopping all the services, restart the services using the following command:
/usr/openv/netbackup/bin/bp.start_all
Run the following command to stop all the NetBackup services:
/usr/openv/netbackup/bin/bp.kill_all
After stopping all the services, restart the NetBackup services using the following command:
/usr/openv/netbackup/bin/bp.start_all
- Configure a storage unit on external media server that is used during catalog backup.
- Perform catalog recovery from NetBackup Administration Console.
For more information, refer to the VeritasTM NetBackup Troubleshooting Guide
- Exec into the primary server pod using the following command:
kubectl exec -it -n <PrimaryServer/MediaServer-CR-namespace> <primary-pod-name> -- /bin/bash
Stop the NetBackup services using the following command:
/usr/openv/netbackup/bin/bp.kill_all
Start the NetBackup services using the following command:
/usr/openv/netbackup/bin/bp.start_all
Activate NetBackup health probes using the following command:
/opt/veritas/vxapp-manage/nbu-health activate
- Restart the NetBackup operator pod, where user must delete the pod using the following command:
kuebctl delete <operator-pod-name> -n <namespace>
Kubernetes will start new pod after deletion.
- Pause the reconciler for primary, media servers, and msdp scaleouts in the following sequence:
Change CR spec paused: true to paused: false in environment object of the primary section using the following command:
kubectl edit <environment_CR_name> -n <namespace>
Wait till primary server is in ready state.
Change CR spec paused: true to paused: false in environment object of the msdpscaleouts section using the following command:
kubectl edit <environment_CR_name> -n <namespace>
Wait till primary server is in ready state.
Change CR spec paused: true to paused: false in environment object of the media servers section using the following command:
kubectl edit <environment_CR_name> -n <namespace>
Wait till primary server is in ready state.
- Verify the rollback is successful by performing backups and recovery jobs.