NetBackup™ Deployment Guide for Kubernetes Clusters
- Introduction
- Section I. Configurations
- Prerequisites
- Recommendations and Limitations
- Configurations
- Configuration of key parameters in Cloud Scale deployments
- Section II. Deployment
- Section III. Monitoring and Management
- Monitoring NetBackup
- Monitoring Snapshot Manager
- Monitoring MSDP Scaleout
- Managing NetBackup
- Managing the Load Balancer service
- Managing PostrgreSQL DBaaS
- Performing catalog backup and recovery
- Managing MSDP Scaleout
- Section IV. Maintenance
- MSDP Scaleout Maintenance
- PostgreSQL DBaaS Maintenance
- Patching mechanism for Primary and Media servers
- Upgrading
- Cloud Scale Disaster Recovery
- Uninstalling
- Troubleshooting
- Troubleshooting AKS and EKS issues
- Troubleshooting AKS-specific issues
- Troubleshooting EKS-specific issues
- Troubleshooting AKS and EKS issues
- Appendix A. CR template
Cluster specific settings
It is recommended to have a private Kubernetes cluster created for Cloud Scale deployment.
Ensure that the control plane or API server of the private created Kubernetes cluster has an internal IP address.
Note:
The use of private cluster ensures that the network traffic between your API server and node pools remain on the private network only.
Select Linux based operating system for the control and data pool nodes.
By default, Linux based operating system is supported only with default settings.
Ensure that latest version of Kubernetes cluster exists which is supported by Cloud Scale version 10.3 and above.
Autoscaling parameters
The autoscaling value for the
pool node must always be set to True.The minimum value of nodes in this node pool must be 1 and the maximum value can be obtained using the following formulae:
Number of max nodes = Number of parallel backup and snapshot jobs to run / Minimum of (RAM per node in GB, max_jobs setting)
Maximum pods per node setting
For Azure:
MAX pods per node = (RAM size in GB * 2) + number of Kubernetes and CSP pods[10] + 3(listener + fluent collector + fluentbit)
For AWS:
Node size in AWS must be selected depending on ENIC available with the node type. For more information on changing the value of max pods per node in AWS, refer to AWS Documentation.
Note:
If the max pods per node are not sufficient, then max jobs per node can be reduced as mentioned in the 'max_jobs tunable' content in the following section.
Pool settings
NetBackup pool: Used for deployment of NetBackup primary services along with Snapshot Manager control plane services.
Minimum CPU requirement and Node size RAM: 4 CPU and 16 GB RAM
cpdata pool: Used for deployment of Snapshot Manager data plane (dynamically created) services.
Average size of VM to be backed up **
RAM requirement in GB's for cpdata node
Number of CPU's
Tunable
<= 2 TB
8
2
2 TB > and < 4 TB
8
2
Max_jobs = 4
4 TB > and < 8 TB
16
4
Max_jobs = 5
8 TB > and < 16 TB
16
4
Max_jobs = 4
16 TB > and < 24 TB
24
4
Max_jobs = 3
24 TB > and < 32 TB
32
4
Max_jobs = 3
Note:
** If customer has distinct sizes of hosts to be protected then one should consider the higher sized VM's as an average size of the VM.
Media pool: CPU requirement and Node size RAM: 4 CPU and 16 GB RAM
MSDP pool: CPU requirement and Node size RAM: 4 CPU and 16 GB RAM
max_jobs tunable: The max_jobs tunable parameter is used to restrict the number of jobs that can run on single node of the Cloud Scale cpdata node which can be used to reduce the number of jobs a single node can run.
The max_jobs must be updated as follows:
$ Kubectl edit configmap flexsnap-conf -n <nbux ns>
Add the following entry in
flexsnap.conf
section:[capability_limit]
max_jobs=16
For example,
===== ~$ k describe cm flexsnap-conf Name: flexsnap-conf Namespace: nbux-002522 Labels: <none> Annotations: <none> Data ==== flexsnap.conf: ---- [agent] id = agent.8308b7c831af4b0388fdd7f1d91541e0 [capability_limit] max_jobs=16 =======
Tuning account rate limit: For BFS performance improvement, the API limits per AWS account can be updated as per the following formulae:
X = Account rate limit V1 = Number of VM's S1 = schedules/day D1 = Data change rate TB/incremental backup X = ((S1 * D1 * V1)/40) < 1 ? keep 1000 : go for (X= ((X+1)*1000)) requests/sec
For example,
The default theoretical speed for the account is 43 TB/day (1000 request per sec x 86400 sec in a day x 512 KB block size).
For PP schedule frequency of 1 per day and each VM around 1 TB size.
Theoretical maximum for number of full/day if the backup window is the full day, then 43 VM/day can be backed up.
With 10% incremental changes everyday, the theoretical maximum for incremental is 380 incremental VM's/day with all incrementals having similar change rate. This incremental change does not consider obtaining the changed list and other pre and post backup functionality. If you consider this as taking 20% of time, then it would be around 250 incremental VMs/ day.