Search <book_title>...

NetBackup™ Deployment Guide for Kubernetes Clusters

Last Published: 2024-06-17

Product(s): NetBackup & Alta Data Protection (10.4.0.1)

Cluster specific settings

Private cluster

It is recommended to have a private Kubernetes cluster created for Cloud Scale deployment.
Ensure that the control plane or API server of the private created Kubernetes cluster has an internal IP address.
Note:
The use of private cluster ensures that the network traffic between your API server and node pools remain on the private network only.

Node group/Pool setting

Select Linux based operating system for the control and data pool nodes.
By default, Linux based operating system is supported only with default settings.
Ensure that latest version of Kubernetes cluster exists which is supported by Cloud Scale version 10.3 and above.

Autoscaling parameters

The autoscaling value for the cpdata pool node must always be set to True.
The minimum value of nodes in this node pool must be 1 and the maximum value can be obtained using the following formulae:
Number of max nodes = Number of parallel backup and snapshot jobs to run / Minimum of (RAM per node in GB, max_jobs setting)

Maximum pods per node setting

For Azure:
MAX pods per node = (RAM size in GB * 2) + number of Kubernetes and CSP pods[10] + 3(listener + fluent collector + fluentbit)
For AWS:
Node size in AWS must be selected depending on ENIC available with the node type. For more information on changing the value of max pods per node in AWS, refer to AWS Documentation.
Note:
If the max pods per node are not sufficient, then max jobs per node can be reduced as mentioned in the 'max_jobs tunable' content in the following section.

Pool settings

NetBackup pool: Used for deployment of NetBackup primary services along with Snapshot Manager control plane services.
Minimum CPU requirement and Node size RAM: 4 CPU and 16 GB RAM

cpdata pool: Used for deployment of Snapshot Manager data plane (dynamically created) services.

Average size of VM to be backed up **	RAM requirement in GB's for cpdata node	Number of CPU's	Tunable
<= 2 TB	8	2
2 TB > and < 4 TB	8	2	Max_jobs = 4
4 TB > and < 8 TB	16	4	Max_jobs = 5
8 TB > and < 16 TB	16	4	Max_jobs = 4
16 TB > and < 24 TB	24	4	Max_jobs = 3
24 TB > and < 32 TB	32	4	Max_jobs = 3

Note:

** If customer has distinct sizes of hosts to be protected then one should consider the higher sized VM's as an average size of the VM.

Media pool: CPU requirement and Node size RAM: 4 CPU and 16 GB RAM
MSDP pool: CPU requirement and Node size RAM: 4 CPU and 16 GB RAM
- max_jobs tunable: The max_jobs tunable parameter is used to restrict the number of jobs that can run on single node of the Cloud Scale cpdata node which can be used to reduce the number of jobs a single node can run.
  The max_jobs must be updated as follows:
  $ Kubectl edit configmap flexsnap-conf -n <nbux ns>
  Add the following entry in flexsnap.conf section:
  [capability_limit]
  max_jobs=16
  For example,
```
=====
~$ k describe cm flexsnap-conf
Name: flexsnap-conf
Namespace: nbux-002522
Labels: <none>
Annotations: <none>
Data
====
flexsnap.conf:
----
[agent] id = agent.8308b7c831af4b0388fdd7f1d91541e0
 
[capability_limit]
max_jobs=16
=======
```
- Tuning account rate limit: For BFS performance improvement, the API limits per AWS account can be updated as per the following formulae:
```
X  = Account rate limit
V1 = Number of VM's
S1 = schedules/day
D1 = Data change rate TB/incremental backup
X  = ((S1 * D1 * V1)/40) < 1 ? keep 1000 : go for (X= ((X+1)*1000)) requests/sec
```
  For example,
  - The default theoretical speed for the account is 43 TB/day (1000 request per sec x 86400 sec in a day x 512 KB block size).
  - For PP schedule frequency of 1 per day and each VM around 1 TB size.
  - Theoretical maximum for number of full/day if the backup window is the full day, then 43 VM/day can be backed up.
  - With 10% incremental changes everyday, the theoretical maximum for incremental is 380 incremental VM's/day with all incrementals having similar change rate. This incremental change does not consider obtaining the changed list and other pre and post backup functionality. If you consider this as taking 20% of time, then it would be around 250 incremental VMs/ day.