NetBackup™ Deployment Guide for Kubernetes Clusters
- Introduction
- Section I. Deployment
- Prerequisites for Kubernetes cluster configuration
- Deployment with environment operators
- Deploying NetBackup
- Primary and media server CR
- Deploying NetBackup using Helm charts
- Deploying MSDP Scaleout
- Deploying Snapshot Manager
- Section II. Monitoring and Management
- Monitoring NetBackup
- Monitoring MSDP Scaleout
- Monitoring Snapshot Manager
- Managing the Load Balancer service
- Managing MSDP Scaleout
- Performing catalog backup and recovery
- Section III. Maintenance
- MSDP Scaleout Maintenance
- Upgrading
- Uninstalling
- Troubleshooting
- Troubleshooting AKS and EKS issues
- Troubleshooting AKS-specific issues
- Troubleshooting EKS-specific issues
- Troubleshooting AKS and EKS issues
- Appendix A. CR template
Configuring the environment.yaml file
The environment.yaml
file lets you configure the primary server, media servers, scale out MSDP Scaleout storage and Snapshot Manager servers. The file contains five sections, the first section contains parameters that are applicable to all the servers, rest of the sections are one each for the primary, media, MSDP Scaleout and Snapshot Manager servers.
The following configurations apply to all the components:
Table: Common environment parameters
Parameter | Description |
---|---|
name: environment-sample | Specify the name of the environment in your cluster. |
namespace: example-ns | Specify the namespace where all the NetBackup resources are managed. If not specified here, then it will be the current namespace when you run the command kubectl apply -f on this file. |
(AKS-specific) containerRegistry: example.azurecr.io (EKS-specific) containerRegistry: example.dkr.ecr.us-east-2.amazonaws.com/exampleReg | Specify a container registry that the cluster has access. NetBackup images are pushed to this registry. |
tag: 10.2 | This tag is used for all images in the environment. Specifying a `tag` value on a sub-resource affects the images for that sub-resource only. For example, if you apply an EEB that affects only primary servers, you might set the `primary.tag` to the custom tag of that EEB. The primary server runs with that image, but the media servers and MSDP scaleouts continue to run images tagged `10.2`. Beware that the values that look like numbers are treated as numbers in YAML even though this field needs to be a string; quote this to avoid misinterpretation. |
licenseKeys: | List the license keys that are shared among all the sub-resources. Licenses specified in a sub-resource are appended to this list and applied only to the sub-resource. |
paused: false | Specify whether the NetBackup operator attempts to reconcile the differences between this YAML specification and the current Kubernetes cluster state. Only set it to true during maintenance. |
configCheckMode: default | This controls whether certain configuration restrictions are checked or enforced during setup. Other allowed values are skip and dryrun. |
corePattern: /corefiles/core.%e.%p.%t | Specify the path to use for storing core files in case of a crash. |
(AKS-specific) loadBalancerAnnotations: service. beta.kubernetes.io/ azure-load- balancer- internal-subnet: example-subnet (EKS-specific) loadBalancerAnnotations: service.beta.kubernetes.io/aws-load-balancer-subnets: example-subnet1 name | Specify the annotations to be added for the network load balancer |
Note:
(EKS-specific) If NetBackup is upgraded form 10.0.0.1, then delete the following configuration from the environment.yaml
file from section: service.beta.kubernetes.io/aws-load-balancer-target-group-attributes: preserve_client_ip.enabled=true
The following section describes Snapshot Manager related parameters. You may also deploy without any Snapshot Manager. In that case, remove the cpServer section entirely from the configuration file.
Table: Snapshot Manager parameters
Parameter | Description |
---|---|
cpServer: -name |
This specifies Snapshot Manager configurations. Currently only single instance of Snapshot Manager deployment is supported. It is also possible to have no Snapshot Managers configured; in this case, delete the cpServer section itself. |
containerRegistry |
(Optional) Specify a container registry that the cluster has access. Snapshot Manager images are pushed to this registry which overrides the one defined in table above. |
tag: |
This tag overrides the one defined in table above. The Snapshot Manager images are shipped with tags different from the NetBackup primary, media, and MSDP images. |
credential:secretName |
This defines the credentials for Snapshot Manager. It refers to a secret in the same namespace as this environment resource with values for username and password. |
networkLoadBalancer: annotations |
Annotations to be provided to the network load balancer. All networkLoadBalancer annotations are supported. These values are merged with the values provided in the . The duplicate values provided here, override the corresponding values in the . |
networkLoadBalancer: ipaddr |
IP address to be assigned to the network load balancer. |
networkLoadBalancer: fqdn |
FQDN to be assigned to the network load balancer. |
log.capacity |
Size for log volume. |
log.storageClassName | Storage class for log volume. It must be EFS based storage class. |
data.capacity | Size for data volume. |
data.storageClassName | EBS based storage class for data volume. |
controlPlane.nodePool | Name of the control plane node pool. |
controlPlane.labelKey | Label and taint key of the control plane. |
controlPlane.labeValue | Label and taint value of the control plane. |
dataPlane.nodePool | Name of the data plane node pool. |
dataPlane.labelKey | Label and taint key of the data plane. |
dataPlane.labelValue | Label and taint value of the data plane. |
proxySettings.vx_http_proxy: | Address to be used as the proxy for all HTTP connections. For example, "http://proxy.example.com:8080/" |
proxySettings.vx_https_proxy: | Address to be used as the proxy for all HTTPS connections. For example, "http://proxy.example.com:8080/" |
proxySettings.vx_no_proxy: | Address that are allowed to bypass the proxy server. You can specify host name, IP addresses and domain names in this parameter. For example, "localhost,mycompany.com,169.254.169.254" |
The following configurations apply to the primary server. The values specified in the following table can override the values specified in the table above.
Table: Environment parameters for the primary server
Paragraph | Description |
---|---|
paused: false | Specifies whether the NetBackup operator attempts to reconcile the differences between this YAML specification and the current Kubernetes cluster state. Set it to true only during maintenance. This applies only to the environment object. To pause reconciliation of the managed primary server, for example, you must set spec.primary.paused. Setting spec.paused:true ceases updates to the managed resources, including updates to their `paused` status. Entries in the media servers and MSDP scaleouts lists also support the `paused` field. The default value is false. |
primary | Specifies attributes specific to the primary server resources. Every environment has exactly one primary server, so this section cannot be left blank. |
name: primary-name | Set resourceNamePrefix to control the name of the primary server. The default value is the same as the environment's name. |
tag: 10.2-special | To use a different image tag specifically for the primary server, uncomment this value and provide the desired tag. This overrides the tag specified in the common section. |
nodeSelector: labelKey: kubernetes.io/os labelValue: linux | Specify a key and value that identifies nodes where the primary server pod runs. Note: This labelKey and labelValue must be the same label key:value pair used during cloud node creation which would be used as a toleration for primary server. |
networkLoadBalancer: (AKS-specific) annotations: service.beta. kubernetes.io / azure-load- balancer-internal- subnet: example- subnet (EKS-specific) annotations: service.beta.kubernetes.io/aws-load-balancer-subnets: example-subnet1 name ipList: - ipAddr: 4.3.2.1 fqdn: primary.example.com | Uncomment the annotations to specify additional primary server-specific annotations. These values are merged with the values given in the loadBalancerAnnotations above. Any duplicate values given here override the corresponding values above. Next, specify the hostname and IP address of the primary server. |
credSecretName: primary-credential-secret | This determines the credentials for the primary server. Media servers use these credentials to register themselves with the primary server. |
itAnalyticsPublicKey: ssh-rsaxxx | If using NetBackup IT Analytics, uncomment this and provide the SSH public key. IT Analytics uses this to access the primary server. |
kmsDBSecret: kms-secret | Secret name which contains the Host Master Key ID (HMKID), Host Master Key passphrase (HMKpassphrase), Key Protection Key ID (KPKID) and Key Protection Key passphrase (KPKpassphrase) for NetBackup Key Management Service. The secret should be 'Opaque', and can be created either using a YAML or the following example command: kubectl create secret generic kms-secret --namespace nb-namespace --from-literal=HMKID="HMK@ID" --from-literal=HMKpassphrase="HMK@passphrase" --from-literal=KPKID="KPK@ID" --from-literal=KPKpassphrase="KPK@passphrase" |
licenseKeys: | To specify additional license keys that are applied only to the primary server, uncomment this and provide the license key(s). In this example, the primary server would have the "X" license key defined in the previous section, followed by this "Y" key. |
catalog: capacity: 100Gi (AKS-specific) storageClassName: standard (EKS-specific) storageClassName: <EFS_ID> | This storage applies to the primary server for the NetBackup catalog, log and data volumes. The primary server catalog volume must be at least 100 Gi. |
log: capacity: 30Gi (AKS-specific) storageClassName: standard (EKS-specific) storageClassName: <EBS based storage class> | Log volume must be at least 30Gi. |
data: capacity: 30Gi (AKS-specific) storageClassName: standard (EKS-specific) storageClassName: <EBS based storage class> | The primary server data volume must be at least 30Gi. Note: (AKS-specific) This storage applies to primary server data volume. |
The following section describes the media server configurations. If you do not have a media server either remove this section from the configuration file entirely, or define it as an empty list.
Note:
(EKS-specific) The environment name or media server name in environment.yaml
file must always be less than 22 characters.
Table: Media server related parameters
Parameters | Description |
---|---|
mediaServers: - name: media1 | This specifies media server configurations. This is given as a list of media servers, but most environments will have just one, with multiple replicas. It's also possible to have zero media servers; in that case, either remove the media servers section entirely, or define it as an empty list: mediaServers: [] |
minimumReplicas | Describes the minimum number of replicas of the running media server. This is an optional field. If not specified, the value will be set to the value specified for field. |
replicas: 1 | Specifies the maximum number of replicas that the media server can scale up to. The value of replicas must be greater than the value of for media autoscaler to work. |
tag: 10.2-special | To use a different image tag specifically for the media servers, uncomment this value and provide the desired tag. This overrides the tag specified above in the common table. |
nodeSelector: labelKey: kubernetes.io/os labelValue: linux | Specify a key and value that identifies nodes where media-server pods will run. Note: This labelKey and labelValue must be the same label key:value pair used during cloud node creation which would be used as a toleration for media server. |
data: capacity: 50Gi (AKS-specific) storageClassName: managed-premium-nbux (EKS-specific) storageClassName: <EBS based storage class> | This storage applies to the media server data volumes. The minimum data size for a media server is 50 Gi. |
log capacity: 30Gi (AKS-specific) storageClassName: managed-premium-nbux (EKS-specific) storageClassName: <EBS based storage class> | This storage applies to the media server log volumes. Log volumes must be at least 30Gi. |
networkLoadBalancer: (AKS-specific) annotations: - service.beta.kubernetes.io/ azure-load-balancer -internal-subnet: example-subnet (EKS-specific) annotations: -service.beta.kubernetes.io/aws-load-balancer-subnets: example-subnet1 name ipList: ipAddr: 4.3.2.2 fqdn: media1-1.example.com ipAddr: 4.3.2.3 fqdn: media1-2.example.com | Uncomment annotations to specify additional media-server specific annotations. These values are merged with the values given in the loadBalancerAnnotations. The duplicate values given here, override the corresponding values in the loadBalancerAnnotations. The number of entries in the IP list should match the replica count specified above. |
(EKS-specific) Note the following:
To use gp3 (EBS based storage class), user must specify provisioner for storage class as ebs.csi.aws.com
and must install EBS CSI driver. For more information on installing the EBS CSI driver, see Amazon EBS CSI driver. Example, for gp3 storage class:
kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: gp3 annotations: storageclass.kubernetes.io/is-default-class: "true" allowVolumeExpansion: true provisioner: ebs.csi.aws.com volumeBindingMode: WaitForFirstConsumer parameters: type: gp3
The following section describes MSDP-related parameters. You may also deploy without any MSDP scaleouts. In that case, remove the msdpScaleouts section entirely from the configuration file.
Table: MSDP Scaleout related parameters
Parameter | Description |
---|---|
msdpScaleouts: - name: dedupe1 | This specifies MSDP Scaleout configurations. This is given as a list, but it would be rare to need more than one scaleout deployment in a single environment. Use the `replicas` property below to scale out. It's also possible to have zero MSDP scaleouts; in that case, either remove the msdpScaleouts section entirely, or define it to an empty list: msdpScaleouts: [] |
tag: '18.0' | This tag overrides the one defined in the table 1-3. It is necessary because the MSDP Scaleout images are shipped with tags different from the NetBackup primary and media images. |
replicas: 4 | This is the scaleout size of this MSDP Scaleout component. It is a required value, and it must be between 4 and 16 inclusive. Note: Scale-down of the MSDP Scaleout replicas after deployment is not supported. |
serviceIPFQDNs: ipAddr: 1.2.3.4 fqdn: dedupe1-1.example.com ipAddr: 1.2.3.5 fqdn: dedupe1-2.example.com ipAddr: 1.2.3.6 fqdn: dedupe1-3.example.com ipAddr: 1.2.3.7 fqdn: dedupe1-4.example.com | These are the IP addresses and host names of the MSDP Scaleout servers. The number of the entries should match the number of the replicas specified above. |
kms: keyGroup: example-key-group | Specifies the initial key group and key secret to be used for KMS encryption. When reusing storage from a previous deployment, the key group and key secret may already exist. In this case, provide the keyGroup only. |
keySecret: example-key-secret | Specify keySecret only if the key group does not already exist and needs to be created. The secret type should be Opaque, and you can create the secret either using a YAML or the following command: kubectl create secret generic example-key-secret --namespace nb-namespace --from-literal=username="devuser" --from-literal=passphrase="test passphrase" |
(AKS-specific) loadBalancerAnnotations: service.beta.kubernetes .io/azure-load- balancer-internal: true (EKS-specific) loadBalancerAnnotations: service.beta.kubernetes .io/aws-load- balancer-internal: true | For MSDP scaleouts, the default value for the following annotation is `false`, which may cause the MSDP Scaleout services in this Environment to be accessible publicly: (AKS-specific): Azure-load-balancer-internal (EKS-specific): AWS-load-balancer-internal Ensure that they use private IP addresses, specify `true` here or in the loadBalancerAnnotations above in Table 1-3. |
credential: secretName: msdp-secret1 | This defines the credentials for the MSDP Scaleout server. It refers to a secret in the same namespace as this environment resource. Secret can be either of type 'Basic-auth' or 'Opaque'. You can create secrets using a YAML or by using the following command:kubectl create secret generic <msdp-secret1> --namespace <nb-namespace> --from-literal=username=<"devuser"> --from-literal=password=<"Y@123abCdEf"> |
autoDelete: false | Optional parameter. Default value is true. When set to true, the MSDP Scaleout operator deletes the MSDP secret after using it. In such case, the MSDP and primary secrets must be distinct. To use the same secret for both MSDP scaleouts and the primary server, set autoDelete to false. |
catalog: capacity: 1Gi (AKS-specific) storageClassName: standard (EKS-specific) storageClassName: gp2 | This storage applies to MSDP Scaleout to store the catalog and metadata. The catalog size may only be increased for capacity expansion. Expanding the existing catalog volumes cause short downtime of the engines. Recommended size is 1/100 of backend data capacity. |
dataVolumes: capacity: 5Gi (AKS-specific) storageClassName: standard (EKS-specific) storageClassName: gp2 | This specifies the data storage for this MSDP Scaleout resource. You may increase the size of a volume or add more volumes to the end of the list, but do not remove or re-order volumes. Maximum 16 volumes are allowed. Appending new data volumes or expanding existing ones will cause short downtime of the Engines. Recommended volume size is 5Gi-32Ti. |
log: capacity: 20Gi (AKS-specific) storageClassName: standard (EKS-specific) storageClassName: gp2 | Specifies log volume size used to provision Persistent Volume Claim for Controller and MDS Pods. In most cases, 5-10 Gi capacity should be big enough for one MDS or Controller Pod to use. |
nodeSelector: labelKey: kubernetes.io/os labelValue: linux | Specify a key and value that identifies nodes where MSDP Scaleout pods will run. |
(AKS-specific) s3Credential: secretName: s3-secret1 | This is an optional parameter. Defines the MSDP S3 root credentials for the MSDP Scaleout server. It refers to a secret in the same namespace as this environment resource. If the parameter is not specified, MSDP S3 feature is unavailable. Run the following command to create the secret: kubectl msdp generate-s3-secret --namespace <nb-namespace> --s3secret <s3-secret1> Save the S3 credential at a secured place after it is generated for later use. |
(AKS-specific) autoDelete: false | This is an optional parameter. Default value is true. When set to true, the MSDP Scaleout operator deletes the MSDP S3 credential secret after using it. |
For more information on Snapshot Manager related parameters, refer to the following:
Do not change these parameters post initial deployment. Changing these parameters may result in an inconsistent deployment.
Table: Edit restricted parameters post deployment
Parameter | Description |
---|---|
name | Specifies the prefix name for the primary, media, and MSDP Scaleout server resources. |
(AKS-specific) ipAddr, fqdn and loadBalancerAnnotations | The values against ipAddr, fqdn and loadBalancerAnnotations against following fields should not be changed post initial deployment. This is applicable for primary, media, and MSDP Scaleout servers. For example: - The loadBalancerAnnotations for loadBalancerAnnotations: service.beta.kubernetes.io/azure-load-balancer -internal-subnet: example-subnet service.beta.kubernetes.io/azure-load-balancer -internal: "true" The IP and FQDNs values defined for Primary, Media and MSDPScaleout ipList: - ipAddr: 4.3.2.1 fqdn: primary.example.com ipList: - ipAddr: 4.3.2.2 fqdn: media1-1.example.com - ipAddr: 4.3.2.3 fqdn: media1-2.example.com serviceIPFQDNs: - ipAddr: 1.2.3.4 fqdn: dedupe1-1.example.com - ipAddr: 1.2.3.5 fqdn: dedupe1-2.example.com - ipAddr: 1.2.3.6 fqdn: dedupe1-3.example.com - ipAddr: 1.2.3.7 fqdn: dedupe1-4.example.com |
(EKS-specific) ipAddr, fqdn and loadBalancerAnnotations | The values against ipAddr, fqdn and loadBalancerAnnotations against following fields should not be changed post initial deployment. This is applicable for primary, media, and MSDP Scaleout servers. For example: - The loadBalancerAnnotations for loadBalancerAnnotations: service.beta.kubernetes.io/aws-load-balancer -internal-subnet: example-subnet service.beta.kubernetes.io/aws-load-balancer -internal: "true" - The IP and FQDNs values defined for Primary, Media and MSDPScaleout ipList: - ipAddr: 4.3.2.1 fqdn: primary.example.com ipList: - ipAddr: 4.3.2.2 fqdn: media1-1.example.com - ipAddr: 4.3.2.3 fqdn: media1-2.example.com serviceIPFQDNs: - ipAddr: 1.2.3.4 fqdn: dedupe1-1.example.com - ipAddr: 1.2.3.5 fqdn: dedupe1-2.example.com - ipAddr: 1.2.3.6 fqdn: dedupe1-3.example.com - ipAddr: 1.2.3.7 fqdn: dedupe1-4.example.com |
Table: Snapshot Manager server related parameters
parameters | Description |
---|---|
cpServer: - name: cpServer-name | This specifies Snapshot Manager server configurations. This is given as a list of Snapshot Manager servers, but most environments will have just one, with multiple replicas. |
tag: 10.2-special | This tag overrides the one defined in Common environment parameters table above. The Snapshot Manager images are shipped with tags different from the NetBackup primary, media, and MSDP images. |
nodeSelector: controlPlane: cpcontrol dataPlane: cpdata | Details of the label to be used for identification of Kubernetes nodes reserved for the Snapshot Manager Servers. If controlPlane is not specified here it will read it from primary.nodeSelector. In that case the nodepool should have appropriate taint and label added to it.[the nodepool name mentioned will have value of primary.nodeSelector.labelValue] Note: The nodepool name mentioned will have value of primary.nodeSelector.labelValue. |
data: capacity: 100Gi (AKS-specific) storageClassName: managed-premium (EKS-specific) storageClassName: <EBS based storage class> | This storage applies to the Snapshot Manager server data volumes. The minimum data size for a Snapshot Manager server is 100 Gi. |
log capacity: 5Gi (AKS-specific) storageClassName: managed-premium (EKS-specific) storageClassName: <EBS based storage class> | This storage applies to the Snapshot Manager server log volumes. Log volumes must be at least 5 Gi. |
networkLoadBalancer: (AKS-specific) annotations: - service.beta.kubernetes.io/ azure-load-balancer -internal-subnet: example-subnet (EKS-specific) annotations: -service.beta.kubernetes.io/aws-load-balancer-subnets: example-subnet1 name ipList: ipAddr: 4.3.2.2 fqdn: media1-1.example.com ipAddr: 4.3.2.3 cpServer: media1-2.example.com | Snapshot ManagerUncomment annotations to specify additional -server specific annotations. These values are merged with the values given in the spec.loadBalancerAnnotations. The duplicate values given here, override the corresponding values in the spec.loadBalancerAnnotations. |
credential: secretName: cp-creds | This defines the credentials for the Snapshot Manager server. It refers to a secret in the same namespace as this environment resource. The secret name can be created using a YAML or the following example command: # kubectl create secret generic cp-creds --namespace nb-namespace --from-literal=username="admin" --from-literal=password="CloudPoint@123" |
Note the following:
To use efs-sc (EFS based storage class), user must specify provisioner for storage class as efs.csi.aws.com
and must install EFS CSI driver. For example:
Storage class: kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: efs-sc provisioner: efs.csi.aws.com parameters: provisioningMode: efs-ap fileSystemId: <EFS ID> directoryPerms: "700" reclaimPolicy: Retain volumeBindingMode: Immediate