NetBackup™ Deployment Guide for Kubernetes Clusters
- Introduction
- Section I. Configurations
- Prerequisites
- Recommendations and Limitations
- Configurations
- Configuration of key parameters in Cloud Scale deployments
- Section II. Deployment
- Section III. Monitoring and Management
- Monitoring NetBackup
- Monitoring Snapshot Manager
- Monitoring fluentbit
- Monitoring MSDP Scaleout
- Managing NetBackup
- Managing the Load Balancer service
- Managing PostrgreSQL DBaaS
- Managing fluentbit
- Performing catalog backup and recovery
- Section IV. Maintenance
- PostgreSQL DBaaS Maintenance
- Patching mechanism for primary, media servers, fluentbit pods, and postgres pods
- Upgrading
- Cloud Scale Disaster Recovery
- Uninstalling
- Troubleshooting
- Troubleshooting AKS and EKS issues
- Troubleshooting AKS-specific issues
- Troubleshooting EKS-specific issues
- Troubleshooting AKS and EKS issues
- Appendix A. CR template
- Appendix B. MSDP Scaleout
- MSDP Scaleout configuration
- Managing MSDP Scaleout
- MSDP Scaleout maintenance
Issues with logging feature for Cloud Scale
Useful commands required during troubleshooting
To get the list of nodes:
$ kubectl get nodes
To view the information (such as Taints applied), describe the node:
$ kubectl describe node <node name>
To view which nodes a pod is assigned to:
$ kubectl get pods -A -o wide
This command displays the extra data including node.
To obtain the information about the fluentbit DaemonSet's, run the describe command on fluentbit DaemonSet's:
$ kubectl describe ds nb-fluentbit-daemonset -n netbackup
This command displays how many DaemonSet's are scheduled.
If the taints and tolerations are not configured properly, a lack of DaemonSet's being assigned is seen and the container and pod logs would not be considered. This issue is due to tolerations not setup in the values.yaml
file or tolerations not added.
The values can be viewed using the following commands:
To get DaemonSet's in NetBackup namespace:
$ kubectl get ds -n <netbackup namespace>
To view tolerations:
$ kubectl edit ds -n <netbackup namespace> nb-fluentbit-daemonset
The tolerations can be found in the vi menu that is opened. If no change is required then do not save any changes.
Following error message appears when fluentbit scans the location for logs which has permission issues:
[error] [input:tail:tail.0] read error, check permissions: /mnt/nblogs/*/*/*.log [ warn] [input:tail:tail.0] error scanning path: /mnt/nblogs/*/*/*.log [error] [input:tail:tail.0] read error, check permissions: /mnt/nblogs/*/*/*/*.log [ warn] [input:tail:tail.0] error scanning path: /mnt/nblogs/*/*/*/*.log [error] [input:tail:tail.0] read error, check permissions: /mnt/nblogs/*/*/*.log [ warn] [input:tail:tail.0] error scanning path: /mnt/nblogs/*/*/*.log
The above error messages are displayed in the sidecar logs which can be found in the collector pod as they are picked up by the DaemonSet's and stored under the pod that the sidecar resides in. Some application logs associated with the sidecar may be missing from the collector if this error occurs.
Workaround:
Exec into the sidecar and determine which folder has permission issues.
If you add an incorrect labels into .yaml file, it will end up having no demonset running for that node and eventually logs would not be collected for pods that are present on that node.
If any of the Cloud Scale node is configured on the Agentpool (systempool) in case of Azure, then the demonset would not be able to collect the logs.