NetBackup™ for Cassandra Administrator's Guide
Setting up cassandra.conf file on the primary server
To protect a Cassandra cluster you need to create a configuration file on the primary server which has the configuration details of the Cassandra cluster. Create the in the following path:
UNIX:
/usr/openv/var/global/
WINDOWS:
install path\NetBackup\var\global\
Note:
The file name cassandra.conf
must have all characters in lower case. This file is a JSON file and can be edited anytime manually and saved at the same location. Verify the JSON format with an online formatter, to avoid any JSON formatting errors when reading this file in NetBackup.
This file can have entries for multiple Cassandra clusters. All the Cassandra clusters must be listed in this file whether they are being backed up or being used for doing an alternate restore.
Sample Cassandra.conf
file:
{ "productionCluster": { "multi_72": { "nodes": [ "10.221.104.71", "10.221.104.72", "10.221.104.73", "10.221.104.74", "10.221.104.77" ], "prodClusternodekeyHashes": { "10.221.104.71": "7b69ed1bbe095b2c5fcd34c26806793f8740ebcb24e0c7 bbd9a9bbae9e848923", "10.221.104.72": "a41dfc6a7b33f5fa02d7226e871a900666cd65beeca148 a77d0aabe9ed33e7ff", "10.221.104.73": "1a41c78e68effd51e6eaf8cde265421cb81475bf836593 8be146a271f444ce35", "10.221.104.74": "ebec0750d15ea1f0dfca993e8425d0106ef5aa0bf6e30d 5bfa6a3aad84313bbd", "10.221.104.77": "ba8f8b33a46bc88780288d87b5cb32116773a3929c2f4c f33bd324e9516c5fdb" }, "dataCenterName": "datacenter1", "nodeDownThresholdPercentage": 25, "dssClusterName": "dss_multi_72" }, "multi_82": { "nodes": [ "10.221.104.171", "10.221.104.172" ], "prodClusternodekeyHashes": { "10.221.104.171": "8a69ed1bbe095b2c5fcd34c26806793f8740ebcb24e0c 7bbd9a9bbae9e848964", "10.221.104.172": "b21dfc6a7b33f5fa02d7226e871a900666cd65beeca14 8a77d0aabe9ed33e7ab" }, "dataCenterName": "datacenterwest", "nodeDownThresholdPercentage": 20, "dssClusterName": "dss_multi_82" } }, "dssCluster": { "dss_multi_72": { "dssClusterInfo": { "cbrNode": "10.221.104.75", "nodes": [ "10.221.104.75", "10.221.104.76" ], "dssClusternodekeyHashes": { "10.221.104.75": "14d0288c869d7021a2c855124c4ee5367e3cb6ede8ffc4d 74a883ff655ba0c57", "10.221.104.76": "ebd134c712ba8c2f8a75ba3c2ce1baf80bbbe199ed50476 e2c36f8e84adce294" } }, "settings": { "jobCleanupTimeoutSec": 3600, "dssMinRam": "90909", "dssMinStoragePerBkupNode": "10485", "concurrentCompactions": "8", "sstableloaderMemsize": "4096M", "concurrentTransfers": "2", "scriptHome": "/tmp/.backups", "workingDir": "/home", "dssDist": "/tmp/cbrpack", "cph": "1", "optThreshold": "32", "securityMode": "userProvided", "verbose": "5", "maxLogSize": "1", "maxStreamsPerBackupHost": "10" } }, "dss_multi_82": { "dssClusterInfo": { "cbrNode": "10.221.104.175", "nodes": [ "10.221.104.175", "10.221.104.176" ], "dssClusternodekeyHashes": { "10.221.104.175": "28d0288c869d7021a2c855124c4ee5367e3cb6ede8ffc4 d74a883ff655ba0c21", "10.221.104.176": "a8d134c712ba8c2f8a75ba3c2ce1baf80bbbe199ed5047 6e2c36f8e84adce214" } }, "settings": { "jobCleanupTimeoutSec": 28800, "dssMinRam": "90909", "dssMinStoragePerBkupNode": "10485", "concurrentCompactions": "8", "sstableloaderMemsize": "4096M", "concurrentTransfers": "2", "scriptHome": "/tmp/.backups", "workingDir": "/home", "dssDist": "/tmp/cbrpack", "cph": "1", "optThreshold": "32", "securityMode": "userProvided", "verbose": "5", "maxLogSize": "1", "maxStreamsPerBackupHost": "10" } } } }
Enter the RSA key of the CBR node. To obtain the RSA key, log in to the CBR node with the host credentials you plan to use with the Data staging servers and run the cat /etc/ssh/ssh_host_rsa_key.pub |awk '{print $2}' |base64 -d |sha256sum |awk '{print $1}' command.
Table:
Key | Description |
---|---|
| Lists one or more Cassandra under this key as a subkey. |
| Lists all the nodes in the Cassandra cluster. The values of this key must always be an IPv4 address and can be listed as a list which is comma separated. |
| Lists all the nodes in the nodes key with the public SHA 256 RSA key. The RSA key can be obtained by logging on to the node using the host credentials you plan to use with the Data staging servers or the production node and run the following command cat /etc/ssh/ssh_host_rsa_key.pub |awk '{print $2}' |base64 -d |sha256sum |awk '{print $1}' |
| Lists the name of the dss cluster to be used for the production cluster backup. |
| Specifies the details of the DSS cluster. |
| Specifies the CBR node IPv4 address which is used as the coordinator node on the DSS cluster. |
| Lists all the nodes in the nodes key under the dssClusterInfo with the public RSA key. |
| Lists the name of the DSS cluster. |
| Contains all the settings of the DSS cluster which are used for that DSS cluster. |
| Sets minimum RAM requirement for Data optimization on data staging server. |
| Sets minimum storage requirement for cata optimization on data staging server. |
| Sets maximum number of compactions which can run concurrently. |
| Sets heap memory size for Cassandra sstableloader. |
| The value for concurrent transfers which is used for parallel data transfer from Production to Data staging server. Default value is 8. |
| The path of directory which is used for CBR package installation on the Cassandra nodes. Note: The path must exists on both Prod cluster and DSS nodes and have full access to the host user account configured with NetBackup for nodes. |
| The path of the directory used for Cassandra data processing. This path contains the schema files, binary files, and DB files. |
| The path is used as the thin-client distribution directory on the data staging servers. Note: The path must exist on all DSS nodes and must have full access to the host user account configured with NetBackup for DSS nodes. |
| Number of connections per backup host from Data staging server cluster. The default value is 8. |
| The Optimization Threshold value which refers to the maximum number of column family to be optimized at same time. Value of Optimization Threshold ranges between 4 to 32. |
| User needs to provide the key value as userProvided. This ensures that your RSA keys are validated at the time of connecting to concern nodes. |
| Sets the logging level for CBR logging. Value of verbose ranges between 1 to 5 only. |
| Sets maximum size value for log files. |
| Sets maximum number of Streams per Backup Host. It is recommended that total number of streams for the job should match the number of DSS nodes or number of backup hosts. |
| Data center name to use for the backup. NetBackup only backups from the nodes in this data center. Specify the name of the data center that is geographically co-located with the media/backup host of NetBackup for better performance in backup and restores. If this field is left empty NetBackup backs up data from all data centers in the cluster irrespective of their geographic location. Hence if your cluster has data centers based on geography do specify the data center co-located with the NetBackup media/backup host to get efficient backups and restores. |
| Specify the percentage of nodes from the Cassandra cluster that can go down. If there are more nodes that are found to be down than this percentage NetBackup fails the backup. This is to ensure that the user can define the percentage of nodes that can be down and NetBackup can still continue to backup the cluster. |
| This parameter is the timeout in seconds to allow the next backup to continue on this cluster. Specify this timeout to the typical time it takes to backup this cluster. If a job fails and NetBackup was not able to do the clean up this time out value will be used to force a cleanup the next job executes for the same cluster. The next job cleans up the remanent meta data after this timeout value has passed else the next job wont do the cleanup. |