NetBackup™ for Apache Cassandra Administrator's Guide
Pre-requisites and best practices
Ensure that NetBackup supports the installed Cassandra version. For more information, refer Software Compatibility List.
Starting with NetBackup version 10.5, the minimum operating system version required for Cassandra nodes must be RHEL 8.6.
Backup host, Data staging server, and Cassandra are supported on the RHEL platform only.
NetBackup requires the same distribution Apache/DataStax and version on the Data Staging Server (DSS) cluster as the production cluster which is being protected.
NetBackup supports yum and tar-based deployments for DataStax Cassandra cluster in DSS and in production. DSS and production clusters must have the same type of deployments.
NetBackup requires around 20% of the nodes of the datacenter being protected as DSS.
SSL authentication is not supported for Apache Cassandra.
Tarball deployment is not supported with Apache Cassandra.
The DSS should be added to the backup environment so that NetBackup can perform the following:
Stage the data to the DSS.
Deduplicate the data saving to the backup storage.
Copy the data to NetBackup media.
The DSS should have the same version of Cassandra as the Cassandra production cluster.
For NetBackup version 10.3 and later, the DSS and Cassandra production credentials must be added in the NetBackup credential management console before adding DSS and Cassandra production clusters. Then on the DSS and Cassandra production cluster add workflow, we must select desired credentials from the existing credential list.
NetBackup supports SSL, LDAP, and DataStax Cassandra with simple authentication. Use database username and password to connect Cassandra and to run commands like cqlsh and nodetool utils. Configure Cassandra in the NetBackup credentials during DSS cluster configuration and Cassandra cluster configuration.
Enable SSH on all the Cassandra nodes and DSS nodes.
Ensure that the local time of Cassandra nodes, the DSS, and the backup hosts are synchronized with NTP server.
Configure a non-root host user account for the data staging server cluster in NetBackup credentials management.
Note:
The non-root host user account can be separate or the same. It must be valid with a home folder and the right to connect to the respective nodes with a use of
ssh
. Add the host user in thesudoers
file on the respective nodes. Database username and password must be the same on DSS and application cluster.Before you run Cassandra backup or restore, ensure that you received a successful ping response from all the data staging servers to Cassandra nodes and the backup host.
Select and update the firewall settings for the backup hosts, data staging servers, and Cassandra nodes can communicate.
Ensure that the specified paths in the DSS cluster configuration exist on all the DSS and Cassandra nodes.
Whenever you upgrade Cassandra or make any schema change such as delete a keyspace or column family, initiate a full backup before any incremental backup job.
Ensure that the specified host user account for the cluster has read and write access to the specified folders in the DSS cluster configuration.
Host mapping must be done according to the IP preference.
Ensure that the SStableloader utility works between the production nodes and the data staging server.
Ensure that free space and the memory on the DSS is three times larger than the column family in the Cassandra cluster. Maintain a similar memory size on all the DSS nodes.
Note:
The compaction operation on the DSS needs more memory. Deploying higher RAM on the DSS nodes result in better backup and restore performance.
Maintain a minimum 20% free space on Cassandra nodes during backup operations.
Ensure enough free space on target cluster nodes during the restore as per the size of data being restored.
Before the restore, ensure that the target Cassandra version has the same version as the version you backed up from.
Before the restore ensure that the target cluster and target Data Staging Server cluster are fully configured in NetBackup.
Canceling a parent job in a compound restore job does not cancel the child restore job. You must manually cancel the child restore jobs.
Ensure that the Connections per host (cph) value is set to 1 in DSS settings for Datastax cassandra backup.
RBAC permissions for a Cassandra role:
Ensure to assign both create and update permissions to:
Add DSS cluster.
Add the Apache Cassandra cluster.
Add DSS nodes.
Edit Apache Cassandra cluster.
The database credentials of the DSS cluster should be the same as the Cassandra production cluster.
You must disable the requiretty option globally in the
sudoers
file, by replacing Defaults requiretty with Defaults !requiretty.Note:
This action changes the global
sudo
configuration.In the case of
tarball
based installation, you must always start Cassandra services fromtarball installation
bin path location.For database user account, if
default_scheme
is forauthentication_options
indse.yaml
file, then specify the internal authentication user. If thisdefault_scheme
is set to , then specify the LDAP user account.For NetBackup versions upgraded from versions before 10.2.1, you need to trigger the discovery manually for both DSS and production cluster.
The database user account configured in NetBackup for the following must have all the required permissions in the cluster:
DSS cluster
Backup and restore of Cassandra production cluster.
The user must be able to Create, View, Update, and Drop any resources in the cluster. On the DSS cluster you can provide specific permissions or assign the superuser role to the configured database user account.
Ensure that the DSS distribution, working directory, and script home directory paths under Cassandra configuration are not the same.
Note:
Working directory path cannot be set as
/root
.Ensure that you update the
secure_path
list with Java executable path in the/etc/sudoers
file.Modify the
cassandra.yaml
file to set the following parameters on all DSS nodes:Parameters
Description/Value
cluster_name
Name of the cluster.
cluster_name: <Provide name of DSS cluster>
num_tokens
Set num_tokens as 1.
num_tokens: 1
initial_token
Calculate and set initial_token using the following command:
python -c "print [str(((2**64 / number_of_nodes_in_cluster) * i) - 2**63) for i in range(number_of_nodes_in_cluster)]" initial_token: <To be calculated>
incremental_backups
Disable incremental_backups.
incremental_backups: false
snapshot_before_compaction
Disables taking a snapshot before each compaction.
snapshot_before_compaction: false
auto_snapshot
Disable auto snapshot.
auto_snapshot: false
compaction_throughput_mb_per_sec
Disable compaction throttling.
compaction_throughput_mb_per_sec: 0
Note:
For Cassandra 4.1 and later, update the parameter and its value to compaction_throughput: 0MiB/s
hinted_handoff_enabled
Disable hinted handoff.
hinted_handoff_enabled: false
cdc_enabled
Disable CDC functionality.
cdc_enabled: false
enable_user_defined_functions
Enable user-defined functions.
enable_user_defined_functions: true
Note:
For Cassandra 4.1 and later, update the parameter and its value to user_defined_functions_enabled: true
enable_scripted_user_defined_functions
Enable scripted user-defined functions.
enable_scripted_user_defined_functions: true
Note:
For Cassandra 4.1 and later, update the parameter and its value to scripted_user_defined_functions_enabled: true