NetBackup™ for Apache Cassandra Administrator's Guide
NetBackup Apache Cassandra Protection Architecture
In this architecture:
NetBackup primary server have backup policies and schedules. It is responsible for managing backup jobs.
NetBackup media server have the backup data. All NetBackup backup targets are supported for Cassandra protection.
Data staging servers perform off-host processing of Cassandra data to:
Determine a cluster-consistent point-in-time.
Remove replica records.
Remove stale data caused by record overwrites.
To perform off-host processing, the data staging server must have Cassandra installed on these nodes. NetBackup expects a Cassandra cluster of the same distribution and version configured on the data staging server. If you have an SSL based authentication and/or LDAP configuration on the DataStax application cluster, then a same configuration of authentication must be performed on the data staging servers with the same root CA certificate as the application cluster. Maintain the version of Cassandra on data staging servers as you do for the Cassandra clusters.
One of the nodes in data staging servers is setup as a "CBR" node (Cassandra Backup and Restore node). CBR performs the entire orchestration required for performing an effective backup and restore.
During backup, the production data is copied to the data staging servers. The data is then deduplicated and transfered to the backup hosts / NetBackup media servers. One data stream is written per DSS. If you have multiple DSS nodes, then data is streamed parallely or concurrently from these DSS nodes. NetBackup recommends to have the same number of streams configured on the backup hosts collectively to get maximum performance. Hence, number of streams per backup host × number of backup hosts >= Data staging servers.
During restore, the data is staged onto the data staging servers from the NetBackup media servers. This staged data is then restored into the Cassandra production cluster as per the number of replicas and Data centers configured for the keyspace being restored.
During restore, you can choose to:
Restore the entire Cassandra cluster.
Restore some keyspaces and/or column families.
Rename some keyspaces and/or column families.
Reconfigure data replica for the data that would be restored.