NetBackup™ for Apache Cassandra Administrator's Guide

Last Published:
Product(s): NetBackup & Alta Data Protection (10.5)

Pre-requisites and best practices

  • Ensure that NetBackup supports the installed Cassandra version. For more information, refer Software Compatibility List.

  • Starting with NetBackup version 10.5, the minimum operating system version required for Cassandra nodes must be RHEL 8.6.

  • Backup host, Data staging server, and Cassandra are supported on the RHEL platform only.

  • NetBackup requires the same distribution Apache/DataStax and version on the Data Staging Server (DSS) cluster as the production cluster which is being protected.

    NetBackup supports yum and tar-based deployments for DataStax Cassandra cluster in DSS and in production. DSS and production clusters must have the same type of deployments.

  • NetBackup requires around 20% of the nodes of the datacenter being protected as DSS.

  • SSL authentication is not supported for Apache Cassandra.

  • Tarball deployment is not supported with Apache Cassandra.

  • The DSS should be added to the backup environment so that NetBackup can perform the following:

    • Stage the data to the DSS.

    • Deduplicate the data saving to the backup storage.

    • Copy the data to NetBackup media.

  • The DSS should have the same version of Cassandra as the Cassandra production cluster.

  • For NetBackup version 10.3 and later, the DSS and Cassandra production credentials must be added in the NetBackup credential management console before adding DSS and Cassandra production clusters. Then on the DSS and Cassandra production cluster add workflow, we must select desired credentials from the existing credential list.

  • NetBackup supports SSL, LDAP, and DataStax Cassandra with simple authentication. Use database username and password to connect Cassandra and to run commands like cqlsh and nodetool utils. Configure Cassandra in the NetBackup credentials during DSS cluster configuration and Cassandra cluster configuration.

  • Enable SSH on all the Cassandra nodes and DSS nodes.

  • Ensure that the local time of Cassandra nodes, the DSS, and the backup hosts are synchronized with NTP server.

  • Configure a non-root host user account for the data staging server cluster in NetBackup credentials management.

    Note:

    The non-root host user account can be separate or the same. It must be valid with a home folder and the right to connect to the respective nodes with a use of ssh. Add the host user in the sudoers file on the respective nodes. Database username and password must be the same on DSS and application cluster.

  • Before you run Cassandra backup or restore, ensure that you received a successful ping response from all the data staging servers to Cassandra nodes and the backup host.

  • Select and update the firewall settings for the backup hosts, data staging servers, and Cassandra nodes can communicate.

  • Ensure that the specified paths in the DSS cluster configuration exist on all the DSS and Cassandra nodes.

  • Whenever you upgrade Cassandra or make any schema change such as delete a keyspace or column family, initiate a full backup before any incremental backup job.

  • Ensure that the specified host user account for the cluster has read and write access to the specified folders in the DSS cluster configuration.

  • Host mapping must be done according to the IP preference.

  • Ensure that the SStableloader utility works between the production nodes and the data staging server.

  • Ensure that free space and the memory on the DSS is three times larger than the column family in the Cassandra cluster. Maintain a similar memory size on all the DSS nodes.

    Note:

    The compaction operation on the DSS needs more memory. Deploying higher RAM on the DSS nodes result in better backup and restore performance.

  • Maintain a minimum 20% free space on Cassandra nodes during backup operations.

  • Ensure enough free space on target cluster nodes during the restore as per the size of data being restored.

  • Before the restore, ensure that the target Cassandra version has the same version as the version you backed up from.

  • Before the restore ensure that the target cluster and target Data Staging Server cluster are fully configured in NetBackup.

  • Canceling a parent job in a compound restore job does not cancel the child restore job. You must manually cancel the child restore jobs.

  • Ensure that the Connections per host (cph) value is set to 1 in DSS settings for Datastax cassandra backup.

RBAC permissions for a Cassandra role:

  • Ensure to assign both create and update permissions to:

    • Add DSS cluster.

    • Add the Apache Cassandra cluster.

    • Add DSS nodes.

    • Edit Apache Cassandra cluster.

  • The database credentials of the DSS cluster should be the same as the Cassandra production cluster.

  • You must disable the requiretty option globally in the sudoers file, by replacing Defaults requiretty with Defaults !requiretty.

    Note:

    This action changes the global sudo configuration.

  • In the case of tarball based installation, you must always start Cassandra services from tarball installation bin path location.

  • For database user account, if default_scheme is internal for authentication_options in dse.yaml file, then specify the internal authentication user. If this default_scheme is set to LDAP, then specify the LDAP user account.

  • For NetBackup versions upgraded from versions before 10.2.1, you need to trigger the discovery manually for both DSS and production cluster.

  • The database user account configured in NetBackup for the following must have all the required permissions in the cluster:

    • DSS cluster

    • Backup and restore of Cassandra production cluster.

    The user must be able to Create, View, Update, and Drop any resources in the cluster. On the DSS cluster you can provide specific permissions or assign the superuser role to the configured database user account.

  • Ensure that the DSS distribution, working directory, and script home directory paths under Cassandra configuration are not the same.

    Note:

    Working directory path cannot be set as /root.

  • Ensure that you update the secure_path list with Java executable path in the /etc/sudoers file.

  • Modify the cassandra.yaml file to set the following parameters on all DSS nodes:

    Parameters

    Description/Value

    cluster_name

    Name of the cluster.

    cluster_name: <Provide name of DSS cluster>

    num_tokens

    Set num_tokens as 1.

    num_tokens: 1

    initial_token

    Calculate and set initial_token using the following command:

    python -c "print [str(((2**64 / number_of_nodes_in_cluster) * i) - 2**63) for i in range(number_of_nodes_in_cluster)]" initial_token: <To be calculated>

    incremental_backups

    Disable incremental_backups.

    incremental_backups: false

    snapshot_before_compaction

    Disables taking a snapshot before each compaction.

    snapshot_before_compaction: false

    auto_snapshot

    Disable auto snapshot.

    auto_snapshot: false

    compaction_throughput_mb_per_sec

    Disable compaction throttling.

    compaction_throughput_mb_per_sec: 0

    Note:

    For Cassandra 4.1 and later, update the parameter and its value to compaction_throughput: 0MiB/s

    hinted_handoff_enabled

    Disable hinted handoff.

    hinted_handoff_enabled: false

    cdc_enabled

    Disable CDC functionality.

    cdc_enabled: false

    enable_user_defined_functions

    Enable user-defined functions.

    enable_user_defined_functions: true

    Note:

    For Cassandra 4.1 and later, update the parameter and its value to user_defined_functions_enabled: true

    enable_scripted_user_defined_functions

    Enable scripted user-defined functions.

    enable_scripted_user_defined_functions: true

    Note:

    For Cassandra 4.1 and later, update the parameter and its value to scripted_user_defined_functions_enabled: true