NetBackup™ Deduplication Guide
- Introducing the NetBackup media server deduplication option
- Quick start
- Planning your deployment
- About MSDP storage and connectivity requirements
- About NetBackup media server deduplication
- About NetBackup Client Direct deduplication
- About MSDP remote office client deduplication
- About MSDP performance
- About MSDP stream handlers
- MSDP deployment best practices
- Provisioning the storage
- Licensing deduplication
- Configuring deduplication
- Configuring the Deduplication Multi-Threaded Agent behavior
- Configuring the MSDP fingerprint cache behavior
- Configuring MSDP fingerprint cache seeding on the storage server
- About MSDP Encryption using NetBackup KMS service
- Configuring a storage server for a Media Server Deduplication Pool
- Configuring a disk pool for deduplication
- Configuring a Media Server Deduplication Pool storage unit
- About MSDP optimized duplication within the same domain
- Configuring MSDP optimized duplication within the same NetBackup domain
- Configuring MSDP replication to a different NetBackup domain
- About NetBackup Auto Image Replication
- Configuring a target for MSDP replication to a remote domain
- Creating a storage lifecycle policy
- Resilient Network properties
- Editing the MSDP pd.conf file
- About protecting the MSDP catalog
- Configuring an MSDP catalog backup
- About NetBackup WORM storage support for immutable and indelible data
- MSDP cloud support
- About MSDP cloud support
- Cloud space reclamation
- About the disaster recovery for cloud LSU
- About Image Sharing using MSDP cloud
- About MSDP cloud immutable (WORM) storage support
- About immutable object support for AWS S3
- About immutable object support for AWS S3 compatible platforms
- About immutable storage support for Azure blob storage
- About immutable storage support for Google Cloud Storage
- S3 Interface for MSDP
- Configuring S3 interface for MSDP on MSDP build-your-own (BYO) server
- Identity and Access Management (IAM) for S3 interface for MSDP
- S3 APIs for S3 interface for MSDP
- Monitoring deduplication activity
- Managing deduplication
- Managing MSDP servers
- Managing NetBackup Deduplication Engine credentials
- Managing Media Server Deduplication Pools
- Changing a Media Server Deduplication Pool properties
- Configuring MSDP data integrity checking behavior
- About MSDP storage rebasing
- Managing MSDP servers
- Recovering MSDP
- Replacing MSDP hosts
- Uninstalling MSDP
- Deduplication architecture
- Configuring and using universal shares
- Using the ingest mode
- Enabling a universal share with object store
- Configuring isolated recovery environment (IRE)
- Using the NetBackup Deduplication Shell
- Managing users from the deduplication shell
- Managing certificates from the deduplication shell
- Managing NetBackup services from the deduplication shell
- Monitoring and troubleshooting NetBackup services from the deduplication shell
- Managing S3 service from the deduplication shell
- Troubleshooting
- About unified logging
- About legacy logging
- Troubleshooting MSDP installation issues
- Troubleshooting MSDP configuration issues
- Troubleshooting MSDP operational issues
- Trouble shooting multi-domain issues
- Appendix A. Migrating to MSDP storage
- Appendix B. Migrating from Cloud Catalyst to MSDP direct cloud tiering
- About direct migration from Cloud Catalyst to MSDP direct cloud tiering
- Appendix C. Encryption Crawler
About Cloud Catalyst migration strategies
Multiple strategies are available for migrating from Cloud Catalyst to MSDP direct cloud tiering. The best strategy for an installation depends on factors such as type of cloud storage (public versus private, standard versus cold storage class) and data retention requirements.
The following are four strategies for migrating from Cloud Catalyst to MSDP direct cloud tiering. Three of these strategies can be adopted with NetBackup 8.3 and later releases and the fourth, Direct Migration, is available in release 10.0 and later. All four strategies have advantages and disadvantages listed that you should review to help you make the best choice for your environment.
The four strategies for migrating from Cloud Catalyst to MSDP direct cloud tiering are as follows:
Natural expiration strategy - Available in NetBackup release 8.3 and later.
Image duplication strategy - Available in NetBackup release 8.3 and later.
Combination strategy - Available in NetBackup release 8.3 and later.
Direct migration strategy - Available in NetBackup release 10.0 and later.
This strategy works in any environment. To use this strategy, you must first configure a new NetBackup 8.3 or later MSDP direct cloud tier storage server. Or, add an MSDP direct cloud tier disk pool and storage unit to an existing NetBackup 8.3 or later MSDP storage server (verify server capacity). Next, modify the storage lifecycle policies and backup policies to use the new MSDP direct cloud tier storage. Once all new duplication or backup jobs write to the new MSDP direct cloud tier storage, the images on the old Cloud Catalyst storage gradually expire. After all those images have expired, the Cloud Catalyst server can be retired or repurposed.
The advantages of the natural expiration strategy are as follows:
Available with NetBackup version 8.3 and later. This strategy gives you improved performance, reliability, usability, and flexibility available in MSDP direct cloud tier. Can be used without upgrading to NetBackup 10.0.
Can be implemented gradually using new MSDP Cloud storage servers while Cloud Catalyst storage servers continue to be used.
Can be used for all environments including public cloud cold storage (for example: AWS Glacier or AWS Glacier Deep Archive).
All new data is uploaded with the MSDP direct cloud tiering, which uses cloud storage more efficiently than Cloud Catalyst. The long-term total cloud storage usage and cost may be reduced.
The disadvantages of the natural expiration strategy are as follows:
Until all the old Cloud Catalyst images have been expired and deleted, there is some duplication of data in cloud storage. This duplication can occur between the old Cloud Catalyst images and new MSDP direct cloud tier images. Additional storage costs could be incurred if you use a public cloud environment.
Requires a separate server.
Cloud Catalyst servers must be maintained until all uploaded images from those servers have expired or are otherwise no longer needed.
This strategy works in most environments except those using public cloud cold storage (for example: AWS Glacier or AWS Glacier Deep Archive). To use this strategy, you must first configure a new NetBackup 8.3 or later MSDP direct cloud tier storage server. Or, add an MSDP direct cloud tier disk pool and storage unit to an existing NetBackup 8.3 or later MSDP storage server (verify server capacity). Next, modify the storage lifecycle policies and backup policies to use the new MSDP direct cloud tier storage. Once all new duplication or backup jobs write to the new MSDP direct cloud tier storage, existing images on the old Cloud Catalyst storage are moved. These images are moved to the new MSDP direct cloud tier storage using a manually initiated bpduplicate command. After all existing images have been moved from the old Cloud Catalyst storage to the new MSDP direct cloud tier storage, the Cloud Catalyst server can be retired or repurposed.
The advantages of the image duplication strategy are as follows:
Available with NetBackup version 8.3 and later. This strategy gives you improved performance, reliability, usability, and flexibility available in MSDP direct cloud tier. Can be used without upgrading to NetBackup 10.0.
Can be implemented gradually using new MSDP Cloud storage servers while Cloud Catalyst storage servers continue to be used.
All new and all old Cloud Catalyst data is uploaded with MSDP direct cloud tiering, which uses cloud storage more efficiently than Cloud Catalyst. The long-term total cloud storage usage and cost may be reduced.
The disadvantages of the image duplication strategy are as follows:
Public cloud cold storage environments (for example: AWS Glacier or AWS Glacier Deep Archive) support restore from the cloud but do not support duplication from the cloud, so this strategy cannot be used.
If public cloud storage is used, potentially significant data egress charges are incurred when old Cloud Catalyst images are read to duplicate them to the new MSDP Cloud storage.
Additional network traffic to and from the cloud occurs when the old Cloud Catalyst images are duplicated to the new MSDP direct cloud tier storage.
Until all old Cloud Catalyst images have been moved to MSDP direct cloud tier storage, there is some duplication of data in cloud storage. This duplication can occur between the old Cloud Catalyst images and new MSDP direct cloud tier images. Additional costs could be incurred if you use a public cloud environment.
Requires a separate server.
Cloud Catalyst servers must be maintained until all uploaded images from those servers have been moved to the new MSDP direct cloud tier storage or are otherwise no longer needed.
This strategy works in most environments except those using public cloud cold storage (example: AWS Glacier or AWS Glacier Deep Archive). This strategy is a combination of the previous two strategies. To use this strategy, you must first configure a new NetBackup 8.3 or later MSDP direct cloud tier storage server. Or, add an MSDP direct cloud tier disk pool and storage unit to an existing NetBackup 8.3 or later MSDP storage server (verify server capacity). Next, modify the storage lifecycle policies and backup policies to use the new MSDP direct cloud tier storage. Once all the new duplication or backup jobs write to the new MSDP direct cloud tier storage, the oldest images on the old Cloud Catalyst storage gradually expire. When the number of remaining unexpired images on the old Cloud Catalyst storage drops below a determined threshold, those remaining images are moved. These images are moved to the new MSDP direct cloud tier storage using a manually initiated bpduplicate command. After all remaining images have been moved from the old Cloud Catalyst storage to the new MSDP direct cloud tier storage, the Cloud Catalyst server can be retired or repurposed.
The advantages of the combination strategy are as follows:
Available with NetBackup version 8.3 and later. This strategy gives you improved performance, reliability, usability, and flexibility available in MSDP direct cloud tier. Can be used without upgrading to NetBackup 10.0.
Can be implemented gradually using new MSDP direct cloud tier storage servers while Cloud Catalyst storage servers continue to be used.
All new data and all old Cloud Catalyst data are uploaded with MSDP direct cloud tiering, which uses cloud storage more efficiently than Cloud Catalyst. The long-term total cloud storage usage and cost may be reduced.
Enables retiring of the old Cloud Catalyst servers before all images on those servers have expired.
The disadvantages of the combination strategy are as follows:
Public cloud cold storage environments (for example: AWS Glacier or AWS Glacier Deep Archive) support restore from the cloud but do not support duplication from the cloud, so this strategy cannot be used.
If public cloud storage is used, potentially significant data egress charges are incurred. This issue can happen when old Cloud Catalyst images are read to duplicate them to the new MSDP direct cloud tier storage.
Additional network traffic to and from the cloud occurs when the old Cloud Catalyst images are duplicated to the new MSDP direct cloud tier storage.
Until all Cloud Catalyst images have expired or have been moved to MSDP direct cloud tier storage, there is some duplication of data in cloud storage. This duplication can occur between the old Cloud Catalyst images and new MSDP direct cloud tier images, so additional costs could be incurred if you use a public cloud environment.
Requires a separate server.
Cloud Catalyst servers must be maintained until all uploaded images from those servers have expired, have been moved to the new MSDP direct cloud tier, or are no longer needed.
This strategy is available in NetBackup 10.0 and later releases and can work in any environment. To use this strategy, you must first configure a new MSDP direct cloud tier storage server using the latest release. Alternatively, the existing Cloud Catalyst server can be reimaged and reinstalled as a new MSDP direct cloud tier storage server using the latest release. If you use an existing server, that server must meet the minimum requirements to be used.
See About the media server deduplication (MSDP) node cloud tier.
See Planning your MSDP deployment.
Note that this operation would not be an upgrade. Instead, it would be a remove and reinstall operation. Once the new MSDP direct cloud tier storage server is available, the nbdecommission -migrate_cloudcatalyst utility is used to create a new MSDP direct cloud tier. This new storage can reference the data previously uploaded to cloud storage by Cloud Catalyst. When the migration process is complete and utility is run, the new MSDP direct cloud tier can be used for new backup and duplication operations. This new storage can be used for restore operations of older Cloud Catalyst images.
For more information about the nbdecommission command, see the NetBackup Commands Reference Guide.
The advantages of the direct migration strategy are as follows:
Can be used for all environments including public cloud cold storage (for example: AWS Glacier or AWS Glacier Deep Archive).
Does not require a separate server since the Cloud Catalyst server can be reimaged as an MSDP direct cloud tier server and used for migration.
The disadvantages of the direct migration strategy are as follows:
Cannot be implemented gradually using the new MSDP direct cloud tier storage servers while Cloud Catalyst storage servers continue to be used for new backup or duplication jobs. The old Cloud Catalyst storage server cannot be used for new backup or duplication jobs while the migration process is running.
Cloud Catalyst uses cloud storage less efficiently than MSDP direct cloud tier. This issue is especially true for NetBackup versions older than 8.2 Cloud Catalyst. This strategy continues to use existing Cloud Catalyst objects for new MSDP direct cloud tier images. Some of the cloud storage efficiency that is gained with MSDP direct cloud tier is not realized.
Requires a new MSDP server so an existing MSDP server cannot be used and consolidation of any Cloud Catalyst servers is not possible.
More Information