NetBackup™ Deduplication Guide
- Introducing the NetBackup media server deduplication option
- Quick start
- Planning your deployment
- About MSDP storage and connectivity requirements
- About NetBackup media server deduplication
- About NetBackup Client Direct deduplication
- About MSDP remote office client deduplication
- About MSDP performance
- About MSDP stream handlers
- MSDP deployment best practices
- Provisioning the storage
- Licensing deduplication
- Configuring deduplication
- Configuring the Deduplication Multi-Threaded Agent behavior
- Configuring the MSDP fingerprint cache behavior
- Configuring MSDP fingerprint cache seeding on the storage server
- About MSDP Encryption using NetBackup KMS service
- Configuring a storage server for a Media Server Deduplication Pool
- Configuring a disk pool for deduplication
- Configuring a Media Server Deduplication Pool storage unit
- About MSDP optimized duplication within the same domain
- Configuring MSDP optimized duplication within the same NetBackup domain
- Configuring MSDP replication to a different NetBackup domain
- About NetBackup Auto Image Replication
- Configuring a target for MSDP replication to a remote domain
- Creating a storage lifecycle policy
- Resilient Network properties
- Editing the MSDP pd.conf file
- About protecting the MSDP catalog
- Configuring an MSDP catalog backup
- About NetBackup WORM storage support for immutable and indelible data
- MSDP cloud support
- About MSDP cloud support
- About the disaster recovery for cloud LSU
- About Image Sharing using MSDP cloud
- About MSDP cloud immutable (WORM) storage support
- Monitoring deduplication activity
- Viewing MSDP job details
- Managing deduplication
- Managing MSDP servers
- Managing NetBackup Deduplication Engine credentials
- Managing Media Server Deduplication Pools
- Changing a Media Server Deduplication Pool properties
- Configuring MSDP data integrity checking behavior
- About MSDP storage rebasing
- Managing MSDP servers
- Recovering MSDP
- Replacing MSDP hosts
- Uninstalling MSDP
- Deduplication architecture
- Configuring and using universal shares
- Troubleshooting
- About unified logging
- About legacy logging
- Troubleshooting MSDP installation issues
- Troubleshooting MSDP configuration issues
- Troubleshooting MSDP operational issues
- Trouble shooting multi-domain issues
- Appendix A. Migrating to MSDP storage
- Appendix B. Migrating from Cloud Catalyst to MSDP direct cloud tiering
- About direct migration from Cloud Catalyst to MSDP direct cloud tiering
- Appendix C. Encryption Crawler
About the configuration items in cloud.json, contentrouter.cfg, and spa.cfg
The cloud.json file is available at: <STORAGE>/etc/puredisk/cloud.json
.
The file has the following parameters:
Parameter | Details | Default value |
---|---|---|
UseMemForUpload | If it is set to true, the upload cache directory is mounted in memory as tmpfs. It is especially useful for high speed cloud that disk speed is bottleneck. It can also reduce the disk competition with local LSU. The value is set to true if the system memory is enough. The default value is true if there is enough memory available. | true |
CachePath | The path of the cache. It is created under an MSDP volume according to the space usage of MSDP volumes. It will reserve some space that local LSU cannot write beyond. Usually you do not need to change this path, unless in some case that some volumes are much freer than others, multiple cloud LSUs may be distributed to the same disk volume. For performance consideration, you may need to change this option to make them distributed to different volumes. This path can be changed to reside in a non-MSDP volume. | NA |
UploadCacheGB | It is the maximum space usage of upload cache. Upload cache is a subdirectory named "upload" under CachePath. For performance consideration, it should be set to larger than: (max concurrent write stream number) * MaxFileSizeMB * 2. So, for 100 concurrent streams, about 13 GB is enough. Note: The initial value of UploadCacheGB in the When you add a new cloud LSU, the value of UploadCacheGB is equal to CloudUploadCacheSize. You can later change this value in the | 12 |
DownloadDataCacheGB | It is the maximum space usage of data file, mainly the Note: The initial value of DownloadDataCacheGB in the When you add a new cloud LSU, the value of DownloadDataCacheGB is equal to CloudDataCacheSize. You can later change this value in the | 500 |
DownloadMetaCacheGB | It is the maximum space usage of metadata file, mainly the Note: The initial value of DownloadMetaCacheGB in the When you add a new cloud LSU, the value of DownloadMetaCacheGB is equal to CloudMetaCacheSize. You can later change this value in the | 500 |
MapCacheGB | It is the max space usage of Note: The initial value of MapCacheGB in the When you add a new cloud LSU, the value of MapCacheGB is equal to CloudMapCacheSize. You can later change this value in the | 5 |
UploadConnNum | Maximum number of concurrent connections to the cloud provider for uploading. Increasing this value is helpful especially for high latency network. | 60 |
DataDownloadConnNum | Maximum number of concurrent connections to the cloud provider for downloading data. Increasing this value is helpful especially for high latency network. | 40 |
MetaDownloadConnNum | Maximum number of concurrent connections to the cloud provider for downloading metadata. Increasing this value is helpful especially for high latency network. | 40 |
MapConnNum | Maximum number of concurrent connections to the cloud provider for downloading map. | 40 |
DeleteConnNum | Maximum number of concurrent connections to the cloud provider for deleting. Increasing this value is helpful especially for high latency network. | 100 |
KeepData | Keep uploaded data to data cache. The value always false if UseMem is true. | false |
KeepMeta | Keep uploaded meta to meta cache, always false if UseMem is true. | false |
ReadOnly | LSU is read only, cannot write and delete on this LSU. | false |
MaxFileSizeMB | Max size of bin file in MB. | 64 |
WriteThreadNum | The number of threads for writing data to the data container in parallel that can improve the performance of IO. | 2 |
RebaseThresholdMB | Rebasing threshold (MB), when image data in container less than the threshold, all of the image data in this container will not be used for deduplication to achieve good locality. Allowed values: 0 to half of MaxFileSizeMB, 0 = disabled | 4 |
AgingCheckContainerIntervalDay | The interval of checking a container for this Cloud LSU (in days). Note: For upgraded system, you must add this manually if you want to change the value for a cloud LSU. | 180 |
The contentrouter.cfg file is available at: <STORAGE>/etc/puredisk/contentrouter.cfg
.
The file has the following parameters:
Parameter | Details | Default value |
---|---|---|
CloudDataCacheSize | Default data cache size when adding Cloud LSU. Decrease this value if enough free space is not available. | 500 GiB |
CloudMapCacheSize | Default map cache size when adding Cloud LSU. Decrease this value if enough free space is not available. | 5 GiB |
CloudMetaCacheSize | Default meta cache size when adding Cloud LSU. Decrease this value if enough free space is not available. | 500 GiB |
CloudUploadCacheSize | Default upload cache size when adding Cloud LSU. The minimum value is 12 GiB. | 12 GiB |
MaxCloudCacheSize | Specify the maximum cloud cache size in percentage. It is based on total system memory, swap space excluded. | 20 % |
CloudBits | The number of top-level entries in the cloud cache. This number is (2^CloudBits). Increasing this value improves cache performance, at the expense of extra memory usage. Minimum value = 16, maximum value = 48. | Auto-sized according to MaxCloudCacheSize |
DCSCANDownloadTmpPath | While using the dcscan to check cloud LSU, data gets downloaded to this folder. For details, see the dcscan tool in cloud support section. | disabled |
UsableMemoryLimit | Specify the maximum usable memory size in percentage. MaxCacheSize + MaxCloudCacheSize + Cloud in-memory upload cache size must be less than or equal to the value of UsableMemoryLimit | 80% |
MaxSamplingCacheSize | Specify the maximum sampling cache size in percentage for all cloud LSUs here. UsableMemoryLimit + MaxSamplingCacheSize must be less than or equal to 95%. If you want to limit the maximum sampling cache size for a cloud LSU, you can configure LSUSamplingCachePercent in | 5% |
Adding a new cloud LSU fails if no partition has free space more than the following:
CloudDataCacheSize + CloudMapCacheSize + CloudMetaCacheSize + CloudUploadCacheSize + WarningSpaceThreshold * partition size
Use thecrcontrol --dsstat 2 --verbosecloud command to check the space of each of the partition.
Note:
Each Cloud LSU has a cache directory. The directory is created under an MSDP volume that is selected according to the disk space usage of all the MSDP volumes. Cloud LSU reserves some disk space for cache from that volume, and the local LSU cannot utilize more disk space.
The initial reserved disk space for each of the cloud LSU is the sum of values of UploadCacheGB, DownloadDataCacheGB, DownloadMetaCacheGB, and MapCacheGB in the <STORAGE>/etc/puredisk/cloud.json
file. The disk space decreases when the caches are used.
There is a Cache options in crcontrol --dsstat 2 --verbosecloud output:
# crcontrol --dsstat 2 --verbosecloud
=============== Mount point 2 ===============
Path = /msdp/data/dp1/1pdvol
Data storage
Raw Size Used Avail Cache Use%
48.8T 46.8T 861.4G 46.0T 143.5G 2%
Number of containers : 3609
Average container size : 252685915 bytes (240.98MB)
Space allocated for containers : 911943468161 bytes (849.31GB)
Reserved space : 2156777086976 bytes (1.96TB)
Reserved space percentage : 4.0%
The Cache option is the currently reserved disk space by cloud for this volume. The disk space is the sum of the reserved space for all cloud LSUs that have cache directories on this volume. The actually available space for Local LSU on this volume is Avail - Cache.
The contentrouter.cfg
file has the following aging check related parameters:
Parameter | Details | Default value |
---|---|---|
EnableAgingCheck | Enable or disable Cloud LSU container aging check. | true |
AgingCheckAllContainers | This parameter determines whether to check all containers or not. If set to 'false', it only checks containers in some latest images | false |
AgingCheckSleepSeconds | Aging check thread wakes up periodically with this time interval (in seconds). | 20 |
AgingCheckBatchNum | The number of containers for aging check each time. | 400 |
AgingCheckContainerInterval | Default interval value of checking a container when adding Cloud LSU (in days). | 180 |
AgingCheckSizeLowBound | This threshold is used to filter the containers whose size is less than this value for aging check. | 8Mib |
AgingCheckLowThreshold | This threshold is used to filter the containers whose garbage percentage is less than this value (in percentage). | 10% |
After you update the aging check related parameters, you must restart the MSDP service. You can use the crcontrol command line to update those parameters without restarting MSDP service.
To update the aging parameters using crcontrol command line
- Enable cloud aging check for all cloud LSUs.
/usr/openv/pdde/pdcr/bin/crcontrol --cloudagingcheckon
- Enable cloud aging check for a specified cloud LSU.
/usr/openv/pdde/pdcr/bin/crcontrol --cloudagingcheckon <dsid>
- Disable cloud aging check for all cloud LSUs.
/usr/openv/pdde/pdcr/bin/crcontrol --cloudagingcheckoff
- Disable cloud aging check for a specified cloud LSU.
/usr/openv/pdde/pdcr/bin/crcontrol --cloudagingcheckoff <dsid>
- Show cloud aging check state for all cloud LSUs.
/usr/openv/pdde/pdcr/bin/crcontrol --cloudagingcheckstate
- Show cloud aging check state for a specified cloud LSU.
/usr/openv/pdde/pdcr/bin/crcontrol --cloudagingcheckstate <dsid>
- Change cloud aging check to fast mode for all cloud LSUs.
/usr/openv/pdde/pdcr/bin/crcontrol --cloudagingfastcheck
- Change cloud aging check to fast mode for a specified cloud LSU.
/usr/openv/pdde/pdcr/bin/crcontrol --cloudagingfastcheck <dsid>
The spa.cfg file is available at: <STORAGE>/etc/puredisk/spa.cfg
.
The file has the following parameters:
Parameter | Details | Default value |
---|---|---|
CloudLSUCheckInterval | The check cloud LSU status interval in seconds. | 1800 |
EnablePOIDListCache | The status of the POID (Path Object ID) list cache as enabled or disabled. Path Object contains the metadata associated with that image. . | true |