Storage Foundation 7.4.1 Administrator's Guide - Linux
- Section I. Introducing Storage Foundation
- Overview of Storage Foundation
- How Dynamic Multi-Pathing works
- How Veritas Volume Manager works
- How Veritas Volume Manager works with the operating system
- How Veritas Volume Manager handles storage management
- Volume layouts in Veritas Volume Manager
- Online relayout
- Volume resynchronization
- Dirty region logging
- Volume snapshots
- FastResync
- How VxVM handles hardware clones or snapshots
- Volume encryption
- How Veritas File System works
- Section II. Provisioning storage
- Provisioning new storage
- Advanced allocation methods for configuring storage
- Customizing allocation behavior
- Using rules to make volume allocation more efficient
- Understanding persistent attributes
- Customizing disk classes for allocation
- Specifying allocation constraints for vxassist operations with the use clause and the require clause
- Creating volumes of a specific layout
- Customizing allocation behavior
- Creating and mounting VxFS file systems
- Creating a VxFS file system
- Mounting a VxFS file system
- tmplog mount option
- ioerror mount option
- largefiles and nolargefiles mount options
- Resizing a file system
- Monitoring free space
- Extent attributes
- Section III. Administering multi-pathing with DMP
- Administering Dynamic Multi-Pathing
- Discovering and configuring newly added disk devices
- About discovering disks and dynamically adding disk arrays
- How to administer the Device Discovery Layer
- Administering DMP using the vxdmpadm utility
- Gathering and displaying I/O statistics
- Specifying the I/O policy
- Discovering and configuring newly added disk devices
- Dynamic Reconfiguration of devices
- Reconfiguring a LUN online that is under DMP control using the Dynamic Reconfiguration tool
- Manually reconfiguring a LUN online that is under DMP control
- Managing devices
- Displaying disk information
- Changing the disk device naming scheme
- Adding and removing disks
- Event monitoring
- Administering Dynamic Multi-Pathing
- Section IV. Administering Storage Foundation
- Administering sites and remote mirrors
- About sites and remote mirrors
- Fire drill - testing the configuration
- Changing the site name
- Administering the Remote Mirror configuration
- Failure and recovery scenarios
- Administering sites and remote mirrors
- Section V. Optimizing I/O performance
- Veritas File System I/O
- Veritas Volume Manager I/O
- Managing application I/O workloads using maximum IOPS settings
- Section VI. Using Point-in-time copies
- Understanding point-in-time copy methods
- When to use point-in-time copies
- About Storage Foundation point-in-time copy technologies
- Volume-level snapshots
- Storage Checkpoints
- About FileSnaps
- About snapshot file systems
- Administering volume snapshots
- Traditional third-mirror break-off snapshots
- Full-sized instant snapshots
- Creating instant snapshots
- Adding an instant snap DCO and DCO volume
- Controlling instant snapshot synchronization
- Creating instant snapshots
- Cascaded snapshots
- Adding a version 0 DCO and DCO volume
- Administering Storage Checkpoints
- Storage Checkpoint administration
- Administering FileSnaps
- Administering snapshot file systems
- Understanding point-in-time copy methods
- Section VII. Optimizing storage with Storage Foundation
- Understanding storage optimization solutions in Storage Foundation
- Migrating data from thick storage to thin storage
- Maintaining Thin Storage with Thin Reclamation
- Reclamation of storage on thin reclamation arrays
- Identifying thin and thin reclamation LUNs
- Veritas InfoScale 4k sector device support solution
- Section VIII. Maximizing storage utilization
- Understanding storage tiering with SmartTier
- Creating and administering volume sets
- Multi-volume file systems
- Features implemented using multi-volume file system (MVFS) support
- Adding a volume to and removing a volume from a multi-volume file system
- Volume encapsulation
- Load balancing
- Administering SmartTier
- About SmartTier
- Placement classes
- Administering placement policies
- File placement policy rules
- Multiple criteria in file placement policy rule statements
- Using SmartTier with solid state disks
- Sub-file relocation
- Administering hot-relocation
- How hot-relocation works
- Moving relocated subdisks
- Deduplicating data
- Compressing files
- About compressing files
- Use cases for compressing files
- Section IX. Administering storage
- Managing volumes and disk groups
- Rules for determining the default disk group
- Moving volumes or disks
- Monitoring and controlling tasks
- Performing online relayout
- Adding a mirror to a volume
- Managing disk groups
- Disk group versions
- Displaying disk group information
- Importing a disk group
- Moving disk groups between systems
- Importing a disk group containing hardware cloned disks
- Handling conflicting configuration copies
- Destroying a disk group
- Backing up and restoring disk group configuration data
- Managing plexes and subdisks
- Decommissioning storage
- Rootability
- Encapsulating a disk
- Rootability
- Sample supported root disk layouts for encapsulation
- Encapsulating and mirroring the root disk
- Administering an encapsulated boot disk
- Quotas
- Using Veritas File System quotas
- File Change Log
- Managing volumes and disk groups
- Section X. Reference
- Appendix A. Reverse path name lookup
- Appendix B. Tunable parameters
- Tuning the VxFS file system
- Methods to change Dynamic Multi-Pathing tunable parameters
- Tunable parameters for VxVM
- Methods to change Veritas Volume Manager tunable parameters
- Appendix C. Command reference
About deduplication chunk size
The deduplication chunk size, which is also referred to as deduplication granularity, is the unit at which fingerprints are computed. A valid chunk size is between 4k and 128k and power of two. Once set, the only way to change the chunk size is to remove and re-enable deduplication on the file system.
You should carefully select the chunk size, as the size has significant impact on deduplication as well as resource requirements. The size directly affects the number of fingerprint records in the deduplication database as well as temporary space required for sorting these records. A smaller chunk size results in a large number of fingerprints and hence requires a significant amount of space for the deduplication database.
While the amount of storage that you save after deduplication depends heavily on the dataset and distribution of duplicates within the dataset, the chunk size can also affect the savings significantly. You must understand your dataset to get the best results after deduplication. A general rule of thumb is that a smaller chunk size saves more storage. A smaller chunk size results in more granular fingerprints and in general results in identifying more duplicates. However, smaller chunks have additional costs in terms of database size, deduplication time, and, more importantly, fragmentation. The deduplication database size can be significantly large for small chunk sizes. Higher fragmentation normally results in more file system metadata and hence can require more storage. The space consumed by the deduplication database and the increased file system metadata can reduce the savings achieved via deduplication. Additionally, fragmentation can also have a negative effect on performance. The Veritas File System (VxFS) deduplication algorithms try to reduce fragmentation by coalescing multiple contiguous duplicate chunks.
Larger chunk sizes normally result in a smaller deduplication database size, faster deduplication, and less fragmentation. These benefits sometimes come at the cost of less storage savings. If you have a large number duplicate files that are small in size, you still can choose a chunk size that is larger than the file size. A larger chunk size does not affect the deduplication of files that are smaller than the chunk size. In such cases, the fingerprint is calculated on the whole file, and the files are still deduplicated.
Veritas recommends a chunk size of 16k or higher.
The space consumed by the deduplication database is a function of the amount of data in the file system and the deduplication chunk size. The space consumed by the deduplication database grows with time as new data is added to file system. Additional storage is required for temporary use, such as sorting fingerprints. The temporary storage may be freed after the work completes. Ensure that sufficient free space is available for deduplication to complete successfully. The deduplication might not start if the file system free space is less than approximately 15%. The deduplication sometimes needs more than 15% free space for smaller chunk sizes. In general, the space consumed reduces significantly with larger chunk sizes. Veritas recommends that you have approximately 20% free space for 4k chunks.