NetBackup™ Backup Planning and Performance Tuning Guide
- NetBackup capacity planning
- Primary server configuration guidelines
- Media server configuration guidelines
- NetBackup hardware design and tuning considerations
- About NetBackup Media Server Deduplication (MSDP)
- MSDP tuning considerations
- MSDP sizing considerations
- Accelerator performance considerations
- Media configuration guidelines
- How to identify performance bottlenecks
- Best practices
- Best practices: NetBackup AdvancedDisk
- Best practices: NetBackup tape drive cleaning
- Best practices: Universal shares
- NetBackup for VMware sizing and best practices
- Best practices: Storage lifecycle policies (SLPs)
- Measuring Performance
- Table of NetBackup All Log Entries report
- Evaluating system components
- Tuning the NetBackup data transfer path
- NetBackup network performance in the data transfer path
- NetBackup server performance in the data transfer path
- About shared memory (number and size of data buffers)
- About the communication between NetBackup client and media server
- Effect of fragment size on NetBackup restores
- Other NetBackup restore performance issues
- About shared memory (number and size of data buffers)
- Tuning other NetBackup components
- How to improve NetBackup resource allocation
- How to improve FlashBackup performance
- Tuning disk I/O performance
Storage trends
Solid state disks (SSDs) as storage on our systems, NetBackup + Customer Platform, Veritas Appliances and Reference Architecture + NetBackup will be the norm rather than the exception in the near future.
A chart from BlockandFiles.com provides a good indicator of the possible scenario we will encounter. SSD is in its early phase while disk is in its late stages of life. We can look at other technologies that were superseded. A relevant example is disk versus tape. Tape has historically been a excellent long term retention media and disk a short-term retention device. With the ongoing increase in data creation, the size of it necessitates a new type of storage. We will likely see the change in the next few years where SSDs push disk (including cloud providers) to long term retention and tape relegated to retention that is legislate mandated such as 30 year for medical information.
Disk drive companies have been trying to make a heat or microwave assisted magnetic recording, known as HAMR or MAMOR disk drives for more than 20 years. This technology portends to increase the bit density on the drive platters to provide storage in the 20 to 50TB per drive range. However, there is yet to be a viable example of a drive with those capacities at a cost that significantly drive the $/GB down past the $0.02 level. Meanwhile the SSD is continuing its increase in performance and decline in cost per GB.
The key advantage of SSDs versus disk for the NetBackup software is the access time advantage SSDs possess. Current SAS Nearline Disk Drives are specified at as low a 4.16 millisecond average rotational latency and 8 millisecond seek for a total of 12.16 millisecond access time. NVMe SSD are in the 10-microsecond range at this time. SSDs do not suffer from some of the mechanical requirements of disk drives.
Drives require that the Read/Write head be positioned properly over the sector, which resides in a track, before it can be read or written to. This is not a consistent exact operation as the tracks have an extremely small width to them. To get a perspective, the drives used presently have a track density of 371,000 tracks per inch. Each of the platters of the drive have 337,400 tracks per disk surface. This works out to 0.0000027 inch width per track. Because of this, there is not always a perfect "settle" over the sector and the disk will retry. This requires a full revolution of the disk which results in an 8.33 millisecond latency for each retry.
Since disk drives are a mechanical device, they are subject to external forces, especially vibration. Drives have adopted Rotational Vibration Accelerometers (RVA) to compensate for the external vibrations, but they are not effective on all impetus. Vibrations from adjacent drives or other external sources can have enough amplitude to overwhelm the RVA and cause a retry. Retries can be numerous and retries of up to 20 times happen. If the retries exceed a 60 second window, the RAID controller will enact a SCSI Time Out, the drive will be marked as bad, and removed from the RAID set.
When drives fail, they are typically returned to the manufacturer for failure analysis. 80% of failed drives returned are found to be No Trouble Found (NTF) and the reason is usually from a SCSI Time Out as noted previously. Vibration and track settle are the most likely cause of this failure and as such, creates additional cost, performance degradation during rebuild, and lost time.
As the market has evolved, access time has become more and more important. Sequential time in the era of tape was vital as it would allow for more data per cartridge with higher throughput. Now, with deduplication, the speed to read from the client is very important, but in the vast majority of backup environments, the need for throughput to the target disks is diminished. The key in the present era, in addition to up time, is Access time.