Veritas NetBackup™ Appliance Capacity Planning and Performance Tuning Guide
- About this Guide
- Section I. Capacity planning
- Analyzing your backup requirements
- Designing your backup system
- Section II. Best Practices
- Section III. Performance tuning
- Section IV. Quick reference to Capacity planning and Performance tuning
About CPU monitoring and tuning
Table: Sample vmstat output (collected with vmstat 5) displays a sample output of the vmstat command when 120 streams of 98% deduplication backup jobs are running on a 53xx appliance.
Table: Sample vmstat output (collected with vmstat 5)
r | b | swpd | Free | Buff | Cache | si | so | us | sy | id | wa |
---|---|---|---|---|---|---|---|---|---|---|---|
89 | 0 | 1006344 | 348907856 | 37632 | 11694512 | 0 | 0 | 62 | 30 | 8 | 0 |
84 | 0 | 1006316 | 348450264 | 37640 | 12016276 | 11 | 0 | 62 | 30 | 8 | 0 |
63 | 0 | 1006316 | 348104004 | 37664 | 12260816 | 0 | 0 | 63 | 30 | 7 | 0 |
76 | 0 | 1006288 | 347857280 | 37664 | 12491148 | 5 | 0 | 61 | 29 | 9 | 0 |
46 | 0 | 1006288 | 347538340 | 37684 | 12756108 | 0 | 0 | 61 | 30 | 8 | 0 |
72 | 0 | 1006260 | 347111556 | 37692 | 13083760 | 3 | 0 | 62 | 30 | 8 | 0 |
72 | 0 | 1006252 | 346786820 | 37692 | 13332416 | 6 | 0 | 62 | 30 | 8 | 0 |
61 | 0 | 1006164 | 346485836 | 37712 | 13612680 | 28 | 0 | 59 | 29 | 13 | 0 |
92 | 0 | 1006156 | 346136540 | 37720 | 13902248 | 0 | 0 | 60 | 30 | 10 | 0 |
106 | 0 | 1006132 | 345721588 | 37724 | 14190992 | 6 | 0 | 61 | 31 | 9 | 0 |
82 | 0 | 1006128 | 345355448 | 37732 | 14465996 | 0 | 0 | 61 | 30 | 9 | 0 |
113 | 0 | 1005972 | 345072276 | 37740 | 14760008 | 30 | 0 | 61 | 30 | 10 | 0 |
66 | 0 | 1005964 | 344747824 | 37740 | 15004520 | 1 | 0 | 61 | 30 | 9 | 0 |
98 | 0 | 1005924 | 344446500 | 37748 | 15282376 | 8 | 0 | 60 | 30 | 10 | 0 |
118 | 0 | 1005920 | 344035148 | 37760 | 15582400 | 0 | 0 | 61 | 30 | 9 | 0 |
96 | 0 | 1005900 | 343802084 | 37764 | 15882380 | 4 | 0 | 62 | 30 | 9 | 0 |
60 | 0 | 1005900 | 343406276 | 37784 | 16175128 | 0 | 0 | 58 | 29 | 13 | 0 |
61 | 0 | 1005872 | 343038168 | 37792 | 16470724 | 3 | 0 | 62 | 30 | 7 | 0 |
60 | 0 | 1005868 | 342653976 | 37792 | 16747684 | 1 | 0 | 61 | 30 | 9 | 0 |
116 | 0 | 1005836 | 342343076 | 37800 | 17001952 | 5 | 0 | 62 | 30 | 8 | 0 |
Note:
Some of the columns from the output have been removed to simplify the display.
From the above table, we can conclude that the system is CPU bound, because the r. The value of column r is fluctuating between 46 and 118. r stands for "CPU ready to run queue". It is a count of processes that are currently running or ready to run but waiting for free CPU. 53xx has 40 logical CPU threads, it can at most handle 40 concurrent processes at a time. You can derive the number of processes that are ready to run but waiting for CPU cycles by subtracting 40 from the value in column r.
column (which displays the % of CPU idle) is mostly in single digit. This indicates that the 53xx CPU utilization is constantly over 90%. Another indication that the system is CPU bound is from the first columnWith the above CPU statistics and the fact that this happens while the system was running 120 concurrent 98% backup streams, there are two possible actions that you can take to lower the CPU consumption:
Lower the batch size of job. If CPU is overly busy, the jobs could spend too much time waiting for available CPU cycles. Lowering the number of concurrent jobs per batch can improve overall performance.
Adding another 53xx as the fingerprint server to double the CPU capacity is a natural solution.
A quick internal experiment with an additional fingerprint server showed that the performance increased almost 40% up to 10GB/sec while CPU usage reduced almost 50% on the appliance. At this point, the bottleneck switched to network since the 53xx can support up to 10 x 10 Gbps NIC which cap the network throughput around 10 GB/sec. We can probably see even higher performance improvement if there were more than 10 x 10 Gbps NIC installed on the system.