How ZFS Cache Impacts NBU Performance

Article: 100010231
Last Published: 2013-08-08
Ratings: 0 0
Product(s): NetBackup & Alta Data Protection

Problem

A Solaris 10 ZFS ARC (Adaptive Replacement Cache) configured as default can gradually impact NetBackup performance at Memory level, forcing NetBackup to use a lot of swap memory even when there are several gigabytes of RAM "Available."

In the following example from a Solaris 10 server, it can be seen that initially, 61% of the memory is owned by ZFS File Data (ARC):

# echo ::memstat | mdb -k
Page Summary                Pages                MB  %Tot
------------     ----------------  ----------------  ----
Kernel                    1960930             15319   24%
ZFS File Data             5006389             39112   61%
Anon                       746499              5832    9%
Exec and libs               37006               289    0%
Page cache                  22838               178    0%
Free (cachelist)           342814              2678    4%
Free (freelist)            103593               809    1%

Total                     8220069             64219
Physical                  8214591             64176

Error Message

The ARChits.sh script available at http://dtrace.org/blogs/brendan/2012/01/09/activity-of-the-zfs-arc/, can be used to find out how often the operating system hits or requests memory from the ARC.  In this example, the hitrate hits 100%:

# ./ARChits.sh
        HITS       MISSES   HITRATE
  2147483647       692982    99.99%
         518            4    99.23%
        2139            0   100.00%
        2865            0   100.00%
         727            0   100.00%
         515            0   100.00%
         700            0   100.00%
        2032            0   100.00%
        4529            0   100.00%
        1040            0   100.00%
     ...

There is a "middle man" between NetBackup and the physical memory.

Cause

To know which processes are the ones hitting ARC or requesting memory, dtrace can be used to count the number of positive and missed hits.

# dtrace -n 'sdt:zfs::arc-hit,sdt:zfs::arc-miss { @[execname] = count() }'
...
...
  nbproxy                                                        1099
  nbpem                                                          1447
  nscd                                                           1649
  bpstsinfo                                                      1785
  find                                                           1806
  fsflush                                                        2065
  bpclntcmd                                                      2257
  bpcompatd                                                      2394
  perl                                                           2945
  bpimagelist                                                    4019
  bprd                                                           4268
  avrd                                                           8899
  grep                                                           9249
  dbsrv11                                                       20782
  bpdbm                                                         37955

In the example above, dbsrv11 and bpdbm are the main consumers of ARC memory.

The next step is to know the memory request sizes in order to measure the impact of the ARC to NetBackup requests, due to the ARC's nature of slicing memory in small blocks.

# dtrace -n 'sdt:zfs::arc-hit,sdt:zfs::arc-miss { @["bytes"] = quantize(((arc_buf_hdr_t *)arg0)->b_size); }'

  bytes
           value  ------------- Distribution ------------- count
             256 |                                         0
             512 |@@@@@                                    10934
            1024 |                                         1146
            2048 |                                         467
            4096 |                                         518
            8192 |@@@@                                     9485
           16384 |@                                        1506
           32768 |                                         139
           65536 |                                         356
          131072 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@            67561
          262144 |                                         0

The majority of memory requests are 128KB (131072) block sizes. A few are very small; this occurs when there are no major requests at NetBackup level.

Things change when a lot of NetBackup requests come in, suddenly raising the number of small blocks requests. The following output comes from a master pulling some data running several vmquery commands:

# dtrace -n 'sdt:zfs::arc-hit,sdt:zfs::arc-miss { @["bytes"] = quantize(((arc_buf_hdr_t *)arg0)->b_size); }'
 
  bytes
           value  ------------- Distribution ------------- count
             256 |                                         0
             512 |@@@@@@@@@@@@                             78938
            1024 |@                                        7944
            2048 |                                         1812
            4096 |@                                        3751
            8192 |@@@@@@@@@@@@                             76053
           16384 |@                                        9030
           32768 |                                         322
           65536 |                                         992
          131072 |@@@@@@@@@@@@                             77239
          262144 |                                         0

Not only is vmquery draining all the memory requests, the operating system is forced to rehydrate the memory into bigger blocks in order to meet NetBackup's block size requirements, impacting the application performance mainly at NBDB or EMMDB levels.

# dtrace -n 'sdt:zfs::arc-hit,sdt:zfs::arc-miss { @[execname] = count() }'
...
...
  avrd                                                           1210
  bpimagelist                                                    2865
  dbsrv11                                                        2970
  grep                                                           4971
  bpdbm                                                          6662
  vmquery                                                       94161

The memory rehydration forces the operating system to use a lot of swap memory, even when there is a lot available under ZFS File Data:

# vmstat 1
kthr      memory            page            disk          faults      cpu
r b w   swap  free  re  mf pi po fr de sr s1 s2 s3 s4   in   sy   cs us sy id
0 0 0 19244016 11342680 432 1518 566 604 596 0 0 8 -687 8 -18 8484 30088 9210 10 5 84
0 2 0 11441128 3746680 44 51 8 23 23 0  0  0  0  0  0 6822 19737 7929 9  3 88
0 1 0 11436168 3745440 14 440 8 23 23 0 0  0  0  0  0 6460 18428 7038 9  4 87
0 2 0 11440808 3746856 6 0 15 170 155 0 0  0  0  0  0 6463 18163 6996 9  4 87
0 2 0 11440808 3747000 295 822 15 147 147 0 0 0 0 0 0 7604 27577 8989 11 5 84
0 1 0 11440552 3746872 122 683 8 70 70 0 0 0  0  0  0 5926 20430 6444 9  3 88

In this case, there are 39GB of RAM Allocated for ZFS File Data (ARC) which are supposed to be free in case any application needs it, but due to the ARC's nature of slicing memory into small pieces, when the operating system takes away some of the memory it takes long time to respond to the application.

# echo ::memstat | mdb -k
Page Summary                Pages                MB  %Tot
------------     ----------------  ----------------  ----
Kernel                    1960930             15319   24%
ZFS File Data             5006389             39112   61%
Anon                       746499              5832    9%
Exec and libs               37006               289    0%
Page cache                  22838               178    0%
Free (cachelist)           342814              2678    4%
Free (freelist)            103593               809    1%

Total                     8220069             64219
Physical                  8214591             64176

When the master is rebooted, initially there will be no ZFS File Data allocation so NetBackup will seem to run "perfectly" - but performance of the master will also seem to degrade slowly, depending on how fast the ARC "eats the memory:"

# echo ::memstat | mdb -k
Page Summary                Pages                MB  %Tot
------------     ----------------  ----------------  ----
Kernel                     479738              3747    6%
Anon                       422140              3297    5%
Exec and libs               45443               355    1%
Page cache                  83530               652    1%
Free (cachelist)          2200908             17194   27%
Free (freelist)           4988310             38971   61%


Total                     8220069             64219
Physical                  8214603             64176

Solution

To address this issue, limit the size of the ZFS ARC on each problematic system.

To determine the limit value, the following procedure may be followed.

Note: As with any changes of this nature, please bear in mind that the setting may have to be tweaked to accommodate additional load and/or memory changes.  Monitor and adjust as needed.

  1. After system is fully loaded and running backups, sample the total memory use.

Consider the following example:
# prstat -s size -–a
NPROC USERNAME  SWAP   RSS MEMORY      TIME  CPU                           
    32 sybase     96G   96G    75%  42:38:04 0.2%
    72 root      367M  341M   0.3%   9:38:11 0.0%
     6 daemon   7144K 9160K   0.0%   0:01:01 0.0%
     1 smmsp    2048K 6144K   0.0%   0:00:22 0.0%

2. Compare percentage of memory in use to total physical memory:
# prtdiag | grep -i Memory
Memory size: 131072 Megabytes

3. In the above example, approximately 75% of the physical memory is used under typical load.  Add a few percent for "headroom" - in this example, 80% will be used.

4. 100% - 80% = 20%.  20% of 128GB is 26GB = 27917287424 bytes.  This is the new limit which will be specified for the cache.

5. Configure this new ZFS ARC limit in /etc/system:
 set zfs:zfs_arc_max=27917287424

6. Reboot the system for the new value to take effect.

References:
High Memory Utilized by ZFS File Data
 
https://forums.oracle.com/thread/2340011

ZFS Evil Tuning Guide: Limiting the ARC Cache
 https://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Limiting_the_ARC_Cache

Activity of the ZFS ARC
 https://dtrace.org/blogs/brendan/2012/01/09/activity-of-the-zfs-arc/


Was this content helpful?