NetBackup™ for Hadoop Administrator's Guide
- Introduction
- Prerequisites and best practices for the Hadoop plug-in for NetBackup
- Configuring NetBackup for Hadoop
- Managing backup hosts
- Configuring the Hadoop plug-in using the Hadoop configuration file
- Configuring communication between NetBackup and Hadoop clusters that are SSL-enabled (HTTPS)
- Performing backups and restores of Hadoop
- Troubleshooting
- Troubleshooting backup issues for Hadoop data
- Troubleshooting restore issues for Hadoop data
Configuring distribution algorithm and golden ratio for backup hosts
To enhance the backup performance, you can configure the distribution algorithm and golden ratio based on the tunable parameters. You can improve the backup performance by Performance fine tuning of these algorithms is possible via combination of distribution algorithm and golden ratio.
To decide the distribution algorithm and golden ratio, consider the following:
If you have
small number of large sized files
in your data set: Use distribution algorithm 1 and change in golden ratio is not honored.If you have
large number of small sized files
in your data set: Use distribution algorithm 2 and change in golden ratio is not honored.If you have
small number of very large sized files and large number of small sized files
in your data set: Use distribution algorithm 4 or 5 and golden ratio that fits your deployment. Golden ratio supported range is from 1 to 100. If not provided default is considered as 75.Note:
Adjusting this value can change performance drastically.
/usr/openv/var/global/
To update the hadoop.conf file for configuring algorithm and golden ratio
- Update the
hadoop.conf
file with the following parameters:{ "distro_algo": distribution_algorithm and "golden_ratio":godlen_ratio }
- Copy this file to the following location on the backup host:
/usr/openv/var/global/