Translation Notice
Please note that this content includes text that has been machine-translated from English. Veritas does not guarantee the accuracy regarding the completeness of the translation. You may also refer to the English Version of this knowledge base article for up-to-date information.
NetBackup 8.2 / 3.2 Hotfix - CloudCatalyst EEB Bundle (Etrack 3981837)
Abstract
Description
This Hotfix resolves the following issues:
- 5240 CloudCatalyst Appliance hangs every 2 weeks due to memory exhaustion
- NVE-386 CC needs to check for revoked certificates via OCSP support in go SDK code
- CloudCatalyst starts to experience slower writes and then eventually halts
- Make esfs use multi-object delete request when supported
- ESFS legacy log rotation out of principle
- Restore from CloudCatalyst-AWS-Snowball fails with error "Image warming failed 501"
- Backups and dups to CloudCatalyst are not progressing vxesfsd and rocksdb core dumps
- Restore of a VMware image from a CC media server is failing.
- [SWST] NB_D2C_AMZ_AIR_Imageshare -> NB_Cloud_DR_API -> Failed to Initialize DR In Cloud
- Amazon backup failed with media close error 87 [SWST] NB_CC_AMZ_SS_Encrypt -> Verify_Encrypt_Enabled ->
- Failed to verify if server side encryption has been enabled
- Datastore will not initialize on cloud catalyst server
- CloudCatalyst vxesfsd process corrupted on nbu 8.2.
- [CC][Glacier]Restore a image failed with error "Image warming failed 409"
- Upgrade to the latest fuse library (version 3.6.2)
- Fix vxesfsd crashes by removing boost library.
- Fix vxesfsd crashes caused by invalid value of sys_nlink.
- Remove unnecessary/obsolete entries from the fsdb database.
- Fix vxesfsd crashes caused by a null-pointer exception in esfs_opendir.
- Adding ocsp check to the ocsd using VerifyPeerCertificate OCSP check will happen for every tls connection made OCSP information is retrieved from the server certificate
- Currently esfs only uses the multi-object delete feature for Amazon.
- Adding a check for the BulkDelete attribute in CloudProvider.xml so that esfs will use the feature for all providers known to support it.
- Honor max log size in esfs.json for ocsd
- Fix restore failure from AWS Snowball device
- Restore fails with error 'Image warming failed 501'.
- esfs_storage logs an error 'NotImplemented: This operation is not supported yet. status code: 501'
- Improve performance and reduce memory consumption for ocsd process.
- OCSD log tool.
- Fix import fail error in cloud NBU(for DR).
- Add more detailed logging for cache eviction process
- Remove orphaned entries in file list directories
- Switch from char arrays to std::string
- Change log write to avoid race condition
- Refactor requestWorker to not reuse connection after getting region. Add socket log.
- Do not create an empty zero-byte log file on startup.
-
Add OSCP caching to remove overhead of response time slowdown from OCSP server. Setting
the default cache time to 60 Minutes. -
ET3982970: Cannot remove certain directories. Change rmdir logic of checking directory is empty.
This change makes sure that directory with garbage data is able to be deleted.
vxesfsd crashing. Set max open file to limit RocksDB memory allocation. The allocated
RSS memory will never be larger than 1GB. -
ET3985755: Retry when there is http conflict with aws 'operation aborted' error.
-
ET 3990062: Cache sys_ino for /data and /databases for performance
Remove unnecessary lock for esfs_opendir for MSDP performance
Start ocsd even if vxesfsd is already running
Check disk usage no more than once every 10 secondsET 3990062:
-
Cache eviction improvements
Skip bhd files and recently modified files during cache eviction
If unable to reclaim enough space, consider them for eviction the next time -
Cache the metadata instead of release it when reference count is 0. Change ocsd to download multiple objects for one file.
-
Free cached memory in destruction method and fix a incorrect memory free.
-
ET 3989115:
-
Round robin between upload and delete requests to avoid starving delete requests in very busy environments.
-
ET 3990062:
-
Fix performance issue of image sharing when data locality is bad.
-
Fix imagesharing's issue over AIR.
-
Improve performance of opendir/readdir (remove support for optional d_type on readdir since MSDP does not use it).
-
Correct name of temp download file for Azure.
-
Prevent inode reuse and change list result for Azure.
-
Change log mechanism. There is a dedicated ocsd log routine.
- Problems addressed:
- 1. The small log file
- 2. Log file is unexpected closed
- The log configuration will be more consistent.
- Remove nbu_wrapper dependency from ocsd. It can get cloud configuration using web service.
- Get cloud instance configuration file directly, if NB web service does not return the configuration.
- ET 3997365: Allow esfs running for non fatal error in fsdb. Avoid crash once vxesfs cannot continue at startup.
- ET 3994287: Ignore unrecognizable lines in bp.conf.
- ET 3993119: Support ECA and remove '.dl' from azure download method.
- ET 3993574: For delete requests change the ino to ext_rscn if it's not null (case of duplicate ino) for DR from cloud.
- storage manager uses ext_rscn as real inode for download because it might be reused. The utility of DR from cloud stores inode in cloud into ext_rscn.
- Comprehensive fsdb check at start. The allocated inode checking time is the same as metadata checking time.
- Implement fsdb check and integrate it into vxesfsd. vxesfsd will stop when fsdb has problems. fsdb check can remove garbage entries.
- Flush FSDB WAL at some important points. Add more info into fill_emptyfile for better analysis in future.
- ET 4006406: When proxy server is not enabled, we shouldn't see proxy related errors in the logs. Also handled NONE auth type.
- ET 3995775: Remove eof error message printed in ocsd logs.
- ET 4002975: Add ReadAt function for OCSReader because AWS SDK has special logic to reduce memory allocation when ReadAt is implemented.
- ET 3998016: Upgrades to go aws-sdk-go that include fixes for memory usage and other improvements
- Remove libnbsqlite.so dependency from fsdb_check.
- Change checking condition for socket ready.
- Search for short name in certmapinfo.json if exact match is not found. Ignore case when comparing server names.
- needWarm interface for msdp to know the bucket supports warming or not.
- Handle warm request for Azure blob.
- Update the warm stat file without warming when low latency storage type is selected.
- Match the objects for MSDPCC restores.
- Fix UseCRL log to identify if CRL is enabled or not.
- Fix some error in calling newOCSHTTPClient and change 0,1,2 file descriptor. /var/log/ocsd.log will have info when ocsd crashes.
- ET 4007911: Correct OS command paths from /usr/bin to /bin in pre and post-install scripts so they work on older versions of RedHat Linux.
- ET 4005838: Handle partial read case for Azure. When a file buffer is read partially it should update the buffer and size.
- ET 4010076: Set skipVerify to true when UseCRL is empty
- ET 4010614: Change return value of select() for socket is ready. select() returns a value that is larger than expected.
- ET 4010651: Skip verification of certificate when UseCRL is empty for S3 compatible provider.
- ET 4010965: vxesfsd is unable to convert ocsd's pid to integer due to a larger than expected value. It returns an error: 'Unable to convert pid to a integer'.
- ET 4012960: Avoid ocsd crash in ocspVerify(). Check size of vChain before referencing vChain[0] and vChain[1]. Do not reference nil err variable when issuer cert cannot be found.
- ET 4013386: Enable multiPath upload/download for NetApp StorageGRID and VERITAS Access. This will avoid the misleading timeout error on large upload requests that take longer than 15 minutes because the upload requests are now broken up into multiple/separate part requests.
- ET 4012807: Rebuild of rocksdb with portable=1 and remove USE_SSE=0ET
- 4012349: On appliances when daemons are started from the CLISH, some process is sending SIGHUP or SIGINT, killing ocsd. Ignore SIGHUP and SIGINT.
- ET 4027975: Use 100MB partSize for vtas-access multipart.
- Effectively disables multipart upload and download for vtas-access except for large DO files (over 100MB).
- Addresses performance problem with vtas-access.
- Add NetBackup Release Static Version String to all binaries to allow to identify what version of NetBackup or EEB binaries come from.
- Uninstall script issues where environments without CloudCatalyst configured will complain about missing esfstab file.
Version 26 includes these additional fixes:
- Add User-Agent.
- Add log for protobuf error and enable fsdb log for rebuild_fsdb.
- Store correct storage server name in fsdb in rebuild_fsdb
- Do not store 8.2 entries in pre-8.2 fsdb in rebuild_fsdb
- ET 4029128: Fix throttling issue for a case when THR:WORK_TIME_START is greater than THR:WORK_TIME_END.
- In that case WORK_TIME_BANDWIDTH_PERCENT was not used, OFF_TIME_BANDWIDTH_PERCENT setting was used instead.
- Fix to use no bandwidth when THR:OFF_TIME_BANDWIDTH_PERCENT is set to zero.
- Fix to remove bandwidth limit when throttling is disabled.
- Fix a case where bandwidth is zero and OCSD is consuming CPU while checking if bandwidth is available.
- ET 4040014: Add retries to handleWarmCheck. The fix is to resolve 403 errors seen on restores from Glacier/Glacier Deep Archive.
- ET 4049812: Use the port number specified by the user. For vtas-access the non-SSL port is 8143 not 80.
Versions Affected
- NetBackup 8.2
This EEB should be installed on: Cloud Catalyst Media Servers
README Notes:
This EEB introduces a comprehensive fsdb check.
If vxesfsd detects records with inode conflicts, ESFS stops and fsdb requires to be rebuilt.
The vxesfsd process will start if it detects that the following conditions are true:
- used inode is smaller than system inode.
- no entries are using the same inode.
- The information for an entry is complete. When an entry under a directory exists, the inode record and metadata records must exist.
If these check conditions are met, fsdb is considered to be in a consistent state and then ESFS will proceed to start.
If an inode record violates any of these check conditions, fsdb is considered to be in an inconsistent state and will require rebuilding.
- Prior to installing this EEB, make sure you have a current drcontrol policy backup of your CloudCatalyst server/appliance.
- CloudCatalyst servers/appliances should always be protected by an active drcontrol policy.
- This allows for recovery of the CloudCatalyst environment in order to access the data uploaded to the cloud.
- Without this, it is possible that data loss could occur.
- More information about the drcontrol policy is available in the NetBackup Deduplication Guide.
Downloads:
NB_8.2_ET3981837_26.zip
Appliance:
NBAPP_EEB_ET3981837-3.2.0.0-26.x86_64.rpm
VRTSflex-nb_EEB_ET3981837-8.2-26.x86_64.rpm
Installation Instructions:
1. Stop NBU services.
2. Uninstall any previous version of this EEB (3981837 versions 1 to 19) before installing version 20.
3. If not an NBU appliance, please run the EEB installer with the -create option.
4. Start NBU services.
Using the NetBackup Emergency Engineering Binary (EEB) installer
https://www.veritas.com/docs/100019405
Installing EEBs on a NetBackup 52x0 / 5330 Appliance
https://www.veritas.com/docs/100023444
How to install add-ons or an EEB on NetBackup instances running on Flex 1.3 version
https://www.veritas.com/content/support/en_US/doc/130821112-136840843-0/v137506948-136840843
Checksums for installed files:
File Checksum Byte count
linuxR_x86/cc_touch 1667738358 189624
linuxR_x86/cred_ioctl 4048399636 57672
linuxR_x86/dbdump 2119704288 7614784
linuxR_x86/esfs_check 3132802536 8466088
linuxR_x86/esfs_cleanup 3988313719 2901371
linuxR_x86/esfs_init.sh 3214286405 8296
linuxR_x86/esfs_reconfig 3467684365 473008
linuxR_x86/esfs_recover.sh 1421309387 2175
linuxR_x86/esfs_upgrade.sh 3842231178 7338
linuxR_x86/esfs_version.txt 2288867304 180
linuxR_x86/fsdb_check 9256110 13781336
linuxR_x86/fsdbbackup 1343945337 17368
linuxR_x86/install-3981837 1296998551 3258
linuxR_x86/libfuse3.so.3.6.2 2532643060 833440
linuxR_x86/librocksdb.so.6.0.2 1177832479 5731584
linuxR_x86/mkesfs 3190631609 8061448
linuxR_x86/nbu_wrapper 2609208640 1582320
linuxR_x86/ocsd 1201644309 18200167
linuxR_x86/ocsd_log_view 2453407573 128992
linuxR_x86/post_uninstall-3981837 2213021276 4694
linuxR_x86/pre_proc_uninstall_3981837 3395168376 2631
linuxR_x86/preprocess_install_3981837 1548897195 665
linuxR_x86/recoverdb 226939226 615208
linuxR_x86/setlsu_ioctl 2931137077 17400
linuxR_x86/vxesfs 4104628702 3271537
linuxR_x86/vxesfsd 2830669010 7568200
Recommended service state:
Stop all NetBackup services before applying this hotfix.
Applies to the following product releases
Update files
|
File name | Description | Version | Platform | Size |
---|
Knowledge base
Duplication jobs to CloudCatalyst may complete with status 0, but if cache fills up, dataloss could result
2020-02-11Severity Possible Data Loss Description Duplication jobs to the cloud using NetBackup CloudCatalyst will complete with status 0, but if the CloudCatalyst cache volume becomes full, data loss can result. Versions Affected NetBackup 8.1 EEB 3958410...
Image Sharing Server configuration for HCP cloud provider support
2020-02-14Problem In NetBackup 8.2, Image Sharing does not support HCP as a cloud provider. You cannot configure an Image Sharing server with an HCP cloud provider. Error Message When you try to configure an Image Sharing server with an HCP cloud provider,...