NetBackup 8.2 / 3.2 Hotfix - CloudCatalyst EEB Bundle (Etrack 3981837)

Translation Notice

Please note that this content includes text that has been machine-translated from English. Veritas does not guarantee the accuracy regarding the completeness of the translation. You may also refer to the English Version of this knowledge base article for up-to-date information.

NetBackup 8.2 / 3.2 Hotfix - CloudCatalyst EEB Bundle (Etrack 3981837)

HotFix Critical

Update ID: UPD772994

Version: 8.2 / 3.2

Platform: Linux

Release date: 2022-09-05

Abstract

This NetBackup 8.2 CloudCatalyst Hotfix EEB Bundle resolves CC Media Server issues.

Description

This Hotfix resolves the following issues:

5240 CloudCatalyst Appliance hangs every 2 weeks due to memory exhaustion
NVE-386 CC needs to check for revoked certificates via OCSP support in go SDK code
CloudCatalyst starts to experience slower writes and then eventually halts
Make esfs use multi-object delete request when supported
ESFS legacy log rotation out of principle
Restore from CloudCatalyst-AWS-Snowball fails with error "Image warming failed 501"
Backups and dups to CloudCatalyst are not progressing vxesfsd and rocksdb core dumps
Restore of a VMware image from a CC media server is failing.
[SWST] NB_D2C_AMZ_AIR_Imageshare -> NB_Cloud_DR_API -> Failed to Initialize DR In Cloud
Amazon backup failed with media close error 87 [SWST] NB_CC_AMZ_SS_Encrypt -> Verify_Encrypt_Enabled ->
Failed to verify if server side encryption has been enabled
Datastore will not initialize on cloud catalyst server
CloudCatalyst vxesfsd process corrupted on nbu 8.2.
[CC][Glacier]Restore a image failed with error "Image warming failed 409"
Upgrade to the latest fuse library (version 3.6.2)
Fix vxesfsd crashes by removing boost library.
Fix vxesfsd crashes caused by invalid value of sys_nlink.
Remove unnecessary/obsolete entries from the fsdb database.
Fix vxesfsd crashes caused by a null-pointer exception in esfs_opendir.
Adding ocsp check to the ocsd using VerifyPeerCertificate OCSP check will happen for every tls connection made OCSP information is retrieved from the server certificate
Currently esfs only uses the multi-object delete feature for Amazon.
Adding a check for the BulkDelete attribute in CloudProvider.xml so that esfs will use the feature for all providers known to support it.
Honor max log size in esfs.json for ocsd
Fix restore failure from AWS Snowball device
Restore fails with error 'Image warming failed 501'.
esfs_storage logs an error 'NotImplemented: This operation is not supported yet. status code: 501'
Improve performance and reduce memory consumption for ocsd process.
OCSD log tool.
Fix import fail error in cloud NBU(for DR).
Add more detailed logging for cache eviction process
Remove orphaned entries in file list directories
Switch from char arrays to std::string
Change log write to avoid race condition
Refactor requestWorker to not reuse connection after getting region. Add socket log.
Do not create an empty zero-byte log file on startup.
Add OSCP caching to remove overhead of response time slowdown from OCSP server. Setting
the default cache time to 60 Minutes.
ET3982970: Cannot remove certain directories. Change rmdir logic of checking directory is empty.
This change makes sure that directory with garbage data is able to be deleted.
vxesfsd crashing. Set max open file to limit RocksDB memory allocation. The allocated
RSS memory will never be larger than 1GB.
ET3985755: Retry when there is http conflict with aws 'operation aborted' error.
ET 3990062: Cache sys_ino for /data and /databases for performance
Remove unnecessary lock for esfs_opendir for MSDP performance
Start ocsd even if vxesfsd is already running
Check disk usage no more than once every 10 seconds

ET 3990062:
Cache eviction improvements
Skip bhd files and recently modified files during cache eviction
If unable to reclaim enough space, consider them for eviction the next time
Cache the metadata instead of release it when reference count is 0. Change ocsd to download multiple objects for one file.
Free cached memory in destruction method and fix a incorrect memory free.
ET 3989115:
Round robin between upload and delete requests to avoid starving delete requests in very busy environments.
ET 3990062:
Fix performance issue of image sharing when data locality is bad.
Fix imagesharing's issue over AIR.
Improve performance of opendir/readdir (remove support for optional d_type on readdir since MSDP does not use it).
Correct name of temp download file for Azure.
Prevent inode reuse and change list result for Azure.
Change log mechanism. There is a dedicated ocsd log routine.
Problems addressed:
1. The small log file
2. Log file is unexpected closed
The log configuration will be more consistent.
Remove nbu_wrapper dependency from ocsd. It can get cloud configuration using web service.
Get cloud instance configuration file directly, if NB web service does not return the configuration.
ET 3997365: Allow esfs running for non fatal error in fsdb. Avoid crash once vxesfs cannot continue at startup.
ET 3994287: Ignore unrecognizable lines in bp.conf.
ET 3993119: Support ECA and remove '.dl' from azure download method.
ET 3993574: For delete requests change the ino to ext_rscn if it's not null (case of duplicate ino) for DR from cloud.
storage manager uses ext_rscn as real inode for download because it might be reused. The utility of DR from cloud stores inode in cloud into ext_rscn.
Comprehensive fsdb check at start. The allocated inode checking time is the same as metadata checking time.
Implement fsdb check and integrate it into vxesfsd. vxesfsd will stop when fsdb has problems. fsdb check can remove garbage entries.
Flush FSDB WAL at some important points. Add more info into fill_emptyfile for better analysis in future.
ET 4006406: When proxy server is not enabled, we shouldn't see proxy related errors in the logs. Also handled NONE auth type.
ET 3995775: Remove eof error message printed in ocsd logs.
ET 4002975: Add ReadAt function for OCSReader because AWS SDK has special logic to reduce memory allocation when ReadAt is implemented.
ET 3998016: Upgrades to go aws-sdk-go that include fixes for memory usage and other improvements
Remove libnbsqlite.so dependency from fsdb_check.
Change checking condition for socket ready.
Search for short name in certmapinfo.json if exact match is not found. Ignore case when comparing server names.
needWarm interface for msdp to know the bucket supports warming or not.
Handle warm request for Azure blob.
Update the warm stat file without warming when low latency storage type is selected.
Match the objects for MSDPCC restores.
Fix UseCRL log to identify if CRL is enabled or not.
Fix some error in calling newOCSHTTPClient and change 0,1,2 file descriptor. /var/log/ocsd.log will have info when ocsd crashes.
ET 4007911: Correct OS command paths from /usr/bin to /bin in pre and post-install scripts so they work on older versions of RedHat Linux.
ET 4005838: Handle partial read case for Azure. When a file buffer is read partially it should update the buffer and size.
ET 4010076: Set skipVerify to true when UseCRL is empty
ET 4010614: Change return value of select() for socket is ready. select() returns a value that is larger than expected.
ET 4010651: Skip verification of certificate when UseCRL is empty for S3 compatible provider.
ET 4010965: vxesfsd is unable to convert ocsd's pid to integer due to a larger than expected value. It returns an error: 'Unable to convert pid to a integer'.
ET 4012960: Avoid ocsd crash in ocspVerify(). Check size of vChain before referencing vChain[0] and vChain[1]. Do not reference nil err variable when issuer cert cannot be found.
ET 4013386: Enable multiPath upload/download for NetApp StorageGRID and VERITAS Access. This will avoid the misleading timeout error on large upload requests that take longer than 15 minutes because the upload requests are now broken up into multiple/separate part requests.
ET 4012807: Rebuild of rocksdb with portable=1 and remove USE_SSE=0ET
4012349: On appliances when daemons are started from the CLISH, some process is sending SIGHUP or SIGINT, killing ocsd. Ignore SIGHUP and SIGINT.
ET 4027975: Use 100MB partSize for vtas-access multipart.
Effectively disables multipart upload and download for vtas-access except for large DO files (over 100MB).
Addresses performance problem with vtas-access.
Add NetBackup Release Static Version String to all binaries to allow to identify what version of NetBackup or EEB binaries come from.
Uninstall script issues where environments without CloudCatalyst configured will complain about missing esfstab file.

Version 26 includes these additional fixes:

Add User-Agent.
Add log for protobuf error and enable fsdb log for rebuild_fsdb.
Store correct storage server name in fsdb in rebuild_fsdb
Do not store 8.2 entries in pre-8.2 fsdb in rebuild_fsdb
ET 4029128: Fix throttling issue for a case when THR:WORK_TIME_START is greater than THR:WORK_TIME_END.
In that case WORK_TIME_BANDWIDTH_PERCENT was not used, OFF_TIME_BANDWIDTH_PERCENT setting was used instead.
Fix to use no bandwidth when THR:OFF_TIME_BANDWIDTH_PERCENT is set to zero.
Fix to remove bandwidth limit when throttling is disabled.
Fix a case where bandwidth is zero and OCSD is consuming CPU while checking if bandwidth is available.
ET 4040014: Add retries to handleWarmCheck. The fix is to resolve 403 errors seen on restores from Glacier/Glacier Deep Archive.
ET 4049812: Use the port number specified by the user. For vtas-access the non-SSL port is 8143 not 80.

Versions Affected

NetBackup 8.2

This EEB should be installed on: Cloud Catalyst Media Servers

README Notes:
This EEB introduces a comprehensive fsdb check.
If vxesfsd detects records with inode conflicts, ESFS stops and fsdb requires to be rebuilt.

The vxesfsd process will start if it detects that the following conditions are true:

used inode is smaller than system inode.
no entries are using the same inode.
The information for an entry is complete. When an entry under a directory exists, the inode record and metadata records must exist.

If these check conditions are met, fsdb is considered to be in a consistent state and then ESFS will proceed to start.

If an inode record violates any of these check conditions, fsdb is considered to be in an inconsistent state and will require rebuilding.

Prior to installing this EEB, make sure you have a current drcontrol policy backup of your CloudCatalyst server/appliance.
CloudCatalyst servers/appliances should always be protected by an active drcontrol policy.
This allows for recovery of the CloudCatalyst environment in order to access the data uploaded to the cloud.
Without this, it is possible that data loss could occur.
More information about the drcontrol policy is available in the NetBackup Deduplication Guide.

Downloads:

NB_8.2_ET3981837_26.zip

Appliance:
NBAPP_EEB_ET3981837-3.2.0.0-26.x86_64.rpm

VRTSflex-nb_EEB_ET3981837-8.2-26.x86_64.rpm

Installation Instructions:

1. Stop NBU services.

2. Uninstall any previous version of this EEB (3981837 versions 1 to 19) before installing version 20.

3. If not an NBU appliance, please run the EEB installer with the -create option.

4. Start NBU services.

Using the NetBackup Emergency Engineering Binary (EEB) installer

https://www.veritas.com/docs/100019405

Installing EEBs on a NetBackup 52x0 / 5330 Appliance

https://www.veritas.com/docs/100023444

How to install add-ons or an EEB on NetBackup instances running on Flex 1.3 version
https://www.veritas.com/content/support/en_US/doc/130821112-136840843-0/v137506948-136840843

Checksums for installed files:

File Checksum Byte count

linuxR_x86/cc_touch 1667738358 189624
linuxR_x86/cred_ioctl 4048399636 57672
linuxR_x86/dbdump 2119704288 7614784
linuxR_x86/esfs_check 3132802536 8466088
linuxR_x86/esfs_cleanup 3988313719 2901371
linuxR_x86/esfs_init.sh 3214286405 8296
linuxR_x86/esfs_reconfig 3467684365 473008
linuxR_x86/esfs_recover.sh 1421309387 2175
linuxR_x86/esfs_upgrade.sh 3842231178 7338
linuxR_x86/esfs_version.txt 2288867304 180
linuxR_x86/fsdb_check 9256110 13781336
linuxR_x86/fsdbbackup 1343945337 17368
linuxR_x86/install-3981837 1296998551 3258
linuxR_x86/libfuse3.so.3.6.2 2532643060 833440
linuxR_x86/librocksdb.so.6.0.2 1177832479 5731584
linuxR_x86/mkesfs 3190631609 8061448
linuxR_x86/nbu_wrapper 2609208640 1582320
linuxR_x86/ocsd 1201644309 18200167
linuxR_x86/ocsd_log_view 2453407573 128992
linuxR_x86/post_uninstall-3981837 2213021276 4694
linuxR_x86/pre_proc_uninstall_3981837 3395168376 2631
linuxR_x86/preprocess_install_3981837 1548897195 665
linuxR_x86/recoverdb 226939226 615208
linuxR_x86/setlsu_ioctl 2931137077 17400
linuxR_x86/vxesfs 4104628702 3271537
linuxR_x86/vxesfsd 2830669010 7568200

Recommended service state:

Stop all NetBackup services before applying this hotfix.

Applies to the following product releases

NetBackup 8.2

Release date: 2019-06-28

End of standard support: 2023-06-28

Sustaining support starts: 2025-06-28

End of support life: 2026-06-28

NetBackup Appliance OS 3.2

Release date: 2019-11-04

End of standard support: 2023-06-28

Sustaining support starts: 2025-06-28

End of support life: 2026-06-28

Update files

	File name	Description	Version	Platform	Size

Choose an account to download the files you selected.

Knowledge base

Duplication jobs to CloudCatalyst may complete with status 0, but if cache fills up, dataloss could result

2020-02-11

Severity Possible Data Loss Description Duplication jobs to the cloud using NetBackup CloudCatalyst will complete with status 0, but if the CloudCatalyst cache volume becomes full, data loss can result. Versions Affected NetBackup 8.1 EEB 3958410...

Image Sharing Server configuration for HCP cloud provider support

2020-02-14

Problem In NetBackup 8.2, Image Sharing does not support HCP as a cloud provider. You cannot configure an Image Sharing server with an HCP cloud provider. Error Message When you try to configure an Image Sharing server with an HCP cloud provider,...