Problem
A NDMP restore job will fail if insufficient ports are open between filer and media server
Error Message
The logs extracts capture where the filer is unable to complete a NDMP data connection back via port 3171 to the designated Media Server.
And in the following example NDMP filer was unable to complete a data connection back to media server which had a port range from 1025 to 5000.
NetBackup Job detailed status shows:
25/08/2019 13:00:52 - begin reading
25/08/2019 13:00:53 - Info ndmpagent (pid=48777) INF - Restoring NDMP files from /vol/user_files/username/ to [See line below]
25/08/2019 13:00:53 - Info ndmpagent (pid=48777) INF - Restoring NDMP files from [See line above] to /vol/user_files/username/Restore
25/08/2019 13:00:53 - Info ndmpagent (pid=48777) NDMP Remote disk
25/08/2019 13:00:53 - Info ndmpagent (pid=48777) DAR enabled
25/08/2019 13:02:15 - Error ndmpagent (pid=48777) ndmp_data_connect_v3 failed, .... (-1)
25/08/2019 13:02:15 - Error ndmpagent (pid=48777) NDMP restore failed from path /vol/user_files
25/08/2019 13:02:15 - Info ndmpagent (pid=48777) done. status: 25
25/08/2019 13:02:15 - Info bptm (pid=48778) EXITING with status 5 <----------
25/08/2019 13:02:15 - Info hostname.com (pid=48778) StorageServer=PureDisk:hostname; Report=PDDO Stats for (hostname): read: 0 KB, CR received: 4456 KB, CR received over FC: 0 KB, dedup: 0.0%
25/08/2019 13:02:15 - Critical bpbrm (pid=48773) unexpected termination of client filerhostname.com
25/08/2019 13:02:15 - Info ndmpagent (pid=48777) done. status: 5: the restore failed to recover the requested files
25/08/2019 13:02:15 - Error bpbrm (pid=48773) client restore EXIT STATUS 5: the restore failed to recover the requested files
Log extracts taken from a ndmpd daemon log (NetApp filer).
Default log location of the NetApp 'ndmpd daemon log': /etc/log/mlog/ndmpd.log
Sun Aug 25 2019 13:00:56 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: DMA>>S V4 sequence=11 (0xb)
Sun Aug 25 2019 13:00:56 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: Time_stamp=0x5751b7c8 (Aug 25 13:00:56 2019)
Sun Aug 25 2019 13:00:56 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: message type=0 (NDMP4_MESSAGE_REQUEST)
Sun Aug 25 2019 13:00:56 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: message_code=0x40a (NDMP4_DATA_CONNECT)
Sun Aug 25 2019 13:00:56 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: reply_sequence=0 (0x0)
Sun Aug 25 2019 13:00:56 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: error_code=0 (NDMP4_NO_ERR)
Sun Aug 25 2019 13:00:56 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: generic_ndmp_dispatch bytes left 20 (pre-body decode)
Sun Aug 25 2019 13:00:56 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: generic_ndmp_dispatch bytes left 0 (post body decode)
Sun Aug 25 2019 13:00:56 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: addr_type=1 (NDMP4_ADDR_TCP)
Sun Aug 25 2019 13:00:56 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: tcp_addr len=1 (0x1)
Sun Aug 25 2019 13:00:56 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: tcp_addr[0] ip_addr=0xa380a15 [10.10.10.1]
Sun Aug 25 2019 13:00:56 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: tcp_addr[0] port=3171
Sun Aug 25 2019 13:00:56 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: tcp_addr[0] addr_env_len=0 (0x0)
The Filer decides which port number is selected (10.10.10.1 at port 3171)
Sun Aug 25 2019 13:00:56 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: C>BKUP (ndmp_client) NDMPD2BR_CONNECT
Sun Aug 25 2019 13:00:56 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: sess_handle 0x19700000000 seq 1 (0x1)
Sun Aug 25 2019 13:00:56 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: context mode(0) vfid(0)
Sun Aug 25 2019 13:00:56 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: server_type 2 (DATA)
Sun Aug 25 2019 13:00:56 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: data_conn_addr_type 1 (ADDRESS TYPE: TCP)
Sun Aug 25 2019 13:00:56 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: control addr ip addr family (1) (ADDRESS TYPE: TCP)
Sun Aug 25 2019 13:00:56 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: control addr ip addr (10.31.13.237)
Sun Aug 25 2019 13:00:56 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: number of addresses 1
Sun Aug 25 2019 13:00:56 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: [0] addr: port(3171) IP addr: addr family (1) (ADDRESS TYPE: TCP)
Sun Aug 25 2019 13:00:56 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: [0] addr: port(3171) IP addr: addr (10.10.10.1)
But is unable to connect, and aborts the data connection.
Sun Aug 25 2019 13:02:11 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: C<BKUP (ndmp_client) NDMPD2BR_CONNECT
Sun Aug 25 2019 13:02:11 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: sess_handle 0x0 seq 0 (0x0)
Sun Aug 25 2019 13:02:11 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: error 33 (ERROR: CONNECT FAILED)
Sun Aug 25 2019 13:02:11 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: Data connection: LOCAL
Sun Aug 25 2019 13:02:11 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: MGMT_RPC::xid_handle decode xid 116457474 okay calling 0x447060
Sun Aug 25 2019 13:02:11 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: connect_work_cb: called
Sun Aug 25 2019 13:02:11 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: connect_work_cb Error from D-Blade = 33
Sun Aug 25 2019 13:02:11 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: DATA cb_data_connect_work errnum 0: Success or No Error
Sun Aug 25 2019 13:02:11 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: DATA rpc error E_CONNECT_ERR (code 33)
Sun Aug 25 2019 13:02:11 -04:00 [kern_ndmpd:info:1777] [89126] ERROR: DATA cb_data_connect_failed (2): 33
Sun Aug 25 2019 13:02:11 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: DATA abort_on_listen_connect_failure: called
Sun Aug 25 2019 13:02:11 -04:00 [kern_ndmpd:info:1777] [89126] INFO: DATA: sending listen/connect failed ABORT
Following the aborted connection, it informs the designated NetBackup Media Server of the failed connection.
Sun Aug 25 2019 13:02:11 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: S<BKUP (ndmp_server) BR2NDMPD_DEBUG_LOG
Sun Aug 25 2019 13:02:11 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: sess_handle 0x19700000000 seq 1549265 (0x17a3d1)
Sun Aug 25 2019 13:02:11 -04:00 [kern_ndmpd:info:1777] [89126] DEBUG: LOG sev (3) msg (NDMP Connect: Unable to Connect to [10.10.10.1]:3171 due to Connection timed out)
vnet_cached_getnameinfo: [vnet_addrinfo.c:2023] found via getnameinfo OUR_HOST=hostname.com IPSTR=10.10.10.1
From job details: MediaServer=hostname.com
Cause
This failures shows that the correct IP address and network interface was correctly selected, indicating no overrides are defined with the ndmp.cfg file. The data connection that assigned to port 3171 is within the default server port window of 1025-5000
The NDMP Filer is unable to connect to the desired port range, this will be due to either a firewall or other port blocking mechanism which is preventing the flow of traffic from the inbound connection from filer to media server from this port range.
An inbound NDMP data connection selects a random port number in response to operational requests, a connection that can either be blocked or is idle an connection that has eventually terminated under some of the following conditions.
Inbound NDMP data connections are initiated back Media Server upon request, unless defined on the NDMP filer then by definition it will select a random TCP/IP port number that will be assigned for NDMP data connections.
Therefore just allowing TCP/IP port 10000 communication through a firewall or anti-virus is insufficient. A prerequisite would be to allow the designated Media Server and NDMP filer to have defined and extensive port range that is allows for the inbound random port selections.
Solution
Investigate firewall rules, ensure data ports 1025-5000 are open from the NAS client to the media server. The port requirements and guidelines for both backup and restore an NDMP server are as follows:
TCP port 10000 must be open from the media server (DMA) to the NDMP filer (tape or disk) for all types of NDMP operations; local, remote, and 3-way. And the NetBackup SERVER_PORT_WINDOW must be open inbound from the filer to the media server for remote NDMP. It must also be open for efficient catalog file (TIR data) movement during local or 3-way NDMP.
In more recent releases some NDMP Vendors have started to offer the capability to assign a designated range of TCP/IP ports which can be used for NDMP data connection in response to NDMP_DATA_LISTEN and NDMP_MOVER_LISTEN operations.
Please the appropriate Consult Vendor Documentation on how to configure and define the required ports ranges to NDMP_DATA_LISTEN and NDMP_MOVER_LISTEN operations.
Also consider that an active Firewall Rules that have an active policy, that restrictions in place to either block or limit which random ports can used during a data connection.
And verify if Third Party Security Software such as Anti-Virus Software, could be intercepting and holding or preventing the random connecting port beyond the expected time to live internal and reaching it final destination.