Required Interface or Preferred Network settings seem to cause connections between non-routed networks on weakhost platforms

Article: 100028792
Last Published: 2013-08-05
Ratings: 0 1
Product(s): NetBackup & Alta Data Protection

Problem

The 'weakhost model' allows an operating system to make an outbound connection on a network interface using a source IP address that is assigned to a different network interface on the host.  This behavior is independent of and beyond the control of the application layer.

This can make it appear that a connection used by NetBackup is crossing between two networks that are not routed together.  In addition, the remote host will send replies to the source IP address that was present in the original packet; if a route exists the network will deliver the reply packets to that interface instead of the one that sent the original packet.  This is called "Asymmetrical Routing".

This type of behavior may be observed when using the "NetBackup Required Interface" or "Preferred Networks" setting with source interfaces specified or when using Cluster Name without Any Cluster Interface.
 

Error Message

If the destination NetBackup process is not expecting a connection from the source IP used by the connecting host, and does not have a 'Server entry' for the peername that the source IP resolves to, then it will reject the connection with different status codes as below:

status 46: server not allowed access
status 59: access to the client was not allowed

If the destination operating system implements the "stronghost model" and recognizes that the source IP is arriving on the wrong network, it will drop the connection causing the connecting NetBackup process to fail:

status 25: cannot connect on socket
status 40: network connection broken
status 58: can't connect to client

Cause

This behaviour can be observed in all NetBackup debug logs including bpbrm, bptm, bpcd, and vnetd.  But it is most simple to reproduce using the bptestbpcd command.

First without any REQUIRED_INTERFACE or PREFERRED_NETWORK settings in the bp.conf file on the NetBackup server, bptestbpcd behaves as expected.  Connections to the client NIC on the 10.12 network originate from the 10.12 NIC on the server.

5220-01$ bptestbpcd -host 10.12.251.34
1 1 1
10.12.253.96:60985 -> 10.12.251.34:1556
10.12.253.96:48622 -> 10.12.251.34:1556
 
Sniffing the 10.10.10 interface on the client shows no traffic.

5200-01$ tcpdump -n -i eth0 (10.10.10.2)
<nothing captured>

Added this setting to the bp.conf file on the NetBackup server.

PREFERRED_NETWORK = 192.168.1.1 PROHIBITED

The same test now suggests that a route exists from the 10.10.10 network to the 10.12 network.

5220-01$ bptestbpcd -host 10.12.251.34
1 1 1
10.10.10.1:47909 -> 10.12.251.34:1556
10.10.10.1:53139 -> 10.12.251.34:1556

Snooping the 10.10.10 NIC on the client shows packets being returned to the 10.10.10 network but with the source IP of the 10.12 NIC and does not show any inbound packets.  The packets are similarly observed inbound to the 10.10.10 NIC on the server, along with an absense of outbound packets.  Notice the source port is a well-known port number and the destination port is a random port which confirms this is a reply packet.  In addition the TCP SYN with ACK confirms this is a reply. 

5200-01$ tcpdump -q -n -nn -i eth0 (10.10.10.2)
13:48:47.691594 IP 10.12.251.34.1556 > 10.10.10.1.39823: S ack 33 win 5792 ...

5220-01$ tcpdump -q -n -nn -i eth2 (10.10.10.1)
13:48:47.691601 IP 10.12.251.34.1556 > 10.10.10.1.39823: S ack 33 win 5792 ...

Snooping the 10.12 NIC on the server shows the outbound packet with 10.10.10.1 as the source IP within the packet.  Sniffing the 10.12 NIC on the client confirms arrival of the inbound packet. 

5220-01$ tcpdump -n -i eth1 (10.12.251.96)
13:48:47.691548 IP 10.10.10.1.39823 > 10.12.251.34.1556: S seq 247 win 4219 ...

5200-01$ tcpdump -n -i eth1 (10.12.251.34)
13:48:47.691593 IP 10.10.10.1.39823 > 10.12.251.34.1556: S seq 247 win 4219 ...

Again this seems impossible as there is no route between the 10.10.10 and 10.12 networks.

This behavior can be reproduced outside of NetBackup by using simple operating system (OS) commands.

In this example, the server uses the OS traceroute command to connect to the other (10.10.10) interface on the client host but using the 10.12 interface (10.12.253.96) as the source.  Again, a seemingly impossible connection is being made.

5220-01$ traceroute -S 10.12.253.96 10.10.10.2
traceroute to 10.10.10.2 (10.10.10.2), 30 hops max, 40 byte packets
1  5200-01-bk (10.10.10.2)  0.218 ms   0.112 ms   0.112 ms

The network captures on the 10.10 NICs on both hosts show the same behavior. 

13:52:04.341927 IP 10.12.253.96.64007 > 10.10.10.2.33441: UDP, length 40

This confirms the behavior that causes the unexpected results is at the OS level and not the application level.

To further understand the situation, use the strace command to observe the operating system function calls.

An strace of 'bptestbpcd -client 10.12.251.34' when REQUIRED_INTERFACE and PREFERRED_NETWORK are not configured shows the socket is bound to 0.0.0.0 (INADDR_ANY) which allows the operating system to pick the source interface and thereby the source IP address.

socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 4
bind(4, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(1556), sin_addr=inet_addr("10.12.251.34")}, 16) = -1 EINPROGRESS (Operation now in progress)

An strace of 'bptestbpcd -client 10.12.251.34' with 'PREFERRED_NETWORK = 192.168.1.1 PROHIBITED' configured shows that a specific interface is bound resulting in a source IP address that may not match the outbound NIC.

socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 4
bind(4, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("10.10.10.1")}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(1556), sin_addr=inet_addr("10.12.251.34")}, 16) = -1 EINPROGRESS (Operation now in progress)

Because the PROHIBITED entry does not allow NetBackup to use one specific interface, the application cannot bind using INADDR_ANY and risk that the wrong interface will be used.  Instead, NetBackup provides the full list of non-prohibited interfaces and lets the operating system chose.

This can be observed by using the bptestnetconn command.  Observe that NetBackup is actually providing both of the non-prohibited interfaces as shown by' SRC'.

$ 5220-01$ bptestnetconn -v6 -p -H 10.12.251.34
adding hostname = 10.12.251.34
...snip...
FL:  10.12.251.34 -> 10.12.251.34  : 1 ms FAST (< 5 sec) SRC: 10.10.10.1,10.12.253.96
...snip...
[0] PREFERRED_NETWORK = 192.168.1.1 PROHIBITED

Solution

Ultimately, the bind call doesn’t provide a way to exclude specific interfaces, leaving applications such as NetBackup with two less than ideal options.

#1 Provide the OS a list of the source interfaces that are not prohibited.  The OS layers then determines the routing, but typically just selects the source IP of the first interface in the bind list regardless of whether it is or is not network routable to the destination. While the traffic appears to come from the “wrong” source address, the weakhost is actually sending from the correct interface.

#2 The application can bind and connect from each of the non-prohibited interfaces in turn until a connection is made.  This introduces delays waiting for failure if the source interface being tried is non-routable to the destination.  On a weakhost, asymentrical routing could occur.

The better solution to the situation above is to avoid using PROHIBIT for any local interface and instead identify which local interface to use to reach hosts on the 192.168.1.0/24 network.  That way INADDR_ANY can be used for most remote destinations and a specific source binding is only needed to reach that one network.  Since the 10.10.10 network is the one used for backups, it would be the logical choice for NetBackup to use.

PREFERRED_NETWORK = 192.168.1.0/24 MATCH 10.10.10.1  

Depending on the connectivity needs, the best solution might be to use static host routes instead of PREFERRED_NETWORK settings.

See RFC 1122 and the Related Articles for additional details.
 

Applies To

Any operating system platform which implements the weak-host model, such as SuSE 10.

Example NetBackup server which is a weak-host:  in this case a 5220 appliance

eth0 inet addr:192.168.1.1  Bcast:192.168.1.255 Mask:255.255.255.0 (hostname 5220-01-admin)
eth1 inet addr:10.12.253.96 Bcast:10.12.255.255 Mask:255.255.248.0 (hostname 5220-01)
eth2 inet addr:10.10.10.1   Bcast:10.10.10.255  Mask:255.255.255.0 (hostname 5220-01-bk)

Example NetBackup client, also a weak-host: in this case a NetBackup 5200 appliance

eth0 inet addr:10.10.10.2   Bcast:10.10.10.255  Mask:255.255.255.0 (hostname 5200-01-bk)
eth1 inet addr:10.12.251.34 Bcast:10.12.255.255 Mask:255.255.248.0 (hostname 5200-01)

A network route does not exist between the 10.10.10 and 10.12 networks.

T he  NetBackup client is configured to accept connections from either interface on the master server.

SERVER = 5220-01-bk
SERVER = 5220-01

Was this content helpful?