Posts Tagged ‘UDP’

NFS problems in failover – MC Service Guard. Applicable to other Linux HA clusters

Monday, August 7th, 2006

Problem: Two Linux servers (RHEL4) running NFS Server in High-Availability (failover) mode. When failovering the resources, an NFS client can continue to work. When failing back, the NFS client times-out for 5+ minutes.

Further problem information: While using RHEL3, that same (exact) configuration worked flawlessly.

Solution: set NFS options to UDP instead of TCP.

Explanation: RHEL3 has used NFS3 with UDP by default. RHEL4 uses NFS4 with TCP by default, which is a significant difference between them two.

Searching the web a while, to better understand the cause of the problem, I discovered an article in linux-ha (which looks like a very good place to visit if you’re into HA in Linux environmnets) which recommended using UDP instead of TCP. Quote:

"If your kernel defaults to using TCP for NFS (as is the case in 2.6
kernels), switch to UDP instead by using the ‘udp’ mount option. If you
don’t do this, you won’t be able to quickly switch from server "A" to
"B" and back to "A" because "A" will hold the TCP connection in
TIME_WAIT state for 15-20 minutes and refuse to reconnect.
" (quoted from the "Hints" section).

So, although I did not expect this cause (I had a hunch about Portmapper), the solution suggested worked fine (and only later we got to understand the cause). Good.

Multihomed routing (split access load balancing) and OpenVPN

Sunday, June 25th, 2006

We have one connection via ATM like interface and we have one PPP connection via xDSL (described here), and we want load balancing for this whole party.

Following this specific part of guide, we’ve managed to get this to work. The idea goes like this (Centos 4.3):

1. Do not state default route for the machine. Not in /etc/sysconfig/network and not in /etc/sysconfig/network-scripts/ifcfg-ethX

2. Using adsl-setup, we’ve defined our ADSL connection. Verify you have an entry DEFROUTE=no in your /etc/sysconfig/network-scripts/ifcfg-ppp0

3. find a way to start the following script after your network interfaces are up. I assume, in this script, that your ATM interface is eth1. multiroute.txt

The reason for specifically stating SERVER is that our DNS server requires recursive DNS for its settings, and I can use my ISP’s DNS Server only when using the corresponding link. Since both links are for different ISPs, I need to “bind” SERVER to a specific route.

Note that this solution is only temporary. At the moment, it is far from being complete, and many tests should be done yet, before I can call it a working solution. I might combine it with /etc/ppp/ip-up.local script, or I might add it as a seperated service in /etc/init.d, which would start after all interfaces are up and running. Not final yet.

With all this working like charm, we’ve had a huge issue – our OpenVPN server, which worked correctly just until then failed to work smoothly. Sometimes clients were able to connect, and sometimes they were unable to do so…

I got the following error message in my logs: “x.y.z.m:2839 TLS Error: TLS key negotiation failed to occur within 60 seconds (check your network connectivity)

The cause, as it seemed to me, was that OpenVPN’s UDP packets were routed via alternate route for each target client. Being UDP, they were not part of an active session, but were stateless, which resulted in a different routing descision each time they were directed at the OpenVPN client. I’ve searched for it, although I was not optimistic, because multihomed routing, with multiple ways out wasn’t very common. I was suprised to find this post, with it’s follow-up, which dealt exactly with my case.

Since I cannot bind it to an internal IP address (although I’ve tried – it didn’t work), I will test TCP based configuration tomorrow morning.




I don’t usually update posts but add new posts with links. However, in this case it was important enough for me to update this hot topic so I’ve decided to just add the new stuff.

First – I’ve failed. Since I do not have too much time here, I did not feel confident to leave a system yet untested. Especially when such a router is an essential link in this company.

I’ve tried using TCP based connection, but, still again, one client was able to connect, while the 2nd one did so for only a short while, and failed maintaining a working connection. I went back to UDP…

I came up with the following idea – if I can use some sort of tagging to differentiate the UDP packets sourced at the router, at the OpenVPN application, I could try and set a routing rule which will force them into a specific routing chain, and force them through my interface.

It didn’t work quite well. I was able to do the followin trick, but for no avail:

iptables -t mangle -A OUTPUT -p udp –sport 5001 -j MARK –set-mark 1

and then, using “ip” command:

ip rule add fwmark 1 table T1

which should have redirected all outbound UDP with source port 5001 (this is the one I use for my OpenVPN, due to legacy considerations), to the T1 routing table – a table directed outside with default route via eth1.

I don’t know why it failed. Almost seemed to work, but no…

I returned the system to a single-path setup, with PPP0 only acting as a manual alternate path in case where the primary path is down. Would work for now.

Linux as a WAN router

Friday, June 16th, 2006

I will discuss the issue of placing a Linux machine as a router, and some special cases where things might play a bit different.

The most common scenario is of placing the Linux as some sort of PPP or DHCP-via-cables router. It might look like this:

In this picture, the Linux machine actually recieves, via PPP, a single Internet IP, and it is required to masquerade all outbound traffic as being sourced by it.

In such a case, a line similar to this would be in place:

iptables -t nat -A POSTROUTING -o ppp+ -j MASQUERADE

It would create MASQUERADE NAT on the external IP address, and will result in a fully working connection from the LAN outside and back (other parameters should be set, like ip_forwarding, and FORWARD IPTables rules, but this is the general idea, IPTables-wise).

Not long ago I have discoverd the true reason for the IPTables rule:

iptables -t nat -A POSTROUTING -o eth0 -j SNAT –to-source $EXT_IP

For most common cases, it would behave just like the MASQUERADE rule. All outbound traffic would be rebuilt, with $EXT_IP as its source.

However, here is a case where routing poses some problem.

In this case, my Linux router did not actually have the internet IP address. The Linux router has the Transport IP, it has the LAN IP on the other side, but it is required to behave as if it has the Internet IP (or part of the pool, at least) defined. In this drawing, you cannot see where the Internet IP comes in.

After some games, I have found a solution for this specific problem, the IPTables line above:

iptables -t nat -A POSTROUTING -o eth0 -j SNAT –to-source $EXT_IP

The server doesn’t have to “hold” the $EXT_IP – as its ISP-side routed knows where to route transportation for this IP address, and routes it outbound without any NAT, we require that all outbound traffic, which is on the Transport IPs, will contain the headers with the $EXT_IP in it. That way, servers on the internet understand the source of the communication and can reply.

Without this line, all outbound traffic never get answered.

So it works, and it works correclty, but when setting up OpenVPN, I’ve had lot of TLS problems.


Clarification – OpenVPN uses UDP by default, and the initial TLS negotiation kept on failing.


After some thought, it seems like this: Incomming TLS communication is directed to the public IP Address ($EXT_IP), which is not an IP address the Linux router knows as its own. Therefore he ignores it. The solution is this following IPTables directive:

iptables -t nat -A PREROUTING -d $EXT_IP -j DNAT –to $TRANSPORT_IP

This line directs all inbound traffic directed at $EXT_IP to the Transport IP address, which in turn, completes the header-rewrite cycle of the router. Not only all outbound traffic’s header is rewritten to “sourced at $EXT_IP”, but also all inbound traffic directed at the router is redirected (more logically than physically, but leave it at that) to the Transport interface. A full cycle.