Archive for June, 2006

Small but annoying – no XVideo for movies

Wednesday, June 28th, 2006

It means I cannot resize video. Using the x11 generic driver does not allow resize.

I’ve searched for a solution just now, and got to this web page. After some tweeks with my own config file (to remind you, it was built using ATI’s tools), I’ve got it to work correctly.

Here’s the updated config file xorg.conf.ati-dualhead.txt

I never quite remember it – extracting a specific file from tar archive

Monday, June 26th, 2006

I always forget, and this blog is meant to help me remember.

Found in Tar’s manual, are these simple directives:

Have a file called file.tar, do:

tar -xf file.tar full/path/without/trailing/slashes

It can be a file or directory. You can have a list of files, such:

tar -xf file.tar home/me home/you home/us/important-file

Should do the trick.

Multihomed routing (split access load balancing) and OpenVPN

Sunday, June 25th, 2006

We have one connection via ATM like interface and we have one PPP connection via xDSL (described here), and we want load balancing for this whole party.

Following this specific part of lartc.org guide, we’ve managed to get this to work. The idea goes like this (Centos 4.3):

1. Do not state default route for the machine. Not in /etc/sysconfig/network and not in /etc/sysconfig/network-scripts/ifcfg-ethX

2. Using adsl-setup, we’ve defined our ADSL connection. Verify you have an entry DEFROUTE=no in your /etc/sysconfig/network-scripts/ifcfg-ppp0

3. find a way to start the following script after your network interfaces are up. I assume, in this script, that your ATM interface is eth1. multiroute.txt

The reason for specifically stating SERVER is that our DNS server requires recursive DNS for its settings, and I can use my ISP’s DNS Server only when using the corresponding link. Since both links are for different ISPs, I need to “bind” SERVER to a specific route.

Note that this solution is only temporary. At the moment, it is far from being complete, and many tests should be done yet, before I can call it a working solution. I might combine it with /etc/ppp/ip-up.local script, or I might add it as a seperated service in /etc/init.d, which would start after all interfaces are up and running. Not final yet.

With all this working like charm, we’ve had a huge issue – our OpenVPN server, which worked correctly just until then failed to work smoothly. Sometimes clients were able to connect, and sometimes they were unable to do so…

I got the following error message in my logs: “x.y.z.m:2839 TLS Error: TLS key negotiation failed to occur within 60 seconds (check your network connectivity)

The cause, as it seemed to me, was that OpenVPN’s UDP packets were routed via alternate route for each target client. Being UDP, they were not part of an active session, but were stateless, which resulted in a different routing descision each time they were directed at the OpenVPN client. I’ve searched for it, although I was not optimistic, because multihomed routing, with multiple ways out wasn’t very common. I was suprised to find this post, with it’s follow-up, which dealt exactly with my case.

Since I cannot bind it to an internal IP address (although I’ve tried – it didn’t work), I will test TCP based configuration tomorrow morning.

===============================================================================

Update

===============================================================================

I don’t usually update posts but add new posts with links. However, in this case it was important enough for me to update this hot topic so I’ve decided to just add the new stuff.

First – I’ve failed. Since I do not have too much time here, I did not feel confident to leave a system yet untested. Especially when such a router is an essential link in this company.

I’ve tried using TCP based connection, but, still again, one client was able to connect, while the 2nd one did so for only a short while, and failed maintaining a working connection. I went back to UDP…

I came up with the following idea – if I can use some sort of tagging to differentiate the UDP packets sourced at the router, at the OpenVPN application, I could try and set a routing rule which will force them into a specific routing chain, and force them through my interface.

It didn’t work quite well. I was able to do the followin trick, but for no avail:

iptables -t mangle -A OUTPUT -p udp –sport 5001 -j MARK –set-mark 1

and then, using “ip” command:

ip rule add fwmark 1 table T1

which should have redirected all outbound UDP with source port 5001 (this is the one I use for my OpenVPN, due to legacy considerations), to the T1 routing table – a table directed outside with default route via eth1.

I don’t know why it failed. Almost seemed to work, but no…

I returned the system to a single-path setup, with PPP0 only acting as a manual alternate path in case where the primary path is down. Would work for now.

Radeon 9600 Dual Display (dual-head)

Wednesday, June 21st, 2006

After much agony with my faulted NVidia Dual-head card, and the frequent hard-freeze which were part of this experience.

A new and shining ATI Radeon 9600 has entered my AGP slot, and I was ready to make it rock.

First thing first – I utterly failed to install the damn driver. ATI build their drivers for someone else. Not me. I was able (I swear I don’t remember how) to make it work, and was into setting my brand new VGA card for dual-head for my Xorg.

I’ve used aticonfig, and run the following commands to do the trick:

aticonfig –initial=dual-head –screen-layout=right

aticonfig –screen-layout=left –dtop horizontal –resolution=0,1280×1024,1024×768 \ –resolution=1,1280×1024,1024×768 –screen-overlap=0

It worked like charm (I’m saving you from the ugly trial and error part of it).

Quick and dirty delete old files, with exclude list and support for filenames with spaces

Sunday, June 18th, 2006

Here’s a little script I’ve written which deletes older than AGE days files, and has an exclude list, just in case. It’s meant to be run by cron on a daily basis:

#!/bin/sh

# Source of all evil
DIR=/ftp
# Age of file in days
AGE=10
# Exclude list – Use pipe (|) seperated values. Example:
# EXCLUDE=”me|tal” for excluding both “me” and both “tal”. Use the longest
# possible expression, for accurate match. For example:
# EXCLUDE=”/ftp/me|/ftp/tal”. Below is the default minimal exclude list.
EXCLUDE=”lost\+found|incoming”

echo -n \” > /tmp/del-list.txt

find $DIR/*/* -mtime +$AGE -print | grep -vE “$EXCLUDE” | tr ‘\n’ “\”\n\”" >> /tmp/del-list.txt

for i in `cat /tmp/del-list.txt` ; do
echo $i >> /var/log/del-ftp.log
done

cat /tmp/del-list.txt | xargs \rm -Rf

\rm /tmp/del-list.txt

It seems to work. So far, I have delete 2nd level directories when old enough (10 days by default), and I can handle files with spaces in their names (scheduled delete of filenames with spaces – for the sake of those searching for a solution. At least, I’ve used this expression and didn’t find a solution online).

Windows Genuine Validation for Corporate

Saturday, June 17th, 2006

In Corporate environment, when by mistake you install the WGA Validation update of late, you will be nagged about activating your Windows.

A solution I’ve found for this from the comments in this web site, suggested the following:

%windir%\system32\wgatray.exe /u

Remove the following Registry tree: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon\Notify\WgaLogon

After reboot, you will not get the nagging popup anymore, and somewhat later will be asked to (re)install the WGA Validation update.

Forgot to mention kernel update

Saturday, June 17th, 2006

Kernel version 2.6.16.9, so I will supply its config file here config-2.6.16.9.txt. Same procedure as before.

Good luck.

Linux as a WAN router

Friday, June 16th, 2006

I will discuss the issue of placing a Linux machine as a router, and some special cases where things might play a bit different.

The most common scenario is of placing the Linux as some sort of PPP or DHCP-via-cables router. It might look like this:

In this picture, the Linux machine actually recieves, via PPP, a single Internet IP, and it is required to masquerade all outbound traffic as being sourced by it.

In such a case, a line similar to this would be in place:

iptables -t nat -A POSTROUTING -o ppp+ -j MASQUERADE

It would create MASQUERADE NAT on the external IP address, and will result in a fully working connection from the LAN outside and back (other parameters should be set, like ip_forwarding, and FORWARD IPTables rules, but this is the general idea, IPTables-wise).

Not long ago I have discoverd the true reason for the IPTables rule:

iptables -t nat -A POSTROUTING -o eth0 -j SNAT –to-source $EXT_IP

For most common cases, it would behave just like the MASQUERADE rule. All outbound traffic would be rebuilt, with $EXT_IP as its source.

However, here is a case where routing poses some problem:

In this case, my Linux router did not actually have the internet IP address. The Linux router has the Transport IP, it has the LAN IP on the other side, but it is required to behave as if it has the Internet IP (or part of the pool, at least) defined. In this drawing, you cannot see where the Internet IP comes in.

After some games, I have found a solution for this specific problem, the IPTables line above:

iptables -t nat -A POSTROUTING -o eth0 -j SNAT –to-source $EXT_IP

The server doesn’t have to "hold" the $EXT_IP – as its ISP-side routed knows where to route transportation for this IP address, and routes it outbound without any NAT, we require that all outbound traffic, which is on the Transport IPs, will contain the headers with the $EXT_IP in it. That way, servers on the internet understand the source of the communication and can reply.

Without this line, all outbound traffic never get answered.

So it works, and it works correcly, but when setting up OpenVPN, I’ve had lot of TLS problems.

Clarification – OpenVPN uses UDP by default, and the initial TLS negotiation kept on failing.

After some thought, it seems like this: Incomming TLS communication is directed to the public IP Address ($EXT_IP), which is not an IP address the Linux router knows as its own. Therefore he ignores it. The solution is this following IPTables directive:

iptables -t nat -A PREROUTING -d $EXT_IP -j DNAT –to $TRANSPORT_IP

This line directs all inbound traffic directed at $EXT_IP to the Transport IP address, which in turn, completes the header-rewrite cycle of the router. Not only all outbound traffic’s header is rewritten to "sourced at $EXT_IP", but also all inbound traffic directed at the router is redirected (more logically than physically, but leave it at that) to the Transport interface. A full cycle.

SBS2000 and AD looks as if it’s down

Sunday, June 4th, 2006

My managed SBS2000 had some warnings regarding NTFrs:

Event Type: Error
Event Source: NtFrs
Event Category: None
Event ID: 13561
Date: 6/2/2006
Time: 2:32:55 PM
User: N/A
Computer: MYDC
Description:
The File Replication Service has detected that the replica set "DOMAIN SYSTEM VOLUME (SYSVOL SHARE)" is in JRNL_WRAP_ERROR.

Replica set name is : "DOMAIN SYSTEM VOLUME (SYSVOL SHARE)"
Replica root path is : "c:\winnt\sysvol\domain"
Replica root volume is : "\\.\C:"
A Replica set hits JRNL_WRAP_ERROR when the record that it is trying to read from the NTFS USN journal is not found. This can occur because of one of the following reasons.

[1] Volume "\\.\C:" has been formatted.
[2] The NTFS USN journal on volume "\\.\C:" has been deleted.
[3] The NTFS USN journal on volume "\\.\C:" has been truncated. Chkdsk can truncate the journal if it finds corrupt entries at the end of the journal.
[4] File Replication Service was not running on this computer for a long time.
[5] File Replication Service could not keep up with the rate of Disk IO activity on "\\.\C:".

Following recovery steps will be taken to automatically recover from this error state.
[1] At the first poll which will occur in 5 minutes this computer will be deleted from the replica set.
[2] At the poll following the deletion this computer will be re-added to the replica set. The re-addition will trigger a full tree sync for the replica set.

Event error code 13561, module NtFrs.

A while back, it had some unexpected power failure, and I’ev had to use chkdsk to recover the server (BSoD, with "Inaccessible boot device").

This error message repeated for quite a while. I’ve gotten this error message just as well:

Event Type: Error
Event Source: NtFrs
Event Category: None
Event ID: 13568
Date: 5/19/2006
Time: 7:37:07 PM
User: N/A
Computer: MYDC
Description:
The File Replication Service has detected that the replica set "DOMAIN SYSTEM VOLUME (SYSVOL SHARE)" is in JRNL_WRAP_ERROR.

Replica set name is : "DOMAIN SYSTEM VOLUME (SYSVOL SHARE)"
Replica root path is : "c:\winnt\sysvol\domain"
Replica root volume is : "\\.\C:"
A Replica set hits JRNL_WRAP_ERROR when the record that it is trying to read from the NTFS USN journal is not found. This can occur because of one of the following reasons.

[1] Volume "\\.\C:" has been formatted.
[2] The NTFS USN journal on volume "\\.\C:" has been deleted.
[3] The NTFS USN journal on volume "\\.\C:" has been truncated. Chkdsk can truncate the journal if it finds corrupt entries at the end of the journal.
[4] File Replication Service was not running on this computer for a long time.
[5] File Replication Service could not keep up with the rate of Disk IO activity on "\\.\C:".
Setting the "Enable Journal Wrap Automatic Restore" registry parameter to 1 will cause the following recovery steps to be taken to automatically recover from this error state.
[1] At the first poll, which will occur in 5 minutes, this computer will be deleted from the replica set. If you do not want to wait 5 minutes, then run "net stop ntfrs" followed by "net start ntfrs" to restart the File Replication Service.
[2] At the poll following the deletion this computer will be re-added to the replica set. The re-addition will trigger a full tree sync for the replica set.

WARNING: During the recovery process data in the replica tree may be unavailable. You should reset the registry parameter described above to 0 to prevent automatic recovery from making the data unexpectedly unavailable if this error condition occurs again.

To change this registry parameter, run regedit.

Click on Start, Run and type regedit.

Expand HKEY_LOCAL_MACHINE.
Click down the key path:
"System\CurrentControlSet\Services\NtFrs\Parameters"
Double click on the value name
"Enable Journal Wrap Automatic Restore"
and update the value.

If the value name is not present you may add it with the New->DWORD Value function under the Edit Menu item. Type the value name exactly as shown above.

I have followed these suggestions, and things looked OK. However, after a reboot, later this week, I’ve discovered that the DC failed to start correctly. Netlogon, DNS, everything went up fine, but clients were unable to auth…

I’ve got this error in my Eventviewer:

Event Type: Error
Event Source: Userenv
Event Category: None
Event ID: 1000
Date: 6/4/2006
Time: 10:19:38 AM
User: NT AUTHORITY\SYSTEM
Computer: MYDC
Description:
Windows cannot determine the user or computer name. Return value (1355).

Which led me to believe it’s some DC related problem.

I’ve searched the web, and got to this link in ExpertExchange. It did point me in the correct direction, as following its procedure in here, I was able to recover my NtFrs, and get the netlogon and sysvol up again. So everythin was working fine.

Regarding creation of "reparse points", or "Junctions", this link has it all. However, for us Unix originated people, who know soft and hard links, it’s the other way around. MS define source and target in reverse, so when they reffer to linkd.exe source target, the idea is somewhat like "linkd.exe target source". Example:

"linkd c:\myfolder c:\me", will create a c:\myfolder which is a junction of c:\me, and not the other way around.