Reduce traverse time of large directory trees on Linux

Byetzion 14/12/202127/01/2023

Every Linux admin is familiar with the long time running through a large directory tree (with hundred of thousands of files and more) can take. Most are aware that if you re-run the same run-through, it will be shorter.

This is caused by a short-valid filesystem cache, where the memory is allocated to other tasks, or the metadata required cache exceeds the available for this task.

If the system is focused on files, meaning that its prime task is holding files (like NFS server, for example) and the memory is largely available, a certain tunable can reduce recurring directory dives (like the ‘find’ or ‘rsync’ commands, which run huge amounts of attribute queries):

sysctl vm.vfs_cache_pressure=10

The default value is 100. Lower values will cause the system to prefer keeping this cache. A quote from kernel’s memory tunables page:

vfs_cache_pressure
——————————–
This percentage value controls the tendency of the kernel to reclaim the memory which is used for caching of directory and inode objects.

At the default value of vfs_cache_pressure=100 the kernel will attempt to reclaim dentries and inodes at a “fair” rate with respect to pagecache and swapcache reclaim. Decreasing vfs_cache_pressure causes the kernel to prefer to retain dentry and inode caches. When vfs_cache_pressure=0, the kernel will never reclaim dentries and inodes due to memory pressure and this can easily lead to out-of-memory conditions. Increasing vfs_cache_pressure beyond 100 causes the kernel to prefer to reclaim dentries and inodes.

Increasing vfs_cache_pressure significantly beyond 100 may have negative performance impact. Reclaim code needs to take various locks to find freeable directory and inode objects. With vfs_cache_pressure=1000, it will look for ten times more freeable objects than there are.

Laptop | Linux

Kernel update – 2.6.15.1

Byetzion 03/03/2006

Alongside a newer kernel, I’m enjoying a newer Software Suspend2 package. I am not crazy about betas, or RCs, so now I’m on the 2.2 stable. I have hibernated only once since I’ve upgraded kernel, but it went just fine on the first try, so I’m rather ok with it without setting a set of…

bash | Linux | Scripting/Programming | Virtualization

Oracle VM and network bonding

Byetzion 09/05/201011/07/2015

Oracle VM, out of the box, does not allow network bonds. An excellent guide on how to enable bonding which I have partially followed, has convinced me that changing the relevant scripts would be better. That I have done, and reported in this wiki post. To sum things up – configure bonding/VLAN tagging as you…

Disk Storage | Linux

iSCSI persistent configurations agains us all

Byetzion 19/11/2009

Using iSCSI with dm-multipath is rather common setup. With iSCSI running over Ethernet cables, which are too easy to disconnect (either on purpose or by mistake), being cheap and common technology – multipath becomes a must. If you have multiple network links, this is only expected that you use multipath for your iSCSI configuration. It’s…

Linux

IPSec VPN for mobile devices on Linux

Byetzion 08/12/2012

I have had recently the pleasure and challenge of setting up VPN server for mobile devices on top of Linux. the common method to do so would be by using IPSec + L2TP, as these are to more common methods mobile devices allow, and it should work quite fine with other types of clients (although…

Disk Storage | Linux

Hot resize Multipath Disk – Linux

Byetzion 19/08/2011

This post is for the users of the great dm-multipath system in Linux, who encounter a major availability problem when attempting a resize of mpath devices (and their partitions), and find themselves scheduling a reboot. This documented is based on a document created by IBM called “Hot Resize Multipath Storage Volume on Linux with SVC”,…

Disk Storage | Linux

Easy Guide to Using AutoFS to Connect to Windows CIFS Shares

Byetzion 11/04/202311/04/2023

AutoFS is a powerful tool that allows for mount-on-demand functionality in Linux, reducing the chances of any negative effects when rebooting Windows running file services. In this article, we will focus on how to correctly connect AutoFS to a Windows share, without covering advanced features such as dynamic maps or special cases with different mappings….

Related posts:

Similar Posts

Leave a Reply Cancel reply