Archive for the ‘bash’ Category

RaspberryPi Zero loses connectivity

Friday, July 9th, 2021

I have had a problem with RPI Zero. The system was working fine, and then it did not. I am using Raspbery Linux (Debian-based) with kernel 5.10.17+. Once a while (usually with network load) the system loses connectivity. Everything seems to be fine, if you have a serial/USB console there, but the wireless network fails. This problem was also mentioned here.

My workaround was to create a script with a cron scheduling. I have identified that the fault lies with the wlan driver, and it needs to get reloaded. So cron calls this script every minute, like this:

*/1 * * * * /usr/local/sbin/

And the script (/usr/local/sbin/ has this in it:

# DST is the network gateway
if ! ping -c 5 -t 5 $DST > /dev/null
  /usr/bin/logger "Restarting wlan0 network driver"
  /usr/sbin/rmmod brcmfmac && /usr/sbin/modprobe brcmfmac roamoff=1

Set this script to be executable, and your RPI Zero should work just fine. This is not a solution, but a workaround, of course, but it works well.

ZFS clone script

Sunday, March 28th, 2021

ZFS has some magical features, comparable to NetApp’s WAFL capabilities. One of the less-used on is the ZFS send/receive, which can be utilised as an engine below something much like NetApp’s SnapMirror or SnapVault.

The idea, if you are not familiar with NetApp’s products, is to take a snapshot of a dataset on the source, and clone it to a remote storage. Then, take another snapshot, and clone only the delta between both snapshots, and so on. This allows for cloning block-level changes only, which reduces clone payload and the time required to clone it.

Copy and save this file as Give it execution permissions.

# This script will clone ZFS snapshots incrementally over SSH to a target server
# Snapshot name structure: [email protected]${TGT_HASH}_INT ; where INT is an increment number
# Written by Etzion. Feel free to use. See more stuff in my blog at
# Arguments:
# $1: ZFS filesystem name
# $2: (target ZFS system):(target ZFS filesystem)



# Sanity and usage
function usage() {
	echo "ZFS_TARGET is the parent of filesystems which will be created with the original source names"
	echo "Example: $IAM share/test backupsrv:backup"
	echo "It will create a filesystem 'test' under the pool 'backup' on 'backupsrv' with clone"
	echo "of the current share/test ZFS filesystem"
	echo "This script is (on purpose) not a recursive script"
	echo "For the script to work correctly, it *must* have SSH key exchanged from source to target"
	exit 0

function abort() {
	# exit errorously with a message
	echo "[email protected]"
	pkill -P $$
	exit 1

function parse_parameters() {
	# Parses command line parameters
	# called with $*
	for i in $*
		case ${i} in
			port=*)	PORT=${i##*=}
			hash=*)	HASH=${i##*=}
	# Use a short substring of MD5sum of the target name for later unique identification
	if [ -z "$hash" ]
		TGT_FULLHASH="`echo $TGT_FS/${SRC_DIRNAME_FS} | md5sum -`"


function sanity() {
	# Verify we have all details
	[ -z "$SRC_FS" ] && usage
	[ -z "$TGT_FS" ] && usage
	[ -z "$TGT_SYS" ] && usage
	$ZFS list -H -o name $SRC_FS > /dev/null 2>&1 || abort "Source filesystem $SRC_FS does not exist"
	# check_target_fs || abort "Target ZFS filesystem $TGT_FS on $TGT_SYS does not exist, or not imported"

function remove_lock() {
	# Removes the lock file
	\rm -f ${LOCKDIR}/$SRC_LOCK

function construct_ssh_cmd() {
	# Constract the remote SSH command
	# Here is a good place to put atomic parameters used for the SSH
	[ -z "${PORT}" ] && PORT=22
	SSH="ssh -p $PORT $TGT_SYS -o ConnectTimeout=3"

function get_last_remote_snapshots() {
	# Gets the last snapshot name on a remote system, to match it to our snapshots
	remoteSnapTmpObj=`$SSH "$ZFS list -H -t snapshot -r -o name ${TGT_FS}/${SRC_DIRNAME_FS}" | grep ${SRC_DIRNAME_FS}@ | grep ${TGT_HASH}`
	# Create a list of all snapshot indexes. Empty means its the first one
	for snapIter in ${remoteSnapTmpObj}
	  remoteSnaps="$remoteSnaps ${snapIter##*@${TGT_HASH}_}"

function check_if_remote_snapshot_exists() {
	# Argument: $1 ->; Name of snapshot
	# Checks if this snapshot exists on remote node
	$SSH "$ZFS list -H -t snapshot -r -o name ${TGT_FS}/${SRC_DIRNAME_FS}@${TGT_HASH}_${newLocalIndex}"
	return $?

function get_last_local_snapshots() {
	# This function will return an array of local existing snapshots using the existing TGT_HASH
    localSnapTmpObj=`$ZFS list -H -t snapshot -r -o name $SRC_FS | grep [email protected] | grep $TGT_HASH `
    # Convert into a list and remove the HASH and everything before it. We should have clear list of indexes
    for snapIter in ${localSnapTmpObj}
    	localSnapList="$localSnapList ${snapIter##*@${TGT_HASH}_}"
    # Convert object to array
    localSnapList=( $localSnapList )
    # Get the last object
    let localSnapArrayObj=${#localSnapList[@]}-1

function delete_snapshot() {
	# This function will delete a snapshot
	# arguments: $1 -> snapshot name
	[ -z "$1" ] && abort "Cleanup snapshot got no arguments"
	$ZFS destroy $1
	#$ZFS destroy ${SRC_FS}@${TGT_HASH}_${newLocalIndex}

function find_matching_snapshot() {
	# This function will attempt to find a matching snapshot as a replication baseline
	# Gets the latest local snapshot index
    # Gets the latest mutual snapshot index
    while [ $localSnapArrayObj -ge 0 ]
    	# Check if the current counter already exists
    	if echo "$remoteSnaps" | grep -w ${localSnapList[$localSnapArrayObj]} > /dev/null 2>&1
    		# We know the mutual index.
    		return 0
    	let localSnapArrayObj--
    # If we've reached here - there is no mutual index!
    abort "There is no mutual snapshot index, you will have to resync"

function cleanup_snapshots() {
	# Creates a list of snapshots to delete and then calls delete_snapshot function
	# We are using the most recent common index, $localSnapArrayObj as the latest reference for deletion
	let deleteArrayObj=$localSnapArrayObj-${LOCAL_SNAPS_TO_LEAVE}
	# Construct a list of snapshots to delete, and delete it in reverse order
	while [ $deleteArrayObj -ge 0 ]
		# Construct snapshot name
		snapsToDelete="$snapsToDelete ${SRC_FS}@${TGT_HASH}_${localSnapList[$deleteArrayObj]}"
		let deleteArrayObj--
	snapsToDelete=( $snapsToDelete )

	while [ $snapDelete -lt ${#snapsToDelete[@]} ]
		# Delete snapshot
		delete_snapshot ${snapsToDelete[$snapDelete]}
		let snapDelete++

function initialize() {
	# This is a unique case where we initialize the first sync
	# We will call this procedure when $remoteSnaps is empty (meaning that there was no snapshot whatsoever)
	# We have to verify that the target has no existing old snapshots here
	# is it empty?
	echo "Going to perform an initialization replication. It might wipe the target $TGT_FS completely"
	echo "Press Enter to proceed, or Ctrl+C to abort"
	read "abc"
	### Decided to remove this check
	### [ -n "$LOCSNAP_LIST" ] && abort "No target snapshots while local history snapshots exists. Clean up history and try again"
	create_local_snapshot $newLocalIndex
	sleep 1
	$ZFS send -ce ${SRC_FS}@${TGT_HASH}_${newLocalIndex} | nc $TGT_SYS $NC_PORT 2>&1
	if [ "$?" -ne "0" ]
		# Do no cleanup current snapshot
		# delete_snapshot ${SRC_FS}@${TGT_HASH}_${newLocalIndex}
		abort "Failed to send initial snapshot to target system"
	sleep 1
	# Set target to RO
	$SSH $ZFS set readonly=on $TGT_FS
	[ "$?" -ne "0" ] && abort "Failed to set remote filesystem $TGT_FS to read-only" # No need to remove local snapshot

function create_local_snapshot() {
	# Creates snapshot on local storage
	# uses argument $1
	[ -z "$1" ] && abort "Failed to get new snapshot index"
	$ZFS snapshot ${SRC_FS}@${TGT_HASH}_${1}
	[ "$?" -ne "0" ] && abort "Failed to create local snapshot. Check error message"

function open_remote_socket() {
	# Starts remote socket via SSH (as the control operation)
	# port is 3000 + three-digit random number
	let NC_PORT=3000+$RANDOM%1000
	$CONTROL_SSH "nc -l -i 90 $NC_PORT | $ZFS receive ${RECEIVE_FLAGS} $TGT_FS > /tmp/output 2>&1 ; sync"
	#$CONTROL_SSH "socat tcp4-listen:${NC_PORT} - | $ZFS receive ${RECEIVE_FLAGS} $TGT_FS > /tmp/output 2>&1 ; sync"
	#zfs send -R [email protected] | zfs receive -Fdvu zpnew

function send_zfs() {
	# Do the heavy lifting of opening remote socket and starting ZFS send/receive
	sleep 1
	$ZFS send -ce -I ${SRC_FS}@${TGT_HASH}_${commonIndex} ${SRC_FS}@${TGT_HASH}_${newLocalIndex} | nc -i 90 $TGT_SYS $NC_PORT 
	#$ZFS send -ce -I ${SRC_FS}@${TGT_HASH}_${commonIndex} ${SRC_FS}@${TGT_HASH}_${newLocalIndex} | socat tcp4-connect:${TGT_SYS}:${NC_PORT} -
	sleep 20


function increment() {
	# Create a new snapshot with the index $localRecentIndex+1, and replicate it to the remote system
	# Baseline is the most recent common snapshot index $commonIndex
	RECEIVE_FLAGS="-Fsdvu" # With an 'F' flag maybe?
	# Handle the case of latest snapshot in DR is newer than current latest snapshot, due to mistaken deletion
	remoteSnaps=( $remoteSnaps )
	let remoteIndex=${#remoteSnaps[@]} # Get last snapshot on DR
	if [ ${localRecentIndex} -lt ${remoteIndex} ]
		let newLocalIndex=${remoteIndex}+1
		let newLocalIndex=localRecentIndex+1
	create_local_snapshot $newLocalIndex


	# if [ "$?" -ne "0" ]
	# then

		# Cleanup current snapshot
		#delete_snapshot ${SRC_FS}@${TGT_HASH}_${newLocalIndex}
		#abort "Failed to send incremental snapshot to target system"
	# fi
	if ! verify_correctness

		if ! loop_resume # If we can
			# We either could not resume operation or failed to run with the required amount of iterations
			# For now we abort. 
			echo "Deleting local snapshot"
			delete_snapshot ${SRC_FS}@${TGT_HASH}_${newLocalIndex}
			abort "Remote snapshot should have the index of the latest snapshot, but it is not. The current remote snapshot index is ${commonIndex}"

function loop_resume() {
	# Attempts to loop over resuming until limit attempt has been reached
	REMOTE_TOKEN=$($SSH "$ZFS get -Ho value receive_resume_token ${TGT_FS}/${SRC_DIRNAME_FS}")
	if [ "$REMOTE_TOKEN" == "-" ]
		return 1
	# We have a valid resume token. We will retry
	while [ "$COUNT" -le "$RESUME_LIMIT" ]
		# For ease of handline - for each iteration, we will request the token again
		echo "Attempting resume operation" 
		REMOTE_TOKEN=$($SSH "$ZFS get -Ho value receive_resume_token ${TGT_FS}/${SRC_DIRNAME_FS}")
		let COUNT++
		$ZFS send -e -t $REMOTE_TOKEN | nc -i 90 $TGT_SYS $NC_PORT
		#$ZFS send -e -t $REMOTE_TOKEN | socat tcp4-connect:${TGT_SYS}:${NC_PORT} -
		sleep 20
		if verify_correctness
			echo "Done"
			return 0
	# If we've reached here, we have failed to run the required iterations. Lets just verify again
	return 1

function verify_correctness() {
	# Check remote index, and verify it is correct with the current, latest snapshot

    if check_if_remote_snapshot_exists
    	echo "Replication Successful"
    	return 0
    	echo "Replication failed"
    	return 1

### MAIN ###
[ `whoami` != "root" ] && abort "This script has to be called by the root user"
[ -z "$1" ] && usage
parse_parameters $*
SRC_LOCK=`echo $SRC_FS | tr / _`
if [ -f ${LOCKDIR}/$SRC_LOCK ] 
	echo "Already locked. If should not be the case - remove ${LOCKDIR}/$SRC_LOCK"
	exit 1
get_last_remote_snapshots # Have a string list of remoteSnaps
# If we dont have remote snapshot it should be initialization
if [ -z "$remoteSnaps" ]
	echo "completed initialization. Done"
	exit 0

# We can get here only if it is not initialization
get_last_local_snapshots # Have a list (array) of localSnaps
find_matching_snapshot # Get the latest local index and the latest common index available
increment # Creates a new snapshot and sends/receives it
cleanup_snapshots # Cleans up old local snapshots
pkill -P $$
echo "Done"

A manual initial run should be called manually. If you expect a very long initial sync, you should run it in tmux to screen, to avoid failing in the middle.

To run the command, run it like this:

./ share/my-data backuphost:share

This will create under the pool ‘share’ in the host ‘backuphost’ a filesystem matching the source (in this case: share/my-data) and set it to read-only. The script will create a snapshot with a unique name based on a shortened hash of the destination, with a counting number suffix, and start cloning the snapshot to the remote host. When called again, it will create a snapshot with the same name, but different index, and clone the delta to the remote host. In case of a disconnection, the clone will retry a few times before failing.

Note that the receiving side does not remove snapshots, so handling (too) old snapshots on the backup host remains up to you.

Extracting multi-layered initramfs

Thursday, December 5th, 2019

Modern Kernel specification (can be seen here) defined the initial ramdisk (initrd or initramfs, depends on who you ask) to allow stacking of compressed or uncompressed CPIO archives. It means, in fact, that you can extend your current initramfs by appending a cpio.gz (or cpio) file at the end, containing the additions or changes to the filesystem (be it directories, files, links and anything else you can think about).

An example of this action:

mkdir /tmp/test
cd /tmp/test
tar -C /home/ezaton/test123 -cf - . | tar xf - # Clones the contests of /home/ezaton/test123 to this location
find ./ | cpio -o -H newc > ../test.cpio.gz # Creates a compressed CPIO file
cat ../test.cpio.gz >> /boot/initramfs-`uname -r`.img

This should work (I haven’t tried, and if you do it – make sure you have a copy of the original initramfs file!), and the contents of the directory /tmp/test would be reflected in the initramfs.

This method allows us to quickly modify existing ramdisk, replacing files (the stacked cpio files are extracted by order), and practically – doing allot of neat tricks.

The trickier question, however, is how to extract the stacked CPIO files.
If you create a file containing multiple cpio.gz files, appended, and just try to extract them, only the contents of the first CPIO file would be extracted.

The Kernel can do it, and so are we. The basic concept we need to understand is that GZIP compresses a stream. It means that there is no difference between a file structured of stacked CPIO files, and then compressed altogether, or a file constructed by appending cpio.gz files. The result would be similar, and so is the handling of the file. It also means that we do not need to run a loop of zcat/un-cpio and then again zcat/un-cpio on the file chunk by chunk, but when we decompress the file, we decompress it in whole.

Let’s create an example file:

cd /tmp 
for i in {1..10} ; do
    mkdir test${i}
    touch test${i}/test${i}-file
    find ./test${i} | cpio -o -H newc | gzip > test${i}.cpio.gz
    cat test${i}.cpio.gz >> test-of-all.cpio.gz 

This script will create ten directories called test1 to test10, each containing a single file called test<number>-file. Each of them will both be archived into a dedicated cpio.gz file (named the same) and appended to a larger file called test-of-all.cpio.gz

If we run the following script to extract the contents, we will get only the first CPIO contents:

mkdir /tmp/extract
cd /tmp/extract
zcat ../test-of-all.cpio.gz | cpio -id # Format is newc, but it is auto detected

The resulting would be the directory ‘test1’ with a single file in it, but with nothing else. The trick to extract all files would be to run the following command:

rm -Rf /tmp/extract # Cleanup
mkdir /tmp/extract
cd /tmp/extract
zcat ../test-of-all.cpio.gz | while cpio -id ; do : ; done

This will extract all files, until there is no more cpio format remaining. Then the ‘cpio’ command will fail and the loop would end.

Some additional notes:
The ‘:’ is a place holder (does nothing) because ‘while’ loop requires a command. It is a legitimate command in shell.

So – now you can extract even complex CPIO structures, such as can be found in older Foreman “Discovery Image” (very old implementation), Tiny Core Linux (see this forum post, and this wiki note as reference on where this stacking is invoked) and more. This said, for extracting Centos/RHEL7 initramfs, which is structured of uncompressed CPIO appended by a cpio.gz file, a different command is required, and a post about it (works for Ubuntu and RHEL) can be found here.

EDIT: It seems the kernel-integrated CPIO extracting method will not “overwrite” a file with a later layer of cpio.gz contents, so I will have to investigate a different approach to that. FYI.

Old Dell iDrac – work around Java failures

Wednesday, June 5th, 2019

I have an old Dell server (R610, if it’s important) and I seem to fail to connect to its iDrac console via Java. No other options exist, and the browser calling Java flow fails somehow.

I have found an explanation here, and I will copy it for eternity 🙂

First – Download the latest JRE version 1.7 from https::/

Then, extract it to a directory of your choice. We’ll call this directory $RUN_ROOT

Download the viewer.jnlp file to this directory $RUN_ROOT, and open it with a text editor. You will see an XML block pointing at a JAR file called avctKVM.jar. Download it manually using ‘wget’ or ‘curl’ from the URL provided in the viewer.jnlp XML file.

Extract the avctKVM.jar file using ‘unzip’. You will get two libraries – avctKVMIO(.so or .dll for Windows) and avmWinLib(.so or .dll for Windows). Move these two files into a new directory under $RUN_ROOT/lib

Download/copy-paste the below .bat or .sh script files (.bat file for Windows, .sh file for Linux).

@echo off

set /P drachost="Host: "
set /p dracuser="Username: "
set "psCommand=powershell -Command "$pword = read-host 'Enter Password' -AsSecureString ; ^
    $BSTR=[System.Runtime.InteropServices.Marshal]::SecureStringToBSTR($pword); ^
for /f "usebackq delims=" %%p in (`%psCommand%`) do set dracpwd=%%p

echo -n 'Host: '
read drachost

echo -n 'Username: '
read dracuser

echo -n 'Password: '
read -s dracpwd

./jre/bin/java -cp avctKVM.jar -Djava.library.path=./lib com.avocent.idrac.kvm.Main ip=$drachost kmport=5900 vport=5900 user=$dracuser passwd=$dracpwd apcp=1 version=2 vmprivilege=true "helpurl=https://$drachost:443/help/contents.html"

Run the downloaded script file (with Linux – you might want to give it execution permissions first), and you will be asked for your credentials.

Thanks Nicola for this brilliant solution!

SecureBoot and VirtualBox kernel modules

Saturday, June 1st, 2019

Installing VirtualBox on Ubuntu 18 (same as for modern Fedora Core) with SecureBoot will result in the following error when running the command /sbin/vboxsetup

The error message would be something like this:

There were problems setting up VirtualBox. To re-start the set-up process, run
as root. If your system is using EFI Secure Boot you may need to sign the
kernel modules (vboxdrv, vboxnetflt, vboxnetadp, vboxpci) before you can load
them. Please see your Linux system’s documentation for more information.

This is because SecureBoot would not allow for non-signed kernel drivers, and VirtualBox creates its own drivers as part of its configuration.

I have found a great solution for this problem in the answers to this question here, which goes as follows:

Create a file (as root) called /usr/bin/ensure-vbox-signed with the following content:



if ! "${MOKUTIL}" --sb-state | grep -qi '[[:space:]]enabled$' ; then
	echo "WARNING: Secure Boot is not enabled, signing is not necessary"
	exit 0

# If secure boot is enabled, we try to find the signature keys
[ -f "${KEY}" ] || { echo "ERROR: Couldn't find the MOK private key at ${KEY}" ; exit 1 ; }
[ -f "${PUB}" ] || { echo "ERROR: Couldn't find the MOK public key at ${PUB}" ; exit 1 ; }

INFO="$("${MODINFO}" -n vboxdrv)"
if [ -z "${INFO}" ] ; then
	# If there's no such module, compile it
	/usr/lib/virtualbox/ setup
	INFO="$("${MODINFO}" -n vboxdrv)"
	if [ -z "${INFO}" ] ; then
		echo "ERROR: Module compilation failed (${MODPROBE} couldn't find it after was called)"
		exit 1

[ -z "${KVER}" ] && KVER="$(uname -r)"

DIR="$(dirname "${INFO}")"

for module in "${DIR}"/vbox*.ko ; do
	MOD="$(basename "${module}")"

	# Quick check - if the module loads, it needs no signing
	echo "Loading ${MOD}..."
	"${MODPROBE}" "${MOD}" && continue

	# The module didn't load, and it must have been built (above), so it needs signing
	echo "Signing ${MOD}..."
	if ! "${KDIR}/scripts/sign-file" sha256 "${KEY}" "${PUB}" "${module}" ; then
		echo -e "\tFailed to sign ${module} with ${KEY} and ${PUB} (rc=${?}, kernel=${KVER})"
		exit 1

	echo "Reloading the signed ${MOD}..."
	if ! "${MODPROBE}" "${MOD}" ; then
		echo -e "\tSigned ${MOD}, but failed to load it from ${module}"
		exit 1
	echo "Loaded the signed ${MOD}!"
exit 0 

Make sure this file is executable by root. Create a systemd service /etc/systemd/system/ensure-vboxdrv-signed.service with the following contents:

Description=Ensure the VirtualBox Linux kernel modules are signed



Run sudo systemctl reload-daemon, and then enable the service by running sudo systemctl start ensure-vboxdrv-signed.service

It should sign and enable your vbox drivers, and allow you to run your VirtualBox machines.