System Administration

Native exFAT support on Fedora 32

A lot has changed since 2018 when exFAT was kept out of Fedora due to patent issues and a third-party FUSE-driver needed to be used. Until recently the GPLv2 based driver from Microsoft wasn’t enabled in the kernel as it was based on an older specification and wasn’t fully functional for everyday use.

$ grep EXFAT /boot/config-`uname -r`
# CONFIG_EXFAT_FS is not set

Fedora 32 recently received an upgrade to kernel 5.7 and with that, the native exFAT driver was enabled during compile time. The driver got a lot of updates from Samsung to work correctly to the latest specifications.

# DOS/FAT/EXFAT/NT Filesystems
# end of DOS/FAT/EXFAT/NT Filesystems

When an SD-card is now plugged into the machine, the kernel module is loaded and the filesystem is mounted by the kernel without the need of a userland driver.

 $ lsmod | grep fat
exfat                  81920  1
vfat                   20480  1
fat                    81920  1 vfat
$ mount | grep fat
/dev/nvme0n1p1 on /boot/efi type vfat (rw,relatime,fmask=0077,dmask=0077,codepage=437,iocharset=ascii,shortname=winnt,errors=remount-ro)
/dev/mmcblk0p1 on /run/media/user/disk type exfat (rw,nosuid,nodev,relatime,uid=1000,gid=1000,fmask=0022,dmask=0022,iocharset=utf8,errors=remount-ro,uhelper=udisks2)

The userland tools may come with Fedora 33, but the package exfat-utils from RPMFusion still need to be installed until it ships with Fedora.

System Administration

Getting Ext3/4 journal size

Ext3 is a successor of Ext2 with support for journaling which means it can have a log of all it’s recent changes it made or is going to make to the file system. This allows fsck to get the file system back in a good state when a power failure happens for example. But what is the size of the journal? Reading the manpage for tune2fs it says it needs to be between 1024 and 102400 blocks which means it can start with 1MB on a file system with a 1KB block size and 4M on a file system with 4KB block size.

So let start to see which inode contains the journal and normally this should be inode 8 unless you have a file system that was upgraded from Ext2 to Ext3/4.

$ sudo LANG=C tune2fs -l /dev/sda1 | awk '/Journal inode/ {print $3}'

Now that we have the inode responsible for the in file system journal we can retrieve it details by doing a stat() with debugfs for that inode. Debugfs retrieve the details from the inode and on of them is de allocated size on disk.

$ sudo LANG=C debugfs -R "stat <8>" /dev/sda1 | awk '/Size: /{print $6}'|head -1
debugfs 1.42.4 (12-Jun-2012)

Now let use the same procedure on an other file system:

$ sudo LANG=C tune2fs -l /dev/mapper/data00-srv | awk '/Journal inode/ {print $3}'
$ sudo LANG=C debugfs -R "stat <8>" /dev/mapper/data00-srv | awk '/Size: /{print $6}'|head -1
debugfs 1.42.4 (12-Jun-2012)

There is also an easy way as dumpe2fs provides an interface to a lot of these values directly.

$ sudo LANG=C dumpe2fs /dev/mapper/data00-srv | grep ^Journal
dumpe2fs 1.42.4 (12-Jun-2012)
Journal inode:            8
Journal backup:           inode blocks
Journal features:         journal_incompat_revoke
Journal size:             128M
Journal length:           32768
Journal sequence:         0x0111fd4c
Journal start:            3380

Keep in mind that changing something on a live journal can destroy your file system, so never move the journal location or change it’s size unless it’s a clean unmounted file system.

Internet, Unix en security

BtrFS and readonly snapshots

In a previous posting I started with BtrFS and as mentioned BtrFS supports snapshotting. With this you can create a point in time copy of a subvolume and even create a clone that can be used as a new working subvolume. To start we first need the BtrFS volume which can and must always be identified as subvolid 0. This as the default volume to be mounted can be altered to a subvolume instead of the real root of a BtrFS volume. We start with updating /etc/fstab so we can mount the BtrFS volume.

LABEL=datavol	/home	btrfs	defaults,subvol=home	0	0
LABEL=datavol	/media/btrfs-datavol	btrfs	defaults,noauto,subvolid=0	0	0

As /media is a temporary file system, meaning it is being recreated with every reboot, we need to create a mountpoint for the BtrFS volume before mounting. After that we create two read-only snapshots with a small delay in between. As there is currently no naming guide for how to call snapshots, I adopted the ZFS naming schema with the @-sign as separator between the subvolume name and timestamp.

$ sudo mkdir -m 0755 /media/btrfs-datavol
$ sudo mount /media/btrfs-datavol
$ cd /media/btrfs-datavol
$ sudo btrfs subvolume snapshot -r home home\@`date "+%Y%M%d-%H%m%S-%Z"`
Create a readonly snapshot of 'home' in './home@20124721-080109-CET
$ sudo btrfs subvolume snapshot -r home home\@`date "+%Y%M%d-%H%m%S-%Z"`
Create a readonly snapshot of 'home' in './home@20124721-080131-CET'
$ ls -l
total 0
drwxr-xr-x 1 root root 52 nov 21  2010 home
drwxr-xr-x 1 root root 52 nov 21  2010 home@20124721-080109-CET
drwxr-xr-x 1 root root 52 nov 21  2010 home@20124721-080131-CET

We now have two read-only snapshots and lets test to see if they are real read-only subvolumes. The creation a new file shouldn’t be possible.

$ sudo touch home@20124721-080109-CET/test.txt
touch: cannot touch `home@20124721-080109-CET/test.txt': Read-only file system

Creating snapshots is fun and handy for migrations or as on disk backup solution, but they do consume space as the delta’s between snapshots is being kept on disk. Meaning that changes between the snapshots are being keept on disk even when you remove them. Freeing diskspace will not only be removing them from the current snapshot, but also removing previous snapshots that include the removed data.

$ sudo btrfs subvolume delete home@20124721-080109-CET
Delete subvolume '/media/btrfs-datavol/home@20124721-080109-CET'
$ ls -l 
total 0
drwxr-xr-x 1 root root 52 nov 21  2010 home
drwxr-xr-x 1 root root 52 nov 21  2010 home@20124721-080131-CET

As last step we unmount the BtrFS volume again. This is where ZFS and BtrFS differ too much for my taste. To create and access snapshots on ZFS the zpool doesn’t needs to be mounted, but then again with the first few release of ZFS the zpool needed to mounted as well. So there is still hope as BtrFS is still under development.

$ sudo umount /media/btrfs-datavol

Seeing what is possible with BtrFS, Sun’s TimeSlider becomes an option. Also the option of Live Upgrades with rollbacks as is possible with Solaris 11, but for that BtrFS with read-write snapshots needs to be tested in the near future.

Internet, Unix en security

First steps with BtrFS

After using ZFS on Solaris, I missed the ZFS features on Linux and with no chance of ZFS coming to Linux, I had to do with MD and LVM. Or at least until BtrFS became mature enough and since the Linux 3.0 that time slowly has come. With Linux 3.0 BtrFS supports auto defragmentation and scrubbing of volumes. The second is maybe the most important feature of both ZFS and BtrFS as it can be used to actively scan data on disk for errors.

The first tests with BtrFS were in a virtual machine already a long time ago, but the userland tools were still in development. Now the command btrfs follow the path set by Sun Microsystems and basically combine the commands zfs and zpool for ZFS. But nothing compares to a test in the real world and so I broke a mirror and created a BtrFS volume with the name datavol:

$ sudo mkfs.btrfs -L 'datavol' /dev/sdb2

Now we can mount the volume and create a subvolume on it which we are going to be using as our new home volume for user’s home directories.

$ sudo mount /dev/sdb2 /mnt
$ sudo btrfs subvolume create /mnt/home
$ sudo umount /dev/sdb2

When updating /etc/fstab we can tell mount to use the volume name instead of a physical path to a device or some obscure UUID number. Also, you can tell which subvolume you want to mount.

LABEL=datavol	/home	btrfs	defaults,subvol=home	0	0

After unmounting and disabling the original volume for /home we can mount everything and copy all the data with rsync for example to see how BtrFS is working in the real world.

$ sudo mount -a

As hinted before scrubbing is important as you can verify that all your data and metadata on disk are still correct. You can do a read-write test by default or only read the test to see if all data can be accessed. There is even an option to read parts of the volume that are still unused. In the example below the subvolume for /home is being scrubbed and with success.

$ sudo btrfs scrub status /home
scrub status for afed6685-315d-4c4d-bac2-865388b28fd2
	scrub started at Sat Jan 17 15:11:58 2012, running for 106 seconds
	total bytes scrubbed: 5.77GB with 0 errors
$ sudo btrfs scrub status /mnt
scrub status for afed6685-315d-4c4d-bac2-865388b28fd2
	scrub started at Sat Jan 17 15:11:58 2012 and finished after 11125 seconds
	total bytes scrubbed: 792.82GB with 0 errors

The first glances of BtrFS in the real world are a lot better with kernel 3.1 then somewhere with kernel 2.6.30 and I’m slowly starting to say it becomes ready to be included in RHEL 7 of Debian 8 for example as a default storage solution. The same as ZFS became in Solaris 11. But it is not all glory as still a lot of work needs to be done.

The first is encryption as the LUKS era ends with BtrFS as it is not smart to put it between your disks and BtrFS. You lose the advantage of balancing data between disks when you do mirroring for example. But then again LVM has the same issue where you then also first need to set up software raid with MD with LUKS on top of it and LVM on top of that. For home directories, EncFS may be an option, but it still leaves a lot of areas uncovered that would be covered by LUKS out of the box.

The second issue is the integration of BtrFS in distributions and the handling of snapshots. As for now you first need to mount the volume before you can make a snapshot of a subvolume. The same for access a snapshot and for that I think ZFS still has an advantage with the .zfs directory accessible for everyone who has access to the filesystem. But time will tell and for now, the first tests look great.

Internet, Unix en security

Tux moet aan de Sonja Bakker

Waar is de tijd gebleven dat zeg 64MB nog voldoende was of dat een Pentium MMX op 233 MHz nog je hele digitale wereld kon voortduwen. Die tijd lijkt ver achter ons te liggen, maar is dat echt zo? De 8GB in mijn huidige workstation lijkt in sommige gevallen amper nog voldoende te zijn om de machine van geheugen te voorzien en het lijkt alleen maar erger te worden over de jaren.

Waar vroeger KDE 2.2 nog gerust drie maanden kon draaien lijkt het nu soms onmogelijk te zijn om KDE of GNOME gewoon een week te laten draaien zonder echte gevolgen. Helaas lijkt het bij het inloggen niet veel beter te worden, want er moeten veel caches worden nagelopen, extra daemons worden opgestart en vele andere dingen.

Misschien wordt het tijd dat Tux echt een naar de Weight Watchers gaat of een boek van Sonja Bakker voor kerst krijgt, want het “lichte” karakter is al jaren verdwenen. De vraag is misschien wat je er aan doet zonder vele vrijwilligers die hun kostbare tijd investeren voor het hoofd te stoten. Misschien een aardig punt om zo meteen 2011 mee te beginnen?