What do I need to get SEEK_HOLE to work with sparse files?

What do I need to get SEEK_HOLE to work with sparse files? - linux

I am trying to work more efficiant with sparse files. I have read about the SEEK_HOLE functionallity in newer Linux Kernels. According to other people, this should be in Kernel version 3.1 and later.
However, as seen bellow, there must be something else to it as well.
I am on kernel 3.2+ and working with sparse files are still slow.
Doing a "cp" or "tar" on this (completely empty) sparse file should take less than 1 sec.
Any idea what I am missing? How can I check if SEEK_HOLE is supported/activated?
They discuss the same thing here, but for some reason I cannot post comments there that it does not work:
Copying a 1TB sparse file
(root#r1)-(/nbd/test)# uname -a
Linux r1.exice.com 3.2.0-33-generic #52-Ubuntu SMP Thu Oct 18 16:29:15 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
(root#r1)-(/nbd/test)# dd if=/dev/zero of=10g.img seek=10G bs=1 count=1
1+0 records in
1+0 records out
1 byte (1 B) copied, 0.000271811 s, 3.7 kB/s
(root#r1)-(/nbd/test)# time cp 10g.img new.img
real 0m15.370s
user 0m1.544s
sys 0m13.405s
(root#r1)-(/nbd/test)# time bsdtar cvfz new.tar.gz 10g.img
a 10g.img
real 1m59.898s
user 1m43.938s
sys 0m15.769s
(root#r1)-(/nbd/test)# time tar cvfz new2.tar.gz 10g.img
10g.img
real 1m58.584s
user 1m51.275s
sys 0m30.382s

Make sure you are using a tool that supports SEEK_HOLE.
bsdtar >=3.0.4
cp >=8.13
Also make sure you run a filesystem that supports it. Works fine with ext4. Does not work with reiserfs.

Related

What do device (character special) file sizes mean?

Using ls -l normally results in a long listing that includes the file size...
-rw-r--r--# 1 user1 staff 881344 Sep 1 15:35 someFile.png
On macOS 10.13.5, and Ubuntu 20.04, character special (device) file sizes are very different...
crw------- 1 root wheel 31, 0 Aug 30 16:11 autofs
In this case, what does the "31, 0" mean?

what does the "31, 0" mean?
It's the major/minor numbers of character device.
See these:
https://unix.stackexchange.com/questions/97676/how-to-find-the-driver-module-associated-with-a-device-on-linux
https://www.ibm.com/support/knowledgecenter/linuxonibm/com.ibm.linux.z.lgdd/lgdd_c_udev.html

Read carefully the documentation of ls(1) then about inode(7)
31 is a major device number, 0 is a minor device number.
Remember that ls(1) would use stat(2) (you might check using strace(1)...), so read Advanced Linux Programming then syscalls(2)
Sometimes, ls might be some shell alias or function. So read documentation of GNU bash. Try also /bin/ls --help
On GNU Linux, read documentation of coreutils. And it is free software, you could download and study its source code !
On MacOSX, the operating system kernel might have different system calls.
Be however aware of udev (on Linux).

Bash on Windows 10, no loop devices

I've just tried Bash on my Windows 10 PC, and it works fine. However, I found that there is no such thing as loop devices by ls /dev/, and modprobe loop gives an error output.
Does it mean this Bash doesn't support loop devices at all or is there a solution for mounting an image as a loop device?

Windows Subsystem for Linux 1 (WSL, formerly known as Bash on Ubuntu on Windows) did not support loop devices. There was a feature request and an issue about it on Microsoft's Git repo.
WSL 2, however, does support loop devices.
$ uname -a
Linux Blade 5.10.102.1-microsoft-standard-WSL2 #1 SMP Wed Mar 2 00:30:59 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
$ fallocate -l 1G test.img
$ mkfs.ext3 test.img
mke2fs 1.45.5 (07-Jan-2020)
Discarding device blocks: done
Creating filesystem with 262144 4k blocks and 65536 inodes
Filesystem UUID: 549cca4d-a65f-4f4f-8428-e324feaed3d0
Superblock backups stored on blocks:
32768, 98304, 163840, 229376
Allocating group tables: done
Writing inode tables: done
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done
$ sudo mount -o loop test.img /media/
$ ls /media/
lost+found

Do you know that Bash is just a shell (something that reads your commands, executes them, pipes between them and permits you to write scripts) and is not an operating system?
Loop devices are part of the Linux kernel, and they simply don't exist in the Windows kernel.

How to edit FreeBSD .gz bootfile?

I have virtual image of a FreeBSD system and when I mount it I don't see the /etc/ directory and other files, instead is a big loader.gz on the filesystem, that I believe that is extracted during the boot process. I decompressed the loader.gz with gzip and I got it:
$ file loader
loader: ELF 64-bit LSB executable, x86-64, version 1 (FreeBSD), dynamically linked (uses shared libs), not stripped
Using grep I'm able to confirm that the files that I need to edit are inside, however I don't know how to edit it. I tried to mount it without success. How can I modify the contents of loader.gz and use it again?
Can you please give me an example?
I have a Linux system and a Mac to install tools and this FreeBSD image.
Please, help me.

The loader program is generally the last stage of the kernel bootstrapping process.
A recent image should have another signature. e.g. for a memory stick image;
> file tmp/FreeBSD-10.0-RELEASE-amd64-memstick.img
tmp/FreeBSD-10.0-RELEASE-amd64-memstick.img: Unix Fast File system
[v1] (little-endian), last mounted on ,
last written at Fri Jan 17 00:24:02 2014,
clean flag 1, number of blocks 681040, number of data blocks 679047,
number of cylinder groups 13, block size 8192, fragment size 1024,
minimum percentage of free blocks 8, rotational delay 0ms,
disk rotational speed 60rps, TIME optimization
Mounting an image on FreeBSD:
# mdconfig -a -t vnode -f tmp/FreeBSD-10.0-RELEASE-amd64-memstick.img -u 1
# mount /dev/md1a /mnt/root/
(Linux has the same capability, I just don't remember what its called.)
This image contains loader in the boot/ directory:
# ls /mnt/root/
.cshrc ERRATA.TXT README.TXT boot/ lib/ proc/ sys#
.profile HARDWARE.HTM RELNOTES.HTM dev/ libexec/ rescue/ tmp/
COPYRIGHT HARDWARE.TXT RELNOTES.TXT docbook.css media/ root/ usr/
ERRATA.HTM README.HTM bin/ etc/ mnt/ sbin/ var/
# ls /mnt/root/boot/
beastie.4th check-password.4th gptzfsboot menu.4th support.4th
boot color.4th kernel/ menu.rc userboot.so
boot0 defaults/ loader* menusets.4th version.4th
boot0sio delay.4th loader.4th modules/ zfs/
boot1 device.hints loader.help pmbr zfsboot
boot2 firmware/ loader.rc pxeboot zfsloader*
brand.4th frames.4th mbr screen.4th
cdboot gptboot menu-commands.4th shortcuts.4th
On my FreeBSD 10 system, loader has another signature;
/boot/loader: FreeBSD/i386 demand paged executable

Linux 3.2 /dev/shm performance variable?

I'm using /dev/shm tmpfs for writing lots of temporary files.
A set of 8-10 files/second, with each set containing files which range from 70kB .. 750kB. The file sets are all approximately the same sizes and arrive to be written regularly about once per second.
The code which writes these files is python calling a library which uses fwrite() to do the writing.
When the application starts, the write times take from 30 to over 400ms. Usually it's the largest (700k) files which can take 400ms to write but then it varies.
Here is an example of a set of 8, given in ms: 42,30,320,76,66,72,102,440.
It seems that the standard deviation of writing to /dev/shm is quite large.
After the application runs for a couple of minutes, the time to write plummets and the variance is much smaller (e.g. 7,8,15,23,24,32,51,71) - this behavior is stable and I have run the application for several hours.
There are no other applications of consequence running concurrently and there is plenty of room on /dev/shm.
It seems that the Linux kernel is dynamically adjusting to the application's use of /dev/shm. My question is: is my suspicion about the linux kernel correct? If so, is there any way to configure or notify the kernel ahead of time to use the desired behavior when my application starts? (faster writes to /dev/shm)
I'm using Ubuntu 12.04 LTS
$ uname -a
Linux devsb02 3.2.0-24-generic #39-Ubuntu SMP Mon May 21 16:52:17 UTC 2012 x86_64 x86_64
x86_64 GNU/Linux
$ ls -l /dev/shm
lrwxrwxrwx 1 root root 8 Aug 23 12:19 /dev/shm -> /run/shm
$ mount
....
none on /run/shm type tmpfs (rw,nosuid,nodev)

How to create a file with ANY given size in Linux?

I have read this question:
How to create a file with a given size in Linux?
But I havent got answer to my question.
I want to create a file of 372.07 MB,
I tried the following commands in Ubuntu 10.08:
dd if=/dev/zero of=output.dat bs=390143672 count=1
dd: memory exhausted
390143672=372.07*1024*1024
Is there any other methods?
Thanks a lot!
Edit:
How to view a file's size on Linux command line with decimal. I mean, the command line ls -hl just says: '373M' but the file is actually "372.07M".

Sparse file
dd of=output.dat bs=1 seek=390143672 count=0
This has the added benefit of creating the file sparse if the underlying filesystem supports that. This means, no space is wasted if some of the pages (_blocks) ever get written to and the file creation is extremely quick.
Non-sparse (opaque) file:
Edit since people have, rightly pointed out that sparse files have characteristics that could be disadvantageous in some scenarios, here is the sweet point:
You could use fallocate (in Debian present due to util-linux) instead:
fallocate -l 390143672 output.dat
This still has the benefit of not needing to actually write the blocks, so it is pretty much as quick as creating the sparse file, but it is not sparse. Best Of Both Worlds.

Change your parameters:
dd if=/dev/zero of=output.dat bs=1 count=390143672
otherwise dd tries to create a 370MB buffer in memory.
If you want to do it more efficiently, write the 372MB part first with large-ish blocks (say 1M), then write the tail part with 1 byte blocks by using the seek option to go to the end of the file first.
Ex:
dd if=/dev/zero of=./output.dat bs=1M count=1
dd if=/dev/zero of=./output.dat seek=1M bs=1 count=42

truncate - shrink or extend the size of a file to the specified size
The following example truncates putty.log from 298 bytes to 235 bytes.
root#ubuntu:~# ls -l putty.log
-rw-r--r-- 1 root root 298 2013-10-11 03:01 putty.log
root#ubuntu:~# truncate putty.log -s 235
root#ubuntu:~# ls -l putty.log
-rw-r--r-- 1 root root 235 2013-10-14 19:07 putty.log

Swap count and bs. bs bytes will be in memory, so it can't be that big.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string