Size() vs ls -la vs du -h which one is correct size? - linux

I was compiling a custom kernel, and I wanted to test the size of the image file.
These are the results:
ls -la | grep vmlinux
-rwxr-xr-x 1 root root 8167158 May 21 12:14 vmlinux
du -h vmlinux
3.8M vmlinux
size vmlinux
text data bss dec hex filename
2221248 676148 544768 3442164 3485f4 vmlinux
Since all of them show different sizes, which one is closest to the actual image size?
Why are they different?

They are all correct, they just show different sizes.
ls shows size of the file (when you open and read it, that's how many bytes you will get)
du shows actual disk usage which can be smaller than the file size due to holes
size shows the size of the runtime image of an object/executable which is not directly related to the size of the file (bss uses no bytes in the file no matter how large, the file may contain debugging information that is not part of the runtime image, etc.)
If you want to know how much RAM/ROM an executable will take excluding dynamic memory allocation, size gives you the information you need.

Two definition need to be understood
1 runtime vs storetime (this is why size differs)
2 file depth vs directory (this is why du differs)
Look at the below example:
[root#localhost test]# ls -l
total 36
-rw-r--r-- 1 root root 712 May 12 19:50 a.c
-rw-r--r-- 1 root root 3561 May 12 19:42 a.h
-rwxr-xr-x 1 root root 71624 May 12 19:50 a.out
-rw-r--r-- 1 root root 1403 May 8 00:15 b.c
-rw-r--r-- 1 root root 1403 May 8 00:15 c.c
[root#localhost test]# du -abch --max-depth=1
1.4K ./b.c
1.4K ./c.c
3.5K ./a.h
712 ./a.c
70K ./a.out
81K .
81K total
[root#localhost test]# ls -l
total 36
-rw-r--r-- 1 root root 712 May 12 19:50 a.c
-rw-r--r-- 1 root root 3561 May 12 19:42 a.h
-rwxr-xr-x 1 root root 71624 May 12 19:50 a.out
-rw-r--r-- 1 root root 1403 May 8 00:15 b.c
-rw-r--r-- 1 root root 1403 May 8 00:15 c.c
[root#localhost test]# size a.out
text data bss dec hex filename
3655 640 16 4311 10d7 a.out
If using size not on executable, OS will report an error.

Empirically differences happen most often for sparse files and for compressed files and can go in both directions.
du < ls
Sparse files contain metadata about space needed for an application, which ls reads and applies for its result, while du doesn't. For example:
truncate -s 1m test.dat
creates a sparse file consisting entirely of nulls without disk usage, ie. du shows 0 and ls shows 1M.
du > ls
On the other hand du can indicate, as in your case, files which might occupy a lot of space on disk (ie. they spread among lots of blocks), but not all blocks are filled, i.e. their bytesize (measured by ls) is smaller than du (looking at occupied blocks). I observed this rather prominently e.g. for some python pickle files.

Related

S-record files output by objcopy are smaller than the original binaries

After using arm-none-eabi-gcc to build a file in ELF format, I am using arm-none-eabi-objcopy to create an S-record file. The command that my makefile runs is:
$(TOOLCHAIN)-objcopy --srec-len 10 -O srec "$<" "$#"
The makefile can build with various different settings - with debug symbols, with optimization, and with neither.
With some information such as my username removed, the output of ls -la after doing all three builds is:
-rw-r--r-- 1 4096 270330 Oct 12 18:13 outfile_Debug.mot
-rw-r--r-- 1 4096 825888 Oct 12 18:13 outfile_Debug.out
-rw-r--r-- 1 4096 270334 Oct 12 17:06 outfile_Default.mot
-rw-r--r-- 1 4096 465928 Oct 12 17:06 outfile_Default.out
-rw-r--r-- 1 4096 184776 Oct 12 19:02 outfile_Optimized.mot
-rw-r--r-- 1 4096 395672 Oct 12 19:02 outfile_Optimized.out
Now, I have read an unsourced claim that srec files canot contain debugging information, which would explain why the Default and Debug .mot files are roughly the same size while the corresponding .out file sizes differ enormously. But otherwise, the ELF file is a binary representation of the executable, while the S-record file uses hex strings in ASCII text, so surely it should be larger than the binary ELF file for a non-debug build?

How does `ls -lh` round file size?

I'm comparing the rounded file size value displayed by ls -lh to the raw size in bytes (as displayed by ls -l, say). I'm having a hard time figuring out what algorithm it uses to do the conversion from bytes.
My assumption is that it interprets the units K,M,G as either
(a) 10^3, 10^6, 10^9, or
(b) 1024, 1024^2, 1024^3.
On the one hand, I have one file that ls -l reports as 2052 bytes, and ls -lh rounds to 2.1K:
$ ls -l usercount.c
-rw-r--r-- 1 squirrel lsf 2052 May 13 15:41 usercount.c
$ ls -lh usercount.c
-rw-r--r-- 1 squirrel lsf 2.1K May 13 15:41 usercount.c
This would seem to support hypothesis (a), because 2052/1000=2.052 which rounds up to 2.1K but 2052/1024=2.0039 which clearly would display as 2.0K when rounded to one decimal place.
On the other hand, I have another file that ls -l reports as being 7223 bytes, which ls -lh displays as 7.1K:
$ ls -l traverse.readdir_r.c
-rw-r--r-- 1 squirrel lsf 7223 Jul 21 2014 traverse.readdir_r.c
$ ls -lh traverse.readdir_r.c
-rw-r--r-- 1 squirrel lsf 7.1K Jul 21 2014 traverse.readdir_r.c
This confusingly supports hypthesis (b), because 7223/1000=7.223 which should round down to 7.2K, but 7223/1024=7.0537 which rounds up to the displayed 7.1K
This leads me to conclude that my assumption is wrong and that it does neither (a) nor (b) exclusively. What algorithm does ls use to do this rounding?
GNU ls will by default round up in 1024-based units.
It does not round to nearest, as you've taken for granted.
Here's the formatting flag from gnulib human.h:
/* Round to plus infinity (default). */
human_ceiling = 0,
This is consistent with everything you're seeing:
2052 is 2.0039 KiB which rounds up to 2.1
7223 is 7.0537 KiB which rounds up to 7.1
by default the block size in ls is 1024, but for example if the output is 44.203125k it will round it to 45k
you can change it too
ls -lh --block-size=1000
and the source code: ls source code

JFFS2 filesystem corrupts immediately (Magic bitmask 0x1985 not found errors)

I have created a root filesystem with buildroot that is using squashfs. It works fine, and now I would like to create an overlayfs, which would hold /home and /etc directories.
For this purpose, I wanted to create a simple jffs2 filesystem with couple of files:
jlumme#simppa:~/projects/jffs2_home$ ls -la
total 20
drwxrwxr-x 4 jlumme jlumme 4096 Apr 21 16:21 .
drwxrwxr-x 6 jlumme jlumme 4096 Apr 21 16:21 ..
drwxrwxr-x 2 jlumme jlumme 4096 Apr 21 13:45 default
drwxrwxr-x 2 jlumme jlumme 4096 Apr 21 13:45 ftp
-rw-rw-r-- 1 jlumme jlumme 24 Apr 21 15:34 test.txt
The flash chip I use is SST25VF064C, so I believe it's erase block size is 64 KB, and thus I create a filesystem image from that folder:
mkfs.jffs2 -r jffs2_home/ -e 64 -o home.jffs2
$ ls -la
-rw-r--r-- 1 jlumme jlumme 496 Apr 21 15:42 home.jffs2
(Suprisingly, if I set -e 32, or even -e 4, the resulting binary image doesn't change at all???).
Nevertheless, moving on, I have aligned my mtdblock that contains home, to 64KB, and my flash layout looks like this:
uboot/<0x00000000 0x40000>
kernel/<0x00040000 0x3D9000>
dtb/<0x00419000 0x10000>
rootfs/<0x00429000 0x1F7000>
home/<0x00620000 0x1E0000>
On my board, I can mount the mtdblock4 fine, and I can read the file contents properly. However, if I modify the file, and try saving it, vi complains:
[ 77.030000] jffs2: Node totlen on flash (0xffffffff) != totlen from node ref (0x00000044)
Now, if I unmount the filesystem, and remount it, I start getting complaints immediately:
# mount -t jffs2 /dev/mtdblock4 /home/
[ 99.740000] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x001d4070: 0xff0a instead
[ 99.760000] jffs2: Empty flash at 0x001d4074 ends at 0x001d412c
[ 99.770000] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x001d412c: 0xffff instead
[ 99.790000] jffs2: Empty flash at 0x001d4130 ends at 0x001d4194
[ 99.790000] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x001d4194: 0xff0a instead
I suppose now my filesystem is already corrupted... and I don't really understand the reason for it..
Any ideas where am I going wrong with this ? Thanks for all suggestions..
This is what I did to solve the issue.
Updated newer MTD drivers from http://www.linux-mtd.infradead.org/
- There was new code for SST25V064C chip
Made sure the area reserved for JFFS2 was initialized to 0xFF
(possibly optional) Specified more accurately the creation of jffs2 file system:
mkfs.jffs2 -e 64 -l -p -s 4096 -r jffs2_home/ -o home.jffs2
With these changes the file system now reads and writes as expected.

Basic Unix refresher inquiry: ls -ld

I know this is really basic, but I cannot find this information
in the ls man page, and need a refresher:
$ ls -ld my.dir
drwxr-xr-x 1 smith users 4096 Oct 29 2011 my.dir
What is the meaning of the number 1 after drwxr-xr-x ?
Does it represent the number of hard links to the direcory my.dir?
I cannot remember. Where can I find this information?
Thanks,
John Goche
I found it on Wikipedia:
duuugggooo (hard link count) owner group size modification_date name
The number is the hard link count.
If you want a more UNIXy solution, type info ls. This gives more detailed information including:
`-l'
`--format=long'
`--format=verbose'
In addition to the name of each file, print the file type, file
mode bits, number of hard links, owner name, group name, size, and
timestamp (*note Formatting file timestamps::), normally the
modification time. Print question marks for information that
cannot be determined.
That is the number of named (hard links) of the file. And I suppose, there is an error here. That must be at least 2 here for a directory.
$ touch file
$ ls -l
total 0
-rw-r--r-- 1 igor igor 0 Jul 15 10:24 file
$ ln file file-link
$ ls -l
total 0
-rw-r--r-- 2 igor igor 0 Jul 15 10:24 file
-rw-r--r-- 2 igor igor 0 Jul 15 10:24 file-link
$ mkdir a
$ ls -l
total 0
drwxr-xr-x 2 igor igor 40 Jul 15 10:24 a
-rw-r--r-- 2 igor igor 0 Jul 15 10:24 file
-rw-r--r-- 2 igor igor 0 Jul 15 10:24 file-link
As you can see, as soon as you make a directory, you get 2 at the column.
When you make subdirectories in a directory, the number increases:
$ mkdir a/b
$ ls -ld a
drwxr-xr-x 3 igor igor 60 Jul 15 10:41 a
As you can see the directory has now three names ('a', '.' in it, and '..' in its subdirectory):
$ ls -id a ; cd a; ls -id .; ls -id b/..
39754633 a
39754633 .
39754633 b/..
All these three names point to the same directory (inode 39754633).
Trying to explain why for directory the initial link count value =2.
Pl. see if this helps.
Any file/directory is indentified by an inode.
Number of Hard Links = Number of references to the inode.
When a directory/file is created, one directory entry (of the
form - {myname, myinodenumber}) is created in the parent directory.
This makes the reference count of the inode for that file/directory =1.
Now when a directory is created apart from this the space for directory is also created which by default should be having two directory entries
one for the directory which is created and another for the
parent directory that is two entries of the form {., myinodenumber}
and {.., myparent'sinodenumber}.
Current directory is referred by "." and the parent is referred by ".." .
So when we create a directory the initial number of Links' value = 1+1=2,
since there are two references to myinodenumber. And the parent's number
of link value is increased by 1.

Why doesn't "total" from ls -l add up to total file sizes listed? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 6 years ago.
The community reviewed whether to reopen this question 1 year ago and left it closed:
Original close reason(s) were not resolved
Improve this question
Why is the total in the output of ls -l printed as 64 and not 26078 which is the total of all files listed?
$ ls -l ~/test/ls
total 64
-rw-r--r-- 1 root root 15276 Oct 5 2004 a2ps.cfg
-rw-r--r-- 1 root root 2562 Oct 5 2004 a2ps-site.cfg
drwxr-xr-x 4 root root 4096 Feb 2 2007 acpi
-rw-r--r-- 1 root root 48 Feb 8 2008 adjtime
drwxr-xr-x 4 root root 4096 Feb 2 2007 alchemist
You can find the definition of that line in the ls documentation for your platform. For coreutils ls (the one found on a lot of Linux systems), the information can be found via info coreutils ls:
For each directory that is listed, preface the files with a line
`total BLOCKS', where BLOCKS is the total disk allocation for all
files in that directory.
The Formula: What is that number?
total int = Sum of (physical_blocks_in_use) * physical_block_size/ls_block_size) for each file.
Where:
ls_block_size is an arbitrary environment variable (normally 512 or 1024 bytes) which is freely modifiable with the
--block-size=<int> flag on ls, the POSIXLY_CORRECT=1 GNU
environment variable (to get 512-byte units), or the -k flag to force
1kB units.
physical_block_size is the OS dependent value of an internal block interface, which may or may not be connected to the underlying hardware. This value is normally 512b or 1k, but is completely dependent on OS. It can be revealed through the %B value on stat or fstat. Note that this value is (almost always) unrelated to the number of physical blocks on a modern storage device.
Why so confusing?
This number is fairly detached from any physical or meaningful metric. Many junior programmers haven't had experience with file holes or hard/sym links. In addition, the documentation available on this specific topic is virtually non-existent.
The disjointedness and ambiguity of the term "block size" has been a result of numerous different measures being easily confused, and the relatively deep levels of abstraction revolving around disk access.
Examples of conflicting information: du (or ls -s) vs stat
Running du * in a project folder yields the following: (Note: ls -s returns the same results.)
dactyl:~/p% du *
2 check.cc
2 check.h
1 DONE
3 Makefile
3 memory.cc
5 memory.h
26 p2
4 p2.cc
2 stack.cc
14 stack.h
Total: 2+2+1+3+3+5+26+4+2+14 = 62 Blocks
Yet when one runs stat we see a different set of values. Running stat in the same directory yields:
dactyl:~/p% stat * --printf="%b\t(%B)\t%n: %s bytes\n"
3 (512) check.cc: 221 bytes
3 (512) check.h: 221 bytes
1 (512) DONE: 0 bytes
5 (512) Makefile: 980 bytes
6 (512) memory.cc: 2069 bytes
10 (512) memory.h: 4219 bytes
51 (512) p2: 24884 bytes
8 (512) p2.cc: 2586 bytes
3 (512) stack.cc: 334 bytes
28 (512) stack.h: 13028 bytes
Total: 3+3+1+5+6+10+51+8+3+28 = 118 Blocks
Note: You can use the command stat * --printf="%b\t(%B)\t%n: %s bytes\n" > to output (in order) the number of blocks, (in parens) the size of those
blocks, the name of the file, and the size in bytes, as shown above.
There are two important things takeaways:
stat reports both the physical_blocks_in_use and physical_block_size as used in the formula above. Note that these are values based on OS interfaces.
du is providing what is generally accepted as a fairly accurate estimate of physical disk utilization.
For reference, here is the ls -l of directory above:
dactyl:~/p% ls -l
**total 59**
-rw-r--r--. 1 dhs217 grad 221 Oct 16 2013 check.cc
-rw-r--r--. 1 dhs217 grad 221 Oct 16 2013 check.h
-rw-r--r--. 1 dhs217 grad 0 Oct 16 2013 DONE
-rw-r--r--. 1 dhs217 grad 980 Oct 16 2013 Makefile
-rw-r--r--. 1 dhs217 grad 2069 Oct 16 2013 memory.cc
-rw-r--r--. 1 dhs217 grad 4219 Oct 16 2013 memory.h
-rwxr-xr-x. 1 dhs217 grad 24884 Oct 18 2013 p2
-rw-r--r--. 1 dhs217 grad 2586 Oct 16 2013 p2.cc
-rw-r--r--. 1 dhs217 grad 334 Oct 16 2013 stack.cc
-rw-r--r--. 1 dhs217 grad 13028 Oct 16 2013 stack.h
That is the total number of file system blocks, including indirect blocks, used by the listed files. If you run ls -s on the same files and sum the reported numbers you'll get that same number.
Just to mention - you can use -h (ls -lh) to convert this in human readable format.

Resources