How does `ls -lh` round file size? - linux

I'm comparing the rounded file size value displayed by ls -lh to the raw size in bytes (as displayed by ls -l, say). I'm having a hard time figuring out what algorithm it uses to do the conversion from bytes.
My assumption is that it interprets the units K,M,G as either
(a) 10^3, 10^6, 10^9, or
(b) 1024, 1024^2, 1024^3.
On the one hand, I have one file that ls -l reports as 2052 bytes, and ls -lh rounds to 2.1K:
$ ls -l usercount.c
-rw-r--r-- 1 squirrel lsf 2052 May 13 15:41 usercount.c
$ ls -lh usercount.c
-rw-r--r-- 1 squirrel lsf 2.1K May 13 15:41 usercount.c
This would seem to support hypothesis (a), because 2052/1000=2.052 which rounds up to 2.1K but 2052/1024=2.0039 which clearly would display as 2.0K when rounded to one decimal place.
On the other hand, I have another file that ls -l reports as being 7223 bytes, which ls -lh displays as 7.1K:
$ ls -l traverse.readdir_r.c
-rw-r--r-- 1 squirrel lsf 7223 Jul 21 2014 traverse.readdir_r.c
$ ls -lh traverse.readdir_r.c
-rw-r--r-- 1 squirrel lsf 7.1K Jul 21 2014 traverse.readdir_r.c
This confusingly supports hypthesis (b), because 7223/1000=7.223 which should round down to 7.2K, but 7223/1024=7.0537 which rounds up to the displayed 7.1K
This leads me to conclude that my assumption is wrong and that it does neither (a) nor (b) exclusively. What algorithm does ls use to do this rounding?

GNU ls will by default round up in 1024-based units.
It does not round to nearest, as you've taken for granted.
Here's the formatting flag from gnulib human.h:
/* Round to plus infinity (default). */
human_ceiling = 0,
This is consistent with everything you're seeing:
2052 is 2.0039 KiB which rounds up to 2.1
7223 is 7.0537 KiB which rounds up to 7.1

by default the block size in ls is 1024, but for example if the output is 44.203125k it will round it to 45k
you can change it too
ls -lh --block-size=1000
and the source code: ls source code

Related

'size' vs 'ls -l' to get the size of an executable file

For the same file, I think the output of ls -l xxx is always greater than or equal to the output of size xxx.
But when I type ls -l /bin/ls the output is:
-rwxr-xr-x 1 root root 104508 1月 14 2015 /bin/ls
For size /bin/ls, the output is:
text data bss dec hex filename
101298 976 3104 105378 19ba2 /bin/ls
Why is ls showing less than size? 104508 < 105378
ls -l is telling you the size of the file, while the size command tells you the size of the executable image stored in the file -- how much memory it will require when loaded. Some segments (such as .bss) are zero-initialized rather than requiring data in the file to initialize them, so the the file may well be smaller than the executable image as a result.

Which file size is most accurate between ll, ls, and block size M or G?

So take the following dir:
4096 dir1
7255937636 dir2
This is what I get with just an ll command. If I do ls -l --block-size=M I end up with:
1M dir1
6920M dir2
Finally if I do ls -l --block-size=G I end up with:
1G dir1
7G dir2
I get that 6920 is easily rounded up to 7G but it seems like it's a big stretch to round that 4096 up to 1G. I also don't understand why the second example isn't 7256M or something more similar. Even more if we're always rounding up, why isn't the 7256 rounded up to 8G?
I guess I don't fully understand what it is I'm looking at here when nothing gives as accurate value as I'm thinking.
Apparently you are confusing the blocksize with the unit used for displaying the (correct) size. Try using
ls -lh
to enable auto scaling for human-readable output.
BTW: ll usually is just an alias for ls -l. This is also the most accurate value you will get.

Size() vs ls -la vs du -h which one is correct size?

I was compiling a custom kernel, and I wanted to test the size of the image file.
These are the results:
ls -la | grep vmlinux
-rwxr-xr-x 1 root root 8167158 May 21 12:14 vmlinux
du -h vmlinux
3.8M vmlinux
size vmlinux
text data bss dec hex filename
2221248 676148 544768 3442164 3485f4 vmlinux
Since all of them show different sizes, which one is closest to the actual image size?
Why are they different?
They are all correct, they just show different sizes.
ls shows size of the file (when you open and read it, that's how many bytes you will get)
du shows actual disk usage which can be smaller than the file size due to holes
size shows the size of the runtime image of an object/executable which is not directly related to the size of the file (bss uses no bytes in the file no matter how large, the file may contain debugging information that is not part of the runtime image, etc.)
If you want to know how much RAM/ROM an executable will take excluding dynamic memory allocation, size gives you the information you need.
Two definition need to be understood
1 runtime vs storetime (this is why size differs)
2 file depth vs directory (this is why du differs)
Look at the below example:
[root#localhost test]# ls -l
total 36
-rw-r--r-- 1 root root 712 May 12 19:50 a.c
-rw-r--r-- 1 root root 3561 May 12 19:42 a.h
-rwxr-xr-x 1 root root 71624 May 12 19:50 a.out
-rw-r--r-- 1 root root 1403 May 8 00:15 b.c
-rw-r--r-- 1 root root 1403 May 8 00:15 c.c
[root#localhost test]# du -abch --max-depth=1
1.4K ./b.c
1.4K ./c.c
3.5K ./a.h
712 ./a.c
70K ./a.out
81K .
81K total
[root#localhost test]# ls -l
total 36
-rw-r--r-- 1 root root 712 May 12 19:50 a.c
-rw-r--r-- 1 root root 3561 May 12 19:42 a.h
-rwxr-xr-x 1 root root 71624 May 12 19:50 a.out
-rw-r--r-- 1 root root 1403 May 8 00:15 b.c
-rw-r--r-- 1 root root 1403 May 8 00:15 c.c
[root#localhost test]# size a.out
text data bss dec hex filename
3655 640 16 4311 10d7 a.out
If using size not on executable, OS will report an error.
Empirically differences happen most often for sparse files and for compressed files and can go in both directions.
du < ls
Sparse files contain metadata about space needed for an application, which ls reads and applies for its result, while du doesn't. For example:
truncate -s 1m test.dat
creates a sparse file consisting entirely of nulls without disk usage, ie. du shows 0 and ls shows 1M.
du > ls
On the other hand du can indicate, as in your case, files which might occupy a lot of space on disk (ie. they spread among lots of blocks), but not all blocks are filled, i.e. their bytesize (measured by ls) is smaller than du (looking at occupied blocks). I observed this rather prominently e.g. for some python pickle files.

How does "less" command get stdin input?

I'm just wondering about this problem:
if I can use something like "ls -al | less", less should have the ability of waiting for input from stdin. What I expected to happen is after running "less" command, the program hang up and wait for input(as a consequence of calling gets() or something like that.)
But why in fact it showed an error message "Missing filename ("less --help" for help)" end exited?
Thank you.
less command can check both argc>1 and stdin be associated with a file(not a tty).
The less command is not designed in that matter. When you execute a command in bash it will display all of the information at once. The less command is used to display the STDOUT of a command or the contents of a file one screen at a time.
$ ls -al | less
total 16
drwxrwxr-x 4 hdante hdante 4096 Nov 24 17:11 .
drwxr-xr-x 88 hdante hdante 4096 Mar 24 22:14 ..
drwxrwxr-x 2 hdante hdante 4096 Nov 25 01:55 new
drwxrwxr-x 3 hdante hdante 4096 Nov 24 18:27 old
(END)
It works. Something is wrong with your less. From less manual pages:
http://www.linuxmanpages.com/man1/less.1.php
https://developer.apple.com/library/mac/#documentation/Darwin/Reference/ManPages/man1/less.1.html
The manual describes the filename as optional.
Hints to diagnose your problem:
try alias | grep less, to see if the command is being modified
try set | grep LESS, and check the scripts being run by LESSCLOSE and LESSOPEN

Basic Unix refresher inquiry: ls -ld

I know this is really basic, but I cannot find this information
in the ls man page, and need a refresher:
$ ls -ld my.dir
drwxr-xr-x 1 smith users 4096 Oct 29 2011 my.dir
What is the meaning of the number 1 after drwxr-xr-x ?
Does it represent the number of hard links to the direcory my.dir?
I cannot remember. Where can I find this information?
Thanks,
John Goche
I found it on Wikipedia:
duuugggooo (hard link count) owner group size modification_date name
The number is the hard link count.
If you want a more UNIXy solution, type info ls. This gives more detailed information including:
`-l'
`--format=long'
`--format=verbose'
In addition to the name of each file, print the file type, file
mode bits, number of hard links, owner name, group name, size, and
timestamp (*note Formatting file timestamps::), normally the
modification time. Print question marks for information that
cannot be determined.
That is the number of named (hard links) of the file. And I suppose, there is an error here. That must be at least 2 here for a directory.
$ touch file
$ ls -l
total 0
-rw-r--r-- 1 igor igor 0 Jul 15 10:24 file
$ ln file file-link
$ ls -l
total 0
-rw-r--r-- 2 igor igor 0 Jul 15 10:24 file
-rw-r--r-- 2 igor igor 0 Jul 15 10:24 file-link
$ mkdir a
$ ls -l
total 0
drwxr-xr-x 2 igor igor 40 Jul 15 10:24 a
-rw-r--r-- 2 igor igor 0 Jul 15 10:24 file
-rw-r--r-- 2 igor igor 0 Jul 15 10:24 file-link
As you can see, as soon as you make a directory, you get 2 at the column.
When you make subdirectories in a directory, the number increases:
$ mkdir a/b
$ ls -ld a
drwxr-xr-x 3 igor igor 60 Jul 15 10:41 a
As you can see the directory has now three names ('a', '.' in it, and '..' in its subdirectory):
$ ls -id a ; cd a; ls -id .; ls -id b/..
39754633 a
39754633 .
39754633 b/..
All these three names point to the same directory (inode 39754633).
Trying to explain why for directory the initial link count value =2.
Pl. see if this helps.
Any file/directory is indentified by an inode.
Number of Hard Links = Number of references to the inode.
When a directory/file is created, one directory entry (of the
form - {myname, myinodenumber}) is created in the parent directory.
This makes the reference count of the inode for that file/directory =1.
Now when a directory is created apart from this the space for directory is also created which by default should be having two directory entries
one for the directory which is created and another for the
parent directory that is two entries of the form {., myinodenumber}
and {.., myparent'sinodenumber}.
Current directory is referred by "." and the parent is referred by ".." .
So when we create a directory the initial number of Links' value = 1+1=2,
since there are two references to myinodenumber. And the parent's number
of link value is increased by 1.

Resources