Does writing data to stdout in linux occupy disk space? - linux

I want to know whether writing data to stdout on a terminal in linux occupies disk space. I tried to look up information about stdout and tty in the man page, but it seems no answer to the question.
Thanks for any tip.
I have a router on which I have installed openwrt, but the total space of the router is 2MB and now there is only 12KB left. And I want to run a bash script to print some information on the terminal in the openwrt system. So I want to know whether it could occupy disk space when printing data to stdout.

stdout is not a specific file, or a location (disk, memory, etc.) - it is a concept. Usually, when talking about stdout in linux applications, we mean a file like object (ie. supporting (write, flush, etc.).
Output can be redirected to a real file on disk in which case, writing to stdout can and will use up disk space.

Unless the output is being redirected (though pipes to e.g. tee, or to a file with the > shell redirection operator), then no, output will not go to a file system. Output to a console (or terminal emulator) is slow as it is, passing it through the file system would make it even slower, and may not allow the disks to spin down if there's a lot of output.
It may, however, end up in the swap space, if the OS thinks the process should be swapped out before the output is actually written.

Related

If the size of the file exceeds the maximum size of the file system, what happens?

For example, In FAT32 partition, The maximum file size is 4GB. but I was able to create a 5GB file with vim and I saved the file and opened it again, the console output was broken like a staircase. I have three questions.
If the size of the file exceeds the maximum size of the file system, what happens?
In my case, Why break?
In Unix system call, stat() can succeed up to a 2GB(2^31 - 1). Does this have anything to do with the file system? Is there a relationship between the limits of data in stat() and the limits of each feature in the file system?
If the size of the file exceeds the maximum size of the file system, what happens?
By definition, that can never happens. What really happens is that some system call (probably write(2) ...) is failing, and the code doing that should take care of that case.
Notice that FAT32 filesystems restrict the maximal size of files to 2Gigabytes. Use a better file system on your USB key if you want more (or split(1) large files in smaller chunks before copying them to your FAT32-formatted USB key).
If using <stdio.h> notice that fflush(3), fprintf(3), fclose(3) (and most other standard functions) can fail (e.g. because they will do some failing write(2)).
the console output was broken like a staircase
probably because your pseudoterminal was in some broken state. See stty(1), reset(1), termios(3) and read the tty demystified.
In Unix system call, stat() can succeed up to a 2GB(2^31 - 1)
You are misunderstanding stat(2). Read again its documentation
Read Advanced Linux Programming then syscalls(2).
I was able to create a 5GB file with vim
To understand the behavior of vim read first its documentation then study its source code (it is free software, and you can and perhaps should study its code).
You could also use strace(1) to understand what system calls are done by some command or process.

How to create a log file that "pop_front"s?

Suppose I have a console program that outputs trace debug lines on stdout, that I want to run on a server.
And then I do:
./serverapp > someoutputfile
If I need to see how the program's doing, I would just log into the server and do:
tail -f someoutputfile
However, understandably over time, someoutputfile would become pretty big.
Is there a way to make it so that someoutputfile is limited to a certain size, and only the most recent parts of it?
I mean, the hard way would be to make a custom script/program that cycles the output between different files, but that seems like overkill.
You can truncate the log file. One way to do this is to type:
>someoutputfile
at the shell command-line. It's a redirect with no output and it will erase all the contents of the file.
The tricky bit here is that any program writing to that file will continue to write into the file at its last output position. So the file will immediately gain a "hole" from 0 to X bytes, where X is the output position.
In most Linux file systems these holes result in sparse files, which don't actually use the space in the hole. So the file may contain many gigabytes of 0's at the beginning but only use 500 KB on disk.
Another way to do fast logging is to memory map a file on disk of fixed size: 16 MB for example. Then the logging writes into a memory pointer which wraps around when it reaches the size limit. It then continues to write at the front of the file. It's a good idea to have some kind of write position marker. I use <====>, for example. I find this method to be ridiculously fast and great for debug logging.
I haven't used it, but it gets good reviews here on SO, try logrotate
A more general discussion of managing output files may show you that a custom script/solution is not out of the question ;-) : Problem with Bash output redirection
I hope this helps.

How do I measure net used disk space change due to activity by a given process in Linux?

I'd like to monitor disk space requirements of a running process. Ideally, I want to be able to point to a process and find out the net change in used disk space attributable to it. Is there an easy way of doing this in Linux? (I'm pretty sure it would be feasible, though maybe not very easy, to do this in Solaris with DTrace)
Probably you'll have to ptrace it (or get strace to do it for you and parse the output), and then try to work out what disc is being used.
This is nontrivial, as your tracing process will need to understand which file operations use disc space - and be free of race conditions. However, you might be able to do an approximation.
Quite a lot of things can use up disc space, because most Linux filesystems support "holes". I suppose you could count holes as well for accounting purposes.
Another problem is knowing what filesystem operations free up disc space - for example, opening a file for writing may, in some cases, truncate it. This clearly frees up space. Likewise, renaming a file can free up space if it's renamed over an existing file.
Another issue is processes which invoke helper processes to do stuff - for example if myprog does a system("rm -rf somedir").
Also it's somewhat difficult to know when a file has been completely deleted, as it might be deleted from the filesystem but still open by another process.
Happy hacking :)
If you know the PID of the process to monitor, you'll find plenty of information about it in /proc/<PID>.
The file /proc/<PID>/io contains statistics about bytes read and written by the process, it should be what you are seeking for.
Moreover, in /proc/<PID>/fd/ you'll find links to all the files opened by your process, so you could monitor them.
there is Dtrace for linux is available
http://librenix.com/?inode=13584
Ashitosh

Linux BASH memory leak when redirecting stdio

I've got a memory leak somewhere, but it doesn't appear to be related to my program. I'm making this bold statement based on the fact that once my program terminates, either by normal means, seg-faulting, or aborting, the memory isn't recovered. If my program were the culprit, I would assume the MMU would recover everything, but this doesn't appear to be the case.
The leak only comes into play when I redirect stdout (in BASH version 2.05 or 4) to a file, as in this is okay:
# my-program
but this isn't:
# my-program > /mnt/sda1/log-output.txt
The rate at which I'm printing to the screen is < 2Kb/sec. (The file is on a USB key).
Any ideas?
A related question is here.
The MemFree alone says nearly nothing.
Linux's block layer caches a lot.
You can see how much is being used for filesystem (and other) caches in the same /proc/meminfo you have mentioned.

How to find or calculate a Linux process's page table size and other kernel accounting?

How can I find out how big a Linux process's page table is, along with any other variable-size process accounting?
If you are really interested in the page tables, do a
$ cat /proc/meminfo | grep PageTables
PageTables: 24496 kB
Since Linux 2.6.10, the amount of memory used by a single process' page tables has been exposed via the VmPTE field of /proc/<pid>/status.
Not sure about Linux, but most UNIX variants provide sysctl(3) for this purpose. There is also the sysctl(8) command line utility.
Hmmm, back in Ye Olden Tymes, we used to call nlist(3) to get the system address for the data we were interested in, then open /dev/kmem, seek to the address, then read the data. Not sure if this works in Linux, but it might be worth typing "man 3 nlist" and seeing what comes back.
You should describe your problem, and not ask about details. If you fork too much (especially with a process which has a large address space) there are all kind of things which go wrong (including out of memory), hitting a pagetable maximum size is IMHO not a realistic problem.
Thad said, I would also be interested to read a process pagetable share in Linux.
As a simple rule of thumb you can however asume that each process occopies a share in the pagetable which is equal to its virtual size, for example 6 bytes for each page. So for example if you have a Oracle Database with 8GB SGA and 500 Processes sharing it, each of the process will use 14MB pagetable, which results in 7GB pagetables+8GB SGA. (sample numbers from http://kevinclosson.wordpress.com/2009/07/25/little-things-doth-crabby-make-%E2%80%93-part-ix-sometimes-you-have-to-really-really-want-your-hugepages/)

Resources