Disk inexplicably filled - linux

I have two linux machines which should be near enough identical clones of each other. One of them has 89% useage of /dev/sda1, and the other has 27% useage.
I've tried the rather manual process of du -h in the root file system and comparing the two, but there are no substantial discreneable differences. Is there any other way to find out where the missing 20GB are?
Thanks!
Problem solved, there was an issue with an unmounted drive which caused it :)

ncdu will display the size of each directory from an ncurses interface. Probably what you're looking for.

Try to look the information provided with the command
tune2fs -l /dev/sda1
Do you see any difference in the block size or anything else?

You can also try baobab -disk usage analyzer-, a gui tool which displays disk usage in a clever visualization of nested piecharts.

Related

Linux : where exactly is a file saved when there are multiple logical volumes?

I've mostly worked in Windows environments and am still very noobish in everything Linux, so it's very likely I'm missing basic Linux concepts. That being said, I have questions about logical volumes and their interactions with files :
I have to use an Ubuntu machine (which I did not set up). On this machine, there is a physical volume /dev/sda2 which is in a volume group vg0.
That volume group vg0 has 4 logical volumes : lv1, mounted on /, lv2, mounted on /boot, lv3, mounted on /var and lv4, mounted on /tmp
My questions are as follows :
If I save a file (for example foo.txt) in the /var directory, will it be stored on the lv3(/var) logical volume ?
If the lv3(/var) logical volume is full and I try to save foo.txt in the /var directory, will it be stored on the lv1(/) logical volume (after all, /var is in /) ?
If the lv1(/) volume is full and I try to save foo.txt somewhere outside of /var (for example in /home), will it be stored on the lv3(/var) logical volume ?
What could be the point of having all these logical volumes, would 1 volume on / not be much simpler ?
It's quite obvious, from my questions, that I don't really get the relations between logical volumes, mount points and files. Is there somewhere a good tutorial where I could educate myself ?
Thanks in advance.
Yes, because lv3 is mounted on /var any files put in /var go there.
No, there are no special cases that happen when the device is full - you just get a device is full error. Despite /var appearing to be a child of /, that has been overridden by mounting lv3 on /var
No, again because there are no special cases for the device being full. It doesn't care, it just tries to put the file where it goes.
Yes, it is much simpler to have it all in /. But it can cause problems. For example, /boot is often its own volume so that you can't fill it up and prevent your system from working if you download a bunch of stuff in your home folder. There are different schools of thought on how much/how little you should separate your file system into different volumes. It is somewhat just opinion, but those opinions are based on various use cases and problems.
I don't have a great answer other than use the search engine of your choice! Honestly, when you are starting out it doesn't matter so much as long as you have space to put your stuff! If you are a newbie, it might be good to just put everything in one volume - as long as you keep an eye and don't let it fill up.

IntelliJ IDEA compilation speedup in Linux

I'm working with IntelliJ IDEA on Linux and recently I've got 16 GB of RAM, so is there any ways to speed up my projects compilation using this memory?
First of all, in order to speedup IntelliJ IDEA itself, you may find this discussion very useful.
The easiest way to speedup compilation is to move compilation output to RAM disk.
RAM disk setup
Open fstab
$ sudo gedit /etc/fstab
(instead of gedit you can use vi or whatever you like)
Set up RAM disk mount point
I'm using RAM disks in several places in my system, and one of them is /tmp, so I'll just put my compile output there:
tmpfs /var/tmp tmpfs defaults 0 0
In this case your filesystem size will not be bounded, but it's ok, my /tmp size right now is 73MB. But if you afraid that RAM disk size will become too big - you can limit it's size, e.g.:
tmpfs /var/tmp tmpfs defaults,size=512M 0 0
Project setup
In IntelliJ IDEA, open Project Structure (Ctrl+Alt+Shift+S by default), then go to Project - 'Project compiler output' and move it to RAM disk mount point:
/tmp/projectName/out
(I've added projectName folder in order to find it easily if I need to get there or will work with several projects at same time)
Then, go to Modules, and in all your modules go to Paths and select 'Inherit project compile output path' or, if you want to use custom compile output path, modify 'Output path' and 'Test output path' the way you did it to project compiler output before.
That's all, folks!
P.S. A few numbers: time of my current project compilation in different cases (approx):
HDD: 80s
SSD: 30s
SSD+RAM: 20s
P.P.S. If you use SSD disk, besides compilation speedup you will reduce write operations on your disk, so it will also help your SSD to live happily ever after ;)
Yes you can. There is several ways to do this. First you can fine tune the JVM for the amount of memory you have. Take this https://gist.github.com/zafarella/43bc260c3c0cdc34f109 one as example.
In addition depending on what linux distribution you use there is a way creating RAM disk and rsyncing content into HDD. Basically you will place all logs and tmp files (including indexes) into RAM - your Idea will fly.
Use something like this profile-sync-daemon to keep files synced. It is possible easily add Idea as an app. Alternatively you can use anything-sync-daemon
You need to change "idea.system.path" and "idea.log.path"
More details on Idea settings could be found at their docs. The idea is to move whatever changes often into RAM.
More RAM Disk alternatives https://wiki.debian.org/SSDOptimization#Persistent_RAMDISK
The bad about this solution is that when you run out of space in RAM OS will page things and it will slow down everything.
Hope that helps.
In addition to ramdisk approach, you might speedup compilation by giving its process more memory (but not too much) and compiling independent modules in parallel. Both options can be found on Settings | Compiler.

Getting Linux process resource usage (cpu,disk,network)

I want to use the /proc to find the resource usage of a particular process every second. The resources include cputime, disk usage and network usage. I looked at /proc/pid/stat , but I am not sure whether I am getting the required details. I want all 3 resource usage and I want to monitor them every second.
Some newer kernels have /proc/<pid_of_process>/io file. This is where IO stats are.
It is not documented in man proc, but you can try to figure out the numbers yourself.
Hope it helps.
Alex.
getrusage() accomplishes cpu, memory and disk etc.
man 2 getrusage
I don't know about network.
checkout glances.
It's got cpu disk and network all on one screen. It's not per process but it's better than looking at 3 separate tools.
Don't think there is a way to get the disk and network information on a per process basis.
The best you can have is the global disk and network, and the per process CPU time.
All documented in man proc
netstat -an
Shows all connections to the server including the source and destination ips and ports if you have proper permissions.
ac
Prints statistics about users' connect time
The best way to approach problems like this is to look up the source code of tools that perform similar monitoring and reporting.
Although there is no guarantee that they are using /proc directly, they will lead you to an efficient way to tackle the problem.
For your case, top(1), iotop(8) and nethogs(8) come to mind.
You can use SAR
-x report statistics for a given process.
See this for more details:
http://www.linuxcommand.org/man_pages/sar1.html
Example:
sar -u -r -b 1 -X PID | grep -v Average | grep -v Linux
You can use top
SYNOPSIS
top -hv|-bcHiOSs -d delay -n limit -u|U user -p PID -o champ -w [columns]
This is a screen capture of top in a terminal

How do I measure net used disk space change due to activity by a given process in Linux?

I'd like to monitor disk space requirements of a running process. Ideally, I want to be able to point to a process and find out the net change in used disk space attributable to it. Is there an easy way of doing this in Linux? (I'm pretty sure it would be feasible, though maybe not very easy, to do this in Solaris with DTrace)
Probably you'll have to ptrace it (or get strace to do it for you and parse the output), and then try to work out what disc is being used.
This is nontrivial, as your tracing process will need to understand which file operations use disc space - and be free of race conditions. However, you might be able to do an approximation.
Quite a lot of things can use up disc space, because most Linux filesystems support "holes". I suppose you could count holes as well for accounting purposes.
Another problem is knowing what filesystem operations free up disc space - for example, opening a file for writing may, in some cases, truncate it. This clearly frees up space. Likewise, renaming a file can free up space if it's renamed over an existing file.
Another issue is processes which invoke helper processes to do stuff - for example if myprog does a system("rm -rf somedir").
Also it's somewhat difficult to know when a file has been completely deleted, as it might be deleted from the filesystem but still open by another process.
Happy hacking :)
If you know the PID of the process to monitor, you'll find plenty of information about it in /proc/<PID>.
The file /proc/<PID>/io contains statistics about bytes read and written by the process, it should be what you are seeking for.
Moreover, in /proc/<PID>/fd/ you'll find links to all the files opened by your process, so you could monitor them.
there is Dtrace for linux is available
http://librenix.com/?inode=13584
Ashitosh

How to find or calculate a Linux process's page table size and other kernel accounting?

How can I find out how big a Linux process's page table is, along with any other variable-size process accounting?
If you are really interested in the page tables, do a
$ cat /proc/meminfo | grep PageTables
PageTables: 24496 kB
Since Linux 2.6.10, the amount of memory used by a single process' page tables has been exposed via the VmPTE field of /proc/<pid>/status.
Not sure about Linux, but most UNIX variants provide sysctl(3) for this purpose. There is also the sysctl(8) command line utility.
Hmmm, back in Ye Olden Tymes, we used to call nlist(3) to get the system address for the data we were interested in, then open /dev/kmem, seek to the address, then read the data. Not sure if this works in Linux, but it might be worth typing "man 3 nlist" and seeing what comes back.
You should describe your problem, and not ask about details. If you fork too much (especially with a process which has a large address space) there are all kind of things which go wrong (including out of memory), hitting a pagetable maximum size is IMHO not a realistic problem.
Thad said, I would also be interested to read a process pagetable share in Linux.
As a simple rule of thumb you can however asume that each process occopies a share in the pagetable which is equal to its virtual size, for example 6 bytes for each page. So for example if you have a Oracle Database with 8GB SGA and 500 Processes sharing it, each of the process will use 14MB pagetable, which results in 7GB pagetables+8GB SGA. (sample numbers from http://kevinclosson.wordpress.com/2009/07/25/little-things-doth-crabby-make-%E2%80%93-part-ix-sometimes-you-have-to-really-really-want-your-hugepages/)

Resources