Compare filesystem space usage over time - linux

Is there a good, graphical way to represent disk usage changes in a linux/unix filesystem over time?
Let me elaborate: there are several good ways to represent disk usage in a filesystem. I'm not interested in summary statistics such total space used (as given by du(1)), but more advanced interactive/visualization tools such as ncdu, gdmap, filelight or baobab, that can give me an idea of where the space is being used.
From a technical perspective, I think the best approach is squarified tree-maps (as available in gdmap), since it makes a better use of the visual space available. The circular approach used by filelight for instance cannot represent huge hierarchies efficiently, and it's dubious how to account for the increasing area of the outer rings in the representation from a human perspective. Looks nice, but that's about it.
Treemaps are perfect to have the current snapshot of disk usage in the filesystem, but I'd like to have something similar to see how disk usage has been evolving over time.
My current solution is very simple: I'm dumping the filesystem usage state using "ncdu -o" over time, and then I compare them side-by-side using two ncdu instances. It's very inefficient, but does the job. I'd like something more visual though.
All the relevant information can be dumped using:
find [dir] -printf "%P\t%s\n"
I did a crappy hack to load this state information in gdmap, so I can use two gdmap instances instead. Still not optimal though, as a treemap will fit the total allocated space into the same rectangle. As such, you cannot really tell if the same area is equivalent to more or less space. If two big directories grow proportionately, they will not change the visualization.
I need something better than that. Obviously, I cannot plot the cumulative directory sizes in a simple line plot, as I would have too many directories.
I'd like something similar to a treemap, where maybe the color of the square represents size increase/decrease using some colormap. However, since a treemap will show individual files as opposed to directories, it's not obvious on how to color-map a directory in which the allocated space has been growing/shrinking due to new/removed files.
What kind of visualization techniques could be used to see the evolution of allocated space over time, which take the whole underlying tree into account?
To elaborate even more, in a squarified treemap the whole allocated space is proportionally divided by file size, and each directory logically clusters the allocated space within it. As such, we don't "see" directories, we see the proportional space taken by it's content.
How we could extend and/or improve the visualization in order to see how the allocated space has been moved to a different area of the treemap?

You can usee Cacti for this.
You need to install snmp deamon on you machine and install cacti (freeware) localy or on any other PC and monitor you linux machine.
http://blog.securactive.net/wp-content/uploads/2012/12/cacti_performance_vision1.png
You can monitor network interfaces, spaces of any partitions and lot of other parameters of your LINUX OS.
apt-get install cacti
vim /etc/snmp/snmpd.conf
add this at about 42 line:
view systemonly included .1.3.6.1
close and restart snmpd deamon
go to cacti config and try to discover your linux machine.

Related

Is there a way to dynamically determine the vhdSize flag?

I am using the MSIX manager tool to convert a *.msix (an application installer) to a *.vhdx so that it can be mounted in an Azure virtual machine. One of the flags that the tool requires is -vhdSize, which is in megabytes. This has proven to be problematic because I have to guess what the size should be based off the MSIX. I have ran into numerous creation errors due to too small of a vhdSize.
I could set it to an arbitrarily high value in order to get around these failures, but that is not ideal. Alternatively, guessing the correct size is an imprecise science and a chore to do repeatedly.
Is there a way to have the tool dynamically set the vhdSize, or am I stuck guessing a value that is both large enough to accommodate the file, but not too large as to waste disk space? Or, is there a better way to create a *.vhdx file?
https://techcommunity.microsoft.com/t5/windows-virtual-desktop/simplify-msix-image-creation-with-the-msixmgr-tool/m-p/2118585
There is an MSIX Hero app that could select a size for you, it will automatically check how big the uncompressed files are, add an extra buffer for safety (currently double the original size), and round it to the next 10MB. Reference from https://msixhero.net/documentation/creating-vhd-for-msix-app-attach/

about managing file system space

Space Issues in a filesystem on Linux
Lets call it FILESYSTEM1
Normally, space in FILESYSTEM1 is only about 40-50% used
and clients run some reports or run some queries and these reports produce massive files about 4-5GB in size and this instantly fills up FILESYSTEM1.
We have some cleanup scripts in place but they never catch this because it happens in a matter of minutes and the cleanup scripts usually clean data that is more than 5-7 days old.
Another set of scripts are also in place and these report when free space in a filesystem is less than a certain threshold
we thought of possible solutions to detect and act on this proactively.
Increase the FILESYSTEM1 file system to double its size.
set the threshold in the Alert Scripts for this filesystem to alert when 50% full.
This will hopefully give us enough time to catch this and act before the client reports issues due to FILESYSTEM1 being full.
Even though this solution works, does not seem to be the best way to deal with the situation.
Any suggestions / comments / solutions are welcome.
thanks
It sounds like what you've found is that simple threshold-based monitoring doesn't work well for the usage patterns you're dealing with. I'd suggest something that pairs high-frequency sampling (say, once a minute) with a monitoring tool that can do some kind of regression on your data to predict when space will run out.
In addition to knowing when you've already run out of space, you also need to know whether you're about to run out of space. Several tools can do this, or you can write your own. One existing tool is Zabbix, which has predictive trigger functions that can be used to alert when file system usage seems likely to cross a threshold within a certain period of time. This may be useful in reacting to rapid changes that, left unchecked, would fill the file system.

Check memory usage in haskell

I'm creating a program which implements some kind of cache. I need to use as much memory as possible and to do that I need to do two things:
Check how much memory is still available in system (RAM only, not SWAP)
Check how much memory my app is already using.
I need a platform independent solution (Linux, Windows, etc.).
Using these two pieces of information I will reduce the size of cache or enlarge it.
How can I get this information in Haskell? Are there any packages that can provide that information?
I can't immediately see how to do this portably.
However, GHC does have "weak pointers". (See System.Mem.Weak.) If you create items and hang on to them via weak pointers (only), then the garbage collector will automatically start deleting items if you run low on physical memory.
(Unfortunately, this doesn't give you the ability to decide which items to delete first — e.g., the ones that are cheapest to recreate or the ones that have been least-used or something.)

How does chroot affect dynamic libraries memory use?

Although there is another question with similar topic, it does not cover the memory use by the shared libraries in chrooted jails.
Let's say we have a few similar chroots. To be more specific, exactly the same sets of binary files and shared libraries which are actually hard links to the master copies to conserve the disk space (to prevent the potential possibility of a files alteration the file system is mounted read only).
How is the memory use affected in such a setup?
As described in the chroot system call:
This call changes an ingredient in the pathname resolution process and does nothing else.
So, the shared library will be loaded in the same way as if it were outside the chroot jail (share read only pages, duplicate data, etc.)
http://man7.org/linux/man-pages/man2/chroot.2.html
Because hardlinks share the same underlying inode, the kernel treats them as the same item when it comes to caching/mapping.
You'll see filesystem cache savings by using hardlinks, as well as disk-space savings.
The biggest issue I'd have with this is that if someone manages so subvert the read-only nature of one of the chroot environments, then they could subvert all of them by making modifications to any of the hardlinked files.
When I set this up, I copied the shared libraries per chroot instead of linking to a read-only mount. With separate files, the text segments were not shared. It's likely that the same inode will map to the same read-only text segment, but this may vary with available memory management hardware and similar architectural details.
Try this experiment on your system: write a small program that makes some minimal use of a large shared library. Run twenty or thirty chroot jails as you describe, each with a running copy of the program. Check overall memory usage before & during running, and dissect one instance to get a good text/data segment breakdown. If memory use increases by the full size of the map for each instance, the segments are not shared. Conversely, if memory use goes up by a fraction of the map, the segments are shared.

How do I measure net used disk space change due to activity by a given process in Linux?

I'd like to monitor disk space requirements of a running process. Ideally, I want to be able to point to a process and find out the net change in used disk space attributable to it. Is there an easy way of doing this in Linux? (I'm pretty sure it would be feasible, though maybe not very easy, to do this in Solaris with DTrace)
Probably you'll have to ptrace it (or get strace to do it for you and parse the output), and then try to work out what disc is being used.
This is nontrivial, as your tracing process will need to understand which file operations use disc space - and be free of race conditions. However, you might be able to do an approximation.
Quite a lot of things can use up disc space, because most Linux filesystems support "holes". I suppose you could count holes as well for accounting purposes.
Another problem is knowing what filesystem operations free up disc space - for example, opening a file for writing may, in some cases, truncate it. This clearly frees up space. Likewise, renaming a file can free up space if it's renamed over an existing file.
Another issue is processes which invoke helper processes to do stuff - for example if myprog does a system("rm -rf somedir").
Also it's somewhat difficult to know when a file has been completely deleted, as it might be deleted from the filesystem but still open by another process.
Happy hacking :)
If you know the PID of the process to monitor, you'll find plenty of information about it in /proc/<PID>.
The file /proc/<PID>/io contains statistics about bytes read and written by the process, it should be what you are seeking for.
Moreover, in /proc/<PID>/fd/ you'll find links to all the files opened by your process, so you could monitor them.
there is Dtrace for linux is available
http://librenix.com/?inode=13584
Ashitosh

Resources