How do I make linux swap more eagerly?

How do I make linux swap more eagerly? - linux

I have a use-case where I have bursts of allocations in the range of 5-6gb, specifically when Visual Studio Code compiles my D project while I'm typing. (The compiler doesn't release memory at all, in order to be as fast as possible.)
DMD does memory allocation in a bit of a sneaky way. Since compilers are short-lived programs, and speed is of the essence, DMD just mallocs away, and never frees. This eliminates the scaffolding and complexity of figuring out who owns the memory and when it should be released. (It has the downside of consuming all the resources of your machine if the module being compiled is big enough.)
source
The machine is a Dell XPS 13 running Manjaro 64-bit, with 16gb of memory -- and I'm hitting that roof. The system seizes up completely, REISUB may or may not work, etc. I can leave it for an hour and it's still hung, not slowly resolving itself. The times I've been able to get to a tty, dmesg has had all kinds of jovial messages. So I thought to enable a big swap partition to alleviate the pressure, but it isn't helping.
I realise that swap won't be used until it's needed, but by then it's too late. Even with the swap, when I run out of memory everything segfaults; Qt, zsh, fuse-ntfs, Xorg. At that point it will report a typical 70mb of swap in use.
vm.swappiness is at 100. swapon reports the swap as being active, automatically enabled by systemd.
NAME TYPE SIZE USED PRIO
/dev/nvme0n1p8 partition 17.6G 0B -2
What can I do to make it swap more?

Try this. Remember to put this question in superuser or serverfault. Stackoverflow is only for programming stuff.
https://askubuntu.com/questions/371302/make-my-ubuntu-use-more-swap-than-ram

Related

Why linux disables disk write buffer when system ram is greater than 8GB?

Background:
I was trying to setup a ubuntu machine on my desktop computer. The whole process took a whole day, including installing OS and softwares. I didn't thought much about it, though.
Then I tried doing my work using the new machine, and it was significantly slower than my laptop, which is very strange.
I did iotop and found that disk traffic when decompressing a package is around 1-2MB/s, and it's definitely abnormal.
Then, after hours of research, I found this article that describes exactly same problem, and provided a ugly solution:
We recently had a major performance issue on some systems, where disk write speed is extremely slow (~1 MB/s — where normal performance
is 150+MB/s).
...
EDIT: to solve this, either remove enough RAM, or add “mem=8G” as kernel boot parameter (e.g. in /etc/default/grub on Ubuntu — don’t
forget to run update-grub !)
I also looked at this post
https://lonesysadmin.net/2013/12/22/better-linux-disk-caching-performance-vm-dirty_ratio/
and did
cat /proc/vmstat | egrep "dirty|writeback"
output is:
nr_dirty 10
nr_writeback 0
nr_writeback_temp 0
nr_dirty_threshold 0 // and here
nr_dirty_background_threshold 0 // here
those values were 8223 and 4111 when mem=8g is set.
So, it's basically showing that when system memory is greater than 8GB (32GB in my case), regardless of vm.dirty_background_ratio and vm.dirty_ratio settings, (5% and 10% in my case), the actual dirty threshold goes to 0 and write buffer is disabled?
Why is this happening?
Is this a bug in the kernel or somewhere else?
Is there a solution other than unplugging RAM or using "mem=8g"?
UPDATE: I'm running 3.13.0-53-generic kernel with ubuntu 12.04 32-bit, so it's possible that this only happens on 32-bit systems.

If you use a 32 bit kernel with more than 2G of RAM, you are running in a sub-optimal configuration where significant tradeoffs must be made. This is because in these configurations, the kernel can no longer map all of physical memory at once.
As the amount of physical memory increases beyond this point, the tradeoffs become worse and worse, because the struct page array that is used to manage all physical memory must be kept mapped at all times, and that array grows with physical memory.
The physical memory that isn't directly mapped by the kernel is called "highmem", and by default the writeback code treats highmem as undirtyable. This is what results in your zero values for the dirty thresholds.
You can change this by setting /proc/sys/vm/highmem_is_dirtyable to 1, but with that much memory you will be far better off if you install a 64-bit kernel instead.

Is this a bug in the kernel
According to the article you quoted, this is a bug, which did not exist in earlier kernels, and is fixed in more recent kernels.
Note that this issue seems to be fixed in later releases (3.5.0+) and is a regression (doesn’t happen on e.g. 2.6.32)

How to stop page cache for disk I/O in my linux system?

Here is my system based on Linux2.6.32.12:
1 It contains 20 processes which occupy a lot of usr cpu
2 It needs to write data on rate 100M/s to disk and those data would not be used recently.
What I expect:
It can run steadily and disk I/O would not affect my system.
My problem:
At the beginning, the system run as I thought. But as the time passed, Linux would cache a lot data for the disk I/O, that lead to physical memory reducing. At last, there will be not enough memory, then Linux will swap in/out my processes. It will cause I/O problem that a lot cpu time was used to I/O.
What I have try:
I try to solved the problem, by "fsync" everytime I write a large block.But the physical memory is still decreasing while cached increasing.
How to stop page cache here, it's useless for me
More infomation:
When Top show free 46963m, all is well including cpu %wa is low and vmstat shows no si or so.
When Top show free 273m, %wa is so high which affect my processes and vmstat shows a lot si and so.

I'm not sure that changing something will affect overall performance.
Maybe you might use posix_fadvise(2) and sync_file_range(2) in your program (and more rarely fsync(2) or fdatasync(2) or sync(2) or syncfs(2), ...). Also look at madvise(2), mlock(2) and munlock(2), and of course mmap(2) and munmap(2). Perhaps ionice(1) could help.
In the reader process, you might perhaps use readhahead(2) (perhaps in a separate thread).
Upgrading your kernel (to a 3.6 or better) could certainly help: Linux has improved significantly on these points since 2.6.32 which is really old.

To drop pagecache you can do the following:
"echo 1 > /proc/sys/vm/drop_caches"
drop_caches are usually 0. And, can be changed as per need. As you've identified yourself, that you need to free pagecache, so this is how to do it. You can also take a look at dirty_writeback_centisecs (and it's related tunables)(http://lxr.linux.no/linux+*/Documentation/sysctl/vm.txt#L129) to make quick writeback, but note it might have consequences, as it calls up kernel flasher thread to write out dirty pages. Also, note the uses of dirty_expire_centices, which defines how much time some data needs to be eligible for writeout.

How can I limit the cache used by copying so there is still memory available for other caches?

Basic situation:
I am copying some NTFS disks in openSUSE. Each one is 2 TB. When I do this, the system runs slow.
My guesses:
I believe it is likely due to caching. Linux decides to discard useful caches (for example, KDE 4 bloat, virtual machine disks, LibreOffice binaries, Thunderbird binaries, etc.) and instead fill all available memory (24 GB total) with stuff from the copying disks, which will be read only once, then written and never used again. So then any time I use these applications (or KDE 4), the disk needs to be read again, and reading the bloat off the disk again makes things freeze/hiccup.
Due to the cache being gone and the fact that these bloated applications need lots of cache, this makes the system horribly slow.
Since it is USB, the disk and disk controller are not the bottleneck, so using ionice does not make it faster.
I believe it is the cache rather than just the motherboard going too slow, because if I stop everything copying, it still runs choppy for a while until it recaches everything.
And if I restart the copying, it takes a minute before it is choppy again. But also, I can limit it to around 40 MB/s, and it runs faster again (not because it has the right things cached, but because the motherboard busses have lots of extra bandwidth for the system disks). I can fully accept a performance loss from my motherboard's I/O capability being completely consumed (which is 100% used, meaning 0% wasted power which makes me happy), but I can't accept that this caching mechanism performs so terribly in this specific use case.
# free
total used free shared buffers cached
Mem: 24731556 24531876 199680 0 8834056 12998916
-/+ buffers/cache: 2698904 22032652
Swap: 4194300 24764 4169536
I also tried the same thing on Ubuntu, which causes a total system hang instead. ;)
And to clarify, I am not asking how to leave memory free for the "system", but for "cache". I know that cache memory is automatically given back to the system when needed, but my problem is that it is not reserved for caching of specific things.
Is there some way to tell these copy operations to limit memory usage so some important things remain cached, and therefore any slowdowns are a result of normal disk usage and not rereading the same commonly used files? For example, is there a setting of max memory per process/user/file system allowed to be used as cache/buffers?

The nocache command is the general answer to this problem! It is also in Debian and Ubuntu 13.10 (Saucy Salamander).
Thanks, Peter, for alerting us to the --drop-cache" option in rsync. But that was rejected upstream (Bug 9560 – drop-cache option), in favor of a more general solution for this: the new "nocache" command based on the rsync work with fadvise.
You just prepend "nocache" to any command you want. It also has nice utilities for describing and modifying the cache status of files. For example, here are the effects with and without nocache:
$ ./cachestats ~/file.mp3
pages in cache: 154/1945 (7.9%) [filesize=7776.2K, pagesize=4K]
$ ./nocache cp ~/file.mp3 /tmp
$ ./cachestats ~/file.mp3
pages in cache: 154/1945 (7.9%) [filesize=7776.2K, pagesize=4K]\
$ cp ~/file.mp3 /tmp
$ ./cachestats ~/file.mp3
pages in cache: 1945/1945 (100.0%) [filesize=7776.2K, pagesize=4K]
So hopefully that will work for other backup programs (rsnapshot, duplicity, rdiff-backup, amanda, s3sync, s3ql, tar, etc.) and other commands that you don't want trashing your cache.

Kristof Provost was very close, but in my situation, I didn't want to use dd or write my own software, so the solution was to use the "--drop-cache" option in rsync.
I have used this many times since creating this question, and it seems to fix the problem completely. One exception was when I am using rsync to copy from a FreeBSD machine, which doesn't support "--drop-cache". So I wrote a wrapper to replace the /usr/local/bin/rsync command, and remove that option, and now it works copying from there too.
It still uses huge amount of memory for buffers and seems to keep almost no cache, but it works smoothly anyway.
$ free
total used free shared buffers cached
Mem: 24731544 24531576 199968 0 15349680 850624
-/+ buffers/cache: 8331272 16400272
Swap: 4194300 602648 3591652

You have practically two choices:
Limit the maximum disk buffer size: the problem you're seeing is probably caused by default kernel configuration that allows using huge piece of RAM for disk buffering and, when you try to write lots of stuff to a really slow device, you'll end up lots of your precious RAM for disk caching to that slow the device.
The kernel does this because it assumes that the processes can continue to do stuff when they are not slowed down by the slow device and that RAM can be automatically freed if needed by simply writing the pages on storage (the slow USB stick - but the kernel doesn't consider the actual performance of that process). The quick fix:
# Wake up background writing process if there's more than 50 MB of dirty memory
echo 50000000 > /proc/sys/vm/dirty_background_bytes
# Limit background dirty bytes to 200 MB (source: http://serverfault.com/questions/126413/limit-linux-background-flush-dirty-pages)
echo 200000000 > /proc/sys/vm/dirty_bytes
Adjust the numbers to match the RAM you're willing to spend on disk write cache. A sensible value depends on your actual write performance, not the amount of RAM you have. You should target on having barely enough RAM for caching to allow full write performance for your devices. Note that this is a global setting, so you have to set this according to the slowest devices you're using.
Reserve a minimum memory size for each task you want to keep going fast. In practice this means creating cgroups for stuff you care about and defining the minimum memory you want to have for any such group. That way, the kernel can use the remaining memory as it sees fit. For details, see this presentation: SREcon19 Asia/Pacific - Linux Memory Management at Scale: Under the Hood
Update year 2022:
You can also try creating new file /etc/udev/rules.d/90-set-default-bdi-max_ratio-and-min_ratio.rules with the following contents:
# For every BDI device, set max cache usage to 30% and min reserved cache to 2% of the whole cache
# https://unix.stackexchange.com/a/481356/20336
ACTION=="add|change", SUBSYSTEM=="bdi", ATTR{max_ratio}="30", ATTR{min_ratio}="2"
The idea is to put limit per device for maximum cache utilization. With the above limit (30%) you can have two totally stalled devices and still have 40% of the disk cache available for the rest of the system. If you have 4 or more stalled devices in parallel, even this workaround cannot help alone. That's why I have also added minimum cache space of 2% for every device but I don't know how to check if this actually effective. I've been running with this config for about half a year and I think it's working nicely.
See https://unix.stackexchange.com/a/481356/20336 for details.

The kernel can not know that you won't use the cached data from copying again. This is your information advantage.
But you could set the swapiness to 0: sudo sysctl vm.swappiness=0. This will cause Linux to drop the cache before libraries, etc. are written to the swap.
It works nice for me too, especially very performant in combination with huge amount of RAM (16-32 GB).

It's not possible if you're using plain old cp, but if you're willing to reimplement or patch it yourself, setting posix_fadvise(fd, 0, 0, POSIX_FADV_NOREUSE) on both input and output file will probably help.
posix_fadvise() tells the kernel about your intended access pattern. In this case, you'd only use the data once, so there isn't any point in caching it.
The Linux kernel honours these flags, so it shouldn't be caching the data any more.

Try using dd instead of cp.
Or mount the filesystem with the sync flag.
I'm not completely sure if these methods bypass the swap, but it may be worth giving a try.

I am copying some NTFS disks [...] the system runs slow. [...]
Since it is USB [...]
The slowdown is a known memory management issue.
Use a newer Linux Kernel. The older ones have a problem with USB data and "Transparent Huge Pages". See this LWN article. Very recently this issue was addressed - see "Memory Management" in LinuxChanges.

Ok, now that I know that you're using rsync and I could dig a bit more:
It seems that rsync is ineffective when used with tons of files at the same time. There's an entry in their FAQ, and it's not a Linux/cache problem. It's an rsync problem eating too much RAM.
Googling around someone recommended to split the syncing in multiple rsync invocations.

how to force linux to allocate memory in high (64bit) address space

I'm trying to track down a segfault problem in an old C code (not written by me). The segfaults occur only, if the addresses of certain variables in that code exceed the 32bit integer limit. (So I've got a pretty good idea what's going wrong, but I don't know where.)
So, my question is: Is there any way to force linux to allocate memory for a process in the high address space? At the moment it's pretty much down to chance whether the segfault happen, which makes debugging a bit difficult.
I'm running Ubuntu 10.04, Kernel 2.6.31-23-generic on a Dell inspiron 1525 laptop with 2GB ram, if that's any help.
Thanks in advance,
Martin.

You can allocate an anonymous block of memory with the mmap() system call, which you can pass as an argument where you want it to be mapped.

I would turn on the -Wpointer-to-int-cast and -Wint-to-pointer-cast warning options and check out any warnings they turn up (I believe these are included in -Wall on 64-bit targets). The cause is very likely something related to this, and simply auditing the warnings the compiler turns up may be a better approach than using a debugger.

Why doesn't WD Velociraptor speed up my VC++-compilation significantly?

Several people round here recommended switching to the new WD Velociraptor 10000rpm harddisk. Also magazine articles praise the performance.
I bought one and mirrored my old system to it. The resulting increase in compilation-speed is somewhat disappointing:
On my old Samsung drive (SATA, 7200), the compilation time was 16:02.
On the Velociraptor the build takes 15:23.
I have a E6600 with 1.5G ram. It's a C++-Project with 1200 files. The build is done in Visual Studio 2005. The acoustic managment is switchted off (no big difference anyway).
Did something go wrong or is this modest acceleration really all, I can expect?
Edit:
Some recommended increasing the RAM. I did now and got a minimal gain (3-5%) by doubling my RAM to 3GB.

Are you using the /MP option (undocumented, you have to enter it manually to your processor options) to enable source-level parallel build? That'll speed up your compile much more than just a faster harddisk. Gains from that are marginal.

Visual Studio 2005 can build multiple projects in parallel, and will do so by default on a multi-core machine, but depending on how your projects depend on each other it may be unable to parallel build them.
If your 1200 cpp files are in a single project, you're probably not using all of your CPU. If I'm not mistaken a C6600 is a quad-core CPU.
Dave

I imagine that hard disk reading was not your bottleneck in compilation. Realistically, few things need to be read/written from/to the hard disk. You would likely see more performance increase from more ram or a faster processor.

I'd suggest from the results that either your hdd latency speed wasn't the bottleneck you were looking for, or that your project is already close to building as fast as possible. Other items to consider would be:
hdd access time (although you may not be able to do much with this due to bus speed limitations)
RAM access speed and size
Processor speed
Reducing background processes

~6% increase in speed just from improving your hard drive. Just like Howler said. Grab some faster ram, and PCU.

As many have already pointed out, you probably didn't attack the real bottleneck. Randomly changing parts (or code for that matter) is as one could say "bass ackwards".
You first identify the performance bottleneck and then you changesomething.
Perfmon can help you get a good overview if you're CPU or I/O bound, you want to look at CPU utilization, disk queue length and IO bytes to get a first glimpse on what's going on.

That is actually a pretty big bump in speed for just replacing a hard disk. You are probably memory or CPU bound at this point. 1.5GB is light these days, and RAM is very cheap. You might see some pretty big improvements with more memory.
Just as a recommendation, if you have more than one drive installed, you could try setting your build directory to be somewhere on a different disk than your source files.
As for this comment:
If your 1200 cpp files are in a single project, you're probably not using all of your CPU. If I'm not mistaken a C6600 is a quad-core CPU.
Actually, a C6600 isn't anything. There is a E6600 and a Q6600. The E6600 is a dual core and the Q6600 is a quad core. On my dev machine I use a quad core CPU, and although our project has more than 1200 files, it is still EASILY processor limited during compile time (although a faster hard drive would still help speed things up!).

1200 Source files is a lot, but none of them is likely to be more than a couple hundred K, so while they all need to be read into memory, it's not going to take long to do so.
Bumping your system memory to 4G (yes, yes I know about the 3.somethingorother limit that 32-bit OSes have), and maybe looking at your CPU are going to provide a lot more performance improvement than merely using a faster disk drive could.

VC 2005 does not compile more then one file at the time per project so either move to VC 2008 to use both of your CPU cores, or break your solution to multiple libraries sub projects to get multiple compilations going.

I halved my compilation time by putting all my source onto a ram drive.
I tried these guys http://www.superspeed.com/desktop/ramdisk.php, installed a 1GB ramdrive, then copied all my source onto it. If you build directly from RAM, the IO overhead is vastly reduced.
To give you an idea of what I'm compiling, and on what;
WinXP 64-bit
4GB ram
2.? GHz dual-core processors
62 C# projects
approx 250kloc.
My build went from about 135s to 65s.
Downsides are that your source files are living in RAM, so you need to be more vigilant about source control. If your machine lost power, you'd lose all unversioned changes. Mitigated slightly by the fact that some RAMdrives will save themselves to disk when you shut the machine down, but still, you'll lose everything from either your last checkout, or the last time you shut down.
Also, you have to pay for the software. But since you're shelling out for hard drives, maybe this isn't that big a deal.
Upsides are the increased compilation time, and the fact that the exes are already living in memory, so the startup time and debugging time is a bit better. The real benefit is the compilation time, though.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string