how to get rid of kswapd0 process running in linux - linux

Frequently facing the issue of the kswapd0 running in one of the linux machines, what could be the reason for that, by looking more at the issue, understood that it will be because of the less memory, I tried the below options to avoid it:
echo 1 > /proc/sys/vm/drop_caches
cat /proc/sys/vm/drop_caches
sudo cat /proc/sys/vm/swappiness
sudo sysctl vm.swappiness=60
but it does not yield fruitful results, what could be the best method to avoid it, or its something some action needs to be taken on the RAM memory of the machine, Any suggestions on this ?
Every time we observe , all the running apps are killed automatically and kswapd0 occupies the complete cpu and memory.

Related

"No Such Process" consumes GPU memory

When I use nvidia-smi, I found nearly 20GB GPU Memory is missing somewhere (total listed processes took 17745MB, meanwhile Memory-Usage is 37739MB):
Then I use nvitop, you can see No Such Process has actually taken my GPU resources. However, I cannot kill this PID:
>>> sudo kill -9 118238
kill: (118238): No such process
How can I get rid of this ghost process without interupting others?
I have found the solution in this answer: https://stackoverflow.com/a/59431785/6563277.
First, I run sudo fuser -v /dev/nvidia* to see all processes are using my GPU RAM that nvidia-smi has failed to show.
Then, I saw some "ghost" Python processes. And after killing it, the GPU RAM was free up.

Rust compilation on AWS fails while succeeding on other machines [duplicate]

I am using opensuse, specific the variant on mono's website when you click vmware
I get this error. Does anyone know how i might fix it?
make[4]: Entering directory `/home/rupert/Desktop/llvm/tools/clang/tools/driver'
llvm[4]: Linking Debug+Asserts executable clang
collect2: ld terminated with signal 9 [Killed]
make[4]: *** [/home/rupert/Desktop/llvm/Debug+Asserts/bin/clang] Error 1
The full text can be found here
Your virtual machine does not have enough memory to perform the linking phase. Linking is typical the most memory intensive part of a build since it's where all the object code comes together and is operated on as a whole.
If you can allocate more RAM to the VM then do that. Alternatively you could increase the amount of swap space. I am not that familiar with VMs but I imagine the virtual hard drive you set-up will have a swap partition. If you can make that bigger or allocate a second swap partition that would help.
Increasing the RAM, if only for the duration of your build, is the easiest thing to do though.
Also got the same issue and solved by doing following steps (It is memory issue only) -
Checks current swap space by running free command (It must be around 10GB.).
Checks the swap partition
sudo fdisk -l
/dev/hda8 none swap sw 0 0
Make swap space and enable it.
sudo swapoff -a
sudo /sbin/mkswap /dev/hda8
sudo swapon -a
If your swap disk size is not enough you would like to create swap file and use it.
Create swap file.
sudo fallocate -l 10g /mnt/10GB.swap
sudo chmod 600 /mnt/10GB.swap
OR
sudo dd if=/dev/zero of=/mnt/10GB.swap bs=1024 count=10485760
sudo chmod 600 /mnt/10GB.swap
Mount swap file.
sudo mkswap /mnt/10GB.swap
Enable swap file.
sudo swapon /mnt/10GB.swap
I tried with make -j1 and it works!. But it takes long time to build.
I had the same problem building on a VirtualBox system. FWIW I was building on a laptop with XP and 2GB RAM. I had to bump the virtual RAM up to 1462MB to get a successful build. Also note the recommended disk size of 8GB is not sufficient to build and install both LLVM and Clang under Ubuntu. I'd recommend at least 16GB.
I would suggest using of -l (--max-load) option instead of limiting -j in this case. Possibly helpful
answer.

How to force NVIDIA OpenCL to release GPU context to avoid memory leak

This is a follow up question to an earlier question.
From the discussion, the mmc code (https://github.com/fangq/mmc) appears to be fine, and the memory was properly released when running on Intel CPU and AMD GPU. However, on NVIDIA GPU, valgrind reported significant memory leak, so was the test. Every time after a cycle of creating and releasing a GPU context, the memory kept increasing.
You can see this result in the below memory (blue line) profiling report.
Here is the test and commands to reproduce the issue (need to run this on NVIDIA GPUs):
git clone https://github.com/fangq/mmc.git
cd mmc/src
sed -i -e 's/mmc_init_from_cmd/for(int i=0;i<5;i++){\nmmc_init_from_cmd/g' mmc.c
sed -i -e 's/return/getchar();}\nreturn/g' mmc.c
make clean
make all
cd ../examples/validation
../../src/bin/mmc -f cube2.inp -G 1 -s cube2 -n 1e4 -b 0 -D TP -M G -F bin
run ../../src/bin/mmc -L to list GPUs, use -G # to specify which GPU to use.
as you will see, the simulation will repeat 5 times, separated by enter keys. You can start a memory monitor, like top command in Linux, and see the increasing memory allocation after each repetition.
I googled and found multiple previous reports on OpenCL memory leaks, but I did not find an solution. I would like to know if if there any trick to force NVIDIA OpenCL driver to clean up memory after each run. I am asking this because mmc has a MATLAB/Octave mex function which can be called multiple times, and this issue could lead to large memory usage after multiple calls.

Two xwin-xdg-menu processes with high CPU consumption

I have a Windows 7 computer with intel i7 with 2 cores and hyperthreading and a linux virtual machine in a cloud. I don't like VNC (it's laggy) so I use X windowing.
I start my Cygwin XWin with the following command:
C:\cygwin64\bin\run.exe --quote /usr/bin/bash.exe -l -c "cd; /usr/bin/xinit /etc/X11/xinit/startxwinrc -- /usr/bin/XWin :0 -multiwindow -listen tcp"
It's working otherwise just as intended but for some reason it's spawning two xwin-xdg-menu processes of which the other one is consuming 25% of my CPU. When I kill it, the CPU usage returns to normal and everything is working fine, including the other xwin-xdg-menu process.
I tried also this:
C:\cygwin64\bin\XWin.exe :0 -multiwindow -listen tcp
but it makes the application run slowly and with a bad resolution.
Is there a way to start X with listen-tcp with an adapted resolution to my multiple screens I have and without having to manually kill the extra process every time?
It seems I'm not the only one with this problem but for now I haven't found any solution to this.
https://cygwin.com/ml/cygwin/2017-05/msg00345.html
https://superuser.com/questions/1210325/cygwin-at-spi-bus-launcher-and-xwin-xdg-menu-high-cpu (I don't have problems with at-spi-bus-launcher though)
Solution:
Create a ~/.startxwinrc file, and add one line:
exec sleep infinity
Make ~/.startxwinrc executable by running chmod +x ~/.startxwinrc.
Reason I suspect this worked:
startxwin searches for a ~/.startxwinrc file to execute when launching. If startxwin does not find a ~/.startxwinrc file, startxwin will follow the default routine outlined in /etc/X11/xinit/startxwinrc.
The default routine launches /usr/bin/xwin-xdg-menu, somehow causing me to have two xwin-xdg-menu processes, one of them with very high cpu. Creating ~/.startxwinrc bypasses the default routine, disabling /usr/bin/xwin-xdg-menu from launching altogether.
exec sleep infinity keeps the x server alive after launching.

How to measure IOPS for a command in linux?

I'm working on a simulation model, where I want to determine when the storage IOPS capacity becomes a bottleneck (e.g. and HDD has ~150 IOPS, while an SSD can have 150,000). So I'm trying to come up with a way to benchmark IOPS in a command (git) for some of it's different operations (push, pull, merge, clone).
So far, I have found tools like iostat, however, I am not sure how to limit the report to what a single command does.
The best idea I can come up with is to determine my HDD IOPS capacity, use time on the actual command, see how long it lasts, multiply that by IOPS and those are my IOPS:
HDD ->150 IOPS
time df -h
real 0m0.032s
150 * .032 = 4.8 IOPS
But, this is of course very stupid, because the duration of the execution may have been related to CPU usage rather than HDD usage, so unless usage of HDD was 100% for that time, it makes no sense to measure things like that.
So, how can I measure the IOPS for a command?
There are multiple time(1) commands on a typical Linux system; the default is a bash(1) builtin which is somewhat basic. There is also /usr/bin/time which you can run by either calling it exactly like that, or telling bash(1) to not use aliases and builtins by prefixing it with a backslash thus: \time. Debian has it in the "time" package which is installed by default, Ubuntu is likely identical, and other distributions will be quite similar.
Invoking it in a similar fashion to the shell builtin is already more verbose and informative, albeit perhaps more opaque unless you're already familiar with what the numbers really mean:
$ \time df
[output elided]
0.00user 0.00system 0:00.01elapsed 66%CPU (0avgtext+0avgdata 864maxresident)k
0inputs+0outputs (0major+261minor)pagefaults 0swaps
However, I'd like to draw your attention to the man page which lists the -f option to customise the output format, and in particular the %w format which counts the number of times the process gave up its CPU timeslice for I/O:
$ \time -f 'ios=%w' du Maildir >/dev/null
ios=184
$ \time -f 'ios=%w' du Maildir >/dev/null
ios=1
Note that the first run stopped for I/O 184 times, but the second run stopped just once. The first figure is credible, as there are 124 directories in my ~/Maildir: the reading of the directory and the inode gives roughly two IOPS per directory, less a bit because some inodes were likely next to each other and read in one operation, plus some extra again for mapping in the du(1) binary, shared libraries, and so on.
The second figure is of course lower due to Linux's disk cache. So the final piece is to flush the cache. sync(1) is a familiar command which flushes dirty writes to disk, but doesn't flush the read cache. You can flush that one by writing 3 to /proc/sys/vm/drop_caches. (Other values are also occasionally useful, but you want 3 here.) As a non-root user, the simplest way to do this is:
echo 3 | sudo tee /proc/sys/vm/drop_caches
Combining that with /usr/bin/time should allow you to build the scripts you need to benchmark the commands you're interested in.
As a minor aside, tee(1) is used because this won't work:
sudo echo 3 >/proc/sys/vm/drop_caches
The reason? Although the echo(1) runs as root, the redirection is as your normal user account, which doesn't have write permissions to drop_caches. tee(1) effectively does the redirection as root.
The iotop command collects I/O usage information about processes on Linux. By default, it is an interactive command but you can run it in batch mode with -b / --batch. Also, you can a list of processes with -p / --pid. Thus, you can monitor the activity of a git command with:
$ sudo iotop -p $(pidof git) -b
You can change the delay with -d / --delay.
You can use pidstat:
pidstat -d 2
More specifically pidstat -d 2 | grep COMMAND or pidstat -C COMMANDNAME -d 2
The pidstat command is used for monitoring individual tasks currently being managed by the Linux kernel. It writes to standard output activities for every task selected with option -p or for every task managed by the Linux kernel if option -p ALL has been used. Not selecting any tasks is equivalent to specifying -p ALL but only active tasks (tasks with non-zero statistics values) will appear in the report.
The pidstat command can also be used for monitoring the child processes of selected tasks.
-C commDisplay only tasks whose command name includes the stringcomm. This string can be a regular expression.

Resources