linux cpuset doesn't work - linux

I'm having troubles with cpusets it will be great if you could help me
I've defined two cpuset groups: "cpuset_0" which has only one task, and "cpuset_1" which is for all the other tasks in my system.
"cpuset_0" has cpus="0", cpu_exclusive="1" and only the one task assign to it.
and "cpuset_1" has cpus="1-3", cpu_exclusive="0" and all the tasks I could move as root from the root cpuset.
Both cpusets has mems="0".
The problem is that for some reason I see tasks which assigned to "cpuset_1" which are running on the exclusive cpu "cpuset_0"
For example running ps H -eo tid,psr,cgroup,cmd
gives me:
2199 0 6:cpuset:/cpuset_1?5:freeze /usr/lib/chromium-browser/chromium-browser
among other processes which shouldn't be running on cpu 0.
BTW: I'm running kernel version 3.2.0

Were you able to actually make it work without using cpuset.mems ?. It is mandatory. What does your config looks like. or you have used the mount command.
https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Resource_Management_Guide/sec-cpuset.html
try to follow the below
https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Resource_Management_Guide/ch-Using_Control_Groups.html

Related

Monitor the CPU usage of an OpenFOAM simulation running on a slurm job

I'm running an OpenFOAM simulation on a cluster. I have used the Scotch decomposition method and my decomposeParDict looks like this:
FoamFile
{
version 2.0;
format ascii;
class dictionary;
object decomposeParDict;
}
numberOfSubdomains 6;
method scotch;
checkMesh and decomposePar finish with no issues. I have assigned 6 nodes to the slurm by
srun -N6 -l sonicFoam
and the solver runs smoothly without any errors.
The issue is the solution speed is not improved in comparison to the non-parallel simulation I ran before. I want to monitor the CPU usage to see if all of the 6 nodes I have assigned are similarly loaded. The squeue --user=foobar command return the jobNumber and list of nodes assigned (NODELIST(REASON)) which looks like this:
foo,bar[061-065]
from sinfo command these nodes are both in debug and main* PARTITIONs (which I have absolutely no idea what it means!).
This post says that you can use the sacct or sstat commands to monitor CPU time and memory usage of a slurm job. But when I run
sacct --format="CPUTime,MaxRSS"
it gives me:
CPUTime MaxRSS
---------- ----------
00:00:00
00:00:00
00:07:36
00:00:56
00:00:26
15:26:24
which I can not understand. And when I specify the job number by
sacct --job=<jobNumber> --format="UserCPU"
The return is empty. So my questions are
Is my simulation loading all nodes or is it running on one or two and the rest are free?
am I running the right commands? if yes what those numbers mean? how they represent the CPU usage per node?
If not then what are the right --format="..."s for sacct and/or sstat (or maybe other slurm commands) to get the CPU usage/load?
P.S.1. I have followed the OpenFOAM compiling following the official instructions. I did not do anything with OpenMPI and it's mpicc compiler for that matter though.
P.S.2 For those of you who might end up here. Maybe I'm running the wrong command apparently one can first allocate some resources by:
srun -N 1 --ntasks-per-node=7 --pty bash
where 7 is the number of cores you want and bash is just a name. and then run the solver with:
mpirun -np 7 sonicFoam -parallel -fileHandler uncollated
I'm not sure yet though.
You can use
sacct --format='jobid,AveCPU,MinCPU,MinCPUTask,MinCPUNode'
to check whether all CPUs have been active. Compare AveCPU (average CPU time of all tasks in job) with MinCPU (minimum CPU time of all tasks in job). If they are equal, all 6 tasks (you requested 6 nodes, with, implicitly, 1 task per node) worked equally. If they are not equal, or even MinCPU is zero, then some tasks have been doing nothing.
But in your case, I believe you will observe that all tasks have been working hard, but they were all doing the same thing.
Besides the remark concerning the -parallel flag by #timdykes, you also must be aware that launching an MPI job with sun requires that OpenMPI was compiled with Slurm support. During your installation of OpenFOAM, it installed its own version of OpenMPI, and if file /usr/include/slurm/slurm.h or /usr/include/slurm.h exists, then Slurm support was probably compiled in. But the safest is probably to use mpirun.
But to do that, you will have to first request an allocation from Slurm with either sbatch or salloc.
Have you tried running with the '-parallel' argument? All of the OpenFOAM examples online use this argument when running a parallel job, one example is the official guide for running in parallel.
srun -N $NTASKS -l sonicFOAM -parallel
As an aside - I saw you built openfoam yourself, have you checked whether the cluster admins have provided a module for it? You can usually run module avail to see a list of the available modules, and then module load moduleName if there is an existing OpenFOAM module. This is useful as you can probably trust its been built with all the right options and would automatically set up your $PATH etc.

Run processes use two CPU in different terminals

I have a complex script (script it's just example, it may be a unzip command and etc. and on the other terminal different command; they are not connected) and two CPU. Can I run two different processes (or commands and etc) on two terminals with different CPU each? (simultaneously) Is that possible? It's possible to specify a particular processor in each terminal for processing?
You can run 2 or more commands even on the same terminal with "taskset"
From the man pages (http://linuxcommand.org/man_pages/taskset1.html):
taskset is used to set or retrieve the CPU affinity of a running pro-
cess given its PID or to launch a new COMMAND with a given CPU affin-
ity. CPU affinity is a scheduler property that "bonds" a process to a
given set of CPUs on the system. The Linux scheduler will honor the
given CPU affinity and the process will not run on any other CPUs.
Note that the Linux scheduler also supports natural CPU affinity: the
scheduler attempts to keep processes on the same CPU as long as practi-
cal for performance reasons. Therefore, forcing a specific CPU affin-
ity is useful only in certain applications.
#eddiem already shared the link (http://xmodulo.com/run-program-process-specific-cpu-cores-linux.html) on how to install taskset and that link also explains how to run it
In short:
$taskset 0x1 tar -xzvf test.tar.gz
That would send the tar command to run on CPU 0
If you want to run several commands/scripts in the same terminal using different CPUs then I think that you just could send them to the background appending "&" at the end e.g.
$taskset 0x1 tar -xzvf test.tar.gz &
You can use the taskset program to control the CPU affinity of specific processes. If you set the affinity for the shell process controlling terminal A to core 0 and terminal B to core 1, any child processes started from A should run on core 0 and B on core 1.
http://xmodulo.com/run-program-process-specific-cpu-cores-linux.html

Limit the percentage of CPU a process tree is allowed to use?

Can I limit the percentage of CPU a running process and all its current and future children can use, combined? I've heard about the cpulimit tool, but that seems to ignore child processes.
Edit: So, the answer I found requires cpulimit to run constantly untill we want the limit to stay in effect, since it is doing the limiting by actively sending suspend and then continue signals to the process. Are there perhaps other ways to achieve this limiting effect, perhaps without the need for such a secondary process running in the background?
Yes!
Just as I was writing this question, found out that I was trying an old version of cpulimit.
The new version supports limiting child processes too.
$ cpulimit -h
Usage: cpulimit [OPTIONS...] TARGET
OPTIONS
-l, --limit=N percentage of cpu allowed from 0 to 400 (required)
-v, --verbose show control statistics
-z, --lazy exit if there is no target process, or if it dies
-i, --include-children limit also the children processes
-h, --help display this help and exit
TARGET must be exactly one of these:
-p, --pid=N pid of the process (implies -z)
-e, --exe=FILE name of the executable program file or path name
COMMAND [ARGS] run this command and limit it (implies -z)
Report bugs to <marlonx80#hotmail.com>.
I've been researching this problem for the last few days, and I found at least two more options: cgroups and CPU affinity
Given that this topic has been viewed more than 2k times, and it's been difficult to find a good source of information, let my post my notes here for future reference.
cgroups
Caveats: you can't use cgroups inside of Docker and you need root access for a one-time setup.
There is cgroups v1 and v2. As of 2020-04-22, only Red Hat has switched to v2 by default and you can't use both at the same time. I.e., you're probably on v1.
You need root to create a cgroup directory / configure your system to create one at startup and delegate access to your non-root user, like so:
v1: mkdir /sys/fs/cgroup/cpu/<directory>/ && chown -R user /sys/fs/cgroup/cpu/<directory>/ (this is specific to restricting CPU usage - there are other cgroup 'controllers' that use different directories; a process can be in multiple cgroups)
v2: mkdir /sys/fs/cgroup/unified/<directory> && chown -R user /sys/fs/cgroup/unified/<directory> (this is a unified cgroup hierarchy, and you can control all cgroup 'controllers' via a single cgroup; a process can be in only one cgroup, and that cgroup must not contain other cgroups - i.e., a leaf cgroup)
Configure the cgroup by writing to control files in this tree:
Configure CPU quota using cpu.cfs_quota_us and cpu.cfs_period_us, e.g., echo 10000 > cpu.cfs_quota_us
Add a process to the new cgroup by writing its pid to the cgroup.procs control file. (All subprocesses are automatically in the same cgroup.)
This link has more info:
https://drill.apache.org/docs/configuring-cgroups-to-control-cpu-usage/
CPU affinity
You can only use CPU affinity to limit CPU usage to an integer number of logical CPUs (aka cpu cores), not to a specific percentage. On today's multi-core systems, that may be good enough.
The documentation for this feature is at $ man sched_setaffinity. Note that cgroups also support setting CPU affinity through the cpuset controller.

Command which shows the most cpu intensive processes running on my system?

I also want to kill processes which are not running on my kernel, as my system is getting slow. Kindly assist. Using Ubuntu
You can get the total number of running processes with
$ ps aux | wc -l
As for killing processes, you need to be careful with that. However, using top or ps and providing the needed options (see the man pages) you can list processes run by specific users/belonging to specific groups.
If your system is slow, focus on the top memory/cpu consumers and to do that use
$ top
to see which processes are the slowest.
You can then type k (for kill) and then the process you want to kill. Use 15 for a soft-kill and 9 for a hard-kill if the softkill doesn't work.
You can type q to quit.

LINUX: How to lock the pages of a process in memory

I have a LINUX server running a process with a large memory footprint (some sort of a database engine). The memory allocated by this process is so large that part of it needs to be swapped (paged) out.
What I would like to do is to lock the memory pages of all the other processes (or a subset of the running processes) in memory, so that only the pages of the database process get swapped out. For example I would like to make sure that i can continue to connect remotely and monitor the machine without having the processes impacted by swapping. I.e. I want sshd, X, top, vmstat, etc to have all pages memory resident.
On linux there are the mlock(), mlockall() system calls that seem to offer the right knob to do the pinning. Unfortunately, it seems to me that I need to make an explicit call inside every process and cannot invoke mlock() from a different process or from the parent (mlock() is not inherited after fork() or evecve()).
Any help is greatly appreciated. Virtual pizza & beer offered :-).
It has been a while since I've done this so I may have missed a few steps.
Make a GDB command file that contains something like this:
call mlockall(3)
detach
Then on the command line, find the PID of the process you want to mlock. Type:
gdb --pid [PID] --batch -x [command file]
If you get fancy with pgrep that could be:
gdb --pid $(pgrep sshd) --batch -x [command file]
Actually locking the pages of most of the stuff on your system seems a bit crude/drastic, not to mention being such an abuse of the mechanism it seems bound to cause some other unanticipated problems.
Ideally, what you probably actually want is to control the "swappiness" of groups of processes so the database is first in line to be swapped while essential system admin tools are the last, and there is a way of doing this.
While searching for mlockall information I ran across this tool. You may be able to find it for your distribution. I only found the man page.
http://linux.die.net/man/8/memlockd
Nowadays, the easy and right way to tackle the problem is cgroup.
Just restrict memory usage of database process:
1. create a memory cgroup
sudo cgcreate -g memory:$test_db -t $User:$User -a $User:$User
2. limit the group's RAM usage to 1G.
echo 1000M > /sys/fs/cgroup/memory/$test_db/memory.limit_in_bytes
or
echo 1000M > /sys/fs/cgroup/memory/$test_db/memory.soft_limit_in_bytes
3. run the database program in the $test_db cgroup
cgexec -g memory:$test_db $db_program_name

Resources