Puzzled by the cpushare setting on Docker. - resources

I wrote a test program 'cputest.py' in python like this:
import time
while True:
for _ in range(10120*40):
pass
time.sleep(0.008)
, which costs 80% cpu when running in a container (without interference of other runnng containers).
Then I ran this program in two containers by the following two commands:
docker run -d -c 256 --cpuset=1 IMAGENAME python /cputest.py
docker run -d -c 1024 --cpuset=1 IMAGENAME python /cputest.py
and used 'top' to view their cpu costs. It turned out that they relatively cost 30% and 67% cpu. I'm pretty puzzled by this result. Would anyone kindly explain it for me? Many thanks!

I sat down last night and tried to figure this out on my own, but ended up not being able to explain the 70 / 30 split either. So, I sent an email to some other devs and got this response, which I think makes sense:
I think you are slightly misunderstanding how task scheduling works - which is why the maths doesn't work. I'll try and dig out a good article but at a basic level the kernel assigns slices of time to each task that needs to execute and allocates slices to tasks with the given priorities.
So with those priorities and the tight looped code (no sleep) the kernel assigns 4/5 of the slots to a and 1/5 to b. Hence the 80/20 split.
However when you add in sleep it gets more complex. Sleep basically tells the kernel to yield the current task and then execution will return to that task after the sleep time has elapsed. It could be longer than the time given - especially if there are higher priority tasks running. When nothing else is running the kernel then just sits idle for the sleep time.
But when you have two tasks the sleeps allow the two tasks to interweave. So when one sleeps the other can execute. This likely leads to a complex execution which you can't model with simple maths. Feel free to prove me wrong there!
I think another reason for the 70/30 split is the way you are doing "80% load". The numbers you have chosen for the loop and the sleep just happen to work on your PC with a single task executing. You could try moving the loop to be based on elapsed time - so loop for 0.8 then sleep for 0.2. That might give you something closer to 80/20 but I don't know.
So in essence, your time.sleep() call is skewing your expected numbers, removing the time.sleep() causes the CPU load to be far closer to what you'd expect.

Related

Sleep() Methods and OS - Scheduler (Camunda/Groovy)

I got a question for you guys and its not as specific as usual, which could make it a little annoying to answer.
The tool i'm working with is Camunda in combination with Groovy scripts and the goal is to reduce the maximum cpu load (or peak load). I'm doing this by "stretching" the work load over a certain time frame since the platform seems to be unhappy with huge work load inputs in a short amount of time. The resulting problem is that Camunda wont react smoothly when someone tries to operate it at the UI - Level.
So i wrote a small script which basically just lets each individual process determine his own "time to sleep" before running, if a certain threshold is exceeded. This is based on how many processes are trying to run at the same time as the individual process.
It looks like:
Process wants to start -> Process asks how many other processes are running ->
waitingTime = numberOfProcesses * timeToSleep * iterationOfMeasures
CPU-Usage Curve 1,3 without the Script. Curve 2,4 With the script
Testing it i saw that i could stretch the work load and smoothe out the UI - Levels. But now i need to describe why this is working exactly.
The Questions are:
What does a sleep method do exactly ?
What does the sleep method do on CPU - Level?
How does an OS-Scheduler react to a Sleep Method?
Namely: Does the scheduler reschedule or just simply "wait" for the time given?
How can i recreate and test the question given above?
The main goal is not for you to answer this, but could you give me a hint for finding the right Literature to answer these questions? Maybe you remember a book which helped you understand this kind of things or a Professor recommended something to you. (Mine wont answer, and i cant blame him)
I'm grateful for hints and or recommendations !
i'm sure you could use timer event
https://docs.camunda.org/manual/7.15/reference/bpmn20/events/timer-events/
it allows to postpone next task trigger for some time defined by expression.
about sleep in java/groovy: https://www.javamex.com/tutorials/threads/sleep.shtml
using sleep is blocking current thread in groovy/java/camunda.
so instead of doing something effective it's just blocked.

Multithreading/Parallel Bash Scripts in Unix Environment

I have multiple bash scripts that I have tried to "parallelize" within a master bash script.
Bash Script:
#!/bin/bash
SHELL=/bin/bash
bash /home/.../a.sh &
bash /home/.../b.sh &
wait
bash /home/.../c.sh &
bash /home/.../d.sh &
bash /home/.../e.sh &
wait
echo "Done paralleling!"
exit 0
I have run the script normally (without ampersands) and with ampersands and I am not seeing any appreciable difference in processing time, leading me to believe that something may not be coded correctly/the most efficient way.
In classic computer-science theory, resource-contention is referred to as "thrashing."
(In the good ol' days, when a 5-megabyte disk drive might be the size of a small washing machine, we used to call it "Maytag Mode," since the poor thing looked like a Maytag washing-machine on the "spin" cycle!)
If you graph the performance curve caused by contention, it slopes upward, then abruptly has an "elbow" shape: it goes straight up, exponentially. We call that, "hitting the wall."
An interesting thing to fiddle-around-with on this script (if you're just curious ...) is to put wait statements at several places. (Be sure you're doing this correctly ...) Allow, say, two instances to run, wait for all of them to complete, then three more, and so on. See if that's usefully faster, and, if it is, try three. And so on. You may find a "sweet spot."
Or ... not. (Don't spend too much time with this. It doesn't look like it's going to be worth it.)
You're likely correct. The thing with parallelism is that it allows you to grab multiple resources to use in parallel. That improves your speed if - and only if - that resource is your limiting factor.
So - for example - if you're reading from a disk - odds are good that the action of reading from disk is what's limiting you, and doing more in parallel doesn't help - and indeed, because of contention can slow the process down. (The disk has to seek to service multiple processes, rather than just 'getting on' and serialising a read).
So it really does boil down to what your script actually does and why it's slow. And the best way of checking that is by profiling it.
At a basic level, something like truss or strace might help.
e.g.
strace -fTtc /home/../e.sh
And see what types of system calls are being made, and how much of the total time they're consuming.

Sleep loop in groovy for hour

hey getting used to groovy and i wanted to have a loop such as a do while loop in my groovy script which is ran every hour or 2 for until a certain condition inside the loop is met (variable = something). So I found the sleep step but was wondering if it would be ok to sleep for such a long time. The sleep function will not mess up right?
The sleep function will not mess up. But that isn't your biggest problem.
If all your script is doing is sleeping, it would be better to have a scheduler like Cron launch your script. This way is simpler and more resilient, it reduces the opportunities for the script to be accumulating garbage, leaking memory, having its JVM get killed by another process, or otherwise just falling into a bad state from programming errors. Cron is solid and there is less that can go wrong that way. Starting up a JVM is not speedy but if your timeframe is in hours it shouldn't be a problem.
Another possible issue is that the time your script wakes up may drift. The OS scheduler is not obliged to wake your thread up at exactly the elapsed time. Also the time on the server could be changed while the script is running. Using Cron would make the time your script acts more predictable.
On the other hand, with the scheduler, if a process takes longer than the time to the next run, there is the chance that multiple instances of the process can exist concurrently. You might want to have the script create a lock file and remove it once it's done, checking to see if the file exists already to let it know if another instance is still running.
First of all there's not do {} while() construct in groovy. Secondly it's a better idea to use a scheduler e.g. QuartzScheduler to run a cron task.

Waiting on many parallel shell commands with Perl

Concise-ish problem explanation:
I'd like to be able to run multiple (we'll say a few hundred) shell commands, each of which starts a long running process and blocks for hours or days with at most a line or two of output (this command is simply a job submission to a cluster). This blocking is helpful so I can know exactly when each finishes, because I'd like to investigate each result and possibly re-run each multiple times in case they fail. My program will act as a sort of controller for these programs.
for all commands in parallel {
submit_job_and_wait()
tries = 1
while ! job_was_successful and tries < 3{
resubmit_with_extra_memory_and_wait()
tries++
}
}
What I've tried/investigated:
I was so far thinking it would be best to create a thread for each submission which just blocks waiting for input. There is enough memory for quite a few waiting threads. But from what I've read, perl threads are closer to duplicate processes than in other languages, so creating hundreds of them is not feasible (nor does it feel right).
There also seem to be a variety of event-loop-ish cooperative systems like AnyEvent and Coro, but these seem to require you to rely on asynchronous libraries, otherwise you can't really do anything concurrently. I can't figure out how to make multiple shell commands with it. I've tried using AnyEvent::Util::run_cmd, but after I submit multiple commands, I have to specify the order in which I want to wait for them. I don't know in advance how long each submission will take, so I can't recv without sometimes getting very unlucky. This isn't really parallel.
my $cv1 = run_cmd("qsub -sync y 'sleep $RANDOM'");
my $cv2 = run_cmd("qsub -sync y 'sleep $RANDOM'");
# Now should I $cv1->recv first or $cv2->recv? Who knows!
# Out of 100 submissions, I may have to wait on the longest one before processing any.
My understanding of AnyEvent and friends may be wrong, so please correct me if so. :)
The other option is to run the job submission in its non-blocking form and have it communicate its completion back to my process, but the inter-process communication required to accomplish and coordinate this across different machines daunts me a little. I'm hoping to find a local solution before resorting to that.
Is there a solution I've overlooked?
You could rather use Scientific Workflow software such as fireworks or pegasus which are designed to help scientists submit large numbers of computing jobs to shared or dedicated resources. But they can also do much more so it might be overkill for your problem, but they are still worth having a look at.
If your goal is to try and find the tightest memory requirements for you job, you could also simply submit your job with a large amount or requested memory, and then extract actual memory usage from accounting (qacct), or , cluster policy permitting, logging on the compute node(s) where your job is running and view the memory usage with top or ps.

Scheduling in Linux: run a task when computer is idle (= no user input)

I'd like to run Folding#home client on my Ubuntu 8.10 box only when it's idle because of the program's heavy RAM consumption.
By "idle" I mean the state when there's no user activity (keyboard, mouse, etc). It's OK for other (probably heavy) processes to run at that time since F#H has the lowest CPU priority. The point is just to improve user experience and to do heavy work when nobody is using the machine.
How to accomplish this?
When the machine in question is a desktop, you could hook a start/stop script into the screensaver so that the process is stopped when the screensaver is inactive and vice versa.
It's fiddly to arrange for the process to only be present when the system is otherwise idle.
Actually starting the program in those conditions isn't the hard bit. You have to arrange for the program to be cleanly shut down, and figure out how and when to do that.
You have to be able to distinguish between that process's own CPU usage, and that of the other programs that might be running, so that you can tell whether the system is properly "idle".
It's a lot easier for the process to only be scheduled when the system is otherwise idle. Just use the 'nice' command to launch the Folding#Home client.
However that won't solve the problem of insufficient RAM. If you've got swap space enabled, the system should be able to swap out any low priority processes such that they're not consuming and real resources, but beware of a big hit on disk I/O each time your Folding#Home client swaps in and out of RAM.
p.s. RAM is very cheap at the moment...
p.p.s. see this article
may be You need to set on idle task lowest priority via nice.
Your going to want to look at a few things to determine 'idle' and also explore the sysinfo() call (the link points out the difference in the structure that it populates between various kernel versions).
Linux does not manage memory in a typical way. Don't just look at loads, look at memory. In particular, /proc/meminfo has a wonderful line started with Committed_AS, which shows you how much memory the kernel has actually promised to other processes. Compare that with what you learned from sysinfo and you might realize that a one minute load average of 0.00 doesn't mean that its time to run some program that wants to allocate 256MB of memory, since the kernel may be really over-selling. Note, all values filled by sysinfo() are available via /proc, sysinfo() is just an easier way to get them.
You would also want to look at how much time each core has spent in IOWAIT since boot, which is an even stronger indicator of if you should run an I/O resource hog. Grab that info in /proc/stat, the first line contains the aggregate count of all CPU's. IOWAIT is in the 6'th field. Of course if you intend to set affinity to a single CPU, only that CPU would be of interest (its still the sixth field, in units of USER_HZ or typically in 100'ths of a second). Average that against btime, also found in /proc/stat.
In short, don't just look at load averages.
EDIT
You should not assume a lack of user input means idle.. cron jobs tend to run .. public services get taxed from time to time, etc. Idle remains your best guess based on reading the values (or perhaps more) that I listed above.
EDIT 2
Looking at the knob values in /proc/sys/vm also gives you a good indication of what the user thinks is idle, in particular swappiness. I realize your doing this only on your own box but this is an authoritative wiki and the question title is generic :)
The file /proc/loadavg has the systems current load. You can just write a bash script to check it, and if its low then run the command. Then you can add it to /etc/cron.d to run it periodically.
This file contains information about
the system load. The first three
numbers represent the number of active
tasks on the system - processes that
are actually running - averaged over
the last 1, 5, and 15 minutes. The
next entry shows the instantaneous
current number of runnable tasks -
processes that are currently scheduled
to run rather than being blocked in a
system call - and the total number of
processes on the system. The final
entry is the process ID of the process
that most recently ran.
Example output:
0.55 0.47 0.43 1/210 12437
If you're using GNOME then take look at this:
https://wiki.gnome.org/Attic/GnomeScreensaver/FrequentlyAskedQuestions
See this thread for a perl script that checks when the system is idle (through gnome screensaver).
You can run commands when idling starts and stops.
I'm using this with some scripts to change BOINC preferences when idling
(to give BOINC more memory and cpu usage).
perl script on ubuntu forums
You can use xprintidle command to find out if user is idle or not. The command will print value in milliseconds from the last interaction with X server.
Here is the sample script which can start/stop tasks when user will go away:
#!/bin/bash
# Wait until user will be idle for provided number of milliseconds
# If user wasn't idle for that amount of time, exit with error
WAIT_FOR_USER_IDLE=60000
# Minimal number of milliseconds after which user will be considered as "idle"
USER_MIN_IDLE_TIME=3000
END=$(($(date +%s%N)/1000000+WAIT_FOR_USER_IDLE))
while [ $(($(date +%s%N)/1000000)) -lt $END ]
do
if [ $(xprintidle) -gt $USER_MIN_IDLE_TIME ]; then
eval "$* &"
PID=$!
#echo "background process started with pid = $PID"
while kill -0 $PID >/dev/null 2>&1
do
if [ $(xprintidle) -lt $USER_MIN_IDLE_TIME ]; then
kill $PID
echo "interrupt"
exit 1
fi
sleep 1
done
echo "success"
exit 0
fi
sleep 1
done
It will take all arguments and execute them as another command when user will be idle. If user will interact with X server then running task will be killed by kill command.
One restriction - the task that you will run should not interact with X server otherwise it will be killed immediately after start.
I want something like xprintidle but it didn't work in my case(Ubuntu 21.10, Wayland)
I used the following solution to get idle current value(time with no mouse/keyboard input):
dbus-send --print-reply --dest=org.gnome.Mutter.IdleMonitor /org/gnome/Mutter/IdleMonitor/Core org.gnome.Mutter.IdleMonitor.GetIdletime
It should return uint64 time in milliseconds. Example:
$ sleep 3; dbus-send --print-reply --dest=org.gnome.Mutter.IdleMonitor /org/gnome/Mutter/IdleMonitor/Core org.gnome.Mutter.IdleMonitor.GetIdletime
method return time=1644776247.028363 sender=:1.34 -> destination=:1.890 serial=9792 reply_serial=2
uint64 2942 # i.e. 2.942 seconds without input

Resources