Concise-ish problem explanation:
I'd like to be able to run multiple (we'll say a few hundred) shell commands, each of which starts a long running process and blocks for hours or days with at most a line or two of output (this command is simply a job submission to a cluster). This blocking is helpful so I can know exactly when each finishes, because I'd like to investigate each result and possibly re-run each multiple times in case they fail. My program will act as a sort of controller for these programs.
for all commands in parallel {
submit_job_and_wait()
tries = 1
while ! job_was_successful and tries < 3{
resubmit_with_extra_memory_and_wait()
tries++
}
}
What I've tried/investigated:
I was so far thinking it would be best to create a thread for each submission which just blocks waiting for input. There is enough memory for quite a few waiting threads. But from what I've read, perl threads are closer to duplicate processes than in other languages, so creating hundreds of them is not feasible (nor does it feel right).
There also seem to be a variety of event-loop-ish cooperative systems like AnyEvent and Coro, but these seem to require you to rely on asynchronous libraries, otherwise you can't really do anything concurrently. I can't figure out how to make multiple shell commands with it. I've tried using AnyEvent::Util::run_cmd, but after I submit multiple commands, I have to specify the order in which I want to wait for them. I don't know in advance how long each submission will take, so I can't recv without sometimes getting very unlucky. This isn't really parallel.
my $cv1 = run_cmd("qsub -sync y 'sleep $RANDOM'");
my $cv2 = run_cmd("qsub -sync y 'sleep $RANDOM'");
# Now should I $cv1->recv first or $cv2->recv? Who knows!
# Out of 100 submissions, I may have to wait on the longest one before processing any.
My understanding of AnyEvent and friends may be wrong, so please correct me if so. :)
The other option is to run the job submission in its non-blocking form and have it communicate its completion back to my process, but the inter-process communication required to accomplish and coordinate this across different machines daunts me a little. I'm hoping to find a local solution before resorting to that.
Is there a solution I've overlooked?
You could rather use Scientific Workflow software such as fireworks or pegasus which are designed to help scientists submit large numbers of computing jobs to shared or dedicated resources. But they can also do much more so it might be overkill for your problem, but they are still worth having a look at.
If your goal is to try and find the tightest memory requirements for you job, you could also simply submit your job with a large amount or requested memory, and then extract actual memory usage from accounting (qacct), or , cluster policy permitting, logging on the compute node(s) where your job is running and view the memory usage with top or ps.
Related
I got a question for you guys and its not as specific as usual, which could make it a little annoying to answer.
The tool i'm working with is Camunda in combination with Groovy scripts and the goal is to reduce the maximum cpu load (or peak load). I'm doing this by "stretching" the work load over a certain time frame since the platform seems to be unhappy with huge work load inputs in a short amount of time. The resulting problem is that Camunda wont react smoothly when someone tries to operate it at the UI - Level.
So i wrote a small script which basically just lets each individual process determine his own "time to sleep" before running, if a certain threshold is exceeded. This is based on how many processes are trying to run at the same time as the individual process.
It looks like:
Process wants to start -> Process asks how many other processes are running ->
waitingTime = numberOfProcesses * timeToSleep * iterationOfMeasures
CPU-Usage Curve 1,3 without the Script. Curve 2,4 With the script
Testing it i saw that i could stretch the work load and smoothe out the UI - Levels. But now i need to describe why this is working exactly.
The Questions are:
What does a sleep method do exactly ?
What does the sleep method do on CPU - Level?
How does an OS-Scheduler react to a Sleep Method?
Namely: Does the scheduler reschedule or just simply "wait" for the time given?
How can i recreate and test the question given above?
The main goal is not for you to answer this, but could you give me a hint for finding the right Literature to answer these questions? Maybe you remember a book which helped you understand this kind of things or a Professor recommended something to you. (Mine wont answer, and i cant blame him)
I'm grateful for hints and or recommendations !
i'm sure you could use timer event
https://docs.camunda.org/manual/7.15/reference/bpmn20/events/timer-events/
it allows to postpone next task trigger for some time defined by expression.
about sleep in java/groovy: https://www.javamex.com/tutorials/threads/sleep.shtml
using sleep is blocking current thread in groovy/java/camunda.
so instead of doing something effective it's just blocked.
I need to run different Python processes, in a certain order of priority.
Specifically, I have 3 processes, and I need them to work this way:
An object detection script, used to locate a person and their position. I need this one to run continuously at a high FPS;
another process that, once some conditions are met (when the person is present in the picture in the required position) starts taking screenshots of the image for a certain amount of time;
another script that analyzes the screenshots taken by the second one.
I wrote the 3 scripts already and they work fine, but the problem is that process 3 is particularly computationally demanding, and I don't want it to prevent processes 1 and 2 from running smoothly.
My idea is that I could give highest priority to process 1, and send screenshots taken by process 2...to a queue, or something like this.
When the person is not detected in the picture, I could run process 3, and empty the queue as the screenshots are analyzed. However, script 3 should still run with limited resources, so that FPS of script 1 isn't affected too much, and it can still detect if the person enters the picture again.
I'm afraid this might all be a little vague, but could you please suggest me a way or tool I could use to manage the processes this way?
So far, I tried simply saving the screenshots to a folder, but I don't know how to limit the resources usage by process 3.
I'm familiar with the basic usage of Docker, so I was thinking that maybe I could:
run the processes in different containers, limiting resources allocated to the 3rd one (?);
use a message broker (Kafka, RabbitMQ?) to store screenshots;
but again, I'm a newbie when it comes to this stuff (speaking of which, I hope I tagged this question correctly), so I don't know if it's an efficient way to to do this (or if it can be done this way, for that matter).
I have multiple bash scripts that I have tried to "parallelize" within a master bash script.
Bash Script:
#!/bin/bash
SHELL=/bin/bash
bash /home/.../a.sh &
bash /home/.../b.sh &
wait
bash /home/.../c.sh &
bash /home/.../d.sh &
bash /home/.../e.sh &
wait
echo "Done paralleling!"
exit 0
I have run the script normally (without ampersands) and with ampersands and I am not seeing any appreciable difference in processing time, leading me to believe that something may not be coded correctly/the most efficient way.
In classic computer-science theory, resource-contention is referred to as "thrashing."
(In the good ol' days, when a 5-megabyte disk drive might be the size of a small washing machine, we used to call it "Maytag Mode," since the poor thing looked like a Maytag washing-machine on the "spin" cycle!)
If you graph the performance curve caused by contention, it slopes upward, then abruptly has an "elbow" shape: it goes straight up, exponentially. We call that, "hitting the wall."
An interesting thing to fiddle-around-with on this script (if you're just curious ...) is to put wait statements at several places. (Be sure you're doing this correctly ...) Allow, say, two instances to run, wait for all of them to complete, then three more, and so on. See if that's usefully faster, and, if it is, try three. And so on. You may find a "sweet spot."
Or ... not. (Don't spend too much time with this. It doesn't look like it's going to be worth it.)
You're likely correct. The thing with parallelism is that it allows you to grab multiple resources to use in parallel. That improves your speed if - and only if - that resource is your limiting factor.
So - for example - if you're reading from a disk - odds are good that the action of reading from disk is what's limiting you, and doing more in parallel doesn't help - and indeed, because of contention can slow the process down. (The disk has to seek to service multiple processes, rather than just 'getting on' and serialising a read).
So it really does boil down to what your script actually does and why it's slow. And the best way of checking that is by profiling it.
At a basic level, something like truss or strace might help.
e.g.
strace -fTtc /home/../e.sh
And see what types of system calls are being made, and how much of the total time they're consuming.
I've been learning some lua for game development. I heard about coroutines in other languages but really came up on them in lua. I just don't really understand how useful they are, I heard a lot of talk how it can be a way to do multi-threaded things but aren't they run in order? So what benefit would there be from normal functions that also run in order? I'm just not getting how different they are from functions except that they can pause and let another run for a second. Seems like the use case scenarios wouldn't be that huge to me.
Anyone care to shed some light as to why someone would benefit from them?
Especially insight from a game programming perspective would be nice^^
OK, think in terms of game development.
Let's say you're doing a cutscene or perhaps a tutorial. Either way, what you have are an ordered sequence of commands sent to some number of entities. An entity moves to a location, talks to a guy, then walks elsewhere. And so forth. Some commands cannot start until others have finished.
Now look back at how your game works. Every frame, it must process AI, collision tests, animation, rendering, and sound, among possibly other things. You can only think every frame. So how do you put this kind of code in, where you have to wait for some action to complete before doing the next one?
If you built a system in C++, what you would have is something that ran before the AI. It would have a sequence of commands to process. Some of those commands would be instantaneous, like "tell entity X to go here" or "spawn entity Y here." Others would have to wait, such as "tell entity Z to go here and don't process anymore commands until it has gone here." The command processor would have to be called every frame, and it would have to understand complex conditions like "entity is at location" and so forth.
In Lua, it would look like this:
local entityX = game:GetEntity("entityX");
entityX:GoToLocation(locX);
local entityY = game:SpawnEntity("entityY", locY);
local entityZ = game:GetEntity("entityZ");
entityZ:GoToLocation(locZ);
do
coroutine.yield();
until (entityZ:isAtLocation(locZ));
return;
On the C++ size, you would resume this script once per frame until it is done. Once it returns, you know that the cutscene is over, so you can return control to the user.
Look at how simple that Lua logic is. It does exactly what it says it does. It's clear, obvious, and therefore very difficult to get wrong.
The power of coroutines is in being able to partially accomplish some task, wait for a condition to become true, then move on to the next task.
Coroutines in a game:
Easy to use, Easy to screw up when used in many places.
Just be careful and not use it in many places.
Don't make your Entire AI code dependent on Coroutines.
Coroutines are good for making a quick fix when a state is introduced which did not exist before.
This is exactly what java does. Sleep() and Wait()
Both functions are the best ways to make it impossible to debug your game.
If I were you I would completely avoid any code which has to use a Wait() function like a Coroutine does.
OpenGL API is something you should take note of. It never uses a wait() function but instead uses a clean state machine which knows exactly what state what object is at.
If you use coroutines you end with up so many stateless pieces of code that it most surely will be overwhelming to debug.
Coroutines are good when you are making an application like Text Editor ..bank application .. server ..database etc (not a game).
Bad when you are making a game where anything can happen at any point of time, you need to have states.
So, in my view coroutines are a bad way of programming and a excuse to write small stateless code.
But that's just me.
It's more like a religion. Some people believe in coroutines, some don't. The usecase, the implementation and the environment all together will result into a benefit or not.
Don't trust benchmarks which try to proof that coroutines on a multicore cpu are faster than a loop in a single thread: it would be a shame if it were slower!
If this runs later on some hardware where all cores are always under load, it will turn out to be slower - ups...
So there is no benefit per se.
Sometimes it's convenient to use. But if you end up with tons of coroutines yielding and states that went out of scope you'll curse coroutines. But at least it isn't the coroutines framework, it's still you.
We use them on a project I am working on. The main benefit for us is that sometimes with asynchronous code, there are points where it is important that certain parts are run in order because of some dependencies. If you use coroutines, you can force one process to wait for another process to complete. They aren't the only way to do this, but they can be a lot simpler than some other methods.
I'm just not getting how different they are from functions except that
they can pause and let another run for a second.
That's a pretty important property. I worked on a game engine which used them for timing. For example, we had an engine that ran at 10 ticks a second, and you could WaitTicks(x) to wait x number of ticks, and in the user layer, you could run WaitFrames(x) to wait x frames.
Even professional native concurrency libraries use the same kind of yielding behaviour.
Lots of good examples for game developers. I'll give another in the application extension space. Consider the scenario where the application has an engine that can run a users routines in Lua while doing the core functionality in C. If the user needs to wait for the engine to get to a specific state (e.g. waiting for data to be received), you either have to:
multi-thread the C program to run Lua in a separate thread and add in locking and synchronization methods,
abend the Lua routine and retry from the beginning with a state passed to the function to skip anything, least you rerun some code that should only be run once, or
yield the Lua routine and resume it once the state has been reached in C
The third option is the easiest for me to implement, avoiding the need to handle multi-threading on multiple platforms. It also allows the user's code to run unmodified, appearing as if the function they called took a long time.
I need to implement a daemon that needs to extract data from a database, load the data to memory, and according to this data
perform actions like sending emails or write/update files. These actions need to be performed every 30 minutes.
I really don't know what to decide. Compile a c++ program that will do the task or use scripts and miscellaneous Linux tools (sed/awk).
What will be the fastest way to do this? To save cpu and memory.
The dilemma is about marinating this process if it's script it does not need compilations and I can just drop it into any machine linux/unix
but if it's native it's more harder.
What do you think?
Use cron(1) to start your program every 30 minutes.
So called scripting languages will definitely enable you to write your program more quickly than C++. But doing this with shell and sed an/or awk, while definitly possible, is very difficult when you have to cope with all corner cases, particularly regarding strings escaping (think quotes, “&”’s “;”’s…).
I suggest you go with a more full featured “scripting” language such as Perl or Python.
Why are you trying to save CPU & Memory? Are you absolutely sure this is a real requirement (or just "premature optimization")?
Unless performance is critical, there's absolutely no reason to code such a thing in C++. It seems to be a sort of maintenance process (right?). I say write it in the highest level script language you know. Python or PHP seem like good candidates. Even if you don't know these languages, it would still take you less time to familiarize yourself with them than it would take you to do it in C++.
I'd go with a Python/Perl/Ruby implementation with a cron entry to schedule the script to run every 30 minutes.
If performance becomes an issue you can add a column to you DB that tracks the last time you ran calculations for the account and then split the processing of your records into groups of 2 or 3 or 4, running them ever 15, 10, 5 minutes respectively.
If after splitting your calculations into groups, you still have performance demands then consider C++/C/Java.
I'd still run this using cron though. No need to be a daemon unless you are providing on-demand services.