I am coding a 5 state process model(new,ready,running,blocked,exit), for this I created a LinkedList which contains the processes ready to run. For example if I have the processes 1,2,3,4,5 it runs the 1st, then the 2nd, and when the third is running the user pushes a button and blocks the process for 5 seconds. In the meantime the following process(the 4th) runs(it doesn´t wait until the third process is unblocked). The problem that I have is that I don´t know if I should use two threads for this, one for the threads that are running and the other for the blocked process?? or is it possible to only use one thread???
You could use only a single thread if you use cooperative multitasking, where your process code periodically yields to permit other processes to run, or if you want each task to run to completion or blocking before letting another process in or back in.
If it's important that the 3rd process restart after exactly 5 seconds, and if it's okay for it to continue running in parallel with an existing process also running, you might want to use two - or more - threads.
Related
I'm trying to create a timed scheduler that can execute tasks in parallel. For example:
Let's say I'm trying to create a function that will do something after 10 seconds of being called. After calling Process_1(), it will be expected to run its intended functionality after 10 seconds.
But at the 5 second mark while Process_1() is waiting to be executed at the halfway point, I'm now calling Process_2() midway. So at the 10 seconds mark, Process_1() will execute its function and at the 15 seconds mark, Process_2() will execute its function.
I've tried using node-cron for this but it doesn't seem like it can schedule things in parallel. Thanks in advance!
Nodejs runs your Javascript in a single thread unless you explicitly create a WorkerThread and run some code in that. True parallel execution where both jobs are running code that uses the CPU will only be accomplished if you either run each task in a WorkerThread or child process to get it out of the main thread.
Let me repeat, true parallel execution requires more than one thread or process in nodejs and nodejs does not do that by default so you will have to create a WorkerThread or child_process.
So, if you have code that takes more than a few ms to do its work and you want it to run at a fairly precise time, then you can't count on the main Javascript thread to do that because it might be busy at that precise time. Timers in Javascript will run your code no earlier than the scheduled time, and when that scheduled time comes around, the event loop is ready to run them, but they won't actually run until whatever was running before finishes and returns control back to the event loop so the event loop can run the code attached to your timer.
So, if all you're mostly doing is I/O kind of work (reading/writing files or network), then your actual Javascript execution time is probably only milliseconds and nodejs can be very, very responsive to run your timers pretty close to "on time". But, if you have computationally expensive things that keep the CPU busy for much longer, then you can't count on your timers to run "on time" if you run that CPU-heavy stuff in the main thread.
What you can do, is start up a WorkerThread, set the timer in the WorkerThread and run your code in the worker thread. As long as you don't ask that WorkerThread to run anything else, it should be ready to run that timer pretty much "on time".
Now WorkerThreads do share some resources with the main thread so they aren't 100% independent (though they are close to independent). If you want 100% independence, then you can start a nodejs child process that runs a node script, sets its own timers and runs its own work in that other process.
All that said, the single threaded model works very, very well at reasonably high scale for code that is predominantly I/O code because nodejs uses non-blocking I/O so while it's waiting to read or write from file or network, the main thread is free and available to run other things. So, it will often give the appearance of running things in parallel because progress is being made on multiple fronts. The I/O itself inside the nodejs library is either natively non-blocking (network I/O) or is happening in an OS-native thread (file I/O) and the programming interface to Javascript is callback or promise based so it is also non-blocking.
I mention all this because you don't say what your two operations that you want to run in parallel are (including your actual code allows us to write more complete answers). If they are I/O or even some crypto, then they may already be non-blocking and you may achieve desired parallelism without having to use additional threads or processes.
In Linux & C, will not waiting (waitpid) for a fork-execve launched process create zombies?
What is the correct way to launch a new program (many times) without waiting and without resource leaks?
It would also be launched from a 2nd worker thread.
Can the first program terminate first cleanly if launched programs have not completed?
Additional: In my case I have several threads that can fork-execve processes at ANY TIME and THE SAME TIME -
1) Some I need to wait for completion and want to report any errors codes with waitpid
2) Some I do not want to block the thread and but would like to report errors
3) Some I don't want to wait and don't care about the outcome and could run after the program terminates
For #2, should I have to create an additional thread to do waitpid ?
For #3, should I do a fork-fork-execve and would ending the 1st fork cause the 2nd process to get cleaned up (no zombie) separately via init ?
Additional: I've read briefly (not sure I understand all) about using nohup, double fork, setgpid(0,0), signal(SIGCHLD, SIG_IGN).
Doesn't global signal(SIGCHLD, SIG_IGN) have too many side effects like getting inherited (or maybe not) and preventing monitoring other processes you do want to wait for ?
Wouldn't relying on init to cleanup resources leak while the program continues to run (weeks in my case)?
In Linux & C, will not waiting (waitpid) for a fork-execve launched process create zombies?
Yes, they become zombies after death.
What is the correct way to launch a new program (many times) without waiting and without resource leaks? It would also be launched from a 2nd worker thread.
Set SIGCHLd to SIG_IGN.
Can the first program terminate first cleanly if launched programs have not completed?
Yes, orphaned processes will be adopted by init.
I ended up keeping an array of just the fork-exec'd pids I did not wait for (other fork-exec'd pids do get waited on) and periodically scanned the list using
waitpid( pids[xx], &status, WNOHANG ) != 0
which gives me a chance report outcome and avoid zombies.
I avoided using global things like signal handlers that might affect other code elsewhere.
It seemed a bit messy.
I suppose that fork-fork-exec would be an alternative to asynchronously monitor the other program's completion by the first fork, but then the first fork needs cleanup.
In Windows, you just keep a handle to the process open if you want to check status without worry of pid reuse, or close the handle if you don't care what the other process does.
(In Linux, there seems no way for multiple threads or processes to monitor the status of the same process safely, only the parent process-thread can, but not my issue here.)
The code below has two threads starting at same time.
How to start and stop two threads at same time, The first Thread must finish executing and second should stop its process.
Like I want to process a large file with one thread and show a GIF using another thread in JavaFX
I can use latch to start two threads, but how to stop them at same time
For example, let us assume that in my operating system a context switch to another process occurs after 100μ of execution time. Furthermore, my computer has only one processor with one thread of execution possible.
If I have Process A which contains only one thread of execution and Process B which has four threads of execution, will this mean that the thread in process A will run for 100μ and process B will also run for 100μ but split the execution time between each thread before context switching?
Process A: ran for 100μ
Thread 1 in Process A execution time: 100μ
Process B: ran for 100μ
Thread 1 in Process A execution time: ~25μ
Thread 2 in Process A execution time: ~25μ
Thread 3 in Process A execution time: ~25μ
Thread 4 in Process A execution time: ~25μ
Would the above be correct?
Moreover, would this be different if I had a quad core processor? If I had a quad core processor, would this potentially mean each thread could run for 100μ each across all processors?
It all really depends on what you are doing within the process / processing in each thread. If the process you are trying to run can benefit from splitting over threads, like for example, making calls to a web service for processing (since a web service can accept multiple calls at once and execute then separately), then no... the single thread will take longer to process than the 4 threads simply because it is executing the calls linearly instead of simultaneously.
On the other hand, if you are executing a process / code that does not benefit from thread splitting, then the time to finish all 4 processing threads will be the same on a single core.
However, in most cases, splitting the processing into threads should take less time than executing it on a single thread, if you do it right.
The matter of Cores doesn't factor in in this case unless you are attempting to run more threads than one core can handle. In which case, the OS will run the extra threads on a separate core.
This link explains a bit more the situation with Cores and Hyper-Threading...
http://www.howtogeek.com/194756/cpu-basics-multiple-cpus-cores-and-hyper-threading-explained/
Thread switches are always on the same interval regardless of process ownership. So if it's 100micro then it's always 100micro. Unless of course the thread itself surrenders execution. When this thread is going to run again is where things get complicated
if a process pool is created and there are 10 processes
but my program only use 4 processes
it means there are 6 idle processes
to use a process pool,
generally the pseudo code is like:
pool=create_process_pool(M)
for i in 1:N:
pool.run(task i)
pool.wait()
pool.close()
how does the pool decide when to call pool.wait()?
there are some cases:
1 if M>N, for example M=10, N=6, then there are 4 idle processes. For the 6 used processes, when they finished running and exit, they can inform the pool.wait(), but for the 4 idle processes, since they didn't run, how can they inform the pool.wait() that they finishes?
2 if M < N, is a process finishes a task and exit, it may be used for another task. So how can this process know that it will have no tasks any more and so inform pool.wait()
can anyone explain a bit how process pool works in this regard?
thanks!
You could implement a process pool (e.g. in C++) with
some Process class (in particular, knowing the pid of each fork-ed process). It would have some empty instance (whose pid would be 0).
some global array of Process-es
a Command class representing a command to be started (when possible) in the process pool.
a std::deque<Command> of commands, when possible a Command would fire some Process
an event loop taking account of SIGCHLD; when a SIGCHLD occurs, you would waitpid with WNOHANG and get the pid of the ended Process so find the actual Process instance and do whatever is needed ; that event loop would probably pop Command-s to run (so would start non-idle Process-es), manage pipes, etc...
Then idle processes would just be represented by a Process slot with a zero pid; no need to fork it explicitly. So they won't be unix processes.... just some internal representation in the process pool software.
My point is that a process pool mechanism don't (necessarily) have to start (with fork system call) idle processes. It could maintain a pool of process descriptors, and for idle slots mark the descriptor specially. That process descriptor could actually be a pid_t and empty slots having (pid_t)0 which is never the pid of any real Unix process. So there is no need to create processes in advance (but only lazily, as necessary). Hence, no need for idle processes.
I strongly suggest to take some hours to read Advanced Linux Programming. It will teach you better than what I could in a few minutes.
As an example, look at the Unix (or GNU) batch (and at) command. It does not use any idle process. And it does manage a pool of process queues. It is free software, so you can study (and improve) its source code.