3 requirements for synchronization: why does this approach not work?

3 requirements for synchronization: why does this approach not work? - multithreading

I'm trying to learn about synchornization and understand there are 3 conditions that need to be met for things to work properly
1)mutual exclusion - no data is being corrupted
2)bounded waiting - a thread won't do nothing forever
3)progress being made - system as a whole is doing work e.g. not just passing around who's turn it is
I don't fully understand why the code bellow doesn't work. According to my notes it has mutual exclusion but doesn't satisfy making progress or bounded waiting. Why? Each thread can do something and as long as now thread crashes everythread will get a turn.
The following are shared variables
int turn; // initially turn = 0
turn == i: Pi can enter its critical section
The code is
do {
while (turn != i){}//wait
critical section
turn = j;//j signifies process Pj in contrast to Pi
remainder section
} while (true);
It's basically slide 10 of these notes.

I think the important bit is that according to slide 6 of your notes the 3 rules apply to the critical section of the algorithm and are exactly as follows:
Progress: If no one is in the critical section and someone wants in,
then those processes not in their remainder section must
be able to decide in a finite time who should go in.
Bounded Wait: All requesters must eventually be let into the critical
section.
How to break it:
Pi executes and its remainder section runs indefinitely (no restriction for this)
Pj runs in its entirety, setting turn:= i so it's now Pi's turn to run the critical section.
Pi is still running its remainder which runs indefinitely.
Pj is back to it's critical section but never gets to run it since Pi never gets back to the point where it can give the turn to Pj.
That breaks the progress rule. No one is in the critical section, Pj wants in but cannot decide in a finite time if it can go in.
That breaks the bounded wait rule. Pj will never be let back in into the critical section.

As Malvavisco correctly points out, if a process never releases a resource no other process will have access to it. This is an uninteresting case and typically it's considered trivial. (In practice, it turns out not to be -- which is why there's a lot of emphasis on being able to manage processes from outside, e.g. forcibly terminate a process with minimal ill effects.)
The slides are actually a little imprecise in their definitions. I find that this Wikipedia page on Peterson's algorithm (Algorithm #3 on slide 12) is more exact. Specifically:
Bounded waiting means that "there exists a bound or limit on the number of times that other processes are allowed to enter their critical sections after a process has made a request to enter its critical section and before that request is granted"
Some thought experimentation makes it pretty clear that Algorithm #1 (slide 10) fails this. There is no bound on the number of times the critical section could be entered by either process if the process switching timing was unfortunate. Suppose process 1 executes, enters the critical section, and from there on process 2 only is switched to while process 1 is in its critical section. Process 1 will never account for this. Peterson's algorithm will as process 1 will forfeit its ability to enter the critical section if process 2 is waiting (and vice versa).

Related

Scala 3 with ScalaFX thread related problem

I have an application that has multiple screens and a process that needs to get UI info from some and update others.
Tried many methods but the result always is always "not a Java FX thread". Without using some kind of thread the UI does not update Because of the multi screen nature of the app (not practical to change) I need to fundamentally change the application architecture which is why I am not posting any code - its all going to change.
What I cant work out is the best way to do this and as any changes are likely to require substantial work I am reluctant to try something that has little chance of success.
I know about Platform.runLater and tried adding that to the updates but that was complex and did not seem to be effective.
I do have the code on GitHub - its a personal leaning project that started in Scala 2 but if you have an interest in learning or pointing out my errors I can provide access.
Hope you have enjoyed a wonderful Christmas.
PS just make the repo public https://github.com/udsl/Processor6502

The problem is not that the Platform.runLater was not working its because the process is being called form a loop in a thread and without a yield the JavaFX thread never gets an opportunity to run. It just appeared to be failing – again I fall foul of an assumption.
The thread calls a method from within a loop which terminates on a condition set by the method.
The process is planned to emulate the execution of 6502 processor instructions in 2 modes run and run-slow, run-slow is run with a short delay after each instruction execution.
The updates are to the main screen the PC, status flags and register contents. The run (debug) screen gets the current instruction display updated and other items will be added. In the future.
The BRK instruction with a zero-byte following is captures and set the execution mode to single-step essentially being a break point though in the future it will be possible via the debug screen to set a breakpoint and for the execution of the breakpoint to restore the original contents. This is to enable the debugging of a future hardware item – time and finances permitting – it’s a hobby after all 😊
It terns out that the JavaFX thread issue only happens when a FX control is written to but not when read from. Placing all reads and writes in a Platform.runLater was too complex which is why I was originally searching for an alternative solution but now only needed it protect the writes is much less a hassle.
In the process loop calling Thread.’yield’() enables the code in the Platform.runLater blocks to be executed on the JavaFX thread so the UI updates without an exception.
The code in the Run method:
val thread = new Thread {
override def run =
while runMode == RunMode.Running || runMode == RunMode.RunningSlow do
executeIns
Thread.`yield`()
if runMode == RunMode.RunningSlow then
Thread.sleep(50) // slow the loop down a bit
}
thread.start
Note that because yield is a Scala reserved word needs to quote it!

Which one I should use in Clojure? go block or thread?

I want to see the intrinsic difference between a thread and a long-running go block in Clojure. In particular, I want to figure out which one I should use in my context.
I understand if one creates a go-block, then it is managed to run in a so-called thread-pool, the default size is 8. But thread will create a new thread.
In my case, there is an input stream that takes values from somewhere and the value is taken as an input. Some calculations are performed and the result is inserted into a result channel. In short, we have input and out put channel, and the calculation is done in the loop. So as to achieve concurrency, I have two choices, either use a go-block or use thread.
I wonder what is the intrinsic difference between these two. (We may assume there is no I/O during the calculations.) The sample code looks like the following:
(go-loop []
(when-let [input (<! input-stream)]
... ; calculations here
(>! result-chan result))
(recur))
(thread
(loop []
(when-let [input (<!! input-stream)]
... ; calculations here
(put! result-chan result))
(recur)))
I realize the number of threads that can be run simultaneously is exactly the number of CPU cores. Then in this case, is go-block and thread showing no differences if I am creating more than 8 thread or go-blocks?
I might want to simulate the differences in performance in my own laptop, but the production environment is quite different from the simulated one. I could draw no conclusions.
By the way, the calculation is not so heavy. If the inputs are not so large, 8,000 loops can be run in 1 second.
Another consideration is whether go-block vs thread will have an impact on GC performance.

There's a few things to note here.
Firstly, the thread pool that threads are created on via clojure.core.async/thread is what is known as a cached thread pool, meaning although it will re-use recently used threads inside that pool, it's essentially unbounded. Which of course means it could potentially hog a lot of system resources if left unchecked.
But given that what you're doing inside each asynchronous process is very lightweight, threads to me seem a little overkill. Of course, it's also important to take into account the quantity of items you expect to hit the input stream, if this number is large you could potentially overwhelm core.async's thread pool for go macros, potentially to the point where we're waiting for a thread to become available.
You also didn't mention preciously where you're getting the input values from, are the inputs some fixed data-set that remains constant at the start of the program, or are inputs continuously feed into the input stream from some source over time?
If it's the former then I would suggest you lean more towards transducers and I would argue that a CSP model isn't a good fit for your problem since you aren't modelling communication between separate components in your program, rather you're just processing data in parallel.
If it's the latter then I presume you have some other process that's listening to the result channel and doing something important with those results, in which case I would say your usage of go-blocks is perfectly acceptable.

Haskell STM and retry

When we run a STM expression which hits retry, the thread is blocked and the transaction is run once again if the entries are modified.
But I was wondering :
If we read a STM variable which, in that specific branch leading to retry, is not actually used , would updating it try to perform the transaction again ?
While the thread is blocked, is it really blocked ? or is it recycled in a thread pool to be used by other potentially waiting operations ?

Yes. Reading STM variable will invoke stmReadTVar - see here. This will generate new entry in transaction record and it will be checked on commit. If you take a look here you will find that ReadTVarOp marked as an operation with side effect (has_side_effects = True) so I don't think compiler will eliminate it regardless will you use it result or not.
As #WillSewell wrote Haskell uses green threads. You can even use STM in single-threaded runtime without worry that the actual OS thread will be blocked.

Re. 1: as I understand your question, yes that is correct; your entire STM transaction will have a consistent view of the world including branches composed with orElse (see: https://ghc.haskell.org/trac/ghc/ticket/8680). But I'm not sure what you mean by "but my transaction actually depends on the value of just 1 variable"; if you do a readTVar then changes to that var will be tracked.
Re. 2: you can think of green threads as lumps of saved computation state that are stored in a stack-like thing and popped off, run for a bit, and put back onto the stack when they can't make further progress for the time being ("blocked") or after they've run for long enough. The degree to which this happens in parallel depends on the number of OS threads you tell the runtime to use (via +RTS -N). You can have a concurrent program that uses thousands of green threads but is only run with a single OS thread, and that's perfectly fine.

Infinite loop inside 'do_select' function of Linux kernel

I am surprised that Linux kernel has infinite loop in 'do_select' function implementation. Is it normal practice?
Also I am interested in how file changes monitoring implemented in Linux kernel? Is it infinite loop again?
select.c source code

This is not an infinite loop; that term is reserved for loops with no exit condition at all. This loop has its exit condition in the middle: http://lxr.linux.no/#linux+v3.9/fs/select.c#L482 This is a very common idiom in C. It's called "loop and a half" and there's a simple pseudocode example here: https://stackoverflow.com/a/10767975/388520 which clearly illustrates why you would want to do this. (That question talks about Java but that's not important; this is a general structured-programming idiom.)
I'm not a kernel expert, but this particular loop appears to have been written this way because the logic of the inner loop needs to run both before and after the call to poll_schedule_timeout at the very bottom of the outer loop. That code is checking whether there are any events to return; if there are already events to return when select is invoked, it's supposed to return immediately; if there aren't any initially, there will be when poll_schedule_timeout returns. So in normal operation the outer loop should cycle either 0.5 or 1.5 times. (There may be edge-case circumstances where the outer loop cycles more times than that.) I might have chosen to pull the inner loop out to its own function, but that might involve passing pointers to too many local variables around.
This is also not a spin loop, by which I mean, the CPU is not wasting electricity checking for events over and over again until one happens. If there are no events to report when control reaches the call to poll_schedule_timeout, that function (by, ultimately, calling __schedule) will cause the calling thread to block -- the CPU is taken away from that thread and assigned to another process that can do something useful with it. (If there are no processes that need the CPU, it'll be put into a low-power "halt" until the next interrupt fires.) When one of the events happens, or the timeout, the thread that called select will get "woken up" and poll_schedule_timeout will return.
On a larger note, operating system kernels often do things that would be considered strange, poor style, or even flat-out wrong, in the service of other engineering goals (efficiency, code reuse, avoidance of race conditions that can only occur on some CPUs, ...) They are written by people who know exactly what they are doing and exactly how far they can get away with bending the rules. You can learn a lot from reading though OS code, but you probably shouldn't try to imitate it until you have a bit more experience. You wouldn't try to pastiche the style of James Joyce as your first exercise in creative writing, ne? Same deal.

Greedy Scheduling in Multi-threading programming in cilk

I am having problem understanding the complete step and incomplete step in greedy scheduling in Multi-threaded programing in cilk.
Here is the power-point presentation for reference.
Cilk ++ Multi-threaded Programming
The problem I have understanding is in from slide # 32 - 37.
Can someone please explain especially the how is
Complete step>=P threads ready to run
incomplete steps < p threads ready
Thanks for your time and help

First, note that "threads" mentioned in the slides are not like OS threads as one may think. Their definition of a thread is given at slide 10: "a maximal sequence of instructions not containing parallel control (spawn, sync, return)". To avoid further confusion, let me call it a task instead.
On slides 32-35, a circle represents a task ("thread"), and edges represent dependencies between tasks. And the sentences you ask about are in fact definitions: when P or more tasks are ready to run (and so all P processors can be busy doing some work) the situation is called a complete step, while if less than P tasks are ready, the situation is called an incomplete step. To simplify the analysis, it is (implicitly) assumed that all tasks contain equal work (of size 1).
Then the theorem on the slide 35 provides an upper bound of time required for a greedy scheduler to run a program. Since all the execution is a sequence of complete and incomplete steps, the execution time is the sum of all steps. Since each complete step performs exactly P work, the number of complete steps cannot be bigger than T1 (total work) divided by P. Then, each incomplete step must execute a task belonging to the critical path (because at every step at least one critical path task must be ready, and incomplete steps execute all ready tasks); so the overall number of incomplete steps does not exceed the span T_inf (critical path length). Thus the sum of T1/P and T_inf gives an upper bound on execution time.
The rest of slides in the "Scheduling Theory" section are rather straightforward.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string