Can a race condition occur when executing such code on Julia? - multithreading

I have a function below. Can a race condition occur when executing such code?
function thread_test(v)
Threads.#threads for i = 1:length(v)
#inbounds v[i] = rand()
end
sum(v)
end

If v is an Array there will be no race condition. Accessing different array elements in different threads is safe.
However, if v is e.g. a Dict{Int, Float64} you can have race conditions. Similarly, you are not guaranteed thread safety for subtypes of AbstractArray, like BitVector.

Related

Julia multithreading "broadcast return"

I am new to the world of multithreading, I just want to write a function that compare (large) sorted 2D arrays:
function check_duplicates(sites, generated_structures)
for i in eachindex(generated_structures)
sites == generated_structures[i] && return false
end
return true
end
"generated_structures" might be large (5M+ elements) and to speed up things I was thinking about doing something like this:
function check_duplicates(sites, generated_structures)
Threads.#threads for i in eachindex(generated_structures)
sites == generated_structures[i] && return false
end
return true
end
So that multiple threads check subparts of the large array. Is it possible that as soon as the condition is satisfied the function stops and return false? Right now some threads return true because the subpart they checked didn't contain matches.
One way would be to have a common status flag that each thread can access and bail out if some other thread has found a match. To do this in a thread-safe manner, there is Threads.Atomic:
function hasdup(val, itr)
status = Threads.Atomic{Bool}(false)
Threads.#threads for x in itr
status[] && break
(x == val) && (status[] = true)
end
return status[]
end
Now, I'm not really sure why the access to status needs to be thread-safe, since it will only be written to in order to set it to true. Removing the Atomic wrapper still works, but is quite a bit slower.
I get a decent threading speedup from the above code, but the threading overhead is quite large, so if your match is early in the record, the threaded version will be much slower.
A multi-threading library that has much lower overhead is Polyester.jl, which will be much faster if the match comes early. It is, however, not compatible with the Threads library, so you cannot nest it within Threads.

Non blocking reads with Julia

I would like to read an user input without blocking the main thread, much like the getch() function from conio.h. Is it possible in Julia?
I tried with #async but it looked like my input wasn't being read although the main thread wasn't blocked.
The problem, I believe, is either you are running on global scope which makes #async create its own local variables (when it reads, it reads into a variable in another scope) or you are using an old version of Julia.
The following examples read an integer from STDIN in a non-blocking fashion.
function foo()
a = 0
#async a = parse(Int64, readline())
println("See, it is not blocking!")
while (a == 0)
print("")
end
println(a)
end
The following two examples do the job in global scope, using an array. You can do the same trick with other types mutable objects.
Array example:
function nonblocking_readInt()
arr = [0]
#async arr[1] = parse(Int64, readline())
arr
end
r = nonblocking_readInt() # is an array
println("See, it is not blocking!")
while(r[1] == 0) # sentinel value check
print("")
end
println(r[1])

Evaluation order for always blocks triggered within always blocks in Verilog?

I understand that, for 2 always blocks with the same trigger, their order of evaluation is completely unpredictable.
However, suppose I have:
always #(a) begin : blockX
c = 0;
d = a + 2;
if(c != 1) e = 2;
end
always #(a) begin : blockY
e = 3;
end
always #(d) begin : blockZ
c = 1;
e = 1;
end
Suppose block X evaluates first. Does changing d in blockX immediately jump to blockZ? If not, when is blockZ evaluated with respect to blockY?
My programmer's instinct thinks of the sequence of events as a stack, where evaluating blockX is like a function call to blockZ and I immediately jump there in the code, then finish evaluating blockX.
However, because we call the active events queue, well, a queue, this suggests blockZ is enqueued at the back of the active events queue, and I'm 100% guaranteed it will be evaluated last (unless there are other triggered always blocks).
There's also the intermediate possibility, where it's neither first nor last but is also evaluated in a random and unpredictable order.
So in this example, are 1, 2, or 3 all possible final values for e, depending on how the compiler is feeling at run time?
Additionally, while I understand, of course, this represents awful style, where might I find the specification for this kind of behvaior?
Always blocks are not function calls. See a recent answer I just gave for a similar question. These blocks are concurrent processes. The LRM only guarentees the ordering of statements within a begin/end block. There is no defined ordering between concurrently executing begin/end blocks (See Section 4.7 Nondeterminism in the 1800-2012 LRM) So a simulator is free to interleave the statements in any way as long as it honors the order within a single block.
So you are correct that e could have the final values 1, 2 or 3 depending on how a simulator decides to implement and optimize your code.

Scope of variables inside threaded for loops?

In the following example
shared_arr = zeros(4000)
Threads.#threads for thread = 1:4
tmp_arr = rand(1000)
for i = 1:1000
shared_arr[(thread - 1)*1000+i] = tmp_arr[i]
end
end
I believe shared_arr is shared among all threads. Is tmp_arr allocated 4 times so that each thread has it's own tmp_arr?
According to the scoping rules described in the documentation, a new scope is introduced whenever a for-loop is invoked. Since tmp_arr isn't declared prior to the loop, it will be a distinct value in each iteration of the for loop. Note that rand might not be threadsafe however per #Lyndon White's comment.

pthread_cond_wait without a while loop

global variable 'temp';
**threadA**
-pthread_mutex_lock-
if (temp == 'x')
-pthread_cond_wait-
do this
-pthread_mutex_unlock-
**threadB**
-pthread_mutex_lock-
if (someCondition == true)
temp = 'x'
-pthread_cond_signal-
-pthread_mutex_unlock-
In my case I may not have any loops, I just have an if condition. So, I want that when temp == 'x', then the threadA should do that/this.
Is the loop compulsory when dealing with the pthread_cond_wait?
What is the other way for writing the code if we don't need loops?
Is this a correct way of writing the code?
A loop is compulsory because according to http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_cond_wait.html:
Spurious wakeups from the pthread_cond_timedwait() or pthread_cond_wait() functions may occur. Since the return from pthread_cond_timedwait() or pthread_cond_wait() does not imply anything about the value of this predicate, the predicate should be re-evaluated upon such return.

Resources