What is the difference between Threads.#spawn and Threads.#threads? - multithreading

I am a novice programmer interested in the Julia language. The documentation (https://docs.julialang.org/en/v1/base/multi-threading/) says Threads.#threads is for "for" loops and theads.#spawn places a given task on any available thread. My understand is that Threads.#threads is inherently synchronized while the threads.#spawn method is asynchronous and needs more planning to implement (namely using the fetch() method).
In code I find online using both, I seem to see the two used interchangeably (from my perspective). What is the conceptual difference between the two for a novice programmer and how/when should we implement each? Additionally, can they complement each other?

Consider:
function withthreads()
arr = zeros(Int, 10)
Threads.#threads for i in 1:10
sleep(3 * rand())
arr[i] = i
end
println("with #threads: $arr")
end
function withspawn()
arr = zeros(Int, 10)
for i in 1:10
Threads.#spawn begin
sleep(3 * rand())
arr[i] = i
end
end
println("with #spawn: $arr")
end
function withsync()
arr = zeros(Int, 10)
#sync begin
for i in 1:10
Threads.#spawn begin
sleep(3 * rand())
arr[i] = i
end
end
end
println("with #sync: $arr")
end
withthreads()
withspawn()
withsync()
output:
with #threads: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
with #spawn: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
with #sync: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
So #threads manages the pool of threads allotted to julia, and spawns up to one thread for each iteration of the for loop (possibly using the same threads more than once for more than one iteration, sequentially as each thread finishes its allotted iteration, if there are more iterations than threads), and also synchonizes the threads, not exiting the for block until all threads have completed. #spawn spawns just one task thread and returns to the main task immediately, and so the block can be exited as soon as all tasks are spawned, even before they are done working (so the zeros remain 0 in array arr).

Related

Repetitive sequence (optimization)

I try to solve this problem:
initial list = [0, 1, 2, 2]
You get this sequence of numbers [0, 1, 2, 2] and you need to add every time the next natural number (so 3, 4, 5, etc.) n times, where n is the element of its index. For example, the next number to add is 3, and list[3] is 2, so you append [3] 2 times. New list will be: [0, 1, 2, 2, 3, 3]. Then the index of 4 is 3, so you have to append 4 three times. The list will be [0, 1, 2, 2, 3, 3, 4, 4, 4] and so on. ([0, 1, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7, 8, 8, 8, 8, 9, 9, 9, 9, 9, 10, 10, 10, 10, 10])
In order to solve this, I tried various approaches. I used recursion, but a recursive approach is very slow in this case. I tried as well the mathematical formula from OEIS (A055086) => a(n) = ceiling(2*sqrt(n+1)) - 2. The problem with the formula is that after 2 ** 20 it is too imprecise.
So, my next idea was to use memoization:
lst = [0, 1, 2, 2]
from itertools import repeat
def find(n):
global lst
print(lst[-1], n, flush = True)
if len(lst) > n:
return lst[n]
for number in range(lst[-1]+1, n+1):
lst += list(repeat(number, lst[number]))
if len(lst) > n:
return lst[n]
Now, this approach works until 2 ** 37, but after this is just timing out. The site where I try to implement my algorithm is (https://www.codewars.com/kata/5f134651bc9687000f8022c4/train/python). I don't ask for a solution, but for any hint on how to optimize my code.
I googled some similar problems and I found that in this case, I could use the total sum of the list, but is not very clear to me yet how could this help me.
Any help is welcomed!
You can answer it iteratively like so:
def find(n):
lst = [0,1,2,2]
if n < 4:
return lst[n]
to_add = 3
while n >= len(lst):
for i in range(lst[to_add]):
lst.append(to_add)
to_add += 1
return lst[n]
You could optimise for large n by breaking early in the for loop, and by keeping track of the list length separately, rather than calls to len

Crossover and mutation in Differential Evolution

I'm trying to solve Traveling Salesman problem using Differential Evolution. For example, if I have vectors:
[1, 4, 0, 3, 2, 5], [1, 5, 2, 0, 3, 5], [4, 2, 0, 5, 1, 3]
how can I make crossover and mutation? I saw something like a+Fx(b-c), but I have no idea how to use this.
I ran into this question when looking for papers on solving the TSP problem using evolutionary computing. I have been working on a similar project and can provide a modified version of my written code.
For mutation, we can swap two indices in a vector. I assume that each vector represents an order of nodes you will visit.
def swap(lst):
n = len(lst)
x = random.randint(0, n)
y = random.randint(0, n)
# store values to be swapped
old_x = lst[x]
old_y = lst[y]
# do swap
lst[y] = old_x
lst[x] = old_y
return lst
For the case of crossover in respect to the TSP problem, we would like to keep the general ordering of values in our permutations (we want a crossover with a positional bias). By doing so, we will preserve good paths in good permutations. For this reason, I believe single-point crossover is the best option.
def singlePoint(parent1, parent2):
point = random.randint(1, len(parent1)-2)
def helper(v1, v2):
# this is a helper function to save with code duplication
points = [i1.getPoint(i) for i in range(0, point)]
# add values from right of crossover point in v2
# that are not already in points
for i in range(point, len(v2)):
pt = v2[i]
if pt not in points:
points.append(pt)
# add values from head of v2 which are not in points
# this ensures all values are in the vector.
for i in range(0, point):
pt = v2[i]
if pt not in points:
points.append(pt)
return points
# notice how I swap parent1 and parent2 on the second call
offspring_1 = helper(parent1, parent2)
offspring_2 = helper(parent2, parent1)
return offspring_1, offspring_2
I hope this helps! Even if your project is done, this could come in handy GA's are great ways to solve optimization problems.
if F=0.6, a=[1, 4, 0, 3, 2, 5], b=[1, 5, 2, 0, 3, 5], c=[4, 2, 0, 5, 1, 3]
then a+Fx(b-c)=[-0.8, 5.8, 1.2, 0, 3.2, 6.2]
then change the smallest number in the array to 0, change the second smallest number in the array to 1, and so on.
so it return [0, 4, 2, 1, 3, 5].
This method is inefficient when used to solve the jsp problems.

Is the order of batches guaranteed in Keras' OrderedEnqueuer?

I have a custom keras.utils.sequence which generates batches in a specific (and critical) order.
However, I need to parellelise batch generation across multiple cores. Does the name 'OrderedEnqueuer' imply that the order of batches in the resulting queue is guaranteed to be the same as the order of the original keras.utils.sequence?
My reasons for thinking that this order is not guaranteed:
OrderedEnqueuer uses python multiprocessing's apply_async internally.
Keras' docs explicitly say that OrderedEnqueuer is guaranteed not to duplicate batches - but not that the order is guaranteed.
My reasons for thinking that it is:
The name!
I understand that keras.utils.sequence objects are indexable.
I found test scripts on Keras' github which appear to be designed to verify order - although I could not find any documentation about whether these were passed, or whether they are truly conclusive.
If the order here is not guaranteed, I would welcome any suggestions on how to parellelise batch preparation while maintaining a guaranteed order, with the proviso that it must be able to parellelise arbitrary python code - I believe e.g tf.data.Dataset API does not allow this (tf.py_function calls back to original python process).
Yes, it's ordered.
Check it yourself with the following test.
First, let's create a dummy Sequence that returns just the batch index after waiting a random time (the random time is to assure that the batches will not be finished in order):
import time, random, datetime
import numpy as np
import tensorflow as tf
class DataLoader(tf.keras.utils.Sequence):
def __len__(self):
return 10
def __getitem__(self, i):
time.sleep(random.randint(1,2))
#you could add a print here to see that it's out of order
return i
Now let's create a test function that creates the enqueuer and uses it.
The function takes the number of workers and prints the time taken as well as the results as returned.
def test(workers):
enq = tf.keras.utils.OrderedEnqueuer(DataLoader())
enq.start(workers = workers)
gen = enq.get()
results = []
start = datetime.datetime.now()
for i in range(30):
results.append(next(gen))
enq.stop()
print('test with', workers, 'workers took', datetime.datetime.now() - start)
print("results:", results)
Results:
test(1)
test(8)
test with 1 workers took 0:00:45.093122
results: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
test with 8 workers took 0:00:09.127771
results: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Notice that:
8 workers is way faster than 1 worker -> it is parallelizing ok
results are ordered for both cases

Count number of repeated elements in list considering the ones larger than them

I am trying to do some clustering analysis on a dataset. I am using a number of different approaches to estimate the number of clusters, then I put what every approach gives (number of clusters) in a list, like so:
total_pred = [0, 0, 1, 1, 0, 1, 1]
Now I want to estimate the real number of clusters, so I let the methods above vote, for example, above, more models found 1 cluster than 0, so I take 1 as the real number of clusters.
I do this by:
counts = np.bincount(np.array(total_pred))
real_nr_of_clusters = np.argmax(counts))
There is a problem with this method, however. If the above list contains something like:
[2, 0, 1, 0, 1, 0, 1, 0, 1]
I will get 0 clusters as the average, since 0 is repeated more often. However, if one model found 2 clusters, it's safe to assume it considers at least 1 cluster is there, hence the real number would be 1.
How can I do this by modifying the above snippet?
To make the problem clear, here are a few more examples:
[1, 1, 1, 0, 0, 0, 3]
should return 1,
[0, 0, 0, 1, 1, 3, 4]
should also return 1 (since most of them agree there is AT LEAST 1 cluster).
There is a problem with your logic
Here is an implementation of the described algorithm.
l = [2, 0, 1, 0, 1, 0, 1, 0, 1]
l = sorted(l, reverse=True)
votes = {x: i for i, x in enumerate(l, start=1)}
Output
{2: 1, 1: 5, 0: 9}
Notice that since you define a vote as agreeing with anything smaller than itself, then min(l) will always win, because everyone will agree that there are at least min(l) clusters. In this case min(l) == 0.
How to fix it
Mean and median
Beforehand, notice that taking the mean or the median are valid and light-weight options that both satisfy the desired output on your examples.
Bias
Although, taking the mean might not be what you want if, for say, you encounter votes with high variance such as [0, 0, 7, 8, 10] where it is unlikely that the answer is 5.
A more general way to fix that is to include a voter's bias toward votes close to theirs. Surely that a 2-voter will agree more to a 1 than a 0.
You do that by implementing a metric (note: this is not a metric in the mathematical sense) that determines how much an instance that voted for x is willing to agree to a vote for y on a scale of 0 to 1.
Note that this approach will allow voters to agree on a number that is not on the list.
We need to update our code to account for applying that pseudometric.
def d(x, y):
return x <= y
l = [2, 0, 1, 0, 1, 0, 1, 0, 1]
votes = {y: sum(d(x, y) for x in l) for y in range(min(l), max(l) + 1)}
Output
{0: 9, 1: 5, 2: 1}
The above metric is a sanity check. It is the one your provided in your question and it indeed ends up determining that 0 wins.
Metric choices
You will have to toy a bit with your metrics, but here are a few which may make sense.
Inverse of the linear distance
def d(x, y):
return 1 / (1 + abs(x - y))
l = [2, 0, 1, 0, 1, 0, 1, 0, 1]
votes = {y: sum(d(x, y) for x in l) for y in range(min(l), max(l) + 1)}
# {0: 6.33, 1: 6.5, 2: 4.33}
Inverse of the nth power of the distance
This one is a generalization of the previous. As n grows, voters tend to agree less and less with distant vote casts.
def d(x, y, n=1):
return 1 / (1 + abs(x - y)) ** n
l = [2, 0, 1, 0, 1, 0, 1, 0, 1]
votes = {y: sum(d(x, y, n=2) for x in l) for y in range(min(l), max(l) + 1)}
# {0: 5.11, 1: 5.25, 2: 2.44}
Upper-bound distance
Similar to the previous metric, this one is close to what you described at first in the sense that a voter will never agree to a vote higher than theirs.
def d(x, y, n=1):
return 1 / (1 + abs(x - y)) ** n if x >= y else 0
l = [2, 0, 1, 0, 1, 0, 1, 0, 1]
votes = {y: sum(d(x, y, n=2) for x in l) for y in range(min(l), max(l) + 1)}
# {0: 5.11, 1: 4.25, 2: 1.0}
Normal distribution
An other option that would be sensical is a normal distribution or a skewed normal distribution.
While the other answer provides a comprehensive review of possible metrics and methods, it seems what you are seeking is to find the closest number of clusters to mean!
So something as simple as:
cluster_num=int(np.round(np.mean(total_pred)))
Which returns 1 for all your cases as you expect.

How to find intersection of 2 lists in least time complexity in python 3.5 excluding the built in set function?

I know we can solve with this method but I am curious to know is there any other method with least time complexity.
a=[1,2,3,4]
b=[2,3,5]
c=set(a).intersection(b)
This gives the answer but is there any shorter method?And it would also be helpful if someone explains what is the time complexity of this built in function in python 3.5
Shorter solution:
If you want a shorter method, you can use the & operator for sets, which is exactly an intersection:
>>> a = [1, 2, 3, 4]
>>> b = [2, 3, 5]
>>> c = set(a) & set(b)
{2, 3}
Time complexity:
Python has a module called timeit, which measures execution time of small code snippets. So in your case:
With the intersection() function:
$ python3 -m timeit -s "a = [1, 2, 3, 4]; b = [2, 3, 5]; c = set(a).intersection(set(b))"
100000000 loops, best of 3: 0.00989 usec per loop
With the & operator:
$ python3 -m timeit -s "a = [1, 2, 3, 4]; b = [2, 3, 5]; c = set(a) & set(b)"
100000000 loops, best of 3: 0.01 usec per loop
You see that there is a very small difference between them (because in the inside they are the same thing). But there is no more efficient way of doing set intersection, because the built-in methods are enough optimized.
Here is a table of time complexity of each collection method in Python. Look at the set section:
Operation: Intersection s&t
Average case: O(min(len(s), len(t))
Worst case: O(len(s) * len(t))
Notes: replace "min" with "max" if t is not a set

Resources