How to use multithreading to solve sudoku-like puzzle in scala? - multithreading

I'm wondering how to best use multithreading to solve a sort-of sudoku puzzle in Scala. What I start with is a 2D array, made into a 1D array for easier traversing. Every element in the array consists of a Square class, which holds all possible options for that square. I have already obtained the options by reducing from what already exists on the board.
What I do next is using recursion, moving from square to square in > direction from x(0, 0) to x(n, n). I try each option in each square, and if a square is out of options to try, it returns to a previous square that has more options.
This DOES work, but it goes really slow for puzzles over 10 x 10. I was thinking using multi threading might speed up the process? Am I somehow able to introduce multi threading into my current solution? If not, how could I best use multi threading to speed up the progress of solving the sudoku-like puzzle.
def next(Squares: Array[Square], step: Int): Unit = {
val sq: _Square = SQs(i)
for (o <- sq.options) { // check each o in square options
val valid = isValidValue(SQs, sq, o) // scan row/column to check if option is valid
if (valid) {
sq.value = o // sets value of square to option
next(Squares, step + 1) // use recursion to step into next square in squares
}
}
}
Here is an example of a 4x4 puzzle:
__ __ __<__
A
__ 1 __ __
A
__ __>__ __
1 __ __ __
and the solution
2 4 1< 3
A
3 1 4 2
A
4 3> 2 1
1 2 3 4

Related

Greedy Algorithms and Time Complexity #2

We have a bomb that is ticking and may explode. This bomb has n switches, that can be moved up or down. Certain combinations of these switches trigger the bomb, but only one combination disables it.
Our task is to move the switches from the current position to a position that disables the bomb, without exploding it in the meantime. The switches are big and awkward, so we can move only one switch at a time.
We have, lets say, n = 4 switches currently in position ^vvv. We need to get them to the position ^v^^. Forbidden positions are vvv^, ^vv^, ^v^v, and ^^^v.
a.) I had to draw this by hand and find the shortest sequence of switch movements that solves the task - result I got was 4 ...and I found two such sequences, if i am right...
b.) this is where it gets a hard - write a code that answers the above question/questions (the shortest sequence and how many). The code should be generalized so that it would work with another number of switches and other starting, targeted, and forbidden combinations; targeted and forbidden combinations may be multiple or even fewer. Only thing we know for sure is that the switches have only two positions. It should also provide the possibility that the desired condition is unavailable; in this case, the program should of course tell.
c.) Next questions is the time complexity of the code this but for now I think I will just stop here...
I used '0' and '1' instead, because it is easier for me to imagine this.
So my approach towards this was something of a greedy algorithm (I think) - starting position, you think of all the possible (allowed) positions, you ignore the forbidden ones, then pick the one that the sequence of positions has the fewest difference from our targeting sequence.
The key part of the code I am yet to write and that's the part I need help with.
all_combinations = ['0000', '0001', '0010', '0011', '0100', '0101', '0110', '0111', '1000', '1001', '1010', '1011' , '1100', '1101', '1110', '1111']
def distance (position1, position2):
distance = 0
for i in range (len (position1)):
if position1 [i]! = position2 [i]:
distance + = 1
return distance
def allowed_positions (current, all_combinations):
allowed = set ()
for combination and all combinations:
if the distance (current, combination) == 1:
allowed.add (combination)
return allowed
def best_name (current, all_combinations, target):
list = []
for option and permitted_mood (current, all_combinations):
list.append (distance (option, target), option)
The task at hand is finding a shortest path in a graph. For this there is one typical approach and that is a breadth-first search algorithm (https://en.wikipedia.org/wiki/Breadth-first_search).
There is no real need to go into the details of how this is done because it can be read elsewhere in more detail and far better explained than I can do this in a StackOverflow answer.
But what might need to be explained is how the switch-combinations you have at hand are represented by a graph.
Imagine you have just two switches. Then you have exactly this graph:
^^---^v
| |
| |
v^---vv
If your starting position is ^^ and your ending (defusing) position is vv while the position ^v is an exploding position, then your graph is reduced to this:
^^ ^v
|
|
v^---vv
In this small example the shortest path is obvious and simple.
The graph at hand is easily sketched out in 2D, each dimension (x and y) representing one of the switches. If you have more switches, then you just add one dimension for each switch. For three switches this would look like this:
^^^--------^^v
|\ |\
| \ | \
| \ | \
| \ | \
| ^v^--- | --^vv
| | | |
| | | |
v^^--------v^v |
\ | \ |
\ | \ |
\ | \ |
\| \|
vv^--------vvv
If the positions ^^v, v^^, and vv^ are forbidden, then this graph is reduced to this:
^^^ ^^v
\
\
\
\
^v^--------^vv
|
|
v^^ v^v |
\ |
\ |
\ |
\|
vv^ vvv
Which already shows the clear way and the breadth-first search will easily find it. It gets interesting only for many dimensions/switches, though.
Drawing this for more dimensions/switches gets confusing of course (look up tesseracts for 4D). But it isn't necessary to have a visual image. Once you have written the algorithm for creating the graph in 2D and 3D in a general way it easily scales to n dimensions/switches without adding any complexity.
start = 8
target = 11
forbidden = {1: -1 , 9: -1, 10: -1, 14: -1}
dimensions = 4
def distance(start, target, forbidden, dimensions):
stack1 = []
stack1.append(start)
forbidden[start] = -1
while(len(stack1) > 0):
top = stack1.pop()
for i in range(dimensions):
testVal = top ^ (1 << i)
if testVal is target:
forbidden[testVal] = top
result = [testVal]
while testVal is not start:
testVal = forbidden[testVal]
result.insert(0, testVal)
return result
if testVal not in forbidden:
forbidden[testVal] = top
stack1.append(testVal)
return [-1]
print(distance(start, target, forbidden, dimensions))
Here is my code for your example in your question. Instead of using bits, I went ahead and used the base 10 number to represent the codes. Forbidden codes are mapped to a hashmap which is used later to trace the path upwards after the target is found. I use a stack to keep track of which code to try. Each time the while loop passes, the last code added is popped and it's unvisited neighbors are added to the stack. Importantly, to prevent cycles, codes on the stack or seen before are added to the list of forbidden nodes. When the target code is found for the first time, an early return is called and the path is traced through the hashmap.
This solution uses breadth first search and returns the first time the target is found. That means it does not guarantee the shortest path from start to target, but it does guarantee a working path if it's available. Since all possible codes are possibly traversed and there are 2^dimensions number of nodes, the time complexity of this algorithm is also O(2^n)

Some basic python coding bug I don't know how to solve

I've encountered some problems when I was trying simulate school's math question,I've test the inner loop independently and the result is what I expect.I have no idea where is the problem and where can I get the resolution.It should be a simple bug.
This is the question:
There is a bag which includes three red balls ,four white balls and five black balls.Take one ball each time.Then what is the probability when red balls were the first color being collected.
And This is my code:(All annotations were not added in my code)
import random as rd
y = 1000 *//total try*
succ = 0 *//success times*
orgbg = ['r','r','r','w','w','w','w','b','b','b','b','b'] *//original bag for each loop initialization*
while (y >= 0):
redball = 0
blackball = 0
whiteball = 0
newbg = orgbg *//every bag for a single try*
while (redball < 3 and whiteball < 4 and blackball < 5):
tknum = rd.randrange(0,len(newbg),1)
tkball = newbg[tknum]
if (tkball == 'r'):
redball = redball + 1
elif (tkball =='w'):
whiteball = whiteball + 1
else:
blackball = blackball + 1
del newbg[tknum]
if (redball == 3):
succ = succ + 1
y = y - 1
print (succ)
This is what the error report says:
ValueError: empty range for randrange() (0,0, 0)
When I turn the code
tknum = rd.randrange(0,len(newbg),1)
into
tknum = rd.randrange(5,len(newbg),1)
The error reoprt says:
ValueError: empty range for randrange() (5,5, 0)
I guess it is the initialization in the outer loop newbg = orgbg doesn't work out,but how can that happen?
Sorry for giving such a length question ,I'm a beginner and this is the first time I ask question on StackOverFlow,you can also give me some suggestion on my code style or method and the way of asking question,next time I will be better,hope you don't mind.
I think that your problem is indeed linked with the initialization in the outer loop newbg = orgbg. To correct your code, you should modify this line with
newbg = deepcopy(orgbg)
and import the corresponding module at the start of your code:
from copy import deepcopy
The explanation of the bug is a bit complicated and is linked with the way that Python handles the memory when copying a list. In fact, there is two possibility for this: a shallow or a deep copy. Here, you made a shallow copy when a deep copy would have been necessary. It is better explained here: https://www.python-course.eu/deep_copy.php or What exactly is the difference between shallow copy, deepcopy and normal assignment operation?

Is the use of 'givens' really necessary in the deeplearning tutorials?

In the deep learning tutorials, all training data is stored in a shared array and only an index into that array is passed to the training function to slice out a minibatch.
I understand that this allows the data to be left in GPU memory, as opposed to passing small chunks of data as a parameter to the training function for each minibatch.
In some previous questions, this was given as an answer as to why the givens mechanism is used in the tutorials.
I don't yet see the connection between these two concepts, so I'm probably missing out on something essential.
As far as I understand, the givens mechanism swaps out a variable in the graph with a given symbolic expression (i.e., some given subgraph is inserted in place of that variable).
Then why not define the computational graph the way we need it in the first place?
Here is a minimal example. I define a shared variable X and an integer index, and I either create a graph that already contains the slicing operation, or I create one where the slicing operation is inserted post-hoc via givens.
By all appearances, the two resulting functions get_nogivens and get_tutorial are identical (see the debugprints at the end).
But then why do the tutorials use the givens pattern?
import numpy as np
import theano
import theano.tensor as T
X = theano.shared(np.arange(100),borrow=True,name='X')
index = T.scalar(dtype='int32',name='index')
X_slice = X[index:index+5]
get_tutorial = theano.function([index], X, givens={X: X[index:index+5]}, mode='DebugMode')
get_nogivens = theano.function([index], X_slice, mode='DebugMode')
> theano.printing.debugprint(get_tutorial)
DeepCopyOp [#A] '' 4
|Subtensor{int32:int32:} [#B] '' 3
|X [#C]
|ScalarFromTensor [#D] '' 0
| |index [#E]
|ScalarFromTensor [#F] '' 2
|Elemwise{add,no_inplace} [#G] '' 1
|TensorConstant{5} [#H]
|index [#E]
> theano.printing.debugprint(get_nogivens)
DeepCopyOp [#A] '' 4
|Subtensor{int32:int32:} [#B] '' 3
|X [#C]
|ScalarFromTensor [#D] '' 0
| |index [#E]
|ScalarFromTensor [#F] '' 2
|Elemwise{add,no_inplace} [#G] '' 1
|TensorConstant{5} [#H]
|index [#E]
They use givens here only to decouple actual data which is passed to the graph from the input data variable. You could explicitly replace input variable with X[index * batch_size: (index + 1) * batch_size] but that is just a little more messy.

Algorithm to solve Local Alignment

Local alignment between X and Y, with at least one column aligning a C
to a W.
Given two sequences X of length n and Y of length m, we
are looking for a highest-scoring local alignment (i.e., an alignment
between a substring X' of X and a substring Y' of Y) that has at least
one column in which a C from X' is aligned to a W from Y' (if such an
alignment exists). As scoring model, we use a substitution matrix s
and linear gap penalties with parameter d.
Write a code in order to solve the problem efficiently. If you use dynamic
programming, it suffices to give the equations for computing the
entries in the dynamic programming matrices, and to specify where
traceback starts and ends.
My Solution:
I've taken 2 sequences namely, "HCEA" and "HWEA" and tried to solve the question.
Here is my code. Have I fulfilled what is asked in the question? If am wrong kindly tell me where I've gone wrong so that I will modify my code.
Also is there any other way to solve the question? If its available can anyone post a pseudo code or algorithm, so that I'll be able to code for it.
public class Q1 {
public static void main(String[] args) {
// Input Protein Sequences
String seq1 = "HCEA";
String seq2 = "HWEA";
// Array to store the score
int[][] T = new int[seq1.length() + 1][seq2.length() + 1];
// initialize seq1
for (int i = 0; i <= seq1.length(); i++) {
T[i][0] = i;
}
// Initialize seq2
for (int i = 0; i <= seq2.length(); i++) {
T[0][i] = i;
}
// Compute the matrix score
for (int i = 1; i <= seq1.length(); i++) {
for (int j = 1; j <= seq2.length(); j++) {
if ((seq1.charAt(i - 1) == seq2.charAt(j - 1))
|| (seq1.charAt(i - 1) == 'C') && (seq2.charAt(j - 1) == 'W')) {
T[i][j] = T[i - 1][j - 1];
} else {
T[i][j] = Math.min(T[i - 1][j], T[i][j - 1]) + 1;
}
}
}
// Strings to store the aligned sequences
StringBuilder alignedSeq1 = new StringBuilder();
StringBuilder alignedSeq2 = new StringBuilder();
// Build for sequences 1 & 2 from the matrix score
for (int i = seq1.length(), j = seq2.length(); i > 0 || j > 0;) {
if (i > 0 && T[i][j] == T[i - 1][j] + 1) {
alignedSeq1.append(seq1.charAt(--i));
alignedSeq2.append("-");
} else if (j > 0 && T[i][j] == T[i][j - 1] + 1) {
alignedSeq2.append(seq2.charAt(--j));
alignedSeq1.append("-");
} else if (i > 0 && j > 0 && T[i][j] == T[i - 1][j - 1]) {
alignedSeq1.append(seq1.charAt(--i));
alignedSeq2.append(seq2.charAt(--j));
}
}
// Display the aligned sequence
System.out.println(alignedSeq1.reverse().toString());
System.out.println(alignedSeq2.reverse().toString());
}
}
#Shole
The following are the two question and answers provided in my solved worksheet.
Aligning a suffix of X to a prefix of Y
Given two sequences X and Y, we are looking for a highest-scoring alignment between any suffix of X and any prefix of Y. As a scoring model, we use a substitution matrix s and linear gap penalties with parameter d.
Give an efficient algorithm to solve this problem optimally in time O(nm), where n is the length of X and m is the length of Y. If you use a dynamic programming approach, it suffices to give the equations that are needed to compute the dynamic programming matrix, to explain what information is stored for the traceback, and to state where the traceback starts and ends.
Solution:
Let X_i be the prefix of X of length i, and let Y_j denote the prefix of Y of length j. We compute a matrix F such that F[i][j] is the best score of an alignment of any suffix of X_i and the string Y_j. We also compute a traceback matrix P. The computation of F and P can be done in O(nm) time using the following equations:
F[0][0]=0
for i = 1..n: F[i][0]=0
for j = 1..m: F[0][j]=-j*d, P[0][j]=L
for i = 1..n, j = 1..m:
F[i][j] = max{ F[i-1][j-1]+s(X[i-1],Y[j-1]), F[i-1][j]-d, F[i][j-1]-d }
P[i][j] = D, T or L according to which of the three expressions above is the maximum
Once we have computed F and P, we find the largest value in the bottom row of the matrix F. Let F[n][j0] be that largest value. We start traceback at F[n][j0] and continue traceback until we hit the first column of the matrix. The alignment constructed in this way is the solution.
Aligning Y to a substring of X, without gaps in Y
Given a string X of length n and a string Y of length m, we want to compute a highest-scoring alignment of Y to any substring of X, with the extra constraint that we are not allowed to insert any gaps into Y. In other words, the output is an alignment of a substring X' of X with the string Y, such that the score of the alignment is the largest possible (among all choices of X') and such that the alignment does not introduce any gaps into Y (but may introduce gaps into X'). As a scoring model, we use again a substitution matrix s and linear gap penalties with parameter d.
Give an efficient dynamic programming algorithm that solves this problem optimally in polynomial time. It suffices to give the equations that are needed to compute the dynamic programming matrix, to explain what information is stored for the traceback, and to state where the traceback starts and ends. What is the running-time of your algorithm?
Solution:
Let X_i be the prefix of X of length i, and let Y_j denote the prefix of Y of length j. We compute a matrix F such that F[i][j] is the best score of an alignment of any suffix of X_i and the string Y_j, such that the alignment does not insert gaps in Y. We also compute a traceback matrix P. The computation of F and P can be done in O(nm) time using the following equations:
F[0][0]=0
for i = 1..n: F[i][0]=0
for j = 1..m: F[0][j]=-j*d, P[0][j]=L
for i = 1..n, j = 1..m:
F[i][j] = max{ F[i-1][j-1]+s(X[i-1],Y[j-1]), F[i][j-1]-d }
P[i][j] = D or L according to which of the two expressions above is the maximum
Once we have computed F and P, we find the largest value in the rightmost column of the matrix F. Let F[i0][m] be that largest value. We start traceback at F[i0][m] and continue traceback until we hit the first column of the matrix. The alignment constructed in this way is the solution.
Hope you get some idea about wot i really need.
I think it's quite easy to find resources or even the answer by google...as the first result of the searching is already a thorough DP solution.
However, I appreciate that you would like to think over the solution by yourself and are requesting some hints.
Before I give out some of the hints, I would like to say something about designing a DP solution
(I assume you know this can be solved by a DP solution)
A dp solution basically consisting of four parts:
1. DP state, you have to self define the physical meaning of one state, eg:
a[i] := the money the i-th person have;
a[i][j] := the number of TV programmes between time i and time j; etc
2. Transition equations
3. Initial state / base case
4. how to query the answer, eg: is the answer a[n]? or is the answer max(a[i])?
Just some 2 cents on a DP solution, let's go back to the question :)
Here's are some hints I am able to think of:
What is the dp state? How many dimensions are enough to define such a state?
Thinking of you are solving problems much alike to common substring problem (on 2 strings),
1-dimension seems too little and 3-dimensions seems too many right?
As mentioned in point 1, this problem is very similar to common substring problem, maybe you should have a look on these problems to get yourself some idea?
LCS, LIS, Edit Distance, etc.
Supplement part: not directly related to the OP
DP is easy to learn, but hard to master. I know a very little about it, really cannot share much. I think "Introduction to algorithm" is a quite standard book to start with, you can find many resources, especially some ppt/ pdf tutorials of some colleges / universities to learn some basic examples of DP.(Learn these examples is useful and I'll explain below)
A problem can be solved by many different DP solutions, some of them are much better (less time / space complexity) due to a well-defined DP state.
So how to design a better DP state or even get the sense that one problem can be solved by DP? I would say it's a matter of experiences and knowledge. There are a set of "well-known" DP problems which I would say many other DP problems can be solved by modifying a bit of them. Here is a post I just got accepted about another DP problem, as stated in that post, that problem is very similar to a "well-known" problem named "matrix chain multiplication". So, you cannot do much about the "experience" part as it has no express way, yet you can work on the "knowledge" part by studying these standard DP problems first maybe?
Lastly, let's go back to your original question to illustrate my point of view:
As I knew LCS problem before, I have a sense that for similar problem, I may be able to solve it by designing similar DP state and transition equation? The state s(i,j):= The optimal cost for A(1..i) and B(1..j), given two strings A & B
What is "optimal" depends on the question, and how to achieve this "optimal" value in each state is done by the transition equation.
With this state defined, it's easy to see the final answer I would like to query is simply s(len(A), len(B)).
Base case? s(0,0) = 0 ! We can't really do much on two empty string right?
So with the knowledge I got, I have a rough thought on the 4 main components of designing a DP solution. I know it's a bit long but I hope it helps, cheers.

Finding number of different paths

I have a game that one player X wants to pass a ball to player Y, but he can be playing with more than one player and the others players can pass the ball to Y.
I want to know how many different paths can the ball take from X to Y?
for example if he is playing with 3 players there are 5 different paths, 4 players 16 paths, if he is playing with 20 players there are 330665665962404000 paths, and 40 players 55447192200369381342665835466328897344361743780 that the ball can take.
the number max. of players that he can play with is 500.
I was thinking in using Catalan Numbers? do you think is a correct approach to solve this?
Can you give me some tips.
At first sight, I would say, that tht number of possible paths can be calculated the following way (I assume a "path" is a sequence of players with no player occuring more than once).
If you play with n+2 players, i.e. player X, player Y and n other players that could occur in the path.
Then the path can contain 0, 1, 2, 3, ... , n-1 or n "intermediate" players between player X (beginning) and player Y (end).
If you choose k (1 <= k <= n) players from n players in total, you can do this in (n choose k) ways.
For each of this subsets of intermediate players, there are k! possible arrangements of players.
So this yields sum(i=0 to n: (n choose i) * i!).
For "better" reading:
---- n / n \ ---- n n! ---- n 1
\ | | \ -------- \ ------
/ | | * i! = / (n-i)! = n! / i!
---- i=0 \ i / ---- i=0 ---- i=0
But I think that these are not the catalan numbers.
This is really a question in combinatorics, not algorithms.
Mark the number of different paths from player X to player Y as F(n), where n is the number of players including Y but not X.
Now, how many different paths are there? Player X can either pass the ball straight to Y (1 option), or pass it to one of the other players (n-1 options). If X passes to another player, we can pretend that player is the new X, where there are n-1 players in the field (since the 'old' X is no longer in the game). That's why
F(n) = 1 + (n-1)F(n-1)
and
F(1) = 1
I'm pretty sure you can reach phimuemue's answer from this one. The question is if you prefer a recursive solution or one with summation.
I'm somewhat of a noob at this kind of searching, but a quick run through the numbers demonstrates the more you can trim, cut out, filter out, the faster you can do it. The numbers you cite are BIG.
First thing that comes to mind is "Is it practical to limit your search depth?" If you can limit your search depth to say 4 (an arbitrary number), your worst case number of possibilities comes out to ...
499 * 498 * 497 * 496 = 61,258,725,024 (assuming no one gets the ball twice)
This is still large, but an exhaustive search would be far faster (though still too slow for a game) than your original set of numbers.
I'm sure others with more experience in this area would have better suggestions. Still, I hope this helps.
If X needs to pass to Y, and there could be P1, P2, ..., Pn players in between and you care about the order of passing then indeed
For 2 extra players you have paths: X-Y, X-P1-Y, X-P2-Y, X-P1-P2-Y, X-P2-P1-Y
Which gives a total of 5 different paths, similarly for 3 extra players you have 16 different paths
First try to reduce the problem to something known, and for this I would eliminate X-Y, they are common to all of the above translates to question: what is the sum of k-permutations for k from 0 to n, where n is the number of P.
This can be given as
f(n):=sum(n!/(n-i)!,i,0,n);
and I can confirm your findings for 19 and 39 (20 and 40 in your notation).
For f(499) I get
6633351524650661171514504385285373341733228850724648887634920376333901210587244906195903313708894273811624288449277006968181762616943058027258258920058014768423359811679381900054568501151839849768338994244697593758840394106353734267539926205845992860165295957099385939316593862710470512043836452624452665801937754479602741031832540175306674471495745716725509714798824661807396000105338256698426305553340786519843729411660457896089840381658295930455362209587765698327585913037665131195504013431486823990271059962837959407778393078276213331859189770016153265512805722812864376997337140529242894215031131618375899072989922780132488077015246576266246551484603286735418485007674249207286921801779414240854077425752351919182464902664206622037834736215298295580945851569079682952183639701057397376328170754187008425429164206646365285647875545882646729176997107332605851460212415526607757545366695048460341802079614840254694664267117469603856584752270653889630424848913719533359942725361985274851471687885265903663806182184272555073708882789845441094009797907518245726494471433964169680271980763830020431957658400573531564215436064984091520
Results obtained with wxMaxima
EDIT: After more clarification from the comments of the question, my answer is absolutely useless :) he definitely wants the number of possible routes, not the best one!
My first thought is why do you want to know these numbers? You're certainly never going to iterate through all the paths available to 500 people (would take far too long) and it's too big to display on a ui in any meaningful way.
I'm assuming that you're going to try to find the best route that the ball can take in which case I would consider looking into algorithms that don't care about the number of nodes in a route.
I'd try looking at the A star algorithm and Dijkstra's algorithm.

Resources