From a random sequence, make appear an specific answer 5 times - python-3.x

I am implementing an N-back task with visual stimuli.
The stimuli is going to appear 20 times (randomly), but the correct answer must appear 5 times.
To generate a random sequence I can use "randint". But how can I tell to python to generate 5 times the correct answer.
Example of a 0-back.
The circle appears 20 times, but only 5 times it is the same as what it is needed.

So, I created 2 arrays.
right_answer, contains the position that I want to repeat 4 times.
wrong_answers, contains the other 8 positions. Each one repeats twice.
I concatenated them and then applied a permutation with a seed. With this, the array will be permuted but I will assure that the specific position/answer appear 4 times.
right_answer = np.array([[510,642],[510,642],\
[510,642],[510,642]])
wrong_answers = np.array([[510,382],[510,512],\
[640,382],[640,512],[640,642],\
[770,382],[770,512],[770,642],\
[510,382],[510,512],\
[640,382],[640,512],[640,642],\
[770,382],[770,512],[770,642]])
concat=np.concatenate((right_answer, wrong_answers))
positions = np.random.RandomState(seed=42).permutation(concat)
print (positions)

Related

Optimization for searching for a substring withing list of strings

Parameters
Currently I have a deque like this which is constantly being appended to:
choices = [
["text approximately 500+ chars long", 1] #second part is a 1 or a 2
]
The text the latest email in an email chain and the number determines what to do with it. Any subsequent emails (lower in the chain or above in the chain) should have the same action applied to it.
I am using a deque because I assume that emails that are in the same chain are sent around the same time and thus I can benefit from using the appendleft of a deque which is faster than a list.
Currently to achieve my goal I am doing it like this:
for choice in choices:
if current_email in choice[0]: #check for lower level emails
incident_type = choice[1]
break
elif choice[0] in current_email: #check for upper level emails
incident_type = choice[1]
break
I believe how I am currently searching is O(nkm) for n (the length of the current_email) and k (the length of choices) and m (the length of the email in k) and thus, makes this a very sub optimal way of doing things. Once choices reaches about 50 then it takes around 5-10 seconds for each full search of choices to complete. Is there a better way of doing this with some other module or method?
Thank you.

Bitwise operations Python

This is a first run-in with not only bitwise ops in python, but also strange (to me) syntax.
for i in range(2**len(set_)//2):
parts = [set(), set()]
for item in set_:
parts[i&1].add(item)
i >>= 1
For context, set_ is just a list of 4 letters.
There's a bit to unpack here. First, I've never seen [set(), set()]. I must be using the wrong keywords, as I couldn't find it in the docs. It looks like it creates a matrix in pythontutor, but I cannot say for certain. Second, while parts[i&1] is a slicing operation, I'm not entirely sure why a bitwise operation is required. For example, 0&1 should be 1 and 1&1 should be 0 (carry the one), so binary 10 (or 2 in decimal)? Finally, the last bitwise operation is completely bewildering. I believe a right shift is the same as dividing by two (I hope), but why i>>=1? I don't know how to interpret that. Any guidance would be sincerely appreciated.
[set(), set()] creates a list consisting of two empty sets.
0&1 is 0, 1&1 is 1. There is no carry in bitwise operations. parts[i&1] therefore refers to the first set when i is even, the second when i is odd.
i >>= 1 shifts right by one bit (which is indeed the same as dividing by two), then assigns the result back to i. It's the same basic concept as using i += 1 to increment a variable.
The effect of the inner loop is to partition the elements of _set into two subsets, based on the bits of i. If the limit in the outer loop had been simply 2 ** len(_set), the code would generate every possible such partitioning. But since that limit was divided by two, only half of the possible partitions get generated - I couldn't guess what the point of that might be, without more context.
I've never seen [set(), set()]
This isn't anything interesting, just a list with two new sets in it. So you have seen it, because it's not new syntax. Just a list and constructors.
parts[i&1]
This tests the least significant bit of i and selects either parts[0] (if the lsb was 0) or parts[1] (if the lsb was 1). Nothing fancy like slicing, just plain old indexing into a list. The thing you get out is a set, .add(item) does the obvious thing: adds something to whichever set was selected.
but why i>>=1? I don't know how to interpret that
Take the bits in i and move them one position to the right, dropping the old lsb, and keeping the sign. Sort of like this
Except of course that in Python you have arbitrary-precision integers, so it's however long it needs to be instead of 8 bits.
For positive numbers, the part about copying the sign is irrelevant.
You can think of right shift by 1 as a flooring division by 2 (this is different from truncation, negative numbers are rounded towards negative infinity, eg -1 >> 1 = -1), but that interpretation is usually more complicated to reason about.
Anyway, the way it is used here is just a way to loop through the bits of i, testing them one by one from low to high, but instead of changing which bit it tests it moves the bit it wants to test into the same position every time.

Can dynamic programming problems always be represented as DAG

I am trying to draw a DAG for Longest Increasing Subsequence {3,2,6,4,5,1} but cannot break this into a DAG structure.
Is it possible to represent this in a tree like structure?
As far as I know, the answer to the actual question in the title is, "No, not all DP programs can be reduced to DAGs."
Reducing a DP to a DAG is one of my favorite tricks, and when it works, it often gives me key insights into the problem, so I find it always worth trying. But I have encountered some that seem to require at least hypergraphs, and this paper and related research seems to bear that out.
This might be an appropriate question for the CS Stack Exchange, meaning the abstract question about graph reduction, not the specific question about longest increasing subsequence.
Assuming following Sequence, S = {3,2,6,4,5,1,7,8} and R = root node. Your tree or DAG will look like
R
3 2 4 1
6 5 7
8
And your result is the longest path (from root to the node with the maximum depth) in the tree (result = {r,1,7,8}).
The result above show the longest increasing sequence in S. The Tree for the longest increasing subsequence in S look as follows
R
3 2 6 4 5 1 7 8
6 4 7 5 7 7 8
7 5 8 7 8 8
8 7 8
8
And again the result is the longest path (from root to the node with the maximum depth) in the tree (result = {r,2,4,5,7,8}).
The answer to this question should be YES.
I'd like to cite the following from here: The Soul of Dynamic Programming
Formulations and Implementations.
A DP must have a corresponding DAG (most of the time implicit), otherwise we cannot find a valid order for computation.
For your case, Longest Increasing Subsequence can be represented as some DAG like the following:
The task is amount to finding the longest path in that DAG. For more information please refer to section 6.2 of Algorithms, Dynamic programming.
Yes, It is possible to represent longest Increasing DP Problem as DAG.
The solution is to find the longest path ( a path that contains maximum nodes) from every node to the last node possible for that particular node.
Here, S is the starting node, E is the ending node and C is count of nodes between S and E.
S E C
3 5 3
2 5 3
6 6 1
4 5 2
5 5 1
1 1 1
so the answer is 3 and it is very easy to generate solution as we have to traverse the nodes only.
I think it might help you.
Reference: https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-006-introduction-to-algorithms-fall-2011/lecture-videos/lecture-20-dynamic-programming-ii-text-justification-blackjack/

Need to construct DFA (deterministic finite automaton)

I need to construct a DFA which recognises all the strings made solely from 0s and 1s, so that thay have an even number of zeros and number of ones divisible by 3. I found an automaton for the case of even number of 0s and even number of 1s:
I tried going from here by adding some states, changing branches, etc.. However I remained unsuccessful usually losing track of what's the automaton doing beacuse of branches and states I'd add. Any help would be greatly appreciated.
You need states which record the divisibility by 2 and by 3, which means you need 6 states. Just call them 0|0, 1|0, 0|1, 1|1, 0|2, 1|2. The first digit tells you that when you reach the state, you have an even or odd number of zeros, the 2nd digit tells you that when you reach the state, you have a number of 1s that, when divided by 3, give the given modulus.
Your state diagram contains:
0|x --0--> 1|x
1|x --0--> 0|x
y|0 --1--> y|1
y|1 --1--> y|2
y|2 --1--> y|0
The start state is 0|0, which is also the only stop state.
The important bit to understand is that each state records the the modulus of the number of zeros or ones read when divided by 2, respectively 3. The 0|0 is then modulus 0 in both cases, which is the accepting criteria. This all works, because the number of different states to keep track of is finite. The name DFA tells us already that it would not work for an infinite number of states to track.
One way to view it is that the problem asks for the intersection of 2 languages: one containing an even number of zeroes and another having the number of ones divisible by 3. A way to approach this is to make DFAs for both the languages and then make another DFA which keeps track of the pair of states in each DFA when an input symbol is read.
I have used 'e' and 'o' to denote that the number of zeroes is even or odd respectively. The second digit in each of the states defines the remainder obtained by dividing the number of ones by 3.

constraint to avoid generating duplicates in this search task

I have to solve the following optimization problem:
Given a set of elements (E1,E2,E3,E4,E5,E6) create an arbitrary set of sequences e.g.
seq1:E1,E4,E3
seq2:E2
seq3:E6,E5
and given a function f that gives a value for every pair of elements e.g.
f(E1,E4) = 5
f(E4,E3) = 2
f(E6,E5) = 3
...
in addition it also gives a value for the pair of an element combined with some special element T, e.g.
f(T,E2) = 10
f(E2,T) = 3
f(E5,T) = 1
f(T,E6) = 2
f(T,E1) = 4
f(E3,T) = 2
...
The utility function that must be optimized is the following:
The utility of a sequence set is the sum of the utility of all sequences.
The utility of a sequence A1,A2,A3,...,AN is equal to
f(T,A1)+f(A1,A2)+f(A2,A3)+...+f(AN,T)
for our example set of sequences above this leads to
seq1: f(T,E1)+f(E1,E4)+f(E4,E3)+f(E3,T) = 4+5+2+2=13
seq2: f(T,E2)+f(E2,T) =10+3=13
seq3: f(T,E6)+f(E6,E5)+f(E5,T) =2+3+1=6
Utility(set) = 13+13+6=32
I try to solve a larger version (more elements than 6, rather 1000) of this problem using A* and some heuristic. Starting from zero sequences and stepwise adding elements either to existing sequences or as a new sequence, until we obtain a set of sequences containing all elements.
The problem I run into is the fact that while generating possible solutions I end up with duplicates, for example in above example all the following combinations are generated:
seq1:E1,E4,E3
seq2:E2
seq3:E6,E5
+
seq1:E1,E4,E3
seq2:E6,E5
seq3:E2
+
seq1:E2
seq2:E1,E4,E3
seq3:E6,E5
+
seq1:E2
seq2:E6,E5
seq3:E1,E4,E3
+
seq1:E6,E5
seq2:E2
seq3:E1,E4,E3
+
seq1:E6,E5
seq2:E1,E4,E3
seq3:E2
which all have equal utility, since the order of the sequences does not matter.
These are all permutations of the 3 sequences, since the number of sequences is arbitrairy there can be as much sequences as elements and a faculty(!) amount of duplicates generated...
One way to solve such a problem is keeping already visited states and don't revisit them. However since storing all visited states requires a huge amount of memory and the fact that comparing two states can be a quite expensive operation, I was wondering whether there wasn't a way I could avoid generating these in the first place.
THE QUESTION:
Is there a way to stepwise construct all these sequence constraining the adding of elements in a way that only combinations of sequences are generated rather than all variations of sequences.(or limit the number of duplicates)
As an example, I only found a way to limit the amount of 'duplicates' generated by stating that an element Ei should always be in a seqj with j<=i, therefore if you had two elements E1,E2 only
seq1:E1
seq2:E2
would be generated, and not
seq1:E2
seq2:E1
I was wondering whether there was any such constraint that would prevent duplicates from being generated at all, without failing to generate all combinations of sets.
Well, it is simple. Allow generation of only such sequences that are sorted according to first member, that is, from the above example, only
seq1:E1,E4,E3
seq2:E2
seq3:E6,E5
would be correct. And this you can guard very easily: never allow additional sequence that has its first member less than the first member of its predecessor.

Resources