Construct a DFA which accept the language L = {w | w ∈ {a,b}* and Na(w) mod 3 > Nb (w) mod 3} - state-machine

I cannot solve this problem , if anyone can solve this problem.
Construct a DFA which accept the language L = {w | w ∈ {a,b}* and Na(w) mod 3 > Nb (w) mod 3}

Make a DFA with nine states named q00, q01, q02, q10, q11, q12, q20, q21 and q22. Each state qxy will correspond to a pair (x, y) = (Na(w) mod 3, Nb(w) mod 3). Then, simply make the accepting states the ones where Na(w) mod 3 > Nb(w) mod 3 is true: q10, q20 and q21. You can lay these states out in a 3-by-3 grid and have the Na(w) component move horizontally along rows and the Nb(w) component move vertically down columns. These will need wrap-around in both columns and rows.

Related

Precompute arbitrary Winning Strategy in Prolog

Lets say we have the Tic-Tac-Toe game at hand. And we want
to precompute a winning strategy in the following way as a tree.
From the winning moves only one is select and stored in the tree.
The many loosing moves the opponent has, all are stored in the
tree, so that we can blindly use the tree to guide us to our
next winning move, which is than again only one in the tree,
/
o---*- ..
\ /
o---*- ..
\
and so on, one, multiple, one, multiple etc.. How would one
do this in Prolog so that computing one such tree can be done
quite quickly for Tic-Tac-Toe game and a given start configuration?
I would do it like this (adapting from min max):
My choices are white nodes, other player's choices are black nodes.
Form a solving tree depth-first.
If the game has ended mark this leaf as won, tie or lost.
If a black node is not a leaf and all of its children are marked: "keep" all childnodes, mark the node as the worst cenario from its children (lost before tie before won)
If a white node has only one child left: copy the marking.
If a white node has one marked child which is maked as won, ignore the left over unmarked child-nodes (no traversing necessary). Mark as won.
If a white node has two marked children: keep only the best child, prefer won over tie over lost. If the mark is won or there are no unmarked nodes left the 2 above rules hold. Otherwise traverse an unmarked child to repeat this process.
the marking is only important for the minmax, not for the choice-tree.
for every childnode store the move to reach it
form the final tree as nested list (while traversing the tree)
Output would look like this:
[[1,
[2,[4,
[3,[..],[..]],[5,[3,[..],[..],[..]]]
]],
[3,[4,[5,..],[2,..]]],
[4,..],
[5,..]
]]
I start and have only one option: using move 1. Then black has the options 2, 3, 4 and 5, where these are all possible moves for black. If he chooses 2, I choose 4. Black now has option 3 and 5. If he chooses 5, I choose 3 and so on. Be sure to have enough RAM available ;)
Here is some proof of the pudding. One can use aggregate_all/3 for min/max, here is a little adaptation of the code here. But the code below does not yet return a winning strategy. It only returns a first move and a score:
% best(+Board, +Player, -Move, -Score)
best(X, P, N, W) :-
move(X, P, Y, N),
(win(Y, P) -> W = 1;
other(P, Q),
(tie(Y, Q) -> W = 0;
aggregate_all(max(V), best(Y, Q, _, V), H),
W is -H)).
We can check the moves and scores we get for a position:
?- best([[x, -, o], [x, -, -], [o, -, -]], x, N, W).
N = 2,
W = -1 ;
N = 5,
W = 1 ;
N = 6,
W = -1 ;
N = 8,
W = -1 ;
N = 9,
W = -1
Now how would we go about and store a winning strategy and choose a winning strategy? One idea here is now to replace the aggregate_all/3 by a findall/3. This should give us the multi branches of a winning strategy:
% best(+Board, +Player, -Tree, -Score)
best(X, P, N-R, W) :-
move(X, P, Y, N),
(win(Y, P) -> R = [], W = 1;
other(P, Q),
(tie(Y, Q) -> R = [], W = 0;
findall(M-V, best(Y, Q, M, V), L),
max_value(L, -1, H),
(Q == x ->
filter_value(L, H, J),
random_member(U, J),
R = [U];
strip_value(L, R)),
W is -H)).
We use random_member/2 for the single branches. Rerun to get different correct results:
?- best([[x, -, o], [x, -, -], [o, -, -]], x, N, W), W = 1.
N = 5-[2-[6-[]], 6-[9-[]], 8-[9-[]], 9-[6-[]]],
W = 1 ;
No
?- best([[x, -, o], [x, -, -], [o, -, -]], x, N, W), W = 1.
N = 5-[2-[8-[6-[9-[]], 9-[6-[]]]], 6-[9-[]], 8-[6-[]], 9-[6-[]]],
W = 1 ;
No
Open source:
Prolog code for the tic-tac-toe game
score via aggregate, return first move
https://gist.github.com/jburse/928f060331ed7d5307a0d3fcd6d4aae9#file-tictac2-pl
Prolog code for the tic-tac-toe game
score via findall, return random winning strategy
https://gist.github.com/jburse/928f060331ed7d5307a0d3fcd6d4aae9#file-tictac3-pl

Convert DFA to RE

I constructed a finite automata for the language L of all strings made of the symbols 0, 1 and 2 (Σ = {0, 1, 2}) where the last symbol is not smaller than the first symbol. E.g., the strings 0, 2012, 01231 and 102 are in the language, but 10, 2021 and 201 are not in the language.
Then from that an GNFA so I can convert to RE.
My RE looks like this:
(0(0+1+2)* )(1(0(1+2)+1+2)* )(2((0+1)2+2))*)
I have no idea if this is correct, as I think I understand RE but not entirely sure.
Could someone please tell me if it’s correct and if not why?
There is a general method to convert any DFA into a regular expression, and is probably what you should be using to solve this homework problem.
For your attempt specifically, you can tell whether an RE is incorrect by finding a word that should be in the language, but that your RE doesn't accept, or a word that shouldn't be in the language that the RE does accept. In this case, the string 1002 should be in the language, but the RE doesn't match it.
There are two primary reasons why this string isn't matched. The first is that there should be a union rather than a concatenation between the three major parts of the language (words starting with 0, 1 and 2, respectively:
(0(0+1+2)*) (1(0(1+2)+1+2)*) (2((0+1)2+2))*) // wrong
(0(0+1+2)*) + (1(0(1+2)+1+2)*) + (2((0+1)2+2))*) // better
The second problem is that in the 1 and 2 cases, the digits smaller than the starting digit need to be repeatable:
(1(0 (1+2)+1+2)*) // wrong
(1(0*(1+2)+1+2)*) // better
If you do both of those things, the RE will be correct. I'll leave it as an exercise for you to follow that step for the 2 case.
The next thing you can try is find a way to make the RE more compact:
(1(0*(1+2)+1+2)*) // verbose
(1(0*(1+2))*) // equivalent, but more compact
This last step is just a matter of preference. You don't need the trailing +1+2 because 0* can be of zero length, so 0*(1+2) covers the +1+2 case.
You can use an algorithm but this DFA might be easy enough to convert as a one-off.
First, note that if the first symbol seen in the initial state is 0, you transition to state A and remain there. A is accepting. This means any string beginning with 0 is accepted. Thus, our regular expression might as well have a term like 0(0+1+2)* in it.
Second, note that if the first symbol seen in the initial state is 1, you transition to state B and remain in states B and D from that point on. You only leave B if you see 0 and you stay out of B as long as you keep seeing 0. The only way to end on D is if the last symbol you saw was 0. Therefore, strings beginning with 1 are accepted if and only if the strings don't end in 0. We can have a term like 1(0+1+2)*(1+2) in our regular expression as well to cover these cases.
Third, note that if the first symbol seen in the initial state is 2, you transition to state C and remain in states C and E from that point on. You leave state C if you see anything but 2 and stay out of B until you see a 2 again. The only way to end up on C is if the last symbol you saw was 2. Therefore, strings beginning with 2 are accepted if and only if the strings end in 2. We can have a term like 2(0+1+2)*(2) in our regular expression as well to cover these cases.
Finally, we see that there are no other cases to consider; our three terms cover all cases and the union of them fully describes our language:
0(0+1+2)* + 1(0+1+2)*(1+2) + 2(0+1+2)*2
It was easy to just write out the answer here because this DFA is sort of like three simple DFAs put together with a start state. More complicated DFAs might be easier to convert to REs using algorithms that don't require you understand or follow what the DFA is doing.
Note that if the start state is accepting (mentioned in a comment on another answer) the RE changes as follows:
e + 0(0+1+2)* + 1(0+1+2)*(1+2) + 2(0+1+2)*2
Basically, we just tack the empty string onto it since it is not already generated by any of the other parts of the aggregate expression.
You have the equivalent of what is known as a right-linear system. It's right-linear because the variables occur on the right hand sides only to the first degree and only on the right-hand sides of each term. The system that you have may be written - with a change in labels from 0,1,2 to u,v,w - as
S ≥ u A + v B + w C
A ≥ 1 + (u + v + w) A
B ≥ 1 + u D + (v + w) B
C ≥ 1 + (u + v) E + w C
D ≥ u D + (v + w) B
E ≥ (u + v) E + w C
The underlying algebra is known as a Kleene algebra. It is defined by the following identities that serve as its fundamental properties
(xy)z = x(yz), x1 = x = 1x,
(x + y) + z = x + (y + z), x + 0 = x = 0 + x,
y0z = 0, w(x + y)z = wxz + wyz,
x + y = y + x, x + x = x,
with a partial ordering relation defined by
x ≤ y ⇔ y ≥ x ⇔ ∃z(x + z = y) ⇔ x + y = y
With respect to this ordering relation, all finite subsets have least upper bounds, including the following
0 = ⋁ ∅, x + y = ⋁ {x, y}
The sum operator "+" is the least upper bound operator.
The system you have is a right-linear fixed point system, since it expresses the variables on the left as a (right-linear) function, as given on the right, of the variables. The object being specified by the system is the least solution with respect to the ordering; i.e. the least fixed point solution; and the regular expression sought out is the value that the main variable has in the least fixed point solution.
The last axiom(s) for Kleene algebras can be stated in any of a number of equivalent ways, including the following:
0* = 1
the least fixed point solution to x ≥ a + bx + xc is x = b* a c*.
There are other ways to express it. A consequence is that one has identities such as the following:
1 + a a* = a* = 1 + a* a
(a + b)* = a* (b a*)*
(a b)* a = a (b a)*
In general, right linear systems, such as the one corresponding to your problem may be written in vector-matrix form as 𝐪 ≥ 𝐚 + A 𝐪, with the least fixed point solution given in matrix form as 𝐪 = A* 𝐚. The central theorem of Kleene algebras is that all finite right-linear systems have least fixed point solutions; so that one can actually define matrix algebras over Kleene algebras with product and sum given respectively as matrix product and matrix sum, and that this algebra can be made into a Kleene algebra with a suitably-defined matrix star operation through which the least fixed point solution is expressed. If the matrix A decomposes into block form as
B C
D E
then the star A* of the matrix has the block form
(B + C E* D)* (B + C E* D)* C E*
(E + D B* C)* D B* (E + D B* C)*
So, what this is actually saying is that for a vector-matrix system of the form
x ≥ a + B x + C y
y ≥ b + D x + E y
the least fixed point solution is given by
x = (B + C E* D)* (a + C E* b)
y = (E + D B* C)* (D B* a + b)
The star of a matrix, if expressed directly in terms of its components, will generally be huge and highly redundant. For an n×n matrix, it has size O(n³) - cubic in n - if you allow for redundant sub-expressions to be defined by macros. Otherwise, if you in-line insert all the redundancy then I think it blows up to a highly-redundant mess that is exponential in n in size.
So, there's intelligence required and involved (literally meaning: AI) in finding or pruning optimal forms that avoid the blow-up as much as possible. That's a non-trivial job for any purported matrix solver and regular expression synthesis compiler.
An heuristic, for your system, is to solve for the variables that don't have a "1" on the right-hand side and in-line substitute the solutions - and to work from bottom-up in terms of the dependency chain of the variables. That would mean starting with D and E first
D ≥ u* (v + w) B
E ≥ (u + v)* w C
In-line substitute into the other inequations
S ≥ u A + v B + w C
A ≥ 1 + (u + v + w) A
B ≥ 1 + u u* (v + w) B + (v + w) B
C ≥ 1 + (u + v) (u + v)* w C + w C
Apply Kleene algebra identities (e.g. x x* y + y = x* y)
S ≥ u A + v B + w C
A ≥ 1 + (u + v + w) A
B ≥ 1 + u* (v + w) B
C ≥ 1 + (u + v)* w C
Solve for the next layer of dependency up: A, B and C:
A ≥ (u + v + w)*
B ≥ (u* (v + w))*
C ≥ ((u + v)* w)*
Apply some more Kleene algebra (e.g. (x* y)* = 1 + (x + y)* y) to get
B ≥ 1 + N (v + w)
C ≥ 1 + N w
where, for convenience we set N = (u + v + w)*. In-line substitute at the top-level:
S ≥ u N + v (1 + N (v + w)) + w (1 + N w).
The least fixed point solution, in the main variable S, is thus:
S = u N + v + v N (v + w) + w + w N w.
where
N = (u + v + w)*.
As you can already see, even with this simple example, there's a lot of chess-playing to navigate through the system to find an optimally-pruned solution. So, it's certainly not a trivial problem. What you're essentially doing is synthesizing a control-flow structure for a program in a structured programming language from a set of goto's ... essentially the core process of reverse-compiling from assembly language to a high level language.
One measure of optimization is that of minimizing the loop-depth - which here means minimizing the depth of the stars or the star height. For example, the expression x* (y x*)* has star-height 2 but reduces to (x + y)*, which has star height 1. Methods for reducing star-height come out of the research by Hashiguchi and his resolution of the minimal star-height problem. His proof and solution (dating, I believe, from the 1980's or 1990's) is complex and to this day the process still goes on of making something more practical of it and rendering it in more accessible form.
Hashiguchi's formulation was cast in the older 1950's and 1960's formulation, predating the axiomatization of Kleene algebras (which was in the 1990's), so to date, nobody has rewritten his solution in entirely algebraic form within the framework of Kleene algebras anywhere in the literature ... as far as I'm aware. Whoever accomplishes this will have, as a result, a core element of an intelligent regular expression synthesis compiler, but also of a reverse-compiler and programming language synthesis de-compiler. Essentially, with something like that on hand, you'd be able to read code straight from binary and the lid will be blown off the world of proprietary systems. [Bite tongue, bite tongue, mustn't reveal secret yet, must keep the ring hidden.]

Tips to proof a language is not regular using Pumping Lemma

I am trying to prove that the following language is not regular using the pumping lemma
L = {ai bj | i = 2j for some j ≥ 0}
I have decided to choose s = a2p bp, in this way |s| ≥ p and I can split it in three pieces xyz where for every i ≥ 0, xyiz ∈ L.
Any tips for continuing the proof?
Thanks!
Choose s = a2p bp is right!
As said by Grijesh Chauhan we must break strings in L in all possible ways.
So you can split s in:
x=ak
y=al
z=a2p-k-l bp
where |xy|≥ 0 and |y|>0.
Taking i=2, you have xy2z:
s = ak alal a2p-k-l bp
that is:
s = a2p+l bp
Since l contains at least one 'a' (because |y|>0). You can say L is not regular

Lattice Points in a 2D plane

Given 2 point in a 2D plane, how many lattice points lie within these two point?
For example, for A (3, 3) and B (-1, -1) the output is 5. The points are: (-1, -1), (0, 0), (1, 1), (2, 2) and (3, 3).
Apparently by "lattice points lie within two points" you mean (letting LP stand for lattice point) the LP's on the line between two points (A and B).
The equation of line AB is y = m*x + b for some slope and intercept numbers m and b. For cases of interest, we can assume m, b are rational, because if either is irrational there is at most 1 LP on AB. (Proof: If 2 or more LP's are on line, it has rational slope, say e/d, with d,e integers; then y=b+x*e/d so at LP (X,Y) on line, d*b = d*Y-X*e, which is an integer, hence b is rational.)
In following, we suppose A = (u,v) and B = (w,z), with u,w and v,z having rational differences, and typically write y = mx+b with m=e/d and b=g/f.
Case 1. A, B both are LP's: Let q = gcd(u-w,v-z); take d = (u-w)/q and e = (v-z)/q and it's easily seen that there are q+1 lattice points on AB.
Case 2a. A is an LP, B isn't: If u-w = h/i and v-z = j/k
then m = j*i/(h*k). Let q = gcd(j*i,h*k), d = h*k/q, e=j*i/q, w' = u + d*floor((w-u)/d) and similarly for z', then solve (u,v),(w',z') as in case 1. For case 2b swap A and B.
Case 3. Neither A nor B is an LP: After finding an LP C on the extended line through A,B, use arithmetic like in Case 2 to find LP A' inside line segment AB and apply case 2. To find A', if m = e/d, b = g/f, note that f*d*y = d*g + e*f*x is of the form p*x + q*y = r, a simple Diophantine equation that is solvable for C=(x,y) iff gcd(p,q) divides r.
Complexity: gcd(m,n) is O(ln(min(m,n)) so algorithm complexity is typically O(ln(Dx)) or O(ln(Dy)) if A,B are separated by x,y distances Dx,Dy.

Problem detecting cyclic numbers in Haskell

I am doing problem 61 at project Euler and came up with the following code (to test the case they give):
p3 n = n*(n+1) `div` 2
p4 n = n*n
p5 n = n*(3*n -1) `div` 2
p6 n = n*(2*n -1)
p7 n = n*(5*n -3) `div` 2
p8 n = n*(3*n -2)
x n = take 2 $ show n
x2 n = reverse $ take 2 $ reverse $ show n
pX p = dropWhile (< 999) $ takeWhile (< 10000) [p n|n<-[1..]]
isCyclic2 (a,b,c) = x2 b == x c && x2 c == x a && x2 a == x b
ns2 = [(a,b,c)|a <- pX p3 , b <- pX p4 , c <- pX p5 , isCyclic2 (a,b,c)]
And all ns2 does is return an empty list, yet cyclic2 with the arguments given as the example in the question, yet the series doesn't come up in the solution. The problem must lie in the list comprehension ns2 but I can't see where, what have I done wrong?
Also, how can I make it so that the pX only gets the pX (n) up to the pX used in the previous pX?
PS: in case you thought I completely missed the problem, I will get my final solution with this:
isCyclic (a,b,c,d,e,f) = x2 a == x b && x2 b == x c && x2 c == x d && x2 d == x e && x2 e == x f && x2 f == x a
ns = [[a,b,c,d,e,f]|a <- pX p3 , b <- pX p4 , c <- pX p5 , d <- pX p6 , e <- pX p7 , f <- pX p8 ,isCyclic (a,b,c,d,e,f)]
answer = sum $ head ns
The order is important. The cyclic numbers in the question are 8128, 2882, 8281, and these are not P3/127, P4/91, P5/44 but P3/127, P5/44, P4/91.
Your code is only checking in the order 8128, 8281, 2882, which is not cyclic.
You would get the result if you check for
isCyclic2 (a,c,b)
in your list comprehension.
EDIT: Wrong Problem!
I assumed you were talking about the circular number problem, Sorry!
There is a more efficient way to do this with something like this:
take (2 * l x -1) . cycle $ show x
where l = length . show
Try that and see where it gets you.
If I understand you right here, you're no longer asking why your code doesn't work but how to make it faster. That's actually the whole fun of Project Euler to find an efficient way to solve the problems, so proceed with care and first try to think of reducing your search space yourself. I suggest you let Haskell print out the three lists pX p3, pX p4, pX p5 and see how you'd go about looking for a cycle.
If you would proceed like your list comprehension, you'd start with the first element of each list, 1035, 1024, 1080. I'm pretty sure you would stop right after picking 1035 and 1024 and not test for cycles with any value from P5, let alone try all the permutations of the combinations involving these two numbers.
(I haven't actually worked on this problem yet, so this is how I would go about speeding it up. There may be some math wizardry out there that's even faster)
First, start looking at the numbers you get from pX. You can drop more than those. For example, P3 contains 6105 - there's no way you're going to find a number in the other sets starting with '05'. So you can also drop those numbers where the number modulo 100 is less than 10.
Then (for the case of 3 sets), we can sometimes see after drawing two numbers that there can't be any number in the last set that will give you a cycle, no matter how you permutate (e.g. 1035 from P3 and 3136 from P4 - there can't be a cycle here).
I'd probably try to build a chain by starting with the elements from one list, one by one, and for each element, find the elements from the remaining lists that are valid successors. For those that you've found, continue trying to find the next chain element from the remaining lists. When you've built a chain with one number from every list, you just have to check if the last two digits of the last number match the first two digits of the first number.
Note when looking for successors, you again don't have to traverse the entire lists. If you're looking for a successor to 3015 from P5, for example, you can stop when you hit a number that's 1600 or larger.
If that's too slow still, you could transform the lists other than the first one to maps where the map key is the first two digits and the associated values are lists of numbers that start with those digits. Saves you from going through the lists from the start again and again.
I hope this helps a bit.
btw, I sense some repetition in your code.
you can unite your [p3, p4, p5, p6, p7, p8] functions into one function that will take the 3 from the p3 as a parameter etc.
to find what the pattern is, you can make all the functions in the form of
pX n = ... `div` 2

Resources