L={a^nb^m , m%n=0} is a context-free language? [closed] - context-free-language

Closed. This question is not about programming or software development. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 5 days ago.
Improve this question
I tried using context-free pumping lemma but i can't find a string that works, i also tried to build a PDA but it also didn't work.
using the pumping lemma definition s=uvwxy
i tried with the string a^n a b^n^2 bb
in the case vx is composed only by "a" with i=2 (uvvwxxy)
in the case vx is composed only by "b" with i=0 (uwy)
but i can't find a working case for vx composed by both "a" and "b" (example vx="abb")
In conclusion i think that L is recursive and not context-free but i can't prove it

One way to see this is to construct a context-free grammar that generates the language. Here's an example of such a grammar:
S -> aSBB | ε
B -> bB | ε
The nonterminal symbol S generates a sequence of a's, followed by any number of pairs of b's. The nonterminal symbol B generates any number of b's, but only in multiples of n, since S can only generate a sequence of a's of length n.
Proof using the pumping lemma for context-free languages:
Assume L is context-free. Then, there exists a pumping length p such that any string s in L with |s| >= p can be written as s = uvwxy, where |vwx| <= p, |vx| >= 1, and for all i >= 0, uv^iwx^iy is in L.
Consider the string s = a^pb^p. Since p is the pumping length, we can write s as s = uvwxy, where |vwx| <= p and |vx| >= 1. There are three cases to consider:
vx contains only a's: In this case, we can pump up vx by setting i = 2. This gives us a^(p + |vx|)b^p, which is not in L since (p + |vx|) % n != 0.
vx contains only b's: In this case, we can pump down vx by setting i = 0. This gives us a^p(b^p - |vx|), which is not in L since (b^p - |vx|) % n != 0.
vx contains both a's and b's: In this case, we can pump up the a's in vx and pump down the b's in vx simultaneously by setting i = 2. This gives us a^(p + k)b^(p - l), where k > 0 and l > 0. Since p % n = 0, we have (p + k) % n = 0 if and only if k % n = 0. Similarly, (p - l) % n = 0 if and only if l % n = 0. Thus, we need both k and l to be divisible by n, which is impossible since k + l = |vx| <= p < n.
Therefore, we have a contradiction in all cases. Hence, L is not a context-free language.

Related

Convert DFA to RE

I constructed a finite automata for the language L of all strings made of the symbols 0, 1 and 2 (Σ = {0, 1, 2}) where the last symbol is not smaller than the first symbol. E.g., the strings 0, 2012, 01231 and 102 are in the language, but 10, 2021 and 201 are not in the language.
Then from that an GNFA so I can convert to RE.
My RE looks like this:
(0(0+1+2)* )(1(0(1+2)+1+2)* )(2((0+1)2+2))*)
I have no idea if this is correct, as I think I understand RE but not entirely sure.
Could someone please tell me if it’s correct and if not why?
There is a general method to convert any DFA into a regular expression, and is probably what you should be using to solve this homework problem.
For your attempt specifically, you can tell whether an RE is incorrect by finding a word that should be in the language, but that your RE doesn't accept, or a word that shouldn't be in the language that the RE does accept. In this case, the string 1002 should be in the language, but the RE doesn't match it.
There are two primary reasons why this string isn't matched. The first is that there should be a union rather than a concatenation between the three major parts of the language (words starting with 0, 1 and 2, respectively:
(0(0+1+2)*) (1(0(1+2)+1+2)*) (2((0+1)2+2))*) // wrong
(0(0+1+2)*) + (1(0(1+2)+1+2)*) + (2((0+1)2+2))*) // better
The second problem is that in the 1 and 2 cases, the digits smaller than the starting digit need to be repeatable:
(1(0 (1+2)+1+2)*) // wrong
(1(0*(1+2)+1+2)*) // better
If you do both of those things, the RE will be correct. I'll leave it as an exercise for you to follow that step for the 2 case.
The next thing you can try is find a way to make the RE more compact:
(1(0*(1+2)+1+2)*) // verbose
(1(0*(1+2))*) // equivalent, but more compact
This last step is just a matter of preference. You don't need the trailing +1+2 because 0* can be of zero length, so 0*(1+2) covers the +1+2 case.
You can use an algorithm but this DFA might be easy enough to convert as a one-off.
First, note that if the first symbol seen in the initial state is 0, you transition to state A and remain there. A is accepting. This means any string beginning with 0 is accepted. Thus, our regular expression might as well have a term like 0(0+1+2)* in it.
Second, note that if the first symbol seen in the initial state is 1, you transition to state B and remain in states B and D from that point on. You only leave B if you see 0 and you stay out of B as long as you keep seeing 0. The only way to end on D is if the last symbol you saw was 0. Therefore, strings beginning with 1 are accepted if and only if the strings don't end in 0. We can have a term like 1(0+1+2)*(1+2) in our regular expression as well to cover these cases.
Third, note that if the first symbol seen in the initial state is 2, you transition to state C and remain in states C and E from that point on. You leave state C if you see anything but 2 and stay out of B until you see a 2 again. The only way to end up on C is if the last symbol you saw was 2. Therefore, strings beginning with 2 are accepted if and only if the strings end in 2. We can have a term like 2(0+1+2)*(2) in our regular expression as well to cover these cases.
Finally, we see that there are no other cases to consider; our three terms cover all cases and the union of them fully describes our language:
0(0+1+2)* + 1(0+1+2)*(1+2) + 2(0+1+2)*2
It was easy to just write out the answer here because this DFA is sort of like three simple DFAs put together with a start state. More complicated DFAs might be easier to convert to REs using algorithms that don't require you understand or follow what the DFA is doing.
Note that if the start state is accepting (mentioned in a comment on another answer) the RE changes as follows:
e + 0(0+1+2)* + 1(0+1+2)*(1+2) + 2(0+1+2)*2
Basically, we just tack the empty string onto it since it is not already generated by any of the other parts of the aggregate expression.
You have the equivalent of what is known as a right-linear system. It's right-linear because the variables occur on the right hand sides only to the first degree and only on the right-hand sides of each term. The system that you have may be written - with a change in labels from 0,1,2 to u,v,w - as
S ≥ u A + v B + w C
A ≥ 1 + (u + v + w) A
B ≥ 1 + u D + (v + w) B
C ≥ 1 + (u + v) E + w C
D ≥ u D + (v + w) B
E ≥ (u + v) E + w C
The underlying algebra is known as a Kleene algebra. It is defined by the following identities that serve as its fundamental properties
(xy)z = x(yz), x1 = x = 1x,
(x + y) + z = x + (y + z), x + 0 = x = 0 + x,
y0z = 0, w(x + y)z = wxz + wyz,
x + y = y + x, x + x = x,
with a partial ordering relation defined by
x ≤ y ⇔ y ≥ x ⇔ ∃z(x + z = y) ⇔ x + y = y
With respect to this ordering relation, all finite subsets have least upper bounds, including the following
0 = ⋁ ∅, x + y = ⋁ {x, y}
The sum operator "+" is the least upper bound operator.
The system you have is a right-linear fixed point system, since it expresses the variables on the left as a (right-linear) function, as given on the right, of the variables. The object being specified by the system is the least solution with respect to the ordering; i.e. the least fixed point solution; and the regular expression sought out is the value that the main variable has in the least fixed point solution.
The last axiom(s) for Kleene algebras can be stated in any of a number of equivalent ways, including the following:
0* = 1
the least fixed point solution to x ≥ a + bx + xc is x = b* a c*.
There are other ways to express it. A consequence is that one has identities such as the following:
1 + a a* = a* = 1 + a* a
(a + b)* = a* (b a*)*
(a b)* a = a (b a)*
In general, right linear systems, such as the one corresponding to your problem may be written in vector-matrix form as 𝐪 ≥ 𝐚 + A 𝐪, with the least fixed point solution given in matrix form as 𝐪 = A* 𝐚. The central theorem of Kleene algebras is that all finite right-linear systems have least fixed point solutions; so that one can actually define matrix algebras over Kleene algebras with product and sum given respectively as matrix product and matrix sum, and that this algebra can be made into a Kleene algebra with a suitably-defined matrix star operation through which the least fixed point solution is expressed. If the matrix A decomposes into block form as
B C
D E
then the star A* of the matrix has the block form
(B + C E* D)* (B + C E* D)* C E*
(E + D B* C)* D B* (E + D B* C)*
So, what this is actually saying is that for a vector-matrix system of the form
x ≥ a + B x + C y
y ≥ b + D x + E y
the least fixed point solution is given by
x = (B + C E* D)* (a + C E* b)
y = (E + D B* C)* (D B* a + b)
The star of a matrix, if expressed directly in terms of its components, will generally be huge and highly redundant. For an n×n matrix, it has size O(n³) - cubic in n - if you allow for redundant sub-expressions to be defined by macros. Otherwise, if you in-line insert all the redundancy then I think it blows up to a highly-redundant mess that is exponential in n in size.
So, there's intelligence required and involved (literally meaning: AI) in finding or pruning optimal forms that avoid the blow-up as much as possible. That's a non-trivial job for any purported matrix solver and regular expression synthesis compiler.
An heuristic, for your system, is to solve for the variables that don't have a "1" on the right-hand side and in-line substitute the solutions - and to work from bottom-up in terms of the dependency chain of the variables. That would mean starting with D and E first
D ≥ u* (v + w) B
E ≥ (u + v)* w C
In-line substitute into the other inequations
S ≥ u A + v B + w C
A ≥ 1 + (u + v + w) A
B ≥ 1 + u u* (v + w) B + (v + w) B
C ≥ 1 + (u + v) (u + v)* w C + w C
Apply Kleene algebra identities (e.g. x x* y + y = x* y)
S ≥ u A + v B + w C
A ≥ 1 + (u + v + w) A
B ≥ 1 + u* (v + w) B
C ≥ 1 + (u + v)* w C
Solve for the next layer of dependency up: A, B and C:
A ≥ (u + v + w)*
B ≥ (u* (v + w))*
C ≥ ((u + v)* w)*
Apply some more Kleene algebra (e.g. (x* y)* = 1 + (x + y)* y) to get
B ≥ 1 + N (v + w)
C ≥ 1 + N w
where, for convenience we set N = (u + v + w)*. In-line substitute at the top-level:
S ≥ u N + v (1 + N (v + w)) + w (1 + N w).
The least fixed point solution, in the main variable S, is thus:
S = u N + v + v N (v + w) + w + w N w.
where
N = (u + v + w)*.
As you can already see, even with this simple example, there's a lot of chess-playing to navigate through the system to find an optimally-pruned solution. So, it's certainly not a trivial problem. What you're essentially doing is synthesizing a control-flow structure for a program in a structured programming language from a set of goto's ... essentially the core process of reverse-compiling from assembly language to a high level language.
One measure of optimization is that of minimizing the loop-depth - which here means minimizing the depth of the stars or the star height. For example, the expression x* (y x*)* has star-height 2 but reduces to (x + y)*, which has star height 1. Methods for reducing star-height come out of the research by Hashiguchi and his resolution of the minimal star-height problem. His proof and solution (dating, I believe, from the 1980's or 1990's) is complex and to this day the process still goes on of making something more practical of it and rendering it in more accessible form.
Hashiguchi's formulation was cast in the older 1950's and 1960's formulation, predating the axiomatization of Kleene algebras (which was in the 1990's), so to date, nobody has rewritten his solution in entirely algebraic form within the framework of Kleene algebras anywhere in the literature ... as far as I'm aware. Whoever accomplishes this will have, as a result, a core element of an intelligent regular expression synthesis compiler, but also of a reverse-compiler and programming language synthesis de-compiler. Essentially, with something like that on hand, you'd be able to read code straight from binary and the lid will be blown off the world of proprietary systems. [Bite tongue, bite tongue, mustn't reveal secret yet, must keep the ring hidden.]

Inverse of a pow(a,b,n) function in python decryption code [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 4 years ago.
Improve this question
I am working on a python decryption code using an encryption code which is already available.
In the encrytpion code, I have
pow (b, xyz, abc)
A number gets encrypted and is passed onto an array.
Now while decrypting, i need to get the value of "b" (from the pow function above) as i have the value in Array.
Using modulus gives the values in range and not the exact value and that is needed for my decryption logic to work.
How to continue with this?
First you factorise 928108726777524737. It has 2 prime factors, call them P and Q.
Then you need to find a value d such that d * 65539 mod (P-1)(Q-1) == 1 (use the Extended Euclidian Algorithm for this).
Once you have done that then given c = pow (b, 65539, 928108726777524737) you can calculate back to b with pow(c, d, 928108726777524737)
To help you a bit more P=948712711, Q=978282167 giving d=872653594828486879
>>> c = pow(99, 65539, 928108726777524737)
>>> pow(c, 872653594828486879, 928108726777524737)
99
Of course in real life you would start with the prime factors and make them a lot larger than this in which case it would be impractical to reverse the process without already knowing the factors. For small values such as this is it is easy to factorise and calculate the inverse.
Calculation of d:
def egcd(a, b):
x,y, u,v = 0,1, 1,0
while a != 0:
q, r = b//a, b%a
m, n = x-u*q, y-v*q
b,a, x,y, u,v = a,r, u,v, m,n
gcd = b
return gcd, x, y
First find the prime factors:
>>> P, Q = 948712711, 978282167
>>> P*Q
928108726777524737
>>> egcd(65539, (P-1)*(Q-1))
(1, -55455130022042981, 3916)
We want the middle value x:
>>> gcd, x, y = egcd(65539, (P-1)*(Q-1))
But we have to make it positive which we can do by adding on the (P-1)*(Q-1) value:
>>> x + (P-1)*(Q-1)
872653594828486879

Proving nonregularity

Suppose I have a language L = {wxwR} where wR is the reverse of w, w and x has minimum length of 1, w can consist of either 0's or 1's, while x can only consist of 1's.
How do I prove that this language is not regular? Is there any other way than using the pumping lemma? If using the pumping lemma, I'm still figuring out what x,y, and z I should pick for the string s=xyz, I would appreciate if you give me any hint.
Thanks!
You should take another look into how to use the pumping lemma. You have to pick a string s, such that for each partition x,y,z, one of the pumping lemma conditions is violated.
So, let n be the "pumping-lemma-number". Pick s= 0^n 1 0^n.
From 1) you know that |xy| <= n. From 2) you know that |y|>=1. Thus y only contains 0s.
Following 3) uv^2w has also to be in L, but the first block of 0s is longer than the second one. This means 3) is violated and thus L is not regular.

Tips to proof a language is not regular using Pumping Lemma

I am trying to prove that the following language is not regular using the pumping lemma
L = {ai bj | i = 2j for some j ≥ 0}
I have decided to choose s = a2p bp, in this way |s| ≥ p and I can split it in three pieces xyz where for every i ≥ 0, xyiz ∈ L.
Any tips for continuing the proof?
Thanks!
Choose s = a2p bp is right!
As said by Grijesh Chauhan we must break strings in L in all possible ways.
So you can split s in:
x=ak
y=al
z=a2p-k-l bp
where |xy|≥ 0 and |y|>0.
Taking i=2, you have xy2z:
s = ak alal a2p-k-l bp
that is:
s = a2p+l bp
Since l contains at least one 'a' (because |y|>0). You can say L is not regular

Can someone help me with this proof using the pumping lemma?

I just started reading about the pumping lemma and know how to perform a few proofs, mostly by contradiction. It is only this particular question which I don't seem to find an answer for. I have no idea on how to begin. I can assume that there has to be a pumping length P and that for all w element of L that the LENGTH(w) >= P. And of course that we can write w as xyz with the three normal conditions of the pumping lemma.
I have to proof that the following language is non regular:
L = {x + y = z | x,y,z element of {0,1}* and #(x) + #(y) = #(z) }
Can someone help me on this, I really want to master the process in proofing these kind of questions?
Edit:
Sorry, forgot to say that the alphabet is {0,1,+,=} and # means the binary value of the string. Like #(00101) = 5 and #(110) = 6.
Since you want to master the process, I'll point out a few things before showing a proof.
The first thing to notice is that the + and the = may only appear once each. So when you write your string w as w = abc, the pumped portion, b, cannot contain + or = otherwise you'd reach a trivial contradiction (I'm not using the more standard w = xyz notation to avoid confusion with L's definition).
Another thing to notice is that normally, you'd pick a specific string w to pump. In this case, it could be easier to pick a class of strings that share a certain property. The pumping lemma only requires you to reach a contratiction using one string, but there's no reason you can't reach a contradiction with multiple strings.
Proof (in a spoiler):
So let w be any string in L such that |w| ≥ P and x, y, z do not contain leading 0's. By the pumping lemma we can write w as w = abc By pumping lemma, we know b is not empty. Since b cannot contain + or =, it is fully contained in either x, y, or z. Pumping w with any i ≠ 1 results in the binary equation no longer holding since exactly one of x, y, z would be a different number (this is why we needed the no leading 0's bit).
Choose as the string 1(0^n+1) + 1(0^n) = 11(0^n).
In other words, your string will read "the sum of two to the power n+2 plus two to the power n+1 is equal to 11 followed by n zeroes".
Since the string to be pumped will consist entirely of symbols from the first addend, pumping must change the number represented (adding or removing digits to a number will change the number; this is true because our string doesn't contain leading zeroes) and if x + y = z holds, then x' + y = z does not hold if x' != x (over integers, at least).
Since the pumping lemma requires pumped strings to be in the language, and pumping this string fails, we have that the language is not regular.

Resources