Use the pumping lemma to show that the following languages are not regular languages L = {anbm | n = 2m} - regular-language

Use the pumping lemma to show that the following languages are not regular languages L = {an bm | n = 2m}

Choose a string a^2p b^p. The pumping lemma says we can write this as w = uvx such that |uv| <= p, |v| < 0 and for all natural numbers n, u(v^n)x is also a string in the language. Because |uv| <= p, the substring uv of w consists entirely of instances of the symbol a. Pumping up or down by choosing a value for n other than one guarantees that the number of a's in the resulting string will change, while the number of b's stays the same. Since the number of a's is twice the number of b's only when n = 1, this is a contradiction. Therefore, the language cannot be regular.

L={anbm|n=2m} Assume that L is regular Language Let the pumping length be p L={a2mbm} Since |s|=3m > m (total string length) take a string s S= aaaaa...aabbb....bbb (a2mbm times taken) Then, u = am-1 ; v= a ; w= ambm. && |uv|<=m Now If i=2 then S= am-1 a2 ambm = a2m-1bm Since here we are getting an extra a in the string S which is no belong to the given language a2mbm our assumption is wrong Therefore it is not a regular Language

Related

Operating on a Regular Expression before applying Pumping Lemma

When applying the pumping lemma to prove that a language is irregular, is it allowed to start by operating on the language to make applying the pumping lemma easier?
For example, when attempting to disprove:
L = {0^i 1^j | j <= i and i, j > 0}
Can I first reverse the language like so:
L = {1^j 0^i | j <= i and i, j > 0}
and reason that "if the language were regular, reversing (or performing any operation that's closed under regular languages) it wouldn't make it irregular," then continue to pump so that j must be greater than i?

what is this shift used in the simplified galil seiferas string match algorithm?

I'm self-studying problem 32-1 in CLRS; part c), presents the following algorithm for string matching:
REPETITION-MATCHER(P, T)
m = P.length
n = T.length
k = 1 + ρ'(P)
q = 0
s = 0
while s <= n-m
if T[s+q+1] == P[q+1]
q = q+1
if q==m
print "Pattern occurs with shift" s
if q==m or T[s+q+1] != P[q+1]
s = s+max(1, ceil(q/k))
q = 0
Here, ρ'(P), which is a function of P only, is defined as the largest integer r such that some prefix P[1..i] = y^r, e.g. a substring y repeated r times.
This algorithm appears to be 95 percent similar to the naive brute-force string matcher. However, the one part which greatly confuses me, and which seems to be the centerpiece of the entire algorithm, is the second to last line. Here, q is the number of characters of P matched so far. What is the rationale behind ceil(q/k)? It is completely opaque to me. It would have made more sense if that line were something like s = s + max(1+q, 1+i), where i is the length of the prefix that gives rise to ρ'(P).
CLRS claims that this algorithm is due to Galil and Seiferas, but in the reference they provide, I cannot find anything that resembles the algorithm provided above. It appears that reference contains, if anything, a much more advanced version of what is here. Can someone explain this ceil(q/k) value, and/or point me toward a reference that describes this particular algorithm, instead of the more well-known main Galil Seiferas paper?
Example #1:
Match aaaa in aaaaab, here ρ' = 4. Consider state:
aaaa ab
^
We have a mismatch here, and we want to move forward by one symbol, no more, because we will match full pattern again (last line sets q to zero). q = 4 and k = 5, so ceil(q/k) = 1, that's all right.
Example #2: Match abcd.abcd.abcd.X in abcd.abcd.abcd.abcd.X. Consider state:
abcd.abcd.abcd. abcd.X
^
We have a mismatch here, and we would like to move forward by five symbols. q = 15 and k = 4, so ceil(q/k) = 4. That's ok, it is almost 5, we still can match our pattern. Had we bigger ρ', say 10, we would have ceil(50/(10+1)) = 5.
Yeh, algorithms skips forward less symbols than KMP does, in case ρ'=10 its running time is O(10n+m) while KMP has O(n+m).
I figured out the proof of correctness.
let k = ρ'(P) + 1, and ρ'(P) is the largest possible repetition factor out of all the prefixes of P.
Suppose T[s+1..s+q] = P[1..q], and either q=m or T[s+q+1] != P[q+1]
Then, for 1 <= j <= floor(q/k) (except for the case q=m and m mod k = 0, in which the upper limit must be ceil(m/k)), we have
T[s+1..s+j] = P[1..j]
T[s+j+1..s+2j] = P[j+1..2j]
T[s+2j+1..s+3j] = P[2j+1..3j]
...
T[s+(k-1)j+1..s+kj] = P[(k-1)j+1..kj]
where not every quantity on every line is equal, since k cannot be a repetition factor, since the largest possible repetition factor out of any prefix of P is k-1.
Suppose we now make a comparison at shift s' = s+j, so that we will make the following comparisons
T[s+j+1..s+2j] with P[1..j]
T[s+2j+1..s+3j] with P[j+1..2j]
T[s+3j+1..s+4j] with P[2j+1..3j]
...
T[s+kj+1..s+(k+1)j] with P[(k-1)j+1..kj]
We claim that not every comparison can match, e.g. at least one of the above "with"s must be replaced with !=. We prove by contradiction. Suppose every "with" above is replaced by =. Then, comparing to the first set of comparisons we did, we would immediately have the following:
P[1..j] = P[j+1..2j]
P[j+1..2j] = [2j+1..3j]
...
P[(k-2)j+1..(k-1)j] = P[(k-1)j+1..kj]
However, this cannot be true, because k is not a repetition factor, hence a contradiction.
Hence, for any 1 <= j <= floor(q/k), testing a new shift s'=s+j is guaranteed to mismatch.
Hence, the smallest shift that is possible to result in a match is s + floor(q/k) + 1 >= ceil(q/k).
Note the code uses ceil(q/k) for simplicity, solely to deal with the case that q = m and m mod k = 0, in which case k * (floor(q/k)+1) would be greater than m, so only ceil(q/k) would do. However, when q mod k = 0 and q < m, then ceil(q/k) = floor(q/k), so is slightly suboptimal, since that shift is guaranteed to fail, and floor(q/k)+1 is the first shift that has any chance of matching.

Why pumping lemma for CFG doesn't work

Language:
{(a^i)(b^j)(c^k)(d^l) : i = 0 or j = k = l}
We take word
w = a^0 b^n c^n d^n
Which obviously belongs to the language because j = k = l
w = uvxyz
|vxy| <= n
|vy| > 1
and now v and y can be:
just a single character and if we pump single character the word is no longer in the language
two characters, count of the third will be lower so the word is not in the language
So, the proof that this language is not CF is not supposed to be do-able with standard pumping lemma, just with the ogdens lemma, but I don't understand why the proof above is invalid.
It doesn't work because in fact every pumped string is in the language, because you still have no as (that is, i=0).
And if you choose a string where i > 0, then you can't guarantee that v isn't just some number of as, and x is the empty string.

Build string of length L with exactly N palindromes in it

Given length of a string L and N - number of palindromes, build a string with exactly N palindromic substrings in it. For example,
L = 4
N = 2
S = 'aabb' or 'abba'
L = 4
N = 3,4,5
S = impossible
L = 4
N = 6
S = 'aaaa' (palindromes are substrings S[0:2], S[2:4], S[1:3], S[0:3], S[1:4], S[0:4])
UPDATE: all target palindromes should be of length > 1
You can introduce a variable s_i = {0, 1} for each character in the string S, then if a substring S[a..b] is a palindrome, it must be that
(s_a = s_b) and (s_{a+1} = s_{b-1}) and ...
so for each substring you have a clause and exactly N of them must be satisfied. This reduces the problem to satisfiability.
I would also be curious if you can solve it as an optimization problem:
for each substring introduce a variable x_i = {0, 1} that stands for the fact that substring number i is a palindrome (let it be S[a..b]). Then introduce a variable y_i for the clause of that substring:
(s_a-s_b)^2 + (s_{a+1}-s_{b-1})^2 + ...
Then you need to satisfy that \sum_{x_i} = N and minimize \sum_{x_i * y_i}. Obviously the minimum is 0 if a solution exists and the objective is always non-negative.
Edit
the optimisation idea seems to be false, since you need to enforce that if y_i = 0, then it must be that x_i = 0, but satisfiability formulation should work

Show that the following set over {a,b} is regular

Given the alphabet {a, b} we define Na(w) as the number of occurrences of a in the word w and similarly for Nb(w). Show that the following set over {a, b} is regular.
A = {xy | Na(x) = Nb(y)}
I'm having a hard time figuring out where to start solving this problem. Any information would be greatly appreciated.
Yes it is regular language!
Any string consists if a and b belongs the language A = {xy | Na(x) = Nb(y)}.
Example:
Suppose string is: w = aaaab we can divide this string into prefix x and suffix y
w = a aaab
--- -----
x y
Number of a in x is one, and number of b in in y is also one.
Similarly as string like: abaabaa can be broken as x = ab (Na(x) = 1) and y = aabaa (Nb(y) = 1).
Or w = bbbabbba as x = bbbabb (Na(x) = 1) and y = ba (Nb(y) = 1)
Or w = baabaab as x = baa and y = baab with (Na(x) = 2) and (Nb(y) = 2).
So you can always break a string consist of a and b into prefix x and suffix y such that Na(x) = (Nb(y).
Formal Prrof:
Note: A strings consists of only as or consist of bs doesn't belongs to languagr e.g. aa, a, bbb...
Lets defined new Lagrange CA such that CA = {xy | Na(x) != Nb(y)}. CA stands for complement of A consists of string consists of only as or only bs.
1And CA is a regular language it's regular expression is a+ + b+.
Now as we know CA is a regular language (it can be expression by regular expression and so DFA) and Complement of any regular language is Regular hence language A is also regular language!
To construct DFA for complement language refer: Finding the complement of a DFA? and to write regular expression for DFA refer following two techniques.
How to write regular expression for a DFA
How to write regular expression for a DFA using Arden theorem
'+' Operator in Regular Expression in formal languages
PS: Btw regular expression for A = {xy | Na(x) = Nb(y)} is (a + b)*a(a + b)*b(a + b)*.
First, find out how to prove that a set is regular.
One way is to define a finite state machine that accepts the language.
Second: maybe think about why the set is not regular.
Hint: A = {a, b}*.
Try proving it by induction on length of word, or by finding the shortest word not in A.

Resources