Regular expression in Automata Theory? - regular-language

I have the following language and its regular expression
{w ∈ {a, b}* : w has bab as a prefix, and babaa as a suffix}
Answer:
Regular expression = bab(a ∪ b)*babaa ∪ babaa ∪ bababaa
Why bold part is needed?

bab is a prefix of babaa, and babaa is obviously a suffix of itself. Therefore, babaa is a possible string.
babaa is a suffix of bababaa and bab is a prefix of bababaa. Thus, it should also be included.

Related

If L and L complement are Recursively enumerable then why can't L be a Regular language?

Below question was asked in GATE 2008 paper :
If L and L' (L complement) are Recursively enumerable then L is ?
a) Regular
b) CFL
c) CSL
d) Recursive
Correct option was option (d) and I accept that it's true. But my question is why can't it be regular or CSL ?
Because I think if we consider L is regular, then L' is also regular (As Regular languages are closed under complementation). And now as L' is regular so according to 'Chomsky hierarchy' L' is also Recursively enumerable. As even L after being regular, it fits into the question statement then why option (a) is not a correct option ? Same goes for CSL, so why option (c) is also not a correct option?
A quick review of language classes -- we know that these 5 language classes are all (strict) subsets of each other:
regular ⊂ CFL ⊂ CSL ⊂ recursive ⊂ recursively enumerable
The question is asking, if we know a language L is recursively enumerable AND we know it's complement L' is also recursively enumerable, what can we say for certain which smaller class L is in?
The answer is equivalent to saying that if a language L is recursively enumerable and NOT recursive, then L' is NOT recursively enumerable. That statement is true, but the equivalent statement for any of the other language classes is not.
If L and L' are both recursively enumerable, then
a) L may be regular (indeed, if L is regular, then L' is regular as well, and all regular languages are recursively enumerable)... but there are non-regular languages whose complements are recursively enumerable
b) L may be a CFL (there are CFLs whose complements are also CFLs, as well as CFLs whose complements are not CFLs)... but there are non-context-free languages whose complements are recursively enumerable
c) L may be a CSL (there are CSLs whose complements are CSLs) ... but there are non-context-sensitive languages whose complements are recursively enumerable
d) L must be recursive because, by virtue of both L and L' being recursively enumerable, we have an effectively computable procedure for deciding whether or not any given string is in L: begin enumerating strings in each language, interleaving the enumerations (so you give the next string in L, then the next string in L', then back to L, etc.). Continuing this process will eventually find the target string either in L or L', at which point you can return true (if it was enumerated in L) or false (if enumerated in L').
Therefore, while it's true that L could be regular, CFL or CSL, it is also true that it might not be any of those; but it must absolutely be recursive. Therefore, that is the "best" answer and the only one that is generally correct in all cases.

Using the pumping lemma to prove irregularity in a regular language - where is the error

I have a vital misunderstanding of the pumping lemma. In the following example I show an example of using it on a regular language to come to incorrect conclusions. What am I doing wrong?
L={a*b*}, assume the language is regular so by the pumping lemma there exists some n, and σ = αβγ and σ' = αβ^kγ ∈ L for all non negative k.
σ = aaabbb
α = aa
β = ab
γ = bb
then σ'= αβ^2γ for k=2, σ' =aaababbb
σ'∉ L, a contradiction, thus L is not regular.
L as described I know to be a regular language so I would expect to find ∈ L. This is due to my choice of β spanning across two characters but there is nothing I can find in the pumping lemma which forbids this.

Is it possible for a subset of a non-context free language to be context-free?

For example, if I have a non-context free language of B, is there such a context free language A such that A is a subset of B? I have been thinking of examples but unable to think of any valid ones.
I thought I got it when I said that A = {A = {w | w is of even length and w ∈ {a, b, c}}, which is context-free, and B = {ww | w ∈ {a,b,c}}, which is not context free. However, I realized that there are some strings A can produce that B cannot, and therefore, A is not a subset of B.
Does anyone know of any examples that could be valid in my situation?
Any finite set of strings is a context-free language. (Indeed, it is a regular language.) So any finite subset of a language is context-free, regardless of what the language is.
Another trivial case is the language L = L1 ∪ L2 where the alphabet of L1 is Σ1 and the alphabet of L2 is Σ2 and Σ1 ∩ Σ2 = ∅. Now L is context-free only if both L1 and L2 are context-free. (This is not the case if the alphabets are not disjoint.) So if exactly one of L1 and L2 is context-free, then it is a subset of the non-context-free language L.
If neither of those are interesting enough for you, then the language { a* } (where a is a symbol) is a subset of { ww | w ∈ {a, b}* }. Another subset of the same classic non-context-free language is { ww | w ∈ {a, b}* ∧ w = wR } (that is, the language of all duplicated even-length palindromes) which is context-free because it is exactly the same as { wwR | w ∈ {a, b}* ∧ w = wR }, as a result of the second condition.

If pref(L) is regular, does that imply L is also regular?

I have this exercise for homework:
Say we have a language L. we know that the language pref(L) (all the prefixes of L, including all the words in L itself) is a regular language. Does this imply that the language L is regular as well?
I took the NFA of pref(L) and divided it (via 2 epsilon transitions from q0) to 2 separate NFA's, as 1 defines L and the other defines pref(L)\L.
What I actually got is a NFA for L, which means it is regular.
I am not sure this is the way or if it legal. I'd be glad for another lead.
Thanks in advance,
Yaron.
It is not necessarily the case that if pref(L) is regular, then L is regular as well.
As an example, let Σ = {a} be a unary alphabet. I'm going to claim that if L is any infinite language over Σ, then pref(L) = Σ*. To see this, first note that pref(L) ⊆ Σ* because every pref(L) is a language over Σ. Now, consider any string in Σ*, which must have the form an. If L is an infinite language over Σ, it must contain at least one string of the form am where m ≥ n. Then an would be a prefix of am, so an ∈ pref(L). This shows that Σ* ⊆ pref(L) and that pref(L) ⊆ Σ*, so in this case Σ* = pref(L).
Now, all we need to do is find a nonregular language over Σ = {a}. As an example, take the language { a2n | n ∈ N } of all strings whose length is a power of two. It's possible to prove using either the Myhill-Nerode theorem or the pumping lemma that this language is not regular. However, by the above result, we know that pref(L) is a regular language.
Hope this helps!

If L* is regular, then is L regular?

I've tried to look for the answer and I'm getting conflicting answers so I'm not sure. I know the reverse is true, that if L is regular then L* is regular under closure.
I imagine that if L* is regular then L is regular because the subset of L* should be regular and L is part of that subset.
If L* is regular, then L is not necessarily regular. For example, consider any nonregular language L over an alphabet Σ such that Σ ⊆ L. (That is, imagine you have a nonregular language where each individual character in the alphabet is a string in L.) In that case, L* = Σ*, since you can form any string as the concatenation of all the individual characters of Σ.
Here's one possible example. Let Σ = {a} and consider the language L = { a2n | n ∈ N }. This language is not regular, and you can prove it using either the pumping lemma for regular languages or the Myhill-Nerode theorem. However, the language L* is the language a*, which is regular. To see this, notice that since L contains the string a, the language L* contains all strings of the form an for any natural number n.
Another option: pick L to be any nonregular language over Σ, then consider the language L ∪ Σ. This is also a nonregular language (if L ∪ Σ were regular, then we could subtract out each character added in via the union, leaving a regular language at each step, to show that L is regular), and it satisfies the above requirements.
Hope this helps!
Take L = {a,b}*, which is regular, but has a non-regular subset L={a^n b^n} (this one can be proved to be non regular by pumping lemma...), so it's not the case that all subsets of a regular language are regular.

Resources