Why is the most constraint language for this not Regular and instead, Context-Free? - state-machine

This is the question:
Q: Given the following production rules, which is finite or otherwise the most constrained language in the Chomsky language hierarchy corresponding to the language described by the following production rules?
Production Rules provided
From what I've read, for Regular Languages in Automata is that it can be constructed by a finite automaton & can't be a^nb^n and cannot have strings produced where we have to count part of the string to produce the rest of it. I'm just still quite confused on what it means when we cannot have strings produced where we have to count part of the string to produce the rest of it... (like just taking this particular question as an example) Could anyone help explain on this?
Thanks a bunch.

The grammar is reproduced here:
S := aAbA
A := aAb
A := aba
Right off the bat, from the syntax of this grammar, we can guarantee the language at most context-free. This is because all productions have single non-terminals on the left-hand side. Given this, the most restrictive language class of the Chomsky hierarchy to which this language belongs must either be the regular languages or the context-free languages.
We can show this language is not regular using the pumping lemma for regular languages. Assume the language of this grammar were regular. Then for any string w in the language of the grammar of length at least p, it must be possible to write w = uvx where |uv| <= p, |v| > 0 and for all n >= 0, u(v^n)x is also a string in the language. Consider now the string a(a^p)aba(b^p)baba. This string is in the language of the grammar because we can derive it as follows:
S := aAbA := a(aAb)bA := aa(aAb)bba := ... := a(a^p)A(b^p)bA
:= a(a^p)aba(b^p)bA := a(a^p)aba(b^p)baba
This string has length at least p (indeed, its length is 2p + 8). As such, I should be able to write it in the manner described above; however, notice that the first p+1 symbols in this string are exclusively a. No matter how I write w = uvx, because |uv| <= p, uv consists only of a, and so must v. Thus, pumping only changes the number of a in the prefix of the string. But this cannot give me strings in the language for any n, since:
the grammar only generates intermediate forms with up to two instances of the non-terminal A
the non-terminal A can only cause the number of a to be exactly one greater than the number of b
all other productions in the grammar add the same numbers of a and b
as a result all strings produced by this grammar can only have two more a's than b's.
changing the number of a's in the prefix without changing the number of b's cannot maintain this characteristic of strings generated by the grammar
This is a contradiction. The only assumption we made was that the language is regular. Thus, we conclude the assumption was wrong and that the language cannot be regular.
Because the language is not regular, the most restrictive language class in the Chomsky hierarchy for it is the context-free languages.

Related

Using string of set length with pumping lemma to prove irregularity

There is this proof that I thought of that I am not quite sure if it's valid or not.
Suppose you had to prove the nonregularity of the following language:
A = { 0^n 1^n 2^n | n>= 0 }
The proof I devised picks a string that belongs in the language, such as 012, and show that it doesn't matter how it's divided, the pumping lemma is not wholly satisfied(I could post the entire proof, but the post is verbose as it is). According to my professor however, this proof cannot be accepted.
He did not explain why, and I don't see how such a proof would be insufficent to demonstrate that a language is not regular. If a string clearly belonging to an assumed regular language does not satisfy the pumping lemma, the language clearly has strings that are not regular as part of it set of strings, therefore the language is not regular.
I believe the reason my professor rejected this proof is because in the majority of problems the pumping length P cannot be correctly guessed. At the same time I do not see how my proof could be proven wrong with a counterexample.
You can only choose p (the pumping length) to be a specific number if the language is regular and p actually exists. The fact itself, that you pick an exact number, means that p exists, which is the thing to be actually proven.
Suppose that p exists. Lets choose a word, that is long enough: w=0^{p}1^{p}2^{p}. According to the pumping lemma there must exist a decomposition of each string in language A as w=xyz with |xy|≤p and |y|≥1 such that xy^{i}z in language A for every i≥0. To satisfy |xy|≤p choose x to be empty, y=0^{p}, and as a consequence z=1^{p}2^{p}. From the lemma, |y|≥1 so |xy|≥1. The strings with i≠1 (in xy^{i}z) are not in the language A. The language is thus not regular, and p does not exist.
If p existed then a finite state automaton could be constructed for this language. But no such automaton exists, because it would need memory to remember the number of 0 to later match the same number of 1 and 2. If n was a finite number, then you could construct, a probably large, automaton, but for infinite n no finite automaton can be constructed.
This language is not even context-free, because there is no push-down automaton that can be constructed for it. It is context-sensitive.

Prove regular language and automata

This is a grammar and I wan to check if this language is regular or not.
L → ε | aLcLc | LL
For example the result of this grammar is:
acc, accacc ..., aacccc, acaccc, accacc, aaacccccc, ...
I know that is not a regular language but how to prove it? Is building an automata the right way to prove it? What is the resulting automata. I don't see pattern to use it for build the automata.
Thank you for any help!
First, let me quickly demonstrate that you cannot deduce the language of a grammar is irregular based solely on the grammar's being irregular. To see this, consider the unrestricted grammar:
S -> SSaSS | aS | e
SaS -> aSa
aaS -> SSa
This is clearly not a regular grammar but you should be able to verify it generates the infinite regular language of all strings of a.
That said, how should we proceed? We will need to figure out what language your grammar generates, and then argue that particular language cannot be regular. We notice that the only rule that introduces terminal symbols always introduces twice as many c as it does a. Furthermore, it's not hard to see the language must be infinite. We can use the Myhill-Nerode theorem to show that these observations imply the language must be irregular.
Consider the prefix a^n of a hypothetical string in the language of this grammar. The shortest string which can be appended to the end of this prefix to give us a string generated by this grammar is c^(2n). No shorter string will work, and that string always works. Imagine now that we were looking at a correct deterministic finite automaton for the language of the grammar. Then, whatever state processing the prefix a^n left us in, we'd need the shortest path from there to an accepting state in the automaton to have length 2n. But a DFA must have finitely many states, and n is an arbitrary natural number. Our DFA cannot work for all possible n (it would need to have arbitrarily many states). This is a contradiction, so there can be no correct DFA for the language of the grammar. Since all regular languages have DFAs, that means the language of this grammar cannot be regular.

Will L = {a*b*} be classified as a regular language?

Will L = {a*b*} be classified as a regular language?
I am confused because I know that L = {a^n b^n} is not regular. What difference does the kleene star make?
Well it is makes difference when you have a L = {a^n b^n} and a L = {a*b*}.
When you have a a^n b^n language it is a language where you must have the same number of a's and b's example:{aaabbb, ab, aabb, etc}. As you said this is not a regular expression.
But when we talk about L = {a*b*} it is a bit different here you can have any number of a followed by any numbers of b (including 0). Some example are:
{a, b, aaab, aabbb, aabbbb, etc}
As you can see it is different from the {a^n b^n} language where you needed to have the same numbers of a's and b's.
And yes a*b* is regular by its nature. If you want a good explanation why it is regular you can check this How to prove a language is regular they might have a better explanation then me (:
I hope it helped you
The language described by the regular expression ab is regular by definition. These expressions cannot describe any non-regular language and are indeed one of the ways of defining the regular languages.
{a^n b^n: n>0} (this would be a formally complete way of describing it) on the other hand, cannot be described by a regular expression. Intuitively, when reaching the border between a and b you need to remember n. Since it is not bounded, no finite-memory device can do that. In ab you only need to remember that from now on only b should appear; this is very finite. The two stars in some sense are not related; each expands its block independently of the other.

Is This Language Regular or not?

I have the language {4^(w⋅g)34^(g)|w,g∈NAT} over the alphabet {0,1}.
I need to find out if this language is recognizable, decidable, context free, regular or none of these.
How would i go about doing that or knowing?
Thanks
Consider any string of the form 4^a 3 4^b. Can we find w, g for our a, b? Well, we know that g must equal b, and then we can choose w = a + g. Since a, b and g are natural numbers, so too must be w; the answer is that, yes, for any string of the form 4^a 3 4^b, we have a string in your language.
The language of all strings of the form 4^a 3 4^b is described by the regular expression 4* 3 4* and, as such, your language is regular, context free, decidable and recognizable.
Suppose your language weren't regular; how could you tell? You could use the Myhill-Nerode theorem or the Pumping Lemma for regular languages to derive a contradiction from assuming the language were regular.
Suppose your language weren't context-free. You could use the Pumping Lemma for context-free languages to derive a contradiction from assuming the language were context-free.
Of course, if your language weren't decidable or recognizable, you could prove that in various ways as well.

Is a*b* regular?

I know anbn for n > 0 is not regular by the pumping lemma but I would imagine a*b* to be regular since both a,b don't have to be the same length. Is there a proof for it being regular or not?
Answer to your question:
imagine a*b* to be regular, Is there a proof for it being regular or not?
No need to imagine, expression a*b* is called regular expression (re), and regular expressions are possible only for regular languages. If a language is not regular then regular expression is also not possible for that and if a language is regular language then we can always represent it by some regular expression.
Yes, a*b* represents a regular language.
Language description: Any number of a followed by any numbers of b (by any number I mean zero (including null ^) or more times). Some example strings are:
{^, a, b, aab, abbb, aabbb, ...}
DFA for RE a*b* will be as follows:
a- b-
|| ||
▼| ▼|
---►((Q0))---b---►((Q1))
In figure: `(())` means final state, so both `{Q0, Q1}` are final states.
You need to understand following basic concept:
What is basically a regular language? And why an infinite language `a*b*` is regular whereas languages like `{ anbn | n > 0 }` are not regular!!
A language(a set) is called regular language, if it requires only bounded(finite) amount of information to keep store at any instance of time while processing strings of the language.
So, what is 'bounded' information?
For example: Consider a fan 'on'/'off' switch. By viewing fan switch we can say whether the fan is in the on or off state (this is bounded or limited information). But we cannot tell 'how many times' a fan has been switched to on or off in the past! (to memorize this, we require a mechanism to store an 'unbounded' amount of information to count — 'how many times' e.g. the meter used in our cars/bikes).
The language { anbn | n > 0 } is not a regular language because here n is unbounded(it can be infinitely large). To verify strings in the language anbn, we need to memorize how many a symbols there are and it requires an infinite memory storage to count because the number of a symbols in the string can be infinitely large!
That means an automata is only capable of processing strings of the language anbn if it has infinite memory e.g PDA.
Whereas, a*b* is of course regular by its nature, because there is the bounded restriction ‐ that b may come after some a ( and a can't came after b). And that is why every string of this language can be easily processed (or recognized) by an automata in which we have finite memory - and finite automata is a class of automata where memory is finite. Yes, in finite automata, we have finite amount of memory in the term of states.
(Memory in finite automata is present in the form of states Q and according to automata principal: any automata can have only finite states. hence finite automata have finite memory, this is the reason the class of automata for regular languages is called finite automata. You can think of a finite automata like a CPU without memory, that has finite register to remember its internal states)
Finite State ⇒ Finite Memory ⇒ Only language can be processed for which finite memory needs to store at any instance of time while processing the string ⇒ that language is called Regular Language
Absent of external memory is limitation of finite automate ⇒ or we can say limitation of finite automata defined class of language called Regular Language.
You should read other answer "finiteness of regular language" to learn scope of regular language.
side note::
language { anbn | n > 0 } is subset of a*b*
Also a language { anbn | 10>100 n > 0 } is regular, a large set but regular because n is bounded, hence finite automata and regular expression is possible for this language.
You should also read: How to prove a language is regular?
The proof is: ((a*)(b*)) is a well-formed regular expression, hence matching a regular language. a*b* is a syntactic shortenning of the same expression.
Another proof: Regular languages are closed to concatenation. a* is a regular language. b* is a regular language, therefore their concatenation, a*b*, is also a regular expression.
You can build an automat for it:
0 ->(a) 1
0 ->(b) 2
1 ->(a) 1
1 ->(b) 2
2 ->(b) 2
2 ->(a) 3
3 ->(a,b) 3
where only 3 is not an accepting state, and prove that the language is a*b*.
To prove that a language is regular, it is sufficient to show either:
1) There exists some DFA that recognizes it. In this case, the DFA is trivial.
2) The language can be expressed as a regular expression, as mentioned in another answer. a*b* is a regular expression to recognize this language.
A regular language is a language that can be expressed with a regular expression or a deterministic or non-deterministic finite automata or state machine.
A language is a set of strings which are made up of characters from a specified alphabet, or set of symbols. Regular languages are a subset of the set of all strings.
a closure property is a statement that a certain operation on languages, when applied to languages in a class (e.g., the regular languages), produces a result that is also in that class.
this RE shows..the type of language that accepts multiple of (a) if any but before (b)
means language without containing any substring (ba)
Regular languages are not subset of context free languages. For example, ab is regular, comprising all the strings made of substring of a's followed by substring of b's. This is not subset of a^nb^n, but superset.

Resources