How do you determine if a language is regular, context free but not regular, or not context free? - programming-languages

I have a homework problem that requires you to prove if a language is one of the three:
A Regular Language
Context-Free but Not Regular
Not Comtext-Free
How would you prove each one? I know Pumping Lemma can verify if a language is Not Regular or Not Context-Free, but that’s it.
The example to help me understand better is the following:
{ a^(2n+1)b^(3n+2) | n ∈ N }, alphabet { a, b } where N is all natural numbers.

The pumping lemma for regular languages can tell you that a language is not regular; however, it cannot tell you that a language is regular. To tell that a language is regular, you must do the equivalent of producing a finite automaton, regular grammar or regular expression and then proving it's correct for your language.
The pumping lemma for context-free languages tells you whether the language is or is not context free. That is, if a language satisfies the pumping lemma for context-free languages, it is context free; and if it does not, then it is not. However, you can certainly use it in the same way you'd use the pumping lemma for regular languages and go ahead and find a pushdown automaton or context-free grammar instead.
In your case, we can first choose the string a^(2p+1) b^(3p+2) to show that the language is not regular by the pumping lemma for regular languages. We can show the language is context-free by arguing that for any string of the form a^(2k+1) b^(3k+2) where 2k+1 and 3k+2 are sufficiently large, we can always choose v to contain 2 a's and y to contain three b's, so that pumping maintains the required property. Alternatively, we can just give a CFG for it based on the same insight:
S -> aaSbbb | abb
Then we should show the grammar is correct, which is left as an exercise.

Related

Using string of set length with pumping lemma to prove irregularity

There is this proof that I thought of that I am not quite sure if it's valid or not.
Suppose you had to prove the nonregularity of the following language:
A = { 0^n 1^n 2^n | n>= 0 }
The proof I devised picks a string that belongs in the language, such as 012, and show that it doesn't matter how it's divided, the pumping lemma is not wholly satisfied(I could post the entire proof, but the post is verbose as it is). According to my professor however, this proof cannot be accepted.
He did not explain why, and I don't see how such a proof would be insufficent to demonstrate that a language is not regular. If a string clearly belonging to an assumed regular language does not satisfy the pumping lemma, the language clearly has strings that are not regular as part of it set of strings, therefore the language is not regular.
I believe the reason my professor rejected this proof is because in the majority of problems the pumping length P cannot be correctly guessed. At the same time I do not see how my proof could be proven wrong with a counterexample.
You can only choose p (the pumping length) to be a specific number if the language is regular and p actually exists. The fact itself, that you pick an exact number, means that p exists, which is the thing to be actually proven.
Suppose that p exists. Lets choose a word, that is long enough: w=0^{p}1^{p}2^{p}. According to the pumping lemma there must exist a decomposition of each string in language A as w=xyz with |xy|≤p and |y|≥1 such that xy^{i}z in language A for every i≥0. To satisfy |xy|≤p choose x to be empty, y=0^{p}, and as a consequence z=1^{p}2^{p}. From the lemma, |y|≥1 so |xy|≥1. The strings with i≠1 (in xy^{i}z) are not in the language A. The language is thus not regular, and p does not exist.
If p existed then a finite state automaton could be constructed for this language. But no such automaton exists, because it would need memory to remember the number of 0 to later match the same number of 1 and 2. If n was a finite number, then you could construct, a probably large, automaton, but for infinite n no finite automaton can be constructed.
This language is not even context-free, because there is no push-down automaton that can be constructed for it. It is context-sensitive.

Prove regular language and automata

This is a grammar and I wan to check if this language is regular or not.
L → ε | aLcLc | LL
For example the result of this grammar is:
acc, accacc ..., aacccc, acaccc, accacc, aaacccccc, ...
I know that is not a regular language but how to prove it? Is building an automata the right way to prove it? What is the resulting automata. I don't see pattern to use it for build the automata.
Thank you for any help!
First, let me quickly demonstrate that you cannot deduce the language of a grammar is irregular based solely on the grammar's being irregular. To see this, consider the unrestricted grammar:
S -> SSaSS | aS | e
SaS -> aSa
aaS -> SSa
This is clearly not a regular grammar but you should be able to verify it generates the infinite regular language of all strings of a.
That said, how should we proceed? We will need to figure out what language your grammar generates, and then argue that particular language cannot be regular. We notice that the only rule that introduces terminal symbols always introduces twice as many c as it does a. Furthermore, it's not hard to see the language must be infinite. We can use the Myhill-Nerode theorem to show that these observations imply the language must be irregular.
Consider the prefix a^n of a hypothetical string in the language of this grammar. The shortest string which can be appended to the end of this prefix to give us a string generated by this grammar is c^(2n). No shorter string will work, and that string always works. Imagine now that we were looking at a correct deterministic finite automaton for the language of the grammar. Then, whatever state processing the prefix a^n left us in, we'd need the shortest path from there to an accepting state in the automaton to have length 2n. But a DFA must have finitely many states, and n is an arbitrary natural number. Our DFA cannot work for all possible n (it would need to have arbitrarily many states). This is a contradiction, so there can be no correct DFA for the language of the grammar. Since all regular languages have DFAs, that means the language of this grammar cannot be regular.

How can we distinguish between regular languages and context free languages?

to express regular languages we use regexp and for context free languages we can use an stack-like memory, I know context free languages have some specifications like center embedding, but still I'm not sure when we can be confidant a given language is context free language? for example why does natural language is not a regular language. is there any reason except center embedding?
Automata theory states that a regular language can be processed by a Finite State Machine (FSM). However, if a language has "center-embedding", then that language is a Context-Free Language(CFL) which requires a Push-Down Automata(PDA).
Importantly, a PDA is a FSM with an additional resource of a memory-like device that is a "stack" or "counter" in order to keep track of the embeddings.
Wikipedia says in Languages that are not context-free :-
To prove that a given language is not context-free, one may employ
the pumping lemma for context-free languages or a number of other
methods, such as Ogden's lemma or Parikh's theorem.
Wikipedia says in Deciding whether a language is regular :-
To prove that a language is not regular, one often uses
the Myhill–Nerode theorem or the pumping lemma among other methods.
why does natural language is not a regular language ?
Chomsky said in (1957): “English is not a regular language”. As for context-free languages, “I do not know whether or not English is itself literally outside the range of such analyses”.
I am adding that English is such a vast language which can't be recognised by a finite machine.

Is This Language Regular or not?

I have the language {4^(w⋅g)34^(g)|w,g∈NAT} over the alphabet {0,1}.
I need to find out if this language is recognizable, decidable, context free, regular or none of these.
How would i go about doing that or knowing?
Thanks
Consider any string of the form 4^a 3 4^b. Can we find w, g for our a, b? Well, we know that g must equal b, and then we can choose w = a + g. Since a, b and g are natural numbers, so too must be w; the answer is that, yes, for any string of the form 4^a 3 4^b, we have a string in your language.
The language of all strings of the form 4^a 3 4^b is described by the regular expression 4* 3 4* and, as such, your language is regular, context free, decidable and recognizable.
Suppose your language weren't regular; how could you tell? You could use the Myhill-Nerode theorem or the Pumping Lemma for regular languages to derive a contradiction from assuming the language were regular.
Suppose your language weren't context-free. You could use the Pumping Lemma for context-free languages to derive a contradiction from assuming the language were context-free.
Of course, if your language weren't decidable or recognizable, you could prove that in various ways as well.

Proving a Language to be regular

Pumping Lemma is used to prove a language to be not regular. But How a language can be
proved to be regular ? In particular,
Let L be a language. Define half(L) to be
{ x | for some y such that |x| = |y|, xy is in L}.
Prove for each regular L that half(L) is regular.
Is there any trick or general procedure to tackle such kind of questions ?
If you can correctly describe your language L by an NFA or DFA, then it will be regular.
There is a well known equality of NFAs, DFAs, regular grammars and regular expressions, so a representation of L in any of these formalisms should do.
Provide a regular grammar or a finite automaton that matches the language. For the full list of properties you can prove to show a language is regular, see the first lines of the Wikipedia Article on regular languages.

Resources