Develop a context-sensitive grammar that generates the language - programming-languages

Can some one explain how to create a context-sensitive grammar that generates the language
L={i^n j^n k^m l^m | n,m ≥ 1}?
This is what i got so far:(I'm not sure that it's right)
S → IJ
I → iIX | iX;
J → jJl | jYl;
Xj → jX;
XY → Yk;
Y→ε.
I will appreciate if you will explain step by step, how to do it correctly or any path how to check the answer. Because I feel completely lost how to solve these problems even after reading about CFG (CSG) from the book.
Thank you.

The language definition L={i^n j^n k^m l^m | n,m ≥ 1} means a non-zero number of is followed by the same number of js as there are is, followed by a different non-zero number of ks followed by the same number of ls as there are ks.
So, start with a starting rule to generate the two independent parts of teh language:
1. S → XY
Add rules for generating 1 ij and 1 kl:
2. iXj → ij
3. kYl → kl
Add rules for generating multiple 'nested' sets:
4. X → iXj
5. Y → kYl
For example, a generation chain for iijjkkklll is:
→1 XY
→4 iXjY
→4 iiXjjY
→2 iijjY
→5 iijjkYl
→5 iijjkkYll
→5 iijjkkkYlll
→3 iijjkkklll

Related

How to generate distinct solutions in Prolog for '8 out of 10 cats does countdown' numbers game solver?

I wrote a Prolog program to find all solutions to any '8 out of 10 cats does countdown' number sequence. I am happy with the result. However, the solutions are not unique. I tried distincts() and reduced() from the "solution sequences" library. They did not produce unique solutions.
The problem is simple. you have a given list of six numbers [n1,n2,n3,n4,n5,n6] and a target number (R). Calculate R from any arbitrary combination of n1 to n6 using only +,-,*,/. You do not have to use all numbers but you can only use each number once. If two solutions are identical, only one must be generated and the other discarded. 
Sometimes there are equivalent results with different arrangement. Such as:
(100+3)*6*75/50+25
(100+3)*75*6/50+25  
Does anyone has any suggestions to eliminate such redundancy?
Each solution is a nested operators and integers. For example +(2,*(4,-(10,5))). This solution is an unbalanced binary tree with Arithmetic Operator for root and sibling nodes and numbers for leaf nodes. In order to have unique solutions, no two trees should be equivalent.
The Code:
:- use_module(library(lists)).
:- use_module(library(solution_sequences)).
solve(L,R,OP) :-
findnsols(10,OP,solve_(L,R,OP),S),
print_solutions(S).
solve_(L,R,OP) :-
distinct(find_op(L,OP)),
R =:= OP.
find_op(L,OP) :-
select(N1,L,Ln),
select(N2,Ln,[]),
N1 > N2,
member(OP,[+(N1,N2), -(N1,N2), *(N1,N2), /(N1,N2), N1, N2]).
find_op(L,OP) :-
select(N,L,Ln),
find_op(Ln,OP_),
OP_ > N,
member(OP,[+(OP_,N), -(OP_,N), *(OP_,N), /(OP_,N), OP_]).
print_solutions([]).
print_solutions([A|B]) :-
format('~w~n',A),
print_solutions(B).
Test:
solve([25,50,75,100,6,3],952,X)
Result
(100+3)*6*75/50+25 <- s1
((100+6)*3*75-50)/25 <- s2
(100+3)*75*6/50+25 <- s1
((100+6)*75*3-50)/25 <- s2
(100+3)*75/50*6+25 <- s1
true.
This code uses select/3 from the "lists" library.
UPDATE: Generate solutions useing DCG
The following is an attempt to generate solutions using DCG.  I was able to generate a more exhaustive solution set than in previous code posted. In a way, using DCG resulted in a more correct and elegant code. However, it is much more difficult to 'guess' what the code is doing.
The issue of redundant solutions still persist.
:- use_module(library(lists)).
:- use_module(library(solution_sequences)).
s(L) --> [L].
s(+(L,Ls)) --> [L],s(Ls).
s(*(L,Ls)) --> [L],s(Ls), {L =\= 1, Ls =\= 1, Ls =\= 0}.
s(-(L,Ls)) --> [L],s(Ls), {L =\= Ls, Ls =\= 0}.
s(/(L,Ls)) --> [L],s(Ls), {Ls =\= 1, Ls =\= 0}.
s(-(Ls,L)) --> [L],s(Ls), {L =\= Ls}.
s(/(Ls,L)) --> [L],s(Ls), {L =\= 1, Ls =\=0}.
solution_list([N,H|[]],S) :-
phrase(s(S),[N,H]).
solution_list([N,H|T],S) :-
phrase(s(S),[N,H|T]);
solution_list([H|T],S).
solve(L,R,S) :-
permutation(L,X),
solution_list(X,S),
R =:= S.
Does anyone has any suggestions to eliminate such redundancy?
I suggest to define a sorting weight on each node (inner or leaf). The number resulting from reducing the child node could be used, although ties will appear. These can be broken by additionally looking at topmost operations, sorting * before + for example. Actually one would like to have a sorting operation for which "tie" means "exactly the same subtree of arithmetic operations".
Since the OP is only seeking hints to help solve the problem.
Use DCG as a generator. (SWI-Prolog) (Prolog DCG Primer)
a. For a more refined version of using DCGs as a generator look for examples that use length/2. When you understand why you might see a beam of light shining down on you for a few moments (The light beam is a video gaming thing).
Use a constraint solver (SWI-Prolog) (CLP(FD) and CLP(ℤ): Prolog Integer Arithmetic) (Understanding CLP(FD) Prolog code of N-queens problem)
Since your solutions are constrained to the 6 numbers and the operators are always binary operators (+,-,*,/) then it is possible to enumerate the unique binary trees. If you know about OEIS then you can find related links that can help you solve this problem, but you need to give OEIS a sequence. To get a sequence for use with OEIS draw the trees for N from 2 to 5 and then enter that sequence into OEIS and see what you get. e.g.
N is the number of leaf (*) nodes.
N=2 ( 1 way to draw the tree )
-
/ \
* *
N=3 ( 2 ways to draw the tree )
- -
/ \ / \
- * * -
/ \ / \
* * * *
So the sequence starts with 1,2 ...
Hint - This page (link died) shows the images of the trees to see if you have done it correctly. In the description I use N to count the number of leaves (*), but on this page they use N to count the number of internal nodes (-). If we call my N N1 and the page N N2, then the relation is N2 = N1 - 1
This might be a Hamiltonian Cycle (Wolfram World) (Hamiltonianicity of the Tower of Hanoi Problem) Remember that there is a relation between Binary Trees and the Tower of Hanoi, but in your case there are added constraints. I don't know if the constraints eliminate a solution as a Hamiltonian Cycle.
Also don't think of building the final answer from a combination of any number and operator, but instead build subsets of operators and numbers, and then use those subsets to build the answer. You constrain at the start, not at the end.
Or put another way, don't think combinations at the start, but permutations of combinations (not sure if that is the correct pattern, but in the ball park) and then using that build the tree.

Is (0*1*)* equal to (0 | 1)*?

Are the regular expressions (0*1*)* and (0 | 1)* the same?
Could someone provide a proof or intuitive disproof for that? I feel like it is true but I’m struggling to write a step by step proof.
Two different regular expressions or two grammars can generate the same languages as these do but the regular expressions or grammars are not the same. There is a standard method of constructing a non-deterministic finite state automaton from a regular expression and from that constructing a deterministic finite state automaton. That method will produce two different automata for the regular expressions in question. Though each one will recognize the same strings, they will go through different states in doing so.
The regular expressions are equivalent.
I don't have a fully rigorous proof, but hand-waving follows.
Let R1 = (0*1*)* and R2 = (0 | 1)*. These are both regular expressions over the alphabet A = {0, 1}.
Part 1:
0 | 1 is the set {0, 1}. 0 is an element of 0*1* (because 0 ∈ 0* and ɛ ∈ 1* and 0 ∘ ɛ = 0) and so is 1.
Thus 0 | 1 is a subset of 0*1*, which means (0 | 1)* is a subset of (0*1*)*, i.e. R2 ⊆ R1.
Part 2:
R2 covers all possible words over the alphabet A. That is, every string containing only the characters 0 and 1 is in R2. (This seems obvious to me; a proof probably involves the definition of * and/or induction.)
Therefore R1 ⊆ R2.
By combining part 1 and part 2 we get R1 = R2.

UHC Ruler getting started

I have just read about UHC's Ruler and would like to use it in my own project.
However, trying to compile examples from paper about it gives me a bunch of syntax error messages.
Is there any examples of rule files (which can be compiled with version from hackage - 0.4.0.0) to start with?
There are example ruler files in the demo and test folders of the github repository.
The full text of the tst4 file looks like a good minimal case:
scheme Y =
hole [ a: A, b: B | | ]
judgeshape tex a `=` b
rules y scheme Y "" =
rule y =
judge Y = a `=` b
-
judge Y = b `=` a

Pumping lemma for regular language

I have a little confusion in checking whether the given language is regular or not using pumping lemma.
Suppose we have to check whether:
L. The language accepting even number of 0's in regular or not?
We know that it is regular because we can construct a DFA for L. But I want to prove this with pumping lemma.
Now suppose, I take a String w= "0000":
Now will divide the string as x = 0, y = 0, and z = 00. Now on applying pumping lemma for i = 2, I will get the string "00000", which is not present in my language so by pumping lemma its prove that the language is not regular. But it is accepted by DFA ?
Any help will be greatly appreciated
Thank you
You are not completely clear about pumping lemma.
What pumping lemma say:
Formal definition: Pumping lemma for regular languages
Let L be a regular language. Then there exists an integer p ≥ 1 depending only on L such that every string w in L of length at least p (p is called the "pumping length") can be written as w = xyz (i.e., w can be divided into three substrings), satisfying the following conditions:
|y| ≥ 1
|xy| ≤ p
for all i ≥ 0, xyiz ∈ L
But what this statement says is that:
If a language is really a regular language then there must be some way to generate(pump) new strings from all sufficiently large strings.
Sufficiently large string means, a string in language that is of the length ≥ P.
So it may not be possible to generate new string from small strings even if language is Regular Language
Some way means, if language is really a regular and our choice of w is correct. Then there should be at lest one way to break w in three parts xyz such that by repeating(pumping) y for any number of times we can generate new strings in the language.
correct choice of w means: w in language and sufficiently large ≥ P
note: in second point, there may be a chance that even if you breaks w correctly into xyz according to formal definition still some new generated strings are not in language. As you did.
And in this situation you are to retry with some other possible choice of y.
In you chosen string w = "0000" you can break w such that y = 00. And with this choice of y you would always find a new generated string in in Language that is "even number of zeros"
One mistake you are doing in your proof that you are doing for a specific string 0000. You should proof for all w ≥ P. So still your proof is incomplete
Read my this answer IN CONTEXT OF PUMPING LEMMA FOR REGULAR LANGUAGES
In that answer, I have explained that breaking w into xyz and pumping y means finding looping part and repeating looping part to generate new strings in language.
When we proof that some language is regular; then actually we don't know where is the looping part so we try with all possible choices that satisfies pumping lemma's rule 1,2 & 3.
And Pumping lemma says that if language is regular and infinite them there must be a loop in the DFA and every sufficiently large string in language passes through looping part (according to pigeonhole principle) of DFA (and hence y can't be null. That's rule-1 in above formal definition).
Think, loop can be at initial position or at end and so x and z can be null strings.
But actually we don't know where loop falls in DFA so we try with all possible ways.
To proof a language is regular: You are to proof that for all sufficiently long strings(w) in language there is at-least one way(y) to generate new strings in the language by repeating looping part any number (i) of times.
To proof a language is not regular:You are to find at least one sufficiently long strings (w) in language such that there no choice for any way 'y' so that its possible to generate new strings with all possible repetition (i).
To proof using Pumping Lemma:
+-------------------------+--------------------------+----------------+--------------+
| | Sufficient large W in L | y | i >=0 |
+-------------------------+--------------------------+----------------+--------------+
| language is regular | For all W (all W can use | At-least one | For all i>=0 |
| | to generate new W' in L) | | |
+-------------------------+--------------------------+----------------+--------------+
| language is NOT regular | Find Any W (at-least 1 | With all (Show | At-least one |
| | W that can't generates | no possible Y | i |
| | new W' in L | exists) | |
+-------------------------+--------------------------+----------------+--------------+
CAUTION:: The Rule always not works to proof 'Weather a Language is Regular?'
Pumping Lemma necessary but not sufficient condition for a language to be regular. A language possible that satisfies these conditions may still be non-regular.
Reference
To proof a language is regular you have some necessary and sufficient conditions for a language to be regular.
Although the accepted answer is complete in its own way, I had to add a few things. I have a very playful way to exploit the pumping lemma to be able to prove that a given language is not a Regular language.
Just to have a context to talk about, let me state the lemma:
Pumping Lemma for Regular Languages:
For any regular language L, there exists an integer k.
Such that for all x ∈ L with |x| ≥ k, there exists u, v, w ∈ Σ∗, such
that x = uvw, and
(1) |uv| ≤ k
(2) |v| ≥ 1
(3) for all i ≥ 0: u(v^i)w ∈ L
The k is called the Pumping lemma constant. Let me come straight to the point and show you how to go about proving a language L is not regular.
Now to start the game you need two players here. One is the Reader(R) and the other is the Adversary(A).
Input: A language L
The Goal of R: Somehow prove the language L is not regular by some contradiction.
The Goal of A: Somehow be prepared enough to face the arguments of R and do not let him/her create a contradiction.
Now let us start the argument.
A: The language L is not Irregular because none could show the contradiction using pumping lemma with a certain pumping constant k. (Each language is mapped to only one integer k)
R: Let me assume what you say. If language L is regular then it must satisfy the conditions of the pumping lemma. So, let me choose a suitable string x ∈ L (|x| >= k), such that it helps me create a contradiction later.
A: Challenged by the R, A tries its best to find at least one suitable partitioning u, v and w of the string x, such that
x = uvw and |uv| <= k and |v| > 0
R: With any possible partition given by A, wins the the argument if able to show an integer i >= 0 such that
u(v^i)w ∉ L
Because now the R has shown that the Language L has at least one string x which doesn't have any partition(u, v, w) such that it satisfies the pumping lemma. The contradiction happened because our assumption that L is regular is FALSE. Therefore the language L is proven to be not regular.
If the R is not able to show the above, this is not a Proof of the language being Regular. It just means that L can be a Regular or Irregular language which just happens to satisfy the pumping lemma conditions.
Always remember, the pumping lemma is if(L is regular), then Statements. The vice-versa is not necessarily TRUE. Although it might be TRUE in some cases.
Therefore the pumping lemma is useful only if you want to prove that a language is not regular.
(Source: Theory of Computation(NPTEL): Prof. Somenath Biswas(IIT Kanpur)

Could a concatenative language use prefix notation?

Concatenative languages have some very intriguing characteristics, such as being able to compose functions of different arity and being able to factor out any section of a function. However, many people dismiss them because of their use of postfix notation and how it's tough to read. Plus the Polish probably don't appreciate people using their carefully crafted notation backwards.
So, is it possible to have prefix notation? If it is, what would the tradeoffs be?
I have an idea of how it could work, but I'm not experienced with concatenative languages so I'm probably missing something. Basically, a function would be evaluated in reverse order and values would be pulled from the stack in reverse order. To demonstrate this, I'll compare postfix to what prefix would look like. Here are some concatenative expressions with the traditional postfix notation.
5 dup * ! Multiply 5 by itself
3 2 - ! Subtract 2 from 3
(1, 2, 3, 4, 5) [2 >] filter length ! Get the number of integers from 1 to 5
! that are greater than 2
The expressions are evaluated from left to right: in the first example, 5 is pushed on the stack, then dup duplicates the top value on the stack, then * multiplies the top two values on the stack. Functions pull their last argument first from the stack: in the second example, when - is called, 2 is at the top of the stack, but it is the last argument.
Here is what I think prefix notation would look like:
* dup 5
- 3 2
length filter (1, 2, 3, 4, 5) [< 2]
The expressions are evaluated from right to left, and functions pull their first argument first from the stack. Note how the prefix filter example reads much more closely to its description and looks similar to the applicative style. One issue I noticed is factoring things out might not be as useful. For example, in postfix notation you can factor out 2 - from 3 2 - to create a subtractTwo function. In prefix notation you can factor out - 3 from - 3 2 to create a subtractFromThree function, which doesn't seem as useful.
Barring any glaring issues, perhaps a concatenative language that uses prefix notation could win over the people who dislike postfix notation. Any insight is appreciated.
Well certainly, if your words are still fixed-arity then it's just a matter of executing tokens right to left.
It's only because of n-arity functions that prefix notation implies parenthesis, and it's only because of wanting human "reading order" to match execution order that being a stack language implies postfix.
I'm writing such a language right now as it happens, and so far I like some of the side-effects of using prefix notation. The semantics are based on Joy:
Files are parsed from left to right, but executed from right to left.
By extension, definitions must come after the point at which they are used.
As a nice side-effect, comments are simply lists which are dropped.
Here's the factorial function, for instance:
def 'fact [cond [* fact - 1 dup] [1 drop] dup]
I also find it easier to reason about the code as I'm writing it, but I don't have a strong background in concatenative languages. Here's my (probably-naive) derivation of the map function over lists. The 'nb' function drops something and is used for comments. 'stash [f]' pops into a temp, runs 'f' on the rest of the stack, then pushes the temp back on.
def 'map [q [cons map stash [head swap i] dup stash [tail dup]] [nb] is_cons nip]
nb [map [f] (cons x y) -> cons map [f] x f y
stash [tail dup] [f] (cons x y) = [f] y (cons x y)
dup [f] y (cons x y) = [f] [f] y (cons x y)
stash [head swap i] [f] [f] y (cons x y) = [f] x (f y)
cons map [f] x (f y) = cons map [f] x f y
map [f] [] -> []]
I just came from reading about the Om Language
Seems just what you are talking about. From it's description (emphasis mine):
The Om language is:
a novel, maximally-simple concatenative, homoiconic programming and algorithm notation language with:
minimal syntax, comprised of only three elements.
prefix notation, in which functions manipulate the remainder of the program itself. [...]
It also states that it's not finished, and will experience much change yet.
Still, it seems to be working, and really interesting as proof of concept.
I imagine a concatenative prefix language without stack. It could call functions, which would then themselves interpret code until they got all needed operands. Interpreter would then call next function. It would only need one memory construct - the result. Everything else could be read from the source code at time of execution. As you might have noticed, I am talking about interpreted language, not compiled one.

Resources