1 or 2 right hand side variable in Context free language - regular-language

I have 2 questions to ask and I have some ideas about it also.
1) X-Context free grammer(X-CFG) with 1 terminal or variable at the right hand side of every rule.
2) Y-CFG with 2 terminal or variable at the right hand side of every rule.
Questions:
a) Do they generate any non-regular languages? Prove.
b) Do they generate all regular languages? Prove.
Answers:
a) I think for X-CFG, they can not generate any non regular because it can can only generate finite number of strings so they cant generate any non regular languages.
b) There are infinite number of regular languages like a^* . We can nor generate infinite strings with CFG, so we can say that it can not generate all regular languages.
Am I right?
I have no idea about question b.

Consider the Y-CFG:
S → aA
S → ab
A → Sb
Is this not a Y-CFG that satisfies your constraints and generates a non-regular language? an bn such that n >= 1
Also see this as it states:
Every regular grammar is context-free, but not all context-free grammars are regular. And explains a little why with an example to help grasp it.
So I believe both of your answers are wrong if I understood the question correctly.
UPDATE
Every regular grammar is context-free. So then the question is can we define all CFG's using 2 variables (t is terminal and N is non-terminal):
S → SS
S → t
S → N
S → tN
S → Nt
Therefore we can define things that terminate, things that grow out from multiple starting strings, things that grow in front and things that grow in back. Which is every case in a CFG. So I would say yes you can generate all regular languages.

Related

How to make a partial function?

I was thinking about how I could save myself from undefinition, and one idea I had was to enumerate all possible sources of partiality. At least I would know what of to beware. I found three yet:
Incomplete pattern matches or guards.
Recursion. (Optionally excluding structural recursion on algebraic types.)
If a function is unsafe, any use of that function infects the user code. (Should I be saying "partiality is transitive"?)
I have heard of other ways to obtain a logical contradiction, for instance by using negative types, but I am not sure if anything of that sort applies to Haskell. There are many logical paradoxes out there, and some of them can be encoded in Haskell, but may it be true that any logical paradox requires the use of recursion, and is therefore covered by the point 2 above?
For instance, if it were proven that a Haskell expression free of recursion can always be evaluated to normal form, then the three points I give would be a complete list. I fuzzily remember seeing something like a proof of this in one of Simon Peyton Jones's books, but that was written like 30 years ago, so even if I remember correctly and it used to apply to a prototype Haskell back then, it may be false today, seeing how many a language extension we have. Possibly some of them enable other ways to undefine a program?
And then, if it were so easy to detect expressions that cannot be partial, why do we not do that? How easier would life be!
This is a partial answer (pun intended), where I'll only list a few arguably non obvious ways one can achieve non termination.
First, I'll confirm that negative-recursive types can indeed cause non termination. Indeed, it is known that allowing a recursive type such as
data R a = R (R a -> a)
allows one to define fix, and obtain non termination from there.
{-# LANGUAGE ScopedTypeVariables #-}
{-# OPTIONS -Wall #-}
data R a = R (R a -> a)
selfApply :: R a -> a
selfApply t#(R x) = x t
-- Church's fixed point combinator Y
-- fix f = (\x. f (x x))(\x. f (x x))
fix :: forall a. (a -> a) -> a
fix f = selfApply (R (\x -> f (selfApply x)))
Total languages like Coq or Agda prohibit this by requiring recursive types to use only strictly-positive recursion.
Another potential source of non-termination is that Haskell allows Type :: Type. As far as I can see, that makes it possible to encode System U in Haskell, where Girard's paradox can be used to cause a logical inconsistency, constructing a term of type Void. That term (as far as I understand) would be non terminating.
Girard's paradox is unfortunately rather complex to fully describe, and I have not completely studied it yet. I only know it is related to the so-called hypergame, a game where the first move is to choose a finite game to play. A finite game is one which causes every match to terminate after finitely many moves. The next moves after that would correspond to a match according to the chosen finite game at step one. Here's the paradox: since the chosen game must be finite, no matter what it is, the whole hypergame match will always terminate after a finite amount of moves. This makes hypergame itself a finite game, making the infinite sequence of moves "I choose hypergame, I choose hypergame, ..." a valid play, in turn proving that hypergame is not finite.
Apparently, this argument can be encoded in a rich enough pure type system like System U, and Type :: Type allows to embed the same argument.

Will L = {a*b*} be classified as a regular language?

Will L = {a*b*} be classified as a regular language?
I am confused because I know that L = {a^n b^n} is not regular. What difference does the kleene star make?
Well it is makes difference when you have a L = {a^n b^n} and a L = {a*b*}.
When you have a a^n b^n language it is a language where you must have the same number of a's and b's example:{aaabbb, ab, aabb, etc}. As you said this is not a regular expression.
But when we talk about L = {a*b*} it is a bit different here you can have any number of a followed by any numbers of b (including 0). Some example are:
{a, b, aaab, aabbb, aabbbb, etc}
As you can see it is different from the {a^n b^n} language where you needed to have the same numbers of a's and b's.
And yes a*b* is regular by its nature. If you want a good explanation why it is regular you can check this How to prove a language is regular they might have a better explanation then me (:
I hope it helped you
The language described by the regular expression ab is regular by definition. These expressions cannot describe any non-regular language and are indeed one of the ways of defining the regular languages.
{a^n b^n: n>0} (this would be a formally complete way of describing it) on the other hand, cannot be described by a regular expression. Intuitively, when reaching the border between a and b you need to remember n. Since it is not bounded, no finite-memory device can do that. In ab you only need to remember that from now on only b should appear; this is very finite. The two stars in some sense are not related; each expands its block independently of the other.

Pros / Cons of Tacit Programming in J

As a beginner in J I am often confronted with tacit programs which seem quite byzantine compared to the more familiar explicit form.
Now just because I find interpretation hard does not mean that the tacit form is incorrect or wrong. Very often the tacit form is considerably shorter than the explicit form, and thus easier to visually see all at once.
Question to the experts : Do these tacit forms convey a better sense of structure, and maybe distil out the underlying computational mechanisms ? Are there other benefits ?
I'm hoping the answer is yes, and true for some non-trivial examples...
Tacit programming is usually faster and more efficient, because you can tell J exactly what you want to do, instead of making it find out as it goes along your sentence. But as someone loving the hell out of tacit programming, I can also say that tacit programming encourages you to think about things in the J way.
To spoil the ending and answer your question: yes, tacit programming can and does convey information about structure. Technically, it emphasizes meaning above all else, but many of the operators that feature prominently in the less-trivial expressions you'll encounter (#: & &. ^: to name a few) have very structure-related meanings.
The canonical example of why it pays to write tacit code is the special code for modular exponentiation, along with the assurance that there are many more shortcuts like it:
ts =: 6!:2, 7!:2#] NB. time and space
100 ts '2 (1e6&| # ^) 8888x'
2.3356e_5 16640
100 ts '1e6 | 2 ^ 8888x'
0.00787232 8.496e6
The other major thing you'll hear said is that when J sees an explicit definition, it has to parse and eval it every single time it applies it:
NB. use rank 0 to apply the verb a large number of times
100 ts 'i (4 : ''x + y + x * y'')"0 i=.i.100 100' NB. naive
0.0136254 404096
100 ts 'i (+ + *)"0 i=.i.100 100' NB. tacit
0.00271868 265728
NB. J is spending the time difference reinterpreting the definition each time
100 ts 'i (4 : ''x (+ + *) y'')"0 i=.i.100 100'
0.0136336 273024
But both of these reasons take a backseat to the idea that J has a very distinct style of solving problems. There is no if, there is ^:. There is no looping, there is rank. Likewise, Ken saw beauty in the fact that in calculus, f+g was the pointwise sum of functions—indeed, one defines f+g to be the function where (f+g)(x) = f(x) + g(x)—and since J was already so good at pointwise array addition, why stop there?
Just as a language like Haskell revels in the pleasure of combining higher-order functions together instead of "manually" syncing them up end to end, so does J. Semantically, take a look at the following examples:
h =: 3 : '(f y) + g y' – h is a function that grabs its argument y, plugs it into f and g, and funnels the results into a sum.
h =: f + g – h is the sum of the functions f and g.
(A < B) +. (A = B) – "A is less than B or A is equal to B."
A (< +. =) B – "A is less than or equal to B."
It's a lot more algebraic. And I've only talked about trains thus far; there's a lot to be said about the handiness of tools like ^: or &.. The lesson is fairly clear, though: J wants it to be easy to talk about your functions algebraically. If you had to wrap all your actions in a 3 :'' or 4 :''—or worse, name them on a separate line!—every time you wanted to apply them interestingly (like via / or ^: or ;.) you'd probably be very turned off from J.
Sure, I admit you will be hard-pressed to find examples as elegant as these as your expressions get more complex. The tacit style just takes some getting used to. The vocab has to be familiar (if not second nature) to you, and even then sometimes you have the pleasure of slogging through code that is simply inexcusable. This can happen with any language.
Not an expert, but the biggest positive aspects of coding in tacit for me are 1) that it makes it a little easier to write programs that write programs and 2) it is a little easier for me to grasp the J way of approaching problems (which is a big part of why like to program with J). Explicit feels more like procedural programming, especially if I am using control words such as if., while. or select. .
The challenges are that 1) explicit code sometimes runs faster than tacit, but this is dependent on the task and the algorithm and 2) tacit code is interpreted as it is parsed and this means that there are times when explicit code is cleaner because you can leave the code waiting for variable values that are only defined at run time.

Pattern matching identical values

I just wondered whether it's possible to match against the same values for multiple times with the pattern matching facilities of functional programming languages (Haskell/F#/Caml).
Just think of the following example:
plus a a = 2 * a
plus a b = a + b
The first variant would be called when the function is invoked with two similar values (which would be stored in a).
A more useful application would be this (simplifying an AST).
simplify (Add a a) = Mult 2 a
But Haskell rejects these codes and warns me of conflicting definitions for a - I have to do explicit case/if-checks instead to find out whether the function got identical values. Is there any trick to indicate that a variable I want to match against will occur multiple times?
This is called a nonlinear pattern. There have been several threads on the haskell-cafe mailing list about this, not long ago. Here are two:
http://www.mail-archive.com/haskell-cafe#haskell.org/msg59617.html
http://www.mail-archive.com/haskell-cafe#haskell.org/msg62491.html
Bottom line: it's not impossible to implement, but was decided against for sake of simplicity.
By the way, you do not need if or case to work around this; the (slightly) cleaner way is to use a guard:
a `plus` b
| a == b = 2*a
| otherwise = a+b
You can't have two parameters with the same name to indicate that they should be equal, but you can use guards to distinguish cases like this:
plus a b
| a == b = 2 * a
| otherwise = a + b
This is more flexible since it also works for more complicated conditions than simple equality.
I have just looked up the mailing list threads given in Thomas's answer, and the very first reply in one of them makes good sense, and explains why such a "pattern" would not make much sense in general: what if a is a function? (It is impossible in general to check it two functions are equal.)
I have implemented a new functional programming language that can handle non-linear patterns in Haskell.
https://github.com/egison/egison
In my language, your plus function in written as follow.
(define $plus
(match-lambda [integer integer]
{[[$a ,a] (* a 2)]
[[$a $b] (+ a b)]}))

Explain concatenative languages to me like I'm an 8-year-old

I've read the Wikipedia article on concatenative languages, and I am now more confused than I was when I started. :-)
What is a concatenative language in stupid people terms?
In normal programming languages, you have variables which can be defined freely and you call methods using these variables as arguments. These are simple to understand but somewhat limited. Often, it is hard to reuse an existing method because you simply can't map the existing variables into the parameters the method needs or the method A calls another method B and A would be perfect for you if you could only replace the call to B with a call to C.
Concatenative language use a fixed data structure to save values (usually a stack or a list). There are no variables. This means that many methods and functions have the same "API": They work on something which someone else left on the stack. Plus code itself is thought to be "data", i.e. it is common to write code which can modify itself or which accepts other code as a "parameter" (i.e. as an element on the stack).
These attributes make this languages perfect for chaining existing code to create something new. Reuse is built in. You can write a function which accepts a list and a piece of code and calls the code for each item in the list. This will now work on any kind of data as long it's behaves like a list: results from a database, a row of pixels from an image, characters in a string, etc.
The biggest problem is that you have no hint what's going on. There are only a couple of data types (list, string, number), so everything gets mapped to that. When you get a piece of data, you usually don't care what it is or where it comes from. But that makes it hard to follow data through the code to see what is happening to it.
I believe it takes a certain set of mind to use the languages successfully. They are not for everyone.
[EDIT] Forth has some penetration but not that much. You can find PostScript in any modern laser printer. So they are niche languages.
From a functional level, they are at par with LISP, C-like languages and SQL: All of them are Turing Complete, so you can compute anything. It's just a matter of how much code you have to write. Some things are more simple in LISP, some are more simple in C, some are more simple in query languages. The question which is "better" is futile unless you have a context.
First I'm going to make a rebuttal to Norman Ramsey's assertion that there is no theory.
Theory of Concatenative Languages
A concatenative language is a functional programming language, where the default operation (what happens when two terms are side by side) is function composition instead of function application. It is as simple as that.
So for example in the SKI Combinator Calculus (one of the simplest functional languages) two terms side by side are equivalent to applying the first term to the second term. For example: S K K is equivalent to S(K)(K).
In a concatenative language S K K would be equivalent to S . K . K in Haskell.
So what's the big deal
A pure concatenative language has the interesting property that the order of evaluation of terms does not matter. In a concatenative language (S K) K is the same as S (K K). This does not apply to the SKI Calculus or any other functional programming language based on function application.
One reason this observation is interesting because it reveals opportunities for parallelization in the evaluation of code expressed in terms of function composition instead of application.
Now for the real world
The semantics of stack-based languages which support higher-order functions can be explained using a concatenative calculus. You simply map each term (command/expression/sub-program) to be a function that takes a function as input and returns a function as output. The entire program is effectively a single stack transformation function.
The reality is that things are always distorted in the real world (e.g. FORTH has a global dictionary, PostScript does weird things where the evaluation order matters). Most practical programming languages don't adhere perfectly to a theoretical model.
Final Words
I don't think a typical programmer or 8 year old should ever worry about what a concatenative language is. I also don't find it particularly useful to pigeon-hole programming languages as being type X or type Y.
After reading http://concatenative.org/wiki/view/Concatenative%20language and drawing on what little I remember of fiddling around with Forth as a teenager, I believe that the key thing about concatenative programming has to do with:
viewing data in terms of values on a specific data stack
and functions manipulating stuff in terms of popping/pushing values on the same the data stack
Check out these quotes from the above webpage:
There are two terms that get thrown
around, stack language and
concatenative language. Both define
similar but not equal classes of
languages. For the most part though,
they are identical.
Most languages in widespread use today
are applicative languages: the central
construct in the language is some form
of function call, where a function is
applied to a set of parameters, where
each parameter is itself the result of
a function call, the name of a
variable, or a constant. In stack
languages, a function call is made by
simply writing the name of the
function; the parameters are implicit,
and they have to already be on the
stack when the call is made. The
result of the function call (if any)
is then left on the stack after the
function returns, for the next
function to consume, and so on.
Because functions are invoked simply
by mentioning their name without any
additional syntax, Forth and Factor
refer to functions as "words", because
in the syntax they really are just
words.
This is in contrast to applicative languages that apply their functions directly to specific variables.
Example: adding two numbers.
Applicative language:
int foo(int a, int b)
{
return a + b;
}
var c = 4;
var d = 3;
var g = foo(c,d);
Concatenative language (I made it up, supposed to be similar to Forth... ;) )
push 4
push 3
+
pop
While I don't think concatenative language = stack language, as the authors point out above, it seems similar.
I reckon the main idea is 1. We can create new programs simply by joining other programs together.
Also, 2. Any random chunk of the program is a valid function (or sub-program).
Good old pure RPN Forth has those properties, excluding any random non-RPN syntax.
In the program 1 2 + 3 *, the sub-program + 3 * takes 2 args, and gives 1 result. The sub-program 2 takes 0 args and returns 1 result. Any chunk is a function, and that is nice!
You can create new functions by lumping two or more others together, optionally with a little glue. It will work best if the types match!
These ideas are really good, we value simplicity.
It is not limited to RPN Forth-style serial language, nor imperative or functional programming. The two ideas also work for a graphical language, where program units might be for example functions, procedures, relations, or processes.
In a network of communicating processes, every sub-network can act like a process.
In a graph of mathematical relations, every sub-graph is a valid relation.
These structures are 'concatenative', we can break them apart in any way (draw circles), and join them together in many ways (draw lines).
Well, that's how I see it. I'm sure I've missed many other good ideas from the concatenative camp. While I'm keen on graphical programming, I'm new to this focus on concatenation.
My pragmatic (and subjective) definition for concatenative programming (now, you can avoid read the rest of it):
-> function composition in extreme ways (with Reverse Polish notation (RPN) syntax):
( Forth code )
: fib
dup 2 <= if
drop 1
else
dup 1 - recurse
swap 2 - recurse +
then ;
-> everything is a function, or at least, can be a function:
( Forth code )
: 1 1 ; \ define a function 1 to push the literal number 1 on stack
-> arguments are passed implicitly over functions (ok, it seems to be a definition for tacit-programming), but, this in Forth:
a b c
may be in Lisp:
(c a b)
(c (b a))
(c (b (a)))
so, it's easy to generate ambiguous code...
you can write definitions that push the xt (execution token) on stack and define a small alias for 'execute':
( Forth code )
: <- execute ; \ apply function
so, you'll get:
a b c <- \ Lisp: (c a b)
a b <- c <- \ Lisp: (c (b a))
a <- b <- c <- \ Lisp: (c (b (a)))
To your simple question, here's a subjective and argumentative answer.
I looked at the article and several related web pages. The web pages say themselves that there isn't a real theory, so it's no wonder that people are having a hard time coming up with a precise and understandable definition. I would say that at present, it is not useful to classify languages as "concatenative" or "not concatenative".
To me it looks like a term that gives Manfred von Thun a place to hang his hat but may not be useful for other programmers.
While PostScript and Forth are worth studying, I don't see anything terribly new or interesting in Manfred von Thun's Joy programming language. Indeed, if you read Chris Okasaki's paper on Techniques for Embedding Postfix Languages in Haskell you can try out all this stuff in a setting that, relative to Joy, is totally mainstream.
So my answer is there's no simple explanation because there's no mature theory underlying the idea of a concatenative language. (As Einstein and Feynman said, if you can't explain your idea to a college freshman, you don't really understand it.) I'll go further and say although studying some of these languages, like Forth and PostScript, is an excellent use of time, trying to figure out exactly what people mean when they say "concatenative" is probably a waste of your time.
You can't explain a language, just get one (Factor, preferably) and try some tutorials on it. Tutorials are better than Stack Overflow answers.

Resources