What is a "free variable"? - haskell

(I'm sure this must have been answered on this site already, but search gets inundated with the concept of calling free() on a variable in C.)
I came across the term "eta reduction," which was defined something like f x = M x ==> M if x is "not free in M". I mean, I think I understand the gist of what it's trying to say, it seems like what you do when you convert a function to point-free style, but I don't know what the qualifier about x not being free means.

Here's an example:
\f -> f x
In this lambda, x is a free variable. Basically a free variable is a variable used in a lambda that is not one of the lambda's arguments (or a let variable). It comes from outside the context of the lambda.
Eta reduction means we can change:
(\x -> g x) to (g)
But only if x is not free (i.e. it is not used or is an argument) in g. Otherwise we'd be creating an expression which refers to a unknown variable:
(\x -> (x+) x) to (x+) ???

Well, here's the relevant Wikipedia article, for what that's worth.
The short version is that such definitions elide the body of a lambda expression using a placeholder like "M", and so have to specify additionally that the variable being bound by that lambda isn't used in whatever the placeholder represents.
So, a "free variable" here roughly means a variable defined in some ambiguous or unknown outer scope--e.g., in an expression like \y -> x + y, x is a free variable but y is not.
Eta reduction is about removing a superfluous layer of binding and immediately applying a variable, which is (as you would probably imagine) only valid if the variable in question is only used in that one place.

Related

Can any recursive definition be rewritten using foldr?

Say I have a general recursive definition in haskell like this:
foo a0 a1 ... = base_case
foo b0 b1 ...
| cond1 = recursive_case_1
| cond2 = recursive_case_2
...
Can it always rewritten using foldr? Can it be proved?
If we interpret your question literally, we can write const value foldr to achieve any value, as #DanielWagner pointed out in a comment.
A more interesting question is whether we can instead forbid general recursion from Haskell, and "recurse" only through the eliminators/catamorphisms associated to each user-defined data type, which are the natural generalization of foldr to inductively defined data types. This is, essentially, (higher-order) primitive recursion.
When this restriction is performed, we can only compose terminating functions (the eliminators) together. This means that we can no longer define non terminating functions.
As a first example, we lose the trivial recursion
f x = f x
-- or even
a = a
since, as said, the language becomes total.
More interestingly, the general fixed point operator is lost.
fix :: (a -> a) -> a
fix f = f (fix f)
A more intriguing question is: what about the total functions we can express in Haskell? We do lose all the non-total functions, but do we lose any of the total ones?
Computability theory states that, since the language becomes total (no more non termination), we lose expressiveness even on the total fragment.
The proof is a standard diagonalization argument. Fix any enumeration of programs in the total fragment so that we can speak of "the i-th program".
Then, let eval i x be the result of running the i-th program on the natural x as input (for simplicity, assume this is well typed, and that the result is a natural). Note that, since the language is total, then a result must exist. Moreover, eval can be implemented in the unrestricted Haskell language, since we can write an interpreter of Haskell in Haskell (left as an exercise :-P), and that would work as fine for the fragment. Then, we simply take
f n = succ $ eval n n
The above is a total function (a composition of total functions) which can be expressed in Haskell, but not in the fragment. Indeed, otherwise there would be a program to compute it, say the i-th program. In such case we would have
eval i x = f x
for all x. But then,
eval i i = f i = succ $ eval i i
which is impossible -- contradiction. QED.
In type theory, it is indeed the case that you can elaborate all definitions by dependent pattern-matching into ones only using eliminators (a more strongly-typed version of folds, the generalisation of lists' foldr).
See e.g. Eliminating Dependent Pattern Matching (pdf)

function name vs variable in haskell

From haskell documentation:
Identifiers are lexically distinguished into two namespaces (Section 1.4): those that begin with a lower-case letter (variable
identifiers) and those that begin with an upper-case letter
(constructor identifiers).
So a variable containing a constant value, i.e a=4 and the function name add add a b = a + b are both variable identifiers, true? Can we say that a function name is variable?
From another academic source:
f (patter1)...(pattern2) = expression
..where a pattern can be constructor or a variable, not a defined
function
This is where I get confused. As I can do f g x where g is a function, I again see that a function name is a variable then. True?
What do they mean with "not a defined defined function" then?
A function name can be a variable identifier, except when it is an operator like +.
This is a statement about lexical matters.
You can't infer from that that a function name is a variable. (Because a variable is not a lexical thing.)
It is the other way around, like in
f . g = \a -> f (g a)
where f and g are variables, i.e. names that are bound to some unknown-in-advance values, but we know that those values must be functions.
A named function is indeed just a global variable who's "value" just happens to be a function. For example,
id x = x
can just as well be written as
id = ( \ x -> x )
Haskell explicitly makes no distinction between these two. Even the type signature says it:
id :: x -> x
So id is just a value who's value has type x -> x (i.e., a function).
Somebody else said something about operators not being variables; this is untrue.
let (<+>) = \ x y -> (x+y)/(x*y) in 5 <+> 6
You can even do something utterly horrifying like write a loop where the contents of <+> changes each time through the loop. (But why in the hell would anybody ever do that?)

Need help understanding (\x -> ) in Haskell

On ZVON, one of the definitions provided for the takeWhile function is
Input: takeWhile (\x -> 6*x < 100) [1..20]
Output: [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]
Can someone explain what the portion (\x -> 6*x < 100) means?
It's an anonymous function definition, otherwise known as a lambda-expression. (\x -> 6*x < 100) is a function which takes a number, and returns the boolean result of the inequality.
Since functional languages like Haskell frequently take functions as arguments, it is convenient to be able to define simple functions in-line, without needing to assign them a name.
Originally, the story goes, Alonzo Church wanted to mark variables in functional expressions with a circumflex, like e.g. (ŷ.x(yz)) but the Princeton printing press just couldn't do that at the time. He then wanted at least to print carets before the vars, like this: (^y.x(yz)), but they couldn't do that either.
The next best option was to use the Greek letter lambda instead, and so they ended up writing (λy.x(yz)) etc., hence the "lambda" in lambda-expression. It was all just a typographical accident.
Today on ASCII terminals we can't even use the letter λ, and so in Haskell we use a backslash instead (and an arrow in place of a dot in original lambda-expressions notation):
(\y -> x (y z))
stands for a function g such that
g y = x (y z)
Source: read it somewhere, don't remember where.
(\x -> 6*x < 100) is a lambda, an anonymous function that takes one argument (here called x) and computes & returns 6*x < 100, i.e., tests whether that number multiplied by 6 is less than 100.
It is a lambda function, that is, a function that you define in the spot mostly for convenience. You read it as "take x as your input, multiply it by 6 and see if it is less than 100". There are some other amenities related, though. For example, in Haskell Lambda functions and ordinary functions have a lexical environment associated and are properly speaking closures, so that they can perform computations using the environment as input.

what is the meaning of "let x = x in x" and "data Float#" in GHC.Prim in Haskell

I looked at the module of GHC.Prim and found that it seems that all datas in GHC.Prim are defined as data Float# without something like =A|B, and all functions in GHC.Prim is defined as gtFloat# = let x = x in x.
My question is whether these definations make sense and what they mean.
I checked the header of GHC.Prim like below
{-
This is a generated file (generated by genprimopcode).
It is not code to actually be used. Its only purpose is to be
consumed by haddock.
-}
I guess it may have some relations with the questions and who could please explain that to me.
It's magic :)
These are the "primitive operators and operations". They are hardwired into the compiler, hence there are no data constructors for primitives and all functions are bottom since they are necessarily not expressable in pure haskell.
(Bottom represents a "hole" in a haskell program, an infinite loop or undefined are examples of bottom)
To put it another way
These data declarations/functions are to provide access to the raw compiler internals. GHC.Prim exists to export these primitives, it doesn't actually implement them or anything (eg its code isn't actually useful). All of that is done in the compiler.
It's meant for code that needs to be extremely optimized. If you think you might need it, some useful reading about the primitives in GHC
A brief expansion of jozefg's answer ...
Primops are precisely those operations that are supplied by the runtime because they can't be defined within the language (or shouldn't be, for reasons of efficiency, say). The true purpose of GHC.Prim is not to define anything, but merely to export some operations so that Haddock can document their existence.
The construction let x = x in x is used at this point in GHC's codebase because the value undefined has not yet been, um, "defined". (That waits until the Prelude.) But notice that the circular let construction, just like undefined, is both syntactically correct and can have any type. That is, it's an infinite loop with the semantics of ⊥, just as undefined is.
... and an aside
Also note that in general the Haskell expression let x = z in y means "change the variable x to the expression z wherever x occurs in the expression y". If you're familiar with the lambda calculus, you should recognize this as the reduction rule for the application of the lambda abstraction \x -> y to the term z. So is the Haskell expression let x = x in x nothing more than some syntax on top of the pure lambda calculus? Let's take a look.
First, we need to account for the recursiveness of Haskell's let expressions. The lambda calculus does not admit recursive definitions, but given a primitive fixed-point operator fix,1 we can encode recursiveness explicitly. For example, the Haskell expression let x = x in x has the same meaning as (fix \r x -> r x) z.2 (I've renamed the x on the right side of the application to z to emphasize that it has no implicit relation to the x inside the lambda).
Applying the usual definition of a fixed-point operator, fix f = f (fix f), our translation of let x = x in x reduces (or rather doesn't) like this:
(fix \r x -> r x) z ==>
(\s y -> s y) (fix \r x -> r x) z ==>
(\y -> (fix \r x -> r x) y) z ==>
(fix \r x -> r x) z ==> ...
So at this point in the development of the language, we've introduced the semantics of ⊥ from the foundation of the (typed) lambda calculus with a built-in fixed-point operator. Lovely!
We need a primitive fixed-point operation (that is, one that is built into the language) because it's impossible to define a fixed-point combinator in the simply typed lambda calculus and its close cousins. (The definition of fix in Haskell's Prelude doesn't contradict this—it's defined recursively, but we need a fixed-point operator to implement recursion.)
If you haven't seen this before, you should read up on fixed-point recursion in the lambda calculus. A text on the lambda calculus is best (there are some free ones online), but some Googling should get you going. The basic idea is that we can convert a recursive definition into a non-recursive one by abstracting over the recursive call, then use a fixed-point combinator to pass our function (lambda abstraction) to itself. The base-case of a well-defined recursive definition corresponds to a fixed point of our function, so the function executes, calling itself over and over again until it hits a fixed point, at which point the function returns its result. Pretty damn neat, huh?

Representing undefined result in MIT Scheme

Imagine I have a function with a domain of all integers bigger than 0. I want the result of other inputs to be undefined. For the sake of simplicity, let's say this is the increment function. In Haskell, I could achieve this with something like
f :: Integer -> Integer
f x
| x > 0 = x + 1
| otherwise = undefined
Of course, the example is quite gimped but it should be clear what I want to achieve. I'm not sure how to achieve the similar in Scheme.
(define (f x)
(if (> x 0)
(+ x 1)
(?????)))
My idea is to just stick an error in there but is there any way to replicate the Haskell behaviour more closely?
Your question is related to this one which has answers pointing out that in R5RS (which I guess MIT scheme partially supports?), the if with one branch returns an "unspecified value". So the equivalent to the haskell code should be:
(define (f x)
(if (> x 0)
(+ x 1)))
You probably already know this: in haskell undefined is defined in terms of error, and is primarily used in development as a placeholder to be removed later. The proper way to define your haskell function would be to give it a type like: Integer -> Maybe Integer.
A common undefined value is void defined as (define void (if #f #f)).
Notice that not all Scheme implementations allow an if without the alternative part (as suggested in the other answers) - for instance, Racket will flag this situation as an error.
In Racket you can explicitly write (void) to specify that a procedure returns no useful result (check if this is available in MIT Scheme). From the documentation:
The constant #<void> is returned by most forms and procedures that have a side-effect and no useful result. The constant #<undefined> is used as the initial value for letrec bindings. The #<void> value is always eq? to itself, and the #<undefined> value is also eq? to itself.
(void v ...) → void?
Returns the constant #<void>. Each v argument is ignored.
That is, the example in the question would look like this:
(define (f x)
(if (> x 0)
(+ x 1)
(void)))
Speaking specifically to MIT Scheme, I believe #!unspecific is the constant that is returned from an if without an alternative.
(eq? (if (= 1 2) 3) #!unspecific) => #t

Resources