How to undestand functors in the Nix expression language? - haskell

I'm having a bit of trouble parsing this. But as I write it out, I think I may have it.
let add = { __functor = self: x: x + self.x; };
inc = add // { x = 1; };
in inc 1
First, is self a keyword like in many OO languages or is this just a regular name?
Secondly, I'm trying to understand what the multiple : are doing in the definition of __functor, but this is probably a failing of my basic familiarity with Nix expressions, but I guess what is happening is that both self and x are arguments to __functor, i.e., it looks like it is probably a curried function.
So really, __functor here is what fmap would be in Haskell, I think, and self (add) is the functor itself, and x: x + self.x is what the function mapped by fmap would be in Haskell.

self is not a keyword, just an ordinary parameter name. You are correct that the right-hand side of __functor is a curried function of two arguments. The Nix interpreter ensures that __functor is passed the appropriate value for self, at the call site inc 1; __functor is handled specially, even though it's not a keyword per se.
Your example is nearly the same as:
let add = a: b: a + b
inc = add 1
in inc 1
In a larger program it might be useful to be able to override add.x elsewhere.
As noted in the comments, Nix uses "functor" in the sense of an object (here, set) that can be used syntactically like a function.
Passing self this way is Nix's version of "Objects are closures". The technique is used many places in Nixpkgs, with & without the __functor feature, to get the usual benefits of Objects, including extension (~ structural subtyping) & late binding.

Related

Understanding recurive let expression in lambda calculus with Haskell, OCaml and nix language

I'm trying to understand how recursive set operate internally by comparing similar feature in another functional programming languages and concepts.
I can find it in wiki. In that, I need to know Y combinator, fixed point. I can get it briefly in wiki.
Then, now I start to apply this in Haskell.
Haskell
It is easy. But I want to know behind the scenes.
*Main> let x = y; y = 10; in x
10
When you write a = f b in a lazy functional language like Haskell or Nix, the meaning is stronger than just assignment. a and f b will be the same thing. This is usually called a binding.
I'll focus on a Nix example, because you're asking about recursive sets specifically.
A simple attribute set
Let's look at the initialization of an attribute set first. When the Nix interpreter is asked to evaluate this file
{ a = 1 + 1; b = true; }
it parses it and returns a data structure like this
{ a = <thunk 1>; b = <thunk 2>; }
where a thunk is a reference to the relevant syntax tree node and a reference to the "environment", which behaves like a dictionary from identifiers to their values, although implemented more efficiently.
Perhaps the reason we're evaluating this file is because you requested nix-build, which will not just ask for the value of a file, but also traverse the attribute set when it sees that it is one. So nix-build will ask for the value of a, which will be computed from its thunk. When the computation is complete, the memory that held the thunk is assigned the actual value, type = tInt, value.integer = 2.
A recursive attribute set
Nix has a special syntax that combines the functionality of attribute set construction syntax ({ }) and let-binding syntax. This is avoids some repetition when you're constructing attribute sets with some shared values.
For example
let b = 1 + 1;
in { b = b; a = b + 5; }
can be expressed as
rec { b = 1 + 1; a = b + 5; }
Evaluation works in a similar manner.
At first the evaluator returns a representation of the attribute set with all thunks, but this time the thunks reference an new environment that includes all the attributes, on top of the existing lexical scope.
Note that all these representations can be constructed while performing a minimal amount of work.
nix-build traverses attrsets in alphabetic order, so it will evaluate a first. It's a thunk that references the a + syntax node and an environment with b in it. Evaluating this requires evaluating the b syntax node (an ExprVar), which references the environment, where we find the 1 + 1 thunk, which is changed to a tInt of 2 as before.
As you can see, this process of creating thunks but only evaluating them when needed is indeed lazy and allows us to have various language constructs with their own scoping rules.
Haskell implementations usually follow a similar pattern, but may compile the code rather than interpret a syntax tree, and resolve all variable references to constant memory offsets completely. Nix tries to do this to some degree, but it must be able to fall back on strings because of the inadvisable with keyword that makes the scope dynamic.
I guess several things by myself.
In eagar evaluation language, I must declare before use it. So the order of declaration is simple.
int x = 10;
int y = x;
Just for Nix language
In wiki, there isn't any concept comparision with Haskell though let ... in is compared with Haskell.
lexical scope
all variables are lexically scoped.
mutual recursion
https://en.wikipedia.org/wiki/Let_expression#Mutually_recursive_let_expression

Haskell: let statement, copy data type to itself with/without modification not working

I want to update a record syntax with a change in one field so i did something like:
let rec = rec{field = 1}
But I've noticed that i can't print rec anymore, means the compiler seems to get into an infinite loop when i try. so i have tried doing:
let a = 1 -- prints OK
let a = a -- now i can't print a (also stuck in a loop)
So i can't do let a = a with any type, but i don't understand why, and how should i resolve this issue.
BTW: while doing:
let b = a {...record changes..}
let a = b
works, but seems redundant.
The issue you're running into is that all let and where bindings in Haskell are recursive by default. So when you write
let rec = rec { ... }
it tries to define a circular data type that will loop forever when you try to evaluate it (just like let a = a).
There's no real way around this—it's a tradeoff in the language. It makes recursive functions (and even plain values) easier to write with less noise, but also means you can't easily redefine a a bunch of times in terms of itself.
The only thing you can really do is give your values different names—rec and rec' would be a common way to do this.
To be fair to Haskell, recursive functions and even recursive values come up quite often. Code like
fibs = 0 : 1 : zipWith (+) fibs (tail fibs)
can be really nice once you get the hang of it, and not having to explicitly mark this definition as recursive (like you'd have to do in, say, OCaml) is a definite upside.
You never need to update a variable : you can always make another variable with the new value. In your case let rec' = rec{field = 1}.
Maybe you worry about performance and the value being unnecessarily copied. That's the compiler's job, not yours : even if you declare 2 variables in your code, the compiler should only make one in memory and update it in place.
Now there are times when the code is so complex that the compiler fails to optimize. You can tell by inspecting the intermediate core language, or even the final assembly. Profile first to know what functions are slow : if it's just an extra Int or Double, you don't care.
If you do find a function that the compiler failed to optimize and that takes too much time, then you can rewrite it to handle the memory yourself. You will then use things like unboxed vectors, IO and ST monad, or even language extensions to access the native machine-level types.
First of all, Haskell does not allow "copying" data to itself, which in the normal sense, means the data is mutable. In Haskell you don't have mutable "variable"s, so you will not be able to modify the value a given variable presents.
All you have did, is define a new variable which have the same name of its previous version. But, to do this properly, you have to refer to the old variable, not the newly defined one. So your original definition
let rec = rec { field=1 }
is a recursive definition, the name rec refer to itself. But what you intended to do, is to refer to the rec defined early on.
So this is a name conflict.
Haskell have some machenism to work around this. One is your "temporary renaming".
For the original example this looks like
let rec' = rec
let rec = rec' { field=1 }
This looks like your given solution. But remember this is only available in a command line environment. If you try to use this in a function, you may have to write
let rec' = rec in let rec = rec' { field=1 } in ...
Here is another workaround, which might be useful when rec is belong to another module (say "MyModule"):
let rec = MyModule.rec { field=1 }

Pass by Reference in Haskell?

Coming from a C# background, I would say that the ref keyword is very useful in certain situations where changes to a method parameter are desired to directly influence the passed value for value types of for setting a parameter to null.
Also, the out keyword can come in handy when returning a multitude of various logically unconnected values.
My question is: is it possible to pass a parameter to a function by reference in Haskell? If not, what is the direct alternative (if any)?
There is no difference between "pass-by-value" and "pass-by-reference" in languages like Haskell and ML, because it's not possible to assign to a variable in these languages. It's not possible to have "changes to a method parameter" in the first place in influence any passed variable.
It depends on context. Without any context, no, you can't (at least not in the way you mean). With context, you may very well be able to do this if you want. In particular, if you're working in IO or ST, you can use IORef or STRef respectively, as well as mutable arrays, vectors, hash tables, weak hash tables (IO only, I believe), etc. A function can take one or more of these and produce an action that (when executed) will modify the contents of those references.
Another sort of context, StateT, gives the illusion of a mutable "state" value implemented purely. You can use a compound state and pass around lenses into it, simulating references for certain purposes.
My question is: is it possible to pass a parameter to a function by reference in Haskell? If not, what is the direct alternative (if any)?
No, values in Haskell are immutable (well, the do notation can create some illusion of mutability, but it all happens inside a function and is an entirely different topic). If you want to change the value, you will have to return the changed value and let the caller deal with it. For instance, see the random number generating function next that returns the value and the updated RNG.
Also, the out keyword can come in handy when returning a multitude of various logically unconnected values.
Consequently, you can't have out either. If you want to return several entirely disconnected values (at which point you should probably think why are disconnected values being returned from a single function), return a tuple.
No, it's not possible, because Haskell variables are immutable, therefore, the creators of Haskell must have reasoned there's no point of passing a reference that cannot be changed.
consider a Haskell variable:
let x = 37
In order to change this, we need to make a temporary variable, and then set the first variable to the temporary variable (with modifications).
let tripleX = x * 3
let x = tripleX
If Haskell had pass by reference, could we do this?
The answer is no.
Suppose we tried:
tripleVar :: Int -> IO()
tripleVar var = do
let times_3 = var * 3
let var = times_3
The problem with this code is the last line; Although we can imagine the variable being passed by reference, the new variable isn't.
In other words, we're introducing a new local variable with the same name;
Take a look again at the last line:
let var = times_3
Haskell doesn't know that we want to "change" a global variable; since we can't reassign it, we are creating a new variable with the same name on the local scope, thus not changing the reference. :-(
tripleVar :: Int -> IO()
tripleVar var = do
let tripleVar = var
let var = tripleVar * 3
return()
main = do
let x = 4
tripleVar x
print x -- 4 :(

Why must equations of a function in Haskell have the same number of arguments? [duplicate]

I noticed today that such a definition
safeDivide x 0 = x
safeDivide = (/)
is not possible. I am just curious what the (good) reason behind this is. There must be a very good one (it's Haskell after all :)).
Note: I am not looking suggestions for alternative implementations to the code above, it's a simple example to demonstrate my point.
I think it's mainly for consistency so that all clauses can be read in the same manner, so to speak; i.e. every RHS is at the same position in the type of the function. I think would mask quite a few silly errors if you allowed this, too.
There's also a slight semantic quirk: say the compiler padded out such clauses to have the same number of patterns as the other clauses; i.e. your example would become
safeDivide x 0 = x
safeDivide x y = (/) x y
Now consider if the second line had instead been safeDivide = undefined; in the absence of the previous clause, safeDivide would be ⊥, but thanks to the eta-expansion performed here, it's \x y -> if y == 0 then x else ⊥ — so safeDivide = undefined does not actually define safeDivide to be ⊥! This seems confusing enough to justify banning such clauses, IMO.
The meaning of a function with multiple clauses is defined by the Haskell standard (section 4.4.3.1) via translation to a lambda and case statement:
fn pat1a pat1b = r1
fn pat2a pat2b = r2
becomes
fn = \a b -> case (a,b) of
(pat1a, pat1b) -> r1
(pat2a, pat2b) -> r2
This is so that the function definition/case statement way of doing things is nice and consistent, and the meaning of each isn't specified redundantly and confusingly.
This translation only really makes sense when each clause has the same number of arguments. Of course, there could be extra rules to fix that, but they'd complicate the translation for little gain, since you probably wouldn't want to define things like that anyway, for your readers' sake.
Haskell does it this way because it's predecessors (like LML and Miranda) did. There is no technical reason it has to be like this; equations with fewer arguments could be eta expanded. But having a different number of arguments for different equations is probably a typo rather than intentional, so in this case we ban something sensible&rare to get better error reporting in the common case.

Are there any programming languages where variables are really functions?

For example, I would write:
x = 2
y = x + 4
print(y)
x = 5
print(y)
And it would output:
6 (=2+4)
9 (=5+4)
Also, are there any cases where this could actually be useful?
Clarification: Yes, lambdas etc. solve this problem (they were how I arrived at this idea); I was wondering if there were specific languages where this was the default: no function or lambda keywords required or needed.
Haskell will meet you halfway, because essentially everything is a function, but variables are only bound once (meaning you cannot reassign x in the same scope).
It's easy to consider y = x + 4 a variable assignment, but when you look at y = map (+4) [1..] (which means add 4 to every number in the infinite list from 1 upwards), what is y now? Is it an infinite list, or is it a function that returns an infinite list? (Hint: it's the second one.) In this case, treating variables as functions can be extremely beneficial, if not an absolute necessity, when taking advantage of laziness.
Really, in Haskell, your definition of y is a function accepting no arguments and returning x+4, where x is also a function that takes no arguments, but returns the value 2.
In any language with first order functions, it's trivial to assign anonymous functions to variables, but for most languages you'll have to add the parentheses to indicate a function call.
Example Lua code:
x = function() return 2 end
y = function() return x() + 4 end
print(y())
x = function() return 5 end
print(y())
$ lua x.lua
6
9
Or the same thing in Python (sticking with first-order functions, but we could have just used plain integers for x):
x = lambda: 2
y = lambda: x() + 4
print(y())
x = lambda: 5
print(y())
$ python x.py
6
9
you can use func expressions in C#
Func<int, int> y = (x) => x + 5;
Console.WriteLine(y(5)); // 10
Console.WriteLine(y(3)); // 8
... or ...
int x = 0;
Func<int> y = () => x + 5;
x = 5;
Console.WriteLine(y()); // 10
x = 3;
Console.WriteLine(y()); // 8
... if you are really wanting to program in a functional style the first option would probably be best.
it looks more like the stuff you saw in math class.
you don't have to worry about external state.
Check out various functional languages like F#, Haskell, and Scala. Scala treats functions as objects that have an apply() method, and you can store them in variables and pass them around like you can any other kind of object. I don't know that you can print out the definition of a Scala function as code though.
Update: I seem to recall that at least some Lisps allow you to pretty-print a function as code (eg, Scheme's pretty-print function).
This is the way spreadsheets work.
It is also related to call by name semantics for evaluating function arguments. Algol 60 had that, but it didn't catch on, too complicated to implement.
The programming language Lucid does this, although it calls x and y "streams" rather than functions.
The program would be written:
y
where
y = x + 4
end
And then you'd input:
x(0): 2
y = 6
x(1): 5
y = 7
Of course, Lucid (like most interesting programming languages) is fairly obscure, so I'm not surprised that nobody else found it. (or looked for it)
Try checking out F# here and on Wikipedia about Functional programming languages.
I myself have not yet worked on these types of languages since I've been concentrated on OOP, but will be delving soon once F# is out.
Hope this helps!
The closest I've seen of these have been part of Technical Analysis systems in charting components. (Tradestation, metastock, etc), but mainly they focus on returning multiple sets of metadata (eg buy/sell signals) which can be then fed into other functions that accept either meta data, or financial data, or plotted directly.
My 2c:
I'd say a language as you suggest would be highly confusing to say the least. Functions are generally r-values for good reason. This code (javascript) shows how enforcing functions as r-values increases readability (and therefore maintenance) n-fold:
var x = 2;
var y = function() { return x+2; }
alert(y());
x= 5;
alert(y());
Self makes no distinction between fields and methods, both are slots and can be accessed in exactly the same way. A slot can contain a value or a function (so those two are still separate entities), but the distinction doesn't matter to the user of the slot.
In Scala, you have lazy values and call-by-name arguments in functions.
def foo(x : => Int) {
println(x)
println(x) // x is evaluated again!
}
In some way, this can have the effect you looked for.
I believe the mathematically oriented languages like Octave, R and Maxima do that. I could be wrong, but no one else has mentioned them, so I thought I would.

Resources