ocaml style: parameterize programs - styles

I have a OCaml program which modules have lots of functions that depend on a parameter, i.e. "dimension". This parameter is determined once at the beginning of a run of the code and stays constant until termination.
My question is: how can I write the code shorter, so that my functions do not all require a "dimension" parameter. Those modules call functions of each other, so there is no strict hierarchy (or I can't see this) between the modules.
how is the ocaml style to adress this problem? Do I have to use functors or are there other means?

You probably cannot evaluate the parameter without breaking dependencies between modules, otherwise you would just define it in one of the modules where it is accessible from other modules. A solution that comes to my mind is a bit "daring". Define the parameter as a lazy value, and suspend in it a dereference of a "global cell":
let hidden_box = ref None
let initialize_param p =
match !hidden_box with None -> hidden_box := Some p | _ -> assert false
let param =
lazy (match !hidden_box with None -> assert false | Some p -> p)
The downside is that Lazy.force param is a bit verbose.
ETA: Note that "there is no strict hierarchy between the modules" is either:
false, or
you have a recursive module definition, or
you are tying the recursive knot somewhere.
In case (2) you can just put everything into a functor. In case (3) you are passing parameters already.

Related

Understanding recurive let expression in lambda calculus with Haskell, OCaml and nix language

I'm trying to understand how recursive set operate internally by comparing similar feature in another functional programming languages and concepts.
I can find it in wiki. In that, I need to know Y combinator, fixed point. I can get it briefly in wiki.
Then, now I start to apply this in Haskell.
Haskell
It is easy. But I want to know behind the scenes.
*Main> let x = y; y = 10; in x
10
When you write a = f b in a lazy functional language like Haskell or Nix, the meaning is stronger than just assignment. a and f b will be the same thing. This is usually called a binding.
I'll focus on a Nix example, because you're asking about recursive sets specifically.
A simple attribute set
Let's look at the initialization of an attribute set first. When the Nix interpreter is asked to evaluate this file
{ a = 1 + 1; b = true; }
it parses it and returns a data structure like this
{ a = <thunk 1>; b = <thunk 2>; }
where a thunk is a reference to the relevant syntax tree node and a reference to the "environment", which behaves like a dictionary from identifiers to their values, although implemented more efficiently.
Perhaps the reason we're evaluating this file is because you requested nix-build, which will not just ask for the value of a file, but also traverse the attribute set when it sees that it is one. So nix-build will ask for the value of a, which will be computed from its thunk. When the computation is complete, the memory that held the thunk is assigned the actual value, type = tInt, value.integer = 2.
A recursive attribute set
Nix has a special syntax that combines the functionality of attribute set construction syntax ({ }) and let-binding syntax. This is avoids some repetition when you're constructing attribute sets with some shared values.
For example
let b = 1 + 1;
in { b = b; a = b + 5; }
can be expressed as
rec { b = 1 + 1; a = b + 5; }
Evaluation works in a similar manner.
At first the evaluator returns a representation of the attribute set with all thunks, but this time the thunks reference an new environment that includes all the attributes, on top of the existing lexical scope.
Note that all these representations can be constructed while performing a minimal amount of work.
nix-build traverses attrsets in alphabetic order, so it will evaluate a first. It's a thunk that references the a + syntax node and an environment with b in it. Evaluating this requires evaluating the b syntax node (an ExprVar), which references the environment, where we find the 1 + 1 thunk, which is changed to a tInt of 2 as before.
As you can see, this process of creating thunks but only evaluating them when needed is indeed lazy and allows us to have various language constructs with their own scoping rules.
Haskell implementations usually follow a similar pattern, but may compile the code rather than interpret a syntax tree, and resolve all variable references to constant memory offsets completely. Nix tries to do this to some degree, but it must be able to fall back on strings because of the inadvisable with keyword that makes the scope dynamic.
I guess several things by myself.
In eagar evaluation language, I must declare before use it. So the order of declaration is simple.
int x = 10;
int y = x;
Just for Nix language
In wiki, there isn't any concept comparision with Haskell though let ... in is compared with Haskell.
lexical scope
all variables are lexically scoped.
mutual recursion
https://en.wikipedia.org/wiki/Let_expression#Mutually_recursive_let_expression

How to undestand functors in the Nix expression language?

I'm having a bit of trouble parsing this. But as I write it out, I think I may have it.
let add = { __functor = self: x: x + self.x; };
inc = add // { x = 1; };
in inc 1
First, is self a keyword like in many OO languages or is this just a regular name?
Secondly, I'm trying to understand what the multiple : are doing in the definition of __functor, but this is probably a failing of my basic familiarity with Nix expressions, but I guess what is happening is that both self and x are arguments to __functor, i.e., it looks like it is probably a curried function.
So really, __functor here is what fmap would be in Haskell, I think, and self (add) is the functor itself, and x: x + self.x is what the function mapped by fmap would be in Haskell.
self is not a keyword, just an ordinary parameter name. You are correct that the right-hand side of __functor is a curried function of two arguments. The Nix interpreter ensures that __functor is passed the appropriate value for self, at the call site inc 1; __functor is handled specially, even though it's not a keyword per se.
Your example is nearly the same as:
let add = a: b: a + b
inc = add 1
in inc 1
In a larger program it might be useful to be able to override add.x elsewhere.
As noted in the comments, Nix uses "functor" in the sense of an object (here, set) that can be used syntactically like a function.
Passing self this way is Nix's version of "Objects are closures". The technique is used many places in Nixpkgs, with & without the __functor feature, to get the usual benefits of Objects, including extension (~ structural subtyping) & late binding.

name capture in Haskell `let` expression

I was writing a function something similar to this:
f x = let
x = ...
in
e
Due to scoping rules in Haskell any use of x in e will resolve to the definition of x in the let construct.
Why is such a thing allowed in Haskell?
Shouldn't the compiler reject such a program telling we cannot bind a value that has the same name as argument of the function.
(This example may be simplistic, but in real world context where variables have semantic meaning associated with them it is easy to make such mistake)
You can enable warnings for this type of name shadowing with the compiler flag
-fwarn-name-shadowing
This option causes a warning to be emitted whenever an inner-scope value has the same name as an outer-scope value, i.e. the inner value shadows the outer one. This can catch typographical errors that turn into hard-to-find bugs, e.g., in the inadvertent capture of what would be a recursive call in f = ... let f = id in ... f ....
However, it is more common to compile with -Wall, which includes a lot of other warnings that will help you avoid bad practices.

Nim: How to prove not nil?

To me, one of the most interesting features of Nim is the not nil annotation, because it basically allows to completely rule out all sorts of NPE / access violations bugs statically, by the help of the compiler. However, I have trouble to use it in practice. Let's consider one of the most basic use cases:
type
SafeSeq[T] = seq[T] not nil
An immediate pitfall here is that even instantiating such a SafeSeq is not that easy. The attempt
let s: SafeSeq[int] = newSeq[int](100)
fails with error cannot prove 'newSeq(100)' is not nil, which is surprising because one might expect that a newSeq simply is not nil. A workaround seems to use a helper like this:
proc newSafeSeq*[T](size: int): SafeSeq[T] =
# I guess only #[] expressions are provably not nil
result = #[]
result.setlen(size)
The next problem arises when trying to do something with a SafeSeq. For instance, one might expect that when you map over a SafeSeq the result should be not nil again. However, something like this fails as well:
let a: SafeSeq[int] = #[1,2,3]
let b: SafeSeq[string] = a.mapIt(string, $it)
The general problem seems to be that as soon as a return type becomes an ordinary seq the compiler seems to forget about the not nil property and can no longer prove it.
My idea now was to introduce a small (arguably ugly) helper method that allows me to actually prove not nil:
proc proveNotNil*[T](a: seq[T]): SafeSeq[T] =
if a != nil:
result = a # surprise, still error "cannot prove 'a' is not nil"
else:
raise newException(ValueError, "can't convert")
# which should allow this:
let a: SafeSeq[int] = #[1,2,3]
let b: SafeSeq[string] = a.mapIt(string, $it).proveNotNil
However, the compiler also fails to prove not nil here. My questions are:
How should I help the compiler inferring not nil in such cases?
What is the long term goal with this feature, i.e, are there plans to make inferring not nil more powerful? The problem with a manual proveNotNil is that it is potentially unsafe and against the idea that the compiler takes care of proving it. However, if the proveNotNil would only be required in very rare cases, it wouldn't hurt much.
Note: I know that seq attempts to be nil agnostic, i.e., everything works fine even in the nil case. However, this only applies for within Nim. When interfacing C code, the nil-hiding-principle becomes a dangerous source for bugs, because a nil sequence is only harmless on the Nim side...
Use isNil magic to check for nil:
type SafeSeq[T] = seq[T] not nil
proc proveNotNil[T](s: seq[T]): SafeSeq[T] =
if s.isNil: # Here is the magic!
assert(false)
else:
result = s
let s = proveNotNil newSeq[int]()

Erlang: Matching strings in guard statement

Started working with erlang quite recently and ran into the problem above, how do you go about comparing two strings in a guard statement? Tried the string:equal(x,y) method but couldn't get it to work inside a guard.
You could use pattern matching like this:
are_the_same(A, A) ->
true;
are_the_same(_, _) ->
false.
In first clause both arguments are named A which will result in them being pattern matched against each other. Or to be exact first argument will be bind to A variable with use of = operator, and than second argument will be bind to A variable with = operator, but since A is bound already it will be treated as "comparision". You can read more about this in docs.
And of course you could write write first clouse with use of guard like:
are_the_same(A, B) when A =:= B ->
You don't need the function string:equal/2 to compare strings; you can use the operators == or =:=, which are allowed in guard tests. For example:
foo(A, B) when A =:= B ->
equal;
foo(_, _) ->
not_equal.
Though in most cases you'd want to use pattern matching instead, as described in the other answer.
NB: As of Erlang/OTP 20.0, string:equal(A, B) is no longer equivalent to A =:= B. string:equal/2 now operates on grapheme clusters, and there are also string:equal/3 and string:equal/4 that can optionally ignore case when comparing and do Unicode normalisation. So you need to understand what you mean by "equal" before settling on a comparison method.
The functions you can use in guards are limited because of the nature of Erlang's scheduling; specifically, Erlang aims to avoid side-effects in guard statements (e.g., calling to another process) because guards are evaluated by the scheduler and do not count against reductions. This is why string:equal does not work.
That being said, you can use Erlang's pattern matching to match strings. Please bear in mind the use of strings as lists, binaries, or iolists (nested lists/binaries) in Erlang, and make sure you're testing/passing strings of the right type (iolists are particularly hard to pattern match and are usually best handled with the re module, or converting them to binaries via iolist_to_binary).
For example, say we want a function that tests to see if a string begins with "foo":
bar("foo" ++ _Rest) -> true;
bar(<<"foo", Rest/binary>>) -> true;
bar(_Else) -> false.
If you just want to test for a particular string, it's even easier:
bar("foo") -> true;
bar(<<"foo">>) -> true;
bar(_Else) -> false.

Resources