Nim: How to prove not nil? - nim-lang

To me, one of the most interesting features of Nim is the not nil annotation, because it basically allows to completely rule out all sorts of NPE / access violations bugs statically, by the help of the compiler. However, I have trouble to use it in practice. Let's consider one of the most basic use cases:
type
SafeSeq[T] = seq[T] not nil
An immediate pitfall here is that even instantiating such a SafeSeq is not that easy. The attempt
let s: SafeSeq[int] = newSeq[int](100)
fails with error cannot prove 'newSeq(100)' is not nil, which is surprising because one might expect that a newSeq simply is not nil. A workaround seems to use a helper like this:
proc newSafeSeq*[T](size: int): SafeSeq[T] =
# I guess only #[] expressions are provably not nil
result = #[]
result.setlen(size)
The next problem arises when trying to do something with a SafeSeq. For instance, one might expect that when you map over a SafeSeq the result should be not nil again. However, something like this fails as well:
let a: SafeSeq[int] = #[1,2,3]
let b: SafeSeq[string] = a.mapIt(string, $it)
The general problem seems to be that as soon as a return type becomes an ordinary seq the compiler seems to forget about the not nil property and can no longer prove it.
My idea now was to introduce a small (arguably ugly) helper method that allows me to actually prove not nil:
proc proveNotNil*[T](a: seq[T]): SafeSeq[T] =
if a != nil:
result = a # surprise, still error "cannot prove 'a' is not nil"
else:
raise newException(ValueError, "can't convert")
# which should allow this:
let a: SafeSeq[int] = #[1,2,3]
let b: SafeSeq[string] = a.mapIt(string, $it).proveNotNil
However, the compiler also fails to prove not nil here. My questions are:
How should I help the compiler inferring not nil in such cases?
What is the long term goal with this feature, i.e, are there plans to make inferring not nil more powerful? The problem with a manual proveNotNil is that it is potentially unsafe and against the idea that the compiler takes care of proving it. However, if the proveNotNil would only be required in very rare cases, it wouldn't hurt much.
Note: I know that seq attempts to be nil agnostic, i.e., everything works fine even in the nil case. However, this only applies for within Nim. When interfacing C code, the nil-hiding-principle becomes a dangerous source for bugs, because a nil sequence is only harmless on the Nim side...

Use isNil magic to check for nil:
type SafeSeq[T] = seq[T] not nil
proc proveNotNil[T](s: seq[T]): SafeSeq[T] =
if s.isNil: # Here is the magic!
assert(false)
else:
result = s
let s = proveNotNil newSeq[int]()

Related

Infinite loop while updating record [duplicate]

I want to update a record syntax with a change in one field so i did something like:
let rec = rec{field = 1}
But I've noticed that i can't print rec anymore, means the compiler seems to get into an infinite loop when i try. so i have tried doing:
let a = 1 -- prints OK
let a = a -- now i can't print a (also stuck in a loop)
So i can't do let a = a with any type, but i don't understand why, and how should i resolve this issue.
BTW: while doing:
let b = a {...record changes..}
let a = b
works, but seems redundant.
The issue you're running into is that all let and where bindings in Haskell are recursive by default. So when you write
let rec = rec { ... }
it tries to define a circular data type that will loop forever when you try to evaluate it (just like let a = a).
There's no real way around this—it's a tradeoff in the language. It makes recursive functions (and even plain values) easier to write with less noise, but also means you can't easily redefine a a bunch of times in terms of itself.
The only thing you can really do is give your values different names—rec and rec' would be a common way to do this.
To be fair to Haskell, recursive functions and even recursive values come up quite often. Code like
fibs = 0 : 1 : zipWith (+) fibs (tail fibs)
can be really nice once you get the hang of it, and not having to explicitly mark this definition as recursive (like you'd have to do in, say, OCaml) is a definite upside.
You never need to update a variable : you can always make another variable with the new value. In your case let rec' = rec{field = 1}.
Maybe you worry about performance and the value being unnecessarily copied. That's the compiler's job, not yours : even if you declare 2 variables in your code, the compiler should only make one in memory and update it in place.
Now there are times when the code is so complex that the compiler fails to optimize. You can tell by inspecting the intermediate core language, or even the final assembly. Profile first to know what functions are slow : if it's just an extra Int or Double, you don't care.
If you do find a function that the compiler failed to optimize and that takes too much time, then you can rewrite it to handle the memory yourself. You will then use things like unboxed vectors, IO and ST monad, or even language extensions to access the native machine-level types.
First of all, Haskell does not allow "copying" data to itself, which in the normal sense, means the data is mutable. In Haskell you don't have mutable "variable"s, so you will not be able to modify the value a given variable presents.
All you have did, is define a new variable which have the same name of its previous version. But, to do this properly, you have to refer to the old variable, not the newly defined one. So your original definition
let rec = rec { field=1 }
is a recursive definition, the name rec refer to itself. But what you intended to do, is to refer to the rec defined early on.
So this is a name conflict.
Haskell have some machenism to work around this. One is your "temporary renaming".
For the original example this looks like
let rec' = rec
let rec = rec' { field=1 }
This looks like your given solution. But remember this is only available in a command line environment. If you try to use this in a function, you may have to write
let rec' = rec in let rec = rec' { field=1 } in ...
Here is another workaround, which might be useful when rec is belong to another module (say "MyModule"):
let rec = MyModule.rec { field=1 }

name capture in Haskell `let` expression

I was writing a function something similar to this:
f x = let
x = ...
in
e
Due to scoping rules in Haskell any use of x in e will resolve to the definition of x in the let construct.
Why is such a thing allowed in Haskell?
Shouldn't the compiler reject such a program telling we cannot bind a value that has the same name as argument of the function.
(This example may be simplistic, but in real world context where variables have semantic meaning associated with them it is easy to make such mistake)
You can enable warnings for this type of name shadowing with the compiler flag
-fwarn-name-shadowing
This option causes a warning to be emitted whenever an inner-scope value has the same name as an outer-scope value, i.e. the inner value shadows the outer one. This can catch typographical errors that turn into hard-to-find bugs, e.g., in the inadvertent capture of what would be a recursive call in f = ... let f = id in ... f ....
However, it is more common to compile with -Wall, which includes a lot of other warnings that will help you avoid bad practices.

Haskell: let statement, copy data type to itself with/without modification not working

I want to update a record syntax with a change in one field so i did something like:
let rec = rec{field = 1}
But I've noticed that i can't print rec anymore, means the compiler seems to get into an infinite loop when i try. so i have tried doing:
let a = 1 -- prints OK
let a = a -- now i can't print a (also stuck in a loop)
So i can't do let a = a with any type, but i don't understand why, and how should i resolve this issue.
BTW: while doing:
let b = a {...record changes..}
let a = b
works, but seems redundant.
The issue you're running into is that all let and where bindings in Haskell are recursive by default. So when you write
let rec = rec { ... }
it tries to define a circular data type that will loop forever when you try to evaluate it (just like let a = a).
There's no real way around this—it's a tradeoff in the language. It makes recursive functions (and even plain values) easier to write with less noise, but also means you can't easily redefine a a bunch of times in terms of itself.
The only thing you can really do is give your values different names—rec and rec' would be a common way to do this.
To be fair to Haskell, recursive functions and even recursive values come up quite often. Code like
fibs = 0 : 1 : zipWith (+) fibs (tail fibs)
can be really nice once you get the hang of it, and not having to explicitly mark this definition as recursive (like you'd have to do in, say, OCaml) is a definite upside.
You never need to update a variable : you can always make another variable with the new value. In your case let rec' = rec{field = 1}.
Maybe you worry about performance and the value being unnecessarily copied. That's the compiler's job, not yours : even if you declare 2 variables in your code, the compiler should only make one in memory and update it in place.
Now there are times when the code is so complex that the compiler fails to optimize. You can tell by inspecting the intermediate core language, or even the final assembly. Profile first to know what functions are slow : if it's just an extra Int or Double, you don't care.
If you do find a function that the compiler failed to optimize and that takes too much time, then you can rewrite it to handle the memory yourself. You will then use things like unboxed vectors, IO and ST monad, or even language extensions to access the native machine-level types.
First of all, Haskell does not allow "copying" data to itself, which in the normal sense, means the data is mutable. In Haskell you don't have mutable "variable"s, so you will not be able to modify the value a given variable presents.
All you have did, is define a new variable which have the same name of its previous version. But, to do this properly, you have to refer to the old variable, not the newly defined one. So your original definition
let rec = rec { field=1 }
is a recursive definition, the name rec refer to itself. But what you intended to do, is to refer to the rec defined early on.
So this is a name conflict.
Haskell have some machenism to work around this. One is your "temporary renaming".
For the original example this looks like
let rec' = rec
let rec = rec' { field=1 }
This looks like your given solution. But remember this is only available in a command line environment. If you try to use this in a function, you may have to write
let rec' = rec in let rec = rec' { field=1 } in ...
Here is another workaround, which might be useful when rec is belong to another module (say "MyModule"):
let rec = MyModule.rec { field=1 }

EAFP in Haskell

I have a doubt of the Maybe and Either types, and their hypothetical relation to EAFP(Easier Ask Forgiveness to Permission). I've worked with Python and get used to work with the EAFP paradigm in the world of exceptions.
The classical example: Division by zero
def func(x,y):
if not y:
print "ERROR."
else: return (x/y)
and Python's style:
def func(x,y):
try:
return (x/y)
except: return None
In Haskell, the first function would be
func :: (Eq a, Fractional a) => a -> a -> a
func x y = if y==0 then error "ERROR." else x/y
and with Maybe:
func :: (Eq a, Fractional a) => a -> a -> Maybe a
func x y = if y==0 then Nothing else Just (x/y)
In Python's version, you run func without checking y. With Haskell, the story is the opposite: y is checked.
My question:
Formally, does Haskell support the EAFP paradigm or "prefers" LBYL although admits a semi-bizarre EAFP approximation?
PD: I called "semi-bizarre" because, even if it is intuitively readable, it looks (at least for me) like it vulnerates EAFP.
The Haskell style with Maybe and Either forces you to check for the error at some point, but it does not have to be right away. If you don't want to deal with the error now, you can just propagate it on through the rest of your computation.
Taking your hypothetical safe divide-by-0 example, you could use it in a broader computation without an explicit check:
do result <- func a b
let x = result * 10
return x
Here, you don't have to match on the Maybe returned by func: you just extract it into the result variable using do-notation, which automatically propagates failure throughout. The consequence is that you don't need to deal with the potential error immediately, but the final result of the computation is wrapped in Maybe itself.
This means that you can easily combine (compose) functions that miht result in an error without having to check the error at each step.
In a sense, this gives you the best of both worlds. You still only have to check for errors in one place, at the very end, but you're explicit about it. You have to use something like do-notation to take care of the actual propagation and you can't ignore the final error by accident: if you don't want to handle it, you have to turn it into a runtime error explicitly.
Isn't explicit better than implicit?
Now, Haskell also has a system of exceptions for working with runtime errors that you do not have to check at all. This is useful occasionally, but not too often. In Haskell, we only use it for errors that we do not expect to ever catch—truly exceptional situations. The rule of thumb is that a runtime exception represents a bug in your program, while an improper input or merely an uncommon case should be represented with Maybe or Either.

ocaml style: parameterize programs

I have a OCaml program which modules have lots of functions that depend on a parameter, i.e. "dimension". This parameter is determined once at the beginning of a run of the code and stays constant until termination.
My question is: how can I write the code shorter, so that my functions do not all require a "dimension" parameter. Those modules call functions of each other, so there is no strict hierarchy (or I can't see this) between the modules.
how is the ocaml style to adress this problem? Do I have to use functors or are there other means?
You probably cannot evaluate the parameter without breaking dependencies between modules, otherwise you would just define it in one of the modules where it is accessible from other modules. A solution that comes to my mind is a bit "daring". Define the parameter as a lazy value, and suspend in it a dereference of a "global cell":
let hidden_box = ref None
let initialize_param p =
match !hidden_box with None -> hidden_box := Some p | _ -> assert false
let param =
lazy (match !hidden_box with None -> assert false | Some p -> p)
The downside is that Lazy.force param is a bit verbose.
ETA: Note that "there is no strict hierarchy between the modules" is either:
false, or
you have a recursive module definition, or
you are tying the recursive knot somewhere.
In case (2) you can just put everything into a functor. In case (3) you are passing parameters already.

Resources