typing recursive modules - programming-languages

typing recursive modules - programming-languages

In Leroy's paper on how recursive modules are typed in OCaml, it is written that modules are checked in an environment made of approximations of module types:
module rec A = ... and B = ... and C = ...
An environment {A -> approx(A); B -> approx(B); C -> approx(C) } is first built, and then used to compute the types of A, B and C.
I noticed that, in some cases, approximations are not good enough, and typechecking fails. In particular, when putting compilation units sources in a recursive module definition, typechecking can fail whereas the compiler was able to compile the compilation units separately.
Coming back to my first example, I found that a solution would be to type A in the initial approximated environment, but then to type B in that initial environment extended with the new computed type of A, and to type C in the previous env with the new computed type of B, and so on.
Before investigating more, I would like to know if there is any prior work on this subject, showing that such a compilation scheme for recursive modules is either safe or unsafe ? Is there a counter-example showing an unsafe program correctly typed with this scheme ?

First, note that Leroy (or Ocaml) does not allow module rec without explicit signature annotations. So it's rather
module rec A : SA = ... and B : SB = ... and C : SC = ...
and the approximative environment is {A : approx(SA), B : approx(SB), C : approx(SC)}.
It is not surprising that some modules type-check when defined separately, but not when defined recursively. After all, the same is already true for core-language declarations: in a 'let rec', recursive occurrences of the bound variables are monomorphic, while with separated 'let' declarations, you can use previous variables polymorphically. Intuitively, the reason is that you cannot have all the knowledge that you'd need to infer the more liberal types before you have actually checked the definitions.
Regarding your suggestion, the problem with it is that it makes the module rec construct unsymmetric, i.e. order would matter in potentially subtle ways. That violates the spirit of recursive definitions (at least in ML), which should always be indifferent to ordering.
In general, the issue with recursive typing is not so much soundness, but rather completeness. You don't want a type system that is either undecidable in general, or whose specification is dependent on algorithmic artefacts (like checking order).
On a more general note, it is well-known that Ocaml's treatment of recursive modules is rather restrictive. There has been work, e.g. by Nakata & Garrigue, that pushes its limits further. However, I am convinced that ultimately, you won't be able to get as liberal as you'd like (and that applies to other aspects of its type module system as well, e.g. functors) without abandoning Ocaml's purely syntactic approach to module typing. But then, I'm biased. :)

Related

Why do GHC and GHCI differ on type inference?

I noticed, when doing a codegolf challenge, that by default, GHC doesn't infer the most general type for variables, leading to type errors when you try to use it with two different types.
For example:
(!) = elem
x = 'l' ! "hello" -- From its use here, GHC assumes (!) :: Char -> [Char] -> Bool
y = 5 ! [3..8] -- Fails because GHC expects these numbers to be of type Char, too
This can be changed using the pragma NoMonomorphismRestriction.
However, typing this into GHCI produces no type error, and :t (!) reveals that here, it assumes (Foldable t, Eq a) => a -> t a -> Bool, even when explicitly run with -XMonomorphismRestriction.
Why do GHC and GHCI differ on assuming the most general type for functions?
(Also, why have it enabled by default anyway? What does it help?)

The background of why the committee made this decision is given, in the designers’ own words, in the article “A History of Haskell: Being Lazy with Class” by Paul Hudak et al.
A major source of controversy in the early stages was the so-called
“monomorphism restriction.” Suppose that genericLength has this
overloaded type:
genericLength :: Num a => [b] -> a
Now consider this definition:
f xs = (len, len)`
where
len = genericLength xs
It looks as if len should be computed only once, but it can
actually be computed twice. Why? Because we can infer the type
len :: (Num a) => a; when desugared with the dictionary-passing
translation, len becomes a function that is called once for each
occurrence of len, each of which might used at a different type.
[John] Hughes argued strongly that it was unacceptable to silently
duplicate computation in this way. His argument was motivated by
a program he had written that ran exponentially slower than he expected.
(This was admittedly with a very simple compiler, but we were reluctant to
make performance differences as big as this dependent on compiler
optimisations.)
Following much debate, the committee adopted the now-notorious
monomorphism restriction. Stated briefly, it says that a definition
that does not look like a function (i.e. has no arguments on
the left-hand side) should be monomorphic in any overloaded
type variables. In this example, the rule forces len to be used
at the same type at both its occurrences, which solves the performance
problem. The programmer can supply an explicit type signature for len if
polymorphic behaviour is required.
The monomorphism restriction is manifestly a wart on the language.
It seems to bite every new Haskell programmer by giving rise to an
unexpected or obscure error message. There has been much
discussion of alternatives.
(18, Emphasis added.) Note that John Hughes is a co-author of the article.

I can't replicate your result that GHCi infers the type (Foldable t, Eq a) => a -> t a -> Bool even with -XMonomorphismRestriction (GHC 8.0.2).
What I see is that when I enter the line (!) = elem it infers the type (!) :: () -> [()] -> Bool, which is actually a perfect illustration of why you would want GHCi to behave "differently" from GHC, given that GHC is using the monomorphism restriction.
The problem described in #Davislor's answer that the monomorphism restriction was intended to address is that you could write code that syntactically looks like it's computing a value once, binding it to a name, and then using it several times, where actually the thing bound to a name is a reference to a closure awaiting a type class dictionary before it can really compute the value. All the use sites would separately work out what dictionary they need to pass and compute the value again, even if all the use sites actually pick the same dictionary (exactly as if you write a function of a number and then invoke it from several different places with the same parameter, you'd get the same result computed multiple times). But if the user was thinking of that binding as a simple value then this would be unexpected, and it's extremely likely that all the use-sites will want a single dictionary (because the user expected a reference to a single value computed from a single dictionary).
The monomorphism restriction forces GHC not to infer types that still need a dictionary (for bindings that have no syntactic parameters). So now the dictionary is chosen once at the binding site, instead of separately at each use of the binding, and the value really is only computed once. But that only works if the dictionary chosen at the binding site is the correct one that all the use sites would have chosen. If GHC picked the wrong one at the binding site, then all the use-sites would be type errors, even if they all agree on what type (and thus dictionary) they were expecting.
GHC compiles entire modules at once. So it can see the use sites and the binding site at the same time. Thus if any use of the binding requires a specific concrete type, the binding will use that type's dictionary, and everything will be well so long as all of the other use sites are compatible with that type (even if they were actually polymorphic and would also have worked with other types). This works even if the code that pins down the correct type is widely separated from the binding by many other calls; all the constraints on the types of things are effectively connected by unification during the type checking/inference phase, so when the compiler is choosing a type at the binding site it can "see" the requirements from all of the use-sites (within the same module).
But if the use sites are not all consistent with a single concrete type, then you get a type error, as in your example. One use-site of (!) requires the a type variable to be instantiated as Char, the other requires a type that also has a Num instance (which Char doesn't).
This wasn't consistent with our hopeful assumption that all the use-sites would want a single dictionary, and so the monomorphism restriction has resulted in an error that could have been avoided by inferring a more general type for (!). It's certainly debatable that the monomorphism restriction prevents more problems than it solves, but given that it is there, surely we'd want GHCi to behave the same way, right?
However GHCi is an interpreter. You enter code one statement at a time, not one module at a time. So when you type (!) = elem and hit enter, GHCi has to understand that statement and produce a value to bind to (!) with some specific type right now (it can be an unevaluated thunk, but we have to know what its type is). With the monomorphism restriction we can't infer (Foldable t, Eq a) => a -> t a -> Bool, we have to pick a type for those type variables now, with no information from use-sites to help us pick something sensible. The extended default rules that are on in GHCi (another difference from GHC) default those to [] and (), so you get (!) :: () -> [()] -> Bool1. Pretty useless, and you get a type error trying either of the uses from your example.
The problem that the monomorphism restriction addresses is particularly egregious in the case of numeric calculations when you're not writing explicit type signatures. Since Haskell's numeric literals are overloaded you could easily write an entire complex calculation, complete with starting data, whose most general type is polymorphic with a Num or Floating or etc constraint. Most of the builtin numeric types are very small, so you're very likely to have values that you'd much rather store than compute multiple times. The scenario is more likely to happen, and more likely to be a problem.
But it's also exactly with numeric types that the whole-module type-inference process is essential to defaulting type variables to a concrete type in a way that is at all usable (and small examples with numbers are exactly what people new to Haskell are likely to be trying out in the interpreter). Before the monomorphism restriction was off by default in GHCi, there was a constant stream of Haskell question here on Stack Overflow from people confused why they couldn't divide numbers in GHCi that they could in compiled code, or something similar (basically the reverse of your question here). In compiled code you can mostly just write code the way you want with no explicit types and the full-module type inference figures out whether it should default your integer literals to Integer, or Int if they need to be added to something returned by length, or Double if they need to be added to something and multiplied by something else which is elsewhere divided by something, etc etc. In GHCi a simple x = 2 very often does the wrong under the monomorphism restriction turned on (because it'll pick Integer regardless of what you wanted to do with x later), with the result that you need to add far more type annotations in a quick-and-easy interactive interpreter than even the most ardent explicit-typist would use in production compiled code.
So it's certainly debateable whether GHC should use the monomorphism restriction or not; it's intended to address a real problem, it just also causes some other ones2. But the monomorphism restriction is a terrible idea for the interpreter. The fundamental difference between line-at-a-time and module-at-a-time type inference means that even when they both did default to using it they behaved quite differently in practice anyway. GHCi without the monomorphism restriction is at least significantly more usable.
1 Without the extended default rules you instead get an error about an ambiguous type variable, because it doesn't have anything to pin down a choice, not even the somewhat silly defaulting rules.
2 I find it only a mild irritation in actual development because I write type signatures for top-level bindings. I find that's enough to make the monomorphism restriction apply only rarely, so it doesn't help or hinder me much. Thus I'd probably rather it was scrapped so that everything works consistently, especially as it seems to bite learners far more often than it bites me as a practitioner. On the other hand, debugging a rare performance problem on the occasion that it matters is much harder than rarely having to add a correct type signature that GHC annoyingly won't infer.

NoMonomorphismRestriction is a useful default in GHCI because you don't have to write out so many pesky type signatures in the repl. GHCI will try to infer the most general types it can.
MonomorphismRestriction is a useful default otherwise for efficiency / performance reasons. Specifically, the issue boils down to the fact that:
typeclasses essentially introduce additional function parameters -- specifically, the dictionary of code implementing the instances in question. In the case of typeclass polymorphic pattern bindings, you end up turning something that looked like a pattern binding -- a constant that would only ever be evaluated once, into what is really a function binding, something which will not be memoised.
Link

Why has Haskell troubles resolving "overloaded" operators?

This post poses the question for the case of !! . The accepted answer tell us that what you are actually doing is creating a new function !! and then you should avoid importing the standard one.
But, why to do so if the new function is to be applied to different types than the standard one? Is not the compiler able to choose the right one according to its parameters?
Is there any compiler flag to allow this?
For instance, if * is not defined for [Float] * Float
Why the compiler cries
> Ambiguous occurrence *
> It could refer to either `Main.*', defined at Vec.hs:4:1
> or `Prelude.*',
for this code:
(*) :: [Float] -> Float -> [Float]
(*) as k = map (\a -> a*k) as -- here: clearly Float*Float
r = [1.0, 2.0, 3.0] :: [Float]
s = r * 2.0 -- here: clearly [Float] * Float
main = do
print r
print s

Allowing the compiler to choose the correct implementation of a function based on its type is the purpose of typeclasses. It is not possible without them.
For a justification of this approach, you might read the paper that introduced them: How to make ad-hoc polymorphism less ad hoc [PDF].

Really, the reason is this: in Haskell, there is not necessarily a clear association “variable x has type T”.
Haskell is almost as flexible as dynamic languages, in the sense that any type can be a type variable, i.e. can have polymorphic type. But whereas in dynamic languages (and also e.g. OO polymorphism or C++ templates), the types of such type-variables are basically just extra information attached to the value-variables in your code (so an overloaded operator can see: argument is an Int->do this, is a String->do that), in Haskell the type variables live in a completely seperate scope in the type language. This gives you many advantages, for instance higher-kinded polymorphism is pretty much impossible without such a system. However, it also means it's harder to reason about how overloaded functions should be resolved. If Haskell allowed you to just write overloads and assume the compiler does its best guess at resolving the ambiguity, you'd often end up with strange error messages in unexpected places. (Actually, this can easily happen with overloads even if you have no Hindley-Milner type system. C++ is notorious for it.)
Instead, Haskell chooses to force overloads to be explicit. You must first define a type class before you can overload methods, and though this can't completely preclude confusing compilation errors it makes them much easier to avoid. Also, it lets you express polymorphic methods with type resolution that couldn't be expressed with traditional overloading, in particular polymorphic results (which is great for writing very easily reusable code).

It is a design decision, not a theoretical problem, not to include this in Haskell. As you say, many other languages use types to disambiguate between terms on an ad-hoc way. But type classes have similar functionality and additionally allow abstraction over things that are overloaded. Type-directed name resolution does not.
Nevertheless, forms of type-directed name resolution have been discussed for Haskell (for example in the context of resolving record field selectors) and are supported by some languages similar to Haskell such as Agda (for data constructors) or Idris (more generally).

Type erasure in Haskell?

I was reading a lecture note on Haskell when I came across this paragraph:
This “not caring” is what the “parametric” in parametric polymorphism means. All Haskell functions must be parametric in their type parameters; the functions must not care or make decisions based on the choices for these parameters. A function can't do one thing when a is Int and a different thing when a is Bool. Haskell simply provides no facility for writing such an operation. This property of a langauge is called parametricity.
There are many deep and profound consequences of parametricity. One consequence is something called type erasure. Because a running Haskell program can never make decisions based on type information, all the type information can be dropped during compilation. Despite how important types are when writing Haskell code, they are completely irrelevant when running Haskell code. This property gives Haskell a huge speed boost when compared to other languages, such as Python, that need to keep types around at runtime. (Type erasure is not the only thing that makes Haskell faster, but Haskell is sometimes clocked at 20x faster than Python.)
What I don't understand is how are "all Haskell functions" parametric? Aren't types explicit/static in Haskell? Also I don't really understand how type erasure improves compiling time runtime?
Sorry if these questions are really basic, I'm new to Haskell.
EDIT:
One more question: why does the author say that "Despite how important types are when writing Haskell code, they are completely irrelevant when running Haskell code"?

What I don't understand is how are "all Haskell functions" parametric?
It doesn't say all Haskell functions are parametric, it says:
All Haskell functions must be parametric in their type parameters.
A Haskell function need not have any type parameters.
One more question: why does the author say that "Despite how important types are when writing Haskell code, they are completely irrelevant when running Haskell code"?
Unlike a dynamically typed language where you need to check at run time if (for example) two things are numbers before trying to add them together, your running Haskell program knows that if you're trying to add them together, then they must be numbers because the compiler made sure of it beforehand.
Aren't types explicit/static in Haskell?
Types in Haskell can often be inferred, in which case they don't need to be explicit. But you're right that they're static, and that is actually why they don't matter at run time, because static means that the compiler makes sure everything has the type that it should before your program ever executes.

Types can be erased in Haskell because the type of an expression is either know at compile time (like True) or its type does not matter at runtime (like []).
There's a caveat to this though, it assumes that all values have some kind of uniform representation. Most Haskell implementations use pointers for everything, so the actual type of what a pointer points to doesn't matter (except for the garbage collector), but you could imagine a Haskell implementation that uses a non-uniform representation and then some type information would have to be kept.

Others have already answered, but perhaps some examples can help.
Python, for instance, retains type information until runtime:
>>> def f(x):
... if type(x)==type(0):
... return (x+1,x)
... else:
... return (x,x)
...
>>> f("hello")
('hello', 'hello')
>>> f(10)
(11, 10)
The function above, given any argument x returns the pair (x,x), except when x is of type int. The function tests for that type at runtime, and if x is found to be an int it behaves in a special way, returning (x+1, x) instead.
To realize the above, the Python runtime must keep track of types. That is, when we do
>>> x = 5
Python can not just store the byte representation of 5 in memory. It also needs to mark that representation with a type tag int, so that when we do type(x) the tag can be recovered.
Further, before doing any operation such as x+1 Python needs to check the type tag to ensure we are really working on ints. If x is for instance a string, Python will raise an exception.
Statically checked languages such as Java do not need such checks at runtime. For instance, when we run
SomeClass x = new SomeClass(42);
x.foo();
the compiler has already checked there's indeed a method foo for x at compile time, so there's no need to do that again. This can improve performance, in principle. (Actually, the JVM does some runtime checks at class load time, but let's ignore those for the sake of simplicity)
In spite of the above, Java has to store type tags like Python does, since it has a type(-) analogous:
if (x instanceof SomeClass) { ...
Hence, Java allows one to write functions which can behave "specially" on some types.
// this is a "generic" function, using a type parameter A
<A> A foo(A x) {
if (x instanceof B) { // B is some arbitrary class
B b = (B) x;
return (A) new B(b.get()+1);
} else {
return x;
}
}
The above function foo() just returns its argument, except when it's of type B, for which a new object is created instead. This is a consequence of using instanceof, which requires every object to carry a tag at runtime.
To be honest, such a tag is already present to be able to implement virtual methods, so it does not cost anything more. Yet, the presence of instanceof makes it possible to cause the above non-uniform behaviour on types -- some types can be handled differently.
Haskell, instead has no such type/instanceof operator. A parametric Haskell function having type
foo :: a -> (a,a)
must behave in the same way at all types. There's no way to cause some "special" behaviour. Concretely, foo x must return (x,x), and we can see this just by looking at the type annotation above. To stress the point, there's no need to look at the code (!!) to prove such property. This is what parametricity ensures from the type above.

Implementations of dynamically typed languages typically need to store type information with each value in memory. This isn't too bad for Lisp-like languages that have just a few types and can reasonably identify them with a few tag bits (although such limited types lead to other efficiency issues). It's much worse for a language with lots of types. Haskell lets you carry type information to runtime, but it forces you to be explicit about it, so you can pay attention to the cost. For example, adding the context Typeable a to a type offers a value with that type access, at runtime, to a representation of the type of a. More subtly, typeclass instance dictionaries are usually specialized away at compile time, but in sufficiently polymorphic or complex cases may survive to runtime. In a compiler like the apparently-abandoned JHC, and one likely possibility for the as-yet-barely-started compiler THC, this could lead to some type information leaking to runtime in the form of pointer tagging. But these situations are fairly easy to identify and only rarely cause serious performance problems.

Stripping out let in Haskell

I should probably first mention that I'm pretty new to Haskell. Is there a particular reason to keep the let expression in Haskell?
I know that Haskell got rid of the rec keyword that corresponds to the Y-combinator portion of a let statement that indicates it's recursive. Why didn't they get rid of the let statement altogether?
If they did, statements will seem more iterative to some degree. For example, something like:
let y = 1+2
z = 4+6
in y+z
would just be:
y = 1+2
z = 4+6
y+z
Which is more readable and easier for someone new to functional programming to follow. The only reason I can think of to keep it around is something like this:
aaa = let y = 1+2
z = 4+6
in y+z
Which would look this this without the let, which I think ends up being ambiguous grammar:
aaa =
y = 1+2
z = 4+6
y+z
But if Haskell didn't ignore whitespace, and code blocks/scope worked similar to Python, would it be able to remove the let?
Is there a stronger reason to keep around let?
Sorry if this question seems stupid, I'm just trying to understand more about why it's in there.

Syntactically you can easily imagine a language without let. Immediately, we can produce this in Haskell by simply relying on where if we wanted. Beyond that are many possible syntaxes.
Semantically, you might think that let could translate away to something like this
let x = e in g ==> (\x -> g) e
and, indeed, at runtime these two expressions are identical (modulo recursive bindings, but those can be achieved with fix). Traditionally, however, let has special typing semantics (along with where and top-level name definitions... all of which being, effectively, syntax sugar for let).
In particular, in the Hindley-Milner type system which forms the foundation of Haskell there's a notion of let-generalization. Intuitively, it regards situations where we upgrade functions to their most polymorphic form. In particular, if we have a function appearing in an expression somewhere with a type like
a -> b -> c
those variables, a, b, and c, may or may not already have meaning in that expression. In particular, they're assumed to be fixed yet unknown types. Compare that to the type
forall a b c. a -> b -> c
which includes the notion of polymorphism by stating, immediately, that even if there happen to be type variables a, b, and c available in the envionment, these references are fresh.
This is an incredibly important step in the HM inference algorithm as it is how polymorphism is generated allowing HM to reach its more general types. Unfortunately, it's not possible to do this step whenever we please—it must be done at controlled points.
This is what let-generalization does: it says that types should be generalized to polymorphic types when they are let-bound to a particular name. Such generalization does not occur when they are merely passed into functions as arguments.
So, ultimately, you need a form of "let" in order to run the HM inference algorithm. Further, it cannot just be syntax sugar for function application despite them having equivalent runtime characteristics.
Syntactically, this "let" notion might be called let or where or by a convention of top-level name binding (all three are available in Haskell). So long as it exists and is a primary method for generating bound names where people expect polymorphism then it'll have the right behavior.

There are important reasons why Haskell and other functional languages use let. I'll try to describe them step by step:
Quantification of type variables
The Damas-Hindley-Milner type system used in Haskell and other functional languages allows polymorphic types, but the type quantifiers are allowed only in front of a given type expression. For example, if we write
const :: a -> b -> a
const x y = x
then the type of const is polymorphic, it is implicitly universally quantified as
∀a.∀b. a -> b -> a
and const can be specialized to any type that we obtain by substituting two type expressions for a and b.
However, the type system doesn't allow quantifiers inside type expressions, such as
(∀a. a -> a) -> (∀b. b -> b)
Such types are allowed in System F, but then type checking and type inference is undecidable, which means that the compiler wouldn't be able to infer types for us and we would have to explicitly annotate expressions with types.
(For long time the question of decidability of type-checking in System F had been open, and it had been sometimes addressed as "an embarrassing open problem", because the undecidability had been proven for many other systems but this one, until proved by Joe Wells in 1994.)
(GHC allows you to enable such explicit inner quantifiers using the RankNTypes extension, but as mentioned, the types can't be inferred automatically.)
Types of lambda abstractions
Consider the expression λx.M, or in Haskell notation \x -> M,
where M is some term containing x. If the type of x is a and the type of M is b, then the type of the whole expression will be λx.M : a → b. Because of the above restriction, a must not contain ∀, therefore the type of x can't contain type quantifiers, it can't be polymorphic (or in other words it must be monomorphic).
Why lambda abstraction isn't enough
Consider this simple Haskell program:
i :: a -> a
i x = x
foo :: a -> a
foo = i i
Let's disregard for now that foo isn't very useful. The main point is that id in the definition of foo is instantiated with two different types. The first one
i :: (a -> a) -> (a -> a)
and the second one
i :: a -> a
Now if we try to convert this program into the pure lambda calculus syntax without let, we'd end up with
(λi.i i)(λx.x)
where the first part is the definition of foo and the second part is the definition of i. But this term will not type check. The problem is that i must have a monomorphic type (as described above), but we need it polymorphic so that we can instantiate i to the two different types.
Indeed, if you try to typecheck i -> i i in Haskell, it will fail. There is no monomorphic type we can assign to i so that i i would typecheck.
let solves the problem
If we write let i x = x in i i, the situation is different. Unlike in the previous paragraph, there is no lambda here, there is no self-contained expression like λi.i i, where we'd need a polymorphic type for the abstracted variable i. Therefore let can allow i to have a polymorhpic type, in this case ∀a.a → a and so i i typechecks.
Without let, if we compiled a Haskell program and converted it to a single lambda term, every function would have to be assigned a single monomorphic type! This would be pretty useless.
So let is an essential construction that allows polymorhism in languages based on Damas-Hindley-Milner type systems.

The History of Haskell speaks a bit to the fact that Haskell has long since embraced a complex surface syntax.
It took some while to identify the stylistic choice as we have done here, but once we had done so, we engaged in furious debate about which style was “better.” An underlying assumption was that if possible there should be “just one way to do something,” so that, for example, having both let and where would be redundant and confusing.
In the end, we abandoned the underlying assumption, and provided full syntactic support for both styles. This may seem like a classic committee decision, but it is one that the present authors believe was a fine choice, and that we now regard as a strength of the language. Different constructs have different nuances, and real programmers do in practice employ both let and where, both guards and conditionals, both pattern-matching definitions and case expressions—not only in the same program but sometimes in the same function definition. It is certainly true that the additional syntactic sugar makes the language seem more elaborate, but it is a superficial sort of complexity, easily explained by purely syntactic transformations.

This is not a stupid question. It is completely reasonable.
First, let/in bindings are syntactically unambiguous and can be rewritten in a simple mechanical way into lambdas.
Second, and because of this, let ... in ... is an expression: that is, it can be written wherever expressions are allowed. In contrast, your suggested syntax is more similar to where, which is bound to a surrounding syntactic construct, like the pattern matching line of a function definition.
One might also make an argument that your suggested syntax is too imperative in style, but this is certainly subjective.
You might prefer using where to let. Many Haskell developers do. It's a reasonable choice.

There is a good reason why let is there:
let can be used within the do notation.
It can be used within list comprehension.
It can be used within function definition as mentioned here conveniently.
You give the following example as an alternative to let :
y = 1+2
z = 4+6
y+z
The above example will not typecheck and the y and z will also lead to the pollution of global namespace which can be avoided using let.

Part of the reason Haskell's let looks like it does is also the consistent way it manages its indentation sensitivity. Every indentation-sensitive construct works the same way: first there's an introducing keyword (let, where, do, of); then the next token's position determines what is the indentation level for this block; and subsequent lines that start at the same level are considered to be a new element in the block. That's why you can have
let a = 1
b = 2
in a + b
or
let
a = 1
b = 2
in a + b
but not
let a = 1
b = 2
in a + b
I think it might actually be possible to have keywordless indentation-based bindings without making the syntax technically ambiguous. But I think there is value in the current consistency, at least for the principle of least surprise. Once you see how one indentation-sensitive construct works, they all work the same. And as a bonus, they all have the same indentation-insensitive equivalent. This
keyword <element 1>
<element 2>
<element 3>
is always equivalent to
keyword { <element 1>; <element 2>; <element 3> }
In fact, as a mainly F# developer, this is something I envy from Haskell: F#'s indentation rules are more complex and not always consistent.

Why does Haskell hide functions with the same name but different type signatures?

Suppose I was to define (+) on Strings but not by giving an instance of Num String.
Why does Haskell now hide Nums (+) function? After all, the function I have provided:
(+) :: String -> String -> String
can be distinguished by the compiler from Prelude's (+). Why can't both functions exist in the same namespace, but with different, non-overlapping type signatures?
As long as there is no call to the function in the code, Haskell to care that there's an ambiguitiy. Placing a call to the function with arguments will then determine the types, such that appropriate implementation can be chosen.
Of course, once there is an instance Num String, there would actually be a conflict, because at that point Haskell couldn't decide based upon the parameter type which implementation to choose, if the function were actually called.
In that case, an error should be raised.
Wouldn't this allow function overloading without pitfalls/ambiguities?
Note: I am not talking about dynamic binding.

Haskell simply does not support function overloading (except via typeclasses). One reason for that is that function overloading doesn't work well with type inference. If you had code like f x y = x + y, how would Haskell know whether x and y are Nums or Strings, i.e. whether the type of f should be f :: Num a => a -> a -> a or f :: String -> String -> String?
PS: This isn't really relevant to your question, but the types aren't strictly non-overlapping if you assume an open world, i.e. in some module somewhere there might be an instance for Num String, which, when imported, would break your code. So Haskell never makes any decisions based on the fact that a given type does not have an instance for a given typeclass. Of course, function definitions hide other function definitions with the same name even if there are no typeclasses involved, so as I said: not really relevant to your question.
Regarding why it's necessary for a function's type to be known at the definition site as opposed to being inferred at the call-site: First of all the call-site of a function may be in a different module than the function definition (or in multiple different modules), so if we had to look at the call site to infer a function's type, we'd have to perform type checking across module boundaries. That is when type checking a module, we'd also have to go all through the modules that import this module, so in the worst case we have to recompile all modules every time we change a single module. This would greatly complicate and slow down the compilation process. More importantly it would make it impossible to compile libraries because it's the nature of libraries that their functions will be used by other code bases that the compiler does not have access to when compiling the library.

As long as the function isn't called
At some point, when using the function
no no no. In Haskell you don't think of "before" or "the minute you do...", but define stuff once and for all time. That's most apparent in the runtime behaviour of variables, but also translates to function signatures and class instances. This way, you don't have to do all the tedious thinking about compilation order and are safe from the many ways e.g. C++ templates/overloads often break horribly because of one tiny change in the program.
Also, I don't think you quite understand how Hindley-Milner works.
Before you call the function, at which time you know the type of the argument, it doesn't need to know.
Well, you normally don't know the type of the argument! It may sometimes be explicitly given, but usually it's deduced from the other argument or the return type. For instance, in
map (+3) [5,6,7]
the compiler doesn't know what types the numeric literals have, it only knows that they are numbers. This way, you can evaluate the result as whatever you like, and that allows for things you could only dream of in other languages, for instance a symbolic type where
> map (+3) [5,6,7] :: SymbolicNum
[SymbolicPlus 5 3, SymbolicPlus 6 3, SymbolicPlus 7 3]

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string