Haskell type inference for lambda functions (in map) [duplicate]

Haskell type inference for lambda functions (in map) [duplicate] - haskell

This question already has an answer here:
What is the monomorphism restriction?
(1 answer)
Closed 6 years ago.
Example 1
Following definition without the type declaration will throw an error:
f :: Eq t => (t,t) -> Bool -- omiting this line will result in an error
f = \(x,y) -> x==y
(I know this function can be written shorter, but this is not the point here.)
Example 2
On the other hand using the same lambda function in a function using map does work without producing an error:
g l = map (\(x,y) -> x==y) l
(Just as an illustration: g [(3,4),(5,5),(7,6)] will produce [False,True,False]
Example 3
Also following code is perfectly fine and it seems to do exactly the same as the original f from above. Here the type inference seems to work.
f' (x,y) = x==y
Question
So my question is: Why do we need a type declaration in the first case, but not in the second and in the third?

If you use:
{-# LANGUAGE NoMonomorphismRestriction #-}
f = \(x,y) -> x==y
you won't get the error.
Update
The Haskell Wiki page on Monomorphism Restriction (link) offers some details on why these definitions are treated differently:
f1 x = show x
f2 = \x -> show x
The difference between the first and second version is that the first version binds x via a "function binding" (see section 4.4.3 of the Haskell 2010 Report), and is therefore unrestricted, but the second version does not. The reason why one is allowed and the other is not is that it's considered clear that sharing f1 will not share any computation, and less clear that sharing f2 will have the same effect. If this seems arbitrary, that's because it is. It is difficult to design an objective rule which disallows subjective unexpected behaviour. Some people are going to fall foul of the rule even though they're doing quite reasonable things.

As #ErikR notes in the comments, this is due to the Monomorphism restriction. We also see this in the error message:
No instance for (Eq a0) arising from a use of '=='
The type variable 'a0' is ambiguous
Possible cause: the monomorphism restriction applied to the following:
f :: (a0, a0) -> Bool
(bound at ...
Note: there are several potential instances:
instance Eq a => Eq (GHC.Real.Ratio a) -- Defined in 'GHC.Real'
instance Eq () -- Defined in 'GHC.Classes'
instane (Eq a, Eq b) => Eq (a,b) -- Defined in 'GHC.Classes'
..plus 22 others
The monomorphism restriction implies that the compiler tries to instantiate an ambiguous type into a non-ambiguous type. (Source: What is the monomorphism restriction?).
So, Haskell wants to put a single instance, but it can't - it finds several, and doesn't know which to choose.
This explains why adding the type solves the problem: now the compiler knows what to choose.
The "monomorphism restriction" is a counter-intuitive rule in Haskell type inference. If you forget to provide a type signature, sometimes this rule will fill the free type variables with specific types using "type defaulting" rules.

Related

Rigid / skolem type variable: fine as parameter but escaping scope with local where/let statement

Per What are skolems?, this works:
{-# LANGUAGE ExistentialQuantification #-}
data AnyEq = forall a. Eq a => AE a
reflexive :: AnyEq -> Bool
reflexive (AE x) = x == x
But why doesn't this:
reflexive2 :: AnyEq -> Bool
reflexive2 ae = x == x
where
AE x = ae
(or the similar version with let). It produces the errors including:
Couldn't match expected type ‘p’ with actual type ‘a’
because type variable ‘a’ would escape its scope
This (rigid, skolem) type variable is bound by
a pattern with constructor: AE :: forall a. Eq a => a -> AnyEq,
in a pattern binding
at Skolem.hs:74:4-7
Is it possible to make it work by adding some type declaration (a bit like the s :: forall a. I a -> String solution to the withContext example in that question). I feel I want to add an Eq x somewhere.
My (possibly naïve) understanding of how reflexive works is as follows:
it takes a value of type AnyEq. This has embedded within it a value of any type provided it is an instance of Eq. This inner type is not apparent in the type signature of reflexive and is unknown when reflexive is compiled.
The binding (AE x) = ae makes x a value of that unknown type, but known to be an instance of Eq. (So just like the variables x and y in myEq :: Eq a => a -> a -> Bool; myEq x y = x == y)
the == operator is happy based on the implied class constraint.
I can't think why reflexive2 doesn't do the same, except for things like the "monomorphism restriction" or "mono local binds", which sometimes make things weird. I've tried compiling with all combinations of NoMonomorphismRestriction and NoMonoLocalBinds, but to no avail.
Thanks.

So I think I found the answer in the documentation (of all places!). This states:
You can’t pattern-match on an existentially quantified constructor in a let or where group of bindings
The reason for this restriction is really an implementation one
We’ll see how annoying it is
There's also a request to lift the restriction (though it seems a bit challenging).
Personally, now I think I understand it, it's not that annoying. But I do think the error message generated is misleading (since I don't think the problem is scope leakage), so will request it is changed. (It would also be good if the generated error message had a reference back to the documentation, so will push my luck by asking for that too.)
Thanks all for your comments, and please let me know if you think I've got some of this wrong. David.

Significance of scoped type variables standing for type variables and not types

In the GHC documentation for the ScopedTypeVariables extension, the overview states the following as a design principle:
A scoped type variable stands for a type variable, and not for a type. (This is a change from GHC’s earlier design.)
I know the general purpose of the scoped type variables extension, but I don't know the implications of the distinction made here between standing for type variables and standing for types. What is the significance of the difference, from the perspective of users of the language?
The comment above alludes to two designs which approached this decision differently and made different tradeoffs. What was the alternative design, and how does it compare to the one currently implemented?

tl;dr: The documentation says what it says because the old implementation of scoped typed variables in GHC was different, and the new documentation (over)emphasizes the contrast between the old behavior and the new behavior. In fact, the scoped typed variables you use with the ScopedTypeVariables extension in place are just plain old (rigid) type variables, and these are the same type variables you've been using in regular Haskell type signatures without scoping (even if you didn't realize they were "rigid"). It's true that scoped type variables don't simply "stand for types", but regular unscoped type variables don't simply "stand for types", either.
Longer answer:
First, setting aside the issue of scoped type variables, consider the following:
pluralize :: [a] -> [a]
pluralize x = x ++ "s"
If a, as a type variable, simply "stood for a type", this would be fine. GHC would determine that a stood for the type Char, and the resulting signature [Char] -> [Char] would be determined to be the correct type of pluralize, so there'd be no problem. In fact, if we were inferring the type of:
pluralize x = x ++ "s"
in a plain old Hindley-Milner (HM) type system, this is probably exactly what would happen. As an intermediate step while typing the (++) application, the type checker would assign x the type [a] for a "fresh" HM type variable a, and it would assign pluralize the type [a] -> [a] before unifying [a] with the type of "s" :: [Char] to unify a and Char.
Instead, this is rejected by the GHC type checker because a in this type signature is not an HM-style type variable and so does not simply stand for a type. Instead, it is a rigid (i.e., user-specified) Haskell type variable, and the type checker doesn't permit such a variable to unify with anything other than itself while defining pluralize.
Similarly, the following is rejected:
pairlist :: a -> b -> [a]
pairlist x y = [x,y]
even though, if a and b just stood for types, this would be fine (as it works for any a and b of kind * provided only that a and b are the same type). Instead, it is rejected by the type checker because two rigid Haskell type variables a and b can't unify.
Now, you might try to make the case that the issue is not that type variables are "rigid" and can't unify with concrete types (like Char) or with each other, but rather that there is an implicit quantification in Haskell type signatures, so that the signature for pluralize is actually:
pluralize :: forall a . [a] -> [a]
and so when a is determined to "stand for" Char, it is the contradiction with this forall a quantification that triggers the error. The problem with this argument is that both explanations are actually more or less equivalent. It is because Haskell type variables are rigid (i.e., because type signatures in Haskell are implicitly universally quantified) that the types cannot unify (i.e., that the unification contradicts the quantification). However, it turns out that the "rigid type variables" explanation is closer to what's actually happening in the GHC type checker than the "implicit quantification" explanation. That's why the error messages generated by the above definitions refer to inability to match rigid type variables rather than to contradictions with universal type variable quantifications.
Now, let's turn back to the question of scoped type variables. In the olden days, GHC's -fscoped-type-variables extension was implemented quite differently. In particular, for pattern type signatures, you were permitted to write things like the following (taken from the documentation for GHC 6.0):
f :: [Int] -> Int -> Int
f (xs::[a]) (y::a) = (head xs + y) :: a
and the documentation went on to say:
The pattern type signatures on the left hand side of f express the fact that xs must be a list of things of some type a; and that y must have this same type. The type signature on the expression (head xs) [sic] specifies that this expression must have the same type a. There is no requirement that the type named by "a" is in fact a type variable. Indeed, in this case, the type named by "a" is Int. (This is a slight liberalisation from the original rather complex rules, which specified that a pattern-bound type variable should be universally quantified.)
It went on to give some additional examples of use of scoped type variables, such as:
g (x::a) (y::b) = [x,y] -- a unifies with b
k (x::a) True = ... -- a unifies with Int
k (x::Int) False = ...
In 2006, Simon Peyton-Jones made a big commit (ac10f840) to add impredicativity to the type system which also ended up substantially changing the implementation of lexically scoped type variables. The commit text contains a detailed explanation of the change including the requirements for the new design.
A key design choice was that lexically scoped type variables now named rigid (i.e., user-specified polymorphic) Haskell type variables, rather than being more like the HM-style variables that simply stood for a type and were subject to unification.
That made the above examples (f, g, and k) illegal, because scoped type variables in pattern matches now behaved more like regular rigid type variables.
Soooo... the old design was probably a weird hack that made scoped type variables more like HM type variables and so quite different from "normal" Haskell type variables, and the new system brought them more into line with how unscoped type variables behaved.
However, complicating things further, #duplode's link in the comments references a proposal to partially "undo" this "restriction" in the context of signatures in pattern matches. I think it would be fair to say that the old design, which treated scoped type variables more like a special case, was inferior to the new design which better unified the treatment of scoped and unscoped type variables, and there is no desire to go back to the old implementation. However, the new, simpler implementation had the side effect of being unnecessarily restrictive for pattern signatures which perhaps ought to be treated as a special case where non-rigid type variables are permitted.

I'm adding this answer (to my own question) in order to expand on duplode's reference in the comments. ScopedTypeVariables is currently being changed to allow scoped type variables to stand for types instead of only type variables. The discussion for this change the motivation for the new and old designs. This does not, however, address the even earlier design mentioned in the question and in K. A. Buhr's answer.
In the current state, before the forthcoming change, the definition
prefix :: a -> [[a]] -> [[a]]
prefix (x :: b) yss = map xcons yss
where xcons ys = x : ys
is valid (with ScopedTypeVariables), in which b is a newly introduced type variable that stands for the same thing as a. On the other hand, if prefix is specialized to
prefix :: Int -> [[Int]] -> [[Int]]
prefix (x :: b) yss = map xcons yss
where xcons ys = x : ys
then the program is rejected: b is forbidden from standing for Int since Int is not a type variable. Simon Peyton Jones remarked on why it was designed so that b could not stand for Int:
At the time I was worried that it'd be confusing to have a type
variable that was just an alias for Int; that is not a type variable
at all. But in these days of GADTs and type equalities we are all used
to that. We'd make a different choice today.
In the current consensus of the GHC maintainers, the limitation against b standing for Int is viewed as unnatural, especially in light of the possibility of type equalities (a ~ Int) => .... The presence of such constraints blur the lines of what being "bound to a type variable" really means. Should the examples
f1 :: (a ~ Int) => Maybe a -> Int
f1 (Just (x :: b)) = ...
f2 :: (a ~ Int) => Maybe Int -> Int
f2 (Just (x :: a)) = ...
be permitted? Under the new proposal, all four examples above are allowed.
In my own view, the tension ultimately comes from the coexistence of two very different type annotation systems. One of them has the effect of preventing you from giving different names to the same type (for instance, you cannot write (\x -> x) :: a -> b or (\x -> x) :: Int -> b and expect b to be unified with a or Int). The other enables and encourages you to give new names to things (pattern type signatures like foo (x :: b) = ...), a feature which exists to enable you to name types that would otherwise be un-nameable. The leftover question is whether pattern type signatures should allow you to alias types that are already nameable. The answer at its core depends on which of the two precedents you find more compelling.
References:
Joachim Breitner "nomeata" (April-August 2018). Feature request ticket "ScopedTypeVariables could allow more programs", https://ghc.haskell.org/trac/ghc/ticket/15050
nomeata (April-August 2018). Pull request "Allow ScopedTypeVariables to refer to types", https://github.com/ghc-proposals/ghc-proposals/pull/128
Richard A. Eisenberg, Joachim Breitner, and Simon Peyton Jones (June 2018). "Type variables in patterns", https://www.microsoft.com/en-us/research/uploads/prod/2018/06/tyvars-in-pats-haskell18-preprint.pdf, especially sections 3.5 and 4.3
nomeata (August 2018). GHC proposal "Allow ScopedTypeVariables to refer to types", https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0029-scoped-type-variables-types.rst

How are variable names chosen in type signatures inferred by GHC?

When I play with checking types of functions in Haskell with :t, for example like those in my previous question, I tend to get results such as:
Eq a => a -> [a] -> Bool
(Ord a, Num a, Ord a1, Num a1) => a -> a1 -> a
(Num t2, Num t1, Num t, Enum t2, Enum t1, Enum t) => [(t, t1, t2)]
It seems that this is not such a trivial question - how does the Haskell interpreter pick literals to symbolize typeclasses? When would it choose a rather than t? When would it choose a1 rather than b? Is it important from the programmer's point of view?

The names of the type variables aren't significant. The type:
Eq element => element -> [element] -> Bool
Is exactly the same as:
Eq a => a -> [a] -> Bool
Some names are simply easier to read/remember.
Now, how can an inferencer choose the best names for types?
Disclaimer: I'm absolutely not a GHC developer. However I'm working on a type-inferencer for Haskell in my bachelor thesis.
During inferencing the names chosen for the variables aren't probably that readable. In fact they are almost surely something along the lines of _N with N a number or aN with N a number.
This is due to the fact that you often have to "refresh" type variables in order to complete inferencing, so you need a fast way to create new names. And using numbered variables is pretty straightforward for this purpose.
The names displayed when inference is completed can be "pretty printed". The inferencer can rename the variables to use a, b, c and so on instead of _1, _2 etc.
The trick is that most operations have explicit type signatures. Some definitions require to quantify some type variables (class, data and instance for example).
All these names that the user explicitly provides can be used to display the type in a better way.
When inferencing you can somehow keep track of where the fresh type variables came from, in order to be able to rename them with something more sensible when displaying them to the user.
An other option is to refresh variables by adding a number to them. For example a fresh type of return could be Monad m0 => a0 -> m0 a0 (Here we know to use m and a simply because the class definition for Monad uses those names). When inferencing is finished you can get rid of the numbers and obtain the pretty names.
In general the inferencer will try to use names that were explicitly provided through signatures. If such a name was already used it might decide to add a number instead of using a different name (e.g. use b1 instead of c if b was already bound).
There are probably some other ad hoc rules. For example the fact that tuple elements have like t, t1, t2, t3 etc. is probably something done with a custom rule. In fact t doesn't appear in the signature for (,,) for example.

How does GHCi pick names for type variables? explains how many of these variable names come about. As Ganesh Sittampalam pointed out in a comment, something strange seems to be happening with arithmetic sequences. Both the Haskell 98 report and the Haskell 2010 report indicate that
[e1..] = enumFrom e1
GHCi, however, gives the following:
Prelude> :t [undefined..]
[undefined..] :: Enum t => [t]
Prelude> :t enumFrom undefined
enumFrom undefined :: Enum a => [a]
This makes it clear that the weird behavior has nothing to do with the Enum class itself, but rather comes in from some stage in translating the syntactic sequence to the enumFrom form. I wondered if maybe GHC wasn't really using that translation, but it really is:
{-# LANGUAGE NoMonomorphismRestriction #-}
module X (aoeu,htns) where
aoeu = [undefined..]
htns = enumFrom undefined
compiled using ghc -ddump-simpl enumlit.hs gives
X.htns :: forall a_aiD. GHC.Enum.Enum a_aiD => [a_aiD]
[GblId, Arity=1]
X.htns =
\ (# a_aiG) ($dEnum_aiH :: GHC.Enum.Enum a_aiG) ->
GHC.Enum.enumFrom # a_aiG $dEnum_aiH (GHC.Err.undefined # a_aiG)
X.aoeu :: forall t_aiS. GHC.Enum.Enum t_aiS => [t_aiS]
[GblId, Arity=1]
X.aoeu =
\ (# t_aiV) ($dEnum_aiW :: GHC.Enum.Enum t_aiV) ->
GHC.Enum.enumFrom # t_aiV $dEnum_aiW (GHC.Err.undefined # t_aiV)
so the only difference between these two representations is the assigned type variable name. I don't know enough about how GHC works to know where that t comes from, but at least I've narrowed it down!
Ørjan Johansen has noted in a comment that something similar seems to happen with function definitions and lambda abstractions.
Prelude> :t \x -> x
\x -> x :: t -> t
but
Prelude> :t map (\x->x) $ undefined
map (\x->x) $ undefined :: [b]
In the latter case, the type b comes from an explicit type signature given to map.

Are you familiar with the concepts of alpha equivalence and alpha substitution? This captures the notion that, for example, both of the following are completely equivalent and interconvertible (in certain circumstances) even though they differ:
\x -> (x, x)
\y -> (y, y)
The same concept can be extended to the level of types and type variables (see "System F" for further reading). Haskell in fact has a notion of "lambdas at the type level" for binding type variables, but it's hard to see because they're implicit by default. However, you can make them explicit by using the ExplicitForAll extension, and play around with explicitly binding your type variables:
ghci> :set -XExplicitForAll
ghci> let f x = x; f :: forall a. a -> a
In the second line, I use the forall keyword to introduce a new type variable, which is then used in a type.
In other words, it doesn't matter whether you choose a or t in your example, as long as the type expressions satisfy alpha-equivalence. Choosing type variable names so as to maximize human convenience is an entirely different topic, and probably far more complicated!

Haskell function composition confusion

I'm trying to learn haskell and I've been going over chapter 6 and 7 of Learn you a Haskell. Why don't the following two function definitions give the same result? I thought (f . g) x = f (g (x))?
Def 1
let{ t :: Eq x => [x] -> Int; t xs = length( nub xs)}
t [1]
1
Def 2
let t = length . nub
t [1]
<interactive>:78:4:
No instance for (Num ()) arising from the literal `1'
Possible fix: add an instance declaration for (Num ())
In the expression: 1
In the first argument of `t', namely `[1]'
In the expression: t [1]

The problem is with your type signatures and the dreaded monomorphism restriction. You have a type signature in your first version but not in your second; ironically, it would have worked the other way around!
Try this:
λ>let t :: Eq x => [x] -> Int; t = length . nub
λ>t [1]
1
The monomorphism restriction forces things that don't look like functions to have a monomorphic type unless they have an explicit type signature. The type you want for t is polymorphic: note the type variable x. However, with the monomorphism restriction, x gets "defaulted" to (). Check this out:
λ>let t = length . nub
λ>:t t
t :: [()] -> Int
This is very different from the version with the type signature above!
The compiler chooses () for the monomorphic type because of defaulting. Defaulting is just the process Haskell uses to choose a type from a typeclass. All this really means is that, in the repl, Haskell will try using the () type if it encounters an ambiguous type variable in the Show, Eq or Ord classes. Yes, this is basically arbitrary, but it's pretty handy for playing around without having to write type signatures everywhere! Also, the defaulting rules are more conservative in files, so this is basically just something that happens in GHCi.
In fact, defaulting to () seems to mostly be a hack to make printf work correctly in GHCi! It's an obscure Haskell curio, but I'd ignore it in practice.
Apart from including a type signature, you could also just turn the monomorphism restriction off in the repl:
λ>:set -XNoMonomorphismRestriction
This is fine in GHCi, but I would not use it in real modules--instead, make sure to always include a type signature for top-level definitions inside files.
EDIT: Ever since GHC 7.8.1, the monomorphism restriction is turned off by default in GHCi. This means that all this code would work fine with a recent version of GHCi and you do not need to set the flag explicitly. It can still be an issue for values defined in a file with no type signature, however.

This is another instance of the "Dreaded" Monomorphism Restriction which leads GHCi to infer a monomorphic type for the composed function. You can disable it in GHCi with
> :set -XNoMonomorphismRestriction

Trouble explicitly specifying a type using a type variable [duplicate]

This question already has answers here:
How to reuse a type variable in an inner type declaration
(4 answers)
Arrays with rigid variable
(2 answers)
Closed 8 years ago.
The following is a simplified example of what I'm trying to do...
test :: Bounded a => Maybe a -> a
test (Just x) = x
test Nothing = (maxBound :: a)
The maxBound function is polymorphic - one of the methods of the Bounded typeclass. Because of that, when I use it I need to specify which version of Bounded I want. In this simplified example, that type could be inferred from the context - but in the real problem it can't - the explicit type is necessary in the real problem, though not really here.
My function is polymorphic too. I can't specify a concrete type directly, only a type variable. The appropriate type variable is a, for which I have specified the Bounded a constraint.
Compiling this, I get the following error...
temp.hs:4:18:
Could not deduce (Bounded a1) arising from a use of `maxBound'
from the context (Bounded a)
bound by the type signature for test :: Bounded a => Maybe a -> a
at temp.hs:2:9-33
Possible fix:
add (Bounded a1) to the context of
an expression type signature: a1
or the type signature for test :: Bounded a => Maybe a -> a
In the expression: (maxBound :: a)
In an equation for `test': test Nothing = (maxBound :: a)
As far as I can tell, this means that the a in maxBound :: a was considered separate from the a that I intended (the type variable in the signature for the function). a1 is the new name that GHC invented to disambiguate the two a variables which it considers separate. GHC considers the a in maxBound :: a to indicate that it can use any type here (!) and therefore complains because "any type" isn't restrictive enough.
This is using GHC version 7.6.3 as supplied in the (I think) most recent Haskell Platform.
I've had similar issues before, but always mixed with other issues, so the problem went away once I fixed those other problems. I dismissed it as being caused by the other issues and forgot about it. No such luxury here - that minimal example above isn't the real problem, but it depends on a solution to the exact same problem.
So... why is GHC treating the a in maxBound :: a as independent of the type variable a for the whole function? And how do I fix this to select the correct version of maxBound?

Main problem is in fact, that GHC tried to see function as
test :: forall a. Bounded a => Maybe a -> a
test (Just x) = x
test Nothing = (maxBound :: forall a. a)
You need ScopedTypeVariables extension and rewrite function to:
{-# LANGUAGE ScopedTypeVariables #-}
test :: forall a. Bounded a => Maybe a -> a
test (Just x) = x
test Nothing = (maxBound :: a)
Now we see, that inner a is depend on outer a
UPDATE
If you already wrote the signatiure, you don't need any extension.
Next function works fine!
test :: Bounded a => Maybe a -> a
test (Just x) = x
test Nothing = maxBound

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string