GHCI can't infer Eq class at compile time, but does fine at runtime? - haskell

Sorry for the confusing title. I am writing a parser combinator library in Haskell for fun. Here are all (I think!) the relevant type annotations and definitions:
data Parser a = Parser (State -> Reply a)
parse :: Parser a -> [Char] -> Either ParseError a
nil :: Parser [a]
nil = Parser $ \state -> Ok [] state
Basically, the parse function applies the function that a Parser wraps around to the current state, and if the parse is successful, wraps the result in an Either. The nil parser takes a state and returns a successful parse of the empty list. So we should have,
parse nil "dog" == Right []
In fact, if I just load the module where all these live, then it compiles and this evaluates to True.
I'm actually trying to run some QuickCheck tests on the library, though, so I wrote this:
import Parsimony
import Test.QuickCheck
prop_nil :: [Char] -> Bool
prop_nil xs = parse nil xs == Right []
This fails to compile! It throws the following error:
No instance for (Eq a0) arising from a use of `=='
The type variable `a0' is ambiguous
At this point I am mostly just confused why an expression could work fine when evaluated, but fail to compile in a parametrized version.

Since nil is polymorphic and Right [] is also polymorphic GHC has an expression of type Bool, but with some unbound type variable in the middle. GHC keels over and dies since it doesn't know what concrete type to use. GHCi for better or worse, will infer [()] or something like that because of its defaulting rules. This is one of ghci's weird quirks, it will automagically default type variables.
To fix this, simply for force the binding of a manually
-- It's important that whatever you force it to actually is comparable
-- eg there should be an instance like
instance Eq ParseError where
-- Otherwise you're kinda stuck.
prop_nil xs = parse nil xs == (Right xs :: Either ParseError String)
PS I like the name Parsimony for a parser library, good luck!

The problem is that the type of nil is Parser [a]. So parse nil xs is of type Either ParseError [a]. Right [] is most generally of type Either l [a]; comparing it to parse nil xs forces the l to be ParseError, but the type in the list is still completely unconstrained. Without any more context it remains fully polymorphic; that a isn't necessarily a member of the Eq type class and even if it is there's no way to know which instance to use for the implementation of ==, and so it isn't valid to invoke == on those two terms.
In a realistic program, you'd likely be saved from this by the fact that you'd use the result for something, which would force that particular occurrence to be consistent with whatever you use it for. That would probably be some concrete type which has an implementation of Eq.
When you talk about loading the module, I presume you mean in the GHCI interpreter. GHCI adds some additional defaulting rules. In particular it will tend to default unconstrained type variables (which aren't the type of a top level function) to (), so that it doesn't have to complain about ambiguous type variables quite so often.
An interactive session in GHCi tends to encounter ambiguous type variable far more often than realistic modules compiled in full, because it has to compile small snippets mostly independently. GHCi has extended defaulting rules to make those work a lot more often (though it often only delays the error to the next reference when the user was expecting a different type, and the difference between GHCi and GHC often causes confusion).
Test snippets can suffer from a similar problem. If you're testing polymorphic functions you often don't constrain some of the types sufficiently for type inference to work, as you would in real purposeful usage of the function. But without the extended defaulting rules of GHCi, this problem manifests as an actual ambiguous type error at the location of the problem, rather than masking it by arbitrarily picking a type.
To fix this, you just need to add a type annotation to fix the type of the list. Either declare the full type of parse nil xs or Right [], just declare the type of the empty list literal on the right hand side. Sometihng like this should do the trick:
prop_nil :: [Char] -> Bool
prop_nil xs = parse nil xs == Right ([] :: [Int])

Another way would be to avoid the Eq constraint in the first place:
prop_nil xs = either (const False) null (parse nil xs)
or, more explicit
prop_nil xs = case parse nil xs of
Right [] -> True
_ -> False

Related

Why such different behaviour with `Ambiguous type..` error (in ghci)?

This example works with ghci, load this file:
import Safe
t1 = tailMay []
and put in ghci:
> print t1
Nothing
But if we add analogous definition to previous file, it doesn't work:
import Safe
t1 = tailMay []
t2 = print $ tailMay []
with such error:
* Ambiguous type variable `a0' arising from a use of `print'
prevents the constraint `(Show a0)' from being solved.
Probable fix: use a type annotation to specify what `a0' should be.
These potential instances exist:
instance Show Ordering -- Defined in `GHC.Show'
instance Show Integer -- Defined in `GHC.Show'
instance Show a => Show (Maybe a) -- Defined in `GHC.Show'
...plus 22 others
That is 3rd sample for ghc with the same error:
import Safe
t1 = tailMay
main = do
print $ t1 []
print $ t1 [1,2,3]
Why? And how to fix the second sample without explicit type annotation?
The issue here is that tailMay [] can generate an output of type Maybe [a] for any a, while print can take an input of type Maybe [a] for any a (in class Show).
When you compose a "universal producer" and a "universal consumer", the compiler has no idea about which type a to pick -- that could be any type in class Show. The choice of a could matter since, in principle, print (Nothing :: Maybe [Int]) could print something different from print (Nothing :: Maybe [Bool]). In this case, the printed output would be the same, but only because we are lucky.
For instance print ([] :: [Int]) and print ([] :: [Char]) will print different messages, so print [] is ambiguous. Hence, GHC reject it, and requires an explicit type annotation (or a type application # type, using an extension).
Why, then, such ambiguity is accepted in GHCi? Well, GHCi is meant to be used for quick experiments, and as such, as a convenience feature, it will try hard to default these ambiguous a. This is done using the extended defaulting rules, which could (I guess) in principle be turned on in GHC as well by turning on that extension.
This is, however, not recommended since sometimes the defaulting rule can choose some unintended type, making the code compile but with an unwanted runtime behavior.
The common solution to this issue is using an annotation (or # type), because it provides more control to the programmer, makes the code easier to read, and avoids surprises.

Why does functor composition on an empty list return a Show error?

When calling the following, GHCI returns an error:
Ambiguous type variables ‘f0’, ‘b0’ arising from a use of ‘print’ prevents the constraint ‘(Show (f0 b0))’ from being solved.
From what I understand, this is because the type of my Expression is (Num b, Functor f) => [f b] where f is the ambiguous type.
However, the Functor instance of List defines fmap as map, and the definition of map ignores the function argument in case the second argument is [] to simply return []. This should mean that my expression should simply return [] regardless of how many fmap compositions I apply, and a call to show [] should go through. Why is it that I see the error then?
(fmap.fmap) (+1) []
It is true that your function will always return [], but typeclass dispatch (which happens at compile-time, rather than run-time) must be based on the type of the argument to show. The Show instance for [a] requires that Show a also be resolved (instance Show a => Show [a])---since there are many values of type [a] which do contain elements---and since the type of the list elements (all 0 of them) is ambiguous, the Show constraint can't be resolved.
This might lead you to ask why show [] for example does not have the same issue, since [] :: [a]. The answer here is that GHCi has some special Extended Default Rules heuristics, which apply in certain simple cases, in order to make working at the prompt more pleasant. If you :set -XNoExtendedDefaultRules you can see that show [] will have this same behavior. In your case, since the element type of the list is f0 b0 rather than a single type variable, the linked extended defaulting rules do not apply, and so the list element type still contains ambiguous type variables.
You can see that this is the issue by resolving some of the type constraints yourself, say by using -XTypeApplications. Even resolving the Functor constraint is enough to make the normal Haskell type defaulting rules apply again: (fmap.(fmap #[])) (+1) [] does indeed print [] at the GHCi prompt.

How does Haskell type-check infinite recursive values?

Define this data type:
data NaturalNumber = Zero | S NaturalNumber
deriving (Show)
In Haskell (compiled using GHC), this code will run without warning or error:
infinity = S infinity
inf1 = S inf2
inf2 = S inf1
So both recursive and mutually recursive infinitely deep values pass the type check.
However, the following code gives an error:
j = S 'h'
The error states Couldn't match expected type ‘NaturalNumber’ with actual type 'Char'. The (same) error persists even if I set
j = S (S (S (S ... (S 'h')...)))
with a hundred or so nested S's.
How can Haskell tell that infinity is a valid member of NaturalNumber but j is not?
Interestingly, it also allows:
bottom = bottom
k = S bottom
Does Haskell merely try to prove incorrectness of a program, and if it fails to do so then allows it? Or is Haskell's type system not Turing complete, so if it allows the program then the program is provably (at the type level) correct?
(If the type system (in the formal semantics of Haskell, instead of only the type checker) is Turing complete, then it will either fail to realize some correctly typed programs are correct or some incorrectly typed programs are incorrect, due to the undecidability of the halting problem.)
Well
S :: NaturalNumber -> NaturalNumber
In
infinity = S infinity
We start by assuming nothing: we assign infinity some unsolved type _a and try to figure out what it is. We know that we have applied S to infinity, so _a must be whatever is on the left side of the arrow in the constructor’s type, which is NaturalNumber. We know that infinity is the result of an application of S, so infinity :: NaturalNumber, again (if we got two conflicting types for this definition, we’d have to emit a type error).
Similar reasoning holds for the mutually recursive definitions. inf1 must be a NaturalNumber because it appears as an argument to S in inf2; inf2 must be a NaturalNumber because it is the result of S; etc.
The general algorithm is to assign definitions unknown types (notable exceptions are literals and constructors), and then to create constraints on those types by seeing how every definition is used. E.g. this must be some form of list because it’s being reversed, and this must be an Int because it’s being used to look up a value from an IntMap, etc.
In the case of
oops = S 'a'
'a' :: Char as it’s a literal, but, also, we must have 'a' :: NaturalNumber because it’s used as an argument to S. We get the obviously bogus constraint that the type of the literal must both be Char and NaturalNumber, which causes a type error.
And in
bottom = bottom
We start with bottom :: _a. The only constraint is _a ~ _a, because a value of type _a (bottom) is being used where a value of type _a is expected (on the RHS of the definition of bottom). Since there is nothing to further constrain the type, the unsolved type variable is generalized: it gets bound by a universal quantifier to produce bottom :: forall a. a.
Note how both uses of bottom above have the same type (_a) while inferring the type of bottom. This breaks polymorphic recursion: every occurrence of a value within its definition is taken to be of the same type as the definition itself. E.g.
-- perfectly balanced binary trees
data Binary a = Leaf a | Branch (Binary (a, a))
-- headB :: _a -> _r
headB (Leaf x) = x -- _a ~ Binary _r; headB :: Binary _r -> _r
headB (Branch bin) = fst (headB bin)
-- recursive call has type headB :: Binary _r -> _r
-- but bin :: Binary (_r, _r); mismatch
So you need a type signature:
headB :: {-forall a.-} Binary a -> a
headB (Leaf x) = x
headB (Branch bin) = fst (headB {-#(a, a)-} bin)
-- knowing exactly what headB's signature is allows for polymorphic recursion
So: when something doesn't have a type signature, the type checker tries to assign it a type, and if it comes across a bogus constraint on its way, it rejects the program. When something has a type signature, the type checker descends into it to make sure it's correct (tries to prove it wrong, if you prefer to think of it that way).
Haskell’s type system is not Turing complete because there are heavy syntactic restrictions to prevent e.g. type lambdas (without language extensions), but it doesn’t suffice to make sure all programs run to completion without error because it still allows bottoms (not to mention all the unsafe functions). It provides the weaker guarantee that, if a program runs to completion without using an unsafe function, it will remain type-correct. Under GHC, with sufficient language extensions, the type system does become Turing complete. I don't think it allows ill-typed programs through; I think the most you can do is throw the compiler into an infinite loop.

Haskell function composition confusion

I'm trying to learn haskell and I've been going over chapter 6 and 7 of Learn you a Haskell. Why don't the following two function definitions give the same result? I thought (f . g) x = f (g (x))?
Def 1
let{ t :: Eq x => [x] -> Int; t xs = length( nub xs)}
t [1]
1
Def 2
let t = length . nub
t [1]
<interactive>:78:4:
No instance for (Num ()) arising from the literal `1'
Possible fix: add an instance declaration for (Num ())
In the expression: 1
In the first argument of `t', namely `[1]'
In the expression: t [1]
The problem is with your type signatures and the dreaded monomorphism restriction. You have a type signature in your first version but not in your second; ironically, it would have worked the other way around!
Try this:
λ>let t :: Eq x => [x] -> Int; t = length . nub
λ>t [1]
1
The monomorphism restriction forces things that don't look like functions to have a monomorphic type unless they have an explicit type signature. The type you want for t is polymorphic: note the type variable x. However, with the monomorphism restriction, x gets "defaulted" to (). Check this out:
λ>let t = length . nub
λ>:t t
t :: [()] -> Int
This is very different from the version with the type signature above!
The compiler chooses () for the monomorphic type because of defaulting. Defaulting is just the process Haskell uses to choose a type from a typeclass. All this really means is that, in the repl, Haskell will try using the () type if it encounters an ambiguous type variable in the Show, Eq or Ord classes. Yes, this is basically arbitrary, but it's pretty handy for playing around without having to write type signatures everywhere! Also, the defaulting rules are more conservative in files, so this is basically just something that happens in GHCi.
In fact, defaulting to () seems to mostly be a hack to make printf work correctly in GHCi! It's an obscure Haskell curio, but I'd ignore it in practice.
Apart from including a type signature, you could also just turn the monomorphism restriction off in the repl:
λ>:set -XNoMonomorphismRestriction
This is fine in GHCi, but I would not use it in real modules--instead, make sure to always include a type signature for top-level definitions inside files.
EDIT: Ever since GHC 7.8.1, the monomorphism restriction is turned off by default in GHCi. This means that all this code would work fine with a recent version of GHCi and you do not need to set the flag explicitly. It can still be an issue for values defined in a file with no type signature, however.
This is another instance of the "Dreaded" Monomorphism Restriction which leads GHCi to infer a monomorphic type for the composed function. You can disable it in GHCi with
> :set -XNoMonomorphismRestriction

Why can't I use record selectors with an existentially quantified type?

When using Existential types, we have to use a pattern-matching syntax for extracting the foralled value. We can't use the ordinary record selectors as functions. GHC reports an error and suggest using pattern-matching with this definition of yALL:
{-# LANGUAGE ExistentialQuantification #-}
data ALL = forall a. Show a => ALL { theA :: a }
-- data ok
xALL :: ALL -> String
xALL (ALL a) = show a
-- pattern matching ok
-- ABOVE: heaven
-- BELOW: hell
yALL :: ALL -> String
yALL all = show $ theA all
-- record selector failed
forall.hs:11:19:
Cannot use record selector `theA' as a function due to escaped type variables
Probable fix: use pattern-matching syntax instead
In the second argument of `($)', namely `theA all'
In the expression: show $ theA all
In an equation for `yALL': yALL all = show $ theA all
Some of my data take more than 5 elements. It's hard to maintain the code if I
use pattern-matching:
func1 (BigData _ _ _ _ elemx _ _) = func2 elemx
Is there a good method to make code like that maintainable or to wrap it up so that I can use some kind of selectors?
Existential types work in a more elaborate manner than regular types. GHC is (rightly) forbidding you from using theA as a function. But imagine there was no such prohibition. What type would that function have? It would have to be something like this:
-- Not a real type signature!
theA :: ALL -> t -- for a fresh type t on each use of theA; t is an instance of Show
To put it very crudely, forall makes GHC "forget" the type of the constructor's arguments; all that the type system knows is that this type is an instance of Show. So when you try to extract the value of the constructor's argument, there is no way to recover the original type.
What GHC does, behind the scenes, is what the comment to the fake type signature above says—each time you pattern match against the ALL constructor, the variable bound to the constructor's value is assigned a unique type that's guaranteed to be different from every other type. Take for example this code:
case ALL "foo" of
ALL x -> show x
The variable x gets a unique type that is distinct from every other type in the program and cannot be matched with any type variable. These unique types are not allowed to escape to the top level—which is the reason why theA cannot be used as a function.
You can use record syntax in pattern matching,
func1 BigData{ someField = elemx } = func2 elemx
works and is much less typing for huge types.

Resources