'Referencing' typeclass functions - haskell

I'm a beginner and I'm trying to use Hoed to trace Haskell evaluations, because maybe it will further help my learning process.
I saw in their examples code like this
isEven :: Int -> Bool
isEven = observe "isEven" isEven'
isEven' n = mod2 n == 0
I was thinking how could I observe in order to trace an instance defined function like >>= for example.
I wrote something like
bind' = observe "bind'" (>>=)
and of course I've got an error
* Ambiguous type variable 'm0' arising from a use of '>>='
prevents the constraint '(Monad m0)' from being solved.
Relevant bindings include
bind' :: m0 a0 -> (a0 -> m0 b0) -> m0 b0 (bound at my.hs:46:1)
Probable fix: use a type annotation to specify what 'm0' should be.
These potential instances exist:
...
Should I / How could I use a type annotation in order to specify which Monad instance's (e.g. Reader, State etc.) >>= function

It looks like you have found the infamous MonomorphismRestriction. More info. The links do a great job of explaining what the MonomorphismRestriction is and how it works.
You're not wrong to expect that writing bind' with no signature should "just work". However, sometimes the compiler needs a bit of help. In short, due to the MonomorphismRestriction, GHC tries to take the nominally polymorphic signature of bind' :: Monad m => m a -> (a -> m b) -> m b, and make it less polymorphic by instantiating some of the type variables.
In your case, it looks like the compiler wants to make bind' only work for one specific Monad m. Without your real code, I can't say for sure, but consider this example:
import Debug.Trace
main :: IO ()
main = (>>=) (return "hello") print
bind' = trace "bind" (>>=)
The compiler produces an error similar to yours: Ambiguous type variable m0
However, if you use bind':
import Debug.Trace
main :: IO ()
main = bind' (return "hello") print
bind' = trace "bind" (>>=)
no error! That's because GHC is inferring that m should be IO since bind' is used in the IO monad.
Alternatively, you can tell GHC to turn off the MonomorphismRestriction:
{-# LANGUAGE NoMonomorphismRestriction #-}
import Debug.Trace
main :: IO ()
main = (>>=) (return "hello") print
bind' = trace "bind" (>>=)
and it compiles just fine!

Related

Unwrapping the STT monad in a transformer stack?

This question is apparently related to the problem discussed here and here. Unfortunately, my requirement is slightly different to those questions, and the answers given don't apply to me. I also don't really understand why runST fails to type check in these cases, which doesn't help.
My problem is this, I have one section of code that uses one monad stack, or rather a single monad:
import Control.Monad.Except
type KErr a = Except KindError a
Another section of code needs to integrate with this that wraps this within the STT monad:
type RunM s a = STT s (Except KindError) a
At the interface between these sections, I clearly need to wrap and unwrap the outer layer. I've got the following function to work in the KErr -> RunM direction:
kerrToRun :: KErr a -> RunM s a
kerrToRun e = either throwError return $ runExcept e
but for some reason I just can't get the converse to type check:
runToKErr :: RunM s a -> KErr a
runToKErr r = runST r
I'm working on the assumption that as the RunM's inner monad has the same structure as KErr, I can just return it once I've unwrapped the STT layer, but I don't seem to be able to get that far, as runST complains about its type arguments:
src/KindLang/Runtime/Eval.hs:18:21:
Couldn't match type ‘s’ with ‘s1’
‘s’ is a rigid type variable bound by
the type signature for runToKErr :: RunM s a -> KErr a
at src/KindLang/Runtime/Eval.hs:17:14
‘s1’ is a rigid type variable bound by
a type expected by the context:
STT s1 (ExceptT KindError Data.Functor.Identity.Identity) a
at src/KindLang/Runtime/Eval.hs:18:15
Expected type: STT
s1 (ExceptT KindError Data.Functor.Identity.Identity) a
Actual type: RunM s a
I've also tried:
runToKErr r = either throwError return $ runExcept $ runST r
in order to more strongly isolate runST from its expected return type, in case that was the cause of the issue, but the results are the same.
Where is this s1 type coming from, and how do I persuade ghc that it's the same type as s?
(The below talks about ST s a but applies just the same as STT s m a; I've just avoided the unnecessary complication of talking about the transformer version below)
The problem you are seeing is that runST has type (forall s. ST s a) -> a to isolate any potential (STRef-changing) effects of the computation from the outside, pure world. The whole point of the s phantom type that all ST computations, STRefs, etc. are tagged with, is to track which "ST-domain" they belong to; and the type of runST ensures that nothing can pass between domains.
You can write runToKErr by enforcing the same invariant:
{-# language Rank2Types #-}
runToKErr :: (forall s. RunM s a) -> KErr a
runToKErr = runST
(of course, you might realize further down the road that this restriction is too strong for the program you hope to write; at that point you'll need to lose hope, sorry, I mean you'll need to redesign your program.)
As for the error message, the reason you can't "convince the type checker that s1 and s are the same type" is because if I pass you an ST s a for a given choice of s and a, that is not the same as giving you something which allows you to choose your own s. GHC chose s1 (a Skolemized variable) as s and so tries to unify ST s a with ST s1 a

When are type signatures necessary in Haskell?

Many introductory texts will tell you that in Haskell type signatures are "almost always" optional. Can anybody quantify the "almost" part?
As far as I can tell, the only time you need an explicit signature is to disambiguate type classes. (The canonical example being read . show.) Are there other cases I haven't thought of, or is this it?
(I'm aware that if you go beyond Haskell 2010 there are plenty for exceptions. For example, GHC will never infer rank-N types. But rank-N types are a language extension, not part of the official standard [yet].)
Polymorphic recursion needs type annotations, in general.
f :: (a -> a) -> (a -> b) -> Int -> a -> b
f f1 g n x =
if n == (0 :: Int)
then g x
else f f1 (\z h -> g (h z)) (n-1) x f1
(Credit: Patrick Cousot)
Note how the recursive call looks badly typed (!): it calls itself with five arguments, despite f having only four! Then remember that b can be instantiated with c -> d, which causes an extra argument to appear.
The above contrived example computes
f f1 g n x = g (f1 (f1 (f1 ... (f1 x))))
where f1 is applied n times. Of course, there is a much simpler way to write an equivalent program.
Monomorphism restriction
If you have MonomorphismRestriction enabled, then sometimes you will need to add a type signature to get the most general type:
{-# LANGUAGE MonomorphismRestriction #-}
-- myPrint :: Show a => a -> IO ()
myPrint = print
main = do
myPrint ()
myPrint "hello"
This will fail because myPrint is monomorphic. You would need to uncomment the type signature to make it work, or disable MonomorphismRestriction.
Phantom constraints
When you put a polymorphic value with a constraint into a tuple, the tuple itself becomes polymorphic and has the same constraint:
myValue :: Read a => a
myValue = read "0"
myTuple :: Read a => (a, String)
myTuple = (myValue, "hello")
We know that the constraint affects the first part of the tuple but does not affect the second part. The type system doesn't know that, unfortunately, and will complain if you try to do this:
myString = snd myTuple
Even though intuitively one would expect myString to be just a String, the type checker needs to specialize the type variable a and figure out whether the constraint is actually satisfied. In order to make this expression work, one would need to annotate the type of either snd or myTuple:
myString = snd (myTuple :: ((), String))
In Haskell, as I'm sure you know, types are inferred. In other words, the compiler works out what type you want.
However, in Haskell, there are also polymorphic typeclasses, with functions that act in different ways depending on the return type. Here's an example of the Monad class, though I haven't defined everything:
class Monad m where
return :: a -> m a
(>>=) :: m a -> (a -> m b) -> m b
fail :: String -> m a
We're given a lot of functions with just type signatures. Our job is to make instance declarations for different types that can be treated as Monads, like Maybe t or [t].
Have a look at this code - it won't work in the way we might expect:
return 7
That's a function from the Monad class, but because there's more than one Monad, we have to specify what return value/type we want, or it automatically becomes an IO Monad. So:
return 7 :: Maybe Int
-- Will return...
Just 7
return 6 :: [Int]
-- Will return...
[6]
This is because [t] and Maybe have both been defined in the Monad type class.
Here's another example, this time from the random typeclass. This code throws an error:
random (mkStdGen 100)
Because random returns something in the Random class, we'll have to define what type we want to return, with a StdGen object tupelo with whatever value we want:
random (mkStdGen 100) :: (Int, StdGen)
-- Returns...
(-3650871090684229393,693699796 2103410263)
random (mkStdGen 100) :: (Bool, StdGen)
-- Returns...
(True,4041414 40692)
This can all be found at learn you a Haskell online, though you'll have to do some long reading. This, I'm pretty much 100% certain, it the only time when types are necessary.

STRef and phantom types

Does s in STRef s a get instantiated with a concrete type? One could easily imagine some code where STRef is used in a context where the a takes on Int. But there doesn't seem to be anything for the type inference to give s a concrete type.
Imagine something in pseudo Java like MyList<S, A>. Even if S never appeared in the implementation of MyList instantiating a concrete type like MyList<S, Integer> where a concrete type is not used in place of S would not make sense. So how can STRef s a work?
tl;dr - in practice it seems it always gets initialised to RealWorld in the end
The source notes that s can be instantiated to RealWorld inside invocations of stToIO, but is otherwise uninstantiated:
-- The s parameter is either
-- an uninstantiated type variable (inside invocations of 'runST'), or
-- 'RealWorld' (inside invocations of 'Control.Monad.ST.stToIO').
Looking at the actual code for ST however it seems runST uses a specific value realWorld#:
newtype ST s a = ST (STRep s a)
type STRep s a = State# s -> (# State# s, a #)
runST :: (forall s. ST s a) -> a
runST st = runSTRep (case st of { ST st_rep -> st_rep })
runSTRep :: (forall s. STRep s a) -> a
runSTRep st_rep = case st_rep realWorld# of
(# _, r #) -> r
realWorld# is defined as a magic primitive inside the GHC source code:
realWorldName = mkWiredInIdName gHC_PRIM (fsLit "realWorld#")
realWorldPrimIdKey realWorldPrimId
realWorldPrimId :: Id -- :: State# RealWorld
realWorldPrimId = pcMiscPrelId realWorldName realWorldStatePrimTy
(noCafIdInfo `setUnfoldingInfo` evaldUnfolding
`setOneShotInfo` stateHackOneShot)
You can also confirm this in ghci:
Prelude> :set -XMagicHash
Prelude> :m +GHC.Prim
Prelude GHC.Prim> :t realWorld#
realWorld# :: State# RealWorld
From your question I can not see if you understand why the phantom s type is there at all. Even if you did not ask for this explicitly, let me elaborate on that.
The role of the phantom type
The main use of the phantom type is to constrain references (aka pointers) to stay "inside" the ST monad. Roughly, the dynamically allocated data must end its life when runST returns.
To see the issue, let's pretend that the type of runST were
runST :: ST s a -> a
Then, consider this:
data Dummy
let var :: STRef Dummy Int
var = runST (newSTRef 0)
change :: () -> ()
change = runST (modifySTRef var succ)
access :: () -> Int
result :: (Int, ())
result = (access() , change())
in result
(Above I added a few useless () arguments to make it similar to imperative code)
Now what should be the result of the code above? It could be either (0,()) or (1,()) depending on the evaluation order. This is a big no-no in the pure Haskell world.
The issue here is that var is a reference which "escaped" from its runST. When you escape the ST monad, you are no longer forced to use the monad operator >>= (or equivalently, the do notation to sequentialize the order of side effects. If references are still around, then we can still have side effects around when there should be none.
To avoid the issue, we restrict runST to work on ST s a where a does not depend on s. Why this? Because newSTRef returns a STRef s a, a reference to a tagged with the phantom type s, hence the return type depends on s and can not be extracted from the ST monad through runST.
Technically, this restriction is done by using a rank-2 type:
runST :: (forall s. ST s a) -> a
the "forall" here is used to implement the restriction. The type is saying: choose any a you wish, then provide a value of type ST s a for any s I wish, then I will return an a. Mind that s is chosen by runST, not by the caller, so it could be absolutely anything. So, the type system will accept an application runST action only if action :: forall s. ST s a where s is unconstrained, and a does not involve s (recall that the caller has to choose a before runST chooses s).
It is indeed a slightly hackish trick to implement the independence constraint, but it does work.
On the actual question
To connect this to your actual question: in the implementation of runST, s will be chosen to be any concrete type. Note that, even if s were simply chosen to be Int inside runST it would not matter much, because the type system has already constrained a to be independent from s, hence to be reference-free. As #Ganesh pointed out, RealWorld is the type used by GHC.
You also mentioned Java. One could attempt to play a similar trick in Java as follows: (warning, overly simplified code follows)
interface ST<S,A> { A call(); }
interface STAction<A> { <S> ST<S,A> call(S dummy); }
...
<A> A runST(STAction<A> action} {
RealWorld dummy = new RealWorld();
return action.call(dummy).call();
}
Above in STAction parameter A can not depend on S.

Signature of IO in Haskell (is this class or data?)

The question is not what IO does, but how is it defined, its signature. Specifically, is this data or class, is "a" its type parameter then? I didn't find it anywhere. Also, I don't understand the syntactic meaning of this:
f :: IO a
You asked whether IO a is a data type: it is. And you asked whether the a is its type parameter: it is. You said you couldn't find its definition. Let me show you how to find it:
localhost:~ gareth.rowlands$ ghci
GHCi, version 7.6.3: http://www.haskell.org/ghc/ :? for help
Prelude> :i IO
newtype IO a
= GHC.Types.IO (GHC.Prim.State# GHC.Prim.RealWorld
-> (# GHC.Prim.State# GHC.Prim.RealWorld, a #))
-- Defined in `GHC.Types'
instance Monad IO -- Defined in `GHC.Base'
instance Functor IO -- Defined in `GHC.Base'
Prelude>
In ghci, :i or :info tells you about a type. It shows the type declaration and where it's defined. You can see that IO is a Monad and a Functor too.
This technique is more useful on normal Haskell types - as others have noted, IO is magic in Haskell. In a typical Haskell type, the type signature is very revealing but the important thing to know about IO is not its type declaration, rather that IO actions actually perform IO. They do this in a pretty conventional way, typically by calling the underlying C or OS routine. For example, Haskell's putChar action might call C's putchar function.
IO is a polymorphic type (which happens to be an instance of Monad, irrelevant here).
Consider the humble list. If we were to write our own list of Ints, we might do this:
data IntList = Nil | Cons { listHead :: Int, listRest :: IntList }
If you then abstract over what element type it is, you get this:
data List a = Nil | Cons { listHead :: a, listRest :: List a }
As you can see, the return value of listRest is List a. List is a polymorphic type of kind * -> *, which is to say that it takes one type argument to create a concrete type.
In a similar way, IO is a polymorphic type with kind * -> *, which again means it takes one type argument. If you were to define it yourself, it might look like this:
data IO a = IO (RealWorld -> (a, RealWorld))
(definition courtesy of this answer)
The amount of magic in IO is grossly overestimated: it has some support from compiler and runtime system, but much less than newbies usually expect.
Here is the source file where it is defined:
http://www.haskell.org/ghc/docs/latest/html/libraries/ghc-prim-0.3.0.0/src/GHC-Types.html
newtype IO a
= IO (State# RealWorld -> (# State# RealWorld, a #))
It is just an optimized version of state monad. If we remove optimization annotations we will see:
data IO a = IO (Realworld -> (Realworld, a))
So basically IO a is a data structure storing a function that takes old real world and returns new real world with io operation performed and a.
Some compiler tricks are necessary mostly to remove Realworld dummy value efficiently.
IO type is an abstract newtype - constructors are not exported, so you cannot bypass library functions, work with it directly and perform nasty things: duplicate RealWorld, create RealWorld out of nothing or escape the monad (write a function of IO a -> a type).
Since IO can be applied to objects of any type a, as it is a polymorphic monad, a is not specified.
If you have some object with type a, then it can be 'wrappered' as an object of type IO a, which you can think of as being an action that gives an object of type a. For example, getChar is of type IO Char, and so when it is called, it has the side effect of (From the program's perspective) generating a character, which comes from stdin.
As another example, putChar has type Char -> IO (), meaning that it takes a char, and then performs some action that gives no output (in the context of the program, though it will print the char given to stdout).
Edit: More explanation of monads:
A monad can be thought of as a 'wrapper type' M, and has two associated functions:
return and >>=.
Given a type a, it is possible to create objects of type M a (IO a in the case of the IO monad), using the return function.
return, therefore, has type a -> M a. Moreover, return attempts not to change the element that it is passed -- if you call return x, you will get a wrappered version of x that contains all of the information of x (Theoretically, at least. This doesn't happen with, for example, the empty monad.)
For example, return "x" will yield an M Char. This is how getChar works -- it yields an IO Char using a return statement, which is then pulled out of its wrapper with <-.
>>=, read as 'bind', is more complicated. It has type M a -> (a -> M b) -> M b, and its role is to take a 'wrappered' object, and a function from the underlying type of that object to another 'wrappered' object, and apply that function to the underlying variable in the first input.
For example, (return 5) >>= (return . (+ 3)) will yield an M Int, which will be the same M Int that would be given by return 8. In this way, any function that can be applied outside of a monad can also be applied inside of it.
To do this, one could take an arbitrary function f :: a -> b, and give the new function g :: M a -> M b as follows:
g x = x >>= (return . f)
Now, for something to be a monad, these operations must also have certain relations -- their definitions as above aren't quite enough.
First: (return x) >>= f must be equivalent to f x. That is, it must be equivalent to perform an operation on x whether it is 'wrapped' in the monad or not.
Second: x >>= return must be equivalent to m. That is, if an object is unwrapped by bind, and then rewrapped by return, it must return to its same state, unchanged.
Third, and finally (x >>= f) >>= g must be equivalent to x >>= (\y -> (f y >>= g) ). That is, function binding is associative (sort of). More accurately, if two functions are bound successively, this must be equivalent to binding the combination thereof.
Now, while this is how monads work, it's not how it's most commonly used, because of the syntactic sugar of do and <-.
Essentially, do begins a long chain of binds, and each <- sort of creates a lambda function that gets bound.
For example,
a = do x <- something
y <- function x
return y
is equivalent to
a = something >>= (\x -> (function x) >>= (\y -> return y))
In both cases, something is bound to x, function x is bound to y, and then y is returned to a in the wrapper of the relevant monad.
Sorry for the wall of text, and I hope it explains something. If there's more you need cleared up about this, or something in this explanation is confusing, just ask.
This is a very good question, if you ask me. I remember being very confused about this too, maybe this will help...
'IO' is a type constructor, 'IO a' is a type, the 'a' (in 'IO a') is an type variable. The letter 'a' carries no significance, the letter 'b' or 't1' could have been used just as well.
If you look at the definition of the IO type constructor you will see that it is a newtype defined as: GHC.Types.IO (GHC.Prim.State# GHC.Prim.RealWorld -> (# GHC.Prim.State# GHC.Prim.RealWorld, a #))
'f :: IO a' is the type of a function called 'f' of apparently no arguments that returns a result of some unconstrained type in the IO monad. 'in the IO monad' means that f can do some IO (i.e. change the 'RealWorld', where 'change' means replace the provided RealWorld with a new one) while computing its result. The result of f is polymorphic (that's a type variable 'a' not a type constant like 'Int'). A polymorphic result means that in your program it's the caller that determines the type of the result, so used in one place f could return an Int, used in another place it could return a String. 'Unconstrained' means that there's no type class restricting what type can be returned and so any type can be returned.
Why is 'f' a function and not a constant since there are no parameters and Haskell is pure? Because the definition of IO means that 'f :: IO a' could have been written 'f :: GHC.Prim.State# GHC.Prim.RealWorld -> (# GHC.Prim.State# GHC.Prim.RealWorld, a #)' and so in fact has a parameter -- the 'state of the real world'.
In the data IO a a have mainly the same meaning as in Maybe a.
But we can't rid of a constructor, like:
fromIO :: IO a -> a
fromIO (IO a) = a
Fortunately we could use this data in Monads, like:
{-# LANGUAGE ScopedTypeVariables #-}
foo = do
(fromIO :: a) <- (dataIO :: IO a)
...

Haskell rank two polymorphism compile error

Given the following definitions:
import Control.Monad.ST
import Data.STRef
fourty_two = do
x <- newSTRef (42::Int)
readSTRef x
The following compiles under GHC:
main = (print . runST) fourty_two -- (1)
But this does not:
main = (print . runST) $ fourty_two -- (2)
But then as bdonlan points out in a comment, this does compile:
main = ((print . runST) $) fourty_two -- (3)
But, this does not compile
main = (($) (print . runST)) fourty_two -- (4)
Which seems to indicate that (3) only compiles due to special treatment of infix $, however, it still doesn't explain why (1) does compile.
Questions:
1) I've read the following two questions (first, second), and I've been led to believe $ can only be instantiated with monomorphic types. But I would similarly assume . can only be instantiated with monomorphic types, and as a result would similarly fail.
Why does the first code succeed but the second code does not? (e.g. is there a special rule GHC has for the first case that it can't apply in the second?)
2) Is there a current GHC extension that compiles the second code? (perhaps ImpredicativePolymorphism did this at some point, but it seems deprecated, has anything replaced it?)
3) Is there any way to define say `my_dollar` using GHC extensions to do what $ does, but is also able to handle polymorphic types, so (print . runST) `my_dollar` fourty_two compiles?
Edit: Proposed Answer:
Also, the following fails to compile:
main = ((.) print runST) fourty_two -- (5)
This is the same as (1), except not using the infix version of ..
As a result, it seems GHC has special rules for both $ and ., but only their infix versions.
I'm not sure I understand why the second doesn't work. We can look at the type of print . runST and observe that it is sufficiently polymorphic, so the blame doesn't lie with (.). I suspect that the special rule that GHC has for infix ($) just isn't quite sufficient. SPJ and friends might be open to re-examining it if you propose this fragment as a bug on their tracker.
As for why the third example works, well, that's just because again the type of ((print . runST) $) is sufficiently polymorphic; in fact, it's equal to the type of print . runST.
Nothing has replaced ImpredicativePolymorphism, because the GHC folks haven't seen any use cases where the extra programmer convenience outweighed the extra potential for compiler bugs. (I don't think they'd see this as compelling, either, though of course I'm not the authority.)
We can define a slightly less polymorphic ($$):
{-# LANGUAGE RankNTypes #-}
infixl 0 $$
($$) :: ((forall s. f s a) -> b) -> ((forall s. f s a) -> b)
f $$ x = f x
Then your example typechecks okay with this new operator:
*Main> (print . runST) $$ fourty_two
42
I can't say with too much authority on this subject, but here's what I think may be happening:
Consider what the typechecker has to do in each of these cases. (print . runST) has type Show b => (forall s. ST s t) -> IO (). fourty_two has type ST x Int.
The forall here is an existential type qualifier - here it means that the argument passed in has to be universal on s. That is, you must pass in a polymorphic type that supports any value for s whatsoever. If you don't explicitly state forall, Haskell puts it at the outermost level of the type definition. This means that fourty_two :: forall x. ST x Int and (print . runST) :: forall t. Show t => (forall s. ST s t) -> IO ()
Now, we can match forall x. ST x Int with forall s. ST s t by letting t = Int, x = s. So the direct call case works. What happens if we use $, though?
$ has type ($) :: forall a b. (a -> b) -> a -> b. When we resolve a and b, since the type for $ doesn't have any explicit type scoping like this, the x argument of fourty_two gets lifted out to the outermost scope in the type for ($) - so ($) :: forall x t. (a = forall s. ST s t -> b = IO ()) -> (a = ST x t) -> IO (). At this point, it tries to match a and b, and fails.
If you instead write ((print . runST) $) fourty_two, then the compiler first resolves the type of ((print . runST $). It resolves the type for ($) to be forall t. (a = forall s. ST s t -> b = IO ()) -> a -> b; note that since the second occurance of a is unconstrained, we don't have that pesky type variable leaking out to the outermost scope! And so the match succeeds, the function is partially applied, and the overall type of the expression is forall t. (forall s. ST s t) -> IO (), which is right back where we started, and so it succeeds.

Resources