How to derive the type for Haskell record fields? - haskell

Coming from OOP this seems like alien code to me.
I don't understand why type of runIdentity is a function :
runIdentity :: Identity a -> a ? I specified to be runIdentity :: a
newtype Identity a = Identity {runIdentity :: a} deriving Show
instance Monad Identity where
return = Identity
Identity x >>= k = k x
instance Functor Identity where
fmap f (Identity x) = Identity (f x)
instance Applicative Identity where
pure = Identity
Identity f <*> Identity v = Identity (f v)
wrapNsucc :: Integer -> Identity Integer
wrapNsucc = Identity . succ
Calling runIdentity :
runIdentity $ wrapNsucc 5 -- gives 6 as output

You're right that runIdentity is but a simple field of type a. But the type of runIdentity is Identity a -> a, since runIdentity is a function to extract that field out of a Identity a. You can't get the runIdentity out of a value without supplying which value to get it from, after all.
Edit:
To expand a little on that OOP-analogy in the comments, think of a class
class Identity<T> {
public T runIdentity;
}
This is the Identity monad, loosely translated to OOP code. The template argument T basically is your a; as such, runIdentity is of type T. To get that T from your object, you'd probably do something like
Identity<int> foo = new Identity<int>();
int x = foo.runIdentity;
You see runIdentity as something of type T, but it's not really. You can't just do
int x = runIdentity; // Nope!
because - where to get the runIdentity from? Instead, think of this like doing
Identity<int> foo = new Identity<int>();
int x = runIdentity(foo);
This shows what actually happens when you're calling a member; you have a function (your runIdentity) and supply it an object to use - IIRC this is what Python does with def func(self). So instead of being plainly of type T, runIdentity is actually taking an Identity<T> as argument to return a T.
Thus, it's of type Identity a -> a.

Another way to see this is that record syntax in Haskell is basically just syntactic sugar over algebraic datatypes, i.e. records don't truly exist in Haskell, only algebraic datatypes do, with perhaps some additional syntactic niceties. Hence there isn't a notion of members the same way that classes have in a lot of OO languages.
data MyRecord = MyRecord { myInt :: Int, myString :: String }
really is just
data MyRecord Int String
with the additional functions
myInt :: MyRecord -> Int
myInt (MyRecord x _) = x
myString :: MyRecord -> String
myString (MyRecord _ y) = y
automatically defined.
The only things that you could not do by yourself with normal algebraic datatypes that record syntax gives you are a nice way of making a copy of MyRecord that only has a subset of fields changed and a nice way of naming certain patterns.
copyWithNewInt :: Int -> MyRecord -> MyRecord
copyWithNewInt x r = r { myInt = x }
-- Same thing as myInt, just written differently
extractInt :: MyRecord -> Int
extractInt (MyRecord { myInt = x }) = x
Because this is just syntactic sugar over ordinary algebraic datatypes, you could always fall back to doing things the usual way.
-- This is a more verbose but also valid way of doing things
copyWithNewInt :: Int -> MyRecord -> MyRecord
copyWithNewInt x (MyRecord _ oldString) = MyRecord x oldString
Incidentally this is why some otherwise ridiculous-seeming constraints exist (the most prominent is that you can't have another type defined with record syntax with myInt again, otherwise you're creating two functions in the same scope with the same name, which Haskell does not allow).
Therefore
newtype Identity a = Identity {runIdentity :: a} deriving Show
is equivalent (minus convenient update syntax which doesn't really matter when you have only one field) to
newtype Identity a = Identity a deriving Show
runIdentity :: Identity a -> a
runIdentity (Identity x) = x
Using record syntax just compresses all that into a single line (and perhaps gives more insight into why runIdentity is named that, i.e. as a verb, rather than as a noun).

newtype Identity a = Identity {runIdentity :: a} deriving Show
Using the record syntax here, you're really creating two things called runIdentity.
One is the field of the constructor Identity. You can use that with record pattern syntax, as in case i of Identity { x = runIdentity } -> x, where matching a value i :: Identity a to extract the field's contents into a local variable x. You can also use record construction or update syntax, as in Identity { runIdentity = "foo" } or i { runIdentity = "bar" }.
In all of those cases runIdentity isn't really a standalone thing in its own right. You're using it only as part of a larger syntactic construct, to say which field of Identity you're accessing. The "slot" of Identify a referred to with the help of the field runIdentity does indeed store things of type a. But this runIdentity field is not a value of type a. It's not even a value at all really, since it needs to have these extra properties (that values do not have) about referring to a particular "slot" in a data type. Values are standalone things, that exist and make sense on their own. Fields are not; field contents are, which is why we use types to classify fields, but fields themselves are not values.1 Values can be placed in data structures, returned from functions, etc. There's no way to define a value that you can place in a data structure, get back out, and then use with record pattern, construction, or update syntax.
The other thing named runIdentity defined with the record match syntax is an ordinary function. Functions are values; you can pass them to other functions, put them in data structures, etc. The intent is to give you a helper for getting the value of a field in a value of type Identity a. But because you have to specify which Identity a value you want to get the value of the runIdentity field from, you have to pass an Identity a into the function. So the runIdentity function is a value of type Identity a -> a, as distinct from the runIdentity field which is a non-value described by type a.
A simple way to see this distinction is to add a definition like myRunIdentity = runIdentity to your file. That definition declares that myRunIdentity is equal to runIdentity, but you can only define values like that. And sure enough myRunIdentity will be a function of type Identity a -> a, that you can apply to things of type Identity a to get an a value. But it won't be usable with record syntax as the field. The field runIdentity didn't "come along with" the value runIdentity in that definition.
This question might have been prompted by type :t runIdentity into ghci, asking it to show you the type. It would have answered runIdentity :: Identity a -> a. The reason is because the :t syntax works on values2. You can type any expression at all there, and it will give you the type of the value that would result. So :t runIdentity is seeing the runIdentity value (the function), not the runIdentity field.
As a final note, I've been banging on about how the field runIdentity :: a and the function runIdentity :: Identity -> a are two separate things. I did so because I thought cleanly separating the two would help people confused by why there can be two different answers to "what is the type of runIdentity". But it's also a perfectly valid interpretation to say that runIdentity is a single thing, and it's simply the case that when you use a field as a first-class value it behaves as a function. And that is how people often talk about fields. So please don't be confused if other sources insist that there is only one thing; these are simply two different ways of looking at the same language concepts.
1 A perspective on lenses, if you've heard of them, is that they are ordinary values that can be used to give us all of the semantics we need from "fields", without any special-purpose syntax. So a hypothetical language could theoretically not provide any syntax for field access at all, just giving us lenses when we declare a new data type, and we'd be able to make do.
But Haskell record syntax fields aren't lenses; used as values they're only "getter" functions, which is why there's dedicated pattern match, construction, and update syntax for using the fields in ways beyond what is possible with ordinary values.
2 Well, more properly it works on expressions, since it's type-checking the code, not running the code and then looking at the value to see what type it is (that wouldn't work anyway, since runtime Haskell values don't have any type information in the GHC system). But you can blur the lines and call values and expressions the same kind of thing; fields are quite different.

Related

Understanding double colon "::" and type variables in Haskell?

For a uni assignment, a function "lookup" has been defined as:
lookup :: Env e -> String -> Maybe e
The "Env" keyword has been defined as:
import qualified Data.Map as M
newtype Env e = Env (M.Map String e) deriving (Functor, Foldable, Traversable, Show, Eq, Monoid, Semigroup)
I understand that the double colon "::" means "has type of". So the lookup function takes two arguments and returns something. I know the second argument is a string, however, I'm struggling to understand the first argument and the output. What is the "Env" type and what does the "e" after it represent. It feels like e is an argument to "Env" but if that is the case, the output wouldn't make sense.
Any help would be appreciated.
It feels like e is an argument to "Env" but if that is the case, the output wouldn't make sense.
That's the direction the syntax is trying to "feel like", because that's what it means. Env is a parameterised type, and e is being passed as an argument to that parameter.
I'm not 100% sure why you think the output wouldn't make sense, under that interpretation? Maybe is also a parameterised type, and the variable e is also being passed to Maybe; this is no different than using a variable twice in any piece of code. You can think of it as something like this pseudo-code:
lookupType(e) {
argument1_type = Env(e)
argument2_type = String
result_type = Maybe(e)
}
Is your confusion that you are thinking of e in the type signature of lookup not as being passed to Env, but as being defined to receive the argument of Env? That happens in the place where Env is defined, not where it is used in the type of lookup. Again, this is just like what happens at the value level with function arguments; e.g. when you're writing code to define a function like plus:
plus x y = x + y
Then the x and y variables are created to stand for whatever plus is applied to elsewhere. But in another piece of code where you're using plus, something like incr x = plus x 1 here the variable is just being passed to plus as an argument, it is not defined as the parameter of plus (it was in fact defined as the parameter of incr).
Perhaps the thing that you need more explicitly called out is this. lookup :: Env e -> String -> Maybe e is saying:
For any type e that you like, lookup takes an Env e and a String and returns a Maybe e
Thus you could pass lookup an Env Integer and a String, and it will give you back a Maybe Integer. Or you could pass it an Env (Maybe [(String, Float)]) and a String, and it will give you back a Maybe (Maybe [(String, Float)]). This should hopefully be intuitive, because it's just looking up keys in an environment; whatever type of data is stored in the environment you pass to lookup is the type that lookup will maybe-return.
The e is there because in fact lookup is parametrically polymorphic; it's almost like lookup takes a type argument called e, which it can then pass to other things in its type signature.1 That's why I wrote my pseudo-code the way I did, above.
But this is just how variables in type signatures work, in base Haskell2. You simply write variables in your type signature without defining them anyway, and they mean your definition can be used with ANY type at the position you write the variable. The only reason the variables have names like e (rather than being some sort of wildcard symbol like ?) is because you often need to say that you can work with any type, but it has to be the same type in several different places. lookup is like that; it can take any type of Env and return any sort of Maybe, but they have to be consistent. Giving the variable the name e merely allows you to link them to say that they're the same variable.
1 This is in fact exactly how types like this work at a lower level. Haskell normally keeps this kind of type argument invisible though; you just write a type signature containing variables, without defining them, and every time you use the accompanying binding the compiler figures out how the variables should be instantiated.
2 In more advanced Haskell, when you turn on a bunch of extensions, you can actually control exactly where type variables are introduced, rather than it always happening automatically at the beginning of every type signature you use with variables. You don't need to know that yet, and I'm not going to talk about it further.
I'll try to give concrete examples, providing and motivating intuition. With thtat intuition, I think the question has a very natural answer:
e is a type variable, and the lookup function wants to work on all possible environments, regardless of which concrete type e is". The unbound e is a natural way to syntactically express that
Step 1, the Env type
The Env type is a wrapper for Map type in the Data.Map module in the containers package. It is a collection of key-value pairs, and you can insert new key-value pairs and look them up. If the key you are looking up is missing, you must return an error, a null value, a default or something else. Just like hashmaps or dictionaries in other programming languages.
The documentation (linked above) writes
data Map = Map k a
A Map from keys k to values a.
and we will try that out and wee what k and a can be.
I use ghci to get interactive feedback.
Prelude> import Data.Map as M
Prelude M> map1 = M.fromList [("Sweden","SEK"),("Chile","CLP")]
Prelude M> :type map1
map1 :: Map [Char] [Char]
Prelude M> map2 = M.fromList [(1::Integer,"Un"),(2,"deux")]
Prelude M> :type map2
map2 :: Map Integer [Char]
Prelude M> map3 = fromList [("Ludvig",10000::Double),("Mikael",25000)]
Prelude M> :type map3
map3 :: Map [Char] Double
You can see that we create various mappings based on lists of key-value pairs.The type signature in the documentation Map k a correspond to different k and a in the ghci session above. For map2, k correspongs to Integer and a to [Char].
You can also see how I declared manual types at some places, using the double colon syntax.
To reduce the flexibility, we can create a new "wrapping" type that for M.Map. With this wrapper, we make sure that the keys are always Strings.
Prelude M> newtype Env e = Env (M.Map String e) deriving (Show)
this definition says that the type Env e is, for every e an alias for M.Map String e. The newtype declaration further says that we must always be explicit and wrap the mappings in a Env value constructor. Let see if we can do that.
Prelude M> Env map1
Env (fromList [("Chile","CLP"),("Sweden","SEK")])
Prelude M> Env map2
<interactive>:34:6: error:
• Couldn't match type ‘Integer’ with ‘[Char]’
Expected type: Map String [Char]
Actual type: Map Integer [Char]
• In the first argument of ‘Env’, namely ‘(map2)’
In the expression: Env (map2)
In an equation for ‘it’: it = Env (map2)
Prelude M> Env map3
Env (fromList [("Ludvig",10000.0),("Mikael",25000.0)])
In the session above, we see that both map1 and map3 can be wrapped in an Env, since they have an appropriate type (they have k==String), but map2 cannot (having k==Integer).
The logical step from Map to Env is a little tricky, since also renames some variables. What was called a when talking about maps, is called e in the Env case. But variable names are always arbitrary, so that is ok.
Step 2, the lookup
We have established that Env e is a wrapping type that contains a lookup table from strings to values of some type e. How do you lookup things? Let us start with the non-wrapped case, and then the wrapped case. In Data.Map there is a function called lookup. Lets try it!
Prelude M> M.lookup "Ludvig" map3
Just 10000.0
Prelude M> M.lookup "Elias" map3
Nothing
Okay, to look up a value, we supply a key, and get just the corresponding value. If the key is missing, we get nothing. What is the type of the return value?
Prelude M> :type M.lookup "Ludvig" map3
M.lookup "Ludvig" map3 :: Maybe Double
when making a lookup in a Data [Char] Double, we need a key of type [Char] and return a vlue of type Maybe Double. Okay. That sounds reasonable. What about some other example?
Prelude M> :type M.lookup 1 map2
M.lookup 1 map2 :: Maybe [Char]
when making a lookup in a Data Integer [Char], we need a key of type Integer and return a value of type Maybe [Char]. So in general, to lookup we need a key of type k and a map of type M.Map k a and return a Maybe a.
Generally, we think M.lookup :: k -> M.Map k a -> Maybe a. Lets see if ghci agrees.
Prelude M> :t M.lookup
M.lookup :: Ord k => k -> Map k a -> Maybe a
It does! It also requires that the k type must be some type with a defined order on it, such as strings or numbers. That is the meaning of the Ord k => thing in the beginning. Why? Well, it has to do with how M.Map type is defined... do not care too much about it right now.
We can specialize the type signature if we set k to be String. In that special case, it would look like
M.lookup :: String -> Map String a -> Maybe a
hmm. this inspires us to flip the order of argument 1 and 2 to lookup, and replace the variable a with e. It is just a variable, and names on variables are arbitrary anyways... Lets call this new function myLookup
myLookup :: Map String e -> String -> Maybe e
and since Env a is essentially the same as Map String e, if we just unwrap the Env type, we may define
myLookup :: Env a -> String -> Maybe e
how does one implement such a function? One way is to just unwrap the type by pattern matching, and let the library to the heavy lifting.
myLookup (Env map) key = M.lookup key map
Conclusion
I have tried to build up concrete examples using functions and types from the standard library of Haskell. I hope I have illustrated the idea of a polymorhic type (that M.Map k a is a type when supplied with a k and a v) and that Env is a wrapper, specializing to the case when k is a String.
I hope that this concrete example motivates why the type signatures looks like they do and why type variables are useful.
I have not tried to give you the formal treatment with terminology. I have not explained the Maybe type, nor typeclasses and derived typeclasses. I hope you can read up on that elsewhere.
I hope it helps.

How does the :: operator syntax work in the context of bounded typeclass?

I'm learning Haskell and trying to understand the reasoning behind it's syntax design at the same time. Most of the syntax is beautiful.
But since :: normally is like a type annotation, How is it that this works:
Input: minBound::Int
Output: -2147483648
There is no separate operator: :: is a type annotation in that example. Perhaps the best way to understand this is to consider this code:
main = print (f minBound)
f :: Int -> Int
f = id
This also prints -2147483648. The use of minBound is inferred to be an Int because it is the parameter to f. Once the type has been inferred, the value for that type is known.
Now, back to:
main = print (minBound :: Int)
This works in the same way, except that minBound is known to be an Int because of the type annotation, rather than for some more complex reason. The :: isn't some binary operation; it just directs the compiler that the expression minBound has the type Int. Once again, since the type is known, the value can be determined from the type class.
:: still means "has type" in that example.
There are two ways you can use :: to write down type information. Type declarations, and inline type annotations. Presumably you've been used to seeing type declarations, as in:
plusOne :: Integer -> Integer
plusOne = (+1)
Here the plusOne :: Integer -> Integer line is a separate declaration about the identifier plusOne, informing the compiler what its type should be. It is then actually defined on the following line in another declaration.
The other way you can use :: is that you can embed type information in the middle of any expression. Any expression can be followed by :: and then a type, and it means the same thing as the expression on its own except with the additional constraint that it must have the given type. For example:
foo = ('a', 2) :: (Char, Integer)
bar = ('a', 2 :: Integer)
Note that for foo I attached the entire expression, so it is very little different from having used a separate foo :: (Char, Integer) declaration. bar is more interesting, since I gave a type annotation for just the 2 but used that within a larger expression (for the whole pair). 2 :: Integer is still an expression for the value 2; :: is not an operator that takes 2 as input and computes some result. Indeed if the 2 were already used in a context that requires it to be an Integer then the :: Integer annotation changes nothing at all. But because 2 is normally polymorphic in Haskell (it could fit into a context requiring an Integer, or a Double, or a Complex Float) the type annotation pins down that the type of this particular expression is Integer.
The use is that it avoids you having to restructure your code to have a separate declaration for the expression you want to attach a type to. To do that with my simple example would have required something like this:
two :: Integer
two = 2
baz = ('a', two)
Which adds a relatively large amount of extra code just to have something to attach :: Integer to. It also means when you're reading bar, you have to go read a whole separate definition to know what the second element of the pair is, instead of it being clearly stated right there.
So now we can answer your direct question. :: has no special or particular meaning with the Bounded type class or with minBound in particular. However it's useful with minBound (and other type class methods) because the whole point of type classes is to have overloaded names that do different things depending on the type. So selecting the type you want is useful!
minBound :: Int is just an expression using the value of minBound under the constraint that this particular time minBound is used as an Int, and so the value is -2147483648. As opposed to minBound :: Char which is '\NUL', or minBound :: Bool which is False.
None of those options mean anything different from using minBound where there was already some context requiring it to be an Int, or Char, or Bool; it's just a very quick and simple way of adding that context if there isn't one already.
It's worth being clear that both forms of :: are not operators as such. There's nothing terribly wrong with informally using the word operator for it, but be aware that "operator" has a specific meaning in Haskell; it refers to symbolic function names like +, *, &&, etc. Operators are first-class citizens of Haskell: we can bind them to variables1 and pass them around. For example I can do:
(|+|) = (+)
x = 1 |+| 2
But you cannot do this with ::. It is "hard-wired" into the language, just as the = symbol used for introducing definitions is, or the module Main ( main ) where syntax for module headers. As such there are lots of things that are true about Haskell operators that are not true about ::, so you need to be careful not to confuse yourself or others when you use the word "operator" informally to include ::.
1 Actually an operator is just a particular kind of variable name that is applied by writing it between two arguments instead of before them. The same function can be bound to operator and ordinary variables, even at the same time.
Just to add another example, with Monads you can play a little like this:
import Control.Monad
anyMonad :: (Monad m) => Int -> m Int
anyMonad x = (pure x) >>= (\x -> pure (x*x)) >>= (\x -> pure (x+2))
$> anyMonad 4 :: [Int]
=> [18]
$> anyMonad 4 :: Either a Int
=> Right 18
$> anyMonad 4 :: Maybe Int
=> Just 18
it's a generic example telling you that the functionality may change with the type, another example:

Parsing to Free Monads

Say I have the following free monad:
data ExampleF a
= Foo Int a
| Bar String (Int -> a)
deriving Functor
type Example = Free ExampleF -- this is the free monad want to discuss
I know how I can work with this monad, eg. I could write some nice helpers:
foo :: Int -> Example ()
foo i = liftF $ Foo i ()
bar :: String -> Example Int
bar s = liftF $ Bar s id
So I can write programs in haskell like:
fooThenBar :: Example Int
fooThenBar =
do
foo 10
bar "nice"
I know how to print it, interpret it, etc. But what about parsing it?
Would it be possible to write a parser that could parse arbitrary
programs like:
foo 12
bar nice
foo 11
foo 42
So I can store them, serialize them, use them in cli programs etc.
The problem I keep running into is that the type of the program depends on which program is being parsed. If the program ends with a foo it's of
type Example () if it ends with a bar it's of type Example Int.
I do not feel like writing parsers for every possible permutation (it's simple here because there are only two possibilities, but imagine we add
Baz Int (String -> a), Doo (Int -> a), Moz Int a, Foz String a, .... This get's tedious and error-prone).
Perhaps I'm solving the wrong problem?
Boilerplate
To run the above examples, you need to add this to the beginning of the file:
{-# LANGUAGE DeriveFunctor #-}
import Control.Monad.Free
import Text.ParserCombinators.Parsec
Note: I put up a gist containing this code.
Not every Example value can be represented on the page without reimplementing some portion of Haskell. For example, return putStrLn has a type of Example (String -> IO ()), but I don't think it makes sense to attempt to parse that sort of Example value out of a file.
So let's restrict ourselves to parsing the examples you've given, which consist only of calls to foo and bar sequenced with >> (that is, no variable bindings and no arbitrary computations)*. The Backus-Naur form for our grammar looks approximately like this:
<program> ::= "" | <expr> "\n" <program>
<expr> ::= "foo " <integer> | "bar " <string>
It's straightforward enough to parse our two types of expression...
type Parser = Parsec String ()
int :: Parser Int
int = fmap read (many1 digit)
parseFoo :: Parser (Example ())
parseFoo = string "foo " *> fmap foo int
parseBar :: Parser (Example Int)
parseBar = string "bar " *> fmap bar (many1 alphaNum)
... but how can we give a type to the composition of these two parsers?
parseExpr :: Parser (Example ???)
parseExpr = parseFoo <|> parseBar
parseFoo and parseBar have different types, so we can't compose them with <|> :: Alternative f => f a -> f a -> f a. Moreover, there's no way to know ahead of time which type the program we're given will be: as you point out, the type of the parsed program depends on the value of the input string. "Types depending on values" is called dependent types; Haskell doesn't feature a proper dependent type system, but it comes close enough for us to have a stab at making this example work.
Let's start by forcing the expressions on either side of <|> to have the same type. This involves erasing Example's type parameter using existential quantification.†
data Ex a = forall i. Wrap (a i)
parseExpr :: Parser (Ex Example)
parseExpr = fmap Wrap parseFoo <|> fmap Wrap parseBar
This typechecks, but the parser now returns an Example containing a value of an unknown type. A value of unknown type is of course useless - but we do know something about Example's parameter: it must be either () or Int because those are the return types of parseFoo and parseBar. Programming is about getting knowledge out of your brain and onto the page, so we're going to wrap up the Example value with a bit of GADT evidence which, when unwrapped, will tell you whether a was Int or ().
data Ty a where
IntTy :: Ty Int
UnitTy :: Ty ()
data (a :*: b) i = a i :&: b i
type Sig a b = Ex (a :*: b)
pattern Sig x y = Wrap (x :&: y)
parseExpr :: Parser (Sig Ty Example)
parseExpr = fmap (\x -> Sig UnitTy x) parseFoo <|>
fmap (\x -> Sig IntTy x) parseBar
Ty is (something like) a runtime "singleton" representative of Example's type parameter. When you pattern match on IntTy, you learn that a ~ Int; when you pattern match on UnitTy you learn that a ~ (). (Information can be made to flow the other way, from types to values, using classes.) :*:, the functor product, pairs up two type constructors ensuring that their parameters are equal; thus, pattern matching on the Ty tells you about its accompanying Example.
Sig is therefore called a dependent pair or sigma type - the type of the second component of the pair depends on the value of the first. This is a common technique: when you erase a type parameter by existential quantification, it usually pays to make it recoverable by bundling up a runtime representative of that parameter.
Note that this use of Sig is equivalent to Either (Example Int) (Example ()) - a sigma type is a sum, after all - but this version scales better when you're summing over a large (or possibly infinite) set.
Now it's easy to build our expression parser into a program parser. We just have to repeatedly apply the expression parser, and then manipulate the dependent pairs in the list.
parseProgram :: Parser (Sig Ty Example)
parseProgram = fmap (foldr1 combine) $ parseExpr `sepBy1` (char '\n')
where combine (Sig _ val) (Sig ty acc) = Sig ty (val >> acc)
The code I've shown you is not exemplary. It doesn't separate the concerns of parsing and typechecking. In production code I would modularise this design by first parsing the data into an untyped syntax tree - a separate data type which doesn't enforce the typing invariant - then transform that into a typed version by type-checking it. The dependent pair technique would still be necessary to give a type to the output of the type-checker, but it wouldn't be tangled up in the parser.
*If binding is not a requirement, have you thought about using a free applicative to represent your data?
†Ex and :*: are reusable bits of machinery which I lifted from the Hasochism paper
So, I worry that this is the same sort of premature abstraction that you see in object-oriented languages, getting in the way of things. For example, I am not 100% sure that you are using the structure of the free monad -- your helpers for example simply seem to use id and () in a rather boring way, in fact I'm not sure if your Int -> x is ever anything other than either Pure :: Int -> Free ExampleF Int or const (something :: Free ExampleF Int).
The free monad for a functor F can basically be described as a tree whose data is stored in leaves and whose branching factor is controlled by the recursion in each constructor of the functor F. So for example Free Identity has no branching, hence only one leaf, and thus has the same structure as the monad:
data MonoidalFree m x = MF m x deriving (Functor)
instance Monoid m => Monad (MonoidalFree m) where
return x = MF mempty x
MF m x >>= my_x = case my_x x of MF n y -> MF (mappend m n) y
In fact Free Identity is isomorphic to MonoidalFree (Sum Integer), the difference is just that instead of MF (Sum 3) "Hello" you see Free . Identity . Free . Identity . Free . Identity $ Pure "Hello" as the means of tracking this integer. On the other hand if you have data E x = L x | R x deriving (Functor) then you get a sort of "path" of Ls and Rs before you hit this one leaf, Free E is going to be isomorphic to MonoidalFree [Bool].
The reason I'm going through this is that when you combine Free with an Integer -> x functor, you get an infinitely branching tree, and when I'm looking through your code to figure out how you're actually using this tree, all I see is that you use the id function with it. As far as I can tell, that restricts the recursion to either have the form Free (Bar "string" Pure) or else Free (Bar "string" (const subExpression)), in which case the system would seem to reduce completely to the MonoidalFree [Either Int String] monad.
(At this point I should pause to ask: Is that correct as far as you know? Was this what was intended?)
Anyway. Aside from my problems with your premature abstraction, the specific problem that you're citing with your monad (you can't tell the difference between () and Int has a bunch of really complicated solutions, but one really easy one. The really easy solution is to yield a value of type Example (Either () Int) and if you have a () you can fmap Left onto it and if you have an Int you can fmap Right onto it.
Without a much better understanding of how you're using this thing over TCP/IP we can't recommend a better structure for you than the generic free monads that you seem to be finding -- in particular we'd need to know how you're planning on using the infinite-branching of Int -> x options in practice.

Why can't I use the type `Show a => [Something -> a]`?

I have a record type say
data Rec {
recNumber :: Int
, recName :: String
-- more fields of various types
}
And I want to write a toString function for Rec :
recToString :: Rec -> String
recToString r = intercalate "\t" $ map ($ r) fields
where fields = [show . recNumber, show . recName]
This works. fields has type [Rec -> String]. But I'm lazy and I would prefer writing
recToString r = intercalate "\t" $ map (\f -> show $ f r) fields
where fields = [recNumber, recName]
But this doesn't work. Intuitively I would say fields has type Show a => [Rec -> a] and this should be ok. But Haskell doesn't allow it.
I'd like to understand what is going on here. Would I be right if I said that in the first case I get a list of functions such that the 2 instances of show are actually not the same function, but Haskell is able to determine which is which at compile time (which is why it's ok).
[show . recNumber, show . recName]
^-- This is show in instance Show Number
^-- This is show in instance Show String
Whereas in the second case, I only have one literal use of show in the code, and that would have to refer to multiple instances, not determined at compile time ?
map (\f -> show $ f r) fields
^-- Must be both instances at the same time
Can someone help me understand this ? And also are there workarounds or type system expansions that allow this ?
The type signature doesn't say what you think it says.
This seems to be a common misunderstanding. Consider the function
foo :: Show a => Rec -> a
People frequently seem to think this means that "foo can return any type that it wants to, so long as that type supports Show". It doesn't.
What it actually means is that foo must be able to return any possible type, because the caller gets to choose what the return type should be.
A few moments' thought will reveal that foo actually cannot exist. There is no way to turn a Rec into any possible type that can ever exist. It can't be done.
People often try to do something like Show a => [a] to mean "a list of mixed types but they all have Show". That obviously doesn't work; this type actually means that the list elements can be any type, but they still have to be all the same.
What you're trying to do seems reasonable enough. Unfortunately, I think your first example is about as close as you can get. You could try using tuples and lenses to get around this. You could try using Template Haskell instead. But unless you've got a hell of a lot of fields, it's probably not even worth the effort.
The type you actually want is not:
Show a => [Rec -> a]
Any type declaration with unbound type variables has an implicit forall. The above is equivalent to:
forall a. Show a => [Rec -> a]
This isn't what you wan't, because the a must be specialized to a single type for the entire list. (By the caller, to any one type they choose, as MathematicalOrchid points out.) Because you want the a of each element in the list to be able to be instantiated differently... what you are actually seeking is an existential type.
[exists a. Show a => Rec -> a]
You are wishing for a form of subtyping that Haskell does not support very well. The above syntax is not supported at all by GHC. You can use newtypes to sort of accomplish this:
{-# LANGUAGE ExistentialQuantification #-}
newtype Showy = forall a. Show a => Showy a
fields :: [Rec -> Showy]
fields = [Showy . recNumber, Showy . recName]
But unfortunatley, that is just as tedious as converting directly to strings, isn't it?
I don't believe that lens is capable of getting around this particular weakness of the Haskell type system:
recToString :: Rec -> String
recToString r = intercalate "\t" $ toListOf (each . to fieldShown) fields
where fields = (recNumber, recName)
fieldShown f = show (f r)
-- error: Couldn't match type Int with [Char]
Suppose the fields do have the same type:
fields = [recNumber, recNumber]
Then it works, and Haskell figures out which show function instance to use at compile time; it doesn't have to look it up dynamically.
If you manually write out show each time, as in your original example, then Haskell can determine the correct instance for each call to show at compile time.
As for existentials... it depends on implementation, but presumably, the compiler cannot determine which instance to use statically, so a dynamic lookup will be used instead.
I'd like to suggest something very simple instead:
recToString r = intercalate "\t" [s recNumber, s recName]
where s f = show (f r)
All the elements of a list in Haskell must have the same type, so a list containing one Int and one String simply cannot exist. It is possible to get around this in GHC using existential types, but you probably shouldn't (this use of existentials is widely considered an anti-pattern, and it doesn't tend to perform terribly well). Another option would be to switch from a list to a tuple, and use some weird stuff from the lens package to map over both parts. It might even work.

Does Haskell have return type overloading?

Based on what I've read about Haskell, and the experimentation I've done with GHC, it seems like Haskell has return type overloading (aka ad hoc polymorphism). One example of this is the fromInteger function which can give you a Double or an Integer depending on where the result is used. For example:
fd :: Double -> String
fd x = "Double"
fi :: Integer -> String
fi x = "Integer"
fd (fromInteger 5) -- returns "Double"
fi (fromInteger 5) -- returns "Integer"
A Gentle Introduction to Haskell seems to agree with this when it says:
The kind of polymorphism that we have talked about so far is commonly called parametric polymorphism. There is another kind called ad hoc polymorphism, better known as overloading. Here are some examples of ad hoc polymorphism:
The literals 1, 2, etc. are often used to represent both fixed and arbitrary precision integers.
If the numeric literals are considered to be an example of ad hoc polymorphism (aka overloading), then it seems that the same is true for the result of functions like fromInteger.
And in fact, I've found some answers to other questions on Stack Overflow that suggest that Haskell has overloading by return type.
However, at least one Haskell programmer has told me that this isn't return type overloading, and is instead an example of "parametric polymorphism, where the parameter is bound by a universal quantifier".
I think what he's getting at is that fromInteger is returning a value from every instance of Num (sort of a nondeterministic type).
That seems like a reasonable interpretation, but as far as I can tell, Haskell never lets us look at more than one of these instance values (thanks in part to the Monomorphism restriction). It also seems like the actual instance who's value we look at can be determined statically. Because of all of this, it seems reasonable to say that in the expression fd (fromInteger 5) the subexpression fromInteger 5 is of type Double, while in the expression fi (fromInteger 5) the subexpression fromInteger 5 is of type Integer.
So, does Haskell have return type overloading?
If not, please provide an example of one of the following:
valid Haskell code that would have different behavior if Haskell had return type overloading
valid Haskell code that would be invalid if Haskell had return type overloading
invalid Haskell code that would be valid if Haskell had return type overloading
Well, one way to look at it is that Haskell translates the return type polymorphism that you're thinking of into parametric polymorphism, using something called the dictionary-passing translation for type classes. (Though this is not the only way to implement type classes or reason about them; it's just the most popular.)
Basically, fromInteger has this type in Haskell:
fromInteger :: Num a => Integer -> a
That might be translated internally into something like this:
fromInteger# :: NumDictionary# a -> Integer -> a
fromInteger# NumDictionary# { fromInteger = method } x = method x
data NumDictionary# a = NumDictionary# { ...
, fromInteger :: Integer -> a
, ... }
So for each concrete type T with a Num instance, there's a NumDictionary# T value that contains a function fromInteger :: Integer -> T, and all code that uses the Num type class is translated into code that takes a dictionary as the argument.
The seminal paper on Haskell-style typeclasses is called "How to make ad-hoc polymorphism less ad hoc". So, the answer to your question is a qualified "yes" -- depending on just how ad hoc you require your return-type overloading to be...
In other words: there is no question that ad hoc polymorphism is relevant to typeclasses, since that was a motivating example for inventing them. But whether you think the result still qualifies as "return-type overloading" depends on the fiddly details of your favored definition.
I'd like to address one small part of your question:
It also seems like the actual instance who's value we look at can be determined statically.
This isn't really accurate. Consider the following wacky data type:
data PerfectlyBalancedTree a
= Leaf a
| Branch (PerfectlyBalancedTree (a,a))
deriving (Eq, Ord, Show, Read)
Let's gawk at that type for a second first before we move on to the good bits. Here are a few typical values of the type PerfectlyBalancedTree Integer:
Leaf 0
Branch (Leaf (0, 0))
Branch (Branch (Leaf ((0,0),(0,0))))
Branch (Branch (Branch (Leaf (((0,0),(0,0)),((0,0),(0,0))))))
In fact, you can visualize any value of this type as being an initial sequence of n Branch tags followed by a "we're finally done, thank goodness" Leaf tag followed by a 2^n-tuple of the contained type. Cool.
Now, we're going to write a function to parse a slightly more convenient representation for these. Here's a couple example invocations:
*Main> :t fromString
fromString :: String -> PerfectlyBalancedTree Integer
*Main> fromString "0"
Leaf 0
*Main> fromString "b(42,69)"
Branch (Leaf (42,69))
*Main> fromString "bbb(((0,0),(0,0)),((0,0),(0,0)))"
Branch (Branch (Branch (Leaf (((0,0),(0,0)),((0,0),(0,0))))))
Along the way, it will be convenient to write a slightly more polymorphic function. Here it is:
fromString' :: Read a => String -> PerfectlyBalancedTree a
fromString' ('b':rest) = Branch (fromString' rest)
fromString' leaf = Leaf (read leaf)
Now our real function is just the same thing with a different type signature:
fromString :: String -> PerfectlyBalancedTree Integer
fromString = fromString'
But wait a second... what just happened here? I slipped something by you big time! Why didn't we just write this directly?
fromStringNoGood :: String -> PerfectlyBalancedTree Integer
fromStringNoGood ('b':rest) = Branch (fromStringNoGood rest)
fromStringNoGood leaf = Leaf (read leaf)
The reason is that in the recursive call, fromStringNoGood has a different type. It's not being called on to return a PerfectlyBalancedTree Integer, it's being called on to return a PerfectlyBalancedTree (Integer, Integer). We could write ourselves such a function...
fromStringStillNoGood :: String -> PerfectlyBalancedTree Integer
fromStringStillNoGood ('b':rest) = Branch (helper rest)
fromStringStillNoGood leaf = Leaf (read leaf)
helper :: String -> PerfectlyBalancedTree (Integer, Integer)
helper ('b':rest) = Branch ({- ... what goes here, now? -})
helper leaf = Leaf (read leaf)
... but this way lies an infinite regress into writing deeperly and deeperly nested types.
The problem is that, even though we're interested in a monomorphic top-level function, we nevertheless cannot determine statically what type read is being called at in the polymorphic function it uses! The data we're passed determines what type of tuple read should return: more bs in the String means a deeper-nested tuple.
You're right: Haskell does have overloading and it provides it through its type-class mechanism.
Consider the following signatures:
f :: [a] -> a
g :: Num a => [a] -> a
The first signature tells you that given a list of elements of any type a, f will produce a value of type a. This means that the implementation of f cannot make any assumptions about the type a and what operations it admits. This is an example of parametric polymorphism. A moment's reflection reveals that there are actually very little options for implementing f: the only thing you can do is select an element from the provided list. Conceptually, there is a single generic implementation of f that works for all types a.
The second signatures tells you that given a list of elements of some type a that belongs to the type class Num, g will produce a value of that type a. This means that the implementation of g can consume, produce, and manipulate values of type a using all operations that come with the type class Num. For example, g can add or multiply the elements of the list, select the minimum of the list, return a lifted constant, ... This is an example of overloading, which is typically taken to be a form of ad-hoc polymorphism (the other main form being coercion). Conceptually, there is a different implementation for g for all types a in Num.
It has return type overloading. For a good example see the Read function. It has the type Read a => String -> a. It can read and return anything that implements the read type class.

Resources