Turtle Graphics as a Haskell Monad - haskell

I'm trying to implement turtle graphics in Haskell. The goal is to be able to write a function like this:
draw_something = do
forward 100
right 90
forward 100
...
and then have it produce a list of points (maybe with additional properties):
> draw_something (0,0) 0 -- start at (0,0) facing east (0 degrees)
[(0,0), (0,100), (-100,100), ...]
I have all this working in a 'normal' way, but I've failed to implement it as a Haskell Monad and use the do-notation. The basic code:
data State a = State (a, a) a -- (x,y), angle
deriving (Show, Eq)
initstate :: State Float
initstate = State (0.0,0.0) 0.0
-- constrain angles to 0 to 2*pi
fmod :: Float -> Float
fmod a
| a >= 2*pi = fmod (a-2*pi)
| a < 0 = fmod (a+2*pi)
| otherwise = a
forward :: Float -> State Float -> [State Float]
forward d (State (x,y) angle) = [State (x + d * (sin angle), y + d * (cos angle)) angle]
right :: Float -> State Float -> [State Float]
right d (State pos angle) = [State pos (fmod (angle+d))]
bind :: [State a] -> (State a -> [State a]) -> [State a]
bind xs f = xs ++ (f (head $ reverse xs))
ret :: State a -> [State a]
ret x = [x]
With this I can now write
> [initstate] `bind` (forward 100) `bind` (right (pi/2)) `bind` (forward 100)
[State (0.0,0.0) 0.0,State (0.0,100.0) 0.0,State (0.0,100.0) 1.5707964,State (100.0,99.99999) 1.5707964]
And get the expected result. However I can't make this an instance of Monad.
instance Monad [State] where
...
results in
`State' is not applied to enough type arguments
Expected kind `*', but `State' has kind `* -> *'
In the instance declaration for `Monad [State]'
And if I wrap the list in a new object
data StateList a = StateList [State a]
instance Monad StateList where
return x = StateList [x]
I get
Couldn't match type `a' with `State a'
`a' is a rigid type variable bound by
the type signature for return :: a -> StateList a
at logo.hs:38:9
In the expression: x
In the first argument of `StateList', namely `[x]'
In the expression: StateList [x]
I tried various other versions but I never got it to run as I'd like to. What am I doing wrong? What do I understand incorrectly?

The monad you're devising needs to have two type parameters. One for the saved trail (which will be fixed for a particular do sequence) and other for the results of computations.
You also need to think about how to compose two turtle-monadic values so that the binding operation is associative. For example,
right 90 >> (right 90 >> forward 100)
must be equal to
(right 90 >> right 90) >> forward 100
(and of course similarly for >>= etc.). This means that if you represent the turtle's history by a list of points, the binding operation most likely just cannot append the lists of points together; forward 100 alone will result in something like [(0,0),(100,0)] but when it's prepended with rotation, the saved points need to be rotated too.
I'd say that the simplest approach would be to use the Writer monad. But I wouldn't save the points, I'd save just the actions the turtle performs (so that we don't need to rotate the points when combining the values). Something like
data Action = Rotate Double | Forward Double
type TurtleMonad a = Writer [Action] a
(This also means that we don't need to track the current direction, it's contained in the actions.) Then each of your functions just writes its argument into the Writer. And at the end, you can extract the final list from it and make a simple function that converts all the actions into a list of points:
track :: [Action] -> [(Double,Double)]
Update: Instead of using [Action] it would be better to use Seq from Data.Sequence. It's also a monoid and concatenating two sequences is very fast, it's amortized complexity is O(log(min(n1,n2))), compared to O(n1) of (++). So the improved type would be
type TurtleMonad a = Writer (Seq Action) a

Related

How and when to use State functor and State applicative?

I've seen the Maybe and Either functor (and applicative) used in code and that made sense, but I have a hard time coming up with an example of the State functor and applicative. Maybe they are not very useful and only exist because the State monad requires a functor and an applicative? There are plenty of explanations of their implementations out there but not any examples when they are used in code, so I'm looking for illustrations of how they might be useful on their own.
I can think of a couple of examples off the top of my head.
First, one common use for State is to manage a counter for the purpose of making some set of "identifiers" unique. So, the state itself is an Int, and the main primitive state operation is to retrieve the current value of the counter and increment it:
-- the state
type S = Int
newInt :: State S Int
newInt = state (\s -> (s, s+1))
The functor instance is then a succinct way of using the same counter for different types of identifiers, such as term- and type-level variables in some language:
type Prefix = String
data Var = Var Prefix Int
data TypeVar = TypeVar Prefix Int
where you generate fresh identifiers like so:
newVar :: Prefix -> State S Var
newVar s = Var s <$> newInt
newTypeVar :: Prefix -> State S TypeVar
newTypeVar s = TypeVar s <$> newInt
The applicative instance is helpful for writing expressions constructed from such unique identifiers. For example, I've used this approach pretty frequently when writing type checkers, which will often construct types with fresh variables, like so:
typeCheckAFunction = ...
let freshFunctionType = ArrowType <$> newTypeVar <*> newTypeVar
...
Here, freshFunctionType is a new a -> b style type with fresh type variables a and b that can be passed along to a unification step.
Second, another use of State is to manage a seed for random number generation. For example, if you want a low-quality but ultra-fast LCG generator for something, you can write:
lcg :: Word32 -> Word32
lcg x = (a * x + c)
where a = 1664525
c = 1013904223
-- monad for random numbers
type L = State Word32
randWord32 :: L Word32
randWord32 = state $ \s -> let s' = lcg s in (s', s')
The functor instance can be used to modify the Word32 output using a pure conversion function:
randUniform :: L Double
randUniform = toUnit <$> randWord32
where toUnit w = fromIntegral w / fromIntegral (maxBound `asTypeOf` w)
while the applicative instance can be used to write primitives that depend on multiple Word32 outputs:
randUniform2 :: L (Double, Double)
randUniform2 = (,) <$> randUniform <*> randUniform
and expressions that use your random numbers in a reasonably natural way:
-- area of a random triangle, say
a = areaOf <$> (Triangle <$> randUniform2 <*> randUniform2 <$> randUniform2)

What constitutes codata in the context of programming?

This is a corecursive algorithm, because with each iteration it calls itself on data that is greater then what it had before:
iterate f x = x : iterate f (f x)
It is similar to tail recursion accumulator style, but its accumulator is implicit instead of being passed as an argument. And it would be infinite if it weren't for lazyness. So is codata just the result of a value constructor in WHNF, kind of like (a, thunk)? Or is codata rather a mathematical term from category theory, which hasn't a useful representation in the programming domain?
Follow-up question: Is value recursion just a synonym for corecursion?
I think answering your questions requires a lot of explanation, so here's a big long answer with specific answers to your questions at the end.
Data and codata have formal mathematical definitions in terms of category theory, so it's not just a matter of how they are used in a program (i.e., not just the "application context" you mentioned in the comments). It may seem this way in Haskell because the language's features (specifically, non-termination and laziness) end up blurring the distinction, so in Haskell, all data is also codata and vice versa, but it doesn't have to be this way, and there are languages that make the distinction clearer.
Both data and codata do have useful representations in the programming domain, and those representations give rise to natural relationships to recursion and corecursion.
It's quite hard to explain these formal definitions and representations without quickly getting technical, but roughly speaking, a data type for, say, a list of integers, is a type L together with a constructor function:
makeL :: Either () (Int, L) -> L
that is somehow "universal" in that it can fully represent any such construction. (Here, you want to interpret the LHS type Either () (Int, L) to mean that a list L is either the empty list Left () or a pair Right (h, t) consisting of the head element h :: Int and a tail list t :: L.)
To start with a counterexample, L = Bool is not the data type we're looking for, because even though you could write:
foo :: Either () (Int, Bool) -> Bool
foo (Left ()) = False
foo (Right (h, t)) = True
to "construct" a Bool, this can't fully represent any such construction. For example, the two constructions:
foo (Right (1, foo (Left ()))) = True
foo (Right (2, foo (Left ()))) = True
give the same Bool value, even though they used different integers, so this Bool value is insufficient to fully represent the construction.
In contrast, the type [Int] is an appropriate data type because the (almost trivial) constructor function:
makeL :: Either () (Int, [Int]) -> [Int]
makeL (Left ()) = []
makeL (Right (h, t)) = h : t
fully represents any possible construction, creating a unique value for each one. So, it's somehow the "natural" construction for the type signature Either () (Int, L) -> L.
Similarly, a codata type for a list of integers would be a type L together with a destructor function:
eatL :: L -> Either () (Int, L)
that is somehow "universal" in the sense that it can represent any possible destruction.
Again, starting with a counterexample, a pair (Int, Int) is not the codata type we're looking for. For example, with the destructor:
eatL :: (Int, Int) -> Either () (Int, (Int, Int))
eatL (a, b) = Right (a, (b, a))
we can represent the destruction:
let p0 = (1, 2)
Right (1, p1) = eatL p0
Right (2, p2) = eatL p1
Right (1, p3) = eatL p2
Right (2, p4) = eatL p3
...continue indefinitely or stop whenever you want...
but we can't represent the destruction:
let p0 = (?, ?)
Right (1, p1) = eatL p0
Right (2, p2) = eatL p1
Right (3, p3) = eatL p2
Left () = eatL p3
On the other hand, in Haskell, the list type [Int] is an appropriate codata type for a list of integers, because the destructor:
eatL :: [Int] -> Either () (Int, [Int])
eatL (x:xs) = Right (x, xs)
eatL [] = Left ()
can represent any possible destruction (including both finite or infinite destructions, thanks to Haskell's lazy lists).
(As evidence that this isn't all hand-waving and in case you want to relate it back to the formal math, in technical category theory terms, the above is equivalent to saying that the list-like endofunctor:
F(A) = 1 + Int*A -- RHS equivalent to "Either () (Int,A)"
gives rise to a category whose objects are constructor functions (AKA F-algebras) 1 + Int*A -> A. A data type associated with F is an initial F-algebra in this category. F also gives rise to another category whose objects are destructor functions (AKA F-coalgebras) A -> 1 + Int*A. A codata type associated with F is a final F-coalgebra in this category.)
In intuitive terms, as suggested by #DanielWagner, a data type is a way of representing any construction of a list-like object, while a codata type is a way of representing any destruction of a list-like object. In languages where data and codata are different, there's a fundamental asymmetry -- a terminating program can only construct a finite list, but it can destruct (the first part of) an infinite list, so data must be finite, but codata can be finite or infinite.
This leads to another complication. In Haskell, we can use makeL to construct an infinite list like so:
myInfiniteList = let t = makeL (Right (1, t)) in t
Note that this would not be possible if Haskell didn't allow lazy evaluation of non-terminating programs. Because we can do this, by the formal definition of "data", a Haskell list-of-integer data type must also include infinite lists! That is, Haskell "data" can be infinite.
This probably conflicts with what you might read elsewhere (and even with the intuition that #DanielWagner provided), where "data" is used to refer to finite data structures only. Well, because Haskell is a little weird and because infinite data isn't allowed in other languages where data and codata are distinct, when people talk about "data" and "codata" (even in Haskell) and are interested in drawing a distinction, they may use "data" to refer to finite structures only.
The way recursion and corecursion fit in to this is that the universality properties naturally give us "recursion" to consume data and "corecursion" to produce codata. If L is a list-of-integer data type with constructor function:
makeL :: Either () (Int, L) -> L
then one way of consuming a list L to produce a Result is to define a (non-recursive) function:
makeResult :: Either () (Int, Result) -> Result
Here, makeResult (Left ()) gives the intended result for an empty list, while makeResult (Right (h, t_result)) gives the intended result for a list whose head element is h :: Int and whose tail would give the result t_result :: Result.
By universality (i.e., the fact that makeL is an initial F-algebra), there exists a unique function process :: L -> Result that "implements" makeResult. In practice, it will be implemented recursively:
process :: [Int] -> Result
process [] = makeResult (Left ())
process (h:t) = makeResult (Right (h, process t))
Conversely, if L is a list-of-integer codata type with destructor function:
eatL :: L -> Either () (Int, L)
then one way of producing a list L from a Seed is to define a (non-recursive) function:
unfoldSeed :: Seed -> Either () (Int, Seed)
Here, unfoldSeed should produce a Right (x, nextSeed) for each desired integer, and produce Left () to terminate the list.
By universality (i.e., the fact that eatL is a final F-coalebra), there exists a unique function generate :: Seed -> L that "implements" unfoldSeed. In practice, it will be implemented corecursively:
generate :: Seed -> [Int]
generate s = case unfoldSeed s of
Left () -> []
Right (x, s') -> x : generate s'
So, with all that said, here are the answers to your original questions:
Technically, iterate f is corecursive because it's the unique codata-producing function Int -> [Int] that implements:
unfoldSeed :: Seed -> Either () (Int, Seed)
unfoldSeed x = Right (x, f x)
by means of generate as defined above.
In Haskell, corecursion that produces codata of type [a] relies on laziness. However, strict codata representations are possible. For example, the following codata representation works fine in Strict Haskell and can be safely fully evaluated.
data CoList = End | CoList Int (() -> CoList)
The following corecursive function produces a CoList value (and I made it finite just for fun -- it's easy to produce infinite codata values, too):
countDown :: Int -> CoList
countDown n | n > 0 = CoList n (\() -> countDown (n-1))
| otherwise = End
So, no, codata isn't just the result of values in WHNF with form (a, thunk) or similar and corecursion is not synonymous with value recursion. However, WHNF and thunks provide one possible implementation and are the implementation-level reason that a "standard" Haskell list data type is also a codata type.

Using show on a custom type

I'm having trouble printing contents of a custom matrix type I made. When I try to do it tells me
Ambiguous occurrence `show'
It could refer to either `MatrixShow.show',
defined at Matrices.hs:6:9
or `Prelude.show',
imported from `Prelude' at Matrices.hs:1:8-17
Here is the module I'm importing:
module Matrix (Matrix(..), fillWith, fromRule, numRows, numColumns, at, mtranspose, mmap) where
newtype Matrix a = Mat ((Int,Int), (Int,Int) -> a)
fillWith :: (Int,Int) -> a -> (Matrix a)
fillWith (n,m) k = Mat ((n,m), (\(_,_) -> k))
fromRule :: (Int,Int) -> ((Int,Int) -> a) -> (Matrix a)
fromRule (n,m) f = Mat ((n,m), f)
numRows :: (Matrix a) -> Int
numRows (Mat ((n,_),_)) = n
numColumns :: (Matrix a) -> Int
numColumns (Mat ((_,m),_)) = m
at :: (Matrix a) -> (Int, Int) -> a
at (Mat ((n,m), f)) (i,j)| (i > 0) && (j > 0) || (i <= n) && (j <= m) = f (i,j)
mtranspose :: (Matrix a) -> (Matrix a)
mtranspose (Mat ((n,m),f)) = (Mat ((m,n),\(j,i) -> f (i,j)))
mmap :: (a -> b) -> (Matrix a) -> (Matrix b)
mmap h (Mat ((n,m),f)) = (Mat ((n,m), h.f))
This is my module:
module MatrixShow where
import Matrix
instance (Show a) => Show (Matrix a) where
show (Mat ((x,y),f)) = show f
Also is there some place where I can figure this out on my own, some link with instructions or some tutorial or something to learn how to do this.
The problem is with your indentation. The definition of show needs to be indented relative to the instance show a => Show (Matrix a). As it is, it appears that you are trying to define a new function called show, unrelated to the Show class, which you can't do.
#dfeuer, whose name I continue to have trouble spelling, has given you the direct answer - Haskell is sensitive to layout - but I'm going to try to help you with the underlying question that you've alluded to in the comments, without giving you the full answer.
You mentioned that you were confused about how matrices are represented. Read the source, Luke:
newtype Matrix a = Mat ((Int,Int), (Int,Int) -> a)
This newtype declaration tells you that a Matrix is formed from a pair ((Int,Int), (Int,Int) -> a). If you split up the tuple, that's an (Int, Int) pair and a function of type (Int, Int) -> a (a function with two integer arguments which returns something of arbitrary type a). This suggests to me that the first part of the tuple represents the size of the matrix, and the second part is a function mapping coordinates onto elements. This hypothesis seems to be confirmed by some of the example code your professor has given you - have a look at at or mtranspose, for example.
So, the question is - given the width and height of the matrix, and a function which will give you the element at a given coordinate, how do we give a string showing the items in the matrix?
The first thing we need to do is enumerate all the possible coordinates for the given width and height of the matrix. Haskell provides some useful syntactic constructs for this sort of operation - we can write [x .. y] to enumerate all the values between x and y, and use a list comprehension to unpack those enumerations in a nested loop.
coords :: (Int, Int) -- (width, height)
-> [(Int, Int)] -- (x, y) pairs
coords (w, h) = [(x, y) | x <- [0 .. w], y <- [0 .. h]]
For example:
ghci> coords (2, 4)
[(0,0),(0,1),(0,2),(0,3),(0,4),(1,0),(1,1),(1,2),(1,3),(1,4),(2,0),(2,1),(2,2),(2,3),(2,4)]
Now that we've worked out how to list all the possible coordinates in a matrix, how do we turn coordinates into elements of type a? Well, the Mat constructor contains a function (Int, Int) -> a which gives you the element associated with a single coordinate. We need to apply that function to each of the coordinates in the list which we just enumerated. This is what map does.
elems :: Matrix a -> [a]
elems (Mat (size, f)) = map f $ coords size
So, there's the code to enumerate the elements of a matrix. Can you figure out how to modify this code so that a) it shows the elements as a string and b) it shows them in a row-by-row fashion? You'll probably need to adjust both of these functions.
I suppose the broader point I'd like to make is that even though it feels like your professor has thrown you into the deep end, it's always possible to do a little detective work and figure out for yourself what something means. Many - most? - of the people answering questions on this site are self-taught programmers, myself included. We persevered!
After all, it's just code. If a computer's going to understand it then it must be written down on the page, and that means that you can understand it, too.

QuickCheck giving up investigating a recursive data structure (rose tree.)

Given an arbitrary tree, I can construct a subtype relation over that tree, using Schubert numbering:
constructH :: Tree a -> Tree (Type a)
where Type nests the original label, and additionally provides the data needed to perform child/parent (or subtype) checks. With Schubert Numbering, the two Int parameters are sufficient for that.
data Type a where !Int -> !Int -> a -> Type a
This leads to the binary predicate
subtypeOf :: Type a -> Type a -> Bool
I now want to test with QuickCheck that this does indeed do what I want it to do. The following property, however, does not work, because QuickCheck just gives up:
subtypeSanity ∷ Tree (Type ()) → Gen Prop
subtypeSanity Node { rootLabel = t, subForest = f } =
let subtypes = concatMap flatten f
in (not $ null subtypes) ==> conjoin
(forAll (elements subtypes) (\x → x `subtypeOf` t):(map subtypeSanity f))
If I leave out the recursive call to subtypeSanity, i.e. the tail of the list I'm passing to conjoin, the property runs fine, but tests just the root node of the tree! How can I descend into my data structure recursively without QuickCheck giving up on generating new test cases?
If needed, I could provide the code to construct the Schubert Hierarchy, and the Arbitrary instance for Tree (Type a), to provide a complete runnable example, but that would be quite a bit of code. I'm convinced that I'm just not "getting" QuickCheck, and using it in the wrong way here.
EDIT: unfortunately, the sized function does not seem to eliminate the problem here. It ends up with the same result (see comment to J. Abrahamson's answer.)
EDIT II: I ended up "fixing" my problem by avoiding the recursive step, and avoiding conjoin. We just make a list of all nodes in the tree, then test the single-node property (which worked fine from the beginning) on those.
allNodes ∷ Tree a → [Tree a]
allNodes n#(Node { subForest = f }) = n:(concatMap allNodes f)
subtypeSanity ∷ Tree (Type ()) → Gen Prop
subtypeSanity tree = forAll (elements $ allNodes tree)
(\(Node { rootLabel = t, subForest = f }) →
let subtypes = concatMap flatten f
in (not $ null subtypes) ==> forAll (elements subtypes) (\x → x `subtypeOf` t))
Tweaking the Arbitrary instance for trees did not work. Here is the arbitrary instance I'm still using:
instance (Arbitrary a, Eq a) ⇒ Arbitrary (Tree (Type a)) where
arbitrary = liftM (constructH) $ sized arbTree
arbTree ∷ Arbitrary a ⇒ Int → Gen (Tree a)
arbTree n = do
m ← choose (0,n)
if m == 0
then Node <$> arbitrary <*> (return [])
else do part ← randomPartition n m
Node <$> arbitrary <*> mapM arbTree part
-- this is a crude way to find a sufficiently random x1,..,xm,
-- such that x1 + .. + xm = n, for any n, m, with 0 < m.
randomPartition ∷ Int → Int → Gen [Int]
randomPartition n m' = do
let m = m' - 1
seed ← liftM ((++[n]) . sort) $ replicateM m (choose (0,n))
return $ zipWith (-) seed (0:seed)
I consider the problem "solved for now," but if someone could explain to me why the recursive step and/or conjoin made QuickCheck give up (after passing "only" 0 tests,) I would be more than grateful.
When generating Arbitrary recursive structures, QuickCheck is often a bit too eager and generates sprawling, enormous random examples. These are undesirable as they usually don't better check the properties of interest and can be very slow. Two solutions are
Use things like the size parameter (sized function) and frequency function to bias the generator toward small trees.
Use a small-type oriented generator like those in smallcheck. These try to exhaustively generate all "small" examples and thus help to keep the size of the tree down.
To clarify the sized and frequency method of controlling generation size, here's an example RoseTree
data Rose a = It a | Rose [Rose a]
instance Arbitrary a => Arbitrary (Rose a) where
arbitrary = frequency
[ (3, It <$> arbitrary) -- The 3-to-1 ratio is chosen, ah,
-- arbitrarily...
-- you'll want to tune it
, (1, Rose <$> children)
]
where children = sized $ \n -> vectorOf n arbitrary
It can be done even more simply with a different Rose formation by very carefully controlling the size of the child list
data Rose a = Rose a [Rose a]
instance Arbitrary a => Arbitrary (Rose a) where
arbitrary = Rose <$> arbitrary <*> sized (\n -> vectorOf (tuneUp n) arbitrary)
where tuneUp n = round $ fromIntegral n / 4.0
You could do this without referencing sized, but that gives the user of your Arbitrary instance a knob to ask for larger trees if needed.
In case it's useful for those stumbling across this issue: when QuickCheck "gives up", it's a sign that your pre-condition (using ==>) is too hard to satisfy.
QuickCheck uses a simple rejection sampling technique: pre-conditions have no effect on the generation of values. QuickCheck generates a bunch of random values like normal. After these are generated, they're sent through the pre-condition: if the result is True, the property is tested with that value; if it's False, that value is discarded. If your pre-condition rejects most of the values QuickCheck has generated, then QuickCheck will "give up" (better to give up completely, than to make statistically dubious pass/fail claims).
In particular, QuickCheck will not attempt to produce values which satisfy a given pre-condition. It's up to you to make sure that the generator you're using (arbitrary or otherwise) produces lots of values which pass your pre-condition.
Let's see how this is manifesting in your example:
subtypeSanity :: Tree (Type ()) -> Gen Prop
subtypeSanity Node { rootLabel = t, subForest = f } =
let subtypes = concatMap flatten f
in (not $ null subtypes) ==> conjoin
(forAll (elements subtypes) (`subtypeOf` t):(map subtypeSanity f))
There is only one occurance of ==>, so its precondition (not $ null subtypes) must be too hard to satisfy. This is due to the recursive call map subtypeSanity f: not only are you rejecting any Tree which has an empty subForest, you're also (due to the recursion) rejecting any Tree where the subForest contains Trees with empty subForests, and rejecting any Tree where the subForest contains Trees with subForests containing Trees with empty subForests, and so on.
According to your arbitrary instance, Trees are only nested to finite depth: eventually we will always reach an empty subForest, hence your recursive precondition will always fail, and QuickCheck will give up.

Different types in case expression result in Haskell

I'm trying to implement some kind of message parser in Haskell, so I decided to use types for message types, not constructors:
data DebugMsg = DebugMsg String
data UpdateMsg = UpdateMsg [String]
.. and so on. I belive it is more useful to me, because I can define typeclass, say, Msg for message with all information/parsers/actions related to this message.
But I have problem here. When I try to write parsing function using case:
parseMsg :: (Msg a) => Int -> Get a
parseMsg code =
case code of
1 -> (parse :: Get DebugMsg)
2 -> (parse :: Get UpdateMsg)
..type of case result should be same in all branches. Is there any solution? And does it even possible specifiy only typeclass for function result and expect it to be fully polymorphic?
Yes, all the right hand sides of all your subcases must have the exact same type; and this type must be the same as the type of the whole case expression. This is a feature; it's required for the language to be able to guarantee at compilation time that there cannot be any type errors at runtime.
Some of the comments on your question mention that the simplest solution is to use a sum (a.k.a. variant) type:
data ParserMsg = DebugMsg String | UpdateMsg [String]
A consequence of this is that the set of alternative results is defined ahead of time. This is sometimes an upside (your code can be certain that there are no unhandled subcases), sometimes a downside (there is a finite number of subcases and they are determined at compilation time).
A more advanced solution in some cases—which you might not need, but I'll just throw it in—is to refactor the code to use functions as data. The idea is that you create a datatype that has functions (or monadic actions) as its fields, and then different behaviors = different functions as record fields.
Compare these two styles with this example. First, specifying different cases as a sum (this uses GADTs, but should be simple enough to understand):
{-# LANGUAGE GADTs #-}
import Data.Vector (Vector, (!))
import qualified Data.Vector as V
type Size = Int
type Index = Int
-- | A 'Frame' translates between a set of values and consecutive array
-- indexes. (Note: this simplified implementation doesn't handle duplicate
-- values.)
data Frame p where
-- | A 'SimpleFrame' is backed by just a 'Vector'
SimpleFrame :: Vector p -> Frame p
-- | A 'ProductFrame' is a pair of 'Frame's.
ProductFrame :: Frame p -> Frame q -> Frame (p, q)
getSize :: Frame p -> Size
getSize (SimpleFrame v) = V.length v
getSize (ProductFrame f g) = getSize f * getSize g
getIndex :: Frame p -> Index -> p
getIndex (SimpleFrame v) i = v!i
getIndex (ProductFrame f g) ij =
let (i, j) = splitIndex (getSize f, getSize g) ij
in (getIndex f i, getIndex g j)
pointIndex :: Eq p => Frame p -> p -> Maybe Index
pointIndex (SimpleFrame v) p = V.elemIndex v p
pointIndex (ProductFrame f g) (p, q) =
joinIndexes (getSize f, getSize g) (pointIndex f p) (pointIndex g q)
joinIndexes :: (Size, Size) -> Index -> Index -> Index
joinIndexes (_, rsize) i j = i * rsize + j
splitIndex :: (Size, Size) -> Index -> (Index, Index)
splitIndex (_, rsize) ij = (ij `div` rsize, ij `mod` rsize)
In this first example, a Frame can only ever be either a SimpleFrame or a ProductFrame, and every Frame function must be defined to handle both cases.
Second, datatype with function members (I elide code common to both examples):
data Frame p = Frame { getSize :: Size
, getIndex :: Index -> p
, pointIndex :: p -> Maybe Index }
simpleFrame :: Eq p => Vector p -> Frame p
simpleFrame v = Frame (V.length v) (v!) (V.elemIndex v)
productFrame :: Frame p -> Frame q -> Frame (p, q)
productFrame f g = Frame newSize getI pointI
where newSize = getSize f * getSize g
getI ij = let (i, j) = splitIndex (getSize f, getSize g) ij
in (getIndex f i, getIndex g j)
pointI (p, q) = joinIndexes (getSize f, getSize g)
(pointIndex f p)
(pointIndex g q)
Here the Frame type takes the getIndex and pointIndex operations as data members of the Frame itself. There isn't a fixed compile-time set of subcases, because the behavior of a Frame is determined by its element functions, which are supplied at runtime. So without having to touch those definitions, we could add:
import Control.Applicative ((<|>))
concatFrame :: Frame p -> Frame p -> Frame p
concatFrame f g = Frame newSize getI pointI
where newSize = getSize f + getSize g
getI ij | ij < getSize f = ij
| otherwise = ij - getSize f
pointI p = getPoint f p <|> fmap (+(getSize f)) (getPoint g p)
I call this second style "behavioral types," but that really is just me.
Note that type classes in GHC are implemented similarly to this—there is a hidden "dictionary" argument passed around, and this dictionary is a record whose members are implementations for the class methods:
data ShowDictionary a { primitiveShow :: a -> String }
stringShowDictionary :: ShowDictionary String
stringShowDictionary = ShowDictionary { primitiveShow = ... }
-- show "whatever"
-- ---> primitiveShow stringShowDictionary "whatever"
You could accomplish something like this with existential types, however it wouldn't work how you want it to, so you really shouldn't.
Doing it with normal polymorphism, as you have in your example, won't work at all. What your type says is that the function is valid for all a--that is, the caller gets to choose what kind of message to receive. However, you have to choose the message based on the numeric code, so this clearly won't do.
To clarify: all standard Haskell type variables are universally quantified by default. You can read your type signature as ∀a. Msg a => Int -> Get a. What this says is that the function is define for every value of a, regardless of what the argument may be. This means that it has to be able to return whatever particular a the caller wants, regardless of what argument it gets.
What you really want is something like ∃a. Msg a => Int -> Get a. This is why I said you could do it with existential types. However, this is relatively complicated in Haskell (you can't quite write a type signature like that) and will not actually solve your problem correctly; it's just something to keep in mind for the future.
Fundamentally, using classes and types like this is not very idiomatic in Haskell, because that's not what classes are meant to do. You would be much better off sticking to a normal algebraic data type for your messages.
I would have a single type like this:
data Message = DebugMsg String
| UpdateMsg [String]
So instead of having a parse function per type, just do the parsing in the parseMsg function as appropriate:
parseMsg :: Int -> String -> Message
parseMsg n msg = case n of
1 -> DebugMsg msg
2 -> UpdateMsg [msg]
(Obviously fill in whatever logic you actually have there.)
Essentially, this is the classical use for normal algebraic data types. There is no reason to have different types for the different kinds of messages, and life is much easier if they have the same type.
It looks like you're trying to emulate sub-typing from other languages. As a rule of thumb, you use algebraic data types in place of most of the uses of sub-types in other languages. This is certainly one of those cases.

Resources