Requiring that a function must be a retraction in Haskell? - haskell

I'm envisioning an implementation of a monadic graph. I'll do my best to explain how it is to be constructed here.
The Graph type should be isomorphic to the following:
data Graph e v = Graph{ vertices :: [v], edges :: [(e, (v, v))] }
Where e is the edge type, and v is the vertex type, we include a list of vertices and a list of edges along with the vertices they connect.
What I'm envisioning is a monad instance of this type as follows:
instance Monad (Graph e) where
return v = Graph v [] -- | Empty graph with one vertex
m >>= f = {- see below -}
I have an idea of how to implement >>= which basically takes each vertex, maps it to a new graph, and then re-connects the vertex which built each graph correspondingly based on how the original graph was connected.
For example, consider a function f which takes a vertex and produces the complete graph on two vertices (K_2) from it. Then if we bound K_2 itself to f, we'd get something like:
| |
where the graph A----B was the original, and the graphs A----C and B----D were produced from A and B respectively. In the end, A and B need to be connected since they were connected in the original graph. Note that A and B need not be exactly the same, but they need to directly map to something in the new graph. I'm leaving out some information for simplicity (what are the edges of the graph, etc), but the main point I've noticed is that for this to actually work as a Monad instance, A needs to be directly mapped to a vertex in f A, and the same goes for B. In general, each vertex in the original graph needs to be mapped directly to a graph in the graph resulting from f.
If I'm understanding correctly, this means that f must be a retraction for some other morphism g. If it is, we can clearly join the graph by connecting each morphed vertex in its resulting graph to the morphed vertices in the others, producing a new graph of the type we want.
Mostly this is just an idea I had, but I really wanted to if there is any way to, in Haskell, require that f be a retraction? Is there a way to state this within the confines of the language in order to supply an appropriate instance of Monad for a graph, or to do this, must I say "this is really only a monad if the function you're binding to is a retraction?" I suspect the latter, but I just wanted to check.
Alternatively, I may be understanding everything wrong! Feel free to correct me or give me some thoughts of your own.

Like the comments say, you could use a pointed graph:
module PointedGraph where
import Control.Arrow (second)
data PointedGraph e v = PointedGraph { hops :: [(e, PointedGraph e v)], center :: v }
deriving (Eq, Show)
instance Monad (PointedGraph e) where
return = PointedGraph []
PointedGraph hs c >>= f = PointedGraph (hs' ++ map (second (>>= f)) hs) c'
where PointedGraph hs' c' = f c
connect :: PointedGraph e v -> e -> PointedGraph e v -> PointedGraph e v
connect g e g' = g { hops = (e,g') : hops g }
k2, ex :: PointedGraph String Int
k2 = connect (return 0) "original" (return 2)
ex = do
n <- k2
connect (return n) "derived" (return $ n + 1)
So this makes:
k2: 0 -original-> 2
ex: 0 -original-> 2
| |
derived derived
| |
v v
1 3
Note that we have no checking for uniqueness of the vertex labels (that'd require an Eq constraint or the like) so we could easily have something like
k2 >>= const k2:
0 -original-> 0
| |
original original
| |
v v
2 2


How can I have a vector that's strict in its values, like a normal type with bangs (!)?

Some common performance advice in Haskell is to make fast data structures "spine strict" so that the structure, but not necessarily its contents, is fully evaluated as it is created. This lets us do more work when we insert a value and the structure is in cache as opposed to putting it off until we look a value up.
With a normal data type, like the binary trie from Data.IntMap, this can be accomplished by making the relevant fields in the data structure strict:
data IntMap a = Bin {- ... -} !(IntMap a) !(IntMap a)
| {- ... -}
(Excerpt from the Data.IntMap.Base source.)
How can I achieve the same behavior if I want to store the children in a vector rather than directly as fields of Bin?
data IntMap a = Bin {- ... -} (Vector (IntMap a))
| {- ... -}
First, I'll answer a simple variant of the question:
If your data type is unboxable, e.g. you want a strict vector of Ints,
use Data.Vector.Unboxed.
As a free bonus, the implementation allows you to have "structure of arrays", (Vector a, Vector b), even the interface
is less error-prone "array of structures", Vector (a, b).
See Wikipedia on AOS and SOA.
Yet, in the OPs question, we want to stick IntMap a into Vector, and
IntMap isn't unboxable (or storable or primitive).
The various options boil down to the same idea: you have to seq values yourself.
Whether you go for
or implementing own Data.Vector.Strict on top of Data.Vector (note: basicClear can be no-op as
it is for unboxed vectors, or you can use unsafeCoerce () as a dummy value),
you will seq values. This is how
Data.Map.Strict is implemented on top
of the same lazy structure as Data.Map.Lazy.
For example
Data.Map.Strict is implemented as:
map :: (a -> b) -> Map k a -> Map k b
map f = go
go Tip = Tip
go (Bin sx kx x l r) = let !x' = f x in Bin sx kx x' (go l) (go r)
Compare that to
map :: (a -> b) -> Map k a -> Map k b
map f = go where
go Tip = Tip
go (Bin sx kx x l r) = Bin sx kx (f x) (go l) (go r)

What is the correct definition of `unfold` for an untagged tree?

I've been thinking in how to implement the equivalent of unfold for the following type:
data Tree a = Node (Tree a) (Tree a) | Leaf a | Nil
It was not immediately obvious since the standard unfold for lists returns a value and the next seed. For this datatype, it doesn't make sense, since there is no "value" until you reach a leaf node. This way, it only really makes sense to return new seeds or stop with a value. I'm using this definition:
data Drive s a = Stop | Unit a | Branch s s deriving Show
unfold :: (t -> Drive t a) -> t -> Tree a
unfold fn x = case fn x of
Branch a b -> Node (unfold fn a) (unfold fn b)
Unit a -> Leaf a
Stop -> Nil
main = print $ unfold go 5 where
go 0 = Stop
go 1 = Unit 1
go n = Branch (n - 1) (n - 2)
While this seems to work, I'm not sure this is how it is supposed to be. So, that is the question: what is the correct way to do it?
If you think of a datatype as the fixpoint of a functor then you can see that your definition is the sensible generalisation of the list case.
module Unfold where
Here we start by definition the fixpoint of a functor f: it's a layer of f followed by some more fixpoint:
newtype Fix f = InFix { outFix :: f (Fix f) }
To make things slightly clearer, here are the definitions of the functors corresponding to lists and trees. They have basically the same shape as the datatypes except that we have replace the recursive calls by an extra parameter. In other words, they describe what one layer of list / tree looks like and are generic over the possible substructures r.
data ListF a r = LNil | LCons a r
data TreeF a r = TNil | TLeaf a | TBranch r r
Lists and trees are then respectively the fixpoints of ListF and TreeF:
type List a = Fix (ListF a)
type Tree a = Fix (TreeF a)
Anyways, hopping you now have a better intuition about this fixpoint business, we can see that there is a generic way of defining an unfold function for these.
Given an original seed as well as a function taking a seed and building one layer of f where the recursive structure are new seeds, we can build a whole structure:
unfoldFix :: Functor f => (s -> f s) -> s -> Fix f
unfoldFix node = go
where go = InFix . fmap go . node
This definition specialises to the usual unfold on list or your definition for trees. In other words: your definition was indeed the right one.

Is there a sense of 'object equality' in Haskell?

If I have a singly linked list in Haskell:
data LL a = Empty | Node a (LL a) deriving (Show, Eq)
I can easily implement methods to insert at the end and at the beginning. But what about inserting after or before a particular element? If I have a LL of Integer, can I make a distinction in Haskell between inserting 4 after a particular node containing a 1, rather than the first 1 that it sees when processing the list?
Node 1 (Node 2 (Node 3 (Node 1 Empty)))
I'm curious how an insertAfter method would look that you would be able to specify "insert 5 after this particular node containing a 1". If I wanted to insert after the first node containing 1, would I have to pass in the entire list to specify this, and for the last node, only Node 1 Empty?
I'm not sure if it's right to address this as 'object equality'- but I'm wondering if there's a way to refer to particular elements of a type with the same payload in a data structure like this.
No, there is no such thing. The only way to tell apart values is by their structure; there is no identity like objects in some languages have. That is, there's no way you could tell apart these two values: (Just 5, Just 5) behaves exactly the same as let x = Just 5 in (x, x). Likewise, there is no difference between "this Node 1" and "some other Node 1": they are indistinguishable.
Usually the "solution" to this problem is to think of your problem in some other way so that there's no longer a need to distinguish based on identity (and usually there in fact is no need). But, as mentioned in the comments, you can emulate the "pointer" mechanic of other languages yourself, by generating distinct tags of some sort, eg increasing integers, and assigning one to each object so that you can tell them apart.
As others have pointed, in Haskell every value is immutable and there is no object.
To specify an unique node, you either need to specify it structually (the first node in the linked list that contains 1, for example) or give every node an extra tag somehow (simulating what happens in an imperative world) so that we can distinguish them.
To structurally distinguish a node from others, we basically need to know the location of
that node, e.g. a zipper that not only gives you the value at the point, but also its "neighborhoods".
And more detailed about "giving every node an extra tag":
First of all, you need to make every value an object, that requires you to generate unique tags at runtime. This is usually done by an allocator, the simplest allocator might just keep an integer, bump it when we need to create a new object:
-- | bumps counter
genId :: (Monad m, Functor m, Enum e) => StateT e m e
genId = get <* modify succ
-- | given a value, initializes a new node value
newNode :: (Monad m, Functor m, Enum e) => a -> StateT e m (a,e)
newNode x = genId >>= return . (x,)
And if you want to make an existing linked list work, we need to walk through it and give every node value a tag to make it an object:
-- | tags the llnked list with an extra value
tagged :: (Traversable f, Enum e, Monad m, Functor m)
=> f a -> StateT e m (f (a,e))
tagged = traverse newNode
And here is the full demo, it does look Maybe "a little" awkward:
{-# LANGUAGE DeriveFunctor, DeriveFoldable, DeriveTraversable, TupleSections #-}
import Control.Applicative
import Control.Monad.State hiding (mapM_)
import Data.Traversable
import Data.Foldable
import Prelude hiding (mapM_)
data LL a = Empty | Node a (LL a)
deriving (Show, Eq, Functor, Foldable, Traversable)
-- | bumps counter
genId :: (Monad m, Functor m, Enum e) => StateT e m e
genId = get <* modify succ
-- | given a value, initializes a new node value
newNode :: (Monad m, Functor m, Enum e) => a -> StateT e m (a,e)
newNode x = genId >>= return . (x,)
example :: LL Int
example = Node 1 (Node 2 (Node 3 (Node 1 Empty)))
-- | tags the llnked list with an extra value
tagged :: (Traversable f, Enum e, Monad m, Functor m)
=> f a -> StateT e m (f (a,e))
tagged = traverse newNode
insertAfter :: (a -> Bool) -> a -> LL a -> LL a
insertAfter cond e ll = case ll of
Empty -> Empty
Node v vs -> Node v (if cond v
then Node e vs
else insertAfter cond e vs)
demo :: StateT Int IO ()
demo = do
-- ll1 = Node (1,0) (Node (2,1) (Node (3,2) (Node (1,3) Empty)))
ll1 <- tagged example
nd <- newNode 10
let tagIs t = (== t) . snd
ll2 = insertAfter (tagIs 0) nd ll1
-- ll2 = Node (1,0) (Node (10,4) (Node (2,1) (Node (3,2) (Node (1,3) Empty))))
ll3 = insertAfter (tagIs 3) nd ll1
-- ll3 = Node (1,0) (Node (2,1) (Node (3,2) (Node (1,3) (Node (10,4) Empty))))
liftIO $ mapM_ print [ll1,ll2,ll3]
main :: IO ()
main = evalStateT demo (0 :: Int)
In this demo, tagIs is essentially doing the "object equality" thing because it is only interested in the extra tag we added before. Notice here I cheated in order to specify two nodes with their "values" being 1: one tagged 0 and the other tagged 3. Before running the program, it's impossible to tell what the actually tag would be. (Just like hard-coding a pointer value and hope it happens to work) In a more realistic setting, you would need another function to scan through the linked list and collect you a list of tags with a certain value (in this example, if you search the linked list to find all the nodes with "value" 1, you would have [0,3]) to work with.
"object equality" seems more like a concept from imperative programming languages, which assumes that there are allocators to offer "references" or "pointers" so that we can talk about "object equality". We have to simulate that allocator, I guess this is the thing that makes functional programming a little awkward to deal with it.
Kristopher Micinski remarked that you actually can do something similar with the ST monad, and you can do it with IO as well. Specifically, you can create an STRef or IORef, which is a sort of mutable box. The box can only be accessed using IO or ST actions as appropriate, which maintains the clean separation between "pure" and "impure" code. These references have identity—asking if two are equal tells you if they are actually the same box, rather than whether they have the same contents. But this is not really so pleasant, and not something you're likely to do without a good reason.
No, because it would break referential transparency. The results from calling a method with the same input multiple times should be indistinguishable, and it should be possible to replace it transparently with calling the method with that input once and then re-using the result. However, calling a method that returns some structure multiple times may produce a new copy of the structure every time -- structures with different "identity". If you could somehow tell that they have different identities, then it violates referential transparency.

Different types in case expression result in Haskell

I'm trying to implement some kind of message parser in Haskell, so I decided to use types for message types, not constructors:
data DebugMsg = DebugMsg String
data UpdateMsg = UpdateMsg [String]
.. and so on. I belive it is more useful to me, because I can define typeclass, say, Msg for message with all information/parsers/actions related to this message.
But I have problem here. When I try to write parsing function using case:
parseMsg :: (Msg a) => Int -> Get a
parseMsg code =
case code of
1 -> (parse :: Get DebugMsg)
2 -> (parse :: Get UpdateMsg)
..type of case result should be same in all branches. Is there any solution? And does it even possible specifiy only typeclass for function result and expect it to be fully polymorphic?
Yes, all the right hand sides of all your subcases must have the exact same type; and this type must be the same as the type of the whole case expression. This is a feature; it's required for the language to be able to guarantee at compilation time that there cannot be any type errors at runtime.
Some of the comments on your question mention that the simplest solution is to use a sum (a.k.a. variant) type:
data ParserMsg = DebugMsg String | UpdateMsg [String]
A consequence of this is that the set of alternative results is defined ahead of time. This is sometimes an upside (your code can be certain that there are no unhandled subcases), sometimes a downside (there is a finite number of subcases and they are determined at compilation time).
A more advanced solution in some cases—which you might not need, but I'll just throw it in—is to refactor the code to use functions as data. The idea is that you create a datatype that has functions (or monadic actions) as its fields, and then different behaviors = different functions as record fields.
Compare these two styles with this example. First, specifying different cases as a sum (this uses GADTs, but should be simple enough to understand):
import Data.Vector (Vector, (!))
import qualified Data.Vector as V
type Size = Int
type Index = Int
-- | A 'Frame' translates between a set of values and consecutive array
-- indexes. (Note: this simplified implementation doesn't handle duplicate
-- values.)
data Frame p where
-- | A 'SimpleFrame' is backed by just a 'Vector'
SimpleFrame :: Vector p -> Frame p
-- | A 'ProductFrame' is a pair of 'Frame's.
ProductFrame :: Frame p -> Frame q -> Frame (p, q)
getSize :: Frame p -> Size
getSize (SimpleFrame v) = V.length v
getSize (ProductFrame f g) = getSize f * getSize g
getIndex :: Frame p -> Index -> p
getIndex (SimpleFrame v) i = v!i
getIndex (ProductFrame f g) ij =
let (i, j) = splitIndex (getSize f, getSize g) ij
in (getIndex f i, getIndex g j)
pointIndex :: Eq p => Frame p -> p -> Maybe Index
pointIndex (SimpleFrame v) p = V.elemIndex v p
pointIndex (ProductFrame f g) (p, q) =
joinIndexes (getSize f, getSize g) (pointIndex f p) (pointIndex g q)
joinIndexes :: (Size, Size) -> Index -> Index -> Index
joinIndexes (_, rsize) i j = i * rsize + j
splitIndex :: (Size, Size) -> Index -> (Index, Index)
splitIndex (_, rsize) ij = (ij `div` rsize, ij `mod` rsize)
In this first example, a Frame can only ever be either a SimpleFrame or a ProductFrame, and every Frame function must be defined to handle both cases.
Second, datatype with function members (I elide code common to both examples):
data Frame p = Frame { getSize :: Size
, getIndex :: Index -> p
, pointIndex :: p -> Maybe Index }
simpleFrame :: Eq p => Vector p -> Frame p
simpleFrame v = Frame (V.length v) (v!) (V.elemIndex v)
productFrame :: Frame p -> Frame q -> Frame (p, q)
productFrame f g = Frame newSize getI pointI
where newSize = getSize f * getSize g
getI ij = let (i, j) = splitIndex (getSize f, getSize g) ij
in (getIndex f i, getIndex g j)
pointI (p, q) = joinIndexes (getSize f, getSize g)
(pointIndex f p)
(pointIndex g q)
Here the Frame type takes the getIndex and pointIndex operations as data members of the Frame itself. There isn't a fixed compile-time set of subcases, because the behavior of a Frame is determined by its element functions, which are supplied at runtime. So without having to touch those definitions, we could add:
import Control.Applicative ((<|>))
concatFrame :: Frame p -> Frame p -> Frame p
concatFrame f g = Frame newSize getI pointI
where newSize = getSize f + getSize g
getI ij | ij < getSize f = ij
| otherwise = ij - getSize f
pointI p = getPoint f p <|> fmap (+(getSize f)) (getPoint g p)
I call this second style "behavioral types," but that really is just me.
Note that type classes in GHC are implemented similarly to this—there is a hidden "dictionary" argument passed around, and this dictionary is a record whose members are implementations for the class methods:
data ShowDictionary a { primitiveShow :: a -> String }
stringShowDictionary :: ShowDictionary String
stringShowDictionary = ShowDictionary { primitiveShow = ... }
-- show "whatever"
-- ---> primitiveShow stringShowDictionary "whatever"
You could accomplish something like this with existential types, however it wouldn't work how you want it to, so you really shouldn't.
Doing it with normal polymorphism, as you have in your example, won't work at all. What your type says is that the function is valid for all a--that is, the caller gets to choose what kind of message to receive. However, you have to choose the message based on the numeric code, so this clearly won't do.
To clarify: all standard Haskell type variables are universally quantified by default. You can read your type signature as ∀a. Msg a => Int -> Get a. What this says is that the function is define for every value of a, regardless of what the argument may be. This means that it has to be able to return whatever particular a the caller wants, regardless of what argument it gets.
What you really want is something like ∃a. Msg a => Int -> Get a. This is why I said you could do it with existential types. However, this is relatively complicated in Haskell (you can't quite write a type signature like that) and will not actually solve your problem correctly; it's just something to keep in mind for the future.
Fundamentally, using classes and types like this is not very idiomatic in Haskell, because that's not what classes are meant to do. You would be much better off sticking to a normal algebraic data type for your messages.
I would have a single type like this:
data Message = DebugMsg String
| UpdateMsg [String]
So instead of having a parse function per type, just do the parsing in the parseMsg function as appropriate:
parseMsg :: Int -> String -> Message
parseMsg n msg = case n of
1 -> DebugMsg msg
2 -> UpdateMsg [msg]
(Obviously fill in whatever logic you actually have there.)
Essentially, this is the classical use for normal algebraic data types. There is no reason to have different types for the different kinds of messages, and life is much easier if they have the same type.
It looks like you're trying to emulate sub-typing from other languages. As a rule of thumb, you use algebraic data types in place of most of the uses of sub-types in other languages. This is certainly one of those cases.

Turtle Graphics as a Haskell Monad

I'm trying to implement turtle graphics in Haskell. The goal is to be able to write a function like this:
draw_something = do
forward 100
right 90
forward 100
and then have it produce a list of points (maybe with additional properties):
> draw_something (0,0) 0 -- start at (0,0) facing east (0 degrees)
[(0,0), (0,100), (-100,100), ...]
I have all this working in a 'normal' way, but I've failed to implement it as a Haskell Monad and use the do-notation. The basic code:
data State a = State (a, a) a -- (x,y), angle
deriving (Show, Eq)
initstate :: State Float
initstate = State (0.0,0.0) 0.0
-- constrain angles to 0 to 2*pi
fmod :: Float -> Float
fmod a
| a >= 2*pi = fmod (a-2*pi)
| a < 0 = fmod (a+2*pi)
| otherwise = a
forward :: Float -> State Float -> [State Float]
forward d (State (x,y) angle) = [State (x + d * (sin angle), y + d * (cos angle)) angle]
right :: Float -> State Float -> [State Float]
right d (State pos angle) = [State pos (fmod (angle+d))]
bind :: [State a] -> (State a -> [State a]) -> [State a]
bind xs f = xs ++ (f (head $ reverse xs))
ret :: State a -> [State a]
ret x = [x]
With this I can now write
> [initstate] `bind` (forward 100) `bind` (right (pi/2)) `bind` (forward 100)
[State (0.0,0.0) 0.0,State (0.0,100.0) 0.0,State (0.0,100.0) 1.5707964,State (100.0,99.99999) 1.5707964]
And get the expected result. However I can't make this an instance of Monad.
instance Monad [State] where
results in
`State' is not applied to enough type arguments
Expected kind `*', but `State' has kind `* -> *'
In the instance declaration for `Monad [State]'
And if I wrap the list in a new object
data StateList a = StateList [State a]
instance Monad StateList where
return x = StateList [x]
I get
Couldn't match type `a' with `State a'
`a' is a rigid type variable bound by
the type signature for return :: a -> StateList a
at logo.hs:38:9
In the expression: x
In the first argument of `StateList', namely `[x]'
In the expression: StateList [x]
I tried various other versions but I never got it to run as I'd like to. What am I doing wrong? What do I understand incorrectly?
The monad you're devising needs to have two type parameters. One for the saved trail (which will be fixed for a particular do sequence) and other for the results of computations.
You also need to think about how to compose two turtle-monadic values so that the binding operation is associative. For example,
right 90 >> (right 90 >> forward 100)
must be equal to
(right 90 >> right 90) >> forward 100
(and of course similarly for >>= etc.). This means that if you represent the turtle's history by a list of points, the binding operation most likely just cannot append the lists of points together; forward 100 alone will result in something like [(0,0),(100,0)] but when it's prepended with rotation, the saved points need to be rotated too.
I'd say that the simplest approach would be to use the Writer monad. But I wouldn't save the points, I'd save just the actions the turtle performs (so that we don't need to rotate the points when combining the values). Something like
data Action = Rotate Double | Forward Double
type TurtleMonad a = Writer [Action] a
(This also means that we don't need to track the current direction, it's contained in the actions.) Then each of your functions just writes its argument into the Writer. And at the end, you can extract the final list from it and make a simple function that converts all the actions into a list of points:
track :: [Action] -> [(Double,Double)]
Update: Instead of using [Action] it would be better to use Seq from Data.Sequence. It's also a monoid and concatenating two sequences is very fast, it's amortized complexity is O(log(min(n1,n2))), compared to O(n1) of (++). So the improved type would be
type TurtleMonad a = Writer (Seq Action) a
