Generating sequence from Markov chain in Haskell - haskell

I would like to generate random sequences from a Markov chain. To generate the Markov chain I use the following code.
module Main where
import qualified Control.Monad.Random as R
import qualified Data.List as L
import qualified Data.Map as M
type TransitionMap = M.Map (String, String) Int
type MarkovChain = M.Map String [(String, Int)]
addTransition :: (String, String) -> TransitionMap -> TransitionMap
addTransition k = M.insertWith (+) k 1
fromTransitionMap :: TransitionMap -> MarkovChain
fromTransitionMap m =
M.fromList [(k, frequencies k) | k <- ks]
where ks = L.nub $ map fst $ M.keys m
frequencies a = map reduce $ filter (outboundFor a) $ M.toList m
outboundFor a k = fst (fst k) == a
reduce e = (snd (fst e), snd e)
After collecting the statistics and generating a Markov Chain object I would like to generate random sequences. I could imagine this method could look something like that (pseudo-code)
generateSequence mc s
| s == "." = s
| otherwise = s ++ " " ++ generateSequence mc s'
where s' = drawRandomlyFrom $ R.fromList $ mc ! s
I would greatly appreciate if someone could explain to me, how I should implement this function.
Edit
If anyone's interested it wasn't as difficult as I thought.
module Main where
import qualified Control.Monad.Random as R
import qualified Data.List as L
import qualified Data.Map as M
type TransitionMap = M.Map (String, String) Rational
type MarkovChain = M.Map String [(String, Rational)]
addTransition :: TransitionMap -> (String, String) -> TransitionMap
addTransition m k = M.insertWith (+) k 1 m
fromTransitionMap :: TransitionMap -> MarkovChain
fromTransitionMap m =
M.fromList [(k, frequencies k) | k <- ks]
where ks = L.nub $ map fst $ M.keys m
frequencies a = map reduce $ filter (outboundFor a) $ M.toList m
outboundFor a k = fst (fst k) == a
reduce e = (snd (fst e), snd e)
generateSequence :: (R.MonadRandom m) => MarkovChain -> String -> m String
generateSequence m s
| not (null s) && last s == '.' = return s
| otherwise = do
s' <- R.fromList $ m M.! s
ss <- generateSequence m s'
return $ if null s then ss else s ++ " " ++ ss
fromSample :: [String] -> MarkovChain
fromSample ss = fromTransitionMap $ foldl addTransition M.empty $ concatMap pairs ss
where pairs s = let ws = words s in zipWith (,) ("":ws) ws
sample :: [String]
sample = [ "I am a monster."
, "I am a rock star."
, "I want to go to Hawaii."
, "I want to eat a hamburger."
, "I have a really big headache."
, "Haskell is a fun language."
, "Go eat a big hamburger."
, "Markov chains are fun to use."
]
main = do
s <- generateSequence (fromSample sample) ""
print s
The only tiny annoyance is the fake "" starting node.

Not sure if this is what you're looking for. This compiles though:
generateSequence :: (R.MonadRandom m) => MarkovChain -> String -> m String
generateSequence mc s | s == "." = return s
| otherwise = do
s' <- R.fromList $ rationalize (mc M.! s)
s'' <- generateSequence mc s'
return $ s ++ " " ++ s''
rationalize :: [(String,Int)] -> [(String,Rational)]
rationalize = map (\(x,i) -> (x, toRational i))

All random number generation needs to happen in either the Random monad or the IO monad. For your purpose, it's probably easiest to understand how to do that in the IO monad, using evalRandIO. In the example below, getRandom is the function we want to use. Now getRandom operates in the Random monad, but we can use evalRandIO to lift it to the IO monad, like this:
main :: IO ()
main = do
x <- evalRandIO getRandom :: IO Double
putStrLn $ "Your random number is " ++ show x
Note: The reason we have to add the type signature to the line that binds x is because in this particular example there are no other hints to tell the compiler what type we want x to be. However, if we used x in some way that makes it clear that we want it to be a Double (e.g., multiplying by another Double), then the type signature wouldn't be necessary.
Using your MarkovChain type, for a current state you can trivially get the available transitions in the form [(nextState,probability)]. (I'm using the word "probability" loosely, it doesn't need to be a true probability; any numeric weight is fine). This is what fromList in Control.Monad.Random is designed for. Again, it operates in the Random monad, but we can use evalRandIO to lift it to the IO monad. Suppose transitions is your list of transitions, having the type [(nextState,probability)]. Then, in the IO monad you can call:
nextState <- evalRandIO $ fromList transitions
You might instead want to create your own function that operates in the Random monad, like this:
getRandomTransition :: RandomGen g => MarkovChain -> String -> Rand g String
getRandomTransition currState chain = do
let transitions = lookup currState chain
fromList transitions
Then you can call this function in the IO monad using evalRandIO, e.g.
nextState <- evalRandIO $ getRandomTransition chain

Related

How do I parameterize a function by module in Haskell?

This might seem artificial, but I can't seem to find an obvious answer to the following:
Say I have the following imports:
import qualified Data.Map as M
import qualified Data.HashMap.Lazy as HML
Now I have some function (comp) that takes some list, does something, creates a map, returns it.
My question is how do I have two ways of calling comp so that its calls (say) to insert and size map correctly?
As a strawman, I could write two copies of this function, one referencing M.insert and M.size, while the other references HML.insert and HML.size ... but how do I "pass the module as a parameter", or indicate this otherwise?
Thanks!
Edit: to make this less abstract these are the exact definitions of comp:
mapComp :: KVPairs -> IO ()
mapComp kvpairs = do
let init = M.empty
let m = foldr ins init kvpairs where
ins (k, v) t = M.insert k v t
if M.size m /= length kvpairs
then putStrLn $ "FAIL: " ++ show (M.size m) ++ ", " ++ show (length kvpairs)
else pure ()
hashmapComp :: KVPairs -> IO()
hashmapComp kvpairs = do
let init = HML.empty
let m = foldr ins init kvpairs where
ins (k, v) t = HML.insert k v t
if HML.size m /= length kvpairs
then putStrLn $ "Fail: " ++ show (HML.size m) ++ ", " ++ show (length kvpairs)
else pure ()
Edit (2): this turned out to be way more interesting than I anticipated, thanks to everyone who responded!
Here's how to to it with module signatures and mixins (a.k.a. Backpack)
You would have to define a library (it could be an internal library) with a signature like:
-- file Mappy.hsig
signature Mappy where
class C k
data Map k v
empty :: Map k v
insert :: C k => k -> v -> Map k v -> Map k v
size :: Map k v -> Int
in the same library or in another, write code that imports the signature as if it were a normal module:
module Stuff where
import qualified Mappy as M
type KVPairs k v = [(k,v)]
comp :: M.C k => KVPairs k v -> IO ()
comp kvpairs = do
let init = M.empty
let m = foldr ins init kvpairs where
ins (k, v) t = M.insert k v t
if M.size m /= length kvpairs
then putStrLn $ "FAIL: " ++ show (M.size m) ++ ", " ++ show (length kvpairs)
else pure ()
In another library (it must be a different one) write an "implementation" module that matches the signature:
-- file Mappy.hs
{-# language ConstraintKinds #-}
module Mappy (C,insert,empty,size,Map) where
import Data.Map.Lazy
type C = Ord
The "signature match" is performed based on names and types only, the implementation module doesn't need to know about the existence of the signature.
Then, in a library or executable in which you want to use the abstract code, pull both the library with the abstract code and the library with the implementation:
executable somexe
main-is: Main.hs
build-depends: base ^>=4.11.1.0,
indeflib,
lazyimpl
default-language: Haskell2010
library indeflib
exposed-modules: Stuff
signatures: Mappy
build-depends: base ^>=4.11.1.0
hs-source-dirs: src
default-language: Haskell2010
library lazyimpl
exposed-modules: Mappy
build-depends: base ^>=4.11.1.0,
containers >= 0.5
hs-source-dirs: impl1
default-language: Haskell2010
Sometimes the name of the signature and of the implementing module don't match, in that case one has to use the mixins section of the Cabal file.
Edit. Creating the HashMap implementation proved somewhat tricky, because insert required two constraints (Eq and Hashable) instead of one. I had to resort to the "class synonym" trick. Here's the code:
{-# language ConstraintKinds, FlexibleInstances, UndecidableInstances #-}
module Mappy (C,insert,HM.empty,HM.size,Map) where
import Data.Hashable
import qualified Data.HashMap.Strict as HM
type C = EqHash
class (Eq q, Hashable q) => EqHash q -- class synonym trick
instance (Eq q, Hashable q) => EqHash q
insert :: EqHash k => k -> v -> Map k v -> Map k v
insert = HM.insert
type Map = HM.HashMap
The simplest is to parameterize by the operations you actually need, rather than the module. So:
mapComp ::
m ->
(K -> V -> m -> m) ->
(m -> Int) ->
KVPairs -> IO ()
mapComp empty insert size kvpairs = do
let m = foldr ins empty kvpairs where
ins (k, v) t = insert k v t
if size m /= length kvpairs
then putStrLn $ "FAIL: " ++ show (size m) ++ ", " ++ show (length kvpairs)
else pure ()
You can then call it as, e.g. mapComp M.empty M.insert M.size or mapComp HM.empty HM.insert HM.size. As a small side benefit, callers may use this function even if the data structure they prefer doesn't offer a module with exactly the right names and types by writing small adapters and passing them in.
If you like, you can combine these into a single record to ease passing them around:
data MapOps m = MapOps
{ empty :: m
, insert :: K -> V -> m -> m
, size :: m -> Int
}
mops = MapOps M.empty M.insert M.size
hmops = MapOps HM.empty HM.insert HM.size
mapComp :: MapOps m -> KVPairs -> IO ()
mapComp ops kvpairs = do
let m = foldr ins (empty ops) kvpairs where
ins (k, v) t = insert ops k v t
if size ops m /= length kvpairs
then putStrLn "Yikes!"
else pure ()
I am afraid that it is not possible to do in Haskell without workarounds. Main problem is that comp would use different types for same objects for M and for HML variants, which is impossible to do in Haskell directly.
You will need to let comp know which option are you going to take using either data or polymorphism.
As a base idea I would create ADT to cover possible options and use boolean value to determine the module:
data SomeMap k v = M (M.Map k v) | HML (HML.HashMap k v)
f :: Bool -> IO ()
f shouldIUseM = do ...
And then use case expression in foldr to check whether your underlying map is M or HML. However, I don't see any good point of using such a bloatcode, it would be much better to create compM and compHML separately.
Another approach would be to create typeclass that would wrap all your cases
class SomeMap m where
empty :: m k v
insert :: k -> v -> m k v -> m k v
size :: m k v -> Int
And then write instances for each map manually (or using some TemplateHaskell magic, which I believe could help here, however it is out of my skills). It will require some bloat code as well, but then you will be able to parametrize comp over the used map type:
comp :: SomeMap m => m -> IO ()
comp thisCouldBeEmptyInitMap = do ...
But honestly, I would write this function like this:
comp :: Bool -> IO ()
comp m = if m then fooM else fooHML
I'm a little suspicious this is an XY problem, so here's how I would address the code you linked to. You have, the following:
mapComp :: KVPairs -> IO ()
mapComp kvpairs = do
let init = M.empty
let m = foldr ins init kvpairs where
ins (k, v) t = M.insert k v t
if M.size m /= length kvpairs
then putStrLn $ "FAIL: " ++ show (M.size m) ++ ", " ++ show (length kvpairs)
else pure ()
hashmapComp :: KVPairs -> IO()
hashmapComp kvpairs = do
let init = HML.empty
let m = foldr ins init kvpairs where
ins (k, v) t = HML.insert k v t
if HML.size m /= length kvpairs
then putStrLn $ "Fail: " ++ show (HML.size m) ++ ", " ++ show (length kvpairs)
else pure ()
This has a lot of repetition, which is usually not good. So we factor out the bits that are different between the two functions, and parameterize a new function by those changing bits:
-- didn't try to compile this
comp :: mp k v -> (k -> v -> mp k v -> mp k v) -> (mp k v -> Int) -> KVPairs -> IO()
comp h_empty h_insert h_size kvpairs = do
let init = h_empty
let m = foldr ins init kvpairs where
ins (k, v) t = h_insert k v t
if h_size m /= length kvpairs
then putStrLn $ "Fail: " ++ show (h_size m) ++ ", " ++ show (length kvpairs)
else pure ()
As you can see this is a really mechanical process. Then you call e.g. comp M.empty M.insert M.size.
If you want to be able to define comp such that it can work on map types that you haven't thought of yet (or which your users will specify), then you must define comp against an abstract interface. This is done with typeclasses, as in SomeMap radrow's answer.
In fact you can do part of this abstracting already, by noticing that both maps you want to work with implement the standard Foldable and Monoid.
-- didn't try to compile this
comp :: (Foldable (mp k), Monoid (mp k v))=> (k -> v -> mp k v -> mp k v) -> KVPairs -> IO()
comp h_insert kvpairs = do
let init = mempty -- ...also why not just use `mempty` directly below:
let m = foldr ins init kvpairs where
ins (k, v) t = h_insert k v t
if length m /= length kvpairs
then putStrLn $ "Fail: " ++ show (length m) ++ ", " ++ show (length kvpairs)
else pure ()
As mentioned in the comments, I think backpack is (will be?) the way to get what I think you're asking for, i.e. parameterized modules. I don't know much about it, and it's not clear to me what usecases it solves that you wouldn't want to use the more traditional approach I've described above (maybe I'll read the wiki page).

Pattern matching in `Alternative`

I have a function that pattern matches on its arguments to produce a computation in StateT () Maybe (). This computation can fail when run, in which case I want the current pattern match branch to fail, so to speak.
I highly doubt it's possible to have something like
compute :: Int -> StateT () Maybe Int
compute = return
f :: Maybe Int -> Maybe Int -> StateT () Maybe ()
f (Just n1) (Just n2) = do
m <- compute (n1 + n2)
guard (m == 42)
f (Just n) _ = do
m <- compute n
guard (m == 42)
f _ (Just n) = do
m <- compute n
guard (m == 42)
behave in the way I want it to: When the first computation fails due to the guard or somewhere in compute, I want f to try the next pattern.
Obviously the above can't work, because StateT (as any other monad might) involves an additional parameter when expanded, so I probably can't formulate this as simple pattern guards.
The following does what I want, but it's ugly:
f' :: Maybe Int -> Maybe Int -> StateT () Maybe ()
f' a b = asum (map (\f -> f a b) [f1, f2, f3])
where
f1 a b = do
Just n1 <- pure a
Just n2 <- pure b
m <- compute (n1 + n2)
guard (m == 42)
f2 a _ = do
Just n <- pure a
m <- compute n
guard (m == 42)
f3 _ b = do
Just n <- pure b
m <- compute n
guard (m == 42)
A call like execStateT (f (Just 42) (Just 1)) () would fail for f but return Just () for f', because it matches f2.
How do I get the behavior of f' while having elegant pattern matching with as little auxiliary definitions as possible like in f? Are there other, more elegant ways to formulate this?
Complete runnable example:
#! /usr/bin/env stack
-- stack --resolver=lts-11.1 script
import Control.Monad.Trans.State
import Control.Applicative
import Control.Monad
import Data.Foldable
compute :: Int -> StateT () Maybe Int
compute = return
f :: Maybe Int -> Maybe Int -> StateT () Maybe ()
f (Just n1) (Just n2) = do
m <- compute (n1 + n2)
guard (m == 42)
f (Just n) _ = do
m <- compute n
guard (m == 42)
f _ (Just n) = do
m <- compute n
guard (m == 42)
f' :: Maybe Int -> Maybe Int -> StateT () Maybe ()
f' a b = asum (map (\f -> f a b) [f1, f2, f3])
where
f1 a b = do
Just n1 <- pure a
Just n2 <- pure b
m <- compute (n1 + n2)
guard (m == 42)
f2 a _ = do
Just n <- pure a
m <- compute n
guard (m == 42)
f3 _ b = do
Just n <- pure b
m <- compute n
guard (m == 42)
main = do
print $ execStateT (f (Just 42) (Just 1)) () -- Nothing
print $ execStateT (f' (Just 42) (Just 1)) () -- Just (), because `f2` succeeded
Edit: I elicited quite some clever answers with this question so far, thanks! Unfortunately, they mostly suffer from overfitting to the particular code example I've given. In reality, I need something like this for unifying two expressions (let-bindings, to be precise), where I want to try unifying the RHS of two simultaneous lets if possible and fall through to the cases where I handle let bindings one side at a time by floating them. So, actually there's no clever structure on Maybe arguments to exploit and I'm not computeing on Int actually.
The answers so far might benefit others beyond the enlightenment they brought me though, so thanks!
Edit 2: Here's some compiling example code with probably bogus semantics:
module Unify (unify) where
import Control.Applicative
import Control.Monad.Trans.State.Strict
data Expr
= Var String -- meta, free an bound vars
| Let String Expr Expr
-- ... more cases
-- no Eq instance, fwiw
-- | If the two terms unify, return the most general unifier, e.g.
-- a substitution (`Map`) of meta variables for terms as association
-- list.
unify :: [String] -> Expr -> Expr -> Maybe [(String, Expr)]
unify metaVars l r = execStateT (go [] [] l r) [] -- threads the current substitution as state
where
go locals floats (Var x) (Var y)
| x == y = return ()
go locals floats (Var x) (Var y)
| lookup x locals == Just y = return ()
go locals floats (Var x) e
| x `elem` metaVars = tryAddSubstitution locals floats x e
go locals floats e (Var y)
| y `elem` metaVars = tryAddSubstitution locals floats y e
-- case in point:
go locals floats (Let x lrhs lbody) (Let y rrhs rbody) = do
go locals floats lrhs rrhs -- try this one, fail current pattern branch if rhss don't unify
-- if we get past the last statement, commit to this branch, no matter
-- the next statement fails or not
go ((x,y):locals) floats lbody rbody
-- try to float the let binding. terms mentioning a floated var might still
-- unify with a meta var
go locals floats (Let x rhs body) e = do
go locals (Left (x,rhs):floats) body e
go locals floats e (Let y rhs body) = do
go locals (Right (y,rhs):floats) body e
go _ _ _ _ = empty
tryAddSubstitution = undefined -- magic
When I need something like this, I just use asum with the blocks inlined. Here I also condensed the multiple patterns Just n1 <- pure a; Just n2 <- pure b into one, (Just n1, Just n2) <- pure (a, b).
f :: Maybe Int -> Maybe Int -> StateT () Maybe ()
f a b = asum
[ do
(Just n1, Just n2) <- pure (a, b)
m <- compute (n1 + n2)
guard (m == 42)
, do
Just n <- pure a
m <- compute n
guard (m == 42)
, do
Just n <- pure b
m <- compute n
guard (m == 42)
]
You can also use chains of <|>, if you prefer:
f :: Maybe Int -> Maybe Int -> StateT () Maybe ()
f a b
= do
(Just n1, Just n2) <- pure (a, b)
m <- compute (n1 + n2)
guard (m == 42)
<|> do
Just n <- pure a
m <- compute n
guard (m == 42)
<|> do
Just n <- pure b
m <- compute n
guard (m == 42)
This is about as minimal as you can get for this kind of “fallthrough”.
If you were using Maybe alone, you would be able to do this with pattern guards:
import Control.Monad
import Control.Applicative
ensure :: Alternative f => (a -> Bool) -> a -> f a
ensure p a = a <$ guard (p a)
compute :: Int -> Maybe Int
compute = return
f :: Maybe Int -> Maybe Int -> Maybe Int
f (Just m) (Just n)
| Just x <- ensure (== 42) =<< compute (m + n)
= return x
f (Just m) _
| Just x <- ensure (== 42) =<< compute m
= return x
f _ (Just n)
| Just x <- ensure (== 42) =<< compute n
= return x
f _ _ = empty
(ensure is a general purpose combinator. Cf. Lift to Maybe using a predicate)
As you have StateT on the top, though, you would have to supply a state in order to pattern match on Maybe, which would foul up everything. That being so, you are probably better off with something in the vein of your "ugly" solution. Here is a whimsical attempt at improving its looks:
import Control.Monad
import Control.Applicative
import Control.Monad.State
import Control.Monad.Trans
import Data.Foldable
ensure :: Alternative f => (a -> Bool) -> a -> f a
ensure p a = a <$ guard (p a)
compute :: Int -> StateT () Maybe Int
compute = return
f :: Maybe Int -> Maybe Int -> StateT () Maybe Int
f a b = asum (map (\c -> f' (c a b)) [liftA2 (+), const, flip const])
where
f' = ensure (== 42) <=< compute <=< lift
While this is an answer specific to the snippet I've given, the refactorings only apply limited to the code I was facing.
Perhaps it's not that far-fetched of an idea to extract the skeleton of the asum expression above to a more general combinator:
-- A better name would be welcome.
selector :: Alternative f => (a -> a -> a) -> (a -> f b) -> a -> a -> f b
selector g k x y = asum (fmap (\sel -> k (sel x y)) [g, const, flip const])
f :: Maybe Int -> Maybe Int -> StateT () Maybe Int
f = selector (liftA2 (+)) (ensure (== 42) <=< compute <=< lift)
Though it is perhaps a bit awkward of a combinator, selector does show the approach is more general than it might appear at first: the only significant restriction is that k has to produce results in some Alternative context.
P.S.: While writing selector with (<|>) instead of asum is arguably more tasteful...
selector g k x y = k (g x y) <|> k x <|> k y
... the asum version straightforwardly generalises to an arbitrary number of pseudo-patterns:
selector :: Alternative f => [a -> a -> a] -> (a -> f b) -> a -> a -> f b
selector gs k x y = asum (fmap (\g -> k (g x y)) gs)
It looks like you could get rid of the whole pattern match by relying on the fact that Int forms a Monoid with addition and 0 as the identity element, and that Maybe a forms a Monoid if a does. Then your function becomes:
f :: Maybe Int -> Maybe Int -> StateT () Maybe Int
f a b = pure $ a <> b >>= compute >>= pure . mfilter (== 42)
You could generalise by passing the predicate as an argument:
f :: Monoid a => (a -> Bool) -> Maybe a -> Maybe a -> StateT () Maybe a
f p a b = pure $ a <> b >>= compute >>= pure . mfilter p
The only thing is that compute is now taking a Maybe Int as input, but that is just a matter of calling traverse inside that function with whatever computation you need to do.
Edit: Taking into account your last edit, I find that if you spread your pattern matches into separate computations that may fail, then you can just write
f a b = f1 a b <|> f2 a b <|> f3 a b
where f1 (Just a) (Just b) = compute (a + b) >>= check
f1 _ _ = empty
f2 (Just a) _ = compute a >>= check
f2 _ _ = empty
f3 _ (Just b) = compute b >>= check
f3 _ _ = empty
check x = guard (x == 42)

Why must we use state monad instead of passing state directly?

Can someone show a simple example where state monad can be better than passing state directly?
bar1 (Foo x) = Foo (x + 1)
vs
bar2 :: State Foo Foo
bar2 = do
modify (\(Foo x) -> Foo (x + 1))
get
State passing is often tedious, error-prone, and hinders refactoring. For example, try labeling a binary tree or rose tree in postorder:
data RoseTree a = Node a [RoseTree a] deriving (Show)
postLabel :: RoseTree a -> RoseTree Int
postLabel = fst . go 0 where
go i (Node _ ts) = (Node i' ts', i' + 1) where
(ts', i') = gots i ts
gots i [] = ([], i)
gots i (t:ts) = (t':ts', i'') where
(t', i') = go i t
(ts', i'') = gots i' ts
Here I had to manually label states in the right order, pass the correct states along, and had to make sure that both the labels and child nodes are in the right order in the result (note that naive use of foldr or foldl for the child nodes could have easily led to incorrect behavior).
Also, if I try to change the code to preorder, I have to make changes that are easy to get wrong:
preLabel :: RoseTree a -> RoseTree Int
preLabel = fst . go 0 where
go i (Node _ ts) = (Node i ts', i') where -- first change
(ts', i') = gots (i + 1) ts -- second change
gots i [] = ([], i)
gots i (t:ts) = (t':ts', i'') where
(t', i') = go i t
(ts', i'') = gots i' ts
Examples:
branch = Node ()
nil = branch []
tree = branch [branch [nil, nil], nil]
preLabel tree == Node 0 [Node 1 [Node 2 [],Node 3 []],Node 4 []]
postLabel tree == Node 4 [Node 2 [Node 0 [],Node 1 []],Node 3 []]
Contrast the state monad solution:
import Control.Monad.State
import Control.Applicative
postLabel' :: RoseTree a -> RoseTree Int
postLabel' = (`evalState` 0) . go where
go (Node _ ts) = do
ts' <- traverse go ts
i <- get <* modify (+1)
pure (Node i ts')
preLabel' :: RoseTree a -> RoseTree Int
preLabel' = (`evalState` 0) . go where
go (Node _ ts) = do
i <- get <* modify (+1)
ts' <- traverse go ts
pure (Node i ts')
Not only is this code more succinct and easier to write correctly, the logic that results in pre- or postorder labeling is far more transparent.
PS: bonus applicative style:
postLabel' :: RoseTree a -> RoseTree Int
postLabel' = (`evalState` 0) . go where
go (Node _ ts) =
flip Node <$> traverse go ts <*> (get <* modify (+1))
preLabel' :: RoseTree a -> RoseTree Int
preLabel' = (`evalState` 0) . go where
go (Node _ ts) =
Node <$> (get <* modify (+1)) <*> traverse go ts
As an example to my comment above, you can write code using the State monad like
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE TemplateHaskell #-}
import Data.Text (Text)
import qualified Data.Text as Text
import Control.Monad.State
data MyState = MyState
{ _count :: Int
, _messages :: [Text]
} deriving (Eq, Show)
makeLenses ''MyState
type App = State MyState
incrCnt :: App ()
incrCnt = modify (\my -> my & count +~ 1)
logMsg :: Text -> App ()
logMsg msg = modify (\my -> my & messages %~ (++ [msg]))
logAndIncr :: Text -> App ()
logAndIncr msg = do
incrCnt
logMsg msg
app :: App ()
app = do
logAndIncr "First step"
logAndIncr "Second step"
logAndIncr "Third step"
logAndIncr "Fourth step"
logAndIncr "Fifth step"
Note that using additional operators from Control.Lens also lets you write incrCnt and logMsg as
incrCnt = count += 1
logMsg msg = messages %= (++ [msg])
which is another benefit of using State in combination with the lens library, but for the sake of comparison I'm not using them in this example. To write the equivalent code above with just argument passing it would look more like
incrCnt :: MyState -> MyState
incrCnt my = my & count +~ 1
logMsg :: MyState -> Text -> MyState
logMsg my msg = my & messages %~ (++ [msg])
logAndIncr :: MyState -> Text -> MyState
logAndIncr my msg =
let incremented = incrCnt my
logged = logMsg incremented msg
in logged
At this point it isn't too bad, but once we get to the next step I think you'll see where the code duplication really comes in:
app :: MyState -> MyState
app initial =
let first_step = logAndIncr initial "First step"
second_step = logAndIncr first_step "Second step"
third_step = logAndIncr second_step "Third step"
fourth_step = logAndIncr third_step "Fourth step"
fifth_step = logAndIncr fourth_step "Fifth step"
in fifth_step
Another benefit of wrapping this up in a Monad instance is that you can use the full power of Control.Monad and Control.Applicative with it:
app = mapM_ logAndIncr [
"First step",
"Second step",
"Third step",
"Fourth step",
"Fifth step"
]
Which allows for much more flexibility when processing values calculated at runtime compared to static values.
The difference between manual state passing and using the State monad is simply that the State monad is an abstraction over the manual process. It also happens to fit several other widely used more general abstractions, like Monad, Applicative, Functor, and a few others. If you also use the StateT transformer then you can compose these operations with other monads, such as IO. Can you do all of this without State and StateT? Of course you can, and there's no one stopping you from doing so, but the point is that State abstracts this pattern out and gives you access to a huge toolbox of more general tools. Also, a small modification to the types above makes the same functions work in multiple contexts:
incrCnt :: MonadState MyState m => m ()
logMsg :: MonadState MyState m => Text -> m ()
logAndIncr :: MonadState MyState m => Text -> m ()
These will now work with App, or with StateT MyState IO, or any other monad stack with a MonadState implementation. It makes it significantly more reusable than simple argument passing, which is only possible through the abstraction that is StateT.
In my experience, the point of many Monads doesn't really click until you get into larger examples, so here is an example use of State (well, StateT ... IO) to parse an incoming request to a web service.
The pattern is that this web service can be called with a bunch of options of different types, though all except for one of the options have decent defaults. If I get a incoming JSON request with an unknown key value, I should abort with an appropriate message. I use the state to keep track of what the current config is, and what the remainder of the JSON request is, along with a bunch of accessor methods.
(Based on code currently in production, with the names of everything changed and the details of what this service actually does obscured)
{-# LANGUAGE OverloadedStrings #-}
module XmpConfig where
import Data.IORef
import Control.Arrow (first)
import Control.Monad
import qualified Data.Text as T
import Data.Aeson hiding ((.=))
import qualified Data.HashMap.Strict as MS
import Control.Monad.IO.Class (liftIO)
import Control.Monad.Trans.State (execStateT, StateT, gets, modify)
import qualified Data.Foldable as DF
import Data.Maybe (fromJust, isJust)
data Taggy = UseTags Bool | NoTags
newtype Locale = Locale String
data MyServiceConfig = MyServiceConfig {
_mscTagStatus :: Taggy
, _mscFlipResult :: Bool
, _mscWasteTime :: Bool
, _mscLocale :: Locale
, _mscFormatVersion :: Int
, _mscJobs :: [String]
}
baseWebConfig :: IO (IORef [String], IORef [String], MyServiceConfig)
baseWebConfig = do
infoRef <- newIORef []
warningRef <- newIORef []
let cfg = MyServiceConfig {
_mscTagStatus = NoTags
, _mscFlipResult = False
, _mscWasteTime = False
, _mscLocale = Locale "en-US"
, _mscFormatVersion = 1
, _mscJobs = []
}
return (infoRef, warningRef, cfg)
parseLocale :: T.Text -> Maybe Locale
parseLocale = Just . Locale . T.unpack -- The real thing does more
parseJSONReq :: MS.HashMap T.Text Value ->
IO (IORef [String], IORef [String], MyServiceConfig)
parseJSONReq m = liftM snd
(baseWebConfig >>= (\c -> execStateT parse' (m, c)))
where
parse' :: StateT (MS.HashMap T.Text Value,
(IORef [String], IORef [String], MyServiceConfig))
IO ()
parse' = do
let addWarning s = do let snd3 (_, b, _) = b
r <- gets (snd3 . snd)
liftIO $ modifyIORef r (++ [s])
-- These two functions suck a key/value off the input map and
-- pass the value on to the handler "h"
onKey k h = onKeyMaybe k $ DF.mapM_ h
onKeyMaybe k h = do myb <- gets fst
modify $ first $ MS.delete k
h (MS.lookup k myb)
-- Access the "lns" field of the configuration
config setter = modify (\(a, (b, c, d)) -> (a, (b, c, setter d)))
onKey "tags" $ \x -> case x of
Bool True -> config $ \c -> c {_mscTagStatus = UseTags False}
String "true" -> config $ \c -> c {_mscTagStatus = UseTags False}
Bool False -> config $ \c -> c {_mscTagStatus = NoTags}
String "false" -> config $ \c -> c {_mscTagStatus = NoTags}
String "inline" -> config $ \c -> c {_mscTagStatus = UseTags True}
q -> addWarning ("Bad value ignored for tags: " ++ show q)
onKey "reverse" $ \x -> case x of
Bool r -> config $ \c -> c {_mscFlipResult = r}
q -> addWarning ("Bad value ignored for reverse: " ++ show q)
onKey "spin" $ \x -> case x of
Bool r -> config $ \c -> c {_mscWasteTime = r}
q -> addWarning ("Bad value ignored for spin: " ++ show q)
onKey "language" $ \x -> case x of
String s | isJust (parseLocale s) ->
config $ \c -> c {_mscLocale = fromJust $ parseLocale s}
q -> addWarning ("Bad value ignored for language: " ++ show q)
onKey "format" $ \x -> case x of
Number 1 -> config $ \c -> c {_mscFormatVersion = 1}
Number 2 -> config $ \c -> c {_mscFormatVersion = 2}
q -> addWarning ("Bad value ignored for format: " ++ show q)
onKeyMaybe "jobs" $ \p -> case p of
Just (Array x) -> do q <- parseJobs x
config $ \c -> c {_mscJobs = q}
Just (String "test") ->
config $ \c -> c {_mscJobs = ["test1", "test2"]}
Just other -> fail $ "Bad value for jobs: " ++ show other
Nothing -> fail "Missing value for jobs"
m' <- gets fst
unless (MS.null m') (fail $ "Unrecognized key(s): " ++ show (MS.keys m'))
parseJobs :: (Monad m, DF.Foldable b) => b Value -> m [String]
parseJobs = DF.foldrM (\a b -> liftM (:b) (parseJob a)) []
parseJob :: (Monad m) => Value -> m String
parseJob (String s) = return (T.unpack s)
parseJob q = fail $ "Bad job value: " ++ show q

Mutually recursive IO definitions

I can write the following:
f :: [Int] -> [Int]
f x = 0:(map (+1) x)
g :: [Int] -> [Int]
g x = map (*2) x
a = f b
b = g a
main = print $ take 5 a
And things work perfectly fine (ideone).
However, lets say I want g to do something more complex than multiply by 2, like ask the user for a number and add that, like so:
g2 :: [Int] -> IO [Int]
g2 = mapM (\x -> getLine >>= (return . (+x) . read))
How do I then, well, tie the knot?
Clarification:
Basically I want the list of Ints from f to be the input of g2 and the list of Ints from g2 to be the input of f.
The effectful generalization of lists is ListT:
import Control.Monad
import Pipes
f :: ListT IO Int -> ListT IO Int
f x = return 0 `mplus` fmap (+ 1) x
g2 :: ListT IO Int -> ListT IO Int
g2 x = do
n <- x
n' <- lift (fmap read getLine)
return (n' + n)
a = f b
b = g2 a
main = runListT $ do
n <- a
lift (print n)
mzero
You can also implement take like functionality with a little extra code:
import qualified Pipes.Prelude as Pipes
take' :: Monad m => Int -> ListT m a -> ListT m a
take' n l = Select (enumerate l >-> Pipes.take n)
main = runListT $ do
n <- take' 5 a
lift (print n)
mzero
Example session:
>>> main
0
1<Enter>
2
2<Enter>
3<Enter>
7
4<Enter>
5<Enter>
6<Enter>
18
7<Enter>
8<Enter>
9<Enter>
10<Enter>
38
You can learn more about ListT by reading the pipes tutorial, specifically the section on ListT.

List Iterator using ContT

I have a simple list that I would like to iterate over "yield"ing between each element and printing that element to the output. I am trying to use the ContT monad to do this but running into issues. Here's what I have so far:
data K a = Nil | K (a,() -> K a)
listIterator :: (Monad m) => [r] -> m (K r)
listIterator [] = return Nil
listIterator (x:xs) = return (ContT (\k -> K (x,k))) >> listIterator xs
runIterator :: IO ()
runIterator = do
a <- listIterator ([1,2,3] :: [Int])
let loop Nil = liftIO $ print "nil"
loop (K (curr,newI)) =
do
liftIO $ print curr
loop (newI ())
loop a
The expected output is:
1
2
3
nil
What I get is:
nil
Any help is appreciated!
listIterator (x:xs) = return (ContT (\k -> K (x,k))) >> listIterator xs
does not do what you expect, equational reasoning
listIterator (x:xs)
= return (ContT (\k -> K (x,k))) >> listIterator xs
= (return (ContT (\k -> K (x,k)))) >>= \_ -> listIterator xs
= (\_ -> listIterator xs) (ContT (\k -> K (x,k)))
= listIterator xs
I'm not sure exactly why you want to use an iterator. Haskell is already lazy, so iteration patterns like this are mostly used only when you have resource management issues that need to interact well with a demand driven usage pattern. And, you don't need the continuation monad at all:
Instead of writing the K constructor to take a tuple it is more idiomatic to
data K a = Nil | K a (() -> K a)
intuitively, the type for the listIterator does not use its monadic structure: it just constructs a value, so
listIterator ::[r] -> K r
listIterator [] = Nil
listIterator (x:xs) = K x (\_ -> listIterator xs)
now life is trivial
runIterator :: IO ()
runIterator = do
let a = listIterator ([1,2,3] :: [Int])
loop Nil = liftIO $ print "nil"
loop (K curr newI) =
do
liftIO $ print curr
loop (newI ())
loop a
which would probably be best to write without the use of do notation.
This may not be the answer you were looking for, but if you are interested in this style of programming, you should look into pipes and similar libraries. (conduit is the rising star in the "real world", but pipes provides a simpler tool for teaching which is why I use it here.)
$ cabal update && cabal install pipes
Pipes are like iterators, except they come in three flavors: those that can acquire input (Consumers), those that produce output (Producers), and those that do both (Pipes). If you connect pipes such that the input and output ends are all satisfied, then it is called a "Pipeline", and it is a self-contained unit that can be run without any additional input.
Pipe provides a monad instance for convenience in creating pipes. The >+> operator connects two pipes together.
import Control.Pipe
import Control.Monad.Trans.Class
import Control.Monad.IO.Class
-- annoyingly, Pipe does not provide a MonadIO instance
instance (MonadIO m) => MonadIO (Pipe a b m) where
liftIO = lift . liftIO
listIterator :: Monad m => [a] -> Producer (Maybe a) m ()
listIterator (x:xs) = yield (Just x) >> listIterator xs
listIterator [] = yield Nothing
printer :: (MonadIO m, Show a) => Consumer (Maybe a) m ()
printer = do
mx <- await
case mx of
Just x -> liftIO (print x) >> printer
Nothing -> liftIO (putStrLn "nil")
main = runPipe $ listIterator [1, 2, 3] >+> printer
The source for Control.Pipe is delightfully simple, especially if you have been reading Gabriel's recent blog posts about Free monads, particularly Why free monads matter and Purify code using free monads.

Resources