Avoiding space leaks with `mapM` and `foldM` over `State` monad - haskell

How do I avoid space leaks while using foldM and mapM over a State monad?
Last year's Advent of Code day 20 has a puzzle of generating a map of a maze from instructions on how to walk across it. For instance, the instructions NN gives the maze
|
|
*
(a straight corridor two steps northwards), and the instructions NNN(EE|WW)S gives the maze
+-+-+
| | |
|
*
(go north a bit, then either go east then south or west then south).
The way I'm trying to solve this involves having a State monad, where the state is the Set of all the corridor sections (termed Doors below), and the value is the list of positions you could be working from.
If you're just following a corridor Path, I use foldM to walk along it, updating the current position. If you're at a junction, follow each branch of the junction and collect all the positions you end up.
This code produces the correct results on small test inputs, but there's a huge space leak when working on the full example.
Profiling indicates it's spending most of its time in includeDoor.
So, questions.
Is there a space leak? If so, where, and how can you tell.
How do I fix it?
(I think what's happening is that Haskell isn't strictly adding fully-evaluated Doors to the Set as soon as it can. In this case, I don't want any laziness anywhere.)
(I parse the input into a bunch of two-element vectors that indicate the step to take for each instruction. That code works fine, and quickly.)
import qualified Data.Set as S
import Linear (V2(..))
import Control.Monad.State.Strict
import Control.Monad.Extra (concatMapM)
type Coord = V2 Integer -- x, y, with north and east incresing values (origin a bottom left)
data Door = Door Coord Coord deriving (Show, Eq, Ord)
type Doors = S.Set Door
data MazeSection = Path [Coord] | Junction [Maze] deriving (Show, Eq)
type Maze = [MazeSection]
type Mapper = State Doors [Coord]
makeDoor :: Coord -> Coord -> Door
makeDoor !a !b
| a < b = Door a b
| otherwise = Door b a
emptyMap = S.empty
part1 maze =
do
let start = V2 0 0
let doors = execState (mapMaze [start] maze) emptyMap
print $ length doors
mapMaze :: [Coord] -> Maze -> Mapper
mapMaze !starts !sections =
foldM (\heres section -> mapMazeSection heres section) starts sections
mapMazeSection :: [Coord] -> MazeSection -> Mapper
mapMazeSection !starts (Junction mazes) =
concatMapM (\maze -> mapMaze starts maze) mazes
mapMazeSection !starts (Path steps) =
mapM mapPath starts
where mapPath start = foldM (\here step -> includeDoor here step) start steps
includeDoor :: Coord -> Coord -> State Doors Coord
includeDoor !here !step =
do let there = (here + step)
let door = there `seq` makeDoor here there
modify' (door `seq` S.insert door)
return there

Space leaks can be very hard to detect in Haskell. I am no expert, but I have heard there are many problems with the State monad and space leaks. I generally avoid State/StateT and use IORef, MVar or TVar instead, but that would change it to IO. The first thing you can try is adding ! in various let bindings and type definitions.
data Door = Door !Coord !Coord
data Maze = Path ![Coord] | Junction ![Maze]
If that doesn't resolve it, there are some tools that may help you pinpoint where it occurs in this article.
Other resources
Here are some other resources that might help.
ndmithcell on space leak detection
Space leak with nested strict StateT
Pinpointing space leaks in big programs

Turns out, it wasn't a space leak! It was me failing to deal with some pathological input. Once I sorted out how to handle that, it worked, and very quickly.

Related

Random walk on a pointed container

Let us consider a dwarf wandering in a tunnel. I will define a type that represents this
situation thusly:
data X a = X { xs :: [a], i :: Int }
display :: X Bool -> IO ()
display X{..} = putStrLn (concatMap f xs) where { f True = "*" ; f False = "-" }
Here you see a dwarf in a section of a tunnel:
λ display x
-*---
It is discovered that a pointed container is an instance of Comonad. I can use this
instance here to define a function that simulates my dwarf moving right:
shiftRight :: X Bool -> Bool
shiftRight x#X{..} | let i' = i - 1 in i' `isInRange` x && xs !! i' = True
| otherwise = False
See:
λ traverse_ display $ scanl (&) x (replicate 4 (extend shiftRight))
-*---
--*--
---*-
----*
-----
Spectacularly, this same operation works with any number of dwarves, in any pointed container,
and so can be extended to a whole dwarf fortress if desired. I can similarly define a function
that moves a dwarf leftwards, or in any other deterministic fashion.
But now what if I want my dwarf to wander around aimlessly? Now my "shift randomly" must
only place a dwarf to the right if the same dwarf is not being placed to the left (for that would
make two dwarves out of one), and also it must never place two dwarves in the same place (which
would make one dwarf out of two). In other words, "shift randomly" must be linear (as in
"linear logic") when applied over a comonadic fortress.
One approach I have in mind is to assign some sort of state to dwarves that tracks the available
moves for a dwarf, removing moves from every relevant dwarf when we decide that the location is
taken by one of them. This way, the remaining dwarves will not be able to take that move. Or we
may track availability of locations. I am thinking that some sort of a "monadic" extendM
might be useful. (It would compare to the usual extend as traverse compares to fmap.)
But I am not aware of any prior art.
The easiest way to solve this is by using the MonadRandom library, which introduces a new monad for random computations. So let’s set up a computation using random numbers:
-- normal comonadic computation
type CoKleisli w a b = w a -> b
-- randomised comonadic computation
type RCoKleisli w a b = w a -> Rand b
Now, how to apply this thing? It’s easy enough to extend it:
halfApply :: Comonad w => (w a -> Rand b) -> (w a -> w (Rand b))
halfApply = extend
But this doesn’t quite work: it gives us a container of randomised values, whereas we want a randomised container of values. In other words, we need to find something which can do w (Rand b) -> Rand (w b). And in fact there does exist such a function: sequenceA! As the documentation states, if we apply sequenceA to a w (Rand b), it will run each Rand computation, then accumulate the results to get a Rand (w b) — which is exactly what we want! So:
fullApply :: (Comonad w, Traversible w, Applicative f)
=> (w a -> f b) -> (w a -> f (w b))
fullApply c = sequenceA . extend c
As you can see from the type signature above, this actually works for any Applicative (because all we require is that each applicative computation can be run in turn), but requires w to be Traversible (so we can traverse over each value in w).
(For more on this sort of thing, I recommend this blog post, plus its second part. If you want to see the above technique in action, I recommend my own probabilistic cellular automata library, back when it still used comonads instead of my own typeclass.)
So that answers one half of your question; that is, how to get probabilistic behaviour using comonads. The second half is:
… and also it must never place two dwarves in the same place …
This I’m not too sure about, but one solution could be to split your comonadic computation into three stages:
Convert every dwarf probabilistically to a diff stating whether that dwarf will move left, right, or stay. Type for this operation: mkDiffs :: X Dwarf -> Rand (X DwarfDiff)
Execute each diff, but keeping the original dwarf positions. Type for this operation: execDiffs :: X DwarfDiff -> X (DwarfDiff, [DwarfDiffed]).
Resolve situations where dwarfs have collided. Type for this operation: resolve :: X (Dwarf, [DwarfDiffed]) -> Rand (X Dwarf).
Types used above:
data Dwarf = Dwarf | NoDwarf
data DwarfDiff = MoveLeft | MoveRight | DontMove | NoDiff
data DwarfDiffed = MovedFromLeft | MovedFromRight | NothingMoved
Example of what I’m talking about:
myDwarfs = X [NoDwarf ,Dwarf ,NoDwarf ,Dwarf ,Dwarf ,Dwarf ] 0
mkDiffs myDwarfs
= X [NoDiff ,MoveRight ,NoDiff ,MoveLeft ,MoveRight ,DontMove ] 0
execDiffs (mkDiffs myDwarfs)
= X [(NoDiff,[NothingMoved]),(MoveRight,[NothingMoved]),(NoDiff,[MovedFromRight,MovedFromLeft]),(MoveLeft,[NothingMoved]),(MoveRight,[NothingMoved]),(DontMove,[MovedFromLeft])] 0
resolve (execDiffs (mkDiffs myDwarfs))
= X [NoDwarf ,NoDwarf ,Dwarf ,Dwarf ,Dwarf , Dwarf ] 0
As you can see, the above solution is pretty complicated. I have an alternate recommendation: don’t use comonads for this problem! Comonads are great for when you need to update one value based on its context, but are awful at updating multiple values simultaneously. The issue is that comonads such as your X are zippers, which store a data structure as a single ‘focused’ value plus a surrounding ‘context’. As I said, this is great for updating a focused value based on its context, but if you need to update multiple values, you have to shoehorn your computation into this value+context mould… which, as we saw above, can be pretty tricky. So possibly comonads aren’t the best choice for this application.

Best way to program a cost function (associate function - value)

I am trying to program a cost function in Haskell, but it seems I overestimated the capabilities of patter matching. This is the code I have already defined:
-- Directions for the movement
data Direction = North | East | West | South deriving (Show, Eq)
-- An `Action` gets a Coord and returns another Coord if possible
type Action = Coord -> Maybe Coord
-- Move function; `move North` is an Action
move :: Direction -> Action
move d (x, y) = ...
My main problem is that now I have to define a Cost function such that:
type Cost = Coord -> Action -> Double
In case I wanted to have a simple cost function that only checks the direction to return a cost, the first idea that came to my mind was taking advantage of pattern matching, but this is invalid syntax (and to be honest, it seems quite fair):
mazeCost :: Cost
mazeCost (x, y) (move East) = 3
mazeCost (x, y) (move West) = 5
-- ... And on and on
My current solution involves computing the destination state and comparing that to the result of each action to check if that is the function that is being passed as an argument, but that seems hacky, not super brief and I think that maybe there is a much better way to do it in Haskell:
mazeCost :: Cost
mazeCost coord action
| destination == east = 1
| destination == west = 2
| destination == north = 3
| destination == south = 0
where destination = action coord
east = move East coord
west = move West coord
south = move South coord
north = move North coord
Is there a better way to associate a function (Coord -> Direction -> Maybe Coord) with a cost value (Double)? This is a minimal example I tried to write, if there are any inconsistencies in the example code or something is not clear, please ask.
You're almost there, you just need ViewPatterns
{-# LANGUAGE ViewPatterns ... #-}
...
move ((x,y), d) = ... Just (d,...) ...
moveCost = curry moveCost' where
moveCost' (move -> Just (East, (x,y))) = ...
moveCost' (move -> Just (West, (x,y))) = ...
Note that we changed the type of move here. View patterns only work with single argument functions that return a pattern we can actually match on, so it has to be passed ((x,y),d) as a tuple, which means moveCost' must accept a tuple, but we can just curry it with a wrapper function.
Cost takes a Coord and an Action. However, since actions need a coord and a direction and there's no direction in scope, you can't actually use the action. You need to apply the action to a coordinate and a direction in order to get a value. So the only implementations of mazeCost are ones that ignore the action entirely, which is not what you want.
However, if you had a direction in scope somehow then you could apply it (and the coord) to the action. One way to do this is through currying. A function Direction -> Costis a function that takes a direction and gives a cost function for that direction.
mazeCost :: Direction -> Cost
mazeCost dir coord act = case act coord dir of
Just (x', y') -> 0 -- or whatever the cost is
Nothing -> 0 -- or whatever the cost is
I think this is a nice illustration of the idea that partial application is similar to introducing a closure: mazeCost East is a function that closes over the Direction East and gives a Cost for that direction.
I should also note that I don't see how your proposed solution can typecheck. If type Action = Coord -> Direction -> Maybe Coord, which you did not specify, then the expression action coord is a function of type Direction -> Maybe Coord, and not a 'destination'. If I have misunderstood and it does typecheck then it is still a partial function that will cause a runtime error if the action is not equivalent to a simple move in one of the Directions, as luqui implied.

What structure should I use to express a turn in a board game?

I've got a working implementation of a Kalah solver, an application that calculates the optimal succession of moves on the first turn of the game.
I'm in the process of reimplementing this application, although this time with a test suite and (hopefully) prettier code that makes use of the more interesting structures like monoids or monads.
As you can see in the original code (or not, it's very convoluted and that's why I'm rewriting it) I've defined one "move" as follows:
I'm passing in a list of Pot as my board, along with a starting position on my side of the board.
I pick up and drop marbles until I get to the end of the list of Pot.
At the end of a "lap" I return the altered board ([Pot]), how many marbles I might be holding in my hand and an ADT expressing whether I should go for another lap or not (LapResult).
The thing is that I suspect that I wouldn't need to separate a move into laps if I expressed the board state with some clever data structure that I could both pass in as an input argument to a function and have that same data structure come out as a return value. At least that's my guess, my thought was that board state reminds me of what I've read about monoids.
So if I define one "move" as all the pick-up-and-drop-marbles until you land in an empty pot or in the store, is there some obvious way of rewriting the code for how a "move" works?
Current state of reimplementation can be found here.
Note: I have not tested any of this. Its probably buggy.
I think your problem is that you need to consider the board from two points of view, call them "White" and "Black".
data Player = White | Black
otherPlayer :: Player -> Player
otherPlayer White = Black
otherPlayer Black = White
The Mancala board is a circular structure, which suggests modular arithmentic. I'd suggest something like:
import Data.Vector -- More efficient version of Array
type PotNum = Int -- Use Int for simple index of pot position.
type Pot = Int -- Just record number of marbles in the pot.
You might get a more compact data structure by using Data.Word8 instead of Int, but I'm not sure. Keep it simple for the moment.
type Board = Vector Pot
Then have isStore be a simple function of PotNum and the player
isStore :: Player -> PotNum -> Bool
isStore White 0 = True
isStore Black 7 = True
isStore _ _ = False
You also want to move forwards around the board, skipping the other player's stores..
nextPot :: Player -> PotNum -> PotNum
nextPot White 6 = 8 -- Skip Black's store
nextPot White 13 = 0
nextPot Black 12 = 0 -- Skip White's store
nextPot _ n = n + 1
A list of the controlled pots for each player
playerPots :: Player -> [PotNum] -- Implementation omitted.
The number of marbles in a given pot
marblesIn :: PotNum -> Board -> Int -- Implementation omitted.
Now you can write a move function. We'll have it return Nothing for an illegal move.
move :: Player -> PotNum -> Board -> Maybe Board -- Implementation omitted.
Using the List monad you can make this produce all the potential moves and resulting board states
allMoves :: Player -> Board -> [(PotNum, Board)]
allMoves p b1 = do
n <- playerPots p
case move p n b1 of
Nothing -> fail "" -- List monad has this as []
Just b2 -> return (n, b2)
So now you can get the complete game tree from any starting position using Data.Tree.unfold, which takes a variant of the move function. This is slightly inelegant; we want to know the move that resulted in the position, but the initial position has no move leading to it. Hence the Maybe.
The unfoldTree function takes a function (f in the code below) which takes the current state and returns the current node and the list of child node values. The current state and the current node are both a triple of the player who just moved, the move they made, and the resulting board. Hence the first bit of "f". The second bit of "f" calls the "opponentMoves" function, which transforms the value returned by "allMoves" to add the right data.
unfoldGame :: Player -> Board -> Tree (Player, Maybe PotNum, Board)
unfoldGame p b = unfoldTree f (p, Nothing, b)
where
f (p1, n1, b1) = ((p1, n1, b1), opponentMoves (otherPlayer p1), b1
opponentMoves p2 b2 = map (\(n3, b3) -> (p2, Just n3, b3)) $ allMoves p2 b2
Now you just need to walk the tree. Each leaf is an end of the game because there are no legal moves left. The unfoldGame function is lazy so you only need the memory to hold the game states you are currently considering.

Haskell space leak in implementation of BFS

I have been banging my head against a Haskell space leak (of the stack overflow kind, naturally) for a few straight days. It's frustrating because I'm attempting to mimic the BFS algorithm straight from CLR, which is not naturally recursive. NB: I have enabled BangPatterns and I have put a bang in front of every possible place where one can go, in an attempt to branch-and-bound this problem, with no effect. I have battled through space leaks before, and I am loth to give up and cry for help on this one, but at this point I'm stuck. I love coding in Haskell, and I understand the Zen of functional programming pretty well, but debugging space leaks is about as much fun as rolling around on a floor full of thumbtacks.
That said, my trouble appears to be a space leak of the typical "accumulator" kind. The stack evidently builds up around calls to bfs' in the code below. Any space-leak protips much appreciated.
import qualified Data.Map as M
import qualified Data.IntSet as IS
import qualified Data.Sequence as S
import qualified Data.List as DL
data BfsColor = White | Gray | Black deriving Show
data Node =
Node {
neighbors :: !IS.IntSet,
color :: !BfsColor,
depth :: !Int
}
type NodeID = Int
type NodeQueue = S.Seq NodeID
type Graph = M.Map NodeID Node
bfs :: Graph -> NodeID -> Graph
bfs graph start_node =
bfs' (S.singleton start_node) graph
bfs' :: NodeQueue -> Graph -> Graph
bfs' !queue !graph
| S.null queue = graph
| otherwise =
let (u,q1) = pop_left queue
Node children _ n = graph M.! u
(g2,q2) = IS.fold (enqueue_child_at_depth $ n+1) (graph,q1) children
g3 = set_color u Black g2
in bfs' q2 g3
enqueue_child_at_depth :: Int -> NodeID -> (Graph, NodeQueue)
-> (Graph, NodeQueue)
enqueue_child_at_depth depth child (graph,!queue) =
case get_color child graph of
White -> (set_color child Gray $ set_depth child depth graph,
queue S.|> child)
otherwise -> (graph,queue)
pop_left :: NodeQueue -> (NodeID, NodeQueue)
pop_left queue =
let (a,b) = S.splitAt 1 queue
in (a `S.index` 0, b)
set_color :: NodeID -> BfsColor -> Graph -> Graph
set_color node_id c graph =
M.adjust (\node -> node{color=c}) node_id graph
get_color :: NodeID -> Graph -> BfsColor
get_color node_id graph = color $ graph M.! node_id
set_depth :: NodeID -> Int -> Graph -> Graph
set_depth node_id d graph =
M.adjust (\node -> node{depth=d}) node_id graph
That looks much easier to understand. (You can still shrink the code by 1/2, though.)
Now, the nature of the space leak becomes apparent. Namely, the one thing that is never evaluated is the depth. It will pile up to a big expression 1+1+.... You can remove all the bang patterns and add a single one at
enqueue_child_at_depth !depth child (graph,queue)
to get rid of the space leak.
(Further code tips: You can replace the IS.IntSet by a simple list. The queue is best deconstructed and reconstructed along the lines of
go depth qs graph = case viewl qs of
EmptyL -> graph
q :< qs ->
let
qs' = (qs ><) . Seq.fromList
. filter (\q -> isWhite q graph)
. neighbors q $ graph
in ...
)
First of all, if would be very helpful if you could provide some simple test case (in the form of code) which demonstrates how this thing stack overflows.
Without it I, personally, can only speculate on the subject of reason for that.
As a speculation: is IS.fold strict enough? Well, for example the following simplest code stack overflows as well (GHC with -O2):
{-# LANGUAGE BangPatterns #-}
import qualified Data.IntSet as IS
test s = IS.fold it 1 s
where it !e !s = s+e
main = print $ test (IS.fromList [1..1000000])
The overflow problem with this code can be hackafixed (is there a better way?) like that:
test s = foldl' it 1 (IS.toList s)
where it !e !s = s+e
Maybe you want to look at IS.fold in your code as well.

How can iterative deepening search implemented efficient in haskell?

I have an optimization problem I want to solve. You have some kind of data-structure:
data Foo =
{ fooA :: Int
, fooB :: Int
, fooC :: Int
, fooD :: Int
, fooE :: Int
}
and a rating function:
rateFoo :: myFoo -> Int
I have to optimize the result of rateFoo by changing the values in the struct. In this specific case, I decided to use iterative deepening search to solve the problem. The (infinite) search tree for the best optimization is created by another function, which simply applies all possible changes recursivly to the tree:
fooTree :: Foo -> Tree
My searching function looks something like this:
optimize :: Int -> Foo -> Foo
optimize threshold foo = undefined
The question I had, before I start is this:
As the tree can be generated by the data at each point, is it possible to have only the parts of the tree generated, which are currently needed by the algorithm? Is it possible to have the memory freed and the tree regenerated if needed in order to save memory (A leave at level n can be generated in O(n) and n remains small, but not small enough to have the whole tree in memory over time)?
Is this something I can excpect from the runtime? Can the runtime unevaluate expressions (turn an evaluated expression into an unevaluated one)? Or what is the dirty hack I have to do for this?
The runtime does not unevaluate expressions.
There's a straightforward way to get what you want however.
Consider a zipper-like structure for your tree. Each node holds a value and a thunk representing down, up, etc. When you move to the next node, you can either move normally (placing the previous node value in the corresponding slot) or forgetfully (placing an expression which evaluates to the previous node in the right slot). Then you have control over how much "history" you hang on to.
Here's my advice:
Just implement your algorithm in the
most straightforward way possible.
Profile.
Optimize for speed or memory use if necessary.
I very quickly learned that I'm not smart and/or experienced enough to reason about what GHC will do or how garbage collection will work. Sometimes things that I'm sure will be disastrously memory-inefficient work smoothly the first time around, and–less often–things that seem simple require lots of fussing with strictness annotations, etc.
The Real World Haskell chapter on profiling and optimization is incredibly helpful once you get to steps 2 and 3.
For example, here's a very simple implementation of IDDFS, where f expands children, p is the search predicate, and x is the starting point.
search :: (a -> [a]) -> (a -> Bool) -> a -> Bool
search f p x = any (\d -> searchTo f p d x) [1..]
where
searchTo f p d x
| d == 0 = False
| p x = True
| otherwise = any (searchTo f p $ d - 1) (f x)
I tested by searching for "abbaaaaaacccaaaaabbaaccc" with children x = [x ++ "a", x ++ "bb", x ++ "ccc"] as f. It seems reasonably fast and requires very little memory (linear with the depth, I think). Why not try something like this first and then move to a more complicated data structure if it isn't good enough?

Resources