Haskell memory usage and IO

Haskell memory usage and IO - haskell

I had just written a piece of Haskell code where in order to debug my code I put in a bunch of print statements in my code (so, my most important function returned IO t, when it just needed to return t) and I saw that this function, on a successful run, would take up a lot of memory (roughly 1.2GB). Once I saw that the program was working fine, I removed all the print statements from the function and ran it, only to realize that it was giving me this error:
Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksize -RTS' to increase it.
Even though it was the same exact piece of code, for some reason the print statements made it ignore stack space overflow. Can anyone enlighten me as to why this happens?
I know I haven't provided my code which might make it harder to answer this question, but I've hacked a bunch of things together and it doesn't look very pretty so I doubt it would be useful and I am fairly certain that the only difference is the print statements.
EDIT:
Since people really wanted to see the code here is the relevant part:
linkCallers :: ([Int], Int, Int, I.IntDisjointSet, IntMap Int) -> ([Int], Int, Int, I.IntDisjointSet, IntMap Int)
linkCallers ([], x, y, us, im) = ([], x, y, us, im)
linkCallers ((a:b:r), x, y, us, im) = if a == b
then (r, x, y+1, us, im)
else if sameRep
then (r, x+1, y+1, us, im)
else (r, x+1, y+1, us', im')
where
ar = fst $ I.lookup a us
br = fst $ I.lookup b us
sameRep = case ar of
Nothing -> False
_ -> ar == br
as' = ar >>= flip lookup im
bs' = br >>= flip lookup im
totalSize = do
asize <- as'
bsize <- bs'
return $ asize + bsize
maxSize = (convertMaybe as') + (convertMaybe bs')
us' = I.union a b $ I.insert a $ I.insert b $ us
newRep = fromJust $ fst $ I.lookup a us'
newRep' = fromJust $ fst $ I.lookup b us'
im'' = case ar of
Nothing -> case br of
Nothing -> im
Just bk -> delete bk im
Just ak -> delete ak $ case br of
Nothing -> im
Just bk -> delete bk im
im' = case totalSize of
Nothing -> insert newRep maxSize im''
Just t -> insert newRep t im''
startLinkingAux' (c,x,y,us,im) = let t#(_,x',_,us',im') = linkCallers (c,x,y,us,im) in
case (fst $ I.lookup primeMinister us') >>= flip lookup im' >>= return . (>=990000) of
Just True -> x'
_ -> startLinkingAux' t
startLinkingAux' used to look something like this:
startLinkingAux' (c,x,y,us,im) = do
print (c,x,y,us,im)
let t#(_,x',_,us',im') = linkCallers (c,x,y,us,im) in
case (fst $ I.lookup primeMinister us') >>= flip lookup im' >>= return . (>=990000) of
Just True -> return x'
_ -> startLinkingAux' t

There could be a memory leak in one of the arguments. Probably the first thing I'd try would be to ask the author of disjoint-set to add a RFData instance for IntDisjointSet (or do it yourself, looking at the source code, it'd fairly easy). Then try calling force on all values returned by linkCallers to see if it helps.
Second, you're not using disjoint-set right. The main idea of the algorithm is that lookups compress paths in the set. This is what gives it it's great performance! So every time you make a lookup, you should replace your old set with a new one. But this makes using a disjoint set quite clumsy in a functional language. It'd suggest to use the State monad for this and use it internally in linkCallers, as one big do block instead of where, just passing the starting set and extracting the final one. And define functions like
insertS :: (MonadState IntDisjointSet m) => Int -> m ()
insertS x = modify (insert x)
lookupS :: (MonadState IntDisjointSet m) => Int -> m (Maybe Int)
lookupS x = state (lookup x)
-- etc
to use inside State. (Perhaps they'd be a good contribution to the library as well as this will be probably a common problem.)
Finally, there are lot of small improvements that can make the code more readable:
Many times you're applying a single function to two values. I'd suggest to define something like
onPair :: (a -> b) -> (a, a) -> (b, b)
onPair f (x, y) = (f x, f y)
-- and use it like:
(ar, br) = onPair (fst . flip I.lookup us) (a, b)
Also using Applicative functions can make things shorter:
sameRep = fromMaybe False $ (==) <$> ar <*> br
totalSize = (+) <$> as' <*> bs'
then also
im'' = maybe id delete ar . maybe id delete br $ im
im' = insert newRep (fromJust maxSize totalSize) im''
Hope it helps.

Related

Is this syntax as expressive as the do-notation?

The do notation allows us to express monadic code without overwhelming nestings, so that
main = getLine >>= \ a ->
getLine >>= \ b ->
putStrLn (a ++ b)
can be expressed as
main = do
a <- getLine
b <- getLine
putStrLn (a ++ b)
Suppose, though, the syntax allows ... #expression ... to stand for do { x <- expression; return (... x ...) }. For example, foo = f a #(b 1) c would be desugared as: foo = do { x <- b 1; return (f a x c) }. The code above could, then, be expressed as:
main = let a = #getLine in
let b = #getLine in
putStrLn (a ++ b)
Which would be desugared as:
main = do
x <- getLine
let a = x in
return (do
x' <- getLine
let b = x' in
return (putStrLn (a ++ b)))
That is equivalent. This syntax is appealing to me because it seems to offer the same functionality as the do-notation, while also allowing some shorter expressions such as:
main = putStrLn (#(getLine) ++ #(getLine))
So, I wonder if there is anything defective with this proposed syntax, or if it is indeed complete and equivalent to the do-notation.

putStrLn is already String -> IO (), so your desugaring ... return (... return (putStrLn (a ++ b))) ends up having type IO (IO (IO ())), which is likely not what you wanted: running this program won't print anything!
Speaking more generally, your notation can't express any do-block which doesn't end in return. [See Derek Elkins' comment.]
I don't believe your notation can express join, which can be expressed with do without any additional functions:
join :: Monad m => m (m a) -> m a
join mx = do { x <- mx; x }
However, you can express fmap constrained to Monad:
fmap' :: Monad m => (a -> b) -> m a -> m b
fmap' f mx = f #mx
and >>= (and thus everything else) can be expressed using fmap' and join. So adding join would make your notation complete, but still not convenient in many cases, because you end up needing a lot of joins.
However, if you drop return from the translation, you get something quite similar to Idris' bang notation:
In many cases, using do-notation can make programs unnecessarily verbose, particularly in cases such as m_add above where the value bound is used once, immediately. In these cases, we can use a shorthand version, as follows:
m_add : Maybe Int -> Maybe Int -> Maybe Int
m_add x y = pure (!x + !y)
The notation !expr means that the expression expr should be evaluated and then implicitly bound. Conceptually, we can think of ! as being a prefix function with the following type:
(!) : m a -> a
Note, however, that it is not really a function, merely syntax! In practice, a subexpression !expr will lift expr as high as possible within its current scope, bind it to a fresh name x, and replace !expr with x. Expressions are lifted depth first, left to right. In practice, !-notation allows us to program in a more direct style, while still giving a notational clue as to which expressions are monadic.
For example, the expression:
let y = 42 in f !(g !(print y) !x)
is lifted to:
let y = 42 in do y' <- print y
x' <- x
g' <- g y' x'
f g'
Adding it to GHC was discussed, but rejected (so far). Unfortunately, I can't find the threads discussing it.

How about this:
do a <- something
b <- somethingElse a
somethingFinal a b

Use two monads without a transformer

In order to understand how to use monad transformers, I wrote the following code without one. It reads standard input line by line and displays each line reversed until an empty line is encountered. It also counts the lines using State and in the end displays the total number.
import Control.Monad.State
main = print =<< fmap (`evalState` 0) go where
go :: IO (State Int Int)
go = do
l <- getLine
if null l
then return get
else do
putStrLn (reverse l)
-- another possibility: fmap (modify (+1) >>) go
rest <- go
return $ do
modify (+1)
rest
I wanted to add the current line number before each line. I was able to do it with StateT:
import Control.Monad.State
main = print =<< evalStateT go 0 where
go :: StateT Int IO Int
go = do
l <- lift getLine
if null l
then get
else do
n <- get
lift (putStrLn (show n ++ ' ' : reverse l))
modify (+1)
go
My question is: how to do the same in the version without monad transformers?

The problem you're having is that the hand-unrolling of StateT s IO a is s -> IO (s, a), not IO (s -> (s, a))! Once you have this insight, it's pretty easy to see how to do it:
go :: Int -> IO (Int, Int)
go s = do
l <- getLine
if null l
then return (s, s)
else do
putStrLn (show s ++ ' ' : reverse l)
go (s+1)

You'd just need to run the accumulated state computation on every line. This is O(n²) time, but since your first program is already using O(n) space, that's not too terrible. Of course, the StateT approach is superior in pretty much every way! If you really want to do it "by hand" and not pay an efficiency price, just manage the state by hand instead of building a state transformer at all. You're really not getting any benefit by using State instead of Int in the first program.

Maybe this is what you are looking for?
main = print =<< fmap (`evalState` 0) (go get) where
go :: State Int Int -> IO (State Int Int)
go st = do
l <- getLine
if null l
then return (st >>= \_ -> get)
else do
let ln = evalState st 0
putStrLn(show ln ++ ' ' : reverse l)
go (st >>= \_ -> modify (+1) >>= \_ -> get)
The idea here is to make go tail recursive, building up your state computation, which you can then evaluate at each step.
EDIT
This version will bound the size of the state computation to a constant size, although under lazy evaluation, when the previous state computation is forced, we should be able to reuse it without re-evaluating it, so I'm guessing that these are essentially the same...
main = print =<< fmap (`evalState` 0) (go get) where
go :: State Int Int -> IO (State Int Int)
go st = do
l <- getLine
if null l
then return st
else do
let ln = evalState st 0
putStrLn(show ln ++ ' ' : reverse l)
go (modify (\s -> s+ln+1) >>= \_ -> get)

Haskell: put in State monad seems to be elided

I'm writing a program to allocate pizzas to people; each person will get one pizza, ideally of their favorite type, unless stock has run out, in which case they are given their next favorite type recursively.
My approach is to compute a ((User, Pizza), Int) for the amount a person would like said pizza, sort those, and recurse through using a state monad to keep inventory counts.
The program is written and type checks:
allocatePizzasImpl :: [((User, Pizza), Int)]
-> State [(Pizza, Int)] [(User, Pizza)]
allocatePizzasImpl [] = return []
allocatePizzasImpl ((user, (flavor, _)):ranks) =
do inventory <- get
-- this line is never hit
put $ updateWith inventory (\i -> if i <= 0
then Nothing
else Just $ i - 1) flavor
next <- allocatePizzasImpl $ filter ((/= user) . fst) ranks
return $ (user, flavor) : next
and I have a helper function to extract the result:
allocatePizzas :: [Pizza]
-> [((User, Pizza), Int)]
-> [(User, Pizza)]
allocatePizzas pizzas rank = fst
. runState (allocatePizzasImpl rank)
$ buildQuotas pizzas
but the line indicated by -- this line is never hit is... never hit by any GHCI breakpoints; furthermore, if I break on the return call, GHCI says inventory isn't in scope.
When run, the result is assigning the same pizza (with one inventory count) to all users. Something is going wrong, but I have absolutely no idea how to proceed. I'm new to Haskell, so any comments on style would be appreciated as well =)
Thanks!
PS: For completeness, updateWith is defined as:
updateWith :: (Eq a, Eq b)
=> [(a, b)] -- inventory
-> (b -> Maybe b) -- update function; Nothing removes it
-> a -- key to update
-> [(a, b)]
updateWith set update key =
case lookup key set of
Just b -> replace set
(unwrapPair (key, update b))
(fromMaybe 0 $ elemIndex (key, b) set)
Nothing -> set
where replace :: [a] -> Maybe a -> Int -> [a]
replace [] _ _ = []
replace (_:xs) (Just val) 0 = val:xs
replace (_:xs) Nothing 0 = xs
replace (x:xs) val i = x : (replace xs val $ i - 1)
unwrapPair :: Monad m => (a, m b) -> m (a, b)
unwrapPair (a, mb) = do b <- mb
return (a, b)

I think your function replace is broken:
replace (_:xs) (Just val) 0 = val:xs
This doesn't pay any attention to the value it's replacing. Wasn't your intention to replace just the pair corresponding to key?
I think you want
updateWith [] e k = []
updateWith ((k', v):kvs) e k
| k' == k = case e v of
Just v' -> (k, v'):kvs
Nothing -> kvs
| otherwise = (k', v) : updateWith kvs e k

The issue (ignoring other conceptual things mentioned by the commenters) turned out to be using fst to extract the result from the State would for some reason not cause the State to actually be computed. Running the result through seq fixed it.
I'd be interested in knowing why this is the case, though!
Edit: As Daniel Wagner pointed out in the comments, I wasn't actually using inventory, which turned out to be the real bug. Marking this as accepted.

How to increment a variable in functional programming?

How do you increment a variable in a functional programming language?
For example, I want to do:
main :: IO ()
main = do
let i = 0
i = i + 1
print i
Expected output:
1

Simple way is to introduce shadowing of a variable name:
main :: IO () -- another way, simpler, specific to monads:
main = do main = do
let i = 0 let i = 0
let j = i i <- return (i+1)
let i = j+1 print i
print i -- because monadic bind is non-recursive
Prints 1.
Just writing let i = i+1 doesn't work because let in Haskell makes recursive definitions — it is actually Scheme's letrec. The i in the right-hand side of let i = i+1 refers to the i in its left hand side — not to the upper level i as might be intended. So we break that equation up by introducing another variable, j.
Another, simpler way is to use monadic bind, <- in the do-notation. This is possible because monadic bind is not recursive.
In both cases we introduce new variable under the same name, thus "shadowing" the old entity, i.e. making it no longer accessible.
How to "think functional"
One thing to understand here is that functional programming with pure — immutable — values (like we have in Haskell) forces us to make time explicit in our code.
In imperative setting time is implicit. We "change" our vars — but any change is sequential. We can never change what that var was a moment ago — only what it will be from now on.
In pure functional programming this is just made explicit. One of the simplest forms this can take is with using lists of values as records of sequential change in imperative programming. Even simpler is to use different variables altogether to represent different values of an entity at different points in time (cf. single assignment and static single assignment form, or SSA).
So instead of "changing" something that can't really be changed anyway, we make an augmented copy of it, and pass that around, using it in place of the old thing.

As a general rule, you don't (and you don't need to). However, in the interests of completeness.
import Data.IORef
main = do
i <- newIORef 0 -- new IORef i
modifyIORef i (+1) -- increase it by 1
readIORef i >>= print -- print it
However, any answer that says you need to use something like MVar, IORef, STRef etc. is wrong. There is a purely functional way to do this, which in this small rapidly written example doesn't really look very nice.
import Control.Monad.State
type Lens a b = ((a -> b -> a), (a -> b))
setL = fst
getL = snd
modifyL :: Lens a b -> a -> (b -> b) -> a
modifyL lens x f = setL lens x (f (getL lens x))
lensComp :: Lens b c -> Lens a b -> Lens a c
lensComp (set1, get1) (set2, get2) = -- Compose two lenses
(\s x -> set2 s (set1 (get2 s) x) -- Not needed here
, get1 . get2) -- But added for completeness
(+=) :: (Num b) => Lens a b -> Lens a b -> State a ()
x += y = do
s <- get
put (modifyL x s (+ (getL y s)))
swap :: Lens a b -> Lens a b -> State a ()
swap x y = do
s <- get
let x' = getL x s
let y' = getL y s
put (setL y (setL x s y') x')
nFibs :: Int -> Int
nFibs n = evalState (nFibs_ n) (0,1)
nFibs_ :: Int -> State (Int,Int) Int
nFibs_ 0 = fmap snd get -- The second Int is our result
nFibs_ n = do
x += y -- Add y to x
swap x y -- Swap them
nFibs_ (n-1) -- Repeat
where x = ((\(x,y) x' -> (x', y)), fst)
y = ((\(x,y) y' -> (x, y')), snd)

There are several solutions to translate imperative i=i+1 programming to functional programming. Recursive function solution is the recommended way in functional programming, creating a state is almost never what you want to do.
After a while you will learn that you can use [1..] if you need a index for example, but it takes a lot of time and practice to think functionally instead of imperatively.
Here's a other way to do something similar as i=i+1 not identical because there aren't any destructive updates. Note that the State monad example is just for illustration, you probably want [1..] instead:
module Count where
import Control.Monad.State
count :: Int -> Int
count c = c+1
count' :: State Int Int
count' = do
c <- get
put (c+1)
return (c+1)
main :: IO ()
main = do
-- purely functional, value-modifying (state-passing) way:
print $ count . count . count . count . count . count $ 0
-- purely functional, State Monad way
print $ (`evalState` 0) $ do {
count' ; count' ; count' ; count' ; count' ; count' }

Note: This is not an ideal answer but hey, sometimes it might be a little good to give anything at all.
A simple function to increase the variable would suffice.
For example:
incVal :: Integer -> Integer
incVal x = x + 1
main::IO()
main = do
let i = 1
print (incVal i)
Or even an anonymous function to do it.

awkward monad transformer stack

Solving a problem from Google Code Jam (2009.1A.A: "Multi-base happiness") I came up with an awkward (code-wise) solution, and I'm interested in how it could be improved.
The problem description, shortly, is: Find the smallest number bigger than 1 for which iteratively calculating the sum of squares of digits reaches 1, for all bases from a given list.
Or description in pseudo-Haskell (code that would solve it if elem could always work for infinite lists):
solution =
head . (`filter` [2..]) .
all ((1 `elem`) . (`iterate` i) . sumSquareOfDigitsInBase)
And my awkward solution:
By awkward I mean it has this kind of code: happy <- lift . lift . lift $ isHappy Set.empty base cur
I memoize results of the isHappy function. Using the State monad for the memoized results Map.
Trying to find the first solution, I did not use head and filter (like the pseudo-haskell above does), because the computation isn't pure (changes state). So I iterated by using StateT with a counter, and a MaybeT to terminate the computation when condition holds.
Already inside a MaybeT (StateT a (State b)), if the condition doesn't hold for one base, there is no need to check the other ones, so I have another MaybeT in the stack for that.
Code:
import Control.Monad.Maybe
import Control.Monad.State
import Data.Maybe
import qualified Data.Map as Map
import qualified Data.Set as Set
type IsHappyMemo = State (Map.Map (Integer, Integer) Bool)
isHappy :: Set.Set Integer -> Integer -> Integer -> IsHappyMemo Bool
isHappy _ _ 1 = return True
isHappy path base num = do
memo <- get
case Map.lookup (base, num) memo of
Just r -> return r
Nothing -> do
r <- calc
when (num < 1000) . modify $ Map.insert (base, num) r
return r
where
calc
| num `Set.member` path = return False
| otherwise = isHappy (Set.insert num path) base nxt
nxt =
sum . map ((^ (2::Int)) . (`mod` base)) .
takeWhile (not . (== 0)) . iterate (`div` base) $ num
solve1 :: [Integer] -> IsHappyMemo Integer
solve1 bases =
fmap snd .
(`runStateT` 2) .
runMaybeT .
forever $ do
(`when` mzero) . isJust =<<
runMaybeT (mapM_ f bases)
lift $ modify (+ 1)
where
f base = do
cur <- lift . lift $ get
happy <- lift . lift . lift $ isHappy Set.empty base cur
unless happy mzero
solve :: [String] -> String
solve =
concat .
(`evalState` Map.empty) .
mapM f .
zip [1 :: Integer ..]
where
f (idx, prob) = do
s <- solve1 . map read . words $ prob
return $ "Case #" ++ show idx ++ ": " ++ show s ++ "\n"
main :: IO ()
main =
getContents >>=
putStr . solve . tail . lines
Other contestants using Haskell did have nicer solutions, but solved the problem differently. My question is about small iterative improvements to my code.

Your solution is certainly awkward in its use (and abuse) of monads:
It is usual to build monads piecemeal by stacking several transformers
It is less usual, but still happens sometimes, to stack several states
It is very unusual to stack several Maybe transformers
It is even more unusual to use MaybeT to interrupt a loop
Your code is a bit too pointless :
(`when` mzero) . isJust =<<
runMaybeT (mapM_ f bases)
instead of the easier to read
let isHappy = isJust $ runMaybeT (mapM_ f bases)
when isHappy mzero
Focusing now on function solve1, let us simplify it.
An easy way to do so is to remove the inner MaybeT monad. Instead of a forever loop which breaks when a happy number is found, you can go the other way around and recurse only if the
number is not happy.
Moreover, you don't really need the State monad either, do you ? One can always replace the state with an explicit argument.
Applying these ideas solve1 now looks much better:
solve1 :: [Integer] -> IsHappyMemo Integer
solve1 bases = go 2 where
go i = do happyBases <- mapM (\b -> isHappy Set.empty b i) bases
if and happyBases
then return i
else go (i+1)
I would be more han happy with that code.
The rest of your solution is fine.
One thing that bothers me is that you throw away the memo cache for every subproblem. Is there a reason for that?
solve :: [String] -> String
solve =
concat .
(`evalState` Map.empty) .
mapM f .
zip [1 :: Integer ..]
where
f (idx, prob) = do
s <- solve1 . map read . words $ prob
return $ "Case #" ++ show idx ++ ": " ++ show s ++ "\n"
Wouldn't your solution be more efficient if you reused it instead ?
solve :: [String] -> String
solve cases = (`evalState` Map.empty) $ do
solutions <- mapM f (zip [1 :: Integer ..] cases)
return (unlines solutions)
where
f (idx, prob) = do
s <- solve1 . map read . words $ prob
return $ "Case #" ++ show idx ++ ": " ++ show s

The Monad* classes exist to remove the need for repeated lifting. If you change your signatures like this:
type IsHappyMemo = Map.Map (Integer, Integer) Bool
isHappy :: MonadState IsHappyMemo m => Set.Set Integer -> Integer -> Integer -> m Bool
This way you can remove most of the 'lift's. However, the longest sequence of lifts cannot be removed, since it is a State monad inside a StateT, so using the MonadState type class will give you the outer StateT, where you need tot get to the inner State. You could wrap your State monad in a newtype and make a MonadHappy class, similar to the existing monad classes.

ListT (from the List package) does a much nicer job than MaybeT in stopping the calculation when necessary.
solve1 :: [Integer] -> IsHappyMemo Integer
solve1 bases = do
Cons result _ <- runList . filterL cond $ fromList [2..]
return result
where
cond num = andL . mapL (isHappy Set.empty num) $ fromList bases
Some elaboration on how this works:
Had we used a regular list the code would had looked like this:
solve1 bases = do
result:_ <- filterM cond [2..]
return result
where
cond num = fmap and . mapM (isHappy Set.empty num) bases
This calculation happens in a State monad, but if we'd like to get the resulting state, we'd have a problem, because filterM runs the monadic predicate it gets for every element of [2..], an infinite list.
With the monadic list, filterL cond (fromList [2..]) represents a list that we can access one item at a time as a monadic action, so our monadic predicate cond isn't actually executed (and affecting the state) unless we consume the corresponding list items.
Similarly, implementing cond using andL makes us not calculate and update the state if we already got a False result from one of the isHappy Set.empty num calculations.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Haskell memory usage and IO - haskell

Related

Is this syntax as expressive as the do-notation?

Use two monads without a transformer

Haskell: put in State monad seems to be elided

How to increment a variable in functional programming?

awkward monad transformer stack

Categories

Resources