I've recently stumbled upon loeb and moeb functions here and I'm trying to explore what it can do.
I'm trying to achieve spreadsheet-like behaviour with possibility to perform IO in some "cells". I was thinking that moeb traverse seemed like good candidate do do this, but any non-trivial (ie. other than const $ return something) function in list I used caused the entire call to run forever. After this I tried to test it in State monad:
moeb f x = fix $ \g -> f ($g) x
foo v = do
x <- get
vs <- v
put (x + 3)
return (x + (vs!!0))
test = [
const $ return 7,
foo,
fmap length
]
main = print $ runState (moeb traverse test) 5
The result was this:
([7,12,3],moeb.hs: out of memory
Why does this happen? foo both gets and sets state, but it evaluates fine, while evaluation of final state hangs.
And how can i achieve spreadsheet-with-IO behavior that terminates?
moeb traverse test :: State Int [Int] is an action to produce a list of integers.
If you unfold the definition of moeb, you get
moeb traverse test
= traverse ($ moeb traverse test) test
meaning that each element f the spreadsheet is passed the action moeb traverse test to be run, from scratch, instead of using the result of the action recursively.
Generalizing moeb using mfix might help but I doubt the result will be worth the trouble.
Related
I'm relatively new to Polysemy, and I'm trying to wrap my head around how to use NonDet correctly. Specifically, let's say I've got this computation
generate :: Member NonDet r => Sem r Int
generate = msum $ fmap pure [0..]
computation :: (Member NonDet r, Member (Final IO) r) => Sem r ()
computation = do
n <- generate
guard (n == 100)
embedFinal $ print n
It's a horribly inefficient way to print the number 100, but it demonstrates the problem I'm having. Now, I want to run this effect only insofar as to get the first success. That is, I want to run this effect long enough to "find" the number 100 and print it, and then I want to stop.
My first attempt
attempt1 :: IO ()
attempt1 = void . runFinal . runNonDet #[] $ computation
This one fails to short-circuit. It prints 100 but then hangs forever, looking for the number 100 again. That makes sense; after all, I didn't actually tell it I only wanted one solution. So let's try that.
My second attempt
runNonDetOnce :: Sem (NonDet ': r) a -> Sem r (Maybe a)
runNonDetOnce = fmap listToMaybe . runNonDet
attempt2 :: IO ()
attempt2 = void . runFinal . runNonDetOnce $ computation
All we're doing here is discarding all but the head of the list. Understandably, this didn't change anything. Haskell already wasn't evaluating the list, so discarding an unused value changes nothing. Like attempt1, this solution hangs forever after printing 100.
My third attempt
attempt3 :: IO ()
attempt3 = void . runFinal . runNonDetMaybe $ computation
So I tried using runNonDetMaybe. This one, unfortunately, just exits without printing anything. Figuring out why that is took a bit, but I have a theory. The documentation says
Unlike runNonDet, uses of <|> will not execute the second branch at all if the first option succeeds.
So it's greedy and doesn't backtrack after success, basically. Thus, it runs my computation like this.
computation = do
n <- generate -- Ah yes, n = 0. Excellent!
guard (n == 100) -- Wait, 0 /= 100! Failure! We can't backtrack, so abort.
embedFinal $ print n
Non-Solutions
In this small example, we could just alter the computation a bit, like so
computation :: (Member NonDet r, Member (Final IO) r) => Sem r ()
computation = msum $ fmap (\n -> guard (n == 100) >> embedFinal (print n)) [0..]
So rather than generate a number and then check it later, we simply move generate inside of computation. With this computation, attempt3 succeeds, since we can get to the "correct" answer without backtracking. This works in this small example, but it's infeasible for a larger codebase. Unless someone has a good systematic way of avoiding backtracking, I don't see a good way to generalize this solution to computations that span over multiple files in a large program.
The other non-solution is to cheat using IO.
computation :: (Member NonDet r, Member (Final IO) r) => Sem r ()
computation = do
n <- generate
guard (n == 100)
embedFinal $ print n
embedFinal $ exitSuccess
Now attempt1 and attempt2 succeed, since we simply forcibly exit the program after success. But, aside from feeling incredibly sloppy, this doesn't generalize either. I want to stop running the current computation after finding 100, not the whole program.
So, to summarize, I want the computation given in the first code snippet above to be run using Polysemy in some way that causes it to backtrack (in NonDet) until it finds one successful value (in the example above, n = 100) and then stop running side effects and end the computation. I tried delving into the source code of runNonDetMaybe and co in this hopes of being able to reproduce something similar to it that has the effect I want, but my Polysemy skills are not nearly to the level of understanding all of the Weaving and decomp shenanigans happening there. I hope someone here who has more expertise with this library than I do can point me in the right direction to running NonDet with the desired effects.
Now attempt1 and attempt2 succeed, since we simply forcibly exit the program after success. But, aside from feeling incredibly sloppy, this doesn't generalize either. I want to stop running the current computation after finding 100, not the whole program.
Rather than exitSuccess, a closely related idea is to throw an exception that you can catch in the interpreter.
Consider the two following variations:
myReadListTailRecursive :: IO [String]
myReadListTailRecursive = go []
where
go :: [String] -> IO [String]
go l = do {
inp <- getLine;
if (inp == "") then
return l;
else go (inp:l);
}
myReadListOrdinary :: IO [String]
myReadListOrdinary = do
inp <- getLine
if inp == "" then
return []
else
do
moreInps <- myReadListOrdinary
return (inp:moreInps)
In ordinary programming languages, one would know that the tail recursive variant is a better choice.
However, going through this answer, it is apparent that haskell's implementation of recursion is not similar to that of using the recursion stack repeatedly.
But because in this case the program in question involves actions, and a strict monad, I am not sure if the same reasoning applies. In fact, I think in the IO case, the tail recursive form is indeed better. I am not sure how to correctly reason about this.
EDIT: David Young pointed out that the outermost call here is to (>>=). Even in that case, does one of these styles have an advantage over the other?
FWIW, I'd go for existing monadic combinators and focus on readability/consiseness. Using unfoldM :: Monad m => m (Maybe a) -> m [a]:
import Control.Monad (liftM, mfilter)
import Control.Monad.Loops (unfoldM)
myReadListTailRecursive :: IO [String]
myReadListTailRecursive = unfoldM go
where
go :: IO (Maybe String)
go = do
line <- getLine
return $ case line of
"" -> Nothing
s -> Just s
Or using MonadPlus instance of Maybe, with mfilter :: MonadPlus m => (a -> Bool) -> m a -> m a:
myReadListTailRecursive :: IO [String]
myReadListTailRecursive = unfoldM (liftM (mfilter (/= "") . Just) getLine)
Another, more versatile option, might be to use LoopT.
That’s really not how I would write it, but it’s clear enough what you’re doing. (By the way, if you want to be able to efficiently insert arbitrary output from any function in the chain, without using monads, you might try a Data.ByteString.Builder.)
Your first implementation is very similar to a left fold, and your second very similar to a right fold or map. (You might try actually writing them as such!) The second one has several advantages for I/O. One of the most important, for handling input and output, is that it can be interactive.
You’ll notice that the first builds the entire list from the outside in: in order to determine what the first element of the list is, the program needs to compute the entire structure to get to the innermost thunk, which is return l. The program generates the entire data structure first, then starts to process it. That’s useful when you’re reducing a list, because tail-recursive functions and strict left folds are efficient.
With the second, the outermost thunk contains the head and tail of the list, so you can grab the tail, then call the thunk to generate the second list. This can work with infinite lists, and it can produce and return partial results.
Here’s a contrived example: a program that reads in one integer per line and prints the sums so far.
main :: IO ()
main = interact( display . compute 0 . parse . lines )
where parse :: [String] -> [Int]
parse [] = []
parse (x:xs) = (read x):(parse xs)
compute :: Int -> [Int] -> [Int]
compute _ [] = []
compute accum (x:xs) = let accum' = accum + x
in accum':(compute accum' xs)
display = unlines . map show
If you run this interactively, you’ll get something like:
$ 1
1
$ 2
3
$ 3
6
$ 4
10
But you could also write compute tail-recursively, with an accumulating parameter:
main :: IO ()
main = interact( display . compute [] . parse . lines )
where parse :: [String] -> [Int]
parse = map read
compute :: [Int] -> [Int] -> [Int]
compute xs [] = reverse xs
compute [] (y:ys) = compute [y] ys
compute (x:xs) (y:ys) = compute (x+y:x:xs) ys
display = unlines . map show
This is an artificial example, but strict left folds are a common pattern. If, however, you write either compute or parse with an accumulating parameter, this is what you get when you try to run interactively, and hit EOF (control-D on Unix, control-Z on Windows) after the number 4:
$ 1
$ 2
$ 3
$ 4
1
3
6
10
This left-folded version needs to compute the entire data structure before it can read any of it. That can’t ever work on an infinite list (When would you reach the base case? How would you even reverse an infinite list if you did?) and an application that can’t respond to user input until it quits is a deal-breaker.
On the other hand, the tail-recursive version can be strict in its accumulating parameter, and will run more efficiently, especially when it’s not being consumed immediately. It doesn’t need to keep any thunks or context around other than its parameters, and it can even re-use the same stack frame. A strict accumulating function, such as Data.List.foldl', is a great choice whenver you’re reducing a list to a value, not building an eagerly-evaluated list of output. Functions such as sum, product or any can’t return any useful intermediate value. They inherently have to finish the computation first, then return the final result.
I want to basically map over a list and at the same time carry along some state. I figured combining the list and state monads might get me there. I tried a few things and figured out that I likely need to use ListT for that. As a simplified version of my actual problem, imagine that I want to implement the sum function, while also returning a modified version of the original list. This or similar is what I imagined it would have to look like:
sum' :: ListT (State Int) Int
sum' = do
lift $ put 0
x <- [1,2,3]
lift $ modify (+x)
return $ x + 1
What I don't get yet is how the syntax of the regular list monad translates to the ListT monad. I cannot simply do x <- [1,2,3], since on the right side of the arrow, type ListT (State Int) t0 is expected. x <- return [1,2,3] compiles (as in keeps the compiler from complaining about this line) but gets me the whole list put into x, instead of each element.
How do I get this working?
x <- ListT $ return [1,2,3]
or
x <- msum $ return <$> [1,2,3]
will do the trick.
ListT . return just injects a list structure-awarely into a list-transformed monad stack.
msum uses the fact that ListT is the transformer mapping a monad to the free MonadPlus monoid over it.
You have a sequence of actions that prefer to be executed in chunks due to some high-fixed overhead like packet headers or making connections. The limit is that sometimes the next action depends on the result of a previous one in which case, all pending actions are executed at once.
Example:
mySession :: Session IO ()
a <- readit -- nothing happens yet
b <- readit -- nothing happens yet
c <- readit -- nothing happens yet
if a -- all three readits execute because we need a
then write "a"
else write "..."
if b || c -- b and c already available
...
This reminds me of so many Haskell concepts but I can't put my finger on it.
Of course, you could do something obvious like:
[a,b,c] <- batch([readit, readit, readit])
But I'd like to hide the fact of chunking from the user for slickness purposes.
Not sure if Session is the right word. Maybe you can suggest a better one? (Packet, Batch, Chunk and Deferred come to mind.)
Update
I think there was a really good answer last night that I read on my phone but when I came back to look for it today it was gone. Was I dreaming?
I don't think you can do exactly what you want, since what you describe exploits haskell's lazy evaluation to have the evaluation of a force the actions that compute b and c, and there's no way to seq on unspecified values.
What I could do was hack together a monad transformer that delayed actions sequenced via >> so that they could be executed all together:
data Session m a = Session { pending :: [ m () ], final :: m a }
runSession :: Monad m => Session m a -> m a
runSession (Session ms ma) = foldr (flip (>>)) (return ()) ms >> ma
instance Monad m => Monad (Session m) where
return = Session [] . return
s >>= f = Session [] $ runSession s >>= (runSession . f)
(Session ms ma) >> (Session ms' ma') =
Session (ms' ++ (ma >> return ()) : ms) ma'
This violates some monad laws, but lets you do something like:
liftIO :: IO a -> Session IO a
liftIO = Session []
exampleSession :: Session IO Int
exampleSession = do
liftIO $ putStrLn "one"
liftIO $ putStrLn "two"
liftIO $ putStrLn "three"
liftIO $ putStrLn "four"
trace "five" $ return 5
and get
ghci> runSession exampleSession
five
one
two
three
four
5
ghci> length (pending exampleSession)
4
This is very similar to what Haxl does.
For more info:
Open sourcing haxl - Facebook Code Blog
ICFP 2014 talk
You could use the unsafeInterleaveIO function. It is a dangerous function that can introduce bugs to your program if not used carefully, but it does what you're asking for.
You can insert it into your example code like this:
lazyReadits :: IO [a]
lazyReadits = unsafeInterleaveIO $ do
a <- readit
r <- lazyReadits
return (a:r)
unsafeInterleaveIO makes the action as a whole lazy, but once it starts evaluating it will evaluate as if it had been strict. This means in my above example: readit will run as soon as something tests whether the returned list is empty or not. If I'd used mapM unsafeInterleaveIO (replicate 3 readit) instead, then readit would only be run when the actual elements of the list are evaluated, which would make the contents of the list depend on the order in which its elements are inspected, which is one example of how unsafeInterleaveIO can introduce bugs.
I have following problem: Given a [String] and String->IO Int. So I can make a transformation (map) and get [IO Int]. Now, I have to do two things -- perfrorm that actions, from start, until result is positive and I need to know, was all list processed.
I am forbidded to process after first non-positive result.
takeWhileM do not answer second question(length compraison is too impractical), and spanM perform forbidden IO.
Of course, I can write recursive function myself, but I want to do it in Haskell way, with all good of high-order functions.
Suggestions? Probably, use completely another approach?
Task above is a bit simplified task from my project.
You can use allM from the monad-loops package:
Prelude Control.Monad.Loops> let xs = ["a", "bb", "ccc", "dddd", "eeeee"]
Prelude Control.Monad.Loops> let f x = putStrLn x >> return (length x)
Prelude Control.Monad.Loops> let p x = x < 2
Prelude Control.Monad.Loops> allM (fmap p . f) xs
a
bb
False
There's also an allM in Control.Monad.ListM, but it's not appropriately lazy—it will continue to perform computations after you hit a positive result.
(I'm with you on this, by the way—I hate writing one-off recursive functions.)
I'm not familiar with the functions takeWhileM and spanM (and neither is hoogle) (edit: as per comment, they can be found in Control.Monad.ListM).
Given that, I think the best thing for you to do is to make a one-off function to perform this task. If it later turns out that you need to write code to do something similar, then you can factor out the common parts and re-use them. There's nothing wrong with writing one-off code in general, it's code duplication that's bad.
There are a few ways to write the function you want - one possible way is like this:
process :: [IO Int] -> IO Bool
process [] = return True
process [a] = a >> return True
process (a:as) = do
n <- a
if n > 0
then return False
else process as
#illusionoflife: I don't see how using takeWhileM would improve on #Chris's solution.
For example:
import Control.Monad.ListM
process :: [IO Int] -> IO Bool
process as = do
taken <- takeWhileM (>>= return . (<= 0)) as
return (length taken >= length as - 1)
(Code not verified!)
#Chris's looks more readable, among other things because in his solution we don't need to figure out if we should use >= or ==. Besides, since I call length we can't use it on an infinite input list.