Looping over a monadic predicate - haskell

For looping over a function until a predicate holds there is
until :: (a -> Bool) -> (a -> a) -> a -> a
Yet, this falls short once the predicate has the form
Monad m => (a -> m b)
The only way I found out of this is via explicit recursion, e.g. when reading from a handle until EOF is reached:
(_, (Just stdout), _, _) <- createProcess (proc "task" (args fl)){ std_out = CreatePipe }
let readH :: IO [Either String Task] -> IO [Either String Task]
readH l = do eof <- hIsEOF stdout
if eof
then l
else do line <- hGetLine stdout
l' <- l
readH.return $ (eitherDecodeStrict' line) : l'
out <- readH $ return []
Is there a higher order function that simplifies this? Maybe together with sequence?

You can define a "monadic until" function yourself, for example
untilM :: Monad m => (a -> m Bool) -> (a -> m a) -> a -> m a
untilM p f = go
where
go x = do r <- p x
if r
then return x
else do a <- f x
go a
or perhaps, if your predicate doesn't need an argument,
untilM :: Monad m => m Bool -> (a -> m a) -> a -> m a
untilM p f = go
where
go x = do r <- p
if r
then return x
else do a <- f x
go a
or even, you don't want any arguments at all,
untilM :: Monad m => m Bool -> m a -> m ()
untilM p f = do r <- p
if r
then return ()
else do f
untilM p f

Let's refactor your code until we arrive at such a combinator.
let readH :: IO [Either String Task] -> IO [Either String Task]
readH l = do eof <- hIsEOF stdout
if eof
then l
else do line <- hGetLine stdout
l' <- l
readH.return $ (eitherDecodeStrict' line) : l'
out <- readH $ return []
First I want to point out the superfluous returns. In this code you never call readH without an accompanying return. The argument to readH can actually be pure by simply removing the unnecessary returns. Notice that we had to add return l on the then branch, and no longer have to "perform" l' <- l on the else branch.
let readH :: [Either String Task] -> IO [Either String Task]
readH l = do eof <- hIsEOF stdout
if eof
then return l
else do line <- hGetLine stdout
readH $ (eitherDecodeStrict' line) : l
out <- readH []
Okay, now I'm going to rename a few things for clarity and slightly reformat.
let -- how to check the stop condition
condition :: IO Bool
condition = hIsEOF stdout
let -- what IO to do at each iteration
takeOneStep :: IO ByteString
takeOneStep = hGetLine stdout
let -- what pure work to do at each iteration
pureTransform :: ByteString -> Either String Task
pureTransform = eitherDecodeStrict'
let readH :: [Either String Task] -> IO [Either String Task]
readH theRest = do
isDone <- condition
if isDone
then return theRest
else do
raw <- takeOneStep
readH (pureTransform raw : theRest)
out <- readH []
Make sure you understand how this version of the code is the same as the last version; it just has a few expressions renamed and factored out.
pureTransform is a bit of a red herring here. We can bundle it with takeOneStep instead.
let -- how to check the stop condition
condition :: IO Bool
condition = hIsEOF stdout
let -- what IO to do at each iteration
takeOneStep :: IO (Eiter String Task)
takeOneStep = do
line <- hGetLine stdout
return (eitherDecodeStrict' line)
let readH :: [Either String Task] -> IO [Either String Task]
readH theRest = do
isDone <- condition
if isDone
then return theRest
else do
thisStep <- takeOneStep
readH (thisStep : theRest)
out <- readH []
Re-read the body of readH at this point. Notice that none of it is specific to this particular task anymore. It now describes a general sort of looping over takeOneStep until condition holds. In fact, it had that generic structure the whole time! It's just that the generic structure can be seen now that we've renamed the task-specific bits. By making takeOneStep and condition arguments of the function, we arrive at the desired combinator.
untilIO :: IO Bool -> IO (Either String Task) -> [Either String Task] -> IO [Either String Task]
untilIO condition takeOneStep theRest = do
isDone <- condition
if isDone
then return theRest
else do
thisStep <- takeOneStep
untilIO (thisStep : theRest)
Notice that this combinator, as implemented, doesn't have to be constrained to Either String Task; it can work for any type a instead of Either String Task.
untilIO :: IO Bool -> IO a -> [a] -> IO [a]
Notice that this combinator, as implemented, doesn't have to even be constrained to IO. It can work for any Monad m instead of IO.
untilM :: Monad m => m Bool -> m a -> [a] -> m [a]
The moral of the story is this: by figuring how to write "looping over a monadic predicate" via explicit recursion for your particular use case, you have already written the general combinator! It's right there in the structure of your code, waiting to be discovered.
There are a couple ways this could be cleaned up further, such as removing the [] argument and building up the list in order (currently the list comes out reversed, you'll notice), but those are beyond the point I'm trying to make right now, and so are left as exercises to the reader. Assuming you've done both of those things, you end up with
untilM :: m Bool -> m a -> m [a]
Which I would use in your example like so:
(_, (Just stdout), _, _) <- createProcess (proc "task" (args fl)){ std_out = CreatePipe }
out <- untilM (hIsEof stdout) $ do
line <- hGetLine stdout
return (eitherDecodeStrict' line)
Looks a lot like an imperative-style "until" loop!
If you swap the argument order, then you end up with something nearly equivalent to Control.Monad.Loops.untilM. Note that unlike our solution here, Control.Monad.Loops.untilM (annoyingly!) always performs the action before checking the condition, so it's not quite safe for use in this case if you might be dealing with empty files. They apparently expect you to use untilM infix, which makes it look like a do-while, hence the flipped arguments and "body then condition" nonsense.
(do ...
...
) `untilM` someCondition

Related

Haskell get values from IO domain

After reading the Haskell books I am kind of confused (or I simply forgot) how to get a value from the IO domain, into the 'Haskell world' to parse it, like so:
fGetSeq = do
input <- sequence [getLine, getLine, getLine]
fTest input
mapM_ print input
fTest = map (read :: String -> Int)
Obviously compiler complains. Couldn't match [] with IO. Is there a simple rule of thumb for passing values between 'worlds' or is it just my bad by omitting typesigs?
The thing about do notation is, every monadic action value in it (those to the right of <-s, or on their own line) must belong to the same monad. It's
do {
x <- ma ; -- ma :: m a x :: a
y <- mb ; -- mb :: m b y :: b ( with the same m! )
return (foo x y) -- foo x y :: c return (foo x y) :: m c
} -- :: m c
Now, since sequence [getLine, getLine, getLine] :: IO [String], this means your do block belongs in IO.
But you can treat the values in their own right, when you got them:
fGetSeq :: IO ()
fGetSeq = do
inputs <- sequence [getLine, getLine, getLine] -- inputs :: [String]
let vals = fTest inputs
mapM_ print vals
fTest :: [String] -> [Int]
fTest = map (read :: String -> Int)
-- or just
fGetSeq1 = do
inputs <- sequence [getLine, getLine, getLine]
mapM_ print ( fTest inputs )
-- or
fGetSeq2 = do { vals <- fTest <$> sequence [getLine, getLine, getLine] ;
mapM_ print vals } -- vals :: [Int]
-- or even (with redundant parens for clarity)
fGetSeq3 = mapM_ print =<< ( fTest <$> sequence [getLine, getLine, getLine] )
-- = mapM_ print . fTest =<< sequence [getLine, getLine, getLine]
The essence of Monad is the layering of the pure 'Haskell world' calculations in between the potentially impure, 'effectful' computations.
So we already are in the pure Haskell world, on the left hand side of that <-. Again, inputs :: [String]. A pure value.
get a value from the IO domain, into the 'Haskell world'
You use the bind operator: (>>=) :: Monad m => m a -> (a -> m b) -> m b.
If m = IO it looks like: (>>=) :: IO a -> (a -> IO b) -> IO b.
As you can see, the function with type a -> IO b addresses the a without IO.
So given a value in the IO monad, e.g. getLine :: IO String:
getInt :: IO Int
getInt = getLine >>= (\s -> return (read s))
Here, s :: String, read :: String -> Int, and return :: Int -> IO Int.
You can rewrite this using a do-block:
getInt :: IO Int
getInt = do
s <- getLine
return (read s)
Or use the standard library function that does exactly this:
getInt :: IO Int
getInt = readLn
As for your example, you can immediately fix it using a let-binding:
foo :: IO ()
foo = do
input <- sequence [getLine, getLine, getLine]
let ints = bar input
mapM_ print ints
bar :: [String] -> [Int]
bar = map read
Or you can restructure it to use getInt as defined above:
foo :: IO ()
foo = sequence [getInt, getInt, getInt] >>= mapM_ print

putStr inside IO () function

How to call IO () function inside another IO () function? I want to print to standard output then call function to do the same.
For example,
p :: String -> IO ()
p [x] = putStr x
p xs = q xs
q :: String -> IO ()
q (x:xs) = putStr x ++ p xs
Your first problem is with typing
p [x] = putStr x
{- putStr :: String -> IO ()
x :: Char, not a String
-}
and
q (x:xs) = putStr x ++ p xs
{- (++) :: [a] -> [a] -> [a]
putStr x :: IO (), not a list of anything.
-}
Let's look at q first, since it follows from p. You're breaking it down into characters, so you should use putChar rather than putStr
Also we're looking at sequencing actions, so we should either use (>>) or (>>=) depending on whether or not you need the result. In this case the result is a value of the unit type (()) which is a useless result and safe to ignore.
q :: String -> IO ()
q (x:xs) = putChar x >> p xs
{- or using `do` notation:
q (x:xs) = do
putChar x
p xs
-}
p can be changed likewise to use putChar rather than putStr
p :: String -> IO ()
p [x] = putChar x
p xs = q xs
though be aware that you haven't matched an empty list on either p or q.
About this time you should notice that substituting putChar for putStr just so you can break strings down to Chars is kind of backward thinking. p = putStr and you're done. However, if you're committed to this backward thinking:
import Control.Monad (foldM_, mapM_)
p = foldM_ (\_ x -> putChar x) ()
-- or
p = foldM_ ((putChar .) . flip const) ()
-- or
p = mapM_ putChar

Sequencing basic parsers in Haskell and Frege using do notation

I try to run snippets from chapter 8 about functional parsers in Graham Hutton's 'Programming in Haskell' both in ghci and frege-repl.
I'm not able to sequence parsers using do syntax.
I have following definitions in Frege (Haskell version differs only with simpler item definition that doesn't pack and unpack String and Char and is the same as in the book):
module Parser where
type Parser a = String -> [(a, String)]
return :: a -> Parser a
return v = \inp -> [(v, inp)]
-- this is Frege version
item :: Parser Char
item = \inp ->
let inp' = unpacked inp
in
case inp' of
[] -> []
(x:xs) -> [(x,packed xs)]
parse :: Parser a -> String -> [(a, String)]
parse p inp = p inp
-- sequencing
(>>=) :: Parser a -> (a -> Parser b) -> Parser b
p >>= f = \inp -> case (parse p inp) of
[] -> []
[(v,out)] -> parse (f v) out
p :: Parser (Char, Char)
p = do x <- Parser.item
Parser.item
y <- Parser.item
Parser.return (x,y)
-- this works
p' :: Parser (Char, Char)
p' = item Parser.>>= \x ->
item Parser.>>= \_ ->
item Parser.>>= \y ->
Parser.return (x,y)
p' works both in ghci and frege-repl. However, when trying loading module I got those messages. First from ghci:
src/Parser.hs:38:8:
Couldn't match type ‘[(Char, String)]’ with ‘Char’
Expected type: String -> [((Char, Char), String)]
Actual type: Parser ([(Char, String)], [(Char, String)])
In a stmt of a 'do' block: Parser.return (x, y)
In the expression:
do { x <- item;
item;
y <- item;
Parser.return (x, y) }
Failed, modules loaded: none.
frege-repl is even less friendly because it simply kicks me out from repl with an error stack trace:
Exception in thread "main" frege.runtime.Undefined: returnTypeN: too many arguments
at frege.prelude.PreludeBase.error(PreludeBase.java:18011)
at frege.compiler.Utilities.returnTypeN(Utilities.java:1937)
at frege.compiler.Utilities.returnTypeN(Utilities.java:1928)
at frege.compiler.GenJava7$80.eval(GenJava7.java:11387)
at frege.compiler.GenJava7$80.eval(GenJava7.java:11327)
at frege.runtime.Fun1$1.eval(Fun1.java:63)
at frege.runtime.Delayed.call(Delayed.java:198)
at frege.runtime.Delayed.forced(Delayed.java:267)
at frege.compiler.GenJava7$78.eval(GenJava7.java:11275)
at frege.compiler.GenJava7$78.eval(GenJava7.java:11272)
at frege.runtime.Fun1$1.eval(Fun1.java:63)
at frege.runtime.Delayed.call(Delayed.java:200)
at frege.runtime.Delayed.forced(Delayed.java:267)
at frege.control.monad.State$IMonad_State$4.eval(State.java:1900)
at frege.control.monad.State$IMonad_State$4.eval(State.java:1897)
at frege.runtime.Fun1$1.eval(Fun1.java:63)
at frege.runtime.Delayed.call(Delayed.java:198)
at frege.runtime.Delayed.forced(Delayed.java:267)
at frege.control.monad.State$IMonad_State$4.eval
...
My intuition is that I need something apart >>= and return or there is something I should tell compilers. Or maybe I need to put p definition into State monad?
This is because String -> a is the monad that is being used in your do notation, since one of the instances of Monad in the Prelude is the function arrow.
Therefore, for example, the x in x <- Parser.item is an argument of type [(Char, String)].
You can get around this by making Parser a newtype and defining your own custom Monad instance for it.
The following works with Frege (and should work the same way with GHC language extension RebindableSyntax):
module P
where
type Parser a = String -> [(a, String)]
return :: a -> Parser a
return v = \inp -> [(v, inp)]
-- this is Frege version
item :: Parser Char
item = maybeToList . uncons
parse :: Parser a -> String -> [(a, String)]
parse p inp = p inp
-- sequencing
(>>=) :: Parser a -> (a -> Parser b) -> Parser b
p >>= f = \inp -> case (parse p inp) of
[] -> []
[(v,out)] -> parse (f v) out
p :: Parser (Char, Char)
p = do
x <- item
item
y <- item
return (x,y)
main = println (p "Frege is cool")
It prints:
[(('F', 'r'), "ege is cool")]
The main difference to your version is a more efficient item function, but, as I said before, this is not the reason for the stack trace. And there was this small indentation problem with the do in your code.
So yes, you can use the do notation here, though some would call it "abuse".

IO inside the Get Monad

So my problem is as follows. I'm trying to implement a streaming parser for RDB files (the dump files that Redis produces). I want to implement a function similar to mapM_ whereby I can , say print out each object represented in the dump file as it is parsed. However, I can't seem to get it to operate in constant space. I find that what is happening is that I'm building a large IO() thunk inside of the Get monad, returning from the Get monad and then executing the IO. Is there anyway to stream my objects as they are parsed to print and then discard them? I've tried Enumerators and Conduits but I haven't seen any real gain. Here is what I have so far:
loadObjs_ :: (Monad m) => (Maybe Integer -> BL8.ByteString -> RDBObj -> Get (m a)) -> Get (m a)
loadObjs_ f = do
code <- lookAhead getWord8
case code of
0xfd -> do
skip 1
expire <- loadTime
getPairs_ f (Just expire)
0xfc -> do
skip 1
expire <- loadTimeMs
getPairs_ f (Just expire)
0xfe -> f Nothing "Switching Database" RDBNull
0xff -> f Nothing "" RDBNull
_ -> getPairs_ f Nothing
getPairs_ :: (Monad m) => (Maybe Integer -> BL8.ByteString -> RDBObj -> Get (m a)) -> Maybe Integer -> Get (m a)
getPairs_ f ex = do
!t <- getWord8
!key <- loadStringObj False
!obj <- loadObj t
!rest <- loadObjs_ f
!out <- f ex key obj
return (out >> rest)
(loadObj does the actual parsing of a single object but I believe that whatever I need to fix the streaming to operate in constant or near-constant memory is at a higher level in the iteration than loadObj)
getDBs_ :: (Monad m) => (Maybe Integer -> BL8.ByteString -> RDBObj -> Get (m a)) -> Get (m a)
getDBs_ f = do
opc <- lookAhead getWord8
if opc == opcodeSelectdb
then do
skip 1
(isEncType,dbnum) <- loadLen
objs <- loadObjs_ f
rest <- getDBs_ f
return (objs >> rest)
else f Nothing "EOF" RDBNull
processRDB_ :: (Monad m) => (Maybe Integer -> BL8.ByteString -> RDBObj -> Get (m a)) -> Get (m a)
processRDB_ f = do
header <- getBytes 9
dbs <- getDBs_ f
eof <- getWord8
return (dbs)
printRDBObj :: Maybe Integer -> BL8.ByteString -> RDBObj -> Get (IO ())
printRDBObj (Just exp) key obj = return $ (print ("Expires: " ++ show exp) >>
print ("Key: " ++ (BL8.unpack key)) >>
print ("Obj: " ++ show obj))
printRDBObj Nothing key RDBNull = return $ (print $ BL8.unpack key)
printRDBObj Nothing key obj = return $ (print ("Key: " ++ (BL8.unpack key)) >>
print ("Obj: " ++ show obj))
main = do
testf <- BL8.readFile "./dump.rdb"
runGet (processRDB_ printRDBObj) testf
Thanks all in advance.
Best,
Erik
EDIT: Here is my attempt to parse the objects into a lazy list and then IO over the lazy list.
processRDB :: Get [RDBObj]
processRDB = do
header <- getBytes 9
dbs <- getDBs
eof <- getWord8
return (dbs)
main = do
testf <- BL8.readFile "./dump.rdb"
mapM_ (print . show) $ runGet processRDB testf
If I understand your code correctly, you are trying to convert the file contents into IO actions incrementally, in the hope of then executing those actions incrementally.
A better approach would be to have your parser return a lazy list of objects which you then print out.

getLine and is value integer

I get value using x <- getLine, how can I check that x can be interpreted as an integer number?
do x <- getLine
case filter (\(_,s) -> s == "") (reads x :: [(Int, String)]) of
[] -> putStrLn "x cannot be parsed as an Int"
(xAsInt, _) : _
-> putStrLn (concat ["x can be parsed as an Int, *and* its Int value is ",
show xAsInt])
Look into Data.Char.isNumber.
Haskell: Check if integer, or check type of variable
You could create a maybeIO function that performs an IO action in a catch, returning Just the result of the action if successful, or Nothing if an exception occurred. Then, you can use readLn instead of getLine + reads, with maybeIO to convert any exception into a Nothing.
import Control.Monad (liftM)
maybeIO :: IO a -> IO (Maybe a)
maybeIO f = catch (liftM Just f) (const $ return Nothing)
main = do
i <- maybeIO (readLn :: IO Int)
print i

Resources