How can I untangle multiple Maybes nested at different levels?

How can I untangle multiple Maybes nested at different levels? - haskell

Working with Maybe seems really difficult in Haskell. I was able to implement the function I need after many frustrating compile errors, but it's still completely disorganized and I don't know how else can I improve it.
I need to:
extract multiple nested ... Maybes into one, final Maybe ...
Do a -> b -> IO () with Just a and Just b or nothing (*)
Here is example with IO part removed. I need a -> b -> IO (), not (a,b) -> IO () later but I couldn't figure out how to pass both arguments otherwise (I can mapM_ with one argument only).
import Network.URI
type URL = String
type Prefix = String
fubar :: String -> Maybe (Prefix, URL)
fubar url = case parseURI url of
Just u -> (flip (,) $ url)
<$> (fmap ((uriScheme u ++) "//" ++ ) ((uriRegName <$> uriAuthority u)))
_ -> Nothing
Result:
> fubar "https://hackage.haskell.org/package/base-4.9.0.0/docs/src/Data.Foldable.html#mapM"
Just ("https://hackage.haskell.org"
,"https://hackage.haskell.org/package/base-4.9.0.0/docs/src/Data.Foldable.html#mapM"
)
(*) printing what failed to parse wrong would be nice

This is pretty simple written with do notation:
fubar :: String -> Maybe (Prefix, URL)
fubar url = do
u <- parseURI url
scheme <- uriScheme u
authority <- uriAuthority u
return (scheme ++ "//" ++ uriRegName authority, url)
Monads in general (and Maybe in particular) are all about combining m (m a) into m a. Each <- binding is an alternate syntax for a call to >>=, the operator responsible for aborting if it sees a Nothing, and otherwise unwrapping the Just for you.

First note that you're just stacking multiple fmaps there, with α <$> (fmap β (γ <$> uriAuthority u)). This can (functor laws!) be rewritten α . β . γ <$> uriAuthority u, i.e.
{-# LANGUAGE TupleSections #-}
...
Just u -> (,url) . ((uriScheme u++"//") ++ ) . uriRegName <$> uriAuthority u
It might be better for legibility to actually keep the layers separate, but then you should also give them names as amalloy suggests.
Further, more strongly:
Extract multiple nested M into one, final M
Well, sounds like monads, doesn't it?
fubar url = do
u <- parseURI url
(,url) . ((uriScheme u++"//") ++ ) . uriRegName <$> uriAuthority u

I'm not entirely clear on precisely what you're asking, but I'll do my best to answer the questions you have presented.
To extract multiple nested Maybes into a single final Maybe is taken care of by Maybe's monad-nature (also applicative-nature). How specifically to do it depends on how they are nested.
Simplest example:
Control.Monad.join :: (Monad m) => m (m a) -> m a
-- thus
Control.Monad.join :: Maybe (Maybe a) -> Maybe a
A tuple:
squishTuple :: (Maybe a, Maybe b) -> Maybe (a,b)
squishTuple (ma, mb) = do -- do in Maybe monad
a <- ma
b <- mb
return (a,b)
-- or
squishTuple (ma, mb) = liftA2 (,) ma mb
A list:
sequenceA :: (Applicative f, Traversable t) => t (f a) -> f (t a)
-- thus
sequenceA :: [Maybe a] -> Maybe [a]
-- (where t = [], f = Maybe)
Other structures can be flattened by composing these and following the types. For example:
flattenComplexThing :: (Maybe a, [Maybe (Maybe b)]) -> Maybe (a, [b])
flattenComplexThing (ma, mbs) = do
a <- ma
bs <- (join . fmap sequenceA . sequenceA) mbs
return (a, bs)
That join . fmap sequenceA . sequenceA line is a bit complex, and it takes some getting used to to know how to construct things like this. My brain works in a very type-directed way (read the composition right-to-left):
[Maybe (Maybe b)]
|
sequenceA :: [Maybe _] -> Maybe [_]
↓
Maybe [Maybe b]
|
-- sequenceA :: [Maybe b] -> Maybe [b]
-- fmap f makes the function f work "inside" the Maybe, so
fmap sequenceA :: Maybe [Maybe b] -> Maybe (Maybe [b])
↓
Maybe (Maybe [b])
|
join :: Maybe (Maybe _) -> Maybe _
↓
Maybe [b]
As for the second question, how to do an a -> b -> IO () when you have Maybe a and Maybe b, presuming you don't want to do the action at all in the case that either one is Nothing, you just do some gymnastics:
conditional :: (a -> IO ()) -> Maybe a -> IO ()
conditional = maybe (return ())
conditional2 :: (a -> b -> IO ()) -> Maybe a -> Maybe b -> IO ()
conditional2 f ma mb = conditional (uncurry f) (liftA2 (,) ma mb)
Again I found conditional2 in a mainly type-directed way in my mind.
It takes some time to develop your type gymnastics, but then it starts to get really fun. To make code like this readable, it's important to use helper functions, e.g. conditional above, and name them well (which is arguable about conditional :-). You'll gradually get familiar with the standard library's helpers. There's no magic bullet here, you just have to get used to it -- it's a language. Work with it, strive for clarity, if something is too ugly try your best to make it prettier. And ask more specific questions :-)

Related

Can `(>>=)` be redeclared as `(a -> m b) -> m a -> m b`?

In Haskell Monad is declared as
class Applicative m => Monad m where
return :: a -> m a
(>>=) :: m a -> (a -> m b) -> m b
return = pure
I was wondering if it is okay to redeclare the bind operator as
(>>=) :: (a -> m b) -> m a -> m b
?
Is it correct that the second declaration makes it clearer that (>>=) maps a function of type a -> m b to a function of type m a -> m b, while the original declaration makes less clear what it means?
Will that change of declaration make something from possible to impossible, or just require some change of using monad (which seems bearable to Haskell programmers)?
Thanks.

There's one reason why >>= tends to be more useful in practice than it's flipped counterpart =<<: it plays nicely with lambda notation. Namely, \ acts as a syntactic herald, so you can continue the computation without needing any parentheses. For instance,
do x <- [1..5]
y <- [10..20]
return $ x*y
can be rewritten very easily in terms of >>= as
[1..5] >>= \x -> [10..20] >>= \y -> return $ x*y
You still have much the same “imperative flow” feel as with the do version.
Whereas with =<< it would require awkward parentheses and seem to read backwards:
(\x -> (\y -> return $ x*y) =<< [10..20]) =<< [1..5]
Ok, you might say this feels more like function application. But where that is useful, it is often more poignant to use only the applicative functor interface rather than the monadic one:
(\x y -> x*y) <$> [1..5] <*> [10..20]
or short
(*) <$> [1..5] <*> [10..20]
Note that (<*>) :: f (a->b) -> f a -> f b has essentially the order of =<< that you propose, just with the a-> inside the functor rather than outside.

Haskell : concat two IO Strings

Today I have tried to concat two IO Strings and couldn't get it work.
So, the problem is: suppose we have s1 :: IO String and s2 :: IO String. How to implement function (+++) :: IO String -> IO String -> IO String, which works exactly as (++) :: [a] -> [a] -> [a] but for IO String?
And more general question is how to implement more general function (+++) :: IO a -> IO a -> IO a? Or maybe even more general?

You can use liftM2 from Control.Monad:
liftM2 :: Monad m => (a -> b -> c) -> m a -> m b -> m c
> :t liftM2 (++)
liftM2 (++) :: Monad m => m [a] -> m [a] -> m [a]
Alternatively, you could use do notation:
(+++) :: Monad m => m [a] -> m [a] -> m [a]
ms1 +++ ms2 = do
s1 <- ms1
s2 <- ms2
return $ s1 ++ s2
Both of these are equivalent. In fact, the definition for liftM2 is implemented as
liftM2 :: Monad m => (a -> b -> c) -> m a -> m b -> m c
liftM2 f m1 m2 = do
val1 <- m1
val2 <- m2
return $ f val1 val2
Very simple! All it does is extract the values from two monadic actions and apply a function of 2 arguments to them. This goes with the function liftM which performs this operation for a function of only one argument. Alternatively, as pointed out by others, you can use IO's Applicative instance in Control.Applicative and use the similar liftA2 function.
You might notice that generic Applicatives have similar behavior to generic Monads in certain contexts, and the reason for this is because they're mathematically very similar. In fact, for every Monad, you can make an Applicative out of it. Consequently, you can also make a Functor out of every Applicative. There are a lot of people excited about the Functor-Applicative-Monad proposal that's been around for a while, and is finally going to be implemented in an upcoming version of GHC. They make a very natural hierarchy of Functor > Applicative > Monad.

import Control.Applicative (liftA2)
(+++) :: Applicative f => f [a] -> f [a] -> f [a]
(+++) = liftA2 (++)
Now in GHCI
>> getLine +++ getLine
Hello <ENTER>
World!<ENTER>
Hello World!

(++) <$> pure "stringOne" <*> pure "stringTwo"

implement function (+++) ... which works exactly as (++) :: [a] -> [a] -> [a] but for IO String?
Don't do that, it's a bad idea. Concatenating strings is a purely functional operation, there's no reason to have it in the IO monad. Except at the place where you need the result – which would be somewhere in the middle of some other IO I suppose. Well, then just use do-notation to bind the read strings to variable names, and use ordinary (++) on them!
do
print "Now start obtaining strings..."
somePreliminaryActions
someMoreIOStuff
s1 <- getS1
s2 <- getS2
yetMoreIO
useConcat'dStrings (s1 ++ s2)
print "Done."
It's ok to make that more compact by writing s12 <- liftA2 (++) getS1 getS2. But I'd do that right in place, not define it seperately.
For longer operations you may of course want to define a seperate named action, but it should be a somehow meaningful one.
You shouldn't think of IO String objects as "IO-strings". They aren't, just as [Int] aren't "list-integers". An object of type IO String is an action which, when incurred, can supply a String object in the IO monad. It is not a string itself.

How to mix Haskell monadic and pure filters in a piping fashion?

In a previous question, I tried to ask about how to mix pure and monadic functions by piping them together, but because I may have worded my question wrong and my example was too simplistic, I think the discussion went the wrong direction, so I think I'll try again.
Here is an example function that mixes pure and monadic filters. In this example, there are some pure filters sequenced in-between monadic filters to try to reduce the amount of work.
findFiles target =
getDirectoryContents target >>=
return . filter (not . (=~ "[0-9]{8}\\.txt$")) >>=
return . filter (=~ "\\.txt$") >>=
filterM doesFileExist >>=
mapM canonicalizePath
The benefit of writing it this way, where pure functions are mixed in using return, is that there is a visual flow of data from top to bottom. No need for temporary variables, fmap, <$> or the like.
Ideally, I can get rid of the returns to make it cleaner. I had the idea of using some operator:
(|>=) :: Monad m => a -> (a -> m b) -> m b
a |>= b = (return a) >>= b
But I don't know how to write this function to avoid operator precedence problems. Does this already exist? It is similar to <$> but the "other direction". If not, how do I make this operator work?
More generally, is there a good way to write code in this piping fashion, or need I settle for fmaps and temporary variables like as described in my previous question?

Ugh. As simple as this:
infixl 1 |>=
(|>=) = flip fmap
findFiles target =
getDirectoryContents target |>=
filter (not . (=~ "[0-9]{8}\\.txt$")) |>=
filter (=~ "\\.txt$") >>=
filterM doesFileExist >>=
mapM canonicalizePath

Seconding DiegoNolan, there's no prize for the pointest-free code and no shame in using do-notation, binding intermediate values with either a monadic assignment (x <- ...) or a good old-fashioned let. The heirs to your code will thank you.
That said, if you can't bear points, you might be a category theorist. Seriously, you can take a page from John Hughes (see Programming with Arrows) and write your pipeline like this:
import Control.Arrow
findFiles = runKleisli $
Kleisli getDirectoryContents >>>
arr (filter (not . (=~ "[0-9]{8}\\.txt$"))) >>>
arr (filter (=~ "\\.txt$")) >>>
Kleisli (filterM doesFileExist) >>>
Kleisli (mapM canonicalizePath)
This is probably a little more principled than monkeying around with one's own special bind operators, but still uglier than the plain pointed style if you ask me. De gustibus non est disputandum, as the Romans always used to say about garum.

Use (<$>), also known as fmap, for mapping pure functions into a functor. Most monads have instances of functors. If they don't have one then you can use liftM
Looking at the types
liftM :: Monad m => (a -> b) -> m a -> m b
(<$>) :: Functor f => (a -> b) -> f a -> f b
Yours would look like this (haven't checken in ghc).
findFiles target =
((filter (not . (=~ "[0-9]{8}\\.txt$")) .
filter (=~ "\\.txt$") ) <$>
getDirectoryContents target) >>=
filterM doesFileExist >>=
mapM canonicalizePath
But at this point you're probably just better off using do notation and let.

You'll want a few extra operators, one to handle each case
Monad -> Monad
Monad -> Pure
Pure -> Monad
Pure -> Pure
You already have the Monad -> Monad case (>>=), and as I described in my answer to your last question, you could use |>= for the Pure -> Monad case, but you'll still need Monad -> Pure one. That's going to be tricky, since the only type-safe way to do it is by having that operator transform your pure function into a monadic one. I'd recommend the following set of operators
Monad -> Monad >>= m a -> (a -> m b) -> m b
Monad -> Pure >|= m a -> (a -> b) -> m b
Pure -> Monad |>= a -> (a -> m b) -> m b
Pure -> Pure ||= (a -> b) -> (b -> c) -> (a -> c)
Using the convention that > means "monad" and | means "pure", and all end with = meaning "to function". Hopefully the type signatures will make sense with the implementations:
import Data.Char (toUpper)
import Control.Monad (liftM)
infixl 1 |>=
(|>=) :: Monad m => a -> (a -> m b) -> m b
a |>= b = b a
infixl 1 >|=
(>|=) :: Monad m => m a -> (a -> b) -> m b
a >|= b = liftM b a
infixr 9 ||=
(||=) :: (a -> b) -> (b -> c) -> a -> c
a ||= b = b . a
And an example
test :: IO ()
test =
getLine >|=
filter (/= 't') ||=
map toUpper >>=
putStrLn
> test
testing
ESING
>
This is also equivalent to
test :: IO ()
test =
getLine >|=
filter (/= 't') >|=
map toUpper >>=
putStrLn
But the extra ||> combination would let you actually compose those functions, which has a different implementation under the hood than feeding them through monadic actions.
However, I would still urge you to use the idiomatic way of doing this by using fmap, do notation, and temporary variables. It'll be much clearer to anyone else that looks at the code, and that includes you in 2 months' time.

implementing a "findM" in Haskell?

I am looking for a function that basically is like mapM on a list -- it performs a series of monadic actions taking every value in the list as a parameter -- and each monadic function returns m (Maybe b). However, I want it to stop after the first parameter that causes the function to return a Just value, not execute any more after that, and return that value.
Well, it'll probably be easier to just show the type signature:
findM :: (Monad m) => (a -> m (Maybe b)) -> [a] -> m (Maybe b)
where b is the first Just value. The Maybe in the result is from the finding (in case of an empty list, etc.), and has nothing to do with the Maybe returned by the Monadic function.
I can't seem to implement this with a straightforward application of library functions. I could use
findM f xs = fmap (fmap fromJust . find isJust) $ mapM f xs
which will work, but I tested this and it seems that all of the monadic actions are executed before calling find, so I can't rely on laziness here.
ghci> findM (\x -> print x >> return (Just x)) [1,2,3]
1
2
3
-- returning IO (Just 1)
What is the best way to implement this function that won't execute the monadic actions after the first "just" return? Something that would do:
ghci> findM (\x -> print x >> return (Just x)) [1,2,3]
1
-- returning IO (Just 1)
or even, ideally,
ghci> findM (\x -> print x >> return (Just x)) [1..]
1
-- returning IO (Just 1)
Hopefully there is an answer that doesn't use explicit recursion, and are compositions of library functions if possible? Or maybe even a point-free one?

One simple point-free solution is using the MaybeT transformer. Whenever we see m (Maybe a) we can wrap it into MaybeT and we get all MonadPlus functions immediately. Since mplus for MaybeT does exactly we need - it runs the second given action only if the first one resulted in Nothing - msum does exactly what we need:
import Control.Monad
import Control.Monad.Trans.Maybe
findM :: (Monad m) => (a -> m (Maybe b)) -> [a] -> m (Maybe b)
findM f = runMaybeT . msum . map (MaybeT . f)
Update: In this case, we were lucky that there exists a monad transformer (MaybeT) whose mplus has just the semantic we need. But in a general case, it can be that it won't be possible to construct such a transformer. MonadPlus has some laws that must be satisfied with respect to other monadic operations. However, all is not lost, as we actually don't need a MonadPlus, all we need is a proper monoid to fold with.
So let's pretend we don't (can't) have MaybeT. Computing the first value of some sequence of operations is described by the First monoid. We just need to make a monadic variant that won't execute the right part, if the left part has a value:
newtype FirstM m a = FirstM { getFirstM :: m (Maybe a) }
instance (Monad m) => Monoid (FirstM m a) where
mempty = FirstM $ return Nothing
mappend (FirstM x) (FirstM y) = FirstM $ x >>= maybe y (return . Just)
This monoid exactly describes the process without any reference to lists or other structures. Now we just fold over the list using this monoid:
findM' :: (Monad m) => (a -> m (Maybe b)) -> [a] -> m (Maybe b)
findM' f = getFirstM . mconcat . map (FirstM . f)
Moreover, it allows us to create a more generic (and even shorter) function using Data.Foldable:
findM'' :: (Monad m, Foldable f)
=> (a -> m (Maybe b)) -> f a -> m (Maybe b)
findM'' f = getFirstM . foldMap (FirstM . f)

I like Cirdec's answer if you don't mind recursion, but I think the equivalent fold based answer is quite pretty.
findM f = foldr test (return Nothing)
where test x m = do
curr <- f x
case curr of
Just _ -> return curr
Nothing -> m
A nice little test of how well you understand folds.

This should do it:
findM _ [] = return Nothing
findM filter (x:xs) =
do
match <- filter x
case match of
Nothing -> findM filter xs
_ -> return match
If you really want to do it points free (added as an edit)
The following would find something in a list using an Alternative functor, using a fold as in jozefg's answer
findA :: (Alternative f) => (a -> f b) -> [a] -> f b
findA = flip foldr empty . ((<|>) .)
I don't thing we can make (Monad m) => m . Maybe an instance of Alternative, but we could pretend there's an existing function:
-- Left biased choice
(<||>) :: (Monad m) => m (Maybe a) -> m (Maybe a) -> m (Maybe a)
(<||>) left right = left >>= fromMaybe right . fmap (return . Just)
-- Or its hideous points-free version
(<||>) = flip ((.) . (>>=)) (flip ((.) . ($) . fromMaybe) (fmap (return . Just)))
Then we can define findM in the same vein as findA
findM :: (Monad m) => (a -> m (Maybe b)) -> [a] -> m (Maybe b)
findM = flip foldr (return Nothing) . ((<||>) .)

This can be expressed pretty nicely with the MaybeT monad transformer and Data.Foldable.
import Data.Foldable (msum)
import Control.Monad.Trans.Maybe (MaybeT(..))
findM :: Monad m => (a -> m (Maybe b)) -> [a] -> m (Maybe b)
findM f = runMaybeT . msum . map (MaybeT . f)
And if you change your search function to produce a MaybeT stack, it becomes even nicer:
findM' :: Monad m => (a -> MaybeT m b) -> [a] -> MaybeT m b
findM' f = msum . map f
Or in point-free:
findM' = (.) msum . map
The original version can be made fully point-free as well, but it becomes pretty unreadable:
findM = (.) runMaybeT . (.) msum . map . (.) MaybeT

How do I handle an infinite list of IO objects in Haskell?

I'm writing a program that reads from a list of files. The each file either contains a link to the next file or marks that it's the end of the chain.
Being new to Haskell, it seemed like the idiomatic way to handle this is is a lazy list of possible files to this end, I have
getFirstFile :: String -> DataFile
getNextFile :: Maybe DataFile -> Maybe DataFile
loadFiles :: String -> [Maybe DataFile]
loadFiles = iterate getNextFile . Just . getFirstFile
getFiles :: String -> [DataFile]
getFiles = map fromJust . takeWhile isJust . loadFiles
So far, so good. The only problem is that, since getFirstFile and getNextFile both need to open files, I need their results to be in the IO monad. This gives the modified form of
getFirstFile :: String -> IO DataFile
getNextFile :: Maybe DataFile -> IO (Maybe DataFile)
loadFiles :: String -> [IO Maybe DataFile]
loadFiles = iterate (getNextFile =<<) . Just . getFirstFile
getFiles :: String -> IO [DataFile]
getFiles = liftM (map fromJust . takeWhile isJust) . sequence . loadFiles
The problem with this is that, since iterate returns an infinite list, sequence becomes an infinite loop. I'm not sure how to proceed from here. Is there a lazier form of sequence that won't hit all of the list elements? Should I be rejiggering the map and takeWhile to be operating inside the IO monad for each list element? Or do I need to drop the whole infinite list process and write a recursive function to terminate the list manually?

A step in the right direction
What puzzles me is getNextFile. Step into a simplified world with me, where we're not dealing with IO yet. The type is Maybe DataFile -> Maybe DataFile. In my opinion, this should simply be DataFile -> Maybe DataFile, and I will operate under the assumption that this adjustment is possible. And that looks like a good candidate for unfoldr. The first thing I am going to do is make my own simplified version of unfoldr, which is less general but simpler to use.
import Data.List
-- unfoldr :: (b -> Maybe (a,b)) -> b -> [a]
myUnfoldr :: (a -> Maybe a) -> a -> [a]
myUnfoldr f v = v : unfoldr (fmap tuplefy . f) v
where tuplefy x = (x,x)
Now the type f :: a -> Maybe a matches getNextFile :: DataFile -> Maybe DataFile
getFiles :: String -> [DataFile]
getFiles = myUnfoldr getNextFile . getFirstFile
Beautiful, right? unfoldr is a lot like iterate, except once it hits Nothing, it terminates the list.
Now, we have a problem. IO. How can we do the same thing with IO thrown in there? Don't even think about The Function Which Shall Not Be Named. We need a beefed up unfoldr to handle this. Fortunately, the source for unfoldr is available to us.
unfoldr :: (b -> Maybe (a, b)) -> b -> [a]
unfoldr f b =
case f b of
Just (a,new_b) -> a : unfoldr f new_b
Nothing -> []
Now what do we need? A healthy dose of IO. liftM2 unfoldr almost gets us the right type, but won't quite cut it this time.
An actual solution
unfoldrM :: Monad m => (b -> m (Maybe (a, b))) -> b -> m [a]
unfoldrM f b = do
res <- f b
case res of
Just (a, b') -> do
bs <- unfoldrM f b'
return $ a : bs
Nothing -> return []
It is a rather straightforward transformation; I wonder if there is some combinator that could accomplish the same.
Fun fact: we can now define unfoldr f b = runIdentity $ unfoldrM (return . f) b
Let's again define a simplified myUnfoldrM, we just have to sprinkle in a liftM in there:
myUnfoldrM :: Monad m => (a -> m (Maybe a)) -> a -> m [a]
myUnfoldrM f v = (v:) `liftM` unfoldrM (liftM (fmap tuplefy) . f) v
where tuplefy x = (x,x)
And now we're all set, just like before.
getFirstFile :: String -> IO DataFile
getNextFile :: DataFile -> IO (Maybe DataFile)
getFiles :: String -> IO [DataFile]
getFiles str = do
firstFile <- getFirstFile str
myUnfoldrM getNextFile firstFile
-- alternatively, to make it look like before
getFiles' :: String -> IO [DataFile]
getFiles' = myUnfoldrM getNextFile <=< getFirstFile
By the way, I typechecked all of these with data DataFile = NoClueWhatGoesHere, and the type signatures for getFirstFile and getNextFile, with their definitions set to undefined.
[edit] changed myUnfoldr and myUnfoldrM to behave more like iterate, including the initial value in the list of results.
[edit] Additional insight on unfolds:
If you have a hard time wrapping your head around unfolds, the Collatz sequence is possibly one of the simplest examples.
collatz :: Integral a => a -> Maybe a
collatz 1 = Nothing -- the sequence ends when you hit 1
collatz n | even n = Just $ n `div` 2
| otherwise = Just $ 3 * n + 1
collatzSequence :: Integral a => a -> [a]
collatzSequence = myUnfoldr collatz
Remember, myUnfoldr is a simplified unfold for the cases where the "next seed" and the "current output value" are the same, as is the case for collatz. This behavior should be easy to see given myUnfoldr's simple definition in terms of unfoldr and tuplefy x = (x,x).
ghci> collatzSequence 9
[9,28,14,7,22,11,34,17,52,26,13,40,20,10,5,16,8,4,2,1]
More, mostly unrelated thoughts
The rest has absolutely nothing to do with the question, but I just couldn't resist musing. We can define myUnfoldr in terms of myUnfoldrM:
myUnfoldr f v = runIdentity $ myUnfoldrM (return . f) v
Look familiar? We can even abstract this pattern:
sinkM :: ((a -> Identity b) -> a -> Identity c) -> (a -> b) -> a -> c
sinkM hof f = runIdentity . hof (return . f)
unfoldr = sinkM unfoldrM
myUnfoldr = sinkM myUnfoldrM
sinkM should work to "sink" (opposite of "lift") any function of the form
Monad m => (a -> m b) -> a -> m c.
since the Monad m in those functions can be unified with the Identity monad constraint of sinkM. However, I don't see anything that sinkM would actually be useful for.

sequenceWhile :: Monad m => (a -> Bool) -> [m a] -> m [a]
sequenceWhile _ [] = return []
sequenceWhile p (m:ms) = do
x <- m
if p x
then liftM (x:) $ sequenceWhile p ms
else return []
Yields:
getFiles = liftM (map fromJust) . sequenceWhile isJust . loadFiles

As you have noticed, IO results can't be lazy, so you can't (easily) build an infinite list using IO. There is a way out, however, in unsafeInterleaveIO; with this, you can do something like:
ioList startFile = do
v <- processFile startFile
continuation <- unsafeInterleaveIO (nextFile startFile >>= ioList)
return (v:continuation)
It's important to be careful here, though - you've just deferred the results of ioList to some unpredictable time in the future. It may never be run at all, in fact. So be very careful when you're being Clever™ like this.
Personally, I would just build a manual recursive function.

Laziness and I/O are a tricky combination. Using unsafeInterleaveIO is one way to produce lazy lists in the IO monad (and this is the technique used by the standard getContents, readFile and friends). However, as convenient as this is, it exposes pure code to possible I/O errors and makes makes releasing resources (such as file handles) non-deterministic. This is why most "serious" Haskell applications (especially those concerned with efficiency) nowadays use things called Enumerators and Iteratees for streaming I/O. One library in Hackage that implements this concept is enumerator.
You are probably fine with using lazy I/O in your application, but I thought I'd still give this as an example of another way to approach these kind of problems. You can find more in-depth tutorials about iteratees here and here.
For example, your stream of DataFiles could be implemented as an Enumerator like this:
import Data.Enumerator
import Control.Monad.IO.Class (liftIO)
iterFiles :: String -> Enumerator DataFile IO b
iterFiles s = first where
first (Continue k) = do
file <- liftIO $ getFirstFile s
k (Chunks [file]) >>== next file
first step = returnI step
next prev (Continue k) = do
file <- liftIO $ getNextFile (Just prev)
case file of
Nothing -> k EOF
Just df -> k (Chunks [df]) >>== next df
next _ step = returnI step

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How can I untangle multiple Maybes nested at different levels? - haskell

Related

Can `(>>=)` be redeclared as `(a -> m b) -> m a -> m b`?

Haskell : concat two IO Strings

How to mix Haskell monadic and pure filters in a piping fashion?

implementing a "findM" in Haskell?

How do I handle an infinite list of IO objects in Haskell?

Categories

Resources