mapM on IO produces infinite output - haskell

This is a bizzare behavior, even for Haskell. Look at the code segments below:
import System.Directory
import System.FilePath
-- This spins infinitely
loadCtx :: FilePath -> IO ()
loadCtx dir = do
lsfiles <- listDirectory dir
let files = mapM (dir </>) lsfiles
putStrLn $ "Files " ++ show files
-- This does what I'd expect, prepending the dir path to each file
loadCtx dir = do
lsfiles <- listDirectory dir
let files = map (dir </>) lsfiles
putStrLn $ "Files " ++ show files
Both definitions are accepted from the typechecker but give completely
different behavior. What is the output of the first mapM? It looks like an infinite loop on reading some files. Also is it possible to compose the listDirectory do-arrow line with the map (dir </>) that prepends the path, in one-line?

What is the output of the first mapM? It looks like an infinite loop on reading some files.
It is not an infinite loop -- merely a very, very long one.
You are not using mapM for IO; you are using mapM in the nondeterminism monad. Here is the type of mapM, specialized to that monad:
Traversable t => (a -> [b]) -> t a -> [t b]
Read this in the following way:
First, give me a way to turn an element of a container (type a) into a nondeterministic choice between many possible replacement elements (type [b]).
Then give me a containerful of elements (type t a).
I will give you a nondeterministic choice between containers with replacement elements in them (type [t b]). (And, this part is not in the type, but: the way I will do this is by taking all possible combinations; for each position in the container, I'll try each possible b, and give you every which way of making one choice for each position in the container.)
For example, if we were to define the function f :: Int -> [Char] for which f n chose nondeterministically between the first n letters of the alphabet, then we could see this kind of interaction:
> f 3
"abc"
> f 5
"abcde"
> f 2
"ab"
> mapM f [3,5,2]
["aaa","aab","aba","abb","aca","acb","ada","adb","aea","aeb","baa","bab","bba","bbb","bca","bcb","bda","bdb","bea","beb","caa","cab","cba","cbb","cca","ccb","cda","cdb","cea","ceb"]
In each result, the first letter is one of the first three in the alphabet (a, b, or c); the second is from the first five, and the third from the first two. What's more, we get every list which has this property.
Now let's think about what that means for your code. You have written
mapM (dir </>) lsfiles
and so what you will get back is a collection of lists. Each list in the collection will be exactly as long as lsfiles is. Let's focus on one of the lists in the collection; call it cs.
The first element of cs will be drawn from dir </> filename, where filename is the first element of lsfiles; that is, it will be one of the characters in dir, or a slash, or one of the characters in filename. The second element of cs will be similar: one of the characters of dir, or a slash, or one of the characters from the second filename in lsfiles. I guess you can see where this is going... there's an awful lot of possibilities here. =)
Also is it possible to compose the listDirectory do-arrow line with the map (dir </>) that prepends the path, in one-line?
Yes:
loadCtx dir = do
files <- map (dir </>) <$> listDirectory dir
putStrLn $ "Files " ++ show files

Well according to the documentation,
type FilePath = String
That is,
type FilePath = [Char]
So in this line,
let files = mapM (dir </>) lsfiles
you have that the argument of mapM, which is (dir </>), is of type FilePath -> FilePath. Now look at the type of mapM,
mapM :: (Traversable t, Monad m) => (a -> m b) -> t a -> m (t b)
^^^^^
So the type a -> m b is instantiated to FilePath -> FilePath, which is FilePath -> [Char]. So you're performing a monadic mapping using the list monad, which is the "nondeterminism" monad in this case for values of type Char.

To complement Jorge's answer, here's an exponential blowup, demonstrated:
> map ("XY" </>) ["a","b","c"]
["XY\\a","XY\\b","XY\\c"]
> mapM ("XY" </>) ["a","b","c"]
["XXX","XXY","XX\\","XXc","XYX","XYY","XY\\","XYc","X\\X","X\\Y","X\\\\",
"X\\c","XbX","XbY","Xb\\","Xbc","YXX","YXY","YX\\","YXc","YYX","YYY","YY\\","YYc",
"Y\\X","Y\\Y","Y\\\\","Y\\c","YbX","YbY","Yb\\","Ybc","\\XX","\\XY","\\X\\",
"\\Xc","\\YX","\\YY","\\Y\\","\\Yc","\\\\X","\\\\Y","\\\\\\","\\\\c","\\bX",
"\\bY","\\b\\","\\bc","aXX","aXY","aX\\","aXc","aYX","aYY","aY\\","aYc","a\\X",
"a\\Y","a\\\\","a\\c","abX","abY","ab\\","abc"]
Indeed, mapM = sequence . map, and sequence in the list monad performs the cartesian product of a list-of-lists, ["XY\\a","XY\\b","XY\\c"] in this case, so we get 4*4*4 combinations. (Ouch!)

Related

Is it possible in Haskell to apply the function putStrLn to every element of a list of Strings, have it print to the screen, while being non recursive

I am trying to make a function that takes a list of strings and executes the command putStrLn or print (I think they are basically equivalent, please correct me if I am wrong as I'm still new to Haskell) to every element and have it printed out on my terminal screen. I was experimenting with the map function and also with lambda/anonymous functions as I already know how to do this recursively but wanted to try a more complex non recursive version. map returned a list of the type IO() which was not what I was going for and my attempts at lambda functions did not go according to plan. The basic code was:
test :: [String] -> something
test x = map (\a->putStrLn a) x -- output for this function would have to be [IO()]
Not entirely sure what the output of the function was supposed to be either which also gave me issues.
I was thinking of making a temp :: String variable and have each String appended to temp and then putStrLn temp but was not sure how to do that entirely. I though using where would be viable but I still ran into issues. I know how to do this in languages like java and C but I am still quite new to Haskell. Any help would be appreciated.
There is a special version of map that works with monadic functions, it's called mapM:
test :: [String] -> IO [()]
test x = mapM putStrLn x
Note that this way the return type of test is a list of units - that's because each call to putStrLn returns a unit, so result of applying it to each element in a list would be a list of units. If you'd rather not deal with this silliness and have the return type be a plain unit, use the special version mapM_:
test :: [String] -> IO ()
test x = mapM_ putStrLn x
I was thinking of making a temp :: String variable and have each String appended to temp and then putStrLn temp
Good idea. A pattern of "render the message" then a separate "emit the message" is often nice to have long term.
test xs = let temp = unlines (map show xs)
in putStrLn temp
Or just
test xs = putStrLn (unlines (show <$> xs))
Or
test = putStrLn . unlines . map show
Not entirely sure what the output of the function was supposed to be either which also gave me issues.
Well you made a list of IO actions:
test :: [String] -> [IO ()]
test x = map (\a->putStrLn a) x
So with this list of IO actions when do you want to execute them? Now? Just once? The first one many times the rest never? In what order?
Presumably you want to execute them all now. Let's also eta reduce (\a -> putStrLn a) to just putStrLn since that means the same thing:
test :: [String] -> IO ()
test x = sequence_ (map (\a->putStrLn a) x)

Is it recommended to use recursive IO actions in the tail recursive form?

Consider the two following variations:
myReadListTailRecursive :: IO [String]
myReadListTailRecursive = go []
where
go :: [String] -> IO [String]
go l = do {
inp <- getLine;
if (inp == "") then
return l;
else go (inp:l);
}
myReadListOrdinary :: IO [String]
myReadListOrdinary = do
inp <- getLine
if inp == "" then
return []
else
do
moreInps <- myReadListOrdinary
return (inp:moreInps)
In ordinary programming languages, one would know that the tail recursive variant is a better choice.
However, going through this answer, it is apparent that haskell's implementation of recursion is not similar to that of using the recursion stack repeatedly.
But because in this case the program in question involves actions, and a strict monad, I am not sure if the same reasoning applies. In fact, I think in the IO case, the tail recursive form is indeed better. I am not sure how to correctly reason about this.
EDIT: David Young pointed out that the outermost call here is to (>>=). Even in that case, does one of these styles have an advantage over the other?
FWIW, I'd go for existing monadic combinators and focus on readability/consiseness. Using unfoldM :: Monad m => m (Maybe a) -> m [a]:
import Control.Monad (liftM, mfilter)
import Control.Monad.Loops (unfoldM)
myReadListTailRecursive :: IO [String]
myReadListTailRecursive = unfoldM go
where
go :: IO (Maybe String)
go = do
line <- getLine
return $ case line of
"" -> Nothing
s -> Just s
Or using MonadPlus instance of Maybe, with mfilter :: MonadPlus m => (a -> Bool) -> m a -> m a:
myReadListTailRecursive :: IO [String]
myReadListTailRecursive = unfoldM (liftM (mfilter (/= "") . Just) getLine)
Another, more versatile option, might be to use LoopT.
That’s really not how I would write it, but it’s clear enough what you’re doing. (By the way, if you want to be able to efficiently insert arbitrary output from any function in the chain, without using monads, you might try a Data.ByteString.Builder.)
Your first implementation is very similar to a left fold, and your second very similar to a right fold or map. (You might try actually writing them as such!) The second one has several advantages for I/O. One of the most important, for handling input and output, is that it can be interactive.
You’ll notice that the first builds the entire list from the outside in: in order to determine what the first element of the list is, the program needs to compute the entire structure to get to the innermost thunk, which is return l. The program generates the entire data structure first, then starts to process it. That’s useful when you’re reducing a list, because tail-recursive functions and strict left folds are efficient.
With the second, the outermost thunk contains the head and tail of the list, so you can grab the tail, then call the thunk to generate the second list. This can work with infinite lists, and it can produce and return partial results.
Here’s a contrived example: a program that reads in one integer per line and prints the sums so far.
main :: IO ()
main = interact( display . compute 0 . parse . lines )
where parse :: [String] -> [Int]
parse [] = []
parse (x:xs) = (read x):(parse xs)
compute :: Int -> [Int] -> [Int]
compute _ [] = []
compute accum (x:xs) = let accum' = accum + x
in accum':(compute accum' xs)
display = unlines . map show
If you run this interactively, you’ll get something like:
$ 1
1
$ 2
3
$ 3
6
$ 4
10
But you could also write compute tail-recursively, with an accumulating parameter:
main :: IO ()
main = interact( display . compute [] . parse . lines )
where parse :: [String] -> [Int]
parse = map read
compute :: [Int] -> [Int] -> [Int]
compute xs [] = reverse xs
compute [] (y:ys) = compute [y] ys
compute (x:xs) (y:ys) = compute (x+y:x:xs) ys
display = unlines . map show
This is an artificial example, but strict left folds are a common pattern. If, however, you write either compute or parse with an accumulating parameter, this is what you get when you try to run interactively, and hit EOF (control-D on Unix, control-Z on Windows) after the number 4:
$ 1
$ 2
$ 3
$ 4
1
3
6
10
This left-folded version needs to compute the entire data structure before it can read any of it. That can’t ever work on an infinite list (When would you reach the base case? How would you even reverse an infinite list if you did?) and an application that can’t respond to user input until it quits is a deal-breaker.
On the other hand, the tail-recursive version can be strict in its accumulating parameter, and will run more efficiently, especially when it’s not being consumed immediately. It doesn’t need to keep any thunks or context around other than its parameters, and it can even re-use the same stack frame. A strict accumulating function, such as Data.List.foldl', is a great choice whenver you’re reducing a list to a value, not building an eagerly-evaluated list of output. Functions such as sum, product or any can’t return any useful intermediate value. They inherently have to finish the computation first, then return the final result.

Print map function's output list using mapM_ / putStrLn

I tried to print map function's list output using putStrLn as,
main = do
let out = "hello\nworld\nbye\nworld\n"
putStrLn $ map ("out: " ++) $ lines out
It throws error as,
Couldn't match type ‘[Char]’ with ‘Char’
I referred some other code and changed the lastline to
mapM_ putStrLn $ map ("out: " ++) $ lines out
It solves the problem, but how does map monad with underscore suffix work in this case?
mapM_ is based on the mapM function, which has the type
mapM :: Monad m => (a -> m b) -> [a] -> m [b]
And mapM_ has the type
mapM_ :: Monad m => (a -> m b) -> [a] -> m ()
With the former, it acts like the normal map over a list, but where each element has an action run with the results aggregated. So for example if you wanted to read multiple files you could use contents <- mapM readFile [filename1, filename2, filename3], and contents would be a list where each element represented the contents of the corresponding file. The mapM_ function does the same thing, but throws away the results. One definition is
mapM_ f list = do
mapM f list
return ()
Every action gets executed, but nothing is returned. This is useful in situations like yours where the result value is useless, namely that () is the only value of type () and therefore no actual decisions can be made from it. If you had mapM putStrLn someListOfStrings then the result of this would have type IO [()], but with mapM_ putStrLn someListOfStrings the [()] is thrown away and just replaced with ().

Haskell: save one listitem at a time to file

I'd like to save a huge list A to a textfile. writeFile seems to only save the list at the very end of the calcultaion of A, which crashes because my memory is insufficient to store the whole list.
I have tried this using
writeFile "test.txt" $ show mylistA
Now I have tried saving the elements of the list, as they are calculated using:
[appendFile "test2.txt" (show x)|x<-mylistA]
But it doesn't work because:
No instance for (Show (IO ())) arising from a use of `print' Possible fix: add an instance declaration for (Show (IO ())) In a stmt of an interactive GHCi command: print it
Can you help me fix this, or give me a solution which saves my huge list A to a text file?
Thank you
The problem is that your list has the type [ IO () ] or "A list of IO actions". Since the IO is on the "inside" of out type we can't execute this in the IO monad. What we want instead is IO (). So a list comprehension isn't going to hack it here.
We could use a function to turn [IO ()] -> IO [()] but this case lends itself to a much more concise combinator.
Instead we can use a simple predefined combinator called mapM_. In the Haskell prelude the M means it's monadic and the _ means that it returns m () in our case IO (). Using it is trivial in this case
[appendFile "test2.txt" (show x)|x<-mylistA]
becomes
mapM_ (\x -> appendFile "test2.txt" (show x)) myListA
mapM_ (appendFile "test2.txt" . show) myListA
This will unfold to something like
appendFile "test2.txt" (show firstItem) >>
appendFile "test2.txt" (show secondItem) >>
...
So we don't ever have the whole list in memory.
You can use the function sequence from Control.Monad to take a (lazily generated) list of IO actions and execute them one at a time
>>> import Control.Monad
Now you can do
>>> let myList = [1, 2, 3]
>>> sequence [print x | x <- myList]
1
2
3
[(),(),()]
Note that you get a list of all the return values at the end. If you want to discard the return value, just use sequence_ instead of sequence.
>>> sequence_ [print x | x <- myList]
1
2
3
I just wanted to expand on jozefg's answer by mentioning forM_, the flipped version of mapM_. Using forM_ you get something that looks like a foreach loop:
-- Read this as "for each `x` in `myListA`, do X"
forM_ myListA $ \x -> do
appendFile "test2.txt" (show x)

Haskell monadic IO

compute fp = do
text <- readFile fp
let (a,b) = sth text
let x = data b
--g <- x
putStr $ print_matrix $ fst $ head $ x
It works when i get only first element but i want do this for all element on the list of pair.
When i write g<- x i got Couldn't match expected type `IO t0'
with actual type [([[Integer]], [[Integer]])]
You're inside the IO Monad here, so any time you write a "bind" arrow <-, the thing on the right side has to be an IO operation. So the short answer is, you don't want to use <- on the value x.
Now, it looks like you want to call print_matrix for each element of a list rather than a single element. In that case I think Macke is on the right track, but I would use mapM_ instead:
mapM_ (putStr . print_matrix . fst) x
should do the trick.
The reason is that you are explicitly saying you want to putStr each element, one at a time, but you don't care about the result of putStr itself.
It sounds like mapM might fit your bill: Monad a => (b -> a c) -> [b] -> a [c]
It's used to apply a monadic function to a list, and get a list back, in the monad

Resources