Haskell: save one listitem at a time to file - haskell

I'd like to save a huge list A to a textfile. writeFile seems to only save the list at the very end of the calcultaion of A, which crashes because my memory is insufficient to store the whole list.
I have tried this using
writeFile "test.txt" $ show mylistA
Now I have tried saving the elements of the list, as they are calculated using:
[appendFile "test2.txt" (show x)|x<-mylistA]
But it doesn't work because:
No instance for (Show (IO ())) arising from a use of `print' Possible fix: add an instance declaration for (Show (IO ())) In a stmt of an interactive GHCi command: print it
Can you help me fix this, or give me a solution which saves my huge list A to a text file?
Thank you

The problem is that your list has the type [ IO () ] or "A list of IO actions". Since the IO is on the "inside" of out type we can't execute this in the IO monad. What we want instead is IO (). So a list comprehension isn't going to hack it here.
We could use a function to turn [IO ()] -> IO [()] but this case lends itself to a much more concise combinator.
Instead we can use a simple predefined combinator called mapM_. In the Haskell prelude the M means it's monadic and the _ means that it returns m () in our case IO (). Using it is trivial in this case
[appendFile "test2.txt" (show x)|x<-mylistA]
becomes
mapM_ (\x -> appendFile "test2.txt" (show x)) myListA
mapM_ (appendFile "test2.txt" . show) myListA
This will unfold to something like
appendFile "test2.txt" (show firstItem) >>
appendFile "test2.txt" (show secondItem) >>
...
So we don't ever have the whole list in memory.

You can use the function sequence from Control.Monad to take a (lazily generated) list of IO actions and execute them one at a time
>>> import Control.Monad
Now you can do
>>> let myList = [1, 2, 3]
>>> sequence [print x | x <- myList]
1
2
3
[(),(),()]
Note that you get a list of all the return values at the end. If you want to discard the return value, just use sequence_ instead of sequence.
>>> sequence_ [print x | x <- myList]
1
2
3

I just wanted to expand on jozefg's answer by mentioning forM_, the flipped version of mapM_. Using forM_ you get something that looks like a foreach loop:
-- Read this as "for each `x` in `myListA`, do X"
forM_ myListA $ \x -> do
appendFile "test2.txt" (show x)

Related

Lazy evaluation of IO actions

I'm trying to write code in source -> transform -> sink style, for example:
let (|>) = flip ($)
repeat 1 |> take 5 |> sum |> print
But would like to do that using IO. I have this impression that my source can be an infinite list of IO actions, and each one gets evaluated once it is needed downstream. Something like this:
-- prints the number of lines entered before "quit" is entered
[getLine..] >>= takeWhile (/= "quit") >>= length >>= print
I think this is possible with the streaming libraries, but can it be done along the lines of what I'm proposing?
Using the repeatM, takeWhile and length_ functions from the streaming library:
import Streaming
import qualified Streaming.Prelude as S
count :: IO ()
count = do r <- S.length_ . S.takeWhile (/= "quit") . S.repeatM $ getLine
print r
This seems to be in that spirit:
let (|>) = flip ($)
let (.>) = flip (.)
getContents >>= lines .> takeWhile (/= "quit") .> length .> print
The issue here is that Monad is not the right abstraction for this, and attempting to do something like this results in a situation where referential transparency is broken.
Firstly, we can do a lazy IO read like so:
module Main where
import System.IO.Unsafe (unsafePerformIO)
import Control.Monad(forM_)
lazyIOSequence :: [IO a] -> IO [a]
lazyIOSequence = pure . go where
go :: [IO a] -> [a]
go (l:ls) = (unsafePerformIO l):(go ls)
main :: IO ()
main = do
l <- lazyIOSequence (repeat getLine)
forM_ l putStrLn
This when run will perform cat. It will read lines and output them. Everything works fine.
But consider changing the main function to this:
main :: IO ()
main = do
l <- lazyIOSequence (map (putStrLn . show) [1..])
putStrLn "Hello World"
This outputs Hello World only, as we didn't need to evaluate any of l. But now consider replacing the last line like the following:
main :: IO ()
main = do
x <- lazyIOSequence (map (putStrLn . show) [1..])
seq (head x) putStrLn "Hello World"
Same program, but the output is now:
1
Hello World
This is bad, we've changed the results of a program just by evaluating a value. This is not supposed to happen in Haskell, when you evaluate something it should just evaluate it, not change the outside world.
So if you restrict your IO actions to something like reading from a file nothing else is reading from, then you might be able to sensibly lazily evaluate things, because when you read from it in relation to all the other IO actions your program is taking doesn't matter. But you don't want to allow this for IO in general, because skipping actions or performing them in a different order can matter (and above, certainly does). Even in the reading a file lazily case, if something else in your program writes to the file, then whether you evaluate that list before or after the write action will affect the output of your program, which again, breaks referential transparency (because evaluation order shouldn't matter).
So for a restricted subset of IO actions, you can sensibly define Functor, Applicative and Monad on a stream type to work in a lazy way, but doing so in the IO Monad in general is a minefield and often just plain incorrect. Instead you want a specialised streaming type, and indeed Conduit defines Functor, Applicative and Monad on a lot of it's types so you can still use all your favourite functions.

Is it possible in Haskell to apply the function putStrLn to every element of a list of Strings, have it print to the screen, while being non recursive

I am trying to make a function that takes a list of strings and executes the command putStrLn or print (I think they are basically equivalent, please correct me if I am wrong as I'm still new to Haskell) to every element and have it printed out on my terminal screen. I was experimenting with the map function and also with lambda/anonymous functions as I already know how to do this recursively but wanted to try a more complex non recursive version. map returned a list of the type IO() which was not what I was going for and my attempts at lambda functions did not go according to plan. The basic code was:
test :: [String] -> something
test x = map (\a->putStrLn a) x -- output for this function would have to be [IO()]
Not entirely sure what the output of the function was supposed to be either which also gave me issues.
I was thinking of making a temp :: String variable and have each String appended to temp and then putStrLn temp but was not sure how to do that entirely. I though using where would be viable but I still ran into issues. I know how to do this in languages like java and C but I am still quite new to Haskell. Any help would be appreciated.
There is a special version of map that works with monadic functions, it's called mapM:
test :: [String] -> IO [()]
test x = mapM putStrLn x
Note that this way the return type of test is a list of units - that's because each call to putStrLn returns a unit, so result of applying it to each element in a list would be a list of units. If you'd rather not deal with this silliness and have the return type be a plain unit, use the special version mapM_:
test :: [String] -> IO ()
test x = mapM_ putStrLn x
I was thinking of making a temp :: String variable and have each String appended to temp and then putStrLn temp
Good idea. A pattern of "render the message" then a separate "emit the message" is often nice to have long term.
test xs = let temp = unlines (map show xs)
in putStrLn temp
Or just
test xs = putStrLn (unlines (show <$> xs))
Or
test = putStrLn . unlines . map show
Not entirely sure what the output of the function was supposed to be either which also gave me issues.
Well you made a list of IO actions:
test :: [String] -> [IO ()]
test x = map (\a->putStrLn a) x
So with this list of IO actions when do you want to execute them? Now? Just once? The first one many times the rest never? In what order?
Presumably you want to execute them all now. Let's also eta reduce (\a -> putStrLn a) to just putStrLn since that means the same thing:
test :: [String] -> IO ()
test x = sequence_ (map (\a->putStrLn a) x)

After reading a file I have IO [Char], but I need [IO Char]

I have a file number.txt which contains a large number and I read it into an IO String like this:
readNumber = readFile "number.txt" >>= return
In another function I want to create a list of Ints, one Int for each digit…
Lets assume the content of number.txt is:
1234567890
Then I want my function to return [1,2,3,4,5,6,7,8,9,0].
I tried severall versions with map, mapM(_), liftM, and, and, and, but I got several error messages everytime, which I was able to reduce to
Couldn't match expected type `[m0 Char]'
with actual type `IO String'
The last version I have on disk is the following:
module Main where
import Control.Monad
import Data.Char (digitToInt)
main = intify >>= putStrLn . show
readNumber = readFile "number.txt" >>= return
intify = mapM (liftM digitToInt) readNumber
So, as far as I understand the error, I need some function that takes IO [a] and returns [IO a], but I was not able to find such thing with hoogle… Only the other way round seemes to exist
In addition to the other great answers here, it's nice to talk about how to read [IO Char] versus IO [Char]. In particular, you'd call [IO Char] "an (immediate) list of (deferred) IO actions which produce Chars" and IO [Char] "a (deferred) IO action producing a list of Chars".
The important part is the location of "deferred" above---the major difference between a type IO a and a type a is that the former is best thought of as a set of instructions to be executed at runtime which eventually produce an a... while the latter is just that very a.
This phase distinction is key to understanding how IO values work. It's also worth noting that it can be very fluid within a program---functions like fmap or (>>=) allow us to peek behind the phase distinction. As an example, consider the following function
foo :: IO Int -- <-- our final result is an `IO` action
foo = fmap f getChar where -- <-- up here getChar is an `IO Char`, not a real one
f :: Char -> Int
f = Data.Char.ord -- <-- inside here we have a "real" `Char`
Here we build a deferred action (foo) by modifying a deferred action (getChar) by using a function which views a world that only comes into existence after our deferred IO action has run.
So let's tie this knot and get back to the question at hand. Why can't you turn an IO [Char] into an [IO Char] (in any meaningful way)? Well, if you're looking at a piece of code which has access to IO [Char] then the first thing you're going to want to do is sneak inside of that IO action
floob = do chars <- (getChars :: IO [Char])
...
where in the part left as ... we have access to chars :: [Char] because we've "stepped into" the IO action getChars. This means that by this point we've must have already run whatever runtime actions are required to generate that list of characters. We've let the cat out of the monad and we can't get it back in (in any meaningful way) since we can't go back and "unread" each individual character.
(Note: I keep saying "in any meaningful way" because we absolutely can put cats back into monads using return, but this won't let us go back in time and have never let them out in the first place. That ship has sailed.)
So how do we get a type [IO Char]? Well, we have to know (without running any IO) what kind of IO operations we'd like to do. For instance, we could write the following
replicate 10 getChar :: [IO Char]
and immediately do something like
take 5 (replicate 10 getChar)
without ever running an IO action---our list structure is immediately available and not deferred until the runtime has a chance to get to it. But note that we must know exactly the structure of the IO actions we'd like to perform in order to create a type [IO Char]. That said, we could use yet another level of IO to peek at the real world in order to determine the parameters of our action
do len <- (figureOutLengthOfReadWithoutActuallyReading :: IO Int)
return $ replicate len getChar
and this fragment has type IO [IO Char]. To run it we have to step through IO twice, we have to let the runtime perform two IO actions, first to determine the length and then second to actually act on our list of IO Char actions.
sequence :: [IO a] -> IO [a]
The above function, sequence, is a common way to execute some structure containing a, well, sequence of IO actions. We can use that to do our two-phase read
twoPhase :: IO [Char]
twoPhase = do len <- (figureOutLengthOfReadWithoutActuallyReading :: IO Int)
putStrLn ("About to read " ++ show len ++ " characters")
sequence (replicate len getChar)
>>> twoPhase
Determining length of read
About to read 22 characters
let me write 22 charac"let me write 22 charac"
You got some things mixed up:
readNumber = readFile "number.txt" >>= return
the return is unecessary, just leave it out.
Here is a working version:
module Main where
import Data.Char (digitToInt)
main :: IO ()
main = intify >>= print
readNumber :: IO String
readNumber = readFile "number.txt"
intify :: IO [Int]
intify = fmap (map digitToInt) readNumber
Such a function can't exists, because you would be able to evaluate the length of the list without ever invoking any IO.
What is possible is this:
imbue' :: IO [a] -> IO [IO a]
imbue' = fmap $ map return
Which of course generalises to
imbue :: (Functor f, Monad m) => m (f a) -> m (f (m a))
imbue = liftM $ fmap return
You can then do, say,
quun :: IO [Char]
bar :: [IO Char] -> IO Y
main = do
actsList <- imbue quun
y <- bar actsLists
...
Only, the whole thing about using [IO Char] is pointless: it's completely equivalent to the much more straightforward way of working only with lists of "pure values", only using the IO monad "outside"; how to do that is shown in Markus's answer.
Do you really need many different helper functions? Because you may write just
main = do
file <- readFile "number.txt"
let digits = map digitToInt file
print digits
or, if you really need to separate them, try to minimize the amount of IO signatures:
readNumber = readFile "number.txt" --Will be IO String
intify = map digitToInt --Will be String -> [Int], not IO
main = readNumber >>= print . intify

Haskell monadic IO

compute fp = do
text <- readFile fp
let (a,b) = sth text
let x = data b
--g <- x
putStr $ print_matrix $ fst $ head $ x
It works when i get only first element but i want do this for all element on the list of pair.
When i write g<- x i got Couldn't match expected type `IO t0'
with actual type [([[Integer]], [[Integer]])]
You're inside the IO Monad here, so any time you write a "bind" arrow <-, the thing on the right side has to be an IO operation. So the short answer is, you don't want to use <- on the value x.
Now, it looks like you want to call print_matrix for each element of a list rather than a single element. In that case I think Macke is on the right track, but I would use mapM_ instead:
mapM_ (putStr . print_matrix . fst) x
should do the trick.
The reason is that you are explicitly saying you want to putStr each element, one at a time, but you don't care about the result of putStr itself.
It sounds like mapM might fit your bill: Monad a => (b -> a c) -> [b] -> a [c]
It's used to apply a monadic function to a list, and get a list back, in the monad

Haskell: can't use "map putStrLn"?

I have a list of strings, and tried this:
ls = [ "banana", "mango", "orange" ]
main = do
map PutStrLn list_of_strings
That didn't work, and I can't understand why.
ghc print-list.hs
print-list.hs:3:0:
Couldn't match expected type `IO t' against inferred type `[IO ()]'
In the expression: main
When checking the type of the function `main'
Any hints? I suppose it has to do with map returning a list and not a value, but I didn't find an easy way to fix this.
Right now the only way I know to print a list of strings is to write a function that will iterate the list, printing each element (print if the list is [a], but print and recurse if it's (a:b)). But it would be much simpler to just use map...
Thanks!
The type of the main function should be IO t (where t is a type variable). The type of map putStrLn ls is [IO ()]. This why you are getting this error message. You can verify this yourself by running the following in ghci:
Prelude> :type map putStrLn ls
map putStrLn ls :: [IO ()]
One solution to the problem is using mapM, which is the "monadic" version of map. Or you can use mapM_ which is the same as mapM but does not collect the returned values from the function. Since you don't care about the return value of putStrLn, it's more appropriate to use mapM_ here. mapM_ has the following type:
mapM_ :: Monad m => (a -> m b) -> [a] -> m ()
Here is how to use it:
ls = [ "banana", "mango", "orange" ]
main = mapM_ putStrLn ls
Ayman's answer makes most sense for this situation. In general, if you have [m ()] and you want m (), then use sequence_, where m can be any monad including IO.

Resources