Processing monad value before assignment - haskell

Is it possible to make those two lines one line:
main = do line <- getLine
let result = words line
what I mean is something like non monadic code
result = words getLine
which -- in my opinion -- would improve readability.

Try this: result <- fmap words getLine
fmap takes a function with a type like a -> b and turns it into f a -> f b for anything that's an instance of Functor, which should include all Monad instances.
There's an equivalent function called liftM that's specific to Monad, for murky historical reasons. You might need to use that instead in some cases, but for standard monads like IO you can stick with fmap.
You can also import Data.Functor or Control.Applicative to get a nice operator version of fmap, so you could write words <$> getLine instead, which often looks prettier.

Related

Is there a way to "unwrap" IO monad in Haskell inside another io action?

In particular I want to use pure Haskell functions on the results on console input. I'm curious if exists something like ? operator from rust.
this snippet complains that words expect the type String, while its supplied wrapped into an io action. But i can't even use >>= operator since as far as i understand i cannot instantiate IO constructor directly. So, does it mean all the standard library functions can't work with io even in the scope of io action?
thing :: IO ()
thing = do
let mbwords = words $ getLine ;
Since Haskell is not designed to be totally useless, there must be a way. And there is more than one indeed.
What you looking for is something like this
main = do
line <- getLine
let mbwords = words line
or perhaps
main = do
mbwords <- fmap words getLine
Note you use <- to "open" the monadic value inside the do-block, not let.
Also note instantiating IO constructors is not necessary as long as you have functions that return IO actions, such as print. You can express the same idea with
main = fmap words getLine >>= print
(of course you can use an arbitrary action of the right type instead of print).

Lazy evaluation of IO actions

I'm trying to write code in source -> transform -> sink style, for example:
let (|>) = flip ($)
repeat 1 |> take 5 |> sum |> print
But would like to do that using IO. I have this impression that my source can be an infinite list of IO actions, and each one gets evaluated once it is needed downstream. Something like this:
-- prints the number of lines entered before "quit" is entered
[getLine..] >>= takeWhile (/= "quit") >>= length >>= print
I think this is possible with the streaming libraries, but can it be done along the lines of what I'm proposing?
Using the repeatM, takeWhile and length_ functions from the streaming library:
import Streaming
import qualified Streaming.Prelude as S
count :: IO ()
count = do r <- S.length_ . S.takeWhile (/= "quit") . S.repeatM $ getLine
print r
This seems to be in that spirit:
let (|>) = flip ($)
let (.>) = flip (.)
getContents >>= lines .> takeWhile (/= "quit") .> length .> print
The issue here is that Monad is not the right abstraction for this, and attempting to do something like this results in a situation where referential transparency is broken.
Firstly, we can do a lazy IO read like so:
module Main where
import System.IO.Unsafe (unsafePerformIO)
import Control.Monad(forM_)
lazyIOSequence :: [IO a] -> IO [a]
lazyIOSequence = pure . go where
go :: [IO a] -> [a]
go (l:ls) = (unsafePerformIO l):(go ls)
main :: IO ()
main = do
l <- lazyIOSequence (repeat getLine)
forM_ l putStrLn
This when run will perform cat. It will read lines and output them. Everything works fine.
But consider changing the main function to this:
main :: IO ()
main = do
l <- lazyIOSequence (map (putStrLn . show) [1..])
putStrLn "Hello World"
This outputs Hello World only, as we didn't need to evaluate any of l. But now consider replacing the last line like the following:
main :: IO ()
main = do
x <- lazyIOSequence (map (putStrLn . show) [1..])
seq (head x) putStrLn "Hello World"
Same program, but the output is now:
1
Hello World
This is bad, we've changed the results of a program just by evaluating a value. This is not supposed to happen in Haskell, when you evaluate something it should just evaluate it, not change the outside world.
So if you restrict your IO actions to something like reading from a file nothing else is reading from, then you might be able to sensibly lazily evaluate things, because when you read from it in relation to all the other IO actions your program is taking doesn't matter. But you don't want to allow this for IO in general, because skipping actions or performing them in a different order can matter (and above, certainly does). Even in the reading a file lazily case, if something else in your program writes to the file, then whether you evaluate that list before or after the write action will affect the output of your program, which again, breaks referential transparency (because evaluation order shouldn't matter).
So for a restricted subset of IO actions, you can sensibly define Functor, Applicative and Monad on a stream type to work in a lazy way, but doing so in the IO Monad in general is a minefield and often just plain incorrect. Instead you want a specialised streaming type, and indeed Conduit defines Functor, Applicative and Monad on a lot of it's types so you can still use all your favourite functions.

Write a function from IO a -> a?

Take the function getLine - it has the type:
getLine :: IO String
How do I extract the String from this IO action?
More generally, how do I convert this:
IO a
to this:
a
If this is not possible, then why can't I do it?
In Haskell, when you want to work with a value that is "trapped" in IO, you don't take the value out of IO. Instead, you put the operation you want to perform into IO, as well!
For example, suppose you want to check how many characters the getLine :: IO String will produce, using the length function from Prelude.
There exists a helper function called fmap which, when specialized to IO, has the type:
fmap :: (a -> b) -> IO a -> IO b
It takes a function that works on "pure" values not trapped in IO, and gives you a function that works with values that are trapped in IO. This means that the code
fmap length getLine :: IO Int
represents an IO action that reads a line from console and then gives you its length.
<$> is an infix synonym for fmap that can make things simpler. This is equivalent to the above code:
length <$> getLine
Now, sometimes the operation you want to perform with the IO-trapped value itself returns an IO-trapped value. Simple example: you wan to write back the string you have just read using putStrLn :: String -> IO ().
In that case, fmap is not enough. You need to use the (>>=) operator, which, when specialiced to IO, has the type IO a -> (a -> IO b) -> IO b. In out case:
getLine >>= putStrLn :: IO ()
Using (>>=) to chain IO actions has an imperative, sequential flavor. There is a kind of syntactic sugar called "do-notation" which helps to write sequential operation like these in a more natural way:
do line <- getLine
putStrLn line
Notice that the <- here is not an operator, but part of the syntactic sugar provided by the do notation.
Not going into any details, if you're in a do block, you can (informally/inaccurately) consider <- as getting the value out of the IO.
For example, the following function takes a line from getLine, and passes it to a pure function that just takes a String
main = do
line <- getLine
putStrLn (wrap line)
wrap :: String -> String
wrap line = "'" ++ line ++ "'"
If you compile this as wrap, and on the command line run
echo "Hello" | wrap
you should see
'Hello'
If you know C then consider the question "How can I get the string from gets?" An IO String is not some string that's made hard to get to, it's a procedure that can return a string - like reading from a network or stdin. You want to run the procedure to obtain a string.
A common way to run IO actions in a sequence is do notation:
main = do
someString <- getLine
-- someString :: String
print someString
In the above you run the getLine operation to obtain a String value then use the value however you wish.
So "generally", it's unclear why you think you need a function of this type and in this case it makes all the difference.
It should be noted for completeness that it is possible. There indeed exists a function of type IO a -> a in the base library called unsafePerformIO.
But the unsafe part is there for a reason. There are few situations where its usage would be considered justified. It's an escape hatch to be used with great caution - most of the time you will let monsters in instead of letting yourself out.
Why can't you normally go from IO a to a? Well at the very least it allows you to break the rules by having a seemingly pure function that is not pure at all - ouch! If it were a common practice to do this the type signatures and all the work done by the compiler to verify them would make no sense at all. All the correctness guarantees would go out of the window.
Haskell is, partly, interesting precisely because this is (normally) impossible.
For how to approach your getLine problem in particular see the other answers.

Idiomatic Haskell syntax without do-blocks or accumulating parentheses?

I'm fairly new to Haskell and have been trying to find a way to pass multiple IO-tainted values to a function to deal with a C library. Most people seem to use the <- operator inside a do block, like this:
g x y = x ++ y
interactiveConcat1 = do {x <- getLine;
y <- getLine;
putStrLn (g x y);
return ()}
This makes me feel like I'm doing C, except emacs can't auto-indent. I tried to write this in a more Lispy style:
interactiveConcat2 = getLine >>= (\x ->
getLine >>= (\y ->
putStrLn (g x y) >>
return () ))
That looks like a mess, and has a string of closed parentheses you have to count at the end (except again, emacs can reliably assist with this task in Lisp, but not in Haskell). Yet another way is to say
import Control.Applicative
interactiveConcat3 = return g <*> getLine <*> getLine >>= putStrLn
which looks pretty neat but isn't part of the base language.
Is there any less laborious notation for peeling values out of the IO taint boxes? Perhaps there is a cleaner way using a lift* or fmap? I hope it isn't too subjective to ask what is considered "idiomatic"?
Also, any tips for making emacs cooperate better than (Haskell Ind) mode would be greatly appreciated. Thanks!
John
Edit: I stumbled across https://wiki.haskell.org/Do_notation_considered_harmful and realized that the nested parentheses in the lambda chain I wrote is not necessary. However it seems the community (and ghc implementors) have embraced the Applicative-inspired style using , <*>, etc, which seems to make the code easier to read in spite of the headaches with figuring out operator precedence.
Note: This post is written in literate Haskell. You can save it as Main.lhs and try it in your GHCi.
A short remark first: you can get rid of the semicolons and the braces in do. Also, putStrLn has type IO (), so you don't need return ():
interactiveConcat1 = do
x <- getLine
y <- getLine
putStrLn $ g x y
We're going to work with IO, so importing Control.Applicative or Control.Monad will come in handy:
> module Main where
> import Control.Applicative
> -- Repeat your definition for completeness
> g :: [a] -> [a] -> [a]
> g = (++)
You're looking for something like this:
> interactiveConcat :: IO ()
> interactiveConcat = magic g getLine getLine >>= putStrLn
What type does magic need? It returns a IO String, takes a function that returns an String and takes usual Strings, and takes two IO Strings:
magic :: (String -> String -> String) -> IO String -> IO String -> IO String
We can probably generalize this type to
> magic :: (a -> b -> c) -> IO a -> IO b -> IO c
A quick hoogle search reveals that there are already two functions with almost that type: liftA2 from Control.Applicative and liftM2 from Control.Monad. They're defined for every Applicative and – in case of liftM2 – Monad. Since IO is an instance of both, you can choose either one:
> magic = liftA2
If you use GHC 7.10 or higher, you can also use <$> and <*> without import and write interactiveConcat as
interactiveConcat = g <$> getLine <*> getLine >>= putStrLn
For completeness, lets add a main so that we can easily check this functionality via runhaskell Main.lhs:
> main :: IO ()
> main = interactiveConcat
A simple check shows that it works as intended:
$ echo "Hello\nWorld" | runhaskell Main.lhs
HelloWorld
References
Applicative in the Typeclassopedia
The section "Some useful monadic functions" of LYAH's chapter "For a Few Monads More".
You can use liftA2 (or liftM2 from Control.Monad):
import Control.Applicative (liftA2)
liftA2 g getLine getLine >>= putStrLn

How can I parse the IO String in Haskell?

I' ve got a problem with Haskell. I have text file looking like this:
5.
7.
[(1,2,3),(4,5,6),(7,8,9),(10,11,12)].
I haven't any idea how can I get the first 2 numbers (2 and 7 above) and the list from the last line. There are dots on the end of each line.
I tried to build a parser, but function called 'readFile' return the Monad called IO String. I don't know how can I get information from that type of string.
I prefer work on a array of chars. Maybe there is a function which can convert from 'IO String' to [Char]?
I think you have a fundamental misunderstanding about IO in Haskell. Particularly, you say this:
Maybe there is a function which can convert from 'IO String' to [Char]?
No, there isn't1, and the fact that there is no such function is one of the most important things about Haskell.
Haskell is a very principled language. It tries to maintain a distinction between "pure" functions (which don't have any side-effects, and always return the same result when give the same input) and "impure" functions (which have side effects like reading from files, printing to the screen, writing to disk etc). The rules are:
You can use a pure function anywhere (in other pure functions, or in impure functions)
You can only use impure functions inside other impure functions.
The way that code is marked as pure or impure is using the type system. When you see a function signature like
digitToInt :: String -> Int
you know that this function is pure. If you give it a String it will return an Int and moreover it will always return the same Int if you give it the same String. On the other hand, a function signature like
getLine :: IO String
is impure, because the return type of String is marked with IO. Obviously getLine (which reads a line of user input) will not always return the same String, because it depends on what the user types in. You can't use this function in pure code, because adding even the smallest bit of impurity will pollute the pure code. Once you go IO you can never go back.
You can think of IO as a wrapper. When you see a particular type, for example, x :: IO String, you should interpret that to mean "x is an action that, when performed, does some arbitrary I/O and then returns something of type String" (note that in Haskell, String and [Char] are exactly the same thing).
So how do you ever get access to the values from an IO action? Fortunately, the type of the function main is IO () (it's an action that does some I/O and returns (), which is the same as returning nothing). So you can always use your IO functions inside main. When you execute a Haskell program, what you are doing is running the main function, which causes all the I/O in the program definition to actually be executed - for example, you can read and write from files, ask the user for input, write to stdout etc etc.
You can think of structuring a Haskell program like this:
All code that does I/O gets the IO tag (basically, you put it in a do block)
Code that doesn't need to perform I/O doesn't need to be in a do block - these are the "pure" functions.
Your main function sequences together the I/O actions you've defined in an order that makes the program do what you want it to do (interspersed with the pure functions wherever you like).
When you run main, you cause all of those I/O actions to be executed.
So, given all that, how do you write your program? Well, the function
readFile :: FilePath -> IO String
reads a file as a String. So we can use that to get the contents of the file. The function
lines:: String -> [String]
splits a String on newlines, so now you have a list of Strings, each corresponding to one line of the file. The function
init :: [a] -> [a]
Drops the last element from a list (this will get rid of the final . on each line). The function
read :: (Read a) => String -> a
takes a String and turns it into an arbitrary Haskell data type, such as Int or Bool. Combining these functions sensibly will give you your program.
Note that the only time you actually need to do any I/O is when you are reading the file. Therefore that is the only part of the program that needs to use the IO tag. The rest of the program can be written "purely".
It sounds like what you need is the article The IO Monad For People Who Simply Don't Care, which should explain a lot of your questions. Don't be scared by the term "monad" - you don't need to understand what a monad is to write Haskell programs (notice that this paragraph is the only one in my answer that uses the word "monad", although admittedly I have used it four times now...)
Here's the program that (I think) you want to write
run :: IO (Int, Int, [(Int,Int,Int)])
run = do
contents <- readFile "text.txt" -- use '<-' here so that 'contents' is a String
let [a,b,c] = lines contents -- split on newlines
let firstLine = read (init a) -- 'init' drops the trailing period
let secondLine = read (init b)
let thirdLine = read (init c) -- this reads a list of Int-tuples
return (firstLine, secondLine, thirdLine)
To answer npfedwards comment about applying lines to the output of readFile text.txt, you need to realize that readFile text.txt gives you an IO String, and it's only when you bind it to a variable (using contents <-) that you get access to the underlying String, so that you can apply lines to it.
Remember: once you go IO, you never go back.
1 I am deliberately ignoring unsafePerformIO because, as implied by the name, it is very unsafe! Don't ever use it unless you really know what you are doing.
As a programming noob, I too was confused by IOs. Just remember that if you go IO you never come out. Chris wrote a great explanation on why. I just thought it might help to give some examples on how to use IO String in a monad. I'll use getLine which reads user input and returns an IO String.
line <- getLine
All this does is bind the user input from getLine to a value named line. If you type this this in ghci, and type :type line it will return:
:type line
line :: String
But wait! getLine returns an IO String
:type getLine
getLine :: IO String
So what happened to the IOness from getLine? <- is what happened. <- is your IO friend. It allows you to bring out the value that is tainted by the IO within a monad and use it with your normal functions. Monads are easily identified because they begin with do. Like so:
main = do
putStrLn "How much do you love Haskell?"
amount <- getLine
putStrln ("You love Haskell this much: " ++ amount)
If you're like me, you'll soon discover that liftIO is your next best monad friend, and that $ help reduce the number of parenthesis you need to write.
So how do you get the information from readFile? Well if readFile's output is IO String like so:
:type readFile
readFile :: FilePath -> IO String
Then all you need is your friendly <-:
yourdata <- readFile "samplefile.txt"
Now if type that in ghci and check the type of yourdata you'll notice it's a simple String.
:type yourdata
text :: String
As people already say, if you have two functions, one is readStringFromFile :: FilePath -> IO String, and another is doTheRightThingWithString :: String -> Something, then you don't really need to escape a string from IO, since you can combine this two functions in various ways:
With fmap for IO (IO is Functor):
fmap doTheRightThingWithString readStringFromFile
With (<$>) for IO (IO is Applicative and (<$>) == fmap):
import Control.Applicative
...
doTheRightThingWithString <$> readStringFromFile
With liftM for IO (liftM == fmap):
import Control.Monad
...
liftM doTheRightThingWithString readStringFromFile
With (>>=) for IO (IO is Monad, fmap == (<$>) == liftM == \f m -> m >>= return . f):
readStringFromFile >>= \string -> return (doTheRightThingWithString string)
readStringFromFile >>= \string -> return $ doTheRightThingWithString string
readStringFromFile >>= return . doTheRightThingWithString
return . doTheRightThingWithString =<< readStringFromFile
With do notation:
do
...
string <- readStringFromFile
-- ^ you escape String from IO but only inside this do-block
let result = doTheRightThingWithString string
...
return result
Every time you will get IO Something.
Why you would want to do it like that? Well, with this you will have pure and
referentially transparent programs (functions) in your language. This means that every function which type is IO-free is pure and referentially transparent, so that for the same arguments it will returns the same values. For example, doTheRightThingWithString would return the same Something for the same String. However readStringFromFile which is not IO-free can return different strings every time (because file can change), so that you can't escape such unpure value from IO.
If you have a parser of this type:
myParser :: String -> Foo
and you read the file using
readFile "thisfile.txt"
then you can read and parse the file using
fmap myParser (readFile "thisfile.txt")
The result of that will have type IO Foo.
The fmap means myParser runs "inside" the IO.
Another way to think of it is that whereas myParser :: String -> Foo, fmap myParser :: IO String -> IO Foo.

Resources