How to convert data from IO(String) to String in haskell [duplicate] - haskell

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
A Haskell function of type: IO String-> String
i'm reading some data from a file using the readFile function available in Haskell. But this function returns me some data stored as IO String. Does anybody knows how do I convert this data into a String type (or any function that reads String from a file, without the IO () type)?

It is a very general question about extracting data from monadic values.
The general idea is to use >>= function:
main = readFile foo >>= \s -> print s
>>= takes 2 arguments. It extracts the value from its first argument and passes it to its second argument. The first argument is monadic value, in this case of type IO String, and the second argument is a function that accepts a plain, non-monadic value, in this case String.
There is a special syntax for this pattern:
main = do
s <- readFile foo
print s
But the meaning is the same as above. The do notation is more convenient for beginners and for certain complicated cases, but explicit application of >>= can lead to a shorter code. For example, this code can be written as just
main = readFile foo >>= print
Also there are a big family of library functions to convert between monadic and non-monadic values. The most important of them are return, fmap, liftM2 and >=>.
The concept of monad is very useful beyond representing IO in a referentially transparent way: these helpers are very useful for error handling, dealing with implicit state and other applications of monads.
The second most important monad is Maybe.

I'd treat the IO type as a functor in this case, and instead of getting the value out of it, I'd send my function inside it and let the Functor instance deal with creating a new IO container with the result from my function.
> :m +Data.Functor
> length <$> readFile "file.txt"
525
<$> is an alias for fmap. I like <$> more, but it's just a personal preference.

Related

Is print in Haskell a pure function?

Is print in Haskell a pure function; why or why not? I'm thinking it's not, because it does not always return the same value as pure functions should.
A value of type IO Int is not really an Int. It's more like a piece of paper which reads "hey Haskell runtime, please produce an Int value in such and such way". The piece of paper is inert and remains the same, even if the Ints eventually produced by the runtime are different.
You send the piece of paper to the runtime by assigning it to main. If the IO action never comes in the way of main and instead languishes inside some container, it will never get executed.
Functions that return IO actions are pure like the others. They always return the same piece of paper. What the runtime does with those instructions is another matter.
If they weren't pure, we would have to think twice before changing
foo :: (Int -> IO Int) -> IO Int
foo f = liftA2 (+) (f 0) (f 0)
to:
foo :: (Int -> IO Int) -> IO Int
foo f = let x = f 0 in liftA2 (+) x x
Yes, print is a pure function. The value it returns has type IO (), which you can think of as a bunch of code that outputs the string you passed in. For each string you pass in, it always returns the same code.
If you just read the Tag of pure-function (A function that always evaluates to the same result value given the same argument value(s) and that does not cause any semantically observable side effect or output, such as mutation of mutable objects or output to I/O devices.) and then Think in the type of print:
putStrLn :: String -> IO ()
You will find a trick there, it always returns IO (), so... No, it produces effects. So in terms of Referential Transparency is not pure
For example, getLine returns IO String but it is also a pure function. (#interjay contribution), What I'm trying to say, is that the answer depends very close of the question:
On matter of value, IO () will always be the same IO () value for the same input.
And
On matter of execution, it is not pure because the execution of that
IO () could have side effects (put an string in the screen, in this
case looks so innocent, but some IO could lunch nuclear bombs, and
then return the Int 42)
You could understand better with the nice approach of #Ben here:
"There are several ways to explain how you're "purely" manipulating
the real world. One is to say that IO is just like a state monad, only
the state being threaded through is the entire world outside your
program;= (so your Stuff -> IO DBThing function really has an extra
hidden argument that receives the world, and actually returns a
DBThing along with another world; it's always called with different
worlds, and that's why it can return different DBThing values even
when called with the same Stuff). Another explanation is that an IO
DBThing value is itself an imperative program; your Haskell program is
a totally pure function doing no IO, which returns an impure program
that does IO, and the Haskell runtime system (impurely) executes the
program it returns."
And #Erik Allik:
So Haskell functions that return values of type IO a, are actually not
the functions that are being executed at runtime — what gets executed
is the IO a value itself. So these functions actually are pure but
their return values represent non-pure computations.
You can found them here Understanding pure functions in Haskell with IO

Why wrap an IO result in IO Monad

In Haskell we have a function readFile :: FilePath -> IO String. My question while understanding monad is why wrap it in IO? Couldn't we just have written function like these:
(lines.readFile) path
Rather than
(readFile >>= lines) path
What benefit does the IO wrapper provide?
Haskell expressions are referentially transparent. This means that if readFile would really have a type of FilePath -> String, then expression readFile "a.txt" would always yield the same result. Even if you read the file, then change it, and then read again, you will get the contents in its first state.
Thus, we need to distingush between values and actions, and this is what IO is for. It doesn't let you use the result readFile "a.exe" in other expressions until you perform an action associated with it. As a consequence, after changing your file you have to perform the reading action again, to get file contents, and because of that you will able to see the changes.
We write functions that create computer programs
It should be noted that Haskell is a functional programming language. Functions, in the mathematical sense, always produce the same values for the same inputs.
Now this requirement to always produce the same result constrains things quite a bit, since a function to read a file would somehow have to produce the same result every time, even if the file was later changed. That's obviously not what we really want.
There is, however, a way to make a functional programming language that can handle reading a changing file. What you do is to write a function that produces some action the computer should perform. So you might perform an action composed of the following steps:
Read the file
Break it into lines
Change the even-numbered lines to uppercase
Output the lines to the screen
These four actions aren't performed yet. They're just a sequence of actions that we might perform. A function can return that exact same sequence of potential actions every time it's called, which makes it a proper mathematical function.
The main :: IO a function in Haskell returns the action that the program should perform. It always returns the same action, making it a proper mathematical function. When the program is run, the computer evaluates the main function, producing the action the computer should perform, and the computer then executes the action.
Do notation
Do notation takes the strangeness out of the process, giving you the feel of a much more standard programming language. You have three options:
Perform an action and do nothing with its results
Perform an action and store its results
Process data using only functions (no actions)
These are done in the following ways, respectively:
action args
result <- action args
let result = f . g . h . whateverCalculation $ value
This is similar to an imperative language like C where you do, respectively:
action(args);
result = action(args);
result = f(g(h(whateverCalculation(value))));
For (lines.readFile) path to work, the type of readFile would need to be FilePath -> String. That, however, doesn't make sense in Haskell. A Haskell function is supposed to always produce the same results when given the same arguments. If the result type readFile was String, however, that would not happen, as readFile "foo.txt" would have to, for any useful implementation of such a readFile, produce different strings depending on the contents of the foo.txt file.
The Haskell solution to this issue is giving readFile the type FilePath -> IO String. An IO String is not a string, but a program that can be executed by the computer and that, when executed, somehow materialises a String into memory. While the String thus produced might be different each time the program is executed, the program itself remains the same, and therefore readFile always returns the same results when given the same arguments (and so, for instance, readFile "foo.txt" is always the same program).
This trick of manipulating a program that produces an I/O-dependent result instead of the result itself only works if the I/O-dependent result is kept opaque; that is, if there is no way of directly extracting it. In other words, there cannot be, for instance, an IO String -> String function -- for one, it would allow us to implement a readFile with the inappropriate type FilePath -> String that we have discussed above. There are, however, indirect ways of using the I/O-dependent result that do not lead to trouble. One of them is using it to create a second program, whose I/O-dependent result is just as opaque as the first one was. The Monad interface allows us to express this usage pattern:
(>>=) :: Monad m => m a -> (a -> m b) -> m b
Specialising (>>=) to IO, we get:
(>>=) #IO :: IO a -> (a -> IO b) -> IO b
The first program has type IO a, and the function that produces the second program using the I/O-dependent result of the first one has type a -> IO b. The result of (>>=) is a program which executes the first program and the second, newly generated, one in sequence. For instance...
readFile "foo.txt" >>= putStrLn
... is a program which reads the contents of foo.txt and then displays these contents.
P.S.: With respect to your example involving lines, it is worth noting that both (readFile >>= lines) path, as you have written it, and (\p -> readFile p >>= lines) path are rejected by the type checker. Something that does work is:
(fmap lines . readFile) path
In it, we are making indirect use of the file contents in a different way. If we have a program which produces an I/O-dependent result, we can turn it into a program which produces a modified version of this result. That is done through fmap, from the Functor class:
fmap :: Functor f => (a -> b) -> f a -> f b
Or, specialising to IO:
fmap #IO :: (a -> b) -> IO a -> IO b

pure code and impure code? I/O data type related [duplicate]

This question already has answers here:
In what sense is the IO Monad pure?
(9 answers)
Closed 7 years ago.
name <- getLine
The following paragraph is about how the line works. Can someone explain to me what this means? Thank you in advance.
getLine is in a sense impure because its result value is not
guaranteed to be the same when performed twice. That's why it's sort
of tainted with the IO type constructor and we can only get that data
out in I/O code. And because I/O code is tainted too, any computation
that depends on tainted I/O data will have a tainted result.
When I say tainted, I don't mean tainted in such a way that we can
never use the result contained in an I/O action ever again in pure
code. No, we temporarily un-taint the data inside an I/O action when
we bind it to a name. When we do name <- getLine, name is just a
normal string, because it represents what's inside the box.
The paragraph hints to the fact that you can't use getLine in a function which has a "pure" type, without the IO monad occurring in it. E.g. if we try to run
lineLength :: Int -> Int
lineLength n = n + length getLine
the compiler will complain because length expects a String (or any other list type) but getLine is an IO String. So there's a type mismatch.
But this does not mean that length and getLine can not be composed: here's how
lineLength :: Int -> IO Int
lineLength n = do
line <- getLine
return (n + length line)
Above we temporarily "remove the IO" since line :: String, so that length can be applied to it, and n added to the result. However, then we are forced to use return to transform a pure Int result into an IO Int.
Notice how the IO also ends up in the function signature: this is unavoidable.
In Haskell, if a function does impure things such as using IO, then the type system forces you to use IO in the type. This is the "taint" the quote is referring to: once you use an impure function/value with an IO type, then you have to use IO in the type of your own code, and any caller of that in turn will have to use IO. Impurity must be propagated in the type signatures.
This also means that if you see f :: Int -> Int, you can rely one the fact that the compiler proved the function to be pure: it will return the same result for the same input. (There are a few low-level ways to circumvent this, but they are not meant to be used in regular code.)

How to understand "m ()" is a monadic computation

From the document:
when :: (Monad m) => Bool -> m () -> m ()
when p s = if p then s else return ()
The when function takes a boolean argument and a monadic computation with unit () type and performs the computation only when the boolean argument is True.
===
As a Haskell newbie, my problem is that for me m () is some "void" data, but here the document mentions it as computation. Is it because of the laziness of Haskell?
Laziness has no part in it.
The m/Monad part is often called computation.
The best example might be m = IO:
Look at putStrLn "Hello" :: IO () - This is a computation that, when run, will print "Hello" to your screen.
This computation has no result - so the return type is ()
Now when you write
hello :: Bool -> IO ()
hello sayIt =
when sayIt (putStrLn "Hello")
then hello True is a computation that, when run, will print "Hello"; while hello False is a computation that when run will do nothing at all.
Now compare it to getLine :: IO String - this is a computation that, when run, will prompt you for an input and will return the input as a String - that's why the return type is String.
Does this help?
for me "m ()" is some "void" data
And that kinda makes sense, in that a computation is a special kind of data. It has nothing to do with laziness - it's associated with context.
Let's take State as an example. A function of type, say, s -> () in Haskell can only produce one value. However, a function of type s -> ((), s) is a regular function doing some transformation on s. The problem you're having is that you're only looking at the () part, while the s -> s part stays hidden. That's the point of State - to hide the state passing.
Hence State s () is trivially convertible to s -> ((), s) and back, and it still is a Monad (a computation) that produces a value of... ().
If we look at practical use, now:
(flip runState 10) $ do
modify (+1)
This expression produces a tuple of ((), Int); the Int part is hidden
It will modify the state, adding 1 to it. It produces the intermediate value of (), though, which fits your when:
when (5 > 3) $ modify (+1)
Monads are notably abstract and mathematical, so intuitive statements about them are often made in language that is rather vague. So values of a monadic type are often informally labeled as "computations," "actions" or (less often) "commands" because it's an analogy that sometimes help us reason about them. But when you dig deeper, these turn out to be empty words when used this way; ultimately what they mean is "some value of a type that provides the Monad interface."
I like the word "action" better for this, so let me go with that. The intuition for the use for that word in Haskell is this: the language makes a distinction between functions and actions:
Functions can't have any side effects, and their types look like a -> b.
Actions may have side effects, and their types look like IO a.
A consequence of this: an action of type IO () produces an uninteresting result value, and therefore it's either a no-op (return ()) or an action that is only interesting because of its side effects.
Monad then is the interface that allows you to glue actions together into complex actions.
This is all very intuitive, but it's an analogy that becomes rather stretched when you try to apply it to many monads other than the IO type. For example, lists are a monad:
instance Monad [] where
return a = [a]
as >>= f = concatMap f as
The "actions" or "computations" of the list monad are... lists. How is a list an "action" or a "computation"? The analogy is rather weak in this case, isn't it?
So I'd say that this is the best advice:
Understand that "action" and "computation" are analogies. There's no strict definition.
Understand that these analogies are stronger for some monad instances, and weak for others.
The ultimate barometer of how things work are the Monad laws and the definitions of the various functions that work with Monad.

Haskell: I/O and Returning From a Function

Please bear with me as I am very new to functional programming and Haskell. I am attempting to write a function in Haskell that takes a list of Integers, prints the head of said list, and then returns the tail of the list. The function needs to be of type [Integer] -> [Integer]. To give a bit of context, I am writing an interpreter and this function is called when its respective command is looked up in an associative list (key is the command, value is the function).
Here is the code I have written:
dot (x:xs) = do print x
return xs
The compiler gives the following error message:
forth.hs:12:1:
Couldn't match expected type `[a]' against inferred type `IO [a]'
Expected type: ([Char], [a] -> [a])
Inferred type: ([Char], [a] -> IO [a])
In the expression: (".", dot)
I suspect that the call to print in the dot function is what is causing the inferred type to be IO [a]. Is there any way that I can ignore the return type of print, as all I need to return is the tail of the list being passed into dot.
Thanks in advance.
In most functional languages, this would work. However, Haskell is a pure functional language. You are not allowed to do IO in functions, so the function can either be
[Int] -> [Int] without performing any IO or
[Int] -> IO [Int] with IO
The type of dot as inferred by the compiler is dot :: (Show t) => [t] -> IO [t] but you can declare it to be [Int] -> IO [Int]:
dot :: [Int] -> IO [Int]
See IO monad: http://book.realworldhaskell.org/read/io.html
I haven't mentioned System.IO.Unsafe.unsafePerformIO that should be used with great care and with a firm understanding of its consequences.
No, either your function causes side effects (aka IO, in this case printing on the screen), or it doesn't. print does IO and therefore returns something in IO and this can not be undone.
And it would be a bad thing if the compiler could be tricked into forgetting about the IO. For example if your [Integer] -> [Integer] function is called several times in your program with the same parameters (like [] for example), the compiler might perfectly well just execute the function only once and use the result of that in all the places where the function got "called". Your "hidden" print would only be executed once even though you called the function in several places.
But the type system protects you and makes sure that all function that use IO, even if only indirectly, have an IO type to reflect this. If you want a pure function you cannot use print in it.
As you may already know, Haskell is a "pure" functional programming language. For this reason, side-effects (such as printing a value on the screen) are not incidental as they are in more mainstream languages. This fact gives Haskell many nice properties, but you would be forgiven for not caring about this when all you're doing is trying to print a value to the screen.
Because the language has no direct facility for causing side-effects, the strategy is that functions may produce one or more "IO action" values. An IO action encapsulates some side effect (printing to the console, writing to a file, etc.) along with possibly producing a value. Your dot function is producing just such an action. The problem you now have is that you need something that will be able to cause the IO side-effect, as well as unwrapping the value and possibly passing it back into your program.
Without resorting to hacks, this means that you need to get your IO action(s) back up to the main function. Practically speaking, this means that everything between main and dot has to be in the "IO Monad". What happens in the "IO Monad" stays in the "IO Monad" so to speak.
EDIT
Here's about the simplest example I could imagine for using your dot function in a valid Haskell program:
module Main where
main :: IO ()
main =
do
let xs = [2,3,4]
xr <- dot xs
xrr <- dot xr
return ()
dot (x:xs) =
do
print x
return xs

Resources