How does getLine work in haskell? - haskell

Looking at the definition of getLine in the Haskell Prelude,
I get how the recursion works, where you keep asking for a character until you hit a newline and you buildup a list which you then return wrapped in an IO.
However my question is how do the return statements work in this case, specifically how does return (c:....:return "") work when you hit the base case. How do you cons a return "" on to a list?

return isn't a control structure like in most languages. It's a constructor for monadic values. Let's take a look at its type:
return :: Monad m => a -> m a
In this case, given a String value, it produces a IO String value.
The fact that return is the last expression evaluated in each branch of the if doesn't mean return ends execution; other expressions could occur after return. Consider this simple example from the list monad:
foo :: Int -> Int -> [Int]
foo x y = return x ++ return y
In the list monad, return simply creates a new single-item list containing its argument. Those two lists are then concatenated into the final result list returned by the function.
$ return 3 :: [Int]
[3]
$ foo 3 4
[3,4]

do-notation is a syntax sugar.
do x <- e
rest
is equivalent to
e >>= \x -> rest
where >>= is a flatMap or bind operation (it attaches a callback to IO container).
flatMap :: IO a -> (a -> IO b) -> IO b meaning is: given container of type IO a attach a callback of type a -> IO b, fired when container succeeds in its operation, and this produces a new container of type IO b
So
getLine =
getChar >>= \c ->
if c == '\n'
then (return [])
else getLine >>= \rest ->
return (c : rest)
What is means? getLine immediately delegates execution to getChar IO-container, with a callback, which analyses the character passed to it. If its a newline, it does "return """, which is a construction of IO-container, returning empty String immediately.
Otherwise, we call ourselves, grab the rest and return current character attached to rest.
P.S.: return is used to turn a pure value into container, since Monad interface doesn't allow us to bind non-container-producing callbacks (there are very good reasons for this).

Related

Refactoring Haskell when adding IO

I have a concern regarding how far the introduction of IO trickles through a program. Say a function deep within my program is altered to include some IO; how do I isolate this change to not have to also change every function in the path to IO as well?
For instance, in a simplified example:
a :: String -> String
a s = (b s) ++ "!"
b :: String -> String
b s = '!':(fetch s)
fetch :: String -> String
fetch s = reverse s
main = putStrLn $ a "hello"
(fetch here could more realistically be reading a value from a static Map to give as its result)
But say if due to some business logic change, I needed to lookup the value returned by fetch in some database (which I can exemplify here with a call to getLine):
fetch :: String -> IO String
fetch s = do
x <- getLine
return $ s ++ x
So my question is, how to prevent having to rewrite every function call in this chain?
a :: String -> IO String
a s = fmap (\x -> x ++ "!") (b s)
b :: String -> IO String
b s = fmap (\x -> '!':x) (fetch s)
fetch :: String -> IO String
fetch s = do
x <- getLine
return $ s ++ x
main = a "hello" >>= putStrLn
I can see that refactoring this would be much simpler if the functions themselves did not depend on each other. That is fine for a simple example:
a :: String -> String
a s = s ++ "!"
b :: String -> String
b s = '!':s
fetch :: String -> IO String
fetch s = do
x <- getLine
return $ s ++ x
doit :: String -> IO String
doit s = fmap (a . b) (fetch s)
main = doit "hello" >>= putStrLn
but I don't know if that is necessarily practical in more complicated programs.
The only way I've found thus far to really isolate an IO addition like this is to use unsafePerformIO, but, by its very name, I don't want to do that if I can help it. Is there some other way to isolate this change? If the refactoring is substantial, I would start to feel inclined to avoid having to do it (especially under deadlines, etc).
Thanks for any advice!
Here are a few methods I use.
Reduce dependencies on effects by inverting control. (One of the methods you described in your question.) That is, execute the effects outside and pass the results (or functions with those results partially applied) into pure code. Instead of having main → a → b → fetch, have main → fetch and then main → a → b:
a :: String -> String
a f = b f ++ "!"
b :: String -> String
b f = '!' : f
fetch :: String -> IO String
fetch s = do
x <- getLine
return $ s ++ x
main = do
f <- fetch "hello"
putStrLn $ a f
For more complex cases of this, where you need to thread an argument to do this sort of “dependency injection” through many levels, Reader/ReaderT lets you abstract over the boilerplate.
Write pure code that you expect might need effects in monadic style from the start. (Polymorphic over the choice of monad.) Then if you do eventually need effects in that code, you don’t need to change the implementation, only the signature.
a :: (Monad m) => String -> m String
a s = (++ "!") <$> b s
b :: (Monad m) => String -> m String
b s = ('!' :) <$> fetch s
fetch :: (Monad m) => String -> m String
fetch s = pure (reverse s)
Since this code works for any m with a Monad instance (or in fact just Applicative), you can run it directly in IO, or purely with the “dummy” monad Identity:
main = putStrLn =<< a "hello"
main = putStrLn $ runIdentity $ a "hello"
Then as you need more effects, you can use “mtl style” (as #dfeuer’s answer describes) to enable effects on an as-needed basis, or if you’re using the same monad stack everywhere, just replace m with that concrete type, e.g.:
newtype Fetch a = Fetch { unFetch :: IO a }
deriving (Applicative, Functor, Monad, MonadIO)
a :: String -> Fetch String
a s = pure (b s ++ "!")
b :: String -> Fetch String
b s = ('!' :) <$> fetch s
fetch :: String -> Fetch String
fetch s = do
x <- liftIO getLine
return $ s ++ x
main = putStrLn =<< unFetch (a "hello")
The advantage of mtl style is that you can have multiple different implementations of your effects. That makes things like testing & mocking easy, since you can reuse the logic but run it with different “handlers” for production & testing. In fact, you can get even more flexibility (at the cost of some runtime performance) using an algebraic effects library such as freer-effects, which not only lets the caller change how each effect is handled, but also the order in which they’re handled.
Roll up your sleeves and do the refactoring. The compiler will tell you everywhere that needs to be updated anyway. After enough times doing this, you’ll naturally end up recognising when you’re writing code that will require this refactoring later, so you’ll consider effects from the beginning and not run into the problem.
You’re quite right to doubt unsafePerformIO! It’s not just unsafe because it breaks referential transparency, it’s unsafe because it can break type, memory, and concurrency safety as well—you can use it to coerce any type to any other, cause a segfault, or cause deadlocks and concurrency errors that would ordinarily be impossible. You’re telling the compiler that some code is pure, so it’s going to assume it can do all the transformations it does with pure code—such as duplicating, reordering, or even dropping it, which may completely change the correctness and performance of your code.
The main legitimate use cases for unsafePerformIO are things like using the FFI to wrap foreign code (that you know is pure), or doing GHC-specific performance hacks; stay away from it otherwise, since it’s not meant as an “escape hatch” for ordinary code.
First off, the refactoring doesn't tend to be as bad as you might imagine. Once you make the first change, the type checker will point you to the next few, and so on. But suppose you have a reason to suspect from the start that you might need some extra capability to make a function go. A common way to do this (called mtl-style, after the monad transformer library) is to express your needs in a constraint.
class Monad m => MonadFetch m where
fetch :: String -> m String
a :: MonadFetch m => String -> m String
a s = fmap (\x -> x ++ "!") (b s)
b :: MonadFetch m => String -> m String
b s = fmap (\x -> '!':x) (fetch s)
instance MonadFetch IO where
-- fetch :: String -> IO String
fetch s = do
x <- getLine
return $ s ++ x
instance MonadFetch Identity where
-- fetch :: String -> Identity String
fetch = Identity . reverse
You're no longer tied to a particular monad: you just need one that can fetch. Code operating on an arbitrary MonadFetch instance is pure, except that it can fetch.

Use of putStrLn to show result

I am using the Idone.com site and wanted to run this code but do not know the syntax putStrLn to compile from stdin Use this code but strip error.
main = putStrLn (show (sumaCifras x))
sumaCifras:: Int -> Int
sumaCifras x = div x 1000 + mod (div x 100) 10 + mod (div x 10) 10 + mod x 10
Compiler is having a problem, because you use x in main function, which isn't bound in this scope. At first you must read a value from input and then pass it to your function. You can do it in 2 ways.
More natural for people used to imperative languages is "do" syntax, in which it will look like that:
main = do
x <- getLine
putStrLn (show (sumaCifras (read x :: Int)))
When you want to use x as Int, you must use "read" function with type signature, so compiler will know what to expect.
To write it in more functional way, you may use monad transformations, to pass it like that
main = getLine >>= (\x -> putStrLn(show (sumaCifras (read x :: Int)))
The ">>=" operator will get result value from first monadic action (in here it is IO action of getting input) and apply it to function on the right (in here it is lambda function that reads input as Integer, applies your function and returns it to putStrLn, which prints it on the screen). "do" syntax is essentially just a syntactic sugar for this monadic operations, so it will not affect the execution or performance of program.
You can go one step further in writing it functionally by writing it totally point-free
main = getLine >>= (putStrLn . show . sumaCifras . (read :: String -> Int))
Note that here you have type signature for read function, not for application of this function to argument, hence the String -> Int. In here first executed is the getLine function. Input obtained from it is then passed to the read, where it is casted to Int, next is sumaCifras, which then is casted to String by show and printed with putStrLn.

Haskell Input to create a String List

I would like to allow a user to build a list from a series of inputs in Haskell.
The getLine function would be called recursively until the stopping case ("Y") is input, at which point the list is returned.
I know the function needs to be in a similar format to below. I am having trouble assigning the correct type signatures - I think I need to include the IO type somewhere.
getList :: [String] -> [String]
getList list = do line <- getLine
if line == "Y"
then return list
else getList (line : list)
So there's a bunch of things that you need to understand. One of them is the IO x type. A value of this type is a computer program that, when later run, will do something and produce a value of type x. So getLine doesn't do anything by itself; it just is a certain sort of program. Same with let p = putStrLn "hello!". I can sequence p into my program multiple times and it will print hello! multiple times, because the IO () is a program, as a value which Haskell happens to be able to talk about and manipulate. If this were TypeScript I would say type IO<x> = { run: () => Promise<x> } and emphatically that type says that the side-effecting action has not been run yet.
So how do we manipulate these values when the value is a program, for example one that fetches the current system time?
The most fundamental way to chain such programs together is to take a program that produces an x (an IO x) and then a Haskell function which takes an x and constructs a program which produces a y (an x -> IO y and combines them together into a resulting program producing a y (an IO y.) This function is called >>= and pronounced "bind". In fact this way is universal, if we add a program which takes any Haskell value of type x and produces a program which does nothing and produces that value (return :: x -> IO x). This allows you to use, for example, the Prelude function fmap f = (>>= return . f) which takes an a -> b and applies it to an IO a to produce an IO b.
So It is so common to say things like getLine >>= \line -> putStrLn (upcase line ++ "!") that we invented do-notation, writing this as
do
line <- getLine
putStrLn (upcase line ++ "!")
Notice that it's the same basic deal; the last line needs to be an IO y for some y.
The last thing you need to know in Haskell is the convention which actually gets these things run. That is that, in your Haskell source code, you are supposed to create an IO () (a program whose value doesn't matter) called Main.main, and the Haskell compiler is supposed to take this program which you described, and give it to you as an executable which you can run whenever you want. As a very special case, the GHCi interpreter will notice if you produce an IO x expression at the top level and will immediately run it for you, but that is very different from how the rest of the language works. For the most part, Haskell says, describe the program and I will give it to you.
Now that you know that Haskell has no magic and the Haskell IO x type just is a static representation of a computer program as a value, rather than something which does side-effecting stuff when you "reduce" it (like it is in other languages), we can turn to your getList. Clearly getList :: IO [String] makes the most sense based on what you said: a program which allows a user to build a list from a series of inputs.
Now to build the internals, you've got the right guess: we've got to start with a getLine and either finish off the list or continue accepting inputs, prepending the line to the list:
getList = do
line <- getLine
if line == 'exit' then return []
else fmap (line:) getList
You've also identified another way to do it, which depends on taking a list of strings and producing a new list:
getList :: IO [String]
getList = fmap reverse (go []) where
go xs = do
x <- getLine
if x == "exit" then return xs
else go (x : xs)
There are probably several other ways to do it.

What's the meaning of IO actions within pure functions?

I thought that in principle Haskell's type system would forbid calls to impure functions (i.e. f :: a -> IO b) from pure ones, but today I realized that by calling them with return they compile just fine. In this example:
h :: Maybe ()
h = do
return $ putStrLn "???"
return ()
h works in the Maybe monad, but it's a pure function nevertheless. Compiling and running it simply returns Just () as one would expect, without actually doing any I/O. I think Haskell's laziness puts the things together (i.e. putStrLn's return value is not used - and can't since its value constructors are hidden and I can't pattern match against it), but why is this code legal? Are there any other reasons that makes this allowed?
As a bonus, related question: in general, is it possible to forbid at all the execution of actions of a monad from within other ones, and how?
IO actions are first-class values like any other; that's what makes Haskell's IO so expressive, allowing you to build higher-order control structures (like mapM_) from scratch. Laziness isn't relevant here,1 it's just that you're not actually executing the action. You're just constructing the value Just (putStrLn "???"), then throwing it away.
putStrLn "???" existing doesn't cause a line to be printed to the screen. By itself, putStrLn "???" is just a description of some IO that could be done to cause a line to be printed to the screen. The only execution that happens is executing main, which you constructed from other IO actions, or whatever actions you type into GHCi. For more information, see the introduction to IO.
Indeed, it's perfectly conceivable that you might want to juggle about IO actions inside Maybe; imagine a function String -> Maybe (IO ()), which checks the string for validity, and if it's valid, returns an IO action to print some information derived from the string. This is possible precisely because of Haskell's first-class IO actions.
But a monad has no ability to execute the actions of another monad unless you give it that ability.
1 Indeed, h = putStrLn "???" `seq` return () doesn't cause any IO to be performed either, even though it forces the evaluation of putStrLn "???".
Let's desugar!
h = do return (putStrLn "???"); return ()
-- rewrite (do foo; bar) as (foo >> do bar)
h = return (putStrLn "???") >> do return ()
-- redundant do
h = return (putStrLn "???") >> return ()
-- return for Maybe = Just
h = Just (putStrLn "???") >> Just ()
-- replace (foo >> bar) with its definition, (foo >>= (\_ -> bar))
h = Just (putStrLn "???") >>= (\_ -> Just ())
Now, what happens when you evaluate h?* Well, for Maybe,
(Just x) >>= f = f x
Nothing >>= f = Nothing
So we pattern match the first case
f x
-- x = (putStrLn "???"), f = (\_ -> Just ())
(\_ -> Just ()) (putStrLn "???")
-- apply the argument and ignore it
Just ()
Notice how we never had to perform putStrLn "???" in order to evaluate this expression.
*n.b. It is somewhat unclear at which point "desugaring" stops and "evaluation" begins. It depends on your compiler's inlining decisions. Pure computations could be evaluated entirely at compile time.

Weird return in Haskell

checkstring :: [String] -> Int -> [String]
checkstring p n = do z <- doesFileExist (p !! n)
if z
then p
else error $ "'" ++ (p !! n) ++ "' file path does not exist"
It checks for a element in the string by looking at "n"(so if n = 2 it will check if the second string in the list) then see if it exists. If it does exist it will return the original string list, if not it will error.Why does it do this? :
Couldn't match expected type `[t0]' with actual type `IO Bool'
In the return type of a call of `doesFileExist'
In a stmt of a 'do' expression: z <- doesFileExist (p !! n)
The type of doesFileExist is String -> IO Bool. If your program wants to know whether a file exists, it has to interact with the file system, which is an IO action. If you want your checkString function to do that, it will also have to have some kind of IO-based type. For example, I think something like this would work, though I haven't tried it:
checkstring :: [String] -> Int -> IO [String]
checkstring p n = do z <- doesFileExist (p !! n)
if z
then return p
else error $ "'" ++ (p !! n) ++ "' file path does not exist"
To add to what MatrixFrog has mentioned in his answer. If you look at your function signature i.e [String] -> Int -> [String] it indicates that this function is a pure function and doesn't involved any side effects, where as in your function body you are using doesFileExist which has a signature of String -> IO Bool where the presence of IO indicates it is a impure function i.e it involves some IO. In haskell there is a strict separation between impure and pure functions and as a matter of fact if your function calls some other function which is impure than your function is also impure. So in your case your function checkString needs to be impure and that can be done by making it return IO [String], which is what MatrixFrog has mentioned in his answer.
On another note, I would suggest that you can make the function to be something like:
checkString :: String -> IO (Maybe String) ,as your function doesn't need the whole list of string as it just need a specific string from the list to do its work and rather than throwing an error you can use Maybe to detect the error.
This is just a suggestion but it also depends on how your function is being used.
I think the problem is that your type signature forces the do block to assume that it is some other monad. For example, suppose you're working in the list monad. Then, you could write
myFcn :: [String] -> Int -> [String]
myFcn p n = do
return (p !! n)
In the case of the list monad, the return simply returns the singleton list, so you get behavior like,
> myFcn ["a", "bc", "d"] 1
["bc"]
(My personal opinion is that it would be very helpful if the GHC had an option to print out common mistakes that could cause a type error; I sympathize with the asker in that I've gotten a lot of type error messages that take time to figure out).

Resources