Using lines after readFile - haskell

I'm trying to do some programming in Haskell. I'm trying to read a file and then put every line in the file in a list by using the line function. Here's the partial code:
file = "muh.rtr"
readTrack :: String -> Track
readTrack file =
do let defFile = readFile file
let fileLines = lines defFile
However, I keep getting this error:
Parser.hs:22:39:
Couldn't match expected type `String' with actual type `IO String'
In the first argument of `lines', namely `defFile'
In the expression: lines defFile
In an equation for `fileLines': fileLines = lines defFile
I have been searching the Internet for hours now hoping to find some answers somewhere but I've not been so lucky so far.

You probably wanted either something like this:
readTrack :: String -> IO Track
readTrack file = do defFile <- readFile file
let fileLines = lines defFile
-- etc....
...or something like this:
readTrack :: String -> IO Track
readTrack file = do fileLines <- liftM lines (readFile file)
-- etc....
But what you really should do is stop, go find an introduction to the language such as Learn You a Haskell, and spend some time reading it.
Feeding code consisting entirely of very simple errors into GHC and then posting the error message on Stack Overflow is not a good way to learn.

The type of readFile is
readFile :: FilePath -> IO String
so you need to use <- to bind the result, and your function has to return IO Track.
readTrack :: String -> IO Track
readTrack file =
do defFile <- readFile file
let fileLines = lines defFile
...
I suggest reading a good tutorial on IO in Haskell, for example the Input and Output chapter of Learn You a Haskell for Great Good!.

readFile return an IO string. That is, it is an IO computation that returns a string. This means that you need to use <- instead of let to "get" the string its returning.
readTrack file =
do
defFile <- readFile file
...
You can use let to bind things that are not IO computations, such as the return value of lines, that is a regular string.
readTrack file =
do
defFile <- readFile file
let fileLines = lines defFile
...
Finally, you need to return the value you might want to try something like
readTrack file =
do
defFile <- readFile file
let fileLines = lines defFile
fileLines --Doesn't actually work!
but unfortunately, since we are inside a "do" block and are trying to return a monadic computation, we need to send the fileLines back into the io monad (remember, out function returns IO [String], not String!
readTrack file =
do
defFile <- readFile file
let fileLines = lines defFile
return fileLines
Note that the "return" here is not a return statement as would normaly be found in most languages and it should not be used in your pure functions.
All this might seem like a lot at first. I would suggest you stick to pure functions (without input and output / monads) until until you get a better hang on the language.

You can't do it like that -- you've run into the IO monad. What you need to do is something like:
readTrack :: String -> IO Track
readTrack file = do
defFile <- readFile file
let fileLines = lines deffile
...
return whatever
Think of IO T values as statements (as opposed to expressions) with return type T. Because statements have side effects, but expressions don't, you can never turn a statement into an expression; the type system enforces this, which is why your type signature won't work.
Note the different assignment-like syntax in the do block: in this example, the foo <- bar is used for IO operations, while the let baz = quux syntax is used for purely functional evaluation. This is more fallout from using monadic I/O -- it makes more sense in the full generality of Haskell's polymorphic type system, but it's not necessarily bad to have a syntactic indicator of pure vs. side-effecting operations, either.
In general, it is good practice to try keeping most of your implementation in the purely functional realm: implement your pure computation with regular functional methods, then describe your I/O operations in the IO monad. It is a common novice mistake to write loops in the IO monad which would be more appropriate as list comprehensions or recursive functions.

If your function is supposed to have type readTrack :: String -> Track, are you sure the String is a filename? Perhaps it's data - if so, don't use readFile. Write some sample data and test using that, eg
sampleData = "2 3\n1 30 234 45\n1 2 32 4\n5 3 4 23"
(The other question on SO about this homework didn't use file IO. I'll not link to it because you're in a crisis and might be tempted to copy, and in any case if you refuse to learn haskell at least I'll force you to improve your StackOverflow search skills! :) )
In any case I think you'll get more marks by solving the String problem than by solving the IO problem.
Delay the readFile issue until you've got the pure version working, otherwise you might end up writing most of your code in the IO monad which would be much more complex than necessary.
One you have a pure function readTrack :: String -> Track, you can do
readTrackFrom :: FilePath -> IO Track
readTrackFrom filename = fmap readTrack (readFile filename)
Now, fmap :: Functor f => (a -> b) -> f a -> f b, so takes pure functions and lifts them to work in a different computational context like IO.
Since IO is a Functor (look it up tomorrow, not tonight), we're using it as the type (String -> Track) -> IO String -> IO Track. That's good because readTrack :: String -> Track and (readFile filename) :: IO String.
If you want to, you can then >>= print or >>= writeFile newfilename as you see fit.
Don't forget to add deriving Show after use data Track =..., but you don't need to if you're using type Track = .....

Related

Write a function from IO a -> a?

Take the function getLine - it has the type:
getLine :: IO String
How do I extract the String from this IO action?
More generally, how do I convert this:
IO a
to this:
a
If this is not possible, then why can't I do it?
In Haskell, when you want to work with a value that is "trapped" in IO, you don't take the value out of IO. Instead, you put the operation you want to perform into IO, as well!
For example, suppose you want to check how many characters the getLine :: IO String will produce, using the length function from Prelude.
There exists a helper function called fmap which, when specialized to IO, has the type:
fmap :: (a -> b) -> IO a -> IO b
It takes a function that works on "pure" values not trapped in IO, and gives you a function that works with values that are trapped in IO. This means that the code
fmap length getLine :: IO Int
represents an IO action that reads a line from console and then gives you its length.
<$> is an infix synonym for fmap that can make things simpler. This is equivalent to the above code:
length <$> getLine
Now, sometimes the operation you want to perform with the IO-trapped value itself returns an IO-trapped value. Simple example: you wan to write back the string you have just read using putStrLn :: String -> IO ().
In that case, fmap is not enough. You need to use the (>>=) operator, which, when specialiced to IO, has the type IO a -> (a -> IO b) -> IO b. In out case:
getLine >>= putStrLn :: IO ()
Using (>>=) to chain IO actions has an imperative, sequential flavor. There is a kind of syntactic sugar called "do-notation" which helps to write sequential operation like these in a more natural way:
do line <- getLine
putStrLn line
Notice that the <- here is not an operator, but part of the syntactic sugar provided by the do notation.
Not going into any details, if you're in a do block, you can (informally/inaccurately) consider <- as getting the value out of the IO.
For example, the following function takes a line from getLine, and passes it to a pure function that just takes a String
main = do
line <- getLine
putStrLn (wrap line)
wrap :: String -> String
wrap line = "'" ++ line ++ "'"
If you compile this as wrap, and on the command line run
echo "Hello" | wrap
you should see
'Hello'
If you know C then consider the question "How can I get the string from gets?" An IO String is not some string that's made hard to get to, it's a procedure that can return a string - like reading from a network or stdin. You want to run the procedure to obtain a string.
A common way to run IO actions in a sequence is do notation:
main = do
someString <- getLine
-- someString :: String
print someString
In the above you run the getLine operation to obtain a String value then use the value however you wish.
So "generally", it's unclear why you think you need a function of this type and in this case it makes all the difference.
It should be noted for completeness that it is possible. There indeed exists a function of type IO a -> a in the base library called unsafePerformIO.
But the unsafe part is there for a reason. There are few situations where its usage would be considered justified. It's an escape hatch to be used with great caution - most of the time you will let monsters in instead of letting yourself out.
Why can't you normally go from IO a to a? Well at the very least it allows you to break the rules by having a seemingly pure function that is not pure at all - ouch! If it were a common practice to do this the type signatures and all the work done by the compiler to verify them would make no sense at all. All the correctness guarantees would go out of the window.
Haskell is, partly, interesting precisely because this is (normally) impossible.
For how to approach your getLine problem in particular see the other answers.

Defining a function in haskell

Why does my function give me an out of scope error?
tarefa1 :: [String] -> [String]
tarefa1 linhas = if res == ok then ["OK"] else [show res]
where
(tab,coords) = parteMapa conteudo
erro1 = validaTabuleiro 1 tab
erro2 = validaCoords (length tab + 1) tab coords
res = juntaErro erro1 erro2
The error:
Not in scope: `conteudo'.
conteudo is supposed to be a .txt document that I have in a different file, but I don't know how to make it to load it in this function.
This is not really a good question, since this should be covered by basic Haskell knowledge, and it's clearly homework for those of us who can speak Portuguese. You shouldn't be afraid to ask your teacher for some help, and I'm sure he would be glad to give you that.
Nevertheless, since it is possible to answer the question, I will:
Input and Output in Haskell is only possible inside functions that evaluate an IO action (that is, a value of the IO type).
Of course, since main has type IO (), you can execute IO actions inside of it.
The simplest way to read a file is with the readFile function. It accepts a FilePath and evaluates to a IO String (which has the full contents of the file). I'll give you an example and I hope you can follow from it.
main :: IO ()
main = do
contents <- readFile "yourfilename.txt" -- because I used "<-", contents has type String
let fileLines = lines contents -- here I have a [String] with each line of the file
someFunction fileLines
return ()
someFunction should also evaluate an IO action, in this example. You can "put things inside" IO using return, in case you don't know.

Can I create a function in Haskell that will encapsulate reading data from file and returning me a simple list of data?

Consider the code below taken from a working example I've built to help me learn Haskell. This code parses a CSV file containing stock quotes downloaded from Yahoo into a nice simple list of bars with which I can then work.
My question: how can I write a function that will take a file name as its parameter and return an OHLCBarList so that the first four lines inside main can be properly encapsulated?
In other words, how can I implement (without getting all sorts of errors about IO stuff) the function whose type would be
getBarsFromFile :: Filename -> OHLCBarList
so that the grunt work that was being done in the first four lines of main can be properly encapsulated?
I've tried to do this myself but with my limited Haskell knowledge, I'm failing miserably.
import qualified Data.ByteString as BS
type Filename = String
getContentsOfFile :: Filename -> IO BS.ByteString
barParser :: Parser Bar
barParser = do
time <- timeParser
char ','
open <- double
char ','
high <- double
char ','
low <- double
char ','
close <- double
char ','
volume <- decimal
char ','
return $ Bar Bar1Day time open high low close volume
type OHLCBar = (UTCTime, Double, Double, Double, Double)
type OHLCBarList = [OHLCBar]
barsToBarList :: [Either String Bar] -> OHLCBarList
main :: IO ()
main = do
contents :: C.ByteString <- getContentsOfFile "PriceData/Daily/yhoo1.csv" --PriceData/Daily/Yhoo.csv"
let lineList :: [C.ByteString] = C.lines contents -- Break the contents into a list of lines
let bars :: [Either String Bar] = map (parseOnly barParser) lineList -- Using the attoparsec
let ohlcBarList :: OHLCBarList = barsToBarList bars -- Now I have a nice simple list of tuples with which to work
--- Now I can do simple operations like
print $ ohlcBarList !! 0
If you really want your function to have type Filename -> OHLCBarList, it can't be done.* Reading the contents of a file is an IO operation, and Haskell's IO monad is specifically designed so that values in the IO monad can never leave. If this restriction were broken, it would (in general) mess with a lot of things. Instead of doing this, you have two options: make the type of getBarsFromFile be Filename -> IO OHLCBarList — thus essentially copying the first four lines of main — or write a function with type C.ByteString -> OHLCBarList that the output of getContentsOfFile can be piped through to encapsulate lines 2 through 4 of main.
* Technically, it can be done, but you really, really, really shouldn't even try, especially if you're new to Haskell.
Others have explained that the correct type of your function has to be Filename -> IO OHLCBarList, I'd like to try and give you some insight as to why the compiler imposes this draconian measure on you.
Imperative programming is all about managing state: "do certain operations to certain bits of memory in sequence". When they grow large, procedural programs become brittle; we need a way of limiting the scope of state changes. OO programs encapsulate state in classes but the paradigm is not fundamentally different: you can call the same method twice and get different results. The output of the method depends on the (hidden) state of the object.
Functional programming goes all the way and bans mutable state entirely. A Haskell function, when called with certain inputs, will always produce the same output. Simple examples of
pure functions are mathematical operators like + and *, or most of the list-processing functions like map. Pure functions are all about the inputs and outputs, not managing internal state.
This allows the compiler to be very smart in optimising your program (for example, it can safely collapse duplicated code for you), and helps the programmer not to make mistakes: you can't put the system in an invalid state if there is none! We like pure functions.
The exception to the rule is IO. Code that performs IO is impure by definition: you could call getLine a hundred times and never get the same result, because it depends on what the user typed. Haskell handles this using the type system: all impure functions are marred with the IO type. IO can be thought of as a dependency on the state of the real world, sort of like World -> (NewWorld, a)
To summarise: pure functions are good because they are easy to reason about; this is why Haskell makes functions pure by default. Any impure code has to be labelled as such with an IO type signature; this tells the compiler and the reader to be careful with this function. So your function which reads from a file (a fundamentally impure action) but returns a pure value can't exist.
Addendum in response to your comment
You can still write pure functions to operate on data that was obtained impurely. Consider the following straw-man:
main :: IO ()
main = do
putStrLn "Enter the numbers you want me to process, separated by spaces"
line <- getLine
let numberStrings = words line
let numbers = map read numberStrings
putStrLn $ "The result of the calculation is " ++ (show $ foldr1 (*) numbers + 10)
Lots of code inside IO here. Let's extract some functions:
main :: IO ()
main = do
putStrLn "Enter the numbers you want me to process, separated by spaces"
result <- fmap processLine getLine -- fmap :: (a -> b) -> IO a -> IO b
-- runs an impure result through a pure function
-- without leaving IO
putStrLn $ "The result of the calculation is " ++ result
processLine :: String -> String -- look ma, no IO!
processLine = show . calculate . readNumbers
readNumbers :: String -> [Int]
readNumbers = map read . words
calculate :: [Int] -> Int
calculate numbers = product numbers + 10
product :: [Int] -> Int
product = foldr1 (*)
I've pulled logic out of main into pure functions which are easier to read, easier for the compiler to optimise, and more reusable (and so more testable). The program as a whole still lives inside IO because the data is obtained impurely (see the last part of this answer for a more thorough treatment of this argument). Impure data can be piped through pure functions using fmap and other combinators; you should try to put as little logic in main as possible.
Your code does seem to be most of the way there; as others have suggested you could extract lines 2-4 of your main into another function.
In other words, how can I implement (without getting all sorts of errors about IO stuff) the function whose type would be
getBarsFromFile :: Filename -> OHLCBarList
so that the grunt work that was being done in the first four lines of main can be properly encapsulated?
You cannot do this without getting all sorts of errors about IO stuff because this type for getBarsFromFile misses an IO. Probably that's what the errors about IO stuff are trying to tell you. Did you try understanding and fixing the errors?
In your situation, I would start by abstracting over the second to fourth line of your main in a function:
parseBars :: ByteString -> OHLCBarList
And then I would combine this function with getContentsOfFile to get:
getBarsFromFile :: FilePath -> IO OHLCBarList
This I would call in main.

How can I parse the IO String in Haskell?

I' ve got a problem with Haskell. I have text file looking like this:
5.
7.
[(1,2,3),(4,5,6),(7,8,9),(10,11,12)].
I haven't any idea how can I get the first 2 numbers (2 and 7 above) and the list from the last line. There are dots on the end of each line.
I tried to build a parser, but function called 'readFile' return the Monad called IO String. I don't know how can I get information from that type of string.
I prefer work on a array of chars. Maybe there is a function which can convert from 'IO String' to [Char]?
I think you have a fundamental misunderstanding about IO in Haskell. Particularly, you say this:
Maybe there is a function which can convert from 'IO String' to [Char]?
No, there isn't1, and the fact that there is no such function is one of the most important things about Haskell.
Haskell is a very principled language. It tries to maintain a distinction between "pure" functions (which don't have any side-effects, and always return the same result when give the same input) and "impure" functions (which have side effects like reading from files, printing to the screen, writing to disk etc). The rules are:
You can use a pure function anywhere (in other pure functions, or in impure functions)
You can only use impure functions inside other impure functions.
The way that code is marked as pure or impure is using the type system. When you see a function signature like
digitToInt :: String -> Int
you know that this function is pure. If you give it a String it will return an Int and moreover it will always return the same Int if you give it the same String. On the other hand, a function signature like
getLine :: IO String
is impure, because the return type of String is marked with IO. Obviously getLine (which reads a line of user input) will not always return the same String, because it depends on what the user types in. You can't use this function in pure code, because adding even the smallest bit of impurity will pollute the pure code. Once you go IO you can never go back.
You can think of IO as a wrapper. When you see a particular type, for example, x :: IO String, you should interpret that to mean "x is an action that, when performed, does some arbitrary I/O and then returns something of type String" (note that in Haskell, String and [Char] are exactly the same thing).
So how do you ever get access to the values from an IO action? Fortunately, the type of the function main is IO () (it's an action that does some I/O and returns (), which is the same as returning nothing). So you can always use your IO functions inside main. When you execute a Haskell program, what you are doing is running the main function, which causes all the I/O in the program definition to actually be executed - for example, you can read and write from files, ask the user for input, write to stdout etc etc.
You can think of structuring a Haskell program like this:
All code that does I/O gets the IO tag (basically, you put it in a do block)
Code that doesn't need to perform I/O doesn't need to be in a do block - these are the "pure" functions.
Your main function sequences together the I/O actions you've defined in an order that makes the program do what you want it to do (interspersed with the pure functions wherever you like).
When you run main, you cause all of those I/O actions to be executed.
So, given all that, how do you write your program? Well, the function
readFile :: FilePath -> IO String
reads a file as a String. So we can use that to get the contents of the file. The function
lines:: String -> [String]
splits a String on newlines, so now you have a list of Strings, each corresponding to one line of the file. The function
init :: [a] -> [a]
Drops the last element from a list (this will get rid of the final . on each line). The function
read :: (Read a) => String -> a
takes a String and turns it into an arbitrary Haskell data type, such as Int or Bool. Combining these functions sensibly will give you your program.
Note that the only time you actually need to do any I/O is when you are reading the file. Therefore that is the only part of the program that needs to use the IO tag. The rest of the program can be written "purely".
It sounds like what you need is the article The IO Monad For People Who Simply Don't Care, which should explain a lot of your questions. Don't be scared by the term "monad" - you don't need to understand what a monad is to write Haskell programs (notice that this paragraph is the only one in my answer that uses the word "monad", although admittedly I have used it four times now...)
Here's the program that (I think) you want to write
run :: IO (Int, Int, [(Int,Int,Int)])
run = do
contents <- readFile "text.txt" -- use '<-' here so that 'contents' is a String
let [a,b,c] = lines contents -- split on newlines
let firstLine = read (init a) -- 'init' drops the trailing period
let secondLine = read (init b)
let thirdLine = read (init c) -- this reads a list of Int-tuples
return (firstLine, secondLine, thirdLine)
To answer npfedwards comment about applying lines to the output of readFile text.txt, you need to realize that readFile text.txt gives you an IO String, and it's only when you bind it to a variable (using contents <-) that you get access to the underlying String, so that you can apply lines to it.
Remember: once you go IO, you never go back.
1 I am deliberately ignoring unsafePerformIO because, as implied by the name, it is very unsafe! Don't ever use it unless you really know what you are doing.
As a programming noob, I too was confused by IOs. Just remember that if you go IO you never come out. Chris wrote a great explanation on why. I just thought it might help to give some examples on how to use IO String in a monad. I'll use getLine which reads user input and returns an IO String.
line <- getLine
All this does is bind the user input from getLine to a value named line. If you type this this in ghci, and type :type line it will return:
:type line
line :: String
But wait! getLine returns an IO String
:type getLine
getLine :: IO String
So what happened to the IOness from getLine? <- is what happened. <- is your IO friend. It allows you to bring out the value that is tainted by the IO within a monad and use it with your normal functions. Monads are easily identified because they begin with do. Like so:
main = do
putStrLn "How much do you love Haskell?"
amount <- getLine
putStrln ("You love Haskell this much: " ++ amount)
If you're like me, you'll soon discover that liftIO is your next best monad friend, and that $ help reduce the number of parenthesis you need to write.
So how do you get the information from readFile? Well if readFile's output is IO String like so:
:type readFile
readFile :: FilePath -> IO String
Then all you need is your friendly <-:
yourdata <- readFile "samplefile.txt"
Now if type that in ghci and check the type of yourdata you'll notice it's a simple String.
:type yourdata
text :: String
As people already say, if you have two functions, one is readStringFromFile :: FilePath -> IO String, and another is doTheRightThingWithString :: String -> Something, then you don't really need to escape a string from IO, since you can combine this two functions in various ways:
With fmap for IO (IO is Functor):
fmap doTheRightThingWithString readStringFromFile
With (<$>) for IO (IO is Applicative and (<$>) == fmap):
import Control.Applicative
...
doTheRightThingWithString <$> readStringFromFile
With liftM for IO (liftM == fmap):
import Control.Monad
...
liftM doTheRightThingWithString readStringFromFile
With (>>=) for IO (IO is Monad, fmap == (<$>) == liftM == \f m -> m >>= return . f):
readStringFromFile >>= \string -> return (doTheRightThingWithString string)
readStringFromFile >>= \string -> return $ doTheRightThingWithString string
readStringFromFile >>= return . doTheRightThingWithString
return . doTheRightThingWithString =<< readStringFromFile
With do notation:
do
...
string <- readStringFromFile
-- ^ you escape String from IO but only inside this do-block
let result = doTheRightThingWithString string
...
return result
Every time you will get IO Something.
Why you would want to do it like that? Well, with this you will have pure and
referentially transparent programs (functions) in your language. This means that every function which type is IO-free is pure and referentially transparent, so that for the same arguments it will returns the same values. For example, doTheRightThingWithString would return the same Something for the same String. However readStringFromFile which is not IO-free can return different strings every time (because file can change), so that you can't escape such unpure value from IO.
If you have a parser of this type:
myParser :: String -> Foo
and you read the file using
readFile "thisfile.txt"
then you can read and parse the file using
fmap myParser (readFile "thisfile.txt")
The result of that will have type IO Foo.
The fmap means myParser runs "inside" the IO.
Another way to think of it is that whereas myParser :: String -> Foo, fmap myParser :: IO String -> IO Foo.

How to create a Haskell function that turns IO String into IO [String]?

I've started to learn Haskell and feeling overwhelmed with it. I'm now trying to create a function that either returns a string from standard input or from the contents of a list of files.
In other words, I'm trying to replicate the behavior of Unix wc utility which takes input from stdin when no files are given.
I've created something like this:
parseArgs [] = [getContents]
parseArgs fs = mapM readFile fs
But it doesn't compile since in one case I have [IO String] and in the other IO [String]. I can't make this pattern matching to return IO [String] in all cases. Please point me to right direction.
To make the first pattern also IO [String], you have to unpack the value from inside the list first and then repack it. Something like this:
do c <- getContents
return [c]
In normal monadic notation:
getContents >>= \c -> return [c]
In a case like this, it's usually better to use a functor instead of a monad. Then you can avoid the return:
fmap (:[]) getContents
(:[]) has the same meaning as \x -> [x], it creates a singleton list.

Resources