Parsec: grabbing raw source after parsing

Parsec: grabbing raw source after parsing - haskell

I have a strange whim. Suppose I have something like this:
data Statement = StatementType Stuff Source
Now I want to parse such a statement, parse all the stuff, and after that I want to put all characters that I've processed (for this particular statement) into resulting data structure. For some reason.
Is it possible, and if yes, how to accomplish that?

In general this is not possible. parsec does not expect a lot from its stream type, in particular there is no way to efficently split a stream.
But for a concrete stream type (e.g. String, or [a], or ByteString) a hack like this would work:
parseWithSource :: Parsec [c] u a -> Parsec [c] u ([c], a)
parseWithSource p = do
input <- getInput
a <- p
input' <- getInput
return (take (length input - length input') input, a)
This solution relies on function getInput that returns current input. So we can get the input twice: before and after parsing, this gives us exact number of consumed elements, and knowing that we can take these elements from the original input.
Here you can see it in action:
*Main Text.Parsec> parseTest (between (char 'x') (char 'x') (parseWithSource ((read :: String -> Int) `fmap` many1 digit))) "x1234x"
("1234",1234)
But you should also look into attoparsec, as it properly supports this functionality with the match function.

Related

Haskell sequence of IO actions processing with filtration their results in realtime+perfoming some IO actions in certain moments

I want to do some infinite sequence of IO actions processing with filtration their results in realtime+perfoming some IO actions in certain moments:
We have some function for reducing sequences (see my question haskell elegant way to filter (reduce) sequences of duplicates from infinte list of numbers):
f :: Eq a => [a] -> [a]
f = map head . group
and expression
join $ sequence <$> ((\l -> (print <$> l)) <$> (f <$> (sequence $ replicate 6 getLine)))
if we run this, user can generate any seq of numbers, for ex:
1
2
2
3
3
"1"
"2"
"3"
[(),(),()]
This means that at first all getLine actions performed (6 times in the example and at the end of this all IO actions for filtered list performed, but I want to do IO actions exactly in the moments then sequencing reduces done for some subsequences of same numbers.
How can I archive this output:
1
2
"1"
2
3
"2"
3
3
"3"
[(),(),()]
So I Want this expression not hangs:
join $ sequence <$> ((\l -> (print <$> l)) <$> (f <$> (sequence $ repeat getLine)))
How can I archive real-time output as described above without not blocking it on infinite lists?

Without a 3rd-party library, you can lazily read the contents of standard input, appending a dummy string to the end of the expected input to force output. (There's probably a better solution that I'm stupidly overlooking.)
import System.IO
print_unique :: (String, String) -> IO ()
print_unique (last, current) | last == current = return ()
| otherwise = print last
main = do
contents <- take 6 <$> lines <$> hGetContents stdin
traverse print_unique (zip <*> tail $ (contents ++ [""]))
zip <*> tail produces tuples consisting of the ith and i+1st lines without blocking. print_unique then immediately outputs a line if the following line is different.
Essentially, you are sequencing the output actions as the input is executed, rather than sequencing the input actions.

This seems like a job for a streaming library, like streaming.
{-# LANGUAGE ImportQualifiedPost #-}
module Main where
import Streaming
import Streaming.Prelude qualified as S
main :: IO ()
main =
S.mapM_ print
. S.catMaybes
. S.mapped S.head
. S.group
$ S.replicateM 6 getLine
"streaming" has an API reminiscent to that of lists, but works with effectful sequences.
The nice thing about streaming's version of group is that it doesn't force you to keep the whole group in memory if it isn't needed.
The least intuitive function in this answer is mapped, because it's very general. It's not obvious that streaming's version of head fits as its parameter. The key idea is that the Stream type can represent both normal effectful sequences, and sequences of elements on which groups have been demarcated. This is controlled by changing a functor type parameter (Of in the first case, a nested Stream (Of a) m in the case of grouped Streams).
mapped let's you transform that functor parameter while having some effect in the underlying monad (here IO). head processes the inner Stream (Of a) m groups, getting us back to an Of (Maybe a) functor parameter.

I found a nice solution with iterateUntilM
iterateUntilM (\_->False) (\pn -> getLine >>= (\n -> if n==pn then return n else (if pn/="" then print pn else return ()) >> return n) ) ""
I don't like some verbose with
(if pn/="" then print pn else return ())
if you know how to reduce this please comment)
ps.
It is noteworthy that I made a video about this function :)
And could not immediately apply it :(

How to parse a float number input in Haskell?

The problem is that I need to input a decimal number, like a float, with right format.
However, I don't know how can I parse the input to ensure it's really a float. If not, I need to putStrLn "ERR". Assume I have the consecutive input.
As example shown below, what condition can I add after IF to exclude the wrong input format, like 1.2.e!##$, which I should give an "ERR" and loop main rather than get an error and exit program immediately.
input <- getLine
if (read input1 :: Float) > 1.0
then do
let result1 = upperbound (read input :: Float)
let result2 = lowerbound (read input :: Float)
print result4
print result3
main
else do
putStrLn"ERR"
main

read is a partial function - it works only on a subset of the input domain. A better example for a partial function is head: it works well on non-empty lists, but will throw an error on an empty list - and you can only handle errors when in the IO monad. Partial functions are useful in some cases, but you should generally avoid using them. So like head, read is an unsafe function - it may fail when the input cannot be parsed.
read has a safe alternative: readMaybe from Text.Read.
readMaybe :: Read a => String -> Maybe a
readMaybe will never fail - if it can't parse a string, it will return Nothing. Handling a Maybe value is a simple task and can be done in several ways (case expressions, Data.Maybe functions, do notation and so on). Here's an example using a case expression:
import Text.Read
...
case (readMaybe input :: Maybe Float) of
Just f | f > 1.0 -> ...
| otherwise -> ...
Nothing -> ...
This article can be helpful in understanding the different ways of error handling in Haskell.

Prelude> let s1 = "1.223"
Prelude> let s2 = "1"
Prelude> let s3 = "1.2.e!##$"
Prelude> read s1 :: Float
1.223
Prelude> read s2 :: Float
1.0
Prelude> read s3 :: Float
*** Exception: Prelude.read: no parse
read throws an exception when it can't parse the string. You need to handle that exception.

Trying to simplify the checking of an IO Bool in an Attoparsec parser

I'm trying to simplify the below code that's part of an attoparsec parser for a network packet, and I'm at a loss for a nice way to do it.
It starts with a call to atEnd :: IO Bool to determine if there's more to parse. I can't find a nicer way to use atEnd than to unwrap it from IO and then use it in an if statement, but it seems like there must be be a simpler way to check bool inside a monad. Here's the code:
maybePayload :: Parser (Maybe A.Value)
maybePayload = do
e <- atEnd
if e then return Nothing
else do
payload' <- char ':' *> takeByteString
maybe mzero (return . Just) (maybeResult $ parse A.json payload')
The intention is to return Nothing if there is no payload, to return Just A.Value if there is a valid JSON payload, and for the parser to fail if there is a non-valid payload.
Here's the Packet record that eventually gets created:
data Packet = Packet
{ pID :: Integer
, pEndpoint :: String
, pPayload :: Maybe A.Value
}

You're doing a lot of work you don't need to do. First you check if you're at the end of the data and return Nothing if that doesn't work out. That's just not necessary, because if you're at the end, any parser that requires content will fail, and using maybeResult will turn that failure into Nothing.
The only time your parser fails is with the case where the input has data which doesn't start with the character :, the rest of the time it succeeds, even if that's by returning Nothing.
The only actual parsing going on is checking for : then using A.json. I think you're trying to write the whole program inside one parser, whereas you should just do the parsing on its own then call that as necessary. There's no need to check for end of data, or to make sure you get the whole content - that's all built in for free in a parser. Once you get rid of all that unnecessary checking, you get
payload :: Parser A.Value
payload = char ':' *> A.json
If you want to you can use that as maybeResult $ parse payload input to get a Maybe A.Value that's not additionally wrapped in a Parser. If you don't apply maybeResult, you can pattern match on the Result returned to deal separately with Failure, Partial success and Success.
Edit: OK, clearer now, thanks:
(If there's a colon followed by invalid json, fail)
If there's a colon followed by valid json, succeed, wrapping it in Just
If there's just end of input, succeed, returning Nothing
So we get:
maybePayload :: Parser (Maybe A.Value)
maybePayload = char ':' *> (Just <$> A.json)
<|> (Nothing <$ endOfInput)
I've used <$> and <$ from Control.Applicative, or if you prefer, from Data.Functor.
<$> is an infix version of fmap, so Just <$> A.json does A.json and wraps any output in Just.
<$ is fmap.const so replaces the () from endOfInput with Nothing.

Why you need to encode failure in Maybe when the parser monad already has a built-in notion of failure? The problem with using Maybe in this way is that the parser cannot backtrack.
You could try something like this (I haven't tried to typecheck it) and then use option in the caller:
payload :: Parser Value
payload = do
payload' <- char ':' *> takeByteString
let res = parse A.json payload'
case res of
Error msg -> fail msg
Success a -> return a

How to translate this python to Haskell?

I'm learning Haskell and as an exercise I'm trying to convert write the read_from function following code to Haskell. Taken from Peter Norvig's Scheme interpreter.
Is there a straightforward way do this?
def read(s):
"Read a Scheme expression from a string."
return read_from(tokenize(s))
parse = read
def tokenize(s):
"Convert a string into a list of tokens."
return s.replace('(',' ( ').replace(')',' ) ').split()
def read_from(tokens):
"Read an expression from a sequence of tokens."
if len(tokens) == 0:
raise SyntaxError('unexpected EOF while reading')
token = tokens.pop(0)
if '(' == token:
L = []
while tokens[0] != ')':
L.append(read_from(tokens))
tokens.pop(0) # pop off ')'
return L
elif ')' == token:
raise SyntaxError('unexpected )')
else:
return atom(token)
def atom(token):
"Numbers become numbers; every other token is a symbol."
try: return int(token)
except ValueError:
try: return float(token)
except ValueError:
return Symbol(token)

There is a straightforward way to "transliterate" Python into Haskell. This can be done by clever usage of monad transformers, which sounds scary, but it's really not. You see, due to purity, in Haskell when you want to use effects such as mutable state (e.g. the append and pop operations are performing mutation) or exceptions, you have to make it a little more explicit. Let's start at the top.
parse :: String -> SchemeExpr
parse s = readFrom (tokenize s)
The Python docstring said "Read a Scheme expression from a string", so I just took the liberty of encoding this as the type signature (String -> SchemeExpr). That docstring becomes obsolete because the type conveys the same information. Now... what is a SchemeExpr? According to your code, a scheme expression can be an int, float, symbol, or list of scheme expressions. Let's create a data type that represents these options.
data SchemeExpr
= SInt Int
| SFloat Float
| SSymbol String
| SList [SchemeExpr]
deriving (Eq, Show)
In order to tell Haskell that the Int we are dealing with should be treated as a SchemeExpr, we need to tag it with SInt. Likewise with the other possibilities. Let's move on to tokenize.
tokenize :: String -> [Token]
Again, the docstring turns into a type signature: turn a String into a list of Tokens. Well, what's a Token? If you look at the code, you'll notice that the left and right paren characters are apparently special tokens, which signal particular behaviors. Anything else is... unspecial. While we could create a data type to more clearly distinguish parens from other tokens, let's just use Strings, to stick a little closer to the original Python code.
type Token = String
Now let's try writing tokenize. First, let's write a quick little operator for making function chaining look a bit more like Python. In Haskell, you can define your own operators.
(|>) :: a -> (a -> b) -> b
x |> f = f x
tokenize s = s |> replace "(" " ( "
|> replace ")" " ) "
|> words
words is Haskell's version of split. However, Haskell has no pre-cooked version of replace that I know of. Here's one that should do the trick:
-- add imports to top of file
import Data.List.Split (splitOn)
import Data.List (intercalate)
replace :: String -> String -> String -> String
replace old new s = s |> splitOn old
|> intercalate new
If you read the docs for splitOn and intercalate, this simple algorithm should make perfect sense. Haskellers would typically write this as replace old new = intercalate new . splitOn old, but I used |> here for easier Python audience understanding.
Note that replace takes three arguments, but above I only invoked it with two. In Haskell you can partially apply any function, which is pretty neat. |> works sort of like the unix pipe, if you couldn't tell, except with more type safety.
Still with me? Let's skip over to atom. That nested logic is a bit ugly, so let's try a slightly different approach to clean it up. We'll use the Either type for a much nicer presentation.
atom :: Token -> SchemeExpr
atom s = Left s |> tryReadInto SInt
|> tryReadInto SFloat
|> orElse (SSymbol s)
Haskell doesn't have the automagical coersion functions int and float, so instead we will build tryReadInto. Here's how it works: we're going to thread Either values around. An Either value is either a Left or a Right. Conventionally, Left is used to signal error or failure, while Right signals success or completion. In Haskell, to simulate the Python-esque function call chaining, you just place the "self" argument as the last one.
tryReadInto :: Read a => (a -> b) -> Either String b -> Either String b
tryReadInto f (Right x) = Right x
tryReadInto f (Left s) = case readMay s of
Just x -> Right (f x)
Nothing -> Left s
orElse :: a -> Either err a -> a
orElse a (Left _) = a
orElse _ (Right a) = a
tryReadInto relies on type inference in order to determine which type it is trying to parse the string into. If the parse fails, it simply reproduces the same string in the Left position. If it succeeds, then it performs whatever function is desired and places the result in the Right position. orElse allows us to eliminate the Either by supplying a value in case the former computations failed. Can you see how Either acts as a replacement for exceptions here? Since the ValueExceptions in the Python code are always caught inside the function itself, we know that atom will never raise an exception. Similarly, in the Haskell code, even though we used Either on the inside of the function, the interface that we expose is pure: Token -> SchemeExpr, no outwardly-visible side effects.
OK, let's move on to read_from. First, ask yourself the question: what side effects does this function have? It mutates its argument tokens via pop, and it has internal mutation on the list named L. It also raises the SyntaxError exception. At this point, most Haskellers will be throwing up their hands saying "oh noes! side effects! gross!" But the truth is that Haskellers use side effects all the time as well. We just call them "monads" in order to scare people away and avoid success at all costs. Mutation can be accomplished with the State monad, and exceptions with the Either monad (surprise!). We will want to use both at the same time, so we'll in fact use "monad transformers", which I'll explain in a bit. It's not that scary, once you learn to see past the cruft.
First, a few utilities. These are just some simple plumbing operations. raise will let us "raise exceptions" as in Python, and whileM will let us write a while loop as in Python. For the latter, we simply have to make it explicit in what order the effects should happen: first perform the effect to compute the condition, then if it's True, perform the effects of the body and loop again.
import Control.Monad.Trans.State
import Control.Monad.Trans.Class (lift)
raise = lift . Left
whileM :: Monad m => m Bool -> m () -> m ()
whileM mb m = do
b <- mb
if b
then m >> whileM mb m
else return ()
We again want to expose a pure interface. However, there is a chance that there will be a SyntaxError, so we will indicate in the type signature that the result will be either a SchemeExpr or a SyntaxError. This is reminiscent of how in Java you can annotate which exceptions a method will raise. Note that the type signature of parse has to change as well, since it might raise the SyntaxError.
data SyntaxError = SyntaxError String
deriving (Show)
parse :: String -> Either SyntaxError SchemeExpr
readFrom :: [Token] -> Either SyntaxError SchemeExpr
readFrom = evalStateT readFrom'
We are going to perform a stateful computation on the token list that is passed in. Unlike the Python, however, we are not going to be rude to the caller and mutate the very list passed to us. Instead, we will establish our own state space and initialize it to the token list we are given. We will use do notation, which provides syntactic sugar to make it look like we're programming imperatively. The StateT monad transformer gives us the get, put, and modify state operations.
readFrom' :: StateT [Token] (Either SyntaxError) SchemeExpr
readFrom' = do
tokens <- get
case tokens of
[] -> raise (SyntaxError "unexpected EOF while reading")
(token:tokens') -> do
put tokens' -- here we overwrite the state with the "rest" of the tokens
case token of
"(" -> (SList . reverse) `fmap` execStateT readWithList []
")" -> raise (SyntaxError "unexpected close paren")
_ -> return (atom token)
I've broken out the readWithList portion into a separate chunk of code,
because I want you to see the type signature. This portion of code introduces
a new scope, so we simply layer another StateT on top of the monad stack
that we had before. Now, the get, put, and modify operations refer
to the thing called L in the Python code. If we want to perform these operations
on the tokens, then we can simply preface the operation with lift in order
to strip away one layer of the monad stack.
readWithList :: StateT [SchemeExpr] (StateT [Token] (Either SyntaxError)) ()
readWithList = do
whileM ((\toks -> toks !! 0 /= ")") `fmap` lift get) $ do
innerExpr <- lift readFrom'
modify (innerExpr:)
lift $ modify (drop 1) -- this seems to be missing from the Python
In Haskell, appending to the end of a list is inefficient, so I instead prepended, and then reversed the list afterwards. If you are interested in performance, then there are better list-like data structures you can use.
Here is the complete file: http://hpaste.org/77852
So if you're new to Haskell, then this probably looks terrifying. My advice is to just give it some time. The Monad abstraction is not nearly as scary as people make it out to be. You just have to learn that what most languages have baked in (mutation, exceptions, etc), Haskell instead provides via libraries. In Haskell, you must explicitly specify which effects you want, and controlling those effects is a little less convenient. In exchange, however, Haskell provides more safety so you don't accidentally mix up the wrong effects, and more power, because you are in complete control of how to combine and refactor effects.

In Haskell, you wouldn't use an algorithm that mutates the data it operates on. So no, there is no straightforward way to do that. However, the code can be rewritten using recursion to avoid updating variables. Solution below uses the MissingH package because Haskell annoyingly doesn't have a replace function that works on strings.
import Data.String.Utils (replace)
import Data.Tree
import System.Environment (getArgs)
data Atom = Sym String | NInt Int | NDouble Double | Para deriving (Eq, Show)
type ParserStack = (Tree Atom, Tree Atom)
tokenize = words . replace "(" " ( " . replace ")" " ) "
atom :: String -> Atom
atom tok =
case reads tok :: [(Int, String)] of
[(int, _)] -> NInt int
_ -> case reads tok :: [(Double, String)] of
[(dbl, _)] -> NDouble dbl
_ -> Sym tok
empty = Node $ Sym "dummy"
para = Node Para
parseToken (Node _ stack, Node _ out) "(" =
(empty $ stack ++ [empty out], empty [])
parseToken (Node _ stack, Node _ out) ")" =
(empty $ init stack, empty $ (subForest (last stack)) ++ [para out])
parseToken (stack, Node _ out) tok =
(stack, empty $ out ++ [Node (atom tok) []])
main = do
(file:_) <- getArgs
contents <- readFile file
let tokens = tokenize contents
parseStack = foldl parseToken (empty [], empty []) tokens
schemeTree = head $ subForest $ snd parseStack
putStrLn $ drawTree $ fmap show schemeTree
foldl is the haskeller's basic structured recursion tool and it serves the same purpose as your while loop and recursive call to read_from. I think the code can be improved a lot, but I'm not so used to Haskell. Below is an almost straight transliteration of the above to Python:
from pprint import pprint
from sys import argv
def atom(tok):
try:
return 'int', int(tok)
except ValueError:
try:
return 'float', float(tok)
except ValueError:
return 'sym', tok
def tokenize(s):
return s.replace('(',' ( ').replace(')',' ) ').split()
def handle_tok((stack, out), tok):
if tok == '(':
return stack + [out], []
if tok == ')':
return stack[:-1], stack[-1] + [out]
return stack, out + [atom(tok)]
if __name__ == '__main__':
tokens = tokenize(open(argv[1]).read())
tree = reduce(handle_tok, tokens, ([], []))[1][0]
pprint(tree)

How can I parse the IO String in Haskell?

I' ve got a problem with Haskell. I have text file looking like this:
5.
7.
[(1,2,3),(4,5,6),(7,8,9),(10,11,12)].
I haven't any idea how can I get the first 2 numbers (2 and 7 above) and the list from the last line. There are dots on the end of each line.
I tried to build a parser, but function called 'readFile' return the Monad called IO String. I don't know how can I get information from that type of string.
I prefer work on a array of chars. Maybe there is a function which can convert from 'IO String' to [Char]?

I think you have a fundamental misunderstanding about IO in Haskell. Particularly, you say this:
Maybe there is a function which can convert from 'IO String' to [Char]?
No, there isn't1, and the fact that there is no such function is one of the most important things about Haskell.
Haskell is a very principled language. It tries to maintain a distinction between "pure" functions (which don't have any side-effects, and always return the same result when give the same input) and "impure" functions (which have side effects like reading from files, printing to the screen, writing to disk etc). The rules are:
You can use a pure function anywhere (in other pure functions, or in impure functions)
You can only use impure functions inside other impure functions.
The way that code is marked as pure or impure is using the type system. When you see a function signature like
digitToInt :: String -> Int
you know that this function is pure. If you give it a String it will return an Int and moreover it will always return the same Int if you give it the same String. On the other hand, a function signature like
getLine :: IO String
is impure, because the return type of String is marked with IO. Obviously getLine (which reads a line of user input) will not always return the same String, because it depends on what the user types in. You can't use this function in pure code, because adding even the smallest bit of impurity will pollute the pure code. Once you go IO you can never go back.
You can think of IO as a wrapper. When you see a particular type, for example, x :: IO String, you should interpret that to mean "x is an action that, when performed, does some arbitrary I/O and then returns something of type String" (note that in Haskell, String and [Char] are exactly the same thing).
So how do you ever get access to the values from an IO action? Fortunately, the type of the function main is IO () (it's an action that does some I/O and returns (), which is the same as returning nothing). So you can always use your IO functions inside main. When you execute a Haskell program, what you are doing is running the main function, which causes all the I/O in the program definition to actually be executed - for example, you can read and write from files, ask the user for input, write to stdout etc etc.
You can think of structuring a Haskell program like this:
All code that does I/O gets the IO tag (basically, you put it in a do block)
Code that doesn't need to perform I/O doesn't need to be in a do block - these are the "pure" functions.
Your main function sequences together the I/O actions you've defined in an order that makes the program do what you want it to do (interspersed with the pure functions wherever you like).
When you run main, you cause all of those I/O actions to be executed.
So, given all that, how do you write your program? Well, the function
readFile :: FilePath -> IO String
reads a file as a String. So we can use that to get the contents of the file. The function
lines:: String -> [String]
splits a String on newlines, so now you have a list of Strings, each corresponding to one line of the file. The function
init :: [a] -> [a]
Drops the last element from a list (this will get rid of the final . on each line). The function
read :: (Read a) => String -> a
takes a String and turns it into an arbitrary Haskell data type, such as Int or Bool. Combining these functions sensibly will give you your program.
Note that the only time you actually need to do any I/O is when you are reading the file. Therefore that is the only part of the program that needs to use the IO tag. The rest of the program can be written "purely".
It sounds like what you need is the article The IO Monad For People Who Simply Don't Care, which should explain a lot of your questions. Don't be scared by the term "monad" - you don't need to understand what a monad is to write Haskell programs (notice that this paragraph is the only one in my answer that uses the word "monad", although admittedly I have used it four times now...)
Here's the program that (I think) you want to write
run :: IO (Int, Int, [(Int,Int,Int)])
run = do
contents <- readFile "text.txt" -- use '<-' here so that 'contents' is a String
let [a,b,c] = lines contents -- split on newlines
let firstLine = read (init a) -- 'init' drops the trailing period
let secondLine = read (init b)
let thirdLine = read (init c) -- this reads a list of Int-tuples
return (firstLine, secondLine, thirdLine)
To answer npfedwards comment about applying lines to the output of readFile text.txt, you need to realize that readFile text.txt gives you an IO String, and it's only when you bind it to a variable (using contents <-) that you get access to the underlying String, so that you can apply lines to it.
Remember: once you go IO, you never go back.
1 I am deliberately ignoring unsafePerformIO because, as implied by the name, it is very unsafe! Don't ever use it unless you really know what you are doing.

As a programming noob, I too was confused by IOs. Just remember that if you go IO you never come out. Chris wrote a great explanation on why. I just thought it might help to give some examples on how to use IO String in a monad. I'll use getLine which reads user input and returns an IO String.
line <- getLine
All this does is bind the user input from getLine to a value named line. If you type this this in ghci, and type :type line it will return:
:type line
line :: String
But wait! getLine returns an IO String
:type getLine
getLine :: IO String
So what happened to the IOness from getLine? <- is what happened. <- is your IO friend. It allows you to bring out the value that is tainted by the IO within a monad and use it with your normal functions. Monads are easily identified because they begin with do. Like so:
main = do
putStrLn "How much do you love Haskell?"
amount <- getLine
putStrln ("You love Haskell this much: " ++ amount)
If you're like me, you'll soon discover that liftIO is your next best monad friend, and that $ help reduce the number of parenthesis you need to write.
So how do you get the information from readFile? Well if readFile's output is IO String like so:
:type readFile
readFile :: FilePath -> IO String
Then all you need is your friendly <-:
yourdata <- readFile "samplefile.txt"
Now if type that in ghci and check the type of yourdata you'll notice it's a simple String.
:type yourdata
text :: String

As people already say, if you have two functions, one is readStringFromFile :: FilePath -> IO String, and another is doTheRightThingWithString :: String -> Something, then you don't really need to escape a string from IO, since you can combine this two functions in various ways:
With fmap for IO (IO is Functor):
fmap doTheRightThingWithString readStringFromFile
With (<$>) for IO (IO is Applicative and (<$>) == fmap):
import Control.Applicative
...
doTheRightThingWithString <$> readStringFromFile
With liftM for IO (liftM == fmap):
import Control.Monad
...
liftM doTheRightThingWithString readStringFromFile
With (>>=) for IO (IO is Monad, fmap == (<$>) == liftM == \f m -> m >>= return . f):
readStringFromFile >>= \string -> return (doTheRightThingWithString string)
readStringFromFile >>= \string -> return $ doTheRightThingWithString string
readStringFromFile >>= return . doTheRightThingWithString
return . doTheRightThingWithString =<< readStringFromFile
With do notation:
do
...
string <- readStringFromFile
-- ^ you escape String from IO but only inside this do-block
let result = doTheRightThingWithString string
...
return result
Every time you will get IO Something.
Why you would want to do it like that? Well, with this you will have pure and
referentially transparent programs (functions) in your language. This means that every function which type is IO-free is pure and referentially transparent, so that for the same arguments it will returns the same values. For example, doTheRightThingWithString would return the same Something for the same String. However readStringFromFile which is not IO-free can return different strings every time (because file can change), so that you can't escape such unpure value from IO.

If you have a parser of this type:
myParser :: String -> Foo
and you read the file using
readFile "thisfile.txt"
then you can read and parse the file using
fmap myParser (readFile "thisfile.txt")
The result of that will have type IO Foo.
The fmap means myParser runs "inside" the IO.
Another way to think of it is that whereas myParser :: String -> Foo, fmap myParser :: IO String -> IO Foo.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Parsec: grabbing raw source after parsing - haskell

Related

Haskell sequence of IO actions processing with filtration their results in realtime+perfoming some IO actions in certain moments

How to parse a float number input in Haskell?

Trying to simplify the checking of an IO Bool in an Attoparsec parser

How to translate this python to Haskell?

How can I parse the IO String in Haskell?

Categories

Resources