Related
I am trying to understand do-blocks/sequencing actions, parsers and monads.
newtype Parser a = P ( String -> [ ( a, String ) ] )
item :: Parser Char
item = P (\inp -> case inp of
[] -> []
(x:xs) -> [(x,xs)])
parse :: Parser a -> String -> [(a,String)]
parse (P p) inp = p inp
firstThird :: Parser (Char,Char)
firstThird = do x <- item
item
y <- item
return (x,y)
I don't understand why
parse firstThird "ab"
evaluates to
[]
Why is it when one of the actions fail the whole do-block fails?
You haven't included a Monad instance for your Parser, but a standard one might be given by:
instance Functor Parser where
fmap = liftM
instance Applicative Parser where
pure x = P $ \str -> [(x, str)]
(<*>) = ap
instance Monad Parser where
P p >>= f = P $ \str ->
[result | (x, rest) <- p str, let P q = f x, result <- q rest]
With this instance, I can duplicate your result, and the parser is operating as intended. Usually a monadic parser is designed so that when a sequence of parsers is bound together (using the >>= operator, or as multiple steps in do-block notation), the intended meaning is that each parser is tried in sequence and must "succeed" in order for the entire sequence to be successful.
In your case, "success" is defined as a parser returning a non-empty list of possible parses. In order for firstThird to return a non-empty list, each of the three item parsers, in sequence, must produce a non-empty list. For the input string "ab", the first call to item produces the non-empty list [('a',"b")], the second call to item produces the non-empty list [('b',"")], and the third call to item has nothing left to parse and so returns the empty list []. No other parse is possible.
If you want to allow a parser to fail but continue the parse, you can use a combinator. There's a standard one named optional in Control.Applicative.
To use it, you'd need an Alternative instance for your Parser:
instance Alternative Parser where
empty = P $ \_ -> []
P p <|> P q = P $ \str -> p str ++ q str
instance MonadPlus Parser
and then you can write:
firstThird :: Parser (Char,Char)
firstThird = do x <- item
optional item
y <- item
return (x,y)
which allows:
> parse firstThird "ab"
[(('a','b'),"")]
> parse firstThird "abc"
[(('a','c'),""),(('a','b'),"c")]
Note that "abc" can be parsed two ways with firstThird, with and without skipping parsing for the middle item.
This is the usual way of writing monadic parsers: the Monad is used for a sequence of parses, all of which must succeed, while the separate Alternative (AKA MonadPlus) instance and the <|> operator in particular are used to gracefully handle cases where parsers are allowed to fail and allow other parsers to be tried instead.
I'm trying to understand order of execution in purely functional language.
I know that in purely functional languages, there is no necessary execution order.
So my question is:
Suppose there are two functions.
I would like to know all ways in which I can call one function after another (except nested call of one function from another) (and except io-mode).
I would like to see examples in Haskell or pseudo-code.
There is no way to do what you describe, if the functions are totally independent and you don't use the result of one when you call the other.
This is because there is no reason to do this. In a side effect free setting, calling a function and then ignoring its result is exactly the same as doing nothing for the amount of time it takes to call that function (setting aside memory usage).
It is possible that seq x y will evaluate x and then y, and then give you y as its result, but this evaluation order isn't guaranteed.
Now, if we do have side effects, such as if we are working inside a Monad or Applicative, this could be useful, but we aren't truly ignoring the result since there is context being passed implicitly. For instance, you can do
main :: IO ()
main = putStrLn "Hello, " >> putStrLn "world"
in the IO Monad. Another example would be the list Monad (which could be thought of as representing a nondeterministic computation):
biggerThanTen :: Int -> Bool
biggerThanTen n = n > 10
example :: String
example = filter biggerThanTen [1..15] >> return 'a' -- This evaluates to "aaaaa"
Note that even here we aren't really ignoring the result. We ignore the specific values, but we use the structure of the result (in the second example, the structure would be the fact that the resulting list from filter biggerThanTen [1..15] has 5 elements).
I should point out, though, that things that are sequenced in this way aren't necessarily evaluated in the order that they are written. You can sort of see this with the list Monad example. This becomes more apparent with bigger examples though:
example2 :: [Int]
example2 =
[1,2,3] >>=
(\x -> [10,100,1000] >>=
(\y -> return (x * y))) -- ==> [10,100,1000,20,200,2000,30,300,3000]
The main takeaway here is that evaluation order (in the absence of side effects like IO and ignoring bottoms) doesn't affect the ultimate meaning of code in Haskell (other than possible differences in efficiency, but that is another topic). As a result, there is never a reason to call two functions "one after another" in the fashion described in the question (that is, where the calls are totally independent from each other).
Do notation
Do notation is actually exactly equivalent to using >>= and >> (there is actually one other thing involved that takes care of pattern match failures, but that is irrelevant to the discussion at hand). The compiler actually takes things written in do notation and converts them to >>= and >> through a process called "desugaring" (since it removes the syntactic sugar). Here are the three examples from above written with do notation:
IO Example
main :: IO ()
main = do
putStrLn "Hello, "
putStrLn "World"
First list example
biggerThanTen :: Int -> Bool
biggerThanTen n = n > 10
example :: String -- String is a synonym for [Char], by the way
example = do
filter biggerThanTen [1..15]
return 'a'
Second list example
example2 :: [Int]
example2 = do
x <- [1,2,3]
y <- [10,100,1000]
return (x * y)
Here is a side-by-side comparison of the conversions:
do --
m -- m >> n
n --
do --
x <- m -- m >>= (\x ->
... -- ...)
The best way to understand do notation is to first understand >>= and return since, as I said, that's what the compiler transforms do notation into.
As a side-note, >> is just the same as >>=, it just ignores the "result" of it's left argument (although it preserves the "context" or "structure"). So all definitions of >> must be equivalent to m >> n = m >>= (\_ -> n).
Expanding the >>= in the second list example
To help drive home the point that Monads are not usually impure, lets expand the >>= calls in the second list example, using the Monad definition for lists. The definition is:
instance Monad [] where
return x = [x]
xs >>= f = concatMap f xs
and we can convert example2 into:
Step 0 (what we already have)
example2 :: [Int]
example2 =
[1,2,3] >>=
(\x -> [10,100,1000] >>=
(\y -> return (x * y)))
Step 1 (converting the first >>=)
example2 =
concatMap
(\x -> [10,100,1000] >>=
(\y -> return (x * y)))
[1,2,3]
Step 2
example2 =
concatMap
(\x -> concatMap
(\y -> return (x * y))
[10,100,1000])
[1,2,3]
Step 3
example2 =
concatMap
(\x -> concatMap
(\y -> [x * y])
[10,100,1000])
[1,2,3]
So, there is no magic going on here, just normal function calls.
You can write a function whose arguments depend on the evaluation of another function:
-- Ads the first two elements of a list together
myFunc :: [Int] -> Int
myFunc xs = (head xs) + (head $ tail xs)
If that's what you mean. In this case, you can't get the output of myFunc xs without evaluating head xs, head $ tail xs and (+). There is an order here. However, the compiler can choose which order to execute head xs and head $ tail xs in since they are not dependent on each other, but it can't do the addition without having both of the other results. It could even choose to evaluate them in parallel, or on different machines. The point is that pure functions, because they have no side effects, don't have to be evaluated in a given order until their results are interdependent.
Another way to look at the above function is as a graph:
myFunc
|
(+)
/ \
/ \
head head
\ |
\ tail
\ /
xs
In order to evaluate a node, all nodes below it have to be evaluated first, but different branches can be evaluated in parallel. First xs must be evaluated, at least partially, but after that the two branches can be evaluated in parallel. There are some nuances due to lazy evaluation, but this is essentially how the compiler constructs evaluation trees.
If you really want to force one function call before the other, you can use the seq function. It takes two arguments, forces the first to be evaluated, then returns the second, e.g.
myFunc2 :: [Int] -> Int
myFunc2 xs = hxs + (hxs `seq` (head $ tail xs))
where hxs = head xs
This will force head xs to evaluate before head $ tail xs, but this is more dealing with strictness than sequencing functions.
Here is an easy way:
case f x of
result1 -> case g y of
result2 -> ....
Still, unless g y uses something from result1 and the subsequent calculations something from result2, or the pattern is such that the result must be evaluated, there is no guarantee that either of f or g are actually called, nor in what order.
Still, you wanted a way to call one function after another, and this is such a way.
Say I have a List of integers l = [1,2]
Which I want to print to stdout.
Doing print l produces [1,2]
Say I want to print the list without the braces
map print l produces
No instance for (Show (IO ())) arising from a use of `print'
Possible fix: add an instance declaration for (Show (IO ()))
In a stmt of an interactive GHCi command: print it
`:t print
print :: Show a => a -> IO ()
So while I thought this would work I went ahead and tried:
map putStr $ map show l
Since I suspected a type mismatch from Integer to String was to blame. This produced the same error message as above.
I realize that I could do something like concatenating the list into a string, but I would like to avoid that if possible.
What's going on? How can I do this without constructing a string from the elements of the List?
The problem is that
map :: (a -> b) -> [a] -> [b]
So we end up with [IO ()]. This is a pure value, a list of IO actions. It won't actually print anything. Instead we want
mapM_ :: (a -> IO ()) -> [a] -> IO ()
The naming convention *M means that it operates over monads and *_ means we throw away the value. This is like map except it sequences each action with >> to return an IO action.
As an example mapM_ print [1..10] will print each element on a new line.
Suppose you're given a list xs :: [a] and function f :: Monad m => a -> m b. You want to apply the function f to each element of xs, yielding a list of actions, then sequence these actions. Here is how I would go about constructing a function, call it mapM, that does this. In the base case, xs = [] is the empty list, and we simply return []. In the recursive case, xs has the form x : xs. First, we want to apply f to x, giving the action f x :: m b. Next, we want recursively call mapM on xs. The result of performing the first step is a value, say y; the result of performing the second step is a list of values, say ys. So we collect y and ys into a list, then return them in the monad:
mapM :: Monad m => (a -> m b) -> [a] -> m [b]
mapM f [] = return []
mapM f (x : xs) = f x >>= \y -> mapM f ys >>= \ys -> return (y : ys)
Now we can map a function like print, which returns an action in the IO monad, over a list of values to print: mapM print [1..10] does precisely this for the list of integers from one through ten. There is a problem, however: we aren't particularly concerned about collecting the results of printing operations; we're primarily concerned about their side effects. Instead of returning y : ys, we simply return ().
mapM_ :: Monad m => (a -> m b) ->[a] -> m ()
mapM_ f [] = return ()
mapM_ f (x : xs) = f x >> mapM_ f xs
Note that mapM and mapM_ can be defined without explicit recursion using the sequence and sequence_ functions from the standard library, which do precisely what their names imply. If you look at the source code for mapM and mapM_ in Control.Monad, you will see them implemented that way.
Everything in Haskell is very strongly typed, including code to perform IO!
When you write print [1, 2], this is just a convenience wrapper for putStrLn (show [1, 2]), where show is a function that turns a (Show'able) object into a string. print itself doesn't do anything (in the side effect sense of do), but it outputs an IO() action, which is sort of like a mini unrun "program" (if you excuse the sloppy language), which isn't "run" at its creation time, but which can be passed around for later execution. You can verify the type in ghci
> :t print [1, 2]
print [1, 2]::IO()
This is just an object of type IO ().... You could throw this away right now and nothing would ever happen. More likely, if you use this object in main, the IO code will run, side effects and all.
When you map multiple putStrLn (or print) functions onto a list, you still get an object whose type you can view in ghci
> :t map print [1, 2]
map print [1, 2]::[IO()]
Like before, this is just an object that you can pass around, and by itself it will not do anything. But unlike before, the type is incorrect for usage in main, which expects an IO() object. In order to use it, you need to convert it to this type.
There are many ways to do this conversion.... One way that I like is the sequence function.
sequence $ map print [1, 2]
which takes a list of IO actions (ie- mini "programs" with side effects, if you will forgive the sloppy language), and sequences them together as on IO action. This code alone will now do what you want.
As jozefg pointed out, although sequence works, sequence_ is a better choice here....
Sequence not only concatinates the stuff in the IO action, but also puts the return values in a list.... Since print's return value is IO(), the new return value becomes a useless list of ()'s (in IO). :)
Using the lens library:
[1,2,3] ^! each . act print
You might write your own function, too:
Prelude> let l = [1,2]
Prelude> let f [] = return (); f (x:xs) = do print x; f xs
Prelude> f l
1
2
I am getting the contents of a file and transforming it into a list of form:
[("abc", 123), ("def", 456)]
with readFile, lines, and words.
Right now, I can manage to transform the resulting list into type IO [(String, Int)].
My problem is, when I try to make a function like this:
check x = lookup x theMap
I get this error, which I'm not too sure how to resolve:
Couldn't match expected type `[(a0, b0)]'
with actual type `IO [(String, Int)]'
In the second argument of `lookup', namely `theMap'
theMap is essentially this:
getLines :: String -> IO [String]
getLines = liftM lines . readFile
tuplify [x,y] = (x, read y :: Int)
theMap = do
list <- getLines "./test.txt"
let l = map tuplify (map words list)
return l
And the file contents are:
abc 123
def 456
Can anyone explain what I'm doing wrong and or show me a better solution? I just started toying around with monads a few hours ago and am running into a few bumps along the way.
Thanks
You will have to "unwrap" theMap from IO. Notice how you're already doing this to getLines by:
do
list <- getlines
[...]
return (some computation on list)
So you could have:
check x = do
m <- theMap
return . lookup x $ m
This is, in fact, an antipattern (albeit an illustrative one,) and you would be better off using the functor instance, ie. check x = fmap (lookup x) theMap
I'm trying to understand how Haskell list comprehensions work "under the hood" in regards to pattern matching. The following ghci output illustrates my point:
Prelude> let myList = [Just 1, Just 2, Nothing, Just 3]
Prelude> let xs = [x | Just x <- myList]
Prelude> xs
[1,2,3]
Prelude>
As you can see, it is able to skip the "Nothing" and select only the "Just" values. I understand that List is a monad, defined as (source from Real World Haskell, ch. 14):
instance Monad [] where
return x = [x]
xs >>= f = concat (map f xs)
xs >> f = concat (map (\_ -> f) xs)
fail _ = []
Therefore, a list comprehension basically builds a singleton list for every element selected in the list comprehension and concatenates them. If a pattern match fails at some step, the result of the "fail" function is used instead. In other words, the "Just x" pattern doesn't match so [] is used as a placeholder until 'concat' is called. That explains why the "Nothing" appears to be skipped.
What I don't understand is, how does Haskell know to call the "fail" function? Is it "compiler magic", or functionality that you can write yourself in Haskell? Is it possible to write the following "select" function to work the same way as a list comprehension?
select :: (a -> b) -> [a] -> [b]
select (Just x -> x) myList -- how to prevent the lambda from raising an error?
[1,2,3]
While implemenatations of Haskell might not do it directly like this internally, it is helpful to think about it this way :)
[x | Just x <- myList]
... becomes:
do
Just x <- myList
return x
... which is:
myList >>= \(Just x) -> return x
As to your question:
What I don't understand is, how does Haskell know to call the "fail" function?
In do-notation, if a pattern binding fails (i.e. the Just x), then the fail method is called. For the above example, it would look something like this:
myList >>= \temp -> case temp of
(Just x) -> return x
_ -> fail "..."
So, every time you have a pattern-match in a monadic context that may fail, Haskell inserts a call to fail. Try it out with IO:
main = do
(1,x) <- return (0,2)
print x -- x would be 2, but the pattern match fails
The rule for desugaring a list comprehension requires an expression of the form [ e | p <- l ] (where e is an expression, p a pattern, and l a list expression) behave like
let ok p = [e]
ok _ = []
in concatMap ok l
Previous versions of Haskell had monad comprehensions, which were removed from the language because they were hard to read and redundant with the do-notation. (List comprehensions are redundant, too, but they aren't so hard to read.) I think desugaring [ e | p <- l ] as a monad (or, to be precise, as a monad with zero) would yield something like
let ok p = return e
ok _ = mzero
in l >>= ok
where mzero is from the MonadPlus class. This is very close to
do { p <- l; return e }
which desugars to
let ok p = return e
ok _ = fail "..."
in l >>= ok
When we take the List Monad, we have
return e = [e]
mzero = fail _ = []
(>>=) = flip concatMap
I.e., the 3 approaches (list comprehensions, monad comprehensions, do expressions) are equivalent for lists.
I don't think the list comprehension syntax has much to do with the fact that List ([]), or Maybe for that matter, happens to be an instance of the Monad type class.
List comprehensions are indeed compiler magic or syntax sugar, but that's possible because the compiler knows the structure of the [] data type.
Here's what the list comprehension is compiled to: (Well, I think, I didn't actually check it against the GHC)
xs = let f = \xs -> case xs of
Just x -> [x]
_ -> []
in concatMap f myList
As you can see, the compiler doesn't have to call the fail function, it can simply inline a empty list, because it knows what a list is.
Interestingly, this fact that the list comprehensions syntax 'skips' pattern match failures is used in some libraries to do generic programming. See the example in the Uniplate library.
Edit: Oh, and to answer your question, you can't call your select function with the lambda you gave it. It will indeed fail on a pattern match failure if you call it with an Nothing value.
You could pass it the f function from the code above, but than select would have the type:
select :: (a -> [b]) -> [a] -> [b]
which is perfectly fine, you can use the concatMap function internally :-)
Also, that new select now has the type of the monadic bind operator for lists (with its arguments flipped):
(>>=) :: [a] -> (a -> [b]) -> [b]
xs >>= f = concatMap f xs -- 'or as you said: concat (map f xs)