Related
I kind of understand State Monad's usefulness it propagating new values in a sequential execution. But in the following code, I'm having trouble in understanding how and where addResult get it's updated state everytime it gets evaluated.
fizzBuzz :: Integer -> String
fizzBuzz n
| n `mod` 15 == 0 = "FizzBuzz"
| n `mod` 5 == 0 = "Buzz"
| n `mod` 3 == 0 = "Fizz"
| otherwise = show n
fizzbuzzList :: [Integer] -> [String]
fizzbuzzList list = execState (mapM_ addResult list) []
addResult :: Integer -> State [String] ()
addResult n = do
xs <- get
let result = fizzBuzz n
put (result : xs)
main :: IO ()
main = mapM_ putStrLn $ fizzbuzzList [1..100]
This code evaluates to produce
1, 2, Fizz...
I just couldn't figure out how the new value produced by addResult gets appended to the previously produced list. Can you please help me understand how mapM_ addResult list does it's stuff here?
As you have correctly observed, the State monad is used to thread some external 'state' value through a series of computations. However, you ask how this state is 'persisted' across multiple invocations of your function addResult :: Integer -> State [String] () on each value of the list list. The trick is the definition of mapM_. We'll start by considering the simpler function mapM:
mapM f [] = return []
mapM f (x:xs) = do
fx <- f x
fxs <- mapM f xs
return (fx : fxs)
If we mentally 'expand' this recursive definition out with, say, an example list [x1,x2,x3,x4,...,xn], we'll observe that the value of mapM f [x1,...,xn] would be another monadic computation:
do
fx1 <- f x1
fx2 <- f x2
-- etc.
fxn <- f xn
return [fx1,fx2,...,fxn]
So mapM basically 'sticks together' a bunch of monadic computations into one big one, by running them together in order. Which explains how you build up a list instead of producing many smaller ones: the get at the beginning of addResult gets the state from the last run, not from the beginning, because you're running all the computations together, like so:
do
fl0 <- addResult (list !! 0)
fl1 <- addResult (list !! 1)
-- etc. like before
(If you read carefully, you'll notice I've been talking about mapM, but you actually used mapM_. They're exactly the same, except that the latter returns ().)
State can be thought of as being defined like
data State s a = s -> (a, s)
which means that
Integer -> State [String] ()
is equivalent to
Integer -> [String] -> ((), [String])
So addResult takes an integer and returns a function that takes a state and returns a tuple containing the new state.
mapM_, roughly speaking, chains a group of these functions together. Using regular map would produce a list of State functions, each expecting a state and returning a new state. mapM_ takes the further step of binding each State to the previous. The end result is not a list of State values, but a single State value that forms a pipeline.
execState then provides the initial state in one end of the pipeline, and returns the final state from the other end.
I'm trying to understand order of execution in purely functional language.
I know that in purely functional languages, there is no necessary execution order.
So my question is:
Suppose there are two functions.
I would like to know all ways in which I can call one function after another (except nested call of one function from another) (and except io-mode).
I would like to see examples in Haskell or pseudo-code.
There is no way to do what you describe, if the functions are totally independent and you don't use the result of one when you call the other.
This is because there is no reason to do this. In a side effect free setting, calling a function and then ignoring its result is exactly the same as doing nothing for the amount of time it takes to call that function (setting aside memory usage).
It is possible that seq x y will evaluate x and then y, and then give you y as its result, but this evaluation order isn't guaranteed.
Now, if we do have side effects, such as if we are working inside a Monad or Applicative, this could be useful, but we aren't truly ignoring the result since there is context being passed implicitly. For instance, you can do
main :: IO ()
main = putStrLn "Hello, " >> putStrLn "world"
in the IO Monad. Another example would be the list Monad (which could be thought of as representing a nondeterministic computation):
biggerThanTen :: Int -> Bool
biggerThanTen n = n > 10
example :: String
example = filter biggerThanTen [1..15] >> return 'a' -- This evaluates to "aaaaa"
Note that even here we aren't really ignoring the result. We ignore the specific values, but we use the structure of the result (in the second example, the structure would be the fact that the resulting list from filter biggerThanTen [1..15] has 5 elements).
I should point out, though, that things that are sequenced in this way aren't necessarily evaluated in the order that they are written. You can sort of see this with the list Monad example. This becomes more apparent with bigger examples though:
example2 :: [Int]
example2 =
[1,2,3] >>=
(\x -> [10,100,1000] >>=
(\y -> return (x * y))) -- ==> [10,100,1000,20,200,2000,30,300,3000]
The main takeaway here is that evaluation order (in the absence of side effects like IO and ignoring bottoms) doesn't affect the ultimate meaning of code in Haskell (other than possible differences in efficiency, but that is another topic). As a result, there is never a reason to call two functions "one after another" in the fashion described in the question (that is, where the calls are totally independent from each other).
Do notation
Do notation is actually exactly equivalent to using >>= and >> (there is actually one other thing involved that takes care of pattern match failures, but that is irrelevant to the discussion at hand). The compiler actually takes things written in do notation and converts them to >>= and >> through a process called "desugaring" (since it removes the syntactic sugar). Here are the three examples from above written with do notation:
IO Example
main :: IO ()
main = do
putStrLn "Hello, "
putStrLn "World"
First list example
biggerThanTen :: Int -> Bool
biggerThanTen n = n > 10
example :: String -- String is a synonym for [Char], by the way
example = do
filter biggerThanTen [1..15]
return 'a'
Second list example
example2 :: [Int]
example2 = do
x <- [1,2,3]
y <- [10,100,1000]
return (x * y)
Here is a side-by-side comparison of the conversions:
do --
m -- m >> n
n --
do --
x <- m -- m >>= (\x ->
... -- ...)
The best way to understand do notation is to first understand >>= and return since, as I said, that's what the compiler transforms do notation into.
As a side-note, >> is just the same as >>=, it just ignores the "result" of it's left argument (although it preserves the "context" or "structure"). So all definitions of >> must be equivalent to m >> n = m >>= (\_ -> n).
Expanding the >>= in the second list example
To help drive home the point that Monads are not usually impure, lets expand the >>= calls in the second list example, using the Monad definition for lists. The definition is:
instance Monad [] where
return x = [x]
xs >>= f = concatMap f xs
and we can convert example2 into:
Step 0 (what we already have)
example2 :: [Int]
example2 =
[1,2,3] >>=
(\x -> [10,100,1000] >>=
(\y -> return (x * y)))
Step 1 (converting the first >>=)
example2 =
concatMap
(\x -> [10,100,1000] >>=
(\y -> return (x * y)))
[1,2,3]
Step 2
example2 =
concatMap
(\x -> concatMap
(\y -> return (x * y))
[10,100,1000])
[1,2,3]
Step 3
example2 =
concatMap
(\x -> concatMap
(\y -> [x * y])
[10,100,1000])
[1,2,3]
So, there is no magic going on here, just normal function calls.
You can write a function whose arguments depend on the evaluation of another function:
-- Ads the first two elements of a list together
myFunc :: [Int] -> Int
myFunc xs = (head xs) + (head $ tail xs)
If that's what you mean. In this case, you can't get the output of myFunc xs without evaluating head xs, head $ tail xs and (+). There is an order here. However, the compiler can choose which order to execute head xs and head $ tail xs in since they are not dependent on each other, but it can't do the addition without having both of the other results. It could even choose to evaluate them in parallel, or on different machines. The point is that pure functions, because they have no side effects, don't have to be evaluated in a given order until their results are interdependent.
Another way to look at the above function is as a graph:
myFunc
|
(+)
/ \
/ \
head head
\ |
\ tail
\ /
xs
In order to evaluate a node, all nodes below it have to be evaluated first, but different branches can be evaluated in parallel. First xs must be evaluated, at least partially, but after that the two branches can be evaluated in parallel. There are some nuances due to lazy evaluation, but this is essentially how the compiler constructs evaluation trees.
If you really want to force one function call before the other, you can use the seq function. It takes two arguments, forces the first to be evaluated, then returns the second, e.g.
myFunc2 :: [Int] -> Int
myFunc2 xs = hxs + (hxs `seq` (head $ tail xs))
where hxs = head xs
This will force head xs to evaluate before head $ tail xs, but this is more dealing with strictness than sequencing functions.
Here is an easy way:
case f x of
result1 -> case g y of
result2 -> ....
Still, unless g y uses something from result1 and the subsequent calculations something from result2, or the pattern is such that the result must be evaluated, there is no guarantee that either of f or g are actually called, nor in what order.
Still, you wanted a way to call one function after another, and this is such a way.
Say I have a List of integers l = [1,2]
Which I want to print to stdout.
Doing print l produces [1,2]
Say I want to print the list without the braces
map print l produces
No instance for (Show (IO ())) arising from a use of `print'
Possible fix: add an instance declaration for (Show (IO ()))
In a stmt of an interactive GHCi command: print it
`:t print
print :: Show a => a -> IO ()
So while I thought this would work I went ahead and tried:
map putStr $ map show l
Since I suspected a type mismatch from Integer to String was to blame. This produced the same error message as above.
I realize that I could do something like concatenating the list into a string, but I would like to avoid that if possible.
What's going on? How can I do this without constructing a string from the elements of the List?
The problem is that
map :: (a -> b) -> [a] -> [b]
So we end up with [IO ()]. This is a pure value, a list of IO actions. It won't actually print anything. Instead we want
mapM_ :: (a -> IO ()) -> [a] -> IO ()
The naming convention *M means that it operates over monads and *_ means we throw away the value. This is like map except it sequences each action with >> to return an IO action.
As an example mapM_ print [1..10] will print each element on a new line.
Suppose you're given a list xs :: [a] and function f :: Monad m => a -> m b. You want to apply the function f to each element of xs, yielding a list of actions, then sequence these actions. Here is how I would go about constructing a function, call it mapM, that does this. In the base case, xs = [] is the empty list, and we simply return []. In the recursive case, xs has the form x : xs. First, we want to apply f to x, giving the action f x :: m b. Next, we want recursively call mapM on xs. The result of performing the first step is a value, say y; the result of performing the second step is a list of values, say ys. So we collect y and ys into a list, then return them in the monad:
mapM :: Monad m => (a -> m b) -> [a] -> m [b]
mapM f [] = return []
mapM f (x : xs) = f x >>= \y -> mapM f ys >>= \ys -> return (y : ys)
Now we can map a function like print, which returns an action in the IO monad, over a list of values to print: mapM print [1..10] does precisely this for the list of integers from one through ten. There is a problem, however: we aren't particularly concerned about collecting the results of printing operations; we're primarily concerned about their side effects. Instead of returning y : ys, we simply return ().
mapM_ :: Monad m => (a -> m b) ->[a] -> m ()
mapM_ f [] = return ()
mapM_ f (x : xs) = f x >> mapM_ f xs
Note that mapM and mapM_ can be defined without explicit recursion using the sequence and sequence_ functions from the standard library, which do precisely what their names imply. If you look at the source code for mapM and mapM_ in Control.Monad, you will see them implemented that way.
Everything in Haskell is very strongly typed, including code to perform IO!
When you write print [1, 2], this is just a convenience wrapper for putStrLn (show [1, 2]), where show is a function that turns a (Show'able) object into a string. print itself doesn't do anything (in the side effect sense of do), but it outputs an IO() action, which is sort of like a mini unrun "program" (if you excuse the sloppy language), which isn't "run" at its creation time, but which can be passed around for later execution. You can verify the type in ghci
> :t print [1, 2]
print [1, 2]::IO()
This is just an object of type IO ().... You could throw this away right now and nothing would ever happen. More likely, if you use this object in main, the IO code will run, side effects and all.
When you map multiple putStrLn (or print) functions onto a list, you still get an object whose type you can view in ghci
> :t map print [1, 2]
map print [1, 2]::[IO()]
Like before, this is just an object that you can pass around, and by itself it will not do anything. But unlike before, the type is incorrect for usage in main, which expects an IO() object. In order to use it, you need to convert it to this type.
There are many ways to do this conversion.... One way that I like is the sequence function.
sequence $ map print [1, 2]
which takes a list of IO actions (ie- mini "programs" with side effects, if you will forgive the sloppy language), and sequences them together as on IO action. This code alone will now do what you want.
As jozefg pointed out, although sequence works, sequence_ is a better choice here....
Sequence not only concatinates the stuff in the IO action, but also puts the return values in a list.... Since print's return value is IO(), the new return value becomes a useless list of ()'s (in IO). :)
Using the lens library:
[1,2,3] ^! each . act print
You might write your own function, too:
Prelude> let l = [1,2]
Prelude> let f [] = return (); f (x:xs) = do print x; f xs
Prelude> f l
1
2
I'm rewritting my Prolog program in Haskell and i have small problem, how can i do something like that
myFunc(Field, Acc, Acc) :-
% some "ending" condition
!.
myFunc(Field, Acc, Result) :-
nextField(Field, Field2),
test1(Field2,...),
myFunc(Field2, Acc, Result).
myFunc(Field, Acc, Result) :-
nextField(Field, Field2),
test2(Ak, (X1, Y1)),
myFunc(Field2, [Field2|Acc], Result).
in Haskell? Code above is checking some condition and recursivly calls itself so in the end i get list of specific fields. The whole point is that if some condition (test1 or test2) fails, it is returning to the last point it could make other choice and does it. How do i implement something like that in Haskell?
To model Prolog computations as expressively in Haskell, you need a backtracking monad. This is trivially done using the LogicT monad. Your example as it stands translates to the following:
import Control.Monad.Logic
myFunc :: Int -> [Int] -> Logic [Int]
myFunc field acc = ifte (exitCond field acc) (\_-> return acc) $
(do f <- nextField field
guard $ test1 f
myFunc f acc)
`mplus`
(do f <- nextField field
guard $ test2 f
myFunc f (f:acc))
Assuming the following implementations for the functions and predicates:
nextField i = return (i+1)
test1 f = f < 10
test2 f = f < 20
exitCond f a = guard (f > 15)
You use mplus to combine to Logic computations so that if one fails it backtracks and tries the other one. ifte is just a soft cut (there's no hard cut in logict, although I believe it's trivial to implement since logict is based on continuations) to exit when the exiting condition is true. You run your computation as follows:
Main> runLogic (myFunc 1 []) (\a r -> a:r) []
[[16,15,14,13,12,11,10],[16,15,14,13,12,11,10,9],[16,15,14,13,12,11,10,8]...
runLogic takes the Logic computation, a continuation and an initial value for the output of the continuation. Here I just passed a continuation which will accumulate all results in a list. The above will backtrack and get all solutions, unlike the Prolog example, since we used a soft cut instead of a hard cut. To stop backtracking after getting the first solution you can use once:
Main> runLogic (once $ myFunc 1 []) (\a r -> a:r) []
[[16,15,14,13,12,11,10]]
you can also use observe to observe the first solution only, without having to pass a continuation:
Main> observe (myFunc 1 [])
[16,15,14,13,12,11,10]
or even obserMany and observeAll:
observeMany 5 (myFunc 1 []) --returns the first 5 solutions
observerAll (myFunc 1 []) --returns a list of all solutions
Finally, you will need to install the logict package to get the above code to work. Use cabal install logict to install it.
Edit: Answering your question in the comments
Yes, you can do something similar without having to install logict. Although a dedicated backtracking monad makes things less complicated and makes clear what you are trying to do.
To model the logict example above you only need the [] monad
myFunc :: Int -> [Int] -> [[Int]]
myFunc field acc | exitCond field acc = return acc
myFunc field acc = do
let m1 = do
f <- nextField field
guard $ test1 f
myFunc f acc
m2 = do
f <- nextField field
guard $ test2 f
myFunc f (f:acc)
in m1 `mplus` m2
nextField i = return $ i + 1
exitCond i a = i > 15
test1 i = i < 10
test2 i = i < 20
You can run it as follows:
Main> myFunc 1 []
[[16,15,14,13,12,11,10],[16,15,14,13,12,11,10,9],[16,15,14,13,12,11,10,8]...
You can also choose how many solutions you want as before:
Main> head $ myFunc 1 []
[16,15,14,13,12,11,10]
Main> take 3 $ myFunc 1 []
[[16,15,14,13,12,11,10],[16,15,14,13,12,11,10,9],[16,15,14,13,12,11,10,8]]
However, you will need the Cont monad, and thus the ListT monad, to implement a hard cut as in the Prolog example, which was not available in the logict example above:
import Control.Monad.Cont
import Control.Monad.Trans.List
myFunc :: Int -> ListT (Cont [[Int]]) [Int]
myFunc field = callCC $ \exit -> do
let g field acc | exitCond field acc = exit acc
g field acc =
let m1 = do
f <- nextField field
guard $ test1 f
g f acc
m2 = do
f <- nextField field
guard $ test2 f
g f (f:acc)
in m1 `mplus` m2
g field []
Like Prolog, this last example will not backtrack again after exitCond is satisfied:
*Main> runCont (runListT (myFunc 1)) id
[[16,15,14,13,12,11,10]]
You comment helped clarify some, but there is still some question in mind about what you are looking for so here is an example of using the list monad and guard.
import Control.Monad
myFunc lst = do
e <- lst
guard $ even e -- only allow even elements
guard . not $ e `elem` [4,6,8] -- do not allow 4, 6, or 8
return e -- accumulate results
used in ghci:
> myFunc [1..20]
[2,10,12,14,16,18,20]
I've never programmed in Haskell - then I would call for your help - but could hint about
that Prolog fragment - where I think you have a typo - should be myFunc(Field2, [(X1,Y1)|Acc], Result). could be compiled -by hand - in a continuation passing schema.
Let's google about it (haskell continuation passing prolog). I will peek first the Wikipedia page: near Haskell we find the continuation monad.
Now we can try to translate that Prolog in executable Haskell. Do this make sense ?
Textual translation of your code to Haskell is:
myFunc field acc = take 1 $ -- a cut
g field acc
where
g f a | ending_condition_holds = [a]
g f a =
( nextField f >>= (\f2 ->
(if test1 ... -- test1 a predicate function
then [()]
else [] ) >>= (_ ->
g f2 a )))
++
( nextField f >>= (\f2 ->
test2 ... >>= (\_ -> -- test2 producing some results
g f2 (f2:a) )))
I'm trying to understand how Haskell list comprehensions work "under the hood" in regards to pattern matching. The following ghci output illustrates my point:
Prelude> let myList = [Just 1, Just 2, Nothing, Just 3]
Prelude> let xs = [x | Just x <- myList]
Prelude> xs
[1,2,3]
Prelude>
As you can see, it is able to skip the "Nothing" and select only the "Just" values. I understand that List is a monad, defined as (source from Real World Haskell, ch. 14):
instance Monad [] where
return x = [x]
xs >>= f = concat (map f xs)
xs >> f = concat (map (\_ -> f) xs)
fail _ = []
Therefore, a list comprehension basically builds a singleton list for every element selected in the list comprehension and concatenates them. If a pattern match fails at some step, the result of the "fail" function is used instead. In other words, the "Just x" pattern doesn't match so [] is used as a placeholder until 'concat' is called. That explains why the "Nothing" appears to be skipped.
What I don't understand is, how does Haskell know to call the "fail" function? Is it "compiler magic", or functionality that you can write yourself in Haskell? Is it possible to write the following "select" function to work the same way as a list comprehension?
select :: (a -> b) -> [a] -> [b]
select (Just x -> x) myList -- how to prevent the lambda from raising an error?
[1,2,3]
While implemenatations of Haskell might not do it directly like this internally, it is helpful to think about it this way :)
[x | Just x <- myList]
... becomes:
do
Just x <- myList
return x
... which is:
myList >>= \(Just x) -> return x
As to your question:
What I don't understand is, how does Haskell know to call the "fail" function?
In do-notation, if a pattern binding fails (i.e. the Just x), then the fail method is called. For the above example, it would look something like this:
myList >>= \temp -> case temp of
(Just x) -> return x
_ -> fail "..."
So, every time you have a pattern-match in a monadic context that may fail, Haskell inserts a call to fail. Try it out with IO:
main = do
(1,x) <- return (0,2)
print x -- x would be 2, but the pattern match fails
The rule for desugaring a list comprehension requires an expression of the form [ e | p <- l ] (where e is an expression, p a pattern, and l a list expression) behave like
let ok p = [e]
ok _ = []
in concatMap ok l
Previous versions of Haskell had monad comprehensions, which were removed from the language because they were hard to read and redundant with the do-notation. (List comprehensions are redundant, too, but they aren't so hard to read.) I think desugaring [ e | p <- l ] as a monad (or, to be precise, as a monad with zero) would yield something like
let ok p = return e
ok _ = mzero
in l >>= ok
where mzero is from the MonadPlus class. This is very close to
do { p <- l; return e }
which desugars to
let ok p = return e
ok _ = fail "..."
in l >>= ok
When we take the List Monad, we have
return e = [e]
mzero = fail _ = []
(>>=) = flip concatMap
I.e., the 3 approaches (list comprehensions, monad comprehensions, do expressions) are equivalent for lists.
I don't think the list comprehension syntax has much to do with the fact that List ([]), or Maybe for that matter, happens to be an instance of the Monad type class.
List comprehensions are indeed compiler magic or syntax sugar, but that's possible because the compiler knows the structure of the [] data type.
Here's what the list comprehension is compiled to: (Well, I think, I didn't actually check it against the GHC)
xs = let f = \xs -> case xs of
Just x -> [x]
_ -> []
in concatMap f myList
As you can see, the compiler doesn't have to call the fail function, it can simply inline a empty list, because it knows what a list is.
Interestingly, this fact that the list comprehensions syntax 'skips' pattern match failures is used in some libraries to do generic programming. See the example in the Uniplate library.
Edit: Oh, and to answer your question, you can't call your select function with the lambda you gave it. It will indeed fail on a pattern match failure if you call it with an Nothing value.
You could pass it the f function from the code above, but than select would have the type:
select :: (a -> [b]) -> [a] -> [b]
which is perfectly fine, you can use the concatMap function internally :-)
Also, that new select now has the type of the monadic bind operator for lists (with its arguments flipped):
(>>=) :: [a] -> (a -> [b]) -> [b]
xs >>= f = concatMap f xs -- 'or as you said: concat (map f xs)