When to use foldr with a continuation as an accumulation function? - haskell

There is a technique I've seen a few times with foldr. It involves using a function in place of the accumulator in a foldr. I'm wondering when it is necessary to do this, as opposed to using an accumulator that is just a regular value.
Most people have seen this technique before when using foldr to define foldl:
myFoldl :: forall a b. (b -> a -> b) -> b -> [a] -> b
myFoldl accum nil as = foldr f id as nil
where
f :: a -> (b -> b) -> b -> b
f a continuation b = continuation $ accum b a
Here, the type of the combining function f is not just a -> b -> b like normal, but a -> (b -> b) -> b -> b. It takes not only an a and b, but a continuation (b -> b) that we need to pass the b to in order to get the final b.
I most recently saw an example of using this trick in the book Parallel and Concurrent Programming in Haskell. Here is a link to the source code of the example using this trick. Here is a link to the chapter of the book explaining this example.
I've taken the liberty of simplifying the source code into a similar (but shorter) example. Below is a function that takes a list of Strings, prints out whether each string's length is greater than five, then prints the full list of only the Strings that have a length greater than five:
import Text.Printf
stringsOver5 :: [String] -> IO ()
stringsOver5 strings = foldr f (print . reverse) strings []
where
f :: String -> ([String] -> IO ()) -> [String] -> IO ()
f str continuation strs = do
let isGreaterThan5 = length str > 5
printf "Working on \"%s\", greater than 5? %s\n" str (show isGreaterThan5)
if isGreaterThan5
then continuation $ str : strs
else continuation strs
Here's an example of using it from GHCi:
> stringsOver5 ["subdirectory", "bye", "cat", "function"]
Working on "subdirectory", greater than 5? True
Working on "bye", greater than 5? False
Working on "cat", greater than 5? False
Working on "function", greater than 5? True
["subdirectory","function"]
Just like in the myFoldl example, you can see that the combining function f is using the same trick.
However, it occurred to me that this stringsOver5 function could probably be written without this trick:
stringsOver5PlainFoldr :: [String] -> IO ()
stringsOver5PlainFoldr strings = foldr f (pure []) strings >>= print
where
f :: String -> IO [String] -> IO [String]
f str ioStrs = do
let isGreaterThan5 = length str > 5
printf "Working on \"%s\", greater than 5? %s\n" str (show isGreaterThan5)
if isGreaterThan5
then fmap (str :) ioStrs
else ioStrs
(Although maybe you could make the argument that IO [String] is a continuation?)
I have two questions regarding this:
Is it every absolutely necessary to use this trick of passing a continuation to foldr, instead of using foldr with a normal value as an accumulator? Is there an example of a function that absolutely can't be written using foldr with a normal value? (Aside from foldl and functions like that, of course.)
When would I want to use this trick in my own code? Is there any example of a function that can be significantly simplified by using this trick?
Is there any sort of performance considerations to take into account when using this trick? (Or, well, when not using this trick?)

I have two questions regarding this:
For some large value of "two" :-P
Is it every absolutely necessary to use this trick of passing a continuation to foldr? Is there an example of a function that
absolutely can't be written without this trick? (Aside from foldl and
functions like that, of course.)
No, never. Each foldr invocation can always be replaced by explicit recursion.
One should use foldr and other well-known library functions when they make the code simpler. When they do not, one should not shoehorn the code so that it fits the foldr pattern.
There is no shame in using plain recursion, when there is no obvious replacement.
Compare your code with this, for instance:
stringsOver5 :: [String] -> IO ()
stringsOver5 strings = go strings []
where
go :: [String] -> [String] -> IO ()
go [] acc = print (reverse acc)
go (s:ss) acc = do
let isGreaterThan5 = length str > 5
printf "Working on \"%s\", greater than 5? %s\n" str (show isGreaterThan5)
if isGreaterThan5
then go ss (s:acc)
else go ss acc
When would I want to use this trick in my own code? Is there any example of a function that can be significantly simplified by using
this trick?
In my humble opinion, almost never.
Personally, I find "calling foldr with four (or more) arguments" to be an anti-pattern in most cases. This is because it is not that shorter than using explicit recursion, and has the potential to be much less readable.
I would argue that this "idiom" is quite puzzling to any Haskeller who has not seen it before. It is a sort-of an acquired taste, so to speak.
Perhaps, it could be a good idea to use this style when the continuation functions are meaningful on their own. E.g., when representing lists as difference lists, the concatenation of a regular-list of difference-lists can be quite elegant
foldr (.) id listOfDLists []
is beautiful, even if the last [] might be puzzling at first.
Is there any sort of performance considerations to take into account when using this trick? (Or, well, when not using this trick?)
Performance should be essentially the same as using explicit recursion. GHC could even generate the exact same code.
Perhaps using foldr could help GHC fire some fold/build optimization rules, but I'm unsure about the need to do that when using continuations.

Related

Haskell Pattern Matching (beginner)

I have to implement a small programm in Haskell that increments/decrements a result by what in the console line is. For example if we have -a in the console the results must be 0, if -b the result must be incremented with 6 and so on. I have to do this with pattern matching.
I haven't used Haskell until now and I find it pretty hard to understand. I have this to start with:
import System.Environment
main = getArgs >>= print . (foldr apply 0) . reverse
apply :: String -> Integer -> Integer
I don't understand what in the main is. What does it make and the reverse from end, what does it do? As I've read on the internet the getArgs function gives me the values from the console line. But how can I use them? Are there are equivalent functions like for/while in Haskell?
Also, if you have some examples or maybe could help me, I will be very thankful.
Thanks!
This is not beginner-friendly code. Several shortcuts are taken there to keep the code very compact (and in pointfree form). The code
main = getArgs >>= print . (foldr apply 0) . reverse
can be expanded as follows
main = do
args <- getArgs
let reversedArgs = reverse args
result = foldr apply 0 reversedArgs
print result
The result of this can be seen as follows. If the command line arguments are, say, args = ["A","B","C"], then we get reversedArgs = ["C","B","A"] and finally
result = apply "C" (apply "B" (apply "A" 0))
since foldr applies the function apply in such way.
Honestly, I'm unsure about why the code uses reverse and foldr for your task. I would have considered foldl (or, to improve performance, foldl') instead.
I expect the exercise is not to touch the given code, but to expand on it to perform your function. It defines a complicated-looking main function and declares the type of a more straight forward apply, which is called but not defined.
import System.Environment -- contains the function getArgs
-- main gets arguments, does something to them using apply, and prints
main = getArgs >>= print . (foldr apply 0) . reverse
-- apply must have this type, but what it does must be elsewhere
apply :: String -> Integer -> Integer
If we concentrate on apply, we see that it receives a string and an integer, and returns an integer. This is the function we have to write, and it can't decide control flow, so we can just get to it while hoping the argument handling works out.
If we do want to figure out what main is up to, we can make a few observations. The only integer in main is 0, so the first call must get that as its second argument; later ones will be chained with whatever is returned, as that's how foldr operates. r stands for from the right, but the arguments are reversed, so this still processes arguments from the left.
So I could go ahead and just write a few apply bindings to make the program compile:
apply "succ" n = succ n
apply "double" n = n + n
apply "div3" n = n `div` 3
This added a few usable operations. It doesn't handle all possible strings.
$ runhaskell pmb.hs succ succ double double succ div3
3
$ runhaskell pmb.hs hello?
pmb.hs: pmb.hs:(5,1)-(7,26): Non-exhaustive patterns in function apply
The exercise should be about how you handle the choice of operation based on the string argument. There are several options, including distinct patterns as above, pattern guards, case and if expressions.
It can be useful to examine the used functions to see how they might fit together. Here's a look at a few of the used functions in ghci:
Prelude> import System.Environment
Prelude System.Environment> :t getArgs
getArgs :: IO [String]
Prelude System.Environment> :t (>>=)
(>>=) :: Monad m => m a -> (a -> m b) -> m b
Prelude System.Environment> :t print
print :: Show a => a -> IO ()
Prelude System.Environment> :t (.)
(.) :: (b -> c) -> (a -> b) -> a -> c
Prelude System.Environment> :t foldr
foldr :: Foldable t => (a -> b -> b) -> b -> t a -> b
Prelude System.Environment> :t reverse
reverse :: [a] -> [a]
This shows that all the strings come out of getArgs, it and print operate in the IO monad, which must be the m in >>=, and . transfers results from the right function into arguments for the left function. The type signature alone doesn't tell us what order foldr handles things, though, or what reverse does (though it can't create new values, only reorder including repetition).
As a last exercise, I'll rewrite the main function in a form that doesn't switch directions as many times:
main = print . foldl (flip apply) 0 =<< getArgs
This reads from right to left in a data flow sense and handles arguments from left to right because foldl performs left-associative folding. flip is just there to match the argument order for apply.
As suggested in the comment, hoogle is a great tool.
To find out what exactly you get from getArgs you can search for it on hoogle:
https://hackage.haskell.org/package/base-4.11.1.0/docs/System-Environment.html#v:getArgs
As you can see, it's of type IO [String].
Since I don't know how familiar you are with the IO abstractions yet, we'll just say that the right part of >>= gets those as argument.
The arguments for a call like ./a.out -a -b --asdf Hi will then be a list of strings:
["-a", "-b", "--asdf", "Hi"].
The fold + reverse in the main will then do some magic, and your apply function will be called with each string in the list and the previous return value (0 for the first invocation).
In Haskell, String is the same as [Char] with a bit of compiler sugar, so you can match on strings like you would on regular lists in your definition of apply.

Point Free Style Required for Optimized Curry

Say we have a (contrived) function like so:
import Data.List (sort)
contrived :: Ord a => [a] -> [a] -> [a]
contrived a b = (sort a) ++ b
And we partially apply it to use elsewhere, eg:
map (contrived [3,2,1]) [[4],[5],[6]]
On the surface, this works as one would expect:
[[1,2,3,4],[1,2,3,5],[1,2,3,6]]
However, if we throw some traces in:
import Debug.Trace (trace)
contrived :: Ord a => [a] -> [a] -> [a]
contrived a b = (trace "sorted" $ sort a) ++ b
map (contrived $ trace "a value" [3,2,1]) [[4],[5],[6]]
We see that the first list passed into contrived is evaluated only once, but it is sorted for each item in [4,5,6]:
[sorted
a value
[1,2,3,4],sorted
[1,2,3,5],sorted
[1,2,3,6]]
Now, contrived can be rather simply translated to point-free style:
contrived :: Ord a => [a] -> [a] -> [a]
contrived a = (++) (sort a)
Which when partially applied:
map (contrived [3,2,1]) [4,5,6]
Still works as we expect:
[[1,2,3,4],[1,2,3,5],[1,2,3,6]]
But if we again add traces:
contrived :: Ord a => [a] -> [a] -> [a]
contrived a = (++) (trace "sorted" $ sort a)
map (contrived $ trace "a value" [3,2,1]) [[4],[5],[6]]
We see that now the first list passed into contrived is evaluated and sorted only once:
[sorted
a value
[1,2,3,4],[1,2,3,5],[1,2,3,6]]
Why is this so? Since the translation into pointfree style is so trivial, why can't GHC deduce that it only needs to sort a once in the first version of contrived?
Note: I know that for this rather trivial example, it's probably preferable to use pointfree style. This is a contrived example that I've simplified quite a bit. The real function that I'm having the issue with is less clear (in my opinion) when expressed in pointfree style:
realFunction a b = conditionOne && conditionTwo
where conditionOne = map (something a) b
conditionTwo = somethingElse a b
In pointfree style, this requires writing an ugly wrapper (both) around (&&):
realFunction a = both conditionOne conditionTwo
where conditionOne = map (something a)
conditionTwo = somethingElse a
both f g x = (f x) && (g x)
As an aside, I'm also not sure why the both wrapper works; the pointfree style of realFunction behaves like the pointfree style version of contrived in that the partial application is only evaluated once (ie. if something sorted a it would only do so once). It appears that since both is not pointfree, Haskell should have the same issue that it had with the non-pointfree contrived.
If I understand correctly, you are looking for this:
contrived :: Ord a => [a] -> [a] -> [a]
contrived a = let a' = sort a in \b -> a' ++ b
-- or ... in (a' ++)
If you want the sort to be computed only once, it has to be done before the \b.
You are correct in that a compiler could optimize this. This is known as the "full laziness" optimization.
If I remember correctly, GHC does not always do it because it's not always an actual optimization, in the general case. Consider the contrived example
foo :: Int -> Int -> Int
foo x y = let a = [1..x] in length a + y
When passing both arguments, the above code works in constant space: the list elements are immediately garbage collected as they are produced.
When partially applying x, the closure for foo x only requires O(1) memory, since the list is not yet generated. Code like
let f = foo 1000 in f 10 + f 20 -- (*)
still run in constant space.
Instead, if we wrote
foo :: Int -> Int -> Int
foo x = let a = [1..x] in (length a +)
then (*) would no longer run in constant space. The first call f 10 would allocate a 1000-long list, and keep it in memory for the second call f 20.
Note that your partial application
... = (++) (sort a)
essentially means
... = let a' = sort a in \b -> a' ++ b
since argument passing involves a binding, as in let. So, the result of your sort a is kept around for all the future calls.

Variable scope in a higher-order lambda function

In working through a solution to the 8 Queens problem, a person used the following line of code:
sameDiag try qs = any (\(colDist,q) -> abs (try - q) == colDist) $ zip [1..] qs
try is an an item; qs is a list of the same items.
Can someone explain how colDist and q in the lambda function get bound to anything?
How did try and q used in the body of lambda function find their way into the same scope?
To the degree this is a Haskell idiom, what problem does this design approach help solve?
The function any is a higher-order function that takes 2 arguments:
the 1st argument is of type a -> Bool, i.e. a function from a to Bool
the 2nd argument is of type [a], i.e. a list of items of type a;
i.e. the 1st argument is a function that takes any element from the list passed as the 2nd argument, and returns a Bool based on that element. (well it can take any values of type a, not just the ones in that list, but it's quite obviously certain that any won't be invoking it with some arbitrary values of a but the ones from the list.)
You can then simplify thinking about the original snippet by doing a slight refactoring:
sameDiag :: Int -> [Int] -> Bool
sameDiag try qs = any f xs
where
xs = zip [1..] qs
f = (\(colDist, q) -> abs (try - q) == colDist)
which can be transformed into
sameDiag :: Int -> [Int] -> Bool
sameDiag try qs = any f xs
where
xs = zip [1..] qs
f (colDist, q) = abs (try - q) == colDist)
which in turn can be transformed into
sameDiag :: Int -> [Int] -> Bool
sameDiag try qs = any f xs
where
xs = zip [1..] qs
f pair = abs (try - q) == colDist) where (colDist, q) = pair
(Note that sameDiag could also have a more general type Integral a => a -> [a] -> Bool rather than the current monomorphic one)
— so how does the pair in f pair = ... get bound to a value? well, simple: it's just a function; whoever calls it must pass along a value for the pair argument. — when calling any with the first argument set to f, it's the invocation of the function any who's doing the calling of f, with individual elements of the list xs passed in as values of the argument pair.
and, since the contents of xs is a list of pairs, it's OK to pass an individual pair from this list to f as f expects it to be just that.
EDIT: a further explanation of any to address the asker's comment:
Is this a fair synthesis? This approach to designing a higher-order function allows the invoking code to change how f behaves AND invoke the higher-order function with a list that requires additional processing prior to being used to invoke f for every element in the list. Encapsulating the list processing (in this case with zip) seems the right thing to do, but is the intent of this additional processing really clear in the original one-liner above?
There's really no additional processing done by any prior to invoking f. There is just very minimalistic bookkeeping in addition to simply iterating through the passed in list xs: invoking f on the elements during the iteration, and immediately breaking the iteration and returning True the first time f returns True for any list element.
Most of the behavior of any is "implicit" though in that it's taken care of by Haskell's lazy evaluation, basic language semantics as well as existing functions, which any is composed of (well at least my version of it below, any' — I haven't taken a look at the built-in Prelude version of any yet but I'm sure it's not much different; just probably more heavily optimised).
In fact, any is simple it's almost trivial to re-implement it with a one liner on a GHCi prompt:
Prelude> let any' f xs = or (map f xs)
let's see now what GHC computes as its type:
Prelude> :t any'
any' :: (a -> Bool) -> [a] -> Bool
— same as the built-in any. So let's give it some trial runs:
Prelude> any' odd [1, 2, 3] -- any odd values in the list?
True
Prelude> any' even [1, 3] -- any even ones?
False
Prelude> let adult = (>=18)
Prelude> any' adult [17, 17, 16, 15, 17, 18]
— see how you can sometimes write code that almost looks like English with higher-order functions?
zip :: [a] -> [b] -> [(a,b)] takes two lists and joins them into pairs, dropping any remaining at the end.
any :: (a -> Bool) -> [a] -> Bool takes a function and a list of as and then returns True if any of the values returned true or not.
So colDist and q are the first and second elements of the pairs in the list made by zip [1..] qs, and they are bound when they are applied to the pair by any.
q is only bound within the body of the lambda function - this is the same as with lambda calculus. Since try was bound before in the function definition, it is still available in this inner scope. If you think of lambda calculus, the term \x.\y.x+y makes sense, despite the x and the y being bound at different times.
As for the design approach, this approach is much cleaner than trying to iterate or recurse through the list manually. It seems quite clear in its intentions to me (with respect to the larger codebase it comes from).

Define a haskell function [IO a] -> IO[a]

I am doing a haskell exercise, regarding define a function accumulate :: [IO a] -> IO [a]
which performs a sequence of interactions and accumulates their result in a list.
What makes me confused is how to express a list of IO a ? (action:actions)??
how to write recursive codes using IO??
This is my code, but these exists some problem...
accumulate :: [IO a] -> IO [a]
accumulate (action:actions) = do
value <- action
list <- accumulate (action:actions)
return (convert_to_list value list)
convert_to_list:: Num a =>a -> [a]-> [a]
convert_to_list a [] = a:[]
convert_to_list x xs = x:xs
What you are trying to implement is sequence from Control.Monad.
Just to let you find the answer instead of giving it, try searching for [IO a] -> IO [a] on hoogle (there's a Source link on the right hand side of the page when you've chosen a function).
Try to see in your code what happens when list of actions is empty list and see what does sequence do to take care of that.
There is already such function in Control.Monad and it called sequence (no you shouldn't look at it). You should denote the important decision taken during naming of it. Technically [IO a] says nothing about in which order those Monads should be attached to each other, but name sequence puts a meaning of sequential attaching.
As for the solving you problem. I'd suggest to look more at types and took advice of #sacundim. In GHCi (interpreter from Glasgow Haskell Compiler) there is pretty nice way to check type and thus understand expression (:t (:) will return (:) :: a -> [a] -> [a] which should remind you one of you own function but with less restrictive types).
First of all I'd try to see at what you have showed with more simple example.
data MyWrap a = MyWrap a
accumulate :: [MyWrap a] -> MyWrap [a]
accumulate (action:actions) = MyWrap (convert_to_list value values) where
MyWrap value = action -- use the pattern matching to unwrap value from action
-- other variant is:
-- value = case action of
-- MyWrap x -> x
MyWrap values = accumulate (action:actions)
I've made the same mistake that you did on purpose but with small difference (values is a hint). As you probably already have been told you could try to interpret any of you program by trying to inline appropriate functions definitions. I.e. match definitions on the left side of equality sign (=) and replace it with its right side. In your case you have infinite cycle. Try to solve it on this sample or your and I think you'll understand (btw your problem might be just a typo).
Update: Don't be scary when your program will fall in runtime with message about pattern match. Just think of case when you call your function as accumulate []
Possibly you looking for sequence function that maps [m a] -> m [a]?
So the short version of the answer to your question is, there's (almost) nothing wrong with your code.
First of all, it typechecks:
Prelude> let accumulate (action:actions) = do { value <- action ;
list <- accumulate (action:actions) ; return (value:list) }
Prelude> :t accumulate
accumulate :: (Monad m) => [m t] -> m [t]
Why did I use return (value:list) there? Look at your second function, it's just (:). Calling g
g a [] = a:[]
g a xs = a:xs
is the same as calling (:) with the same arguments. This is what's known as "eta reduction": (\x-> g x) === g (read === as "is equivalent").
So now just one problem remains with your code. You've already taken a value value <- action out of the action, so why do you reuse that action in list <- accumulate (action:actions)? Do you really have to? Right now you have, e.g.,
accumulate [a,b,c] ===
do { v1<-a; ls<-accumulate [a,b,c]; return (v1:ls) } ===
do { v1<-a; v2<-a; ls<-accumulate [a,b,c]; return (v1:v2:ls) } ===
do { v1<-a; v2<-a; v3<-a; ls<-accumulate [a,b,c]; return (v1:v2:v3:ls) } ===
.....
One simple fix and you're there.

Shorter way to write this code

The following pattern appears very frequently in Haskell code. Is there a shorter way to write it?
if pred x
then Just x
else Nothing
You're looking for mfilter in Control.Monad:
mfilter :: MonadPlus m => (a -> Bool) -> m a -> m a
-- mfilter odd (Just 1) == Just 1
-- mfilter odd (Just 2) == Nothing
Note that if the condition doesn't depend on the content of the MonadPlus, you can write instead:
"foo" <$ guard (odd 3) -- Just "foo"
"foo" <$ guard (odd 4) -- Nothing
Hm... You are looking for a combinator that takes an a, a function a -> Bool and returns a Maybe a. Stop! Hoogle time. There is no complete match, but find is quite close:
find :: (a -> Bool) -> [a] -> Maybe a
I doubt that you can actually find your function somewhere. But why not define it by yourself?
ifMaybe :: (a -> Bool) -> a -> Maybe a
ifMaybe f a | f a = Just a
ifMaybe _ _ = Nothing
You can use guard to achieve this behavior:
guard (pred x) >> return x
This is such a generally useful behavior that I've even defined ensure in my own little suite of code for one-offs (everybody has such a thing, right? ;-):
ensure p x = guard (p x) >> return x
Use:
(?:) (5>2) (Just 5,Nothing)
from Data.Bool.HT.
f pred x = if pred x then Just x else Nothing
Given the above definition, you can simply write:
f pred x
Of course this is no different than Daniel Wagner's ensure or FUZxxl's ifMaybe. But it's name is simply f, making it the shortest, and it's definition is precisely the code you gave, making it the most easily proven correct. ;)
Some ghci, just for fun
ghci> let f pred x = if pred x then Just x else Nothing
ghci> f (5>) 2
Just 2
ghci> f (5>) 6
Nothing
If you couldn't tell, this isn't a very serious answer. The others are a bit more insightful, but I couldn't resist the tongue-in-cheek response to "make this code shorter".
Usually I'm a big fan of very generic code, but I actually find this exact function useful often enough, specialized to Maybe, that I keep it around instead of using guard, mfilter, and the like.
The name I use for it is justIf, and I'd typically use it for doing things like this:
∀x. x ⊢ import Data.List
∀x. x ⊢ unfoldr (justIf (not . null . snd) . splitAt 3) [1..11]
[[1,2,3],[4,5,6],[7,8,9]]
Basically, stuff where some sort of element-wise filtering or checking needs to be done in a compound expression, so Maybe is used to indicate the result of the predicate.
For the specialized version like this, there really isn't much you can do to make it shorter. It's already pretty simple. There's a fine line between being concise, and just golfing your code for character count, and for something this simple I wouldn't really worry about trying to "improve" it...

Resources