Why doesn't this function bottom out - haskell

xs = [1, 2] ++ undefined
length $ take 2 $ take 4 xs
My brain is reading this as
length (take 2 ( take 4 xs ) )
If everything in the parenthesis is evaluated first, then why does this not error out with Exception: prelude.undefined?
The answer given by the book is that take 2 is only taking the first two indices, but shouldn't the take 4 take precedence here and get evaluated first?

If everything in the parenthesis is evaluated first, then why does this not error out with Exception: prelude.undefined?
Because Haskell is lazy. That means it does not evaluate anything unless it is necessary.
Indeed, length will need access to the list (but not to its elements if these contain calculations). It will thus ask the take 2 ( take 4 xs ) ) for the first elements, and then the second, and then the third, etc.
Now take is lazy as well. It means as long as it does not need to be evaluated, it will do nothing. In case the length asks the next element it will determine that element, if later another element is asked, it will provide the second element for the list. If the length asks for another one, then take 2 will stop, and thus return the empty list.
Haskell thus has not an evaluation strategy like Python or Java where the innermost functions are first evaluated and left to right.

Let's calculate! What is [1, 2] ++ undefined? Well, to know that, we first need to know what [1, 2] is. It's really just syntactic sugar for 1:2:[].
Next, we need to know what ++ is. The source code tells us:
(++) :: [a] -> [a] -> [a]
[] ++ ys = ys
(x : xs) ++ ys = x : xs ++ ys
Okay. So let's pattern match.
(1:2:[]) ++ undefined
= -- the second pattern matches
1 : (2:[]) ++ undefined
= -- the second pattern matches
1 : 2 : [] ++ undefined
= -- the first pattern matches
1 : 2 : undefined
Okay, so we've figured out that
xs = 1 : 2 : undefined
What about take 4 xs? Well, we first have to look at what take does. The real source code is somewhat complicated for performance reasons, but we can use this simpler, equivalent, version:
take :: Int -> [a] -> [a]
take n _ | n <= 0 = []
take _ [] = []
take n (a : as) = a : take (n - 1) as
So
take 4 xs
= -- value of xs
take 4 (1:2:undefined)
= -- 4 is positive, so the first guard falls through.
-- the second pattern fails (the list isn't empty)
-- the third pattern succeeds
1 : take 3 (2:undefined)
= -- same analysis
1 : 2 : take 3 undefined
= -- 3 is positive, so the guard fails.
-- the pattern match on undefined produces undefined
1 : 2 : undefined
That's just the same as xs!
Next we calculate
take 2 (take 4 xs)
= -- as shown above
take 2 (1 : 2 : undefined)
= -- following the same sequence above
1 : take 1 (2 : undefined)
= -- same same
1 : 2 : take 0 undefined
= -- this time the guard succeeds!
1 : 2 : []
= -- reinstalling syntactic sugar
[1, 2]
So now you have a fully defined list of two elements, and taking its length poses no special challenge at all.
Note that the above is a calculation. It does not represent the actual sequence of operations that occur when the program runs. But ... that doesn't actually matter. One of the great things about Haskell is that in your analysis, you can "substitute equals for equals" whenever you want. It won't affect the result. You do have to be a bit careful, however, not to let it confuse you. For example, suppose you want to evaluate null (repeat ()). You'd start out something like this:
repeat () = () : repeat ()
What do you do next? Well, one option is to continue to expand repeat ():
repeat () = () : () : repeat ()
It's totally valid. But if you keep going on that way forever, you'll never get to an answer. At some point, you have to look at the rest of the problem.
null (() : repeat ())
is immediately False. So just because you can find an infinite sequence of reductions doesn't mean there's an infinite loop. Indeed, the way Haskell works, there's only an infinite loop if every possible sequence of reductions is infinite.

Because you're not forcing the thunk after take 4 xs. Consider the following code in the repl
Prelude> xs = [1, 2] ++ undefined
Prelude> take 4 xs
[1,2*** Exception: Prelude.undefined
CallStack (from HasCallStack):
error, called at libraries\base\GHC\Err.hs:79:14 in base:GHC.Err
undefined, called at <interactive>:1:16 in interactive:Ghci1
This happens because of the implicit show in GHCi. But what if we don't use a bare expression?
Prelude> xs = [1, 2] ++ undefined
Prelude> ys = take 4 xs
Hey look no error! Let's keep going
Prelude> xs = [1, 2] ++ undefined
Prelude> ys = take 4 xs
Prelude> zs = take 2 ys
Prelude> length zs
2

Related

How to create a Infinite List in Haskell where the new value consumes all the previous values

If I create a infinite list like this:
let t xs = xs ++ [sum(xs)]
let xs = [1,2] : map (t) xs
take 10 xs
I will get this result:
[
[1,2],
[1,2,3],
[1,2,3,6],
[1,2,3,6,12],
[1,2,3,6,12,24],
[1,2,3,6,12,24,48],
[1,2,3,6,12,24,48,96],
[1,2,3,6,12,24,48,96,192],
[1,2,3,6,12,24,48,96,192,384],
[1,2,3,6,12,24,48,96,192,384,768]
]
This is pretty close to what I am trying to do.
This current code uses the last value to define the next. But, instead of a list of lists, I would like to know some way to make an infinite list that uses all the previous values to define the new one.
So the output would be only
[1,2,3,6,12,24,48,96,192,384,768,1536,...]
I have the definition of the first element [1].
I have the rule of getting a new element, sum all the previous elements.
But, I could not put this in the Haskell grammar to create the infinite list.
Using my current code, I could take the list that I need, using the command:
xs !! 10
> [1,2,3,6,12,24,48,96,192,384,768,1536]
But, it seems to me, that it is possible doing this in some more efficient way.
Some Notes
I understand that, for this particular example, that was intentionally oversimplified, we could create a function that uses only the last value to define the next.
But, I am searching if it is possible to read all the previous values into an infinite list definition.
I am sorry if the example that I used created some confusion.
Here another example, that is not possible to fix using reading only the last value:
isMultipleByList :: Integer -> [Integer] -> Bool
isMultipleByList _ [] = False
isMultipleByList v (x:xs) = if (mod v x == 0)
then True
else (isMultipleByList v xs)
nextNotMultipleLoop :: Integer -> Integer -> [Integer] -> Integer
nextNotMultipleLoop step v xs = if not (isMultipleByList v xs)
then v
else nextNotMultipleLoop step (v + step) xs
nextNotMultiple :: [Integer] -> Integer
nextNotMultiple xs = if xs == [2]
then nextNotMultipleLoop 1 (maximum xs) xs
else nextNotMultipleLoop 2 (maximum xs) xs
addNextNotMultiple xs = xs ++ [nextNotMultiple xs]
infinitePrimeList = [2] : map (addNextNotMultiple) infinitePrimeList
take 10 infinitePrimeList
[
[2,3],
[2,3,5],
[2,3,5,7],
[2,3,5,7,11],
[2,3,5,7,11,13],
[2,3,5,7,11,13,17],
[2,3,5,7,11,13,17,19],
[2,3,5,7,11,13,17,19,23],
[2,3,5,7,11,13,17,19,23,29],
[2,3,5,7,11,13,17,19,23,29,31]
]
infinitePrimeList !! 10
[2,3,5,7,11,13,17,19,23,29,31,37]
You can think so:
You want to create a list (call them a) which starts on [1,2]:
a = [1,2] ++ ???
... and have this property: each next element in a is a sum of all previous elements in a. So you can write
scanl1 (+) a
and get a new list, in which any element with index n is sum of n first elements of list a. So, it is [1, 3, 6 ...]. All you need is take all elements without first:
tail (scanl1 (+) a)
So, you can define a as:
a = [1,2] ++ tail (scanl1 (+) a)
This way of thought you can apply with other similar problems of definition list through its elements.
If we already had the final result, calculating the list of previous elements for a given element would be easy, a simple application of the inits function.
Let's assume we already have the final result xs, and use it to compute xs itself:
import Data.List (inits)
main :: IO ()
main = do
let is = drop 2 $ inits xs
xs = 1 : 2 : map sum is
print $ take 10 xs
This produces the list
[1,2,3,6,12,24,48,96,192,384]
(Note: this is less efficient than SergeyKuz1001's solution, because the sum is re-calculated each time.)
unfoldr has a quite nice flexibility to adapt to various "create-a-list-from-initial-conditions"-problems so I think it is worth mentioning.
A little less elegant for this specific case, but shows how unfoldr can be used.
import Data.List
nextVal as = Just (s,as++[s])
where s = sum as
initList = [1,2]
myList =initList ++ ( unfoldr nextVal initList)
main = putStrLn . show . (take 12) $ myList
Yielding
[1,2,3,6,12,24,48,96,192,384,768,1536]
in the end.
As pointed out in the comment, one should think a little when using unfoldr. The way I've written it above, the code mimicks the code in the original question. However, this means that the accumulator is updated with as++[s], thus constructing a new list at every iteration. A quick run at https://repl.it/languages/haskell suggests it becomes quite memory intensive and slow. (4.5 seconds to access the 2000nd element in myList
Simply swapping the acumulator update to a:as produced a 7-fold speed increase. Since the same list can be reused as accumulator in every step it goes faster. However, the accumulator list is now in reverse, so one needs to think a little bit. In the case of predicate function sum this makes no differece, but if the order of the list matters, one must think a little bit extra.
You could define it like this:
xs = 1:2:iterate (*2) 3
For example:
Prelude> take 12 xs
[1,2,3,6,12,24,48,96,192,384,768,1536]
So here's my take. I tried not to create O(n) extra lists.
explode ∷ Integral i ⇒ (i ->[a] -> a) -> [a] -> [a]
explode fn init = as where
as = init ++ [fn i as | i <- [l, l+1..]]
l = genericLength init
This convenience function does create additional lists (by take). Hopefully they can be optimised away by the compiler.
explode' f = explode (\x as -> f $ take x as)
Usage examples:
myList = explode' sum [1,2]
sum' 0 xs = 0
sum' n (x:xs) = x + sum' (n-1) xs
myList2 = explode sum' [1,2]
In my tests there's little performance difference between the two functions. explode' is often slightly better.
The solution from #LudvigH is very nice and clear. But, it was not faster.
I am still working on the benchmark to compare the other options.
For now, this is the best solution that I could find:
-------------------------------------------------------------------------------------
-- # infinite sum of the previous using fuse
-------------------------------------------------------------------------------------
recursiveSum xs = [nextValue] ++ (recursiveSum (nextList)) where
nextValue = sum(xs)
nextList = xs ++ [nextValue]
initialSumValues = [1]
infiniteSumFuse = initialSumValues ++ recursiveSum initialSumValues
-------------------------------------------------------------------------------------
-- # infinite prime list using fuse
-------------------------------------------------------------------------------------
-- calculate the current value based in the current list
-- call the same function with the new combined value
recursivePrimeList xs = [nextValue] ++ (recursivePrimeList (nextList)) where
nextValue = nextNonMultiple(xs)
nextList = xs ++ [nextValue]
initialPrimes = [2]
infiniteFusePrimeList = initialPrimes ++ recursivePrimeList initialPrimes
This approach is fast and makes good use of many cores.
Maybe there is some faster solution, but I decided to post this to share my current progress on this subject so far.
In general, define
xs = x1 : zipWith f xs (inits xs)
Then it's xs == x1 : f x1 [] : f x2 [x1] : f x3 [x1, x2] : ...., and so on.
Here's one example of using inits in the context of computing the infinite list of primes, which pairs them up as
ps = 2 : f p1 [p1] : f p2 [p1,p2] : f p3 [p1,p2,p3] : ...
(in the definition of primes5 there).

Why does foldright work for infinite lists?

I was under the impression that foldright starts from the end of a list and works backwards (this is how I imagined what right-associative means). So I am confused that the following works for infinite lists.
I have a function find:
find :: (a -> Bool) -> List a -> Optional a
find p = foldRight (\c a -> if p c then Full c else a) Empty
Note that the following work:
>> find (const True) infinity
Full 0
I did do some searching and found this post: How do you know when to use fold-left and when to use fold-right?
Unfortunately, the accepted answer is not particularly helpful because the example for right-associative operations is:
A x (B x (C x D))
Which still means it needs to execute the right-most thing first.
I was wondering if anyone can clear this up for me, thanks.
Let's start with a function:
>>> let check x y = if x > 10 then x else y
>>> check 100 5
100
>>> check 0 5
5
check takes two arguments, but might not use its second argument. Since haskell is lazy, this means that the second argument may never be evaluated:
>>> check 20 (error "fire the missles!")
20
This laziness lets us skip a possibly infinite amount of work:
>>> check 30 (sum [1..])
30
Now let's step through foldr check 0 [0..] using equational reasoning:
foldr check 0 [0..]
= check 0 (foldr check 0 [1..]) -- by def'n of foldr
= foldr check 0 [1..] -- by def'n of check
= check 1 (foldr check 0 [2..]) -- by def'n of foldr
= foldr check 0 [2..] -- by def'n of check
-- ...
= foldr check 0 [10..]
= check 10 (foldr check 0 [11..]) -- by def'n of foldr
= foldr check 0 [11..] -- by def'n of check
= check 11 (foldr check 0 [12..]) -- by def'n of foldr
= 11 -- by def'n of check
Note how laziness forces us to evaluate from the top-down, seeing how (and if) the outer-most function call uses its arguments, rather than from the bottom-up (evaluating all arguments before passing them to a function), as strict languages do.
It works because of lazy evaluation. Let’s take a really simple example.
import Data.Char (toUpper)
main :: IO ()
main = interact (foldr capitalized []) where
capitalized :: Char -> String -> String
capitalized x xs = (toUpper x):xs
Run this program interactively and see what happens. The input is an infinite (or at least indefinite) list of characters read from standard input.
This works because each element of the output list gets produced lazily, when it is needed. So the tail is not produced first: it’s only computed if and when it’s needed. Until then, it’s deferred, and we can use the partial results. The partial result for 'h':xs is 'H':(foldr capitalized [] xs). The partial result for 'h':'e':'l':'l':'o':',':' ':'w':'o':'r':'l':'d':'!':'\n':xs is a string we can output before we proceed to the tail xs.
Now see what happens if you try this with foldl.
This works for any data structure that generates a useful prefix. For a reduction operation that produces a single value, and no useful intermediate results, a strict left fold (Data.List.foldl') is usually the better choice.
Your objection proves too much. If it was valid, no infinite lists at all would be possible! An infinite list is constructed using (:). Its second argument, the tail of the list, is also an infinite list, and would have to be evaluated first. This recursively doesn't get us anywhere.

Pairs of elements from list

I want to convert [1,2,3,4] to [[1 2] [2 3] [3 4]] or [(1 2) (2 3) (3 4)]. In clojure I have (partition 2 1 [1,2,3,4]). How can I do it in haskell? I suspect there is such function in standard api but I can't find it.
The standard trick for this is to zip the list with it's own tail:
> let xs = [1,2,3,4] in zip xs (tail xs)
[(1,2),(2,3),(3,4)]
To see why this works, line up the list and its tail visually.
xs = 1 : 2 : 3 : 4 : []
tail xs = 2 : 3 : 4 : []
and note that zip is making a tuple out of each column.
There are two more subtle reasons why this always does the right thing:
zip stops when either list runs out of elements. That makes sense here since we can't have an "incomplete pair" at the end and it also ensures that we get no pairs from a single element list.
When xs is empty, one might expect tail xs to throw an exception. However, because zip
checks its first argument first, when it sees that it's the empty list, the second argument
is never evaluated.
Everything above also holds true for zipWith, so you can use the same method whenever you need to apply a function pairwise to adjacent elements.
For a generic solution like Clojure's partition, there is nothing in the standard libraries. However, you can try something like this:
partition' :: Int -> Int -> [a] -> [[a]]
partition' size offset
| size <= 0 = error "partition': size must be positive"
| offset <= 0 = error "partition': offset must be positive"
| otherwise = loop
where
loop :: [a] -> [[a]]
loop xs = case splitAt size xs of
-- If the second part is empty, we're at the end. But we might
-- have gotten less than we asked for, hence the check.
(ys, []) -> if length ys == size then [ys] else []
(ys, _ ) -> ys : loop (drop offset xs)
Just to throw another answer out there using a different approach:
For n=2 you want to simply zip the list with its tail. For n=3 you want to zip the list with its tail and with the tail of its tail. This pattern continues further, so all we have to do is generalise it:
partition n = sequence . take n . iterate tail
But this only works for an offset of 1. To generalise the offsets we just have to look at the genrated list. It will always have the form:
[[1..something],[2..something+1],..]
So all left to do is select every offsetth element and we should be fine. I shamelessy stole this version from #ertes from this question:
everyNth :: Int -> [a] -> [a]
everyNth n = map head . takeWhile (not . null) . iterate (drop n)
The entire function now becomes:
partition size offset = everyNth offset . sequence . take size . iterate tail
Sometimes is best to roll your own. Recursive functions are what gives LisP its power and appeal. Haskell tries to discourage them but too often a solution is best achieved with a recursive function. They are often quite simple as is this one to produce pairs.
Haskell pattern matching reduces code. This could easily be changed by changing only the pattern to (x:y:yys) to produce (a,b), (c,d), (e,f).
> prs (x:yys#(y:_)) = (x,y):prs yys
> prs "abcdefg"
[('a','b'),('b','c'),('c','d'),('d','e'),('e','f'),('f','g')

Understanding recursion in Haskell

I am having a very difficult time understand how to think about problems in a recursive way, and solve them using Haskell. I have spent hours of reading trying to wrap my head around recursion. The explanation I most often get from people who understand it is never clear and is something like "you pass a function, the name of the function as the argument, the function will then execute, solving a small piece of a the problem and calling the function again and again until you hit the base case".
Can someone please be kind enough, and walk me through the thought process of these three simple recursive functions? Not so much the functionality of them, but how the code, ends up executing and solving the problem, recursively.
Many thanks in advance!
Function 1
maximum' [] = error "maximum of empty list"
maximum' [x] = x
maximum' (x:rest) = max x(maximum' rest)
Function 2
take' n _
| n <= 0 = []
take' _ [] = []
take' n (x:xs) = x : take' (n-1) xs
Function 3
reverse' [] = []
reverse' (x:xs) = reverse' xs ++ [x]
Guidelines
When trying to understand recursion, you may find it easier to think about how the algorithm behaves for a given input. It's easy to get hung up on what the execution path looks like, so instead ask yourself questions like:
What happens if I pass an empty list?
What happens if I pass a list with one item?
What happens if I pass a list with many items?
Or, for recursion on numbers:
What happens if I pass a negative number?
What happens if I pass 0?
What happens if I pass a number greater than 0?
The structure of a recursive algorithm is often just a matter of covering the above cases. So let's see how your algorithms behave to get a feel for this approach:
maximum'
maximum [] = error
maximum [1] = 1
maximum [1, 2] = 2
As you can see, the only interesting behaviour is #3. The others just ensure the algorithm terminates. Looking at the definition,
maximum' (x:rest) = max x (maximum' rest)
Calling this with [1, 2] expands to:
maximum [1, 2] ~ max 1 (maximum' [2])
~ max 1 2
maximum' works by returning a number, which this case knows how to process recursively using max. Let's look at one more case:
maximum [0, 1, 2] ~ max 0 (maximum' [1, 2])
~ max 0 (max 1 2)
~ max 0 2
You can see how, for this input, the recursive call to maximum' in the first line is exactly the same as the previous example.
reverse'
reverse [] = []
reverse [1] = [1]
reverse [1, 2] = [2, 1]
Reverse works by taking the head of the given list and sticking it at the end. For an empty list, this involves no work, so that's the base case. So given the definition:
reverse' (x:xs) = reverse' xs ++ [x]
Let's do some substitution. Given that [x] is equivalent to x:[], you can see there are actually two values to deal with:
reverse' [1] ~ reverse' [] ++ 1
~ [] ++ 1
~ [1]
Easy enough. And for a two-element list:
reverse' [0, 1] ~ reverse' [1] ++ 0
~ [] ++ [1] ++ 0
~ [1, 0]
take'
This function introduces recursion over an integer argument as well as lists, so there are two base cases.
What happens if we take 0-or-less items? We don't need to take any items, so just return the empty list.
take' n _ | n <= 0 = []
take' -1 [1] = []
take' 0 [1] = []
What happens if we pass an empty list? There are no more items to take, so stop the recursion.
take' _ [] = []
take' 1 [] = []
take -1 [] = []
The meat of the algorithm is really about walking down the list, pulling apart the input list and decrementing the number of items to take until either of the above base cases stop the process.
take' n (x:xs) = x : take' (n-1) xs
So, in the case where the numeric base case is satisfied first, we stop before getting to the end of the list.
take' 1 [9, 8] ~ 9 : take (1-1) [8]
~ 9 : take 0 [8]
~ 9 : []
~ [9]
In the case where the list base case is satisfied first, we run out of items before the counter reaches 0, and just return what we can.
take' 3 [9, 8] ~ 9 : take (3-1) [8]
~ 9 : take 2 [8]
~ 9 : 8 : take 1 []
~ 9 : 8 : []
~ [9, 8]
Recursion is a strategy to apply a certain function to a set. You apply the function to the first element of that set, then you repeat the process to the remaining elements.
Let's take an example, you want to double all the integers inside a list. First, you think about which function should I use? Answer -> 2*, now you have to apply this function recursively. Let's call it apply_rec, so you have:
apply_rec (x:xs) = (2*x)
But this only changes the first element, you want to change all the elements on the set. So you have to apply the apply_rec to the remaining elements as well. Thus:
apply_rec (x:xs) = (2*x) : (apply_rec xs)
Now you have a different problem. When does apply_rec ends? It ends when you reach the end of the list. In other words [], so you need to cover this case as well.
apply_rec [] = []
apply_rec (x:xs) = (2*x) : (apply_rec xs)
When you reach the end you do not want to apply any function, hence the function apply_rec should "return" [].
Let's see the behavior of this function in a set = [1,2,3].
apply_rec [1,2,3] = (2 * 1) : (apply_rec [2,3])
apply_rec [2,3] = 2 : ((2 * 2) : (apply_rec [3]))
apply_rec [3] = 2 : (4 : ((2 * 3) : (apply_rec []))
apply_rec [] = 2 : (4 : (6 : [])))
resulting in [2,4,6].
Since you probably do not know very well recursion, the best thing is to start with simpler examples than those that you have presented. Take also a look learn recursion and at this Haskell Tutorial 3 - recursion.
You ask about "thought process", presumably of a programmer, not a computer, right? So here's my two cents:
The way to think about writing some function g with recursion is, imagine that you have already written that function. That's all.
That means you get to use it whenever you need it, and it "will do" whatever it is supposed to be doing. So just write down what that is - formulate the laws that it must obey, write down whatever you know about it. Say something about it.
Now, just saying g x = g x is not saying anything. Of course it is true, but it is a meaningless tautology. If we say g x = g (x+2) it is no longer a tautology, but meaningless anyway. We need to say something more sensible. For example,
g :: Integer -> Bool
g x | x<=0 = False
g 1 = True
g 2 = True
here we said something. Also,
g x = x == y+z where
y = head [y | y<-[x-1,x-2..], g y] -- biggest y<x that g y
z = head [z | z<-[y-1,y-2..], g z] -- biggest z<y that g z
Have we said everything we had to say about x? Whether we did or didn't, we said it about any x there can be. And that concludes our recursive definition - as soon as all the possibilities are exhausted, we're done.
But what about termination? We want to get some result from our function, we want it to finish its work. That means, when we use it to calculate x, we need to make sure we use it recursively with some y that's defined "before" x, that is "closer" to one of the simplest defined cases we have.
And here, we did. Now we can marvel at our handiwork, with
filter g [0..]
Last thing is, in order to understand a definition, don't try to retrace its steps. Just read the equations themselves. If we were presented with the above definition for g, we'd read it simply as: g is a Boolean function of a number which is True for 1, and 2, and for any x > 2 that is a sum of its two preceding g numbers.
Maybe the way your are presenting your issue is not the good one, I mean this is not by studding implementation of existing recursive function that you will understand how you can replicate it. I prefer to provide you an alternative way, it could be view as a methodical process which help you yo write standard skeleton of recursive call and then facilitate reasoning about them.
All your example are about list, then the first stuff when you work with list is to be exhaustive, I mean to use pattern matching.
rec_fun [] = -- something here, surely the base case
rec_fun (x:xs) = -- another thing here, surely the general case
Now, the base case could not include recursive otherwise you will surely end up with a infinite loop, then the base case should return a value, and the best way to grasp this value is to look to the type annotation of your function.
For example :
reverse :: [a] -> [a]
Could encourage you to consider the base case as a value of type [a], as [] for reverse
maximum :: [a] -> a
Could encourage you to consider the base case as a value of type a for maximum
Now for the recursive part, as said the function should include a call of herself.
rec_fun (x:xs) = fun x rec_fun xs
with fun to denote the use of another function which are responsible to realize the chaining of recursive call. To help your intuition we can present it as an operator.
rec_fun (x:xs) = x `fun` rec_fun xs
Now considering (again) the type annotation of your function (or more shortly the base case), you should be able to deduce the nature of this operator. For reverse, as its should return a list the operator is surely the concatenation (++) and so on.
If you put all this stuff together, it shouldn't be so hard to end up with the desired implementation.
Of course, as with any other algorithm, you will always need to thinks a little bit and there are no magical recipe, you must think. For example, when you know the maximum of the tail of the list, what is the maximum of the list ?
Looking at Function 3:
reverse' [] = []
reverse' (x:xs) = reverse' xs ++ [x]
Let's say you called reverse' [1,2,3] then...
1. reverse' [1,2,3] = reverse' [2,3] ++ [1]
reverse' [2,3] = reverse' [3] ++ [2] ... so replacing in equation 1, we get:
2. reverse' [1,2,3] = reverse' [3] ++ [2] ++ [1]
reverse' [3] = [3] and there is no xs ...
** UPDATE ** There *is* an xs! The xs of [3] is [], the empty list.
We can confirm that in GHCi like this:
Prelude> let (x:xs) = [3]
Prelude> xs
[]
So, actually, reverse' [3] = reverse' [] ++ [3]
Replacing in equation 2, we get:
3. reverse' [1,2,3] = reverse' [] ++ [3] ++ [2] ++ [1]
Which brings us to the base case: reverse' [] = []
Replacing in equation 3, we get:
4. reverse' [1,2,3] = [] ++ [3] ++ [2] ++ [1], which collapses to:
5. reverse' [1,2,3] = [3,2,1], which, hopefully, is what you intended!
Maybe you can try to do something similar with the other two. Choose small parameters. Have success!
I too have always found it hard to think recursively. Going through the http://learnyouahaskell.com/ recursion chapter a few times, then trying to re-implement his re-implementations has helped solidify it for me. Also, generally, learning to program functionally by carefully going through the Mostly Adequate Guide and practicing currying and composition has made me focus on solving the core of the problem then applying it in other ways.
Back to recursion...Basically these are the steps I go through when thinking of a recursive solution:
The recursion has to stop, so think of one or more base cases. These are the case(s) where further calls to the function are no longer necessary.
Think of the simplest non-base case (the recursive case), and think of how you can call the function again in a way that will result in the base case...so that the function doesn't keep calling itself. The key is focusing on the simplest non-base case. That will help your mind wrap around the problem.
So, for example, if you have to reverse a list, the base case would be an empty list or a list of one element. When moving to the recursive case, don't think about [1,2,3,4]. Instead think of the simplest case ([1,2]) and how to solve that problem. The answer is easy: take the tail and append the head to get the reverse.
I'm no haskell expert...I just started learning myself. I started with this which works.
reverse' l
| lenL == 1 || lenL == 0 = l
where lenL = length l
reverse' xs ++ [x]
The guard checks if it's a 1 or 0 length list and returns the original list if it is.
The recursive case happens when the list is not length 0 or 1 and gets the reverse of the tail, appending the head. This happens until the list is 1 or 0 length and you have your answer.
Then I realized you don't need the check for a singleton list, since the tail of a one element list is an empty list and I went to this which is the answer in learnyouahaskell:
reverse' :: [a] -> [a]
reverse' [] = []
reverse' (x:xs) = reverse' xs ++ [x]
I hope that helps. At the end of the day, practice makes perfect, so keep trying to solve some things recursively and you'll get it.

Haskell lazy evaluation

If I call the following Haskell code
find_first_occurrence :: (Eq a) => a -> [a] -> Int
find_first_occurrence elem list = (snd . head) [x | x <- zip list [0..], fst x == elem]
with the arguments
'X' "abcdXkjdkljklfjdlfksjdljjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj"
how much of the zipped list [('a',0), ('b',1), ] is going to be built?
UPDATE:
I tried to run
find_first_occurrence 10 [1..]
and returns 9 almost instantly, so I guess it does use lazy evaluation at least for simple cases? The answer is also computed "instantly" when I run
let f n = 100 - n
find_first_occurrence 10 (map f [1..])
Short answer: it will be built only up to the element you're searching for. This means that only in the worst case you'll need to build the whole list, that is when no element satisfies the conditions.
Long answer: let me explain why with a pair of examples:
ghci> head [a | (a,b) <- zip [1..] [1..], a > 10]
11
In this case, zip should produce an infinite list, however the laziness enables Haskell to build it only up to (11,11): as you can see, the execution does not diverge but actually gives us the correct answer.
Now, let me consider another issue:
ghci> find_first_occurrence 1 [0, 0, 1 `div` 0, 1]
*** Exception: divide by zero
ghci> find_first_occurrence 1 [0, 1, 1 `div` 0, 0]
1
it :: Int
(0.02 secs, 1577136 bytes)
Since the whole zipped list is not built, haskell obviously will not even evaluate each expression occurring in the list, so when the element is before div 1 0, the function is correctly evaluated without raising exceptions: the division by zero did not occur.
All of it.
Since StackOverflow won't let me post such a short answer: you can't get away with doing less work than looking through the whole list if the thing you're looking for isn't there.
Edit: The question now asks something much more interesting. The short answer is that we will build the list:
('a',0):('b',1):('c',2):('d',3):('X',4):<thunk>
(Actually, this answer is just the slightest bit subtle. Your type signature uses the monomorphic return type Int, which is strict in basically all operations, so all the numbers in the tuples above will be fully evaluated. There are certainly implementations of Num for which you would get something with more thunks, though.)
You can easily answer such a question by introducing undefineds here and there. In our case it is sufficient to change our inputs:
find_first_occurrence 'X' ("abcdX" ++ undefined)
You can see that it produces the result, which means that it does not even look beyond the 'X' it found (otherwise it would have thrown an Exception). Obviously, the zipped list can not be built without looking at the original list.
Another (possibly less reliable) way to analyse your laziness is to use trace function from Debug.Trace:
> let find_first_occurrence elem list = (snd . head) [x | x <- map (\i -> trace (show i) i) $ zip list [0..], fst x == elem]
> find_first_occurrence 'X' "abcdXkjdkljklfjdlfksjdljjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj"
Prints
('a',0)
('b',1)
('c',2)
('d',3)
('X',4)
4

Resources