Breaking down a haskell function - haskell

I'm reading Real world haskell book again and it's making more sense. I've come accross this function and wanted to know if my interpretation of what it's doing is correct. The function is
oddList :: [Int] -> [Int]
oddList (x:xs) | odd x = x : oddList xs
| otherwise = oddList xs
oddList _ = []
I've read that as
Define the function oddList which accepts a list of ints and returns a list of ints.
Pattern matching: when the parameter is a list.
Take the first item, binding it to x, leaving the remainder elements in xs.
If x is an odd number prepend x to the result of applying oddList to the remaining elements xs and return that result. Repeat...
When x isn't odd, just return the result of applying oddList to xs
In all other cases return an empty list.
1) Is that a suitable/correct way of reading that?
2) Even though I think I understand it, I'm not convinced I've got the (x:xs) bit down. How should that be read, what's it actually doing?
3) Is the |...| otherwise syntax similar/same as the case expr of syntax

1 I'd make only 2 changes to your description:
when the parameter is a nonempty list.
f x is an odd number prepend x to the result of applying oddList to the remaining elements xs and return that result. [delete "Repeat...""]
Note that for the "_", "In all other cases" actually means "When the argument is an empty list", since that is the only other case.
2 The (x:xs) is a pattern that introduces two variables. The pattern matches non empty lists and binds the x variable to the first item (head) of the list and binds xs to the remainder (tail) of the list.
3 Yes. An equivalent way to write the same function is
oddList :: [Int] -> [Int]
oddList ys = case ys of { (x:xs) | odd x -> x : oddList xs ;
(x:xs) | otherwise -> oddList xs ;
_ -> [] }
Note that otherwise is just the same as True, so | otherwise could be omitted here.

You got it right.
The (x:xs) parts says: If the list contains at least one element, bind the first element to x, and the rest of the list to xs
The code could also be written as
oddList :: [Int] -> [Int]
oddList (x:xs) = case (odd x) of
True -> x : oddList xs
False -> oddList xs
oddList _ = []
In this specific case, the guard (|) is just a prettier way to write that down. Note that otherwise is just a synonym for True , which usually makes the code easier to read.
What #DanielWagner is pointing out, is we in some cases, the use of guards allow for some more complex behavior.
Consider this function (which is only relevant for illustrating the principle)
funnyList :: [Int] -> [Int]
funnyList (x1:x2:xs)
| even x1 && even x2 = x1 : funnyList xs
| odd x1 && odd x2 = x2 : funnyList xs
funnyList (x:xs)
| odd x = x : funnyList xs
funnyList _ = []
This function will go though these clauses until one of them is true:
If there are at least two elements (x1 and x2) and they are both even, then the result is:
adding the first element (x1) to the result of processing the rest of the list (not including x1 or x2)
If there are at least one element in the list (x), and it is odd, then the result is:
adding the first element (x) to the result of processing the rest of the list (not including x)
No matter what the list looks like, the result is:
an empty list []
thus funnyList [1,3,4,5] == [1,3] and funnyList [1,2,4,5,6] == [1,2,5]
You should also checkout the free online book Learn You a Haskell for Great Good

You've correctly understood what it does on the low level.
However, with some experience you should be able to interpret it in the "big picture" right away: when you have two cases (x:xs) and _, and xs only turns up again as an argument to the function again, it means this is a list consumer. In fact, such a function is always equivalent to a foldr. Your function has the form
oddList' (x:xs) = g x $ oddList' xs
oddList' [] = q
with
g :: Int -> [Int] -> [Int]
g x qs | odd x = x : qs
| otherwise = qs
q = [] :: [Int]
The definition can thus be compacted to oddList' = foldr g q.
While you may right now not be more comfortable with a fold than with explicit recursion, it's actually much simpler to read once you've seen it a few times.
Actually of course, the example can be done even simpler: oddList'' = filter odd.

Read (x:xs) as: a list that was constructed with an expression of the form (x:xs)
And then, make sure you understand that every non-empty list must have been constructed with the (:) constructor.
This is apparent when you consider that the list type has just 2 constructors: [] construct the empty list, while (a:xs) constructs the list whose head is a and whose tail is xs.
You need also to mentally de-sugar expressions like
[a,b,c] = a : b : c : []
and
"foo" = 'f' : 'o' : 'o' : []
This syntactic sugar is the only difference between lists and other types like Maybe, Either or your own types. For example, when you write
foo (Just x) = ....
foo Nothing = .....
we are also considering the two base cases for Maybe:
it has been constructed with Just
it has been constructed with Nothing

Related

Recursion with Maybe

I'm having difficulty try to write a function to find the sum of two lists using recursion, that possibly could be Nothing if any list is empty.
The math of the following functions are:
Σw[i]x[i]
where w and x are equal length int arrays
Here is my working code:
example :: [Int] -> [Int] -> Int
example [] [] = 0
example (x:xs) (l:ls) = ((x*l) + (example xs ls))
Here is the idea of what I want to work:
example :: [Int] -> [Int] -> Maybe Int
example [] [] = Nothing
example (x:xs) (l:ls) = Just((x*l) + (example xs ls))
Thanks
I'm guessing at what your intent is here, not sure whether I read it correctly: You want the function to produce Nothing when the two input lists have difference lengths?
The "happy" base case is 0 just like in the first attempt, but lifted into Maybe.
example [] [] = Just 0
To handle situations where the lists have different lengths, include the cases where only one of the lists is empty. You should have gotten a compiler warning about a non-exhaustive pattern match for not including these cases.
example [] _ = Nothing
example _ [] = Nothing
The final case, then, is where you have two nonempty lists. It looks a lot like that line from your first attempt, except rather than applying the addition directly to example xs ys, we fmap the addition over example xs ys, taking advantage of the fact that Maybe is a Functor.
example (x : xs) (y : ys) = fmap (x * y +) (example xs ys)
Example usage:
λ> example [1,2] [3,4]
Just 11
λ> example [1,2] [3,4,5]
Nothing
By the way, if you wanted to use a library this, safe would be a nice choice to turn this into a one-liner.
import Safe.Exact
example xs ys = fmap sum (zipWithExactMay (*) xs ys)
You're close, but your recursive call to example xs ls returns a Maybe Int, and you can't add an Int and a Maybe Int (in x*l + example xs ls), hence your error on the last line.
You can use fromMaybe to deal with this case, using 0 as the default sum:
example :: [Int] -> [Int] -> Maybe Int
example [] [] = Nothing
example (x:xs) (l:ls) = Just $ x * l + fromMaybe 0 (example xs ls)
Alternatively (and more neatly), you can avoid the explicit recursion using something like this:
example [] [] = Nothing
example xl yl = Just $ sum $ zipWith (*) xl yl
Note that you have non-exhaustive patterns in your pattern match. Two lists of different lengths will cause a pattern-match exception.

Can haskell grouping function be made similarly like that?

Is it possible to somehow make group function similarly to that:
group :: [Int] -> [[Int]]
group [] = []
group (x:[]) = [[x]]
group (x:y:ys)
| x == y = [[x,y], ys]
| otherwise = [[x],[y], ys]
Result shoult be something like that:
group[1,2,2,3,3,3,4,1,1] ==> [[1],[2,2],[3,3,3],[4],[1,1]]
PS: I already looked for Data.List implementation, but it doesn't help me much. (https://hackage.haskell.org/package/base-4.3.1.0/docs/src/Data-List.html)
Is it possible to make group funtion more clearer than the Data.List implementation?
Or can somebody easily explain the Data.List implementation atleast?
Your idea is good, but I think you will need to define an ancillary function -- something like group_loop below -- to store the accumulated group. (A similar device is needed to define span, which the Data.List implementation uses; it is no more complicated to define group directly, as you wanted to do.) You are basically planning to move along the original list, adding items to the subgroup as long as they match, but starting a new subgroup when something doesn't match:
group [] = []
group (x:xs) = group_loop [x] x xs
where
group_loop acc c [] = [acc]
group_loop acc c (y:ys)
| y == c = group_loop (acc ++ [y]) c ys
| otherwise = acc : group_loop [y] y ys
It might be better to accumulate the subgroups by prepending the new element, and then reversing all at once:
group [] = []
group (x:xs) = group_loop [x] x xs
where
group_loop acc c [] = [reverse acc]
group_loop acc c (y:ys)
| y == c = group_loop (y:acc) c ys
| otherwise = reverse acc : group_loop [y] y ys
since then you don't have to keep retraversing the accumulated subgroup to tack things on the end. Either way, I get
>>> group[1,2,2,3,3,3,4,1,1]
[[1],[2,2],[3,3,3],[4],[1,1]]
group from Data.List is a specialized version of groupBy which uses the equality operator == as the function by which it groups elements.
The groupBy function is defined like this:
groupBy :: (a -> a -> Bool) -> [a] -> [[a]]
groupBy _ [] = []
groupBy eq (x:xs) = (x:ys) : groupBy eq zs
where (ys,zs) = span (eq x) xs
It relies on another function call span which splits a list into a tuple of two lists based on a function applied to each element of the list. The documentation for span includes this note which may help understand its utility.
span p xs is equivalent to (takeWhile p xs, dropWhile p xs)
Make sure you first understand span. Play around with it a little in the REPL.
Ok, so now back to groupBy. It uses span to split up a list, using the comparison function you pass in. That function is in eq, and in the case of the group function, it is ==. In this case, the span function splits the list into two lists: The first of which matches the first element pulled from the list, and the remainder in the second element of the tuple.
And since groupBy recursively calls itself, it appends the rest of the results from span down the line until it reaches the end.
Visually, you can think of the values produced by span looking something like this:
([1], [2,2,3,3,3,4,1,1])
([2,2], [3,3,3,4,1,1])
([3,3,3], [4,1,1])
([4], [1,1])
([1,1], [])
The recursive portion joins all the first elements of those lists together in another list, giving you the result of
[[1],[2,2],[3,3,3],[4],[1,1]]
Another way of looking at this is to take the first element x of the input and recursively group the rest of it. x will then either be prepended to the first element of the grouping, or go in a new first group by itself. Some examples:
With [1,2,3], we'll add 1 to a new group in [[2], [3]], yielding [[1], [2], [3]]
With [1,1,2], we'll add the first 1 to the first group of [[1], [2]], yielding [[1,1], [2]].
The resulting code:
group :: [Int] -> [[Int]]
group [] = []
group [x] = [[x]]
group (x:y:ys) = let (first:rest) = group (y:ys)
in if x /= y
then [x]:first:rest -- Example 1 above
else (x:first):rest -- Example 2 above
IMO, this simplifies the recursive case greatly by treating singleton lists explicitly.
Here, I come up with a solution with foldr:
helper x [] = [[x]]
helper x xall#(xs:xss)
| x == head xs = (x:xs):xss
| otherwise = [x]:xall
group :: Eq a => [a] -> [[a]]
group = foldr helper []

Why does Sieve of Eratosthenes need extra helper function for merging infinite lists?

I'm working with the Sieve of Eratosthenes code from Literate Programming (http://en.literateprograms.org/Sieve_of_Eratosthenes_%28Haskell%29), modified slightly to include edge cases on merge and diff:
primesInit = [2,3,5,7,11,13]
primes = primesInit ++ [i | i <- diff [15,17..] nonprimes]
nonprimes = foldr1 f . map g $ tail primes
where g p = [n * p | n <- [p,p+2..]]
f (x:xt) ys = x : (merge xt ys)
merge :: (Ord a) => [a] -> [a] -> [a]
merge [] ys = ys
merge xs [] = xs
merge xs#(x:xt) ys#(y:yt)
| x < y = x : merge xt ys
| x == y = x : merge xt yt
| x > y = y : merge xs yt
diff :: (Ord a) => [a] -> [a] -> [a]
diff [] ys = []
diff xs [] = xs
diff xs#(x:xt) ys#(y:yt)
| x < y = x : diff xt ys
| x == y = diff xt yt
| x > y = diff xs yt
Both merge and diff on their own are lazy. So is nonprimes and primes. But if we change the definition of primes to remove f, as in:
nonprimes = foldr1 merge . map g $ tail primes
where g p = [n * p | n <- [p,p+2..]]
Now nonprimes isn't lazy. I've also recreated this with take 20 $ foldr1 merge [[i*n | n <- [3,7..]] | i <- [5,9..]] (GHCI runs out of memory and exits).
Based on http://www.haskell.org/haskellwiki/Performance/Laziness , one easy source of non-laziness is recursing before returning a data constructor. But merge doesn't have this problem; it returns a cons-cell that contains the recursive call as the second item. Nor should the use of foldr be a culprit here by itself (It's foldl that can't do infinite lists).
So, why does merge need to be separated from foldr1 by f, which essentially does the first call to merge manually? All f does is return a cons cell that contains the call to merge as the second item, right?
NOTE: Someone else on Stack Overflow was working with similar code and ran into the same problem I did, but they accepted an answer that looked to me like basically different code. I'm asking why, not how, as it seems that laziness is somewhat important in Haskell.
Let's compare those two functions again:
merge [] ys = ys
merge xs [] = xs
merge xs#(x:xt) ys#(y:yt)
| x < y = x : merge xt ys
| x == y = x : merge xt yt
| x > y = y : merge xs yt
and
f (x:xt) ys = x : (merge xt ys)
Let's ignore the semantic differences between the two, though they are significant - f is a lot more restricted as far as when it's valid to call. Instead, lets look at only the strictness properties.
Pattern matches in multiple equations are checked top-down. Multiple pattern matches within a single equation are checked left-to-right. So the first thing merge does is force the constructor of its first argument, in order to determine if the first equation matches. If the first equation doesn't match, it forces the constructor of the second argument, in order to determine if the second equation matches. Only if neither equation matches does it move to the third case. The compiler is smart enough to know it's already forced both arguments at this point, so it doesn't do it again - but those pattern matches would require the arguments to be forced if it hadn't already been.
But the important thing here is that the process of figuring out which equation matches causes both arguments to be forced before any constructor is produced.
Now, contrast that with f. In the definition of f, the only pattern-matching is on the first argument. As such, f is somewhat less strict than merge. It produces a constructor before examining its second argument.
And it turns out that if you closely examine the behavior of foldr, it works on infinite lists precisely when the function passed to it doesn't (always) examine its second argument before producing a constructor.
The parenthetical "always" there is interesting. One of my favorite examples of using foldr and laziness together is:
dropRWhile :: (a -> Bool) -> [a] -> [a]
dropRWhile p = foldr (\x xs -> if p x && null xs then [] else x:xs) []
This is a maximally-lazy function that works like dropWhile, except from the back (right) of the list. If the current element doesn't match the predicate, it's returned immediately. If it does match the predicate, it looks ahead until it finds something that either doesn't match, or the end of the list. This will be productive on infinite lists, so long as it eventually finds an element that doesn't match the predicate. And that is the source of the "always" parenthetical up above - a function that usually doesn't examine its second argument before producing a constructor still allows foldr to usually work on infinite lists.
To determine the first element of its output, merge needs to evaluate both arguments enough to determine if they are empty lists or not. Without that information it can't be determined which case of the function definition applies.
In combination with foldr1 it becomes a problem that merge tries to evaluate its second argument. nonprimes in an expression of this form:
foldr1 merge [a,b,c,...]
To evaluate this, first `foldr1 is expanded:
merge a (foldr1 merge [b,c,...])
To now evaluate merge, the cases of its function definition are checked. First a is evaluated, and it turns out to not be an empty list. So the first case of merge doesn't apply. Next, the second parameter of merge needs to be evaluated to see if it is an empty list and if the second case of the definition of merge applies. This second parameter is foldr1 merge [b,c,...].
But to evaluate this we are in the same situation as before with foldr1 merge [a,b,c,...], and we just the same end up with merge b (foldr1 merge [c,...]), where merge again needs to evaluate it's second parameter to check if it's an empty list.
And so on. Each evaluation of merge requires another evaluation of merge first, which ends up in infinite recursion.
With f that problem is avoided, since it doesn't need to look at its second parameter for the top level evaluation. foldr1 f [a,b,c...] is f a (foldr1 f [b,c,...]) which evaluates to a non-empty list a0 : merge a' (foldr1 f [b,c,...]). So foldr1 f ... never is an empty list. This can be determined without any infinite recursion.
Now also the evaluation of merge a' (foldr1 f [b,c,...]) isn't a problem, since the second parameter evaluates to some b0 : ..., which is all merge needs to know to start producing a result.

Haskell: Double every 2nd element in list

I just started using Haskell and wanted to write a function that, given a list, returns a list in which every 2nd element has been doubled.
So far I've come up with this:
double_2nd :: [Int] -> [Int]
double_2nd [] = []
double_2nd (x:xs) = x : (2 * head xs) : double_2nd (tail xs)
Which works but I was wondering how you guys would write that function. Is there a more common/better way or does this look about right?
That's not bad, modulo the fixes suggested. Once you get more familiar with the base library you'll likely avoid explicit recursion in favor of some higher level functions, for example, you could create a list of functions where every other one is *2 and apply (zip) that list of functions to your list of numbers:
double = zipWith ($) (cycle [id,(*2)])
You can avoid "empty list" exceptions with some smart pattern matching.
double2nd (x:y:xs) = x : 2 * y : double2nd xs
double2nd a = a
this is simply syntax sugar for the following
double2nd xss = case xss of
x:y:xs -> x : 2 * y : double2nd xs
a -> a
the pattern matching is done in order, so xs will be matched against the pattern x:y:xs first. Then if that fails, the catch-all pattern a will succeed.
A little bit of necromancy, but I think that this method worked out very well for me and want to share:
double2nd n = zipWith (*) n (cycle [1,2])
zipWith takes a function and then applies that function across matching items in two lists (first item to first item, second item to second item, etc). The function is multiplication, and the zipped list is an endless cycle of 1s and 2s. zipWith (and all the zip variants) stops at the end of the shorter list.
Try it on an odd-length list:
Prelude> double_2nd [1]
[1,*** Exception: Prelude.head: empty list
And you can see the problem with your code. The 'head' and 'tail' are never a good idea.
For odd-lists or double_2nd [x] you can always add
double_2nd (x:xs) | length xs == 0 = [x]
| otherwise = x : (2 * head xs) : double_2nd (tail xs)
Thanks.
Here's a foldr-based solution.
bar :: Num a => [a] -> [a]
bar xs = foldr (\ x r f g -> f x (r g f))
(\ _ _ -> [])
xs
(:)
((:) . (*2))
Testing:
> bar [1..9]
[1,4,3,8,5,12,7,16,9]

Split list and make sum from sublist?

im searching for a solution for my Haskell class.
I have a list of numbers and i need to return SUM for every part of list. Parts are divided by 0. I need to use FOLDL function.
Example:
initial list: [1,2,3,0,3,4,0,5,2,1]
sublist [[1,2,3],[3,4],[5,2,1]]
result [6,7,7]
I have a function for finding 0 in initial list:
findPos list = [index+1 | (index, e) <- zip [0..] list, e == 0]
(returns [4,6] for initial list from example)
and function for making SUM with FOLDL:
sumList list = foldl (+) 0 list
But I completely failed to put it together :/
---- MY SOLUTION
In the end I found something completely different that you guys suggested.
Took me whole day to make it :/
groups :: [Int] -> [Int]
groups list = [sum x | x <- makelist list]
makelist :: [Int] -> [[Int]]
makelist xs = reverse (foldl (\acc x -> zero x acc) [[]] xs)
zero :: Int -> [[Int]] -> [[Int]]
zero x acc | x == 0 = addnewtolist acc
| otherwise = addtolist x acc
addtolist :: Int -> [[Int]] -> [[Int]]
addtolist i listlist = (i : (head listlist)) : (drop 1 listlist)
addnewtolist :: [[Int]] -> [[Int]]
addnewtolist listlist = [] : listlist
I'm going to give you some hints, rather than a complete solution, since this sounds like it may be a homework assignment.
I like the breakdown of steps you've suggested. For the first step (going from a list of numbers with zero markers to a list of lists), I suggest doing an explicit recursion; try this for a template:
splits [] = {- ... -}
splits (0:xs) = {- ... -}
splits (x:xs) = {- ... -}
You can also abuse groupBy if you're careful.
For the second step, it looks like you're almost there; the last step you need is to take a look at the map :: (a -> b) -> ([a] -> [b]) function, which takes a normal function and runs it on each element of a list.
As a bonus exercise, you might want to think about how you might do the whole thing in one shot as a single fold. It's possible -- and even not too difficult, if you track through what the types of the various arguments to foldr/foldl would have to be!
Additions since the question changed:
Since it looks like you've worked out a solution, I now feel comfortable giving some spoilers. =)
I suggested two possible implementations; one that goes step-by-step, as you suggested, and another that goes all at once. The step-by-step one could look like this:
splits [] = []
splits (0:xs) = [] : splits xs
splits (x:xs) = case splits xs of
[] -> [[x]]
(ys:yss) -> ((x:ys):yss)
groups' = map sum . splits
Or like this:
splits' = groupBy (\x y -> y /= 0)
groups'' = map sum . splits'
The all-at-once version might look like this:
accumulate 0 xs = 0:xs
accumulate n (x:xs) = (n+x):xs
groups''' = foldr accumulate [0]
To check that you understand these, here are a few exercises you might like to try:
What do splits and splits' do with [1,2,3,0,4,5]? [1,2,0,3,4,0]? [0]? []? Check your predictions in ghci.
Predict what each of the four versions of groups (including yours) output for inputs like [] or [1,2,0,3,4,0], and then test your prediction in ghci.
Modify groups''' to exhibit the behavior of one of the other implementations.
Modify groups''' to use foldl instead of foldr.
Now that you've completed the problem on your own, I am showing you a slightly less verbose version. Foldr seems better in my opinion to this problem*, but because you asked for foldl I will show you my solution using both functions.
Also, your example appears to be incorrect, the sum of [5,2,1] is 8, not 7.
The foldr version.
makelist' l = foldr (\x (n:ns) -> if x == 0 then 0:(n:ns) else (x + n):ns) [0] l
In this version, we traverse the list, if the current element (x) is a 0, we add a new element to the accumulator list (n:ns). Otherwise, we add the value of the current element to the value of the front element of the accumulator, and replace the front value of the accumulator with this value.
Step by step:
acc = [0], x = 1. Result is [0+1]
acc = [1], x = 2. Result is [1+2]
acc = [3], x = 5. Result is [3+5]
acc = [8], x = 0. Result is 0:[8]
acc = [0,8], x = 4. Result is [0+4,8]
acc = [4,8], x = 3. Result is [4+3,8]
acc = [7,8], x = 0. Result is 0:[7,8]
acc = [0,7,8], x = 3. Result is [0+3,7,8]
acc = [3,7,8], x = 2. Result is [3+2,7,8]
acc = [5,7,8], x = 1. Result is [5+1,7,8] = [6,7,8]
There you have it!
And the foldl version. Works similarly as above, but produces a reversed list, hence the use of reverse at the beginning of this function to unreverse the list.
makelist l = reverse $ foldl (\(n:ns) x -> if x == 0 then 0:(n:ns) else (x + n):ns) [0] l
*Folding the list from the right allows the cons (:) function to be used naturally, using my method with a left fold produces a reversed list. (There is likely a simpler way to do the left fold version that I did not think of that eliminates this triviality.)
As you already solved it, another version:
subListSums list = reverse $ foldl subSum [0] list where
subSum xs 0 = 0 : xs
subSum (x:xs) n = (x+n) : xs
(Assuming that you have only non-negative numbers in the list)

Resources