Argument of groupBy's Lambda - haskell

Learn You a Haskell shows the groupBy function:
ghci> let values = [-4.3, -2.4, -1.2, 0.4, 2.3, 5.9, 10.5,
29.1, 5.3, -2.4, -14.5, 2.9, 2.3]
ghci> groupBy (\x y -> (x > 0) == (y > 0)) values
[[-4.3,-2.4,-1.2],[0.4,2.3,5.9,10.5,29.1,5.3],[-2.4,-14.5],[2.9,2.3]]
In groupBy's first argument, what is the meaning of the lambda's 2 arguments: x and y?

These are the variables to compare. You know that group puts equal neighbored values together. To decide what a equal value is it uses a compare function. group relies on the instance of your type of the Eq typeclass. But groupBy allows you to choose how to compare the neighbored values.

If we look at the type of groupBy:
groupBy :: (a -> a -> Bool) -> [a] -> [[a]]
The first argument to groupBy is a function that takes two arguments of type a and return a Bool. You could equivalently write this as
groupBy comparer values where comparer x y = (x > 0) == (y > 0)
The \x y -> part just says that the lambda function takes two arguments named x and y, just like with any other function declaration.
The easiest way to see what this expression does is to just run it:
ghci> groupBy (\x y -> (x > 0) == (y > 0)) values
[[-4.3,-2.4,-1.2],[0.4,2.3,5.9,10.5,29.1,5.3],[-2.4,-14.5],[2.9,2.3]]
If you look closely, you can see that each sublist is grouped by if it's positive or negative. The groupBy function groups elements of a list by the given condition, but only in sequential order. For example:
ghci> groupBy (\x y -> x == y) [1, 1, 2, 2, 2, 3, 3, 4]
[[1,1],[2,2,2],[3,3],[4]]
ghci> groupBy (\x y -> x == y) [1, 1, 2, 2, 2, 3, 3, 1]
[[1,1],[2,2,2],[3,3],[1]]
In the second example, notice that the 1s haven't all been grouped together because they aren't adjacent.

In cases like these, it's best to go straight to the source! groupBy is part of Data.List, so you can find it the base package on Hackage. When you don't know what package a function is in, search for the function in Hoogle and click on the name to be taken to the Haddocks on Hackage. When you're looking at Haddock documentation, there will usually be a "Source" link on the righthand side of the function type definition to take you to the definition. Here's the source for groupBy.
I've reproduced the definition here to step through it.
-- | The 'groupBy' function is the non-overloaded version of 'group'.
groupBy :: (a -> a -> Bool) -> [a] -> [[a]]
groupBy _ [] = []
groupBy eq (x:xs) = (x:ys) : groupBy eq zs
where (ys,zs) = span (eq x) xs
First, the documentation line at the top tells us that groupBy is the non-overloaded version of group, which is a very common pattern in base. You can go check out group to figure out the simplest case of grouping functionality, then you can understand the -By version as allowing you to supply your own predicate (in case you wanted to compare equality differently than the Eq instance for a type, or whatever other operation you're trying to do).
The base case is trivial, but the recursive step might be a little confusing if you don't know what span does (time to hit Hackage again!). span takes a predicate and a list and returns a pair (2-tuple) of lists broken before the first element that doesn't match the predicate (it's like break but (not) negated).
So now you should be able to put it all together and see that groupBy groups elements of a list together by segregating runs of elements which are "equal" to the first element in that run. Note that it is NOT comparing elements pairwise (I was burned by that before) so don't assume that the two elements being passed to the predicate function would be adjacent in the list!

Let's start with group. This function simply groups together all the adjacent elements that are identical. e.g.,
group [0,1,2,3,3,4] = [[0],[1],[2],[3,3],[4]]
GHC defines group as follows:
group :: Eq a => [a] -> [[a]]
group = groupBy (==)
That is, the equality test is implicit. The same thing can be written with an explicit equality test using groupBy as;
groupBy (\x y -> x == y) [0,1,2,3,3,4] = [[0],[1],[2],[3,3],[4]]
Now, let's look at how GHC defines groupBy:
groupBy :: (a -> a -> Bool) -> [a] -> [[a]]
groupBy _ [] = []
groupBy eq (x:xs) = (x:ys) : groupBy eq zs
where (ys,zs) = span (eq x) xs
eq is used to split the rest of the list based on comparison with the first element. groupBy is then recursively called on the list of elements that fail the comparison. Note that (x:ys) is a concatenation of the first element, with the list of elements that satisfy the comparison condition. Also, the span function will start the second list at the first element where test condition is not met.
Hence, in the given example, the moment you reach the value 0.4, a new list has to start, since 0.4 will be the first element of zs from the above definition.

groupBy divides a list into groups according to some “rule”. In groupBy (\x y -> x `someComparison` y) someList, x is a first element of the “current” group, y is an element of someList. groupBy traverses someList, so at each step y becomes the next element of someList. The new group started, when the predicate returns False. y becomes the first member of the new group. Iteration continues, the first member of this new group now becomes x, and the next element of someList becomes y.
groupBy does NOT compare elements pairwise (1st with 2nd, 2nd with 3rd, etc), instead it compares each element of the list with the first element of the group currently being filled. Example:
groupBy (\x y -> x < y) [1,2,3,2,1] -- returns: [[1,2,3,2],[1]]
Step by step groupBy:
compares 1 with 2. 1 < 2, thus both numbers go into the same group. Groups: [[1,2]]
compares 1 with 3. 1 < 3, thus 3 goes into the old group. Groups: [[1,2,3]]
compares 1 with 2. 1 < 2, thus 2 goes into the old group. Groups: [[1,2,3,2]]
compares 1 with 1. 1 ≮ 1, thus the new group is formed, and 1 goes as its first element. Groups: [[1,2,3,2],[1]]
Exercise: To understand how groupBy works, try to figure out how the following expressions return their results with pen and paper:
groupBy (\x y -> x < y) [1,2,3,4,5,4,3,2,1] -- [[1,2,3,4,5,4,3,2],[1]]
groupBy (\x y -> x < y) [1,3,5,2,1] -- [[1,3,5,2],[1]]
groupBy (\x y -> x <= y) [3,5,3,2,1,0,1,0] -- [[3,5,3],[2],[1],[0,1,0]]
groupBy (\x y -> x <= y) [1,2,3,2,1] -- [[1,2,3,2,1]]
Again, the key to understand behavior of groupBy is to remember, that at each step it compares the first element of the current group with a consecutive element of the list. The moment predicate returns False, the new group is formed, and the process continues.
To understand why groupBy behaves that way, inspect its source:
groupBy :: (a -> a -> Bool) -> [a] -> [[a]]
groupBy _ [] = []
groupBy eq (x:xs) = (x:ys) : groupBy eq zs
where (ys,zs) = span (eq x) xs
The key here is the use of span function: span (1<) [2,3,2,1] returns ([2,3,2],[1]). span (eq x) xs in the above code puts all elemets of xs that match (eq x) into the first part of a pair, and te rest of xs into the second. (x:ys) then joins x with the first part of a pair, while groupBy is recursively invoked on the rest of xs (which is zs). That's why groupBy works this strange way.

Related

Pairing up elements from a list given a predicate

I'm relatively new to Haskell and found a challenge to create a set of tuples which greedily takes from a list given a predicate. For example, using (\x -> \y -> odd(x+y)) on [2,3,4,5,6,7] could return [(2,3),(3,2),(4,5),(5,4),(6,7),(7,6)] or [(2,7),(6,5),(3,4),(4,3),(5,6),(7,2)] or any other valid set of pairings, as long as it's one where each pair is symmetrical, and all items from the set are included in one and only one pairing. A key part of my challenge is to learn to work with monads, specifically Maybe/Just/Nothing, so my current function is Eq a => (a -> a -> Bool) -> [a] -> Maybe [(a,a)] where Nothing is returned if a list of tuples including every element cannot be made; for example running (\x -> \y -> even(x+y)) on [2,3,4,5,6,7] would return Nothing, as you can't pair up all the elements to fit that predicate without leaving some out.
To start off, I thought I could generate a full list of possible pairs and filter them with the predicate. My function at present is test p xs = filter (uncurry p) [(x,y) | (x:ys) <- tails xs, y <- ys], with the idea that later on I can remove tuples with duplicate first values (perhaps somehow using nubBy?), run swap from Data.Tuple on what's left in my list to make my pairs symmetrical, and then run a final check to see if all the elements from the list have been included so I know whether to return nothing. I realise, however, that there's probably a better way of going about this that performs fewer redundant actions and does the final check for returning Nothing earlier on. I've tried to play around with list comprehension, but I can't come up with anything serviceable.
A tuple (x, y) is inherently ordered: (x, y) != (y, x). It would be helpful to define an "unordered" pair type for filtering:
newtype Pair x = Pair { unpair :: (x, x) }
instance Eq a => Eq (Pair a) where
(Pair p1) == (Pair p2) = p1 == p2 || p1 == swap p2
Then you can use a simpler method of generating sample pairs, using the Applicative instance for lists. You can filter out duplicates later, using Pair.
>>> (,) <$> [1, 2, 3] <*> [1, 2, 3]
[(1,1),(1,2),(1,3),(2,1),(2,2),(2,3),(3,1),(3,2),(3,3)]
Once you have filtered the above list, use nub by first converting all your initial results to Pair values, deduplicate using nub, then convert back to tuples:
result :: Eq x => [(x,x)] -> [(x,x)]
result = map unpair . nub . map Pair

Double elements in a list if values are over certain threshold

Trying to double elements of a list that are greater than 5 for example.
to double every element in a list i would do this
doubleAll n = [2*x| x <-n]
now i want to double all elements in list that are greater than 5 using list comprehensions.
So if i do this
doubleAll n = [2*x| x <-n, x>5]
My list [1,2,3,4,5] would result in to [10]. But i want my list to show [1,2,3,4,10]
Can anyone explain what i did wrong and how can this be fixed?
An interpretation of [2*x| x <-n, x>5] is:
Take the next element from n and call it x
Proceed if x>5, otherwise go to step 1
Return the value 2*x as the next element of the list.
Repeat
From this it is clear that the x>5 filters out elements of n. The expression is equivalent to:
map (\x -> 2*x) ( filter (\x -> x>5) n )
As Arthur mentioned, you want something like:
[ if x > 5 then 2*x else x | x <- n ]
It's interpretation is:
Take the next value of n and call it x
Return the value if x > 5 then 2*x else x as then next value of the list.
This is clearer to understand if you don't use list comprehensions and use the map and filter operations instead:
-- Apply a function to every element of the list.
map :: (a -> b) -> [a] -> [b]
-- Throw out list elements that don't pass the test.
filter :: (a -> Bool) -> [a] -> [a]
Your original doubleAll is equivalent this:
-- Multiply every element of the list by two.
doubleAll xs = map (*2) xs
The version where you double only if x > 5 would be this:
-- Apply the `step` function to every element of the list. This step function
-- multiplies by two if x >= 5, otherwise it just returns its argument.
doubleAll xs = map step xs
where step x | x >= 5 = 2*x
| otherwise = x
The problem with the list comprehension version that you wrote is that it's instead equivalent to this:
-- Filter out list elements that are smaller than 5, then double the remaining ones.
doubleAll xs = map (*2) (filter (>=5) xs)
The list comprehension solution that produces the result you want would instead be this:
doubleAll xs = [if x >= 5 then x*2 else x | x <- xs]
As a more general remark, I always recommend to newcomers to stay away from list comprehensions and learn the higher-order list functions, which are more general and less magical.

Does Haskell have a takeUntil function?

Currently I am using
takeWhile (\x -> x /= 1 && x /= 89) l
to get the elements from a list up to either a 1 or 89. However, the result doesn't include these sentinel values. Does Haskell have a standard function that provides this variation on takeWhile that includes the sentinel in the result? My searches with Hoogle have been unfruitful so far.
Since you were asking about standard functions, no. But also there isn't a package containing a takeWhileInclusive, but that's really simple:
takeWhileInclusive :: (a -> Bool) -> [a] -> [a]
takeWhileInclusive _ [] = []
takeWhileInclusive p (x:xs) = x : if p x then takeWhileInclusive p xs
else []
The only thing you need to do is to take the value regardless whether the predicate returns True and only use the predicate as a continuation factor:
*Main> takeWhileInclusive (\x -> x /= 20) [10..]
[10,11,12,13,14,15,16,17,18,19,20]
Is span what you want?
matching, rest = span (\x -> x /= 1 && x /= 89) l
then look at the head of rest.
The shortest way I found to achieve that is using span and adding a function before it that takes the result of span and merges the first element of the resulting tuple with the head of the second element of the resulting tuple.
The whole expression would look something like this:
(\(f,s) -> f ++ [head s]) $ span (\x -> x /= 1 && x /= 89) [82..140]
The result of this expression is
[82,83,84,85,86,87,88,89]
The first element of the tuple returned by span is the list that takeWhile would return for those parameters, and the second element is the list with the remaining values, so we just add the head from the second list to our first list.

How can i count elements of a tuple in Haskell?

So i've got a list of tuples like this one :
xs = [("a","b"),("a","c"),("b","d")
and i want a function that counts the number of times a certain value appears in the first position of the tuple. If i used the list xs and the letter 'a', it would return the value 2, because the letter 'a' appears two times in the first position of the tuple. This function shouldn't be recursive.
So what i've got is this:
f xs = (fst $ unzip xs) / length(xs)
Now i have all the elements down on a list. this would be easy if it was recursive, but if i don't want it that way, how can i do it ?
If we're not using recursion, we need to use some higher order functions. In particular, filter looks helpful, it removes elements who don't satisfy some condition.
Well if we use filter we can get a list of all elements with the first element being the correct thing.
count :: Eq a => [(a, b)] -> Int
count x = length . filter ((== x) . fst)
I suppose since you're studying, you should work to understand some folds, start with
count x = foldr step 0
where step (a, b) r | a == x = 1 + r
| otherwise = r
If you map the first elements into a list find all occurences of your value and count the length of the resulting list:
countOccurences :: Eq a => a -> [(a, b)] -> Int
countOccurences e = length . filter ((==)e) . map fst

Haskell mapping function to list

I am new to Haskell and I have the following problem. I have to create a list of numbers [f1, f2, f3...] where fi x = x ^ i. Then I have to create a function that applies the fi to a list of numbers. For example if I have a list lis = [4,5,6,7..] the output would be [4^1, 5^2,6^3, 7^4...]. This is what I have written so far :
powers x= [x^y |y<-[1,2,3,4]]
list = [1,2,3,4]
match :: (x -> xs) -> [x] -> [xs]
match f [] = []
match f (x:xs) = (f x) : ( match f xs )
So if I put the list = [1,2,3] the output is [1,1,1,1][2,4,8,16],[3,9,27,81] instead of [1,4,27]
Can you please tell me what is wrong and point me to the right direction?
The first issue is that powers is of type Int -> [Int]. What you really want, I think, is something of type [Int -> Int] -- a list of Int -> Int functions instead of a function that takes an Int and returns a list of Int. If you define powers like so:
powers = [(^y) | y <- [1..4]]
you can use zipWith to apply each power to its corresponding element in the list, like so:
zipWith ($) powers [1,2,3] -- returns [1,4,27]
The ($) applies its left (first) argument to its right (second) argument.
Note that using powers as defined here will limit the length of the returned list to 4. If you want to be able to use arbitrary length lists, you want to make powers an infinite list, like so:
powers = [(^y) | y <- [1..]]
Of course, as dave4420 points out, a simpler technique is to simply use
zipWith (^) [1,2,3] [1..] -- returns [1,4,27]
Your match is the standard function map by another name. You need to use zipWith instead (which you can think of as mapping over two lists side-by-side).
Is this homework?
You are currently creating a list for every input value.
What you need to do is recursively compute the appropriate
power for each input value, like this:
match f [] = []
match f (x:xs) y = (f x y) : (match f xs y+1)
Then, you can call this as match pow [1, 2, 3] 1.
This is equivalent to using zipWith and providing the desired function (pow), your input list ([1, 2, 3]) and the exponent list (a lazy one to infinity list) as arguments.

Resources