Assume the following (non-functioning) code, that takes a predicate such as (==2) and a list of integers, and drops only the last element of the list that satisfies the predicate:
cutLast :: (a -> Bool) -> [Int] -> [Int]
cutLast a [] = []
cutLast pred (as:a)
| (pred) a == False = (cutLast pred as):a
| otherwise = as
This code does not work, so clearly lists cannot be iterated through in reverse like this. How could I implement this idea? I'm not 100% sure if the code is otherwise correct - but hopefully it gets the idea across.
Borrowing heavily from myself: the problem with this sort of question is that you don't know which element to remove until you get to the end of the list. Once we observe this, the most straightforward thing to do is traverse the list one way then back using foldr (the second traversal comes from the fact foldr is not tail-recursive).
The cleanest solution I can think of is to rebuild the list on the way back up, dropping the first element.
cutLast :: Eq a => (a -> Bool) -> [a] -> Either [a] [a]
cutLast f = foldr go (Left [])
where
go x (Right xs) = Right (x:xs)
go x (Left xs) | f x = Right xs
| otherwise = Left (x:xs)
The return type is Either to differentiate between not found anything to drop from the list (Left), and having encountered and dropped the last satisfying element from the list (Right). Of course, if you don't care about whether you dropped or didn't drop an element, you can drop that information:
cutLast' f = either id id . cutLast f
Following the discussion of speed in the comments, I tried swapping out Either [a] [a] for (Bool,[a]). Without any further tweaking, this is (as #dfeuer predicted) consistently a bit slower (on the order of 10%).
Using irrefutable patterns on the tuple, we can indeed avoid forcing the whole output (as per #chi's suggestion), which makes this much faster for lazily querying the output. This is the code for that:
cutLast' :: Eq a => (a -> Bool) -> [a] -> (Bool,[a])
cutLast' f = foldr go (False,[])
where
go x ~(b,xs) | not (f x) = (b,x:xs)
| not b = (False,x:xs)
| otherwise = (True,xs)
However, this is 2 to 3 times slower than either of the other two versions (that don't use irrefutable patterns) when forced to normal form.
One simple (but less efficient) solution is to implement cutFirst in a similar fashion to filter, then reverse the input to and output from that function.
cutLast pred = reverse . cutFirst . reverse
where cutFirst [] = []
cutFirst (x:xs) | pred x = xs
| otherwise = x : cutFirst xs
Related
What is the difference between this two, in terms of evaluation?
Why this "obeys" (how to say?) non-strictness
recFilter :: (a -> Bool) -> [a] -> [a]
recFilter _ [] = []
recFilter p (h:tl) = if (p h)
then h : recFilter p tl
else recFilter p tl
while this doesn't?
recFilter :: (a -> Bool) -> [a] -> Int -> [a]
recFilter _ xs 0 = xs
recFilter p (h:tl) len
| p(h) = recFilter p (tl ++ [h]) (len-1)
| otherwise = recFilter p tl (len-1)
Is it possible to write a tail-recursive function non-strictly?
To be honest I also don't understand the call stack of the first example, because I can't see where that h: goes. Is there a way to see this in ghci?
The non-tail recursive function roughly consumes a portion of the input (the first element) to produce a portion of the output (well, if it's not filtered out at least). Then recursion handles the next portion of the input, and so on.
Your tail recursive function will recurse until len becomes zero, and only at that point it will output the whole result.
Consider this pseudocode:
def rec1(p,xs):
case xs:
[] -> []
(y:ys) -> if p(y): print y
rec1(p,ys)
and compare it with this accumulator-based variant. I'm not using len since I use a separate accumulator argument, which I assume to be initially empty.
def rec2(p,xs,acc):
case xs:
[] -> print acc
(y:ys) -> if p(y):
rec2(p,ys,acc++[y])
else:
rec2(p,ys,acc)
rec1 prints before recursing: it does not need to inspect the whole input list to start printing its output. It works in a "steraming" fashion, in a sense. Instead, rec2 will only start to print at the very end, after the input list was completely scanned.
In your Haskell code there are no prints, of course, but you can thing of returning x : function call as "printing x", since x is made available to the caller of our function before function call is actually made. (Well, to be pedantic this depends on how the caller will consume the output list, but I'll neglect this.)
Hence the non-tail recursive code can also work on infinite lists. Even on finite inputs, performance is improved: if we call head (rec1 p xs), we only evaluate xs until the first non-discarded element. By contrast head (rec2 p xs) would fully filter the whole list xs, even we don't need that.
The second implementation does not make much sense: a variable named len will not contain the length of the list. You thus need to pass this, for infinite lists, this would not work, since there is no length at all.
You likely want to produce something like:
recFilter :: (a -> Bool) -> [a] -> [a]
recFilter p = go []
where go ys [] = ys -- (1)
go ys (x:xs) | p x = go (ys ++ [x]) xs
| otherwise = go ys xs
where we thus have an accumulator to which we append the items in the list, and then eventually return the accumulator.
The problem with the second approach is that as long as the accumulator is not returned, Haskell will need to keep recursing until at least we reach weak head normal form (WHNF). This means that if we pattern match the result with [] or (_:_), we will need at least have to recurse until case one, since the other cases only produce a new expression, and it will thus not yield a data constructor on which we can pattern match.
This in contrast to the first filter where if we pattern match on [] or (_:_) it is sufficient to stop at the first case (1), or the third case 93) where the expression produces an object with a list data constructor. Only if we require extra elements to pattern match, for example (_:_:_), it will require to evaluate the recFilter p tl in case (2) of the first implementation:
recFilter :: (a -> Bool) -> [a] -> [a]
recFilter _ [] = [] -- (1)
recFilter p (h:tl) = if (p h)
then h : recFilter p tl -- (2)
else recFilter p tl
For more information, see the Laziness section of the Wikibook on Haskell that describes how laziness works with thunks.
This question already has answers here:
Eq => function in Haskell
(2 answers)
Closed 3 years ago.
I am new to Haskell and am trying to write quite an easy function which gathers each repeated consecutive elements under separate sub-lists, For example:
f :: Eq a => [a] -> [[a]]
So:
f [] = []
f [3] = [[3]]
f [1,1,1,3,2,2,1,1,1,1] = [[1,1,1],[3],[2,2],[1,1,1,1]]
I thought about this function:
f :: Eq a => [a] -> [[a]]
f [] = []
f (x:[]) = [[x]]
f (x:x':xs) = if x == x' then [[x, x']] ++ (f (xs))
else [[x]] ++ (f (xs))
It seems to not work well since when it arrives to the last element, it wants to compare it to its consecutive, which clearly does not exist.
I would like to receive a simple answer (beginner level) that will not be too different than mine, correcting my code will be the best.
Thanks in advance.
The problem isn't really what you said, it's just that you only hard-coded the cases that either one or two consecutive elements are equal. Actually you want to ground together an arbitrary number of equal consecutives. IOW, for every element, you pop off as many following ones as are equal.
Generally, splitting of the head-part of a list which fulfills some condition is what the span function does. In this case, the condition it's supposed to check is being equal to the element you already removed. That's written thus:
f [] = []
f (x:xs) = (x:xCopies) : f others
where (xCopies,others) = span (==x) xs
Here, x:xCopies puts together the chunk of elements equal to x (with x itself on front), use that as the heading chunk-list of the result, and then you recurse over all the elements that remain.
Your problem is that both halves of your if have the same structure: they cons exactly one element onto the front of the recursive call. This can't be right: sometimes you want to add an element to the front of the list, and other times you want to combine your element with what's already in the recursive call.
Instead, you need to pattern-match on the recursive call to get the first item in the recursive result, and then prepend to that when the first two items match.
f :: Eq a => [a] -> [[a]]
f [] = []
f [x] = [[x]]
f (x:xs#(y:_)) | x == y = (x:head):more
| otherwise = [x]:result
where result#(head:more) = f xs
I need a function to double every other number in a list. This does the trick:
doubleEveryOther :: [Integer] -> [Integer]
doubleEveryOther [] = []
doubleEveryOther (x:[]) = [x]
doubleEveryOther (x:(y:zs)) = x : 2 * y : doubleEveryOther zs
However, the catch is that I need to double every other number starting from the right - so if the length of the list is even, the first one will be doubled, etc.
I understand that in Haskell it's tricky to operate on lists backwards, so my plan was to reverse the list, apply my function, then output the reverse again. I have a reverseList function:
reverseList :: [Integer] -> [Integer]
reverseList [] = []
reverseList xs = last xs : reverseList (init xs)
But I'm not quite sure how to implant it inside my original function. I got to something like this:
doubleEveryOther :: [Integer] -> [Integer]
doubleEveryOther [] = []
doubleEveryOther (x:[]) = [x]
doubleEveryOther (x:(y:zs)) =
| rev_list = reverseList (x:(y:zs))
| rev_list = [2 * x, y] ++ doubleEveryOther zs
I'm not exactly sure of the syntax of a function that includes intermediate values like this.
In case it's relevant, this is for Exercise 2 in CIS 194 HW 1.
This is a very simple combination of the two functions you've already created:
doubleEveryOtherFromRight = reverseList . doubleEveryOther . reverseList
Note that your reverseList is actually already defined in the standard Prelude as reverse. so you didn't need to define it yourself.
I'm aware that the above solution isn't very efficient, because both uses of reverse need to pass through the entire list. I'll leave it to others to suggest more efficient versions, but hopefully this illustrates the power of function composition to build more complex computations out of simpler ones.
As Lorenzo points out, you can make one pass to determine if the list has an odd or even length, then a second pass to actually construct the new list. It might be simpler, though, to separate the two tasks.
doubleFromRight ls = zipWith ($) (cycle fs) ls -- [f0 ls0, f1 ls1, f2 ls2, ...]
where fs = if odd (length ls)
then [(*2), id]
else [id, (*2)]
So how does this work? First, we observe that to create the final result, we need to apply one of two function (id or (*2)) to each element of ls. zipWith can do that if we have a list of appropriate functions. The interesting part of its definition is basically
zipWith f (x:xs) (y:ys) = f x y : zipWith f xs ys
When f is ($), we're just applying a function from one list to the corresponding element in the other list.
We want to zip ls with an infinite alternating list of id and (*2). The question is, which function should that list start with? It should always end with (*2), so the starting item is determined by the length of ls. An odd-length requires us to start with (*2); an even one, id.
Most of the other solutions show you how to either use the building blocks you already have or building blocks available in the standard library to build your function. I think it's also instructive to see how you might build it from scratch, so in this answer I discuss one idea for that.
Here's the plan: we're going to walk all the way to the end of the list, then walk back to the front. We'll build our new list during our walk back from the end. The way we'll build it as we walk back is by alternating between (multiplicative) factors of 1 and 2, multiplying our current element by our current factor and then swapping factors for the next step. At the end we'll return both the final factor and the new list. So:
doubleFromRight_ :: Num a => [a] -> (a, [a])
doubleFromRight_ [] = (1, [])
doubleFromRight_ (x:xs) =
-- not at the end yet, keep walking
let (factor, xs') = doubleFromRight_ xs
-- on our way back to the front now
in (3-factor, factor*x:xs')
If you like, you can write a small wrapper that throws away the factor at the end.
doubleFromRight :: Num a => [a] -> [a]
doubleFromRight = snd . doubleFromRight_
In ghci:
> doubleFromRight [1..5]
[1,4,3,8,5]
> doubleFromRight [1..6]
[2,2,6,4,10,6]
Modern practice would be to hide the helper function doubleFromRight_ inside a where block in doubleFromRight; and since the slightly modified name doesn't actually tell you anything new, we'll use the community standard name internally. Those two changes might land you here:
doubleFromRight :: Num a => [a] -> [a]
doubleFromRight = snd . go where
go [] = (1, [])
go (x:xs) = let (factor, xs') = go xs in (3-factor, factor*x:xs')
An advanced Haskeller might then notice that go fits into the shape of a fold and write this:
doubleFromRight :: Num a => [a] -> [a]
doubleFromRight = snd . foldr (\x (factor, xs) -> (3-factor, factor*x:xs)) (1,[])
But I think it's perfectly fine in this case to stop one step earlier with the explicit recursion; it may even be more readable in this case!
If we really want to avoid calculating the length, we can define
doubleFromRight :: Num a => [a] -> [a]
doubleFromRight xs = zipWith ($)
(foldl' (\a _ -> drop 1 a) (cycle [(2*), id]) xs)
xs
This pairs up the input list with the cycled infinite list of functions, [(*2), id, (*2), id, .... ]. then it skips along them both. when the first list is finished, the second is in the appropriate state to be - again - applied, pairwise, - on the second! This time, for real.
So in effect it does measure the length (of course), it just doesn't count in integers but in the list elements so to speak.
If the length of the list is even, the first element will be doubled, otherwise the second, as you've specified in the question:
> doubleFromRight [1..4]
[2,2,6,4]
> doubleFromRight [1..5]
[1,4,3,8,5]
The foldl' function processes the list left-to-right. Its type is
foldl' :: (b -> a -> b) -> b -> [a] -> b
-- reducer_func acc xs result
Whenever you have to work on consecutive terms in a list, zip with a list comprehension is an easy way to go. It takes two lists and returns a list of tuples, so you can either zip the list with its tail or make it indexed. What i mean is
doubleFromRight :: [Int] -> [Int]
doubleFromRight ls = [if (odd i == oddness) then 2*x else x | (i,x) <- zip [1..] ls]
where
oddness = odd . length $ ls
This way you count every element, starting from 1 and if the index has the same parity as the last element in the list (both odd or both even), then you double the element, else you leave it as is.
I am not 100% sure this is more efficient, though, if anyone could point it out in the comments that would be great
The language I'm using is a subset of Haskell called Core Haskell which does not allow the use of the built-in functions of Haskell. For example, if I were to create a function which counts the number of times that the item x appears in the list xs, then I would write:
count = \x ->
\xs -> if null xs
then 0
else if x == head xs
then 1 + count x(tail xs)
else count x(tail xs)
I'm trying to create a function which outputs a list xs with its duplicate values removed. E.g. remdups (7:7:7:4:5:7:4:4:[]) => (7:4:5:[])
can anyone offer any advice?
Thanks!
I'm guessing that you're a student, and this is a homework problem, so I'll give you part of the answer and let you finish it. In order to write remdups, it would be useful to have a function that tells us if a list contains an element. We can do that using recursion. When using recursion, start by asking yourself what the "base case", or simplest possible case is. Well, when the list is empty, then obviously the answer is False (no matter what the character is). So now, what if the list isn't empty? We can check if the first character in the list is a match. If it is, then we know that the answer is True. Otherwise, we need to check the rest of the list -- which we do by calling the function again.
elem _ [] = False
elem x (y:ys) = if x==y
then True
else elem x ys
The underscore (_) simply means "I'm not going to use this variable, so I won't even bother to give it a name." That can be written more succinctly as:
elem _ [] = False
elem x (y:ys) = x==y || elem x ys
Writing remdups is a little tricky, but I suspect your teacher gave you some hints. One way to approach it is to imagine we're partway through processing the list. We have part of the list that hasn't been processed yet, and part of the list that has been processed (and doesn't contain any duplicates). Suppose we had a function called remdupHelper, which takes those two arguments, called remaining and finished. It would look at the first character in remaining, and return a different result depending on whether or not that character is in finished. (That result could call remdupHelper recursively). Can you write remdupHelper?
remdupHelper = ???
Once you have remdupHelper, you're ready to write remdups. It just invokes remdupHelper in the initial condition, where none of the list has been processed yet:
remdups l = remdupHelper l [] -- '
This works with Ints:
removeDuplicates :: [Int] -> [Int]
removeDuplicates = foldr insertIfNotMember []
where
insertIfNotMember item list = if (notMember item list)
then item : list
else list
notMember :: Int -> [Int] -> Bool
notMember item [] = True
notMember item (x:xs)
| item == x = False
| otherwise = notMember item xs
How it works should be obvious. The only "tricky" part is that the type of foldr is:
(a -> b -> b) -> b -> [a] -> b
but in this case b unifies with [a], so it becomes:
(a -> [a] -> [a]) -> [a] -> [a] -> [a]
and therefore, you can pass the function insertIfNotMember, which is of type:
Int -> [Int] -> [Int] -- a unifies with Int
The question is to compute the mode (the value that occurs most frequently) of a sorted list of integers.
[1,1,1,1,2,2,3,3] -> 1
[2,2,3,3,3,3,4,4,8,8,8,8] -> 3 or 8
[3,3,3,3,4,4,5,5,6,6] -> 3
Just use the Prelude library.
Are the functions filter, map, foldr in Prelude library?
Starting from the beginning.
You want to make a pass through a sequence and get the maximum frequency of an integer.
This sounds like a job for fold, as fold goes through a sequence aggregating a value along the way before giving you a final result.
foldl :: (a -> b -> a) -> a -> [b] -> a
The type of foldl is shown above. We can fill in some of that already (I find that helps me work out what types I need)
foldl :: (a -> Int -> a) -> a -> [Int] -> a
We need to fold something through that to get the value. We have to keep track of the current run and the current count
data BestRun = BestRun {
currentNum :: Int,
occurrences :: Int,
bestNum :: Int,
bestOccurrences :: Int
}
So now we can fill in a bit more:
foldl :: (BestRun -> Int -> BestRun) -> BestRun -> [Int] -> BestRun
So we want a function that does the aggregation
f :: BestRun -> Int -> BestRun
f (BestRun current occ best bestOcc) x
| x == current = (BestRun current (occ + 1) best bestOcc) -- continuing current sequence
| occ > bestOcc = (BestRun x 1 current occ) -- a new best sequence
| otherwise = (BestRun x 1 best bestOcc) -- new sequence
So now we can write the function using foldl as
bestRun :: [Int] -> Int
bestRun xs = bestNum (foldl f (BestRun 0 0 0 0) xs)
Are the functions filter, map, foldr in Prelude library?
Stop...Hoogle time!
Did you know Hoogle tells you which module a function is from? Hoolging map results in this information on the search page:
map :: (a -> b) -> [a] -> [b]
base Prelude, base Data.List
This means map is defined both in Prelude and in Data.List. You can hoogle the other functions and likewise see that they are indeed in Prelude.
You can also look at Haskell 2010 > Standard Prelude or the Prelude hackage docs.
So we are allowed to map, filter, and foldr, as well as anything else in Prelude. That's good. Let's start with Landei's idea, to turn the list into a list of lists.
groupSorted :: [a] -> [[a]]
groupSorted = undefined
-- groupSorted [1,1,2,2,3,3] ==> [[1,1],[2,2],[3,3]]
How are we supposed to implement groupSorted? Well, I dunno. Let's think about that later. Pretend that we've implemented it. How would we use it to get the correct solution? I'm assuming it is OK to choose just one correct solution, in the event that there is more than one (as in your second example).
mode :: [a] -> a
mode xs = doSomething (groupSorted xs)
where doSomething :: [[a]] -> a
doSomething = undefined
-- doSomething [[1],[2],[3,3]] ==> 3
-- mode [1,2,3,3] ==> 3
We need to do something after we use groupSorted on the list. But what? Well...we should find the longest list in the list of lists. Right? That would tell us which element appears the most in the original list. Then, once we find the longest sublist, we want to return the element inside it.
chooseLongest :: [[a]] -> a
chooseLongest xs = head $ chooseBy (\ys -> length ys) xs
where chooseBy :: ([a] -> b) -> [[a]] -> a
chooseBy f zs = undefined
-- chooseBy length [[1],[2],[3,3]] ==> [3,3]
-- chooseLongest [[1],[2],[3,3]] ==> 3
chooseLongest is the doSomething from before. The idea is that we want to choose the best list in the list of lists xs, and then take one of its elements (its head does just fine). I defined this by creating a more general function, chooseBy, which uses a function (in this case, we use the length function) to determine which choice is best.
Now we're at the "hard" part. Folds. chooseBy and groupSorted are both folds. I'll step you through groupSorted, and leave chooseBy up to you.
How to write your own folds
We know groupSorted is a fold, because it consumes the entire list, and produces something entirely new.
groupSorted :: [Int] -> [[Int]]
groupSorted xs = foldr step start xs
where step :: Int -> [[Int]] -> [[Int]]
step = undefined
start :: [[Int]]
start = undefined
We need to choose an initial value, start, and a stepping function step. We know their types because the type of foldr is (a -> b -> b) -> b -> [a] -> b, and in this case, a is Int (because xs is [Int], which lines up with [a]), and the b we want to end up with is [[Int]].
Now remember, the stepping function will inspect the elements of the list, one by one, and use step to fuse them into an accumulator. I will call the currently inspected element v, and the accumulator acc.
step v acc = undefined
Remember, in theory, foldr works its way from right to left. So suppose we have the list [1,2,3,3]. Let's step through the algorithm, starting with the rightmost 3 and working our way left.
step 3 start = [[3]]
Whatever start is, when we combine it with 3 it should end up as [[3]]. We know this because if the original input list to groupSorted were simply [3], then we would want [[3]] as a result. However, it isn't just [3]. Let's pretend now that it's just [3,3]. [[3]] is the new accumulator, and the result we would want is [[3,3]].
step 3 [[3]] = [[3,3]]
What should we do with these inputs? Well, we should tack the 3 onto that inner list. But what about the next step?
step 2 [[3,3]] = [[2],[3,3]]
In this case, we should create a new list with 2 in it.
step 1 [[2],[3,3]] = [[1],[2],[3,3]]
Just like last time, in this case we should create a new list with 1 inside of it.
At this point we have traversed the entire input list, and have our final result. So how do we define step? There appear to be two cases, depending on a comparison between v and acc.
step v acc#((x:xs):xss) | v == x = (v:x:xs) : xss
| otherwise = [v] : acc
In one case, v is the same as the head of the first sublist in acc. In that case we prepend v to that same sublist. But if such is not the case, then we put v in its own list and prepend that to acc. So what should start be? Well, it needs special treatment; let's just use [] and add a special pattern match for it.
step elem [] = [[elem]]
start = []
And there you have it. All you have to do to write your on fold is determine what start and step are, and you're done. With some cleanup and eta reduction:
groupSorted = foldr step []
where step v [] = [[v]]
step v acc#((x:xs):xss)
| v == x = (v:x:xs) : xss
| otherwise = [v] : acc
This may not be the most efficient solution, but it works, and if you later need to optimize, you at least have an idea of how this function works.
I don't want to spoil all the fun, but a group function would be helpful. Unfortunately it is defined in Data.List, so you need to write your own. One possible way would be:
-- corrected version, see comments
grp [] = []
grp (x:xs) = let a = takeWhile (==x) xs
b = dropWhile (==x) xs
in (x : a) : grp b
E.g. grp [1,1,2,2,3,3,3] gives [[1,1],[2,2],[3,3,3]]. I think from there you can find the solution yourself.
I'd try the following:
mostFrequent = snd . foldl1 max . map mark . group
where
mark (a:as) = (1 + length as, a)
mark [] = error "cannot happen" -- because made by group
Note that it works for any finite list that contains orderable elements, not just integers.