I need a function to double every other number in a list. This does the trick:
doubleEveryOther :: [Integer] -> [Integer]
doubleEveryOther [] = []
doubleEveryOther (x:[]) = [x]
doubleEveryOther (x:(y:zs)) = x : 2 * y : doubleEveryOther zs
However, the catch is that I need to double every other number starting from the right - so if the length of the list is even, the first one will be doubled, etc.
I understand that in Haskell it's tricky to operate on lists backwards, so my plan was to reverse the list, apply my function, then output the reverse again. I have a reverseList function:
reverseList :: [Integer] -> [Integer]
reverseList [] = []
reverseList xs = last xs : reverseList (init xs)
But I'm not quite sure how to implant it inside my original function. I got to something like this:
doubleEveryOther :: [Integer] -> [Integer]
doubleEveryOther [] = []
doubleEveryOther (x:[]) = [x]
doubleEveryOther (x:(y:zs)) =
| rev_list = reverseList (x:(y:zs))
| rev_list = [2 * x, y] ++ doubleEveryOther zs
I'm not exactly sure of the syntax of a function that includes intermediate values like this.
In case it's relevant, this is for Exercise 2 in CIS 194 HW 1.
This is a very simple combination of the two functions you've already created:
doubleEveryOtherFromRight = reverseList . doubleEveryOther . reverseList
Note that your reverseList is actually already defined in the standard Prelude as reverse. so you didn't need to define it yourself.
I'm aware that the above solution isn't very efficient, because both uses of reverse need to pass through the entire list. I'll leave it to others to suggest more efficient versions, but hopefully this illustrates the power of function composition to build more complex computations out of simpler ones.
As Lorenzo points out, you can make one pass to determine if the list has an odd or even length, then a second pass to actually construct the new list. It might be simpler, though, to separate the two tasks.
doubleFromRight ls = zipWith ($) (cycle fs) ls -- [f0 ls0, f1 ls1, f2 ls2, ...]
where fs = if odd (length ls)
then [(*2), id]
else [id, (*2)]
So how does this work? First, we observe that to create the final result, we need to apply one of two function (id or (*2)) to each element of ls. zipWith can do that if we have a list of appropriate functions. The interesting part of its definition is basically
zipWith f (x:xs) (y:ys) = f x y : zipWith f xs ys
When f is ($), we're just applying a function from one list to the corresponding element in the other list.
We want to zip ls with an infinite alternating list of id and (*2). The question is, which function should that list start with? It should always end with (*2), so the starting item is determined by the length of ls. An odd-length requires us to start with (*2); an even one, id.
Most of the other solutions show you how to either use the building blocks you already have or building blocks available in the standard library to build your function. I think it's also instructive to see how you might build it from scratch, so in this answer I discuss one idea for that.
Here's the plan: we're going to walk all the way to the end of the list, then walk back to the front. We'll build our new list during our walk back from the end. The way we'll build it as we walk back is by alternating between (multiplicative) factors of 1 and 2, multiplying our current element by our current factor and then swapping factors for the next step. At the end we'll return both the final factor and the new list. So:
doubleFromRight_ :: Num a => [a] -> (a, [a])
doubleFromRight_ [] = (1, [])
doubleFromRight_ (x:xs) =
-- not at the end yet, keep walking
let (factor, xs') = doubleFromRight_ xs
-- on our way back to the front now
in (3-factor, factor*x:xs')
If you like, you can write a small wrapper that throws away the factor at the end.
doubleFromRight :: Num a => [a] -> [a]
doubleFromRight = snd . doubleFromRight_
In ghci:
> doubleFromRight [1..5]
[1,4,3,8,5]
> doubleFromRight [1..6]
[2,2,6,4,10,6]
Modern practice would be to hide the helper function doubleFromRight_ inside a where block in doubleFromRight; and since the slightly modified name doesn't actually tell you anything new, we'll use the community standard name internally. Those two changes might land you here:
doubleFromRight :: Num a => [a] -> [a]
doubleFromRight = snd . go where
go [] = (1, [])
go (x:xs) = let (factor, xs') = go xs in (3-factor, factor*x:xs')
An advanced Haskeller might then notice that go fits into the shape of a fold and write this:
doubleFromRight :: Num a => [a] -> [a]
doubleFromRight = snd . foldr (\x (factor, xs) -> (3-factor, factor*x:xs)) (1,[])
But I think it's perfectly fine in this case to stop one step earlier with the explicit recursion; it may even be more readable in this case!
If we really want to avoid calculating the length, we can define
doubleFromRight :: Num a => [a] -> [a]
doubleFromRight xs = zipWith ($)
(foldl' (\a _ -> drop 1 a) (cycle [(2*), id]) xs)
xs
This pairs up the input list with the cycled infinite list of functions, [(*2), id, (*2), id, .... ]. then it skips along them both. when the first list is finished, the second is in the appropriate state to be - again - applied, pairwise, - on the second! This time, for real.
So in effect it does measure the length (of course), it just doesn't count in integers but in the list elements so to speak.
If the length of the list is even, the first element will be doubled, otherwise the second, as you've specified in the question:
> doubleFromRight [1..4]
[2,2,6,4]
> doubleFromRight [1..5]
[1,4,3,8,5]
The foldl' function processes the list left-to-right. Its type is
foldl' :: (b -> a -> b) -> b -> [a] -> b
-- reducer_func acc xs result
Whenever you have to work on consecutive terms in a list, zip with a list comprehension is an easy way to go. It takes two lists and returns a list of tuples, so you can either zip the list with its tail or make it indexed. What i mean is
doubleFromRight :: [Int] -> [Int]
doubleFromRight ls = [if (odd i == oddness) then 2*x else x | (i,x) <- zip [1..] ls]
where
oddness = odd . length $ ls
This way you count every element, starting from 1 and if the index has the same parity as the last element in the list (both odd or both even), then you double the element, else you leave it as is.
I am not 100% sure this is more efficient, though, if anyone could point it out in the comments that would be great
I have written a function generating subsets of subset. It caused stack overflow when I use in the following way subsets [1..]. And it is "normal" behaviour when it comes to "normal" (no-lazy) languages. And now, I would like to improve my function to be lazy.
P.S. I don't understand laziness ( And I try to understand it) so perhaps my problem is strange for you- please explain. :)
P.S. 2 Feel free to say me something about my disability in Haskell ;)
subsets :: [a] -> [[a]]
subsets (x:xs) = (map (\ e -> x:e) (subsets xs)) ++ (subsets xs)
subsets [] = [[]]
There's two problems with that function. First, it recurses twice, which makes it exponentially more ineffiecient than necessary (if we disregard the exponential number of results...), because each subtree is recalculated every time for all overlapping subsets; this can be fixed by leting the recursive call be the same value:
subsets' :: [a] -> [[a]]
subsets' [] = [[]]
subsets' (x:xs) = let s = subsets' xs
in map (x:) s ++ s
This will already allow you to calculate length $ subsets' [1..25] in a few seconds, while length $ subsets [1..25] takes... well, I didn't wait ;)
The other issue is that with your version, when you give it an infinite list, it will recurse on the infinite tail of that list first. To generate all finite subsets in a meaningful way, we need to ensure two things: first, we must build up each set from smaller sets (to ensure termination), and second, we should ensure a fair order (ie., not generate the list [[1], [2], ...] first and never get to the rest). For this, we start from [[]] and recursively add the current element to everything we have already generated, and then remember the new list for the next step:
subsets'' :: [a] -> [[a]]
subsets'' l = [[]] ++ subs [[]] l
where subs previous (x:xs) = let next = map (x:) previous
in next ++ subs (previous ++ next) xs
subs _ [] = []
Which results in this order:
*Main> take 100 $ subsets'' [1..]
[[],[1],[2],[2,1],[3],[3,1],[3,2],[3,2,1],[4],[4,1],[4,2],[4,2,1],[4,3],[4,3,1],[4,3,2],[4,3,2,1],[5],[5,1],[5,2],[5,2,1],[5,3],[5,3,1],[5,3,2],[5,3,2,1],[5,4],[5,4,1],[5,4,2],[5,4,2,1],[5,4,3],[5,4,3,1],[5,4,3,2],[5,4,3,2,1],[6],[6,1],[6,2],[6,2,1],[6,3],[6,3,1],[6,3,2],[6,3,2,1],[6,4],[6,4,1],[6,4,2],[6,4,2,1],[6,4,3],[6,4,3,1],[6,4,3,2],[6,4,3,2,1],[6,5],[6,5,1],[6,5,2],[6,5,2,1],[6,5,3],[6,5,3,1],[6,5,3,2],[6,5,3,2,1],[6,5,4],[6,5,4,1],[6,5,4,2],[6,5,4,2,1],[6,5,4,3],[6,5,4,3,1],[6,5,4,3,2],[6,5,4,3,2,1],[7],[7,1],[7,2],[7,2,1],[7,3],[7,3,1],[7,3,2],[7,3,2,1],[7,4],[7,4,1],[7,4,2],[7,4,2,1],[7,4,3],[7,4,3,1],[7,4,3,2],[7,4,3,2,1],[7,5],[7,5,1],[7,5,2],[7,5,2,1],[7,5,3],[7,5,3,1],[7,5,3,2],[7,5,3,2,1],[7,5,4],[7,5,4,1],[7,5,4,2],[7,5,4,2,1],[7,5,4,3],[7,5,4,3,1],[7,5,4,3,2],[7,5,4,3,2,1],[7,6],[7,6,1],[7,6,2],[7,6,2,1]]
You can't generate all the subsets of an infinite set: they form an uncountable set. Cardinality makes it impossible.
At most, you can try to generate all the finite subsets. For that, you can't proceed by induction, from [] onwards, since you'll never reach []. You need to proceed inductively from the beginning of the list, instead of the end.
A right fold solution would be:
powerset :: Foldable t => t a -> [[a]]
powerset xs = []: foldr go (const []) xs [[]]
where go x f a = let b = (x:) <$> a in b ++ f (a ++ b)
then:
\> take 8 $ powerset [1..]
[[],[1],[2],[2,1],[3],[3,1],[3,2],[3,2,1]]
Since a code example is worth a thousand words I'll start with that:
testList = [1,2,2,3,4,5]
testSet = map sumMapper $ tails testList
where sumMapper [] = []
sumMapper (a:b) = sumMap a b
sumMap a b = map (+ a) b
This code takes a list and adds up all the elements to get the sum of all of them (I'd also be interested in efficiency of this). The output of testSet is:
[[3,3,4,5,6],[4,5,6,7],[5,6,7],[7,8],[9],[],[]]
I would like to find the union of these lists (to make it into a set) but I feel that:
whatIWant = foldl1 union testSet
will have bad performance (the real lists will be thousands of elements long).
Is this the correct solution or am I missing something obvious?
You might want to try
nub $ concat theListOfLists
In the version using union, the code to cut out duplicates will get run many times. Here it only is run once.
It will only execute the code to pull out the unique values once.
There is also a Data.Set library, you could alternatively use
import Data.Set
S.fromList $ concat theListOfLists
The important point is that the code (here and above) that pulls out duplicates only gets run on the full list once, rather than over and over again.
edit- Rein mentions below that nub is O(n^2), so you should avoid the first solution above in favor of something O(n log n), as Data.Set.fromList should be. As others have mentioned in the comments, you need something that enforces Ord a to get the proper complexity O(n log n), and Data.Set does, nub does not.
I will leave the two solutions (poor performance and good performance) because I think the resulting discussion was useful.
If you're using elements that are members of the Ord typeclass, as in your example, you can use Data.Set:
import qualified Data.Set as Set
whatYouWant = foldl' (Set.union . Set.fromList) Set.empty testSet
This has the advantage of taking space proportional to the size of the largest sublist rather than to the size of the entire concatenated list as does the Set.fromList . concat solution. The strict foldl' also prevents buildup of unevaluated thunks, preventing O(n) stack and heap space usage.
Generally speaking, an Ord constraint allows more efficient algorithms than an Eq constraint because it allows you to build a tree. This is also the reason that nub is O(n^2): the more efficient algorithm requires Ord rather than just Eq.
Since union is an associative operation (a+(b+c)==(a+b)+c), you can use tree-shaped folding for a logarithmic advantage in time complexity:
_U [] = []
_U (xs:t) = union xs (_U (pairs t))
pairs (xs:ys:t) = union xs ys : pairs t
pairs t = t
Of course Data.List.union itself is O(n2) in general, but if your testList is ordered non-decreasing, all the lists will be too, and you can use a linear ordUnion instead of the union, for a solution which is linearithmic overall and shouldn't leak space:
ordUnion :: (Ord a) => [a] -> [a] -> [a]
ordUnion a [] = a
ordUnion [] b = b
ordUnion (x:xs) (y:ys) = case compare x y of
LT -> x : ordUnion xs (y:ys)
EQ -> x : ordUnion xs ys
GT -> y : ordUnion (x:xs) ys
To prevent duplicates which might slip through, one more function is needed to process _U's output—a linear ordNub :: (Ord a) => [a] -> [a], with an obvious implementation.
Using the left-preferential (\(x:xs) ys -> x:ordUnion xs ys) could be even more productive overall (force smaller portions of the input at each given moment):
g testList = ordNub . _U $ [map (+ a) b | (a:b) <- tails testList]
where
_U [] = []
_U ((x:xs):t) = x : ordUnion xs (_U (pairs t))
pairs ((x:xs):ys:t) = (x : ordUnion xs ys) : pairs t
pairs t = t
see also:
data-ordlist package
even less forcing "implicit heap" by apfelmus
Tree-like folds