Improve efficiency by removing concat - haskell

I'm attempting to improve the efficiency of a particular function in my code which is taking up a large amount of the runtime. After profiling, I believe this is because of the concat within the code. How could I go about improving this code to be quicker?
chunk :: C -> [A] -> [[A]]
chunk c = go []
where s = Set.fromList (map snd (Map.toList c))
go :: [A] -> [A] -> [[A]]
go l [] = [l | member l s]
go l (x:xs) = if member l s then l : go [x] xs
else go (l ++ [x]) xs
Thanks for your help!

A simple solution would be to use Seq, where snoc operation is O(1). This involves converting the input first and then converting the result back, which would be worth, unless the set is large compared to the average length of the lists.
There is however another problem, and that is testing the membership of the lists (or similar structures). Comparison or testing equality on lists is O(n), and in your case, where you test the membership of a list that is likely a sub-list of a list contained in the set, the testing will indeed be Ω(n). So even then, the complexity of chunk could be in the order of O(n^2) where n is the length of the list argument.
It seems that using a trie would be a better solution. A trie is much more efficient than a set of lists in both memory and time complexity. And especially useful for this case is its operation which allows you to take a sub-trie constructed by filtering all elements with a given prefix in O(1).
An example code (untested):
chunk :: C -> [A] -> [[A]]
chunk c = go trie
where trie = Trie.fromList (map snd (Map.toList c))
go :: Trie -> [A] -> [A] -> [[A]]
go s l [] = [reverse l | Trie.member [] s]
go s l (x:xs)
| Trie.member [] s = reverse l : go (Trie.lookupPrefix [x] trie) xs
| otherwise = go (x : l) (Trie.lookupPrefix [x] s)
Now each step of go should take only O(n) amortized cost (the only non-O(1) operation is reverse, but this is O(1) amortized, as reversing a k-element list occurs only once after k steps).
And now we can also make a further improvement: When the sub-trie is empty, we know we'll never return an additional element, as we'll never reach the matching case. So we could add a pattern to the top
go s _ _ | Trie.null s = []
Package list-trie seem to be just perfect for this.

Related

Fast powerset implementation with complement set

I would like to have a function
powersetWithComplements :: [a] -> [([a], [a])]
Such that for example:
powersetWithComplements [1,2,3] = [([],[1,2,3]),([3],[1,2]),([2],[1,3]),([2,3],[1]),([1],[2,3]),([1,3],[2]),([1,2],[3]),([1,2,3],[])]
It is easy to obtain some implementation, for example
powerset :: [a] -> [[a]]
powerset = filterM (const [False, True])
powersetWithComplements s = let p = powerset s in zip p (reverse p)
Or
powersetWithComplements s = [ (x, s \\ x) | x <- powerset s]
But I estimate that the performance of both these would be really poor. What would be an optimal approach? It is possible to use different data structure than the [] list.
Well you should see a powerset like this: you enumerate over the items of the set, and you decide whether you put these in the "selection" (first item of the tuple), or not (second item of the tuple). By enumerating over these selections exhaustively, we get the powerset.
So we can do the same, for instance using recursion:
import Control.Arrow(first, second)
powersetWithComplements [] = [([],[])]
powersetWithComplements (x:xs) = map (second (x:)) rec ++ map (first (x:)) rec
where rec = powersetWithComplements xs
So here the map (second (x:) prepends all the second items of the tuples of the rec with x, and the map (second (x:) does the same for the first item of the tuples of rec. where rec is the recursion on the tail of the items.
Prelude Control.Arrow> powersetWithComplements [1,2,3]
[([],[1,2,3]),([3],[1,2]),([2],[1,3]),([2,3],[1]),([1],[2,3]),([1,3],[2]),([1,2],[3]),([1,2,3],[])]
The advantage of this approach is that we do not generate a complement list for every list we generate: we concurrently build the selection, and complement. Furthermore we can reuse the lists we construct in the recursion, which will reduce the memory footprint.
In both time complexity and memory complexity, the powersetWithComplements function will be equal (note that this is complexity, of course in terms of processing time it will require more time, since we do an extra amount of work) like the powerset function, since prepending a list is usually done in O(1)), and we now build two lists (and a tuple) for every original list.
Since you are looking for a "fast" implementation, I thought I would share some benchmark experiments I did with Willem's solution.
I thought using a DList instead of a plain list would be a big improvement, since DLists have constant-time append, whereas appending lists is linear in the size of the left argument.
psetDL :: [a] -> [([a],[a])]
psetDL = toList . go
where
go [] = DList.singleton ([],[])
go (x:xs) = (second (x:) <$> rec) <> (first (x:) <$> rec)
where
rec = go xs
But that did not have a significant effect.
I suspected this is because we are traversing both sublists anyway because of the fmap (<$>). We can avoid the traversal by doing something similar to CPS-converting the function, passing down the accumulated sets as parameters rather than returning them.
psetTail :: [a] -> [([a],[a])]
psetTail = go [] []
where
go a b [] = [(a,b)]
go a b (x:xs) = go a (x:b) xs <> go (x:a) b xs
This yielded a 220% improvement on a list of size 20. Now since we aren't traversing the lists from fmapping, we can get rid of the append traversal by using a DList:
psetTailDL :: [a] -> [([a],[a])]
psetTailDL = toList . go [] []
where
go a b [] = DList.singleton (a,b)
go a b (x:xs) = go a (x:b) xs <> go (x:a) b xs
Which yields an additional 20% improvement.
I guess the best is inspired by your reverse discovery
partitions s=filterM(const[False,True])s
`zip`filterM(const[True,False])s
rather than a likely stackoverflower
partitions[]=[([],[])]
partitions(x:xs)=[p|(f,t)<-partitions xs,p<-[(l,x:r),(x:l,r)]]
or a space-and-time-efficient finite list indexer
import Data.Array
import Data.Bits
import Data.List
partitions s=[(map(a!)f,map(a!)t)
|n<-[length s],a<-[listArray(0,n-1)s],
m<-[0..2^n-1],(f,t)<-[partition(testBit m)[0..n-1]]]

How to efficiently generate all lists of length `n^2` containing `n` copies of every `x < n`?

Given an integer n, how can I build the list containing all lists of length n^2 containing exactly n copies of each integer x < n? For example, for n = 2, we have:
[0,0,1,1], [0,1,0,1], [1,0,0,1], [0,1,1,0], [1,0,1,0], [1,1,0,0]
This can be easily done combining permutations and nub:
f :: Int -> [[Int]]
f n = nub . permutations $ concatMap (replicate n) [0..n-1]
But that is way too inefficient. Is there any simple way to encode the efficient/direct algorithm?
Sure, it's not too hard. We'll start with a list of n copies of each number less than n, and repeatedly choose one to start our result with. First, a function for choosing an element from a list:
zippers :: [a] -> [([a], a, [a])]
zippers = go [] where
go l (h:r) = (l,h,r) : go (h:l) r
go _ [] = []
Now we'll write a function that produces all possible interleavings of some input lists. Internally we'll maintain the invariant that each [a] is non-empty; hence we'll have to establish that invariant before we start recursing. In fact, this will be wasted work in the way we intend to call this function, but for good abstraction we might as well handle all inputs correctly, right?
interleavings :: [[a]] -> [[a]]
interleavings = go . filter (not . null) where
go [] = [[]]
go xss = do
(xssl, x:xs, xssr) <- zippers xss
(x:) <$> interleavings ([xs | not (null xs)] ++ xssl ++ xssr)
And now we're basically done. All we have to do is feed in an appropriate starting list.
f :: Int -> [[Int]]
f n = interleavings (replicate n <$> [1..n])
Try it in ghci:
> f 2
[[1,1,2,2],[1,2,2,1],[1,2,1,2],[2,2,1,1],[2,1,1,2],[2,1,2,1]]

How do you efficiently find a union of a list of lists of values in haskell?

Since a code example is worth a thousand words I'll start with that:
testList = [1,2,2,3,4,5]
testSet = map sumMapper $ tails testList
where sumMapper [] = []
sumMapper (a:b) = sumMap a b
sumMap a b = map (+ a) b
This code takes a list and adds up all the elements to get the sum of all of them (I'd also be interested in efficiency of this). The output of testSet is:
[[3,3,4,5,6],[4,5,6,7],[5,6,7],[7,8],[9],[],[]]
I would like to find the union of these lists (to make it into a set) but I feel that:
whatIWant = foldl1 union testSet
will have bad performance (the real lists will be thousands of elements long).
Is this the correct solution or am I missing something obvious?
You might want to try
nub $ concat theListOfLists
In the version using union, the code to cut out duplicates will get run many times. Here it only is run once.
It will only execute the code to pull out the unique values once.
There is also a Data.Set library, you could alternatively use
import Data.Set
S.fromList $ concat theListOfLists
The important point is that the code (here and above) that pulls out duplicates only gets run on the full list once, rather than over and over again.
edit- Rein mentions below that nub is O(n^2), so you should avoid the first solution above in favor of something O(n log n), as Data.Set.fromList should be. As others have mentioned in the comments, you need something that enforces Ord a to get the proper complexity O(n log n), and Data.Set does, nub does not.
I will leave the two solutions (poor performance and good performance) because I think the resulting discussion was useful.
If you're using elements that are members of the Ord typeclass, as in your example, you can use Data.Set:
import qualified Data.Set as Set
whatYouWant = foldl' (Set.union . Set.fromList) Set.empty testSet
This has the advantage of taking space proportional to the size of the largest sublist rather than to the size of the entire concatenated list as does the Set.fromList . concat solution. The strict foldl' also prevents buildup of unevaluated thunks, preventing O(n) stack and heap space usage.
Generally speaking, an Ord constraint allows more efficient algorithms than an Eq constraint because it allows you to build a tree. This is also the reason that nub is O(n^2): the more efficient algorithm requires Ord rather than just Eq.
Since union is an associative operation (a+(b+c)==(a+b)+c), you can use tree-shaped folding for a logarithmic advantage in time complexity:
_U [] = []
_U (xs:t) = union xs (_U (pairs t))
pairs (xs:ys:t) = union xs ys : pairs t
pairs t = t
Of course Data.List.union itself is O(n2) in general, but if your testList is ordered non-decreasing, all the lists will be too, and you can use a linear ordUnion instead of the union, for a solution which is linearithmic overall and shouldn't leak space:
ordUnion :: (Ord a) => [a] -> [a] -> [a]
ordUnion a [] = a
ordUnion [] b = b
ordUnion (x:xs) (y:ys) = case compare x y of
LT -> x : ordUnion xs (y:ys)
EQ -> x : ordUnion xs ys
GT -> y : ordUnion (x:xs) ys
To prevent duplicates which might slip through, one more function is needed to process _U's output—a linear ordNub :: (Ord a) => [a] -> [a], with an obvious implementation.
Using the left-preferential (\(x:xs) ys -> x:ordUnion xs ys) could be even more productive overall (force smaller portions of the input at each given moment):
g testList = ordNub . _U $ [map (+ a) b | (a:b) <- tails testList]
where
_U [] = []
_U ((x:xs):t) = x : ordUnion xs (_U (pairs t))
pairs ((x:xs):ys:t) = (x : ordUnion xs ys) : pairs t
pairs t = t
see also:
data-ordlist package
even less forcing "implicit heap" by apfelmus
Tree-like folds

How to improve performence of this Haskell code?

I'm facing the following problem :
From the initial set [1,2,3,4] compute all possible subsets i.e [[1],[2],[3],[4],[1,2],[1,3],[1,4],[2,3],[2,4],[3,4],[1,2,3],[1,2,4],[1,3,4],[2,3,4],[1,2,3,4]]
I've wrote the following Haskell program generate.hs which is correct.
generateSets :: Eq a => [a] -> [[a]] -> [[a]] -> [[a]]
generateSets [] _ _ = []
generateSets src [] _ = let isets = growthup [] src in generateSets src iset iset
generateSets src sets rsets = if null sets' then rsets else generateSets src sets' (rsets++sets')
where sets' = concatMap (flip growthup src) sets
growthup :: (Eq a) => [a] -> [a] -> [[a]]
growthup ps ss = map (\suf -> ps++[suf]) ss'
where ss' = nextoccurence ps ss
nextoccurence :: (Eq a) => [a] -> [a] -> [a]
nextoccurence [] ys = ys
nextoccurence xs ys = tail ys'
where ys' = dropWhile (/= last xs) ys
While executing it in the GHC interpreter ghci ...
ghci> generate [1,2,3,4] [] []
ghci> [[1],[2],[3],[4],[1,2],[1,3],[1,4],[2,3],[2,4],[3,4],[1,2,3],[1,2,4],[1,3,4],[2,3,4],[1,2,3,4]]
every thing goes fine but the program take too long for just small sets of size 30 for example.
My question is : It is possible to improve my code in order to gain more from haskell laziness, or garbagge collector or something else ?
Is my code a good candidate for parallelism ?
Thanks for any reply !
Sets have a lot of subsets. In fact, a set of n elements has 2n subsets, so a set of 30 elements has over one billion subsets. Whichever method you use to generate them, even iterating over the results is going to take a long time. For larger sets you can pretty much forget about going through them all before the heat death of the universe.
So there's only so much you can do performance-wise, as even doubling the speed of your algorithm will only let you work with lists of one more element in the same time. For most applications, the real solution is to avoid having to enumerate all the subsets in the first place.
That said, there is a simple inductive way of thinking about subsets which makes defining a proper subset function easy without having to do any equality comparisons, which solves some of the problems with your implementation.
For the base case, the empty set has one subset: the empty set.
subsets [] = [[]]
For a set with at least one element (x:xs), we have the subsets which contain that element, and the ones that don't. We can get the subsets that don't contain x by recursively calling subsets xs, and we can get the rest by prepending x to those.
subsets (x:xs) = subsets xs ++ map (x:) (subsets xs)
The definition of subsequences in Data.List works on the same principle, but in a slightly more optimized way, which also returns the subsets in a different order and makes better use of sharing. However, as I said, enumerating the subsets of a list of length 30 is going to be slow no matter what, and your best bet is to try to avoid having to do it in the first place.

Asymptotic runtime of list-to-tree function

I have a merge function which takes time O(log n) to combine two trees into one, and a listToTree function which converts an initial list of elements to singleton trees and repeatedly calls merge on each successive pair of trees until only one tree remains.
Function signatures and relevant implementations are as follows:
merge :: Tree a -> Tree a -> Tree a --// O(log n) where n is size of input trees
singleton :: a -> Tree a --// O(1)
empty :: Tree a --// O(1)
listToTree :: [a] -> Tree a --// Supposedly O(n)
listToTree = listToTreeR . (map singleton)
listToTreeR :: [Tree a] -> Tree a
listToTreeR [] = empty
listToTreeR (x:[]) = x
listToTreeR xs = listToTreeR (mergePairs xs)
mergePairs :: [Tree a] -> [Tree a]
mergePairs [] = []
mergePairs (x:[]) = [x]
mergePairs (x:y:xs) = merge x y : mergePairs xs
This is a slightly simplified version of exercise 3.3 in Purely Functional Data Structures by Chris Okasaki.
According to the exercise, I shall now show that listToTree takes O(n) time. Which I can't. :-(
There are trivially ceil(log n) recursive calls to listToTreeR, meaning ceil(log n) calls to mergePairs.
The running time of mergePairs is dependent on the length of the list, and the sizes of the trees. The length of the list is 2^h-1, and the sizes of the trees are log(n/(2^h)), where h=log n is the first recursive step, and h=1 is the last recursive step. Each call to mergePairs thus takes time (2^h-1) * log(n/(2^h))
I'm having trouble taking this analysis any further. Can anyone give me a hint in the right direction?
It's almost there. You already know the expression is
so the only problem is to evaluate this sum. Using log(AB) = log A + log B and log 2N = N we have
With help of calculators, we can find that X = O(2m) = O(n), which is expected.
(If you want to compute this yourself, search for "Geometric series", or approximate the sum using an integral.)

Resources