Haskell split a list into two by a pivot value - haskell

I want to split [a] into ([a], [a]) by a pivot value and I have my code
splitList :: (Ord a) => a -> [a] -> ([a],[a])
splitList pivot list =
([x | x <- list, x <= pivot], [x | x <- list, x > pivot])
But it iterates the list twice to generate two lists, is there a way to iterate only once?

There are two possibilities, depending on if you want a tail recursive solution (and don't care about reversing the order of elements), or a solution that consumes its argument lazily.
The lazy solution decides if the first element of the list goes into the first or into the second part and uses a simple recursion to process the rest of the list. This would be the preferred solution in most cases as laziness is usually more important than tail recursion:
splitList :: (Ord a) => a -> [a] -> ([a],[a])
splitList _ [] = ([], [])
splitList p (x : xs)
| x <= p = (x : l, r)
| otherwise = (l, x : r)
where
~(l, r) = splitList p xs
However in some cases you care neither for the ordering of elements nor for laziness, but instead for speed. (For example when implementing a sorting algorithm.) Then a variant that uses an accumulator to build the result (see Accumulating Parameters: Getting rid of the 'almost' in "almost tail recursive" ) to achieve tail recursion would be more appropriate:
splitListR :: (Ord a) => a -> [a] -> ([a],[a])
splitListR pivot = sl ([], [])
where
sl acc [] = acc
sl (l, g) (x : xs)
| x <= pivot = sl (x : l, g) xs
| otherwise = sl (l, x : g) xs

It's generally considered good style to avoid hand-rolling your recursion; instead you can use a folding function like so:
splitList pivot = foldr triage ([],[])
where
triage x ~(lows, highs)
| x <= pivot = (x:lows, highs)
| otherwise = (lows, x:highs)
Of course it's even better style to make use of a preexisting function that does exactly what you need, i.e. partition. :)

If you want to write this from scratch, you can maintain two lists, one for small items, one for large. First I'll write the wrapper:
splitList :: (Ord a) => a -> [a] -> ([a],[a])
splitList pivot input = spL input [] [] where
OK, so I"m just calling spL and giving it two empty lists to start off with. Because I'm using a where block, I'll not need to pass the pivot around, so only the three lists that are changing get passed. If we haven't got anything left in the input, we're done and should return the answer:
spL [] smalls larges = (smalls,larges)
Now as you'll see, we'll actually make smalls and larges backwards, so if you don't like that, replace the final answer pair there with (reverse smalls,reverse larges). Let's deal with some input now:
spL (i:input) smalls larges | i <= pivot = spL input (i:smalls) larges
| otherwise = spL input smalls (i:larges)
So we pop it on the front of the smalls if it's small enough.
The reason for pushing on the front of the list is it saves us iterating through to the end of the list every time. You can always reverse to obtain the original ordering if that matters to you, like I said.
All together we get:
splitList :: (Ord a) => a -> [a] -> ([a],[a])
splitList pivot input = spL input [] [] where
spL [] smalls larges = (smalls,larges)
spL (i:input) smalls larges | i <= pivot = spL input (i:smalls) larges
| otherwise = spL input smalls (i:larges)

import Data.List (partition)
splitList pivot = partition (<= pivot)

http://www.cs.indiana.edu/pub/techreports/TR27.pdf of 19761 suggests the following:
import Control.Applicative
partition3 [] p = ZipList [[], [], []]
partition3 (x:xs) p
| x < p = ZipList [(x:),id,id] <*> partition3 xs p
| x > p = ZipList [id,id,(x:)] <*> partition3 xs p
| True = ZipList [id,(x:),id] <*> partition3 xs p
using it, we write
splitList pivot list = (a++b, c)
where
[a,b,c] = getZipList $ partition3 list pivot
1 as seen here.

Related

How can we limit recursive calls in Haskell?

I've got a problem with my code, which is the following:
import Data.List
splitat _ [] = ([],[])
splitat element (head:tail)
| element == head = ([],(head:tail))
| otherwise = ([head]++fst(splitat element tail), snd(splitat element tail))
It splits a list at 'element', then combining the left and right sublist into a tuple.
However, in the third line, the 'splitat element tail' command is called twice, once through 'fst' and once through 'snd'. Is there a way to evaluate this term only 1 time to keep the recursion tree narrow?
Thanks in advance.
Yes. You can make use of a let expression, or a where clause. For example:
splitat :: Eq a => a -> [a] -> ([a], [a])
splitat _ [] = ([],[])
splitat x' xa#(x:xs) | x == x' = ([], xa)
| otherwise = (x:ys1, ys2)
where (ys1, ys2) = splitat x' xs
Note: please do not use head :: [a] -> a or tail :: [a] -> [a] or other functions that are defined as variables, since these will shadow the existing binding. It makes it harder to reason about the code, since a person might think that head and tail refer to these functions, and not the variables.
Use Control.Arrow.first (or Data.Bifunctor.first; the arrow library ships with GHC, while I don't recall if you need to install bifunctor first or not):
splitat _ [] = ([],[])
splitAt e lst#(h:t) | e == h = ([], lst)
| otherwise = first (h:) (splitAt e t)

Can haskell grouping function be made similarly like that?

Is it possible to somehow make group function similarly to that:
group :: [Int] -> [[Int]]
group [] = []
group (x:[]) = [[x]]
group (x:y:ys)
| x == y = [[x,y], ys]
| otherwise = [[x],[y], ys]
Result shoult be something like that:
group[1,2,2,3,3,3,4,1,1] ==> [[1],[2,2],[3,3,3],[4],[1,1]]
PS: I already looked for Data.List implementation, but it doesn't help me much. (https://hackage.haskell.org/package/base-4.3.1.0/docs/src/Data-List.html)
Is it possible to make group funtion more clearer than the Data.List implementation?
Or can somebody easily explain the Data.List implementation atleast?
Your idea is good, but I think you will need to define an ancillary function -- something like group_loop below -- to store the accumulated group. (A similar device is needed to define span, which the Data.List implementation uses; it is no more complicated to define group directly, as you wanted to do.) You are basically planning to move along the original list, adding items to the subgroup as long as they match, but starting a new subgroup when something doesn't match:
group [] = []
group (x:xs) = group_loop [x] x xs
where
group_loop acc c [] = [acc]
group_loop acc c (y:ys)
| y == c = group_loop (acc ++ [y]) c ys
| otherwise = acc : group_loop [y] y ys
It might be better to accumulate the subgroups by prepending the new element, and then reversing all at once:
group [] = []
group (x:xs) = group_loop [x] x xs
where
group_loop acc c [] = [reverse acc]
group_loop acc c (y:ys)
| y == c = group_loop (y:acc) c ys
| otherwise = reverse acc : group_loop [y] y ys
since then you don't have to keep retraversing the accumulated subgroup to tack things on the end. Either way, I get
>>> group[1,2,2,3,3,3,4,1,1]
[[1],[2,2],[3,3,3],[4],[1,1]]
group from Data.List is a specialized version of groupBy which uses the equality operator == as the function by which it groups elements.
The groupBy function is defined like this:
groupBy :: (a -> a -> Bool) -> [a] -> [[a]]
groupBy _ [] = []
groupBy eq (x:xs) = (x:ys) : groupBy eq zs
where (ys,zs) = span (eq x) xs
It relies on another function call span which splits a list into a tuple of two lists based on a function applied to each element of the list. The documentation for span includes this note which may help understand its utility.
span p xs is equivalent to (takeWhile p xs, dropWhile p xs)
Make sure you first understand span. Play around with it a little in the REPL.
Ok, so now back to groupBy. It uses span to split up a list, using the comparison function you pass in. That function is in eq, and in the case of the group function, it is ==. In this case, the span function splits the list into two lists: The first of which matches the first element pulled from the list, and the remainder in the second element of the tuple.
And since groupBy recursively calls itself, it appends the rest of the results from span down the line until it reaches the end.
Visually, you can think of the values produced by span looking something like this:
([1], [2,2,3,3,3,4,1,1])
([2,2], [3,3,3,4,1,1])
([3,3,3], [4,1,1])
([4], [1,1])
([1,1], [])
The recursive portion joins all the first elements of those lists together in another list, giving you the result of
[[1],[2,2],[3,3,3],[4],[1,1]]
Another way of looking at this is to take the first element x of the input and recursively group the rest of it. x will then either be prepended to the first element of the grouping, or go in a new first group by itself. Some examples:
With [1,2,3], we'll add 1 to a new group in [[2], [3]], yielding [[1], [2], [3]]
With [1,1,2], we'll add the first 1 to the first group of [[1], [2]], yielding [[1,1], [2]].
The resulting code:
group :: [Int] -> [[Int]]
group [] = []
group [x] = [[x]]
group (x:y:ys) = let (first:rest) = group (y:ys)
in if x /= y
then [x]:first:rest -- Example 1 above
else (x:first):rest -- Example 2 above
IMO, this simplifies the recursive case greatly by treating singleton lists explicitly.
Here, I come up with a solution with foldr:
helper x [] = [[x]]
helper x xall#(xs:xss)
| x == head xs = (x:xs):xss
| otherwise = [x]:xall
group :: Eq a => [a] -> [[a]]
group = foldr helper []

How to process / summarise a list into a "different" list

I think I need something like a fold or maybe a foldt but the examples I've seen, seem to only compress the list into a simple scalar value.
What I need would need to remember and re-use values from previous lines in the list (essentially a "group by" operation)
If my input data looks like:
[["order1", "item1"],["", "item2"],["","item3"],["order2","item4"]]
What is the correct approach to end up with something like:
[["order1",["item1","item2","item3"]],["order2",["item4"]]
ie data Order = Order { id :: Text, items :: [OrderItem]}
What if I wanted a slightly different structure?
[("order1",["item1","item2","item3"]),("order",["item4"])]
ie data OrderTuple = OrderTuple { order :: Order, items :: [OrderItem]}
What if I also wanted to keep a running total of some numeric value from the OrderItem?
edit: Here's the code I'm trying to get working based on Frerich's answer
--testGroupBy :: [[String]] -> [[String]]
testGroupBy :: [[String]] -> [(String, [String])]
testGroupBy z =
--groupBy (\(x:xs) (y:ys) -> x == y || null y) z
groupBy testFunc z
testFunc :: [String] -> [String] -> Bool
testFunc (x:xs) (y:ys) = x == y || null y
Pattern matching is useful here
groupData = foldl acc []
where acc ((r, rs):rss) ("":xs) = (r, rs ++ xs): rss
acc rss (x:xs) = (x, xs): rss
acc _ _ = error "Bad input data"
resultant groups are in reverse order, use reverse if you need.
What if I wanted a slightly different structure?
Simply transform one into other, you can do inside groupData or as separated function.
If you admit initial groups without fst element
groupData = foldr acc []
where acc (x:xs) [] = [(x, xs)]
acc ("":xs) (("", rs):rss) = ("", rs ++ xs): rss
acc (x:xs) (("", rs):rss) = (x, rs ++ xs): rss
acc (x:xs) rss = (x, xs): rss
then
let xs = [["", "item8"],["", "item9"],["order1", "item1"],["", "item2"],["","item3"],["order2","item4"]]
print $ groupData xs
is
[("",["item9","item8"])
,("order1",["item3","item2","item1"])
,("order2",["item4"])]
Instead of looking for a fold-based solution, I'd first try to see whether you can define a function as a composition of higher-level functions (such as map). Let me fire up a ghci session and play abit:
λ: let x = [["order1", "item1"],["", "item2"],["","item3"],["order2","item4"]]
Your "group by" operation actually has an existing name: Data.List.groupBy -- this almost gets us what we need:
λ: import Data.List
λ: let x' = groupBy (\(x:xs) (y:ys) -> x == y || null y) x
λ: x'
[[["order1","item1"],["","item2"],["","item3"]],[["order2","item4"]]]
This groupBy application puts all elements in x into one group (i.e. list) whose first element is equal, or if the second element is empty. This can then get massaged into your desired format (in this case, the second one you proposed with a map):
λ: let x'' = map (\x -> (head (head x), map (!! 1) x)) x'
λ: x''
[("order1",["item1","item2","item3"]),("order2",["item4"])]
Putting it all together:
groupData :: [[String]] -> [(String, [String])]
groupData = map (\x -> (head (head x), map (!! 1) x))
. groupBy (\(x:xs) (y:ys) -> x == y || y == "")
I suppose that with this, building a proper data structure (i.e. something more typesafe than nested lists) should be straightforward.

Combs on a large set doesn't compute Haskell

I'm writing a combs function in haskell
what it needs to do is, when I provide it with a deck of cards, give me every combination of hands possible from that deck of size x
This is the relevant code
combs :: Int -> [a] -> [[a]]
combs 0 _ = [[ ]]
combs i (x:xs) = (filter (isLength i) y)
where y = subs (x:xs)
combs _ _ = [ ]
isLength :: Int -> [a] -> Bool
isLength i x
| length x == i = True
| otherwise = False
subs :: [a] -> [[a]]
subs [ ] = [[ ]]
subs (x : xs) = map (x:) ys ++ ys
where ys = subs xs
However, when I ask it to compute a combs 5 [1..52], e.g. a hand of 5 out of a full deck, it does not provide a result, and keeps running for a really long time
Does anyone know what the problem is and how to speed up this algorithm?
To extract i items from x:xs you can proceed in two ways:
you keep the x, and extract only i-1 elements from xs
you discard x, and extract all the i elements from xs
Hence, a solution is:
comb :: Int -> [a] -> [[a]]
comb 0 _ = [[]] -- only the empty list has 0 elements
comb _ [] = [] -- can not extract > 0 elements from []
comb i (x:xs) = [ x:ys | ys <- comb (i-1) xs ] -- keep x case
++ comb i xs -- discard x case
By the way, the above code also "proves" a well-known recursive formula for the binomial coefficients. You might already have met this formula if you attended a calculus class.
Letting B(k,n) = length (comb k [1..n]), we have
B(k+1,n+1) == B(k,n) + B(k+1,n)
which is just a direct consequence of the last line of the code above.
Right now it's a bit hard to see what you are trying to do - but I guess the problems you have is that you gonna filter and map a lot.
I think a simple way to get what you need is this:
module Combinations where
import Data.List (delete)
combs :: Eq a => Int -> [a] -> [[a]]
combs 0 _ = [[]]
combs i xs = [ y:ys | y <- xs, ys <- combs (i-1) (delete y xs) ]
which uses delete from Data.List
It should be lazy enough to find you combinations quick - of course all will take a while ;)
λ> take 5 $ combs 5 [1..52]
[[1,2,3,4,5],[1,2,3,4,6],[1,2,3,4,7],[1,2,3,4,8],[1,2,3,4,9]]
how does it work
it's one of those recursive combinatorial algorithm that works by selecting a first card y from all the cards xs, and then recursivley gets the rest of the handysfrom the deck without the selected carddelete a xsand then putting it back togethery:ys` inside the list-monad (here using list-comprehensions).
BTW: ther are 311,875,200 such decks ;)
version without list-comprehensions
here is a version without comprehensions in case your system has issues here:
combs :: Eq a => Int -> [a] -> [[a]]
combs 0 _ = [[]]
combs i xs = do
y <- xs
ys <- combs (i-1) (delete y xs)
return $ y:ys
version that will remove permutations
this one uses Ord to get sort the items in ascending order and in doing so removing duplciates in respect to permutaion - for this to work xs is expected to be pre-sorted!
Note chi's version is working with fewer constraints and might be more preformant too - but I thougt this is nice and readable and goes well with the version before so maybe it's of interest to you.
I know it's not a thing often done in Haskell/FP where you strife for the most general and abstract cases but I come form an environment where most strive for readability and understanding (coding for the programmer not only for the compiler) - so be gentle ;)
combs' :: Ord a => Int -> [a] -> [[a]]
combs' 0 _ = [[]]
combs' i xs = [ y:ys | y <- xs, ys <- combs' (i-1) (filter (> y) xs) ]

Turning List Comprehension into Functional Application

I have a function which was written in list comprehension. As a learning I decided to try to convert this function into a functional application using map, zip, fold, etc. I am having a really hard time converting this particular one.
It might seem unreasonable for what it is doing, but it is part of a bigger function and I want to get this piece working first.
combination :: Int -> [a] -> [([a],[a])]
combination 0 xs = [([],xs)]
combination n (x:xs) = [ (x:ys,zs) | (ys,zs) <- combination (n-1) xs ]
It's just a map:
combination :: Int -> [a] -> [([a],[a])]
combination 0 xs = [([],xs)]
combination n (x:xs) = map (\(ys, zs) -> (x:ys,zs)) (combination (n-1) xs)

Resources