Pairing up elements from a list given a predicate

Pairing up elements from a list given a predicate - haskell

I'm relatively new to Haskell and found a challenge to create a set of tuples which greedily takes from a list given a predicate. For example, using (\x -> \y -> odd(x+y)) on [2,3,4,5,6,7] could return [(2,3),(3,2),(4,5),(5,4),(6,7),(7,6)] or [(2,7),(6,5),(3,4),(4,3),(5,6),(7,2)] or any other valid set of pairings, as long as it's one where each pair is symmetrical, and all items from the set are included in one and only one pairing. A key part of my challenge is to learn to work with monads, specifically Maybe/Just/Nothing, so my current function is Eq a => (a -> a -> Bool) -> [a] -> Maybe [(a,a)] where Nothing is returned if a list of tuples including every element cannot be made; for example running (\x -> \y -> even(x+y)) on [2,3,4,5,6,7] would return Nothing, as you can't pair up all the elements to fit that predicate without leaving some out.
To start off, I thought I could generate a full list of possible pairs and filter them with the predicate. My function at present is test p xs = filter (uncurry p) [(x,y) | (x:ys) <- tails xs, y <- ys], with the idea that later on I can remove tuples with duplicate first values (perhaps somehow using nubBy?), run swap from Data.Tuple on what's left in my list to make my pairs symmetrical, and then run a final check to see if all the elements from the list have been included so I know whether to return nothing. I realise, however, that there's probably a better way of going about this that performs fewer redundant actions and does the final check for returning Nothing earlier on. I've tried to play around with list comprehension, but I can't come up with anything serviceable.

A tuple (x, y) is inherently ordered: (x, y) != (y, x). It would be helpful to define an "unordered" pair type for filtering:
newtype Pair x = Pair { unpair :: (x, x) }
instance Eq a => Eq (Pair a) where
(Pair p1) == (Pair p2) = p1 == p2 || p1 == swap p2
Then you can use a simpler method of generating sample pairs, using the Applicative instance for lists. You can filter out duplicates later, using Pair.
>>> (,) <$> [1, 2, 3] <*> [1, 2, 3]
[(1,1),(1,2),(1,3),(2,1),(2,2),(2,3),(3,1),(3,2),(3,3)]
Once you have filtered the above list, use nub by first converting all your initial results to Pair values, deduplicate using nub, then convert back to tuples:
result :: Eq x => [(x,x)] -> [(x,x)]
result = map unpair . nub . map Pair

Related

How to use groupBy on a list of tuples?

How can I group this list by second element of tuples:
[(3,2),(17,2),(50,3),(64,3)]
to get something like:
[[(3,2),(17,2)],[(50,3),(64,3)]]
I'm actually a newcomer to Haskell...and seems to be falling in love with it. Hope you would help me find an efficient way.

It sounds like you've already identified that you want Data.List.groupBy. The type of this function is
groupBy :: (a -> a -> Bool) -> [a] -> [[a]]
So it takes a binary predicate, i.e. an equivalence relation determining how to group elements. You want to group elements by equality on the second term of a pair, so you want
groupBy (\x y -> snd x == snd y) myList
Where snd is a built-in function that gets the second element of a pair.
Incidentally, this pattern of "apply a function to two arguments and then apply a binary function to the results" is very common, especially when calling Data.List functions, so Data.Function provides on.
on :: (b -> b -> c) -> (a -> b) -> a -> a -> c
Weird signature, but the use case is just what we want.
((+) `on` f) x y = f x + f y
So your desired groupBy can be written as
groupBy ((==) `on` snd)
Note that groupBy only finds consecutive equal elements. You didn't indicate whether you wanted consecutive equal elements or all equal elements, but if you want the latter, then I don't believe Haskell base provides that function, though you could certainly write it recursively yourself.

Take symmetrical pairs, of different numbers from list

I have a list like this:
[(2,3),(2,5),(2,7),(3,2),(3,4),(3,6),(4,3),(4,5),(4,7),(5,2),(5,4),(5,6),(6,3),(6,5),(6,7),(7,2),(7,4),(7,6)]
The digits are from [2..7]. I want to take a set where there are any symmetrical pairs. e.g. [(1,2),(2,1)], but those two numbers aren't used again in the set. An example would be:
[(3,6),(6,3),(2,5),(5,2),(4,7),(7,4)]
I wanted to first put symmetric pairs together as I thought it might be easier to work with so i created this function, which actually creates the pairs and puts them in another list
g xs = [ (y,x):(x,y):[] | (x,y) <- xs ]
with which the list turns to this:
[[(3,2),(2,3)],[(5,2),(2,5)],[(7,2),(2,7)],[(2,3),(3,2)],[(4,3),(3,4)],[(6,3),(3,6)],[(3,4),(4,3)],[(5,4),(4,5)],[(7,4),(4,7)],[(2,5),(5,2)],[(4,5),(5,4)],[(6,5),(5,6)],[(3,6),(6,3)],[(5,6),(6,5)],[(7,6),(6,7)],[(2,7),(7,2)],[(4,7),(7,4)],[(6,7),(7,6)]]
Then from here I was hoping to somehow remove duplicates.
I made a function that will look at all of the fst elements of all of the pairs:
flatList xss = [ x | xs <- xss, (x,y) <- xs ]
to use with another function to remove the duplicates.
h (x:xs) | (fst (head x)) `elem` (flatList xs) = h xs
| otherwise = (head x):(last x):(h xs)
which gives me the list
[(3,6),(6,3),(5,6),(6,5),(2,7),(7,2),(4,7),(7,4),(6,7),(7,6)]
which has duplicate numbers. That function only takes into account the first element of the first pair in the list of lists,the problem is when I also take into account the first element of the second pair (or the second element of the first pair):
h (x:xs) | (fst (head x)) `elem` (flatList xs) || (fst (last x)) `elem` (flatList xs) = h xs
| otherwise = (head x):(last x):(h xs)
I only get these two pairs:
[(6,7),(7,6)]
I see that the problem is that this method of deleting duplicates grabs the last repeated element, and would work with a list of digits, but not a list of pairs, as it misses pairs it needs to take.
Is there another way to solve this, or an alteration I could make?

It probably makes more sense to use a 2-tuple of 2-tuples in your list comprehension, since that makes it more easy to do pattern matching, and thus "by contract" enforces the fact that there are two items. We thus can construct 2-tuples that contain the 2-tuples with:
g :: Eq a => [(a, a)] -> [((a, a), (a, a))]
g xs = [ (t, s) | (t#(x,y):ts) <- tails xs, let s = (y, x), elem s ts ]
Here the elem s ts checks if the "swapped" 2-tuple occurs in the rest of the list.
Then we still need to filter the elements. We can make use of a function that uses an accumulator for the thus far obtained items:
h :: Eq a => [((a, a), (a, a))] -> [(a, a)]
h = go []
where go _ [] = []
go seen ((t#(x, y), s):xs)
| notElem x seen && notElem y seen = t : s : go (x:y:seen) xs
| otherwise = go seen xs
For the given sample input, we thus get:
Prelude Data.List> (h . g) [(2,3),(2,5),(2,7),(3,2),(3,4),(3,6),(4,3),(4,5),(4,7),(5,2),(5,4),(5,6),(6,3),(6,5),(6,7),(7,2),(7,4),(7,6)]
[(2,3),(3,2),(4,5),(5,4),(6,7),(7,6)]

after reading a few times your question, I got an elegant solution to your problem. Thinking that if you have a list of pairs without any repeated number, you can get the list of swapped pairs easily, solving your problem. So your problem can be reduce to given a list, get the list of all pairs using each number just one.
For a given list, there are many solutions to this, ex: for [1,2,3,4] valid solutions are: [(2,4),(4,2),(1,3),(3,1)] and [(2,3),(3,2),(1,4),(4,1)], etc... The approach here is:
take a permutation if the original list (say [1,4,3,2])
pick one element for each half and pair them together (for simplicity, you can pick consecutive elements too)
for each pair, create a the swapped pair and put all together
By doing so you end up with a list of non repeating numbers of pairs and its symmetric. More over, looping around all permutaitons, you can get all the solutions to your problem.
import Data.List (permutations, splitAt)
import Data.Tuple (swap)
-- This function splits a list by the half of the length
splitHalf :: [a] -> ([a], [a])
splitHalf xs = splitAt (length xs `quot` 2) xs
-- This zip a pair of list into a list of pairs
zipHalfs :: ([a], [a]) -> [(a,a)]
zipHalfs (xs, ys) = zip xs ys
-- Given a list of tuples, creates a larger list with all tuples and all swapped tuples
makeSymetrics :: [(a,a)] -> [(a,a)]
makeSymetrics xs = foldr (\t l -> t:(swap t):l) [] xs
-- This chain all of the above.
-- Take all permutations of xs >>> for each permutations >>> split it in two >>> zip the result >>> make swapped pairs
getPairs :: [a] -> [[(a,a)]]
getPairs xs = map (makeSymetrics . zipHalfs . splitHalf) $ permutations xs
>>> getPairs [1,2,3,4]
[[(1,3),(3,1),(2,4),(4,2)],[(2,3),(3,2),(1,4),(4,1)] ....

Haskell delete largest number from a list

I am trying to figure out how to create a recursive function that will find the largest element in the list and delete it then return the list. This is what i have so far but the problem is that every time i run it it returns the list without any of the values that are assigned to x.
deleteMax :: (Ord a) => [a] -> [a]
deleteMax [] = []
deleteMax [x] = []
deleteMax (x:y:xs)
|x == y = y: deleteMax xs
|x >= y = y: deleteMax xs
|x < y = x: deleteMax xs

This is not your answer
So you are a beginner and as such would like the simple solution of "how do I find the largest element in a list" followed by "how do I remove (one of the) largest element(s) in the list". This isn't that answer but it is me avoiding a long comment while also giving you something to come back to in 3 months.
The Lazy Way
One solution, which #n.m. and I were sparring about in comments, is to tie the knot (Googleable term). In this method you only need one logical pass over the list. In this case it is basically a trick to hide the pass that constructs the result list.
The idea is that during your pass over the list you do both tasks of 1. Compute the maximum element and 2. Compare with the maximum element and construct the list. There is nothing here that requires a monad but it can be easiest to see as part of a state monad:
deleteMaxState :: (Ord a) => [a] -> [a]
deleteMaxState [] = []
First we handle the base cases so we have a candidate 'maximum' (x) for our recursive operation.
deleteMaxState xs#(fstElem:_) =
let (r,(m,_)) = runState (go xs) (fstElem, notMax m)
notMax mx v = if (mx > v) then (v:) else id
go [] = return []
go (x:xs) =
do (curr,f) <- get
when (x > curr) (put (x,f))
f x <$> go xs
in r
In the loopwe track two values the first, curr, is the largest observed value by this point in our traversal of the list. The second value, f, is the trick - it is (a function including) the maximum value provided to the computation after the traversal has completed.
The magic is all here:
(r,(m,_)) = runState (go xs) (fstElem, m)
The left element of the result state (m,_) was our running maximum. Once the traversal ends we use that value - it becomes the right element (fstElem, m) and thus represents the maximum of the whole list.
We can use f to create thunks that populate portions of the list or just in-line construct our list as a bunch of unevaluated cons computations.
Making this one iota simpler, we can remove the higher-order function f and just have a number (untested):
deleteMaxState xs#(fstElem:_) =
let (r,(m,_)) = runState (go xs) (fstElem, m)
go [] = return []
go (x:xs) =
do (curr,theMax) <- get
when (x > curr) (put (x,theMax))
((if x >= theMax then Nothing else Just x) :) <$> go xs
in catMaybes r
Now we can see the second pass pretty explicitly not just as an unevaluated set of "some computation involving max, consed on the result" but as an actual pass via catMaybes.
The tying of the knot allows the programmer to write one logical traversal. This can be nice since it requires only one pattern match and recursive call per constructor of the list elements but at the cost of reasoning about evaluation order.

Argument of groupBy's Lambda

Learn You a Haskell shows the groupBy function:
ghci> let values = [-4.3, -2.4, -1.2, 0.4, 2.3, 5.9, 10.5,
29.1, 5.3, -2.4, -14.5, 2.9, 2.3]
ghci> groupBy (\x y -> (x > 0) == (y > 0)) values
[[-4.3,-2.4,-1.2],[0.4,2.3,5.9,10.5,29.1,5.3],[-2.4,-14.5],[2.9,2.3]]
In groupBy's first argument, what is the meaning of the lambda's 2 arguments: x and y?

These are the variables to compare. You know that group puts equal neighbored values together. To decide what a equal value is it uses a compare function. group relies on the instance of your type of the Eq typeclass. But groupBy allows you to choose how to compare the neighbored values.

If we look at the type of groupBy:
groupBy :: (a -> a -> Bool) -> [a] -> [[a]]
The first argument to groupBy is a function that takes two arguments of type a and return a Bool. You could equivalently write this as
groupBy comparer values where comparer x y = (x > 0) == (y > 0)
The \x y -> part just says that the lambda function takes two arguments named x and y, just like with any other function declaration.
The easiest way to see what this expression does is to just run it:
ghci> groupBy (\x y -> (x > 0) == (y > 0)) values
[[-4.3,-2.4,-1.2],[0.4,2.3,5.9,10.5,29.1,5.3],[-2.4,-14.5],[2.9,2.3]]
If you look closely, you can see that each sublist is grouped by if it's positive or negative. The groupBy function groups elements of a list by the given condition, but only in sequential order. For example:
ghci> groupBy (\x y -> x == y) [1, 1, 2, 2, 2, 3, 3, 4]
[[1,1],[2,2,2],[3,3],[4]]
ghci> groupBy (\x y -> x == y) [1, 1, 2, 2, 2, 3, 3, 1]
[[1,1],[2,2,2],[3,3],[1]]
In the second example, notice that the 1s haven't all been grouped together because they aren't adjacent.

In cases like these, it's best to go straight to the source! groupBy is part of Data.List, so you can find it the base package on Hackage. When you don't know what package a function is in, search for the function in Hoogle and click on the name to be taken to the Haddocks on Hackage. When you're looking at Haddock documentation, there will usually be a "Source" link on the righthand side of the function type definition to take you to the definition. Here's the source for groupBy.
I've reproduced the definition here to step through it.
-- | The 'groupBy' function is the non-overloaded version of 'group'.
groupBy :: (a -> a -> Bool) -> [a] -> [[a]]
groupBy _ [] = []
groupBy eq (x:xs) = (x:ys) : groupBy eq zs
where (ys,zs) = span (eq x) xs
First, the documentation line at the top tells us that groupBy is the non-overloaded version of group, which is a very common pattern in base. You can go check out group to figure out the simplest case of grouping functionality, then you can understand the -By version as allowing you to supply your own predicate (in case you wanted to compare equality differently than the Eq instance for a type, or whatever other operation you're trying to do).
The base case is trivial, but the recursive step might be a little confusing if you don't know what span does (time to hit Hackage again!). span takes a predicate and a list and returns a pair (2-tuple) of lists broken before the first element that doesn't match the predicate (it's like break but (not) negated).
So now you should be able to put it all together and see that groupBy groups elements of a list together by segregating runs of elements which are "equal" to the first element in that run. Note that it is NOT comparing elements pairwise (I was burned by that before) so don't assume that the two elements being passed to the predicate function would be adjacent in the list!

Let's start with group. This function simply groups together all the adjacent elements that are identical. e.g.,
group [0,1,2,3,3,4] = [[0],[1],[2],[3,3],[4]]
GHC defines group as follows:
group :: Eq a => [a] -> [[a]]
group = groupBy (==)
That is, the equality test is implicit. The same thing can be written with an explicit equality test using groupBy as;
groupBy (\x y -> x == y) [0,1,2,3,3,4] = [[0],[1],[2],[3,3],[4]]
Now, let's look at how GHC defines groupBy:
groupBy :: (a -> a -> Bool) -> [a] -> [[a]]
groupBy _ [] = []
groupBy eq (x:xs) = (x:ys) : groupBy eq zs
where (ys,zs) = span (eq x) xs
eq is used to split the rest of the list based on comparison with the first element. groupBy is then recursively called on the list of elements that fail the comparison. Note that (x:ys) is a concatenation of the first element, with the list of elements that satisfy the comparison condition. Also, the span function will start the second list at the first element where test condition is not met.
Hence, in the given example, the moment you reach the value 0.4, a new list has to start, since 0.4 will be the first element of zs from the above definition.

groupBy divides a list into groups according to some “rule”. In groupBy (\x y -> x `someComparison` y) someList, x is a first element of the “current” group, y is an element of someList. groupBy traverses someList, so at each step y becomes the next element of someList. The new group started, when the predicate returns False. y becomes the first member of the new group. Iteration continues, the first member of this new group now becomes x, and the next element of someList becomes y.
groupBy does NOT compare elements pairwise (1st with 2nd, 2nd with 3rd, etc), instead it compares each element of the list with the first element of the group currently being filled. Example:
groupBy (\x y -> x < y) [1,2,3,2,1] -- returns: [[1,2,3,2],[1]]
Step by step groupBy:
compares 1 with 2. 1 < 2, thus both numbers go into the same group. Groups: [[1,2]]
compares 1 with 3. 1 < 3, thus 3 goes into the old group. Groups: [[1,2,3]]
compares 1 with 2. 1 < 2, thus 2 goes into the old group. Groups: [[1,2,3,2]]
compares 1 with 1. 1 ≮ 1, thus the new group is formed, and 1 goes as its first element. Groups: [[1,2,3,2],[1]]
Exercise: To understand how groupBy works, try to figure out how the following expressions return their results with pen and paper:
groupBy (\x y -> x < y) [1,2,3,4,5,4,3,2,1] -- [[1,2,3,4,5,4,3,2],[1]]
groupBy (\x y -> x < y) [1,3,5,2,1] -- [[1,3,5,2],[1]]
groupBy (\x y -> x <= y) [3,5,3,2,1,0,1,0] -- [[3,5,3],[2],[1],[0,1,0]]
groupBy (\x y -> x <= y) [1,2,3,2,1] -- [[1,2,3,2,1]]
Again, the key to understand behavior of groupBy is to remember, that at each step it compares the first element of the current group with a consecutive element of the list. The moment predicate returns False, the new group is formed, and the process continues.
To understand why groupBy behaves that way, inspect its source:
groupBy :: (a -> a -> Bool) -> [a] -> [[a]]
groupBy _ [] = []
groupBy eq (x:xs) = (x:ys) : groupBy eq zs
where (ys,zs) = span (eq x) xs
The key here is the use of span function: span (1<) [2,3,2,1] returns ([2,3,2],[1]). span (eq x) xs in the above code puts all elemets of xs that match (eq x) into the first part of a pair, and te rest of xs into the second. (x:ys) then joins x with the first part of a pair, while groupBy is recursively invoked on the rest of xs (which is zs). That's why groupBy works this strange way.

List processing in Haskell

I am teaching myself Haskell and have run into a problem and need help.
Background:
type AInfo = (Char, Int)
type AList = [AInfo] (let’s say [(‘a’, 2), (‘b’,5), (‘a’, 1), (‘w’, 21)]
type BInfo = Char
type BList = [BInfo] (let’s say [‘a’, ‘a’, ‘c’, ‘g’, ‘a’, ‘w’, ‘b’]
One quick edit: The above information is for illustrative purposes only. The actual elements of the lists are a bit more complex. Also, the lists are not static; they are dynamic (hence the uses of the IO monad) and I need to keep/pass/"return"/have access to and change the lists during the running of the program.
I am looking to do the following:
For all elements of AList check against all elements of BList and where the character of the AList element (pair) is equal to the character in the Blist add one to the Int value of the AList element (pair) and remove the character from BList.
So what this means is after the first element of AList is checked against all elements of BList the values of the lists should be:
AList [(‘a’, 5), (‘b’,5), (‘a’, 1), (‘w’, 21)]
BList [‘c’, ‘g’, ‘w’, ‘b’]
And in the end, the lists values should be:
AList [(‘a’, 5), (‘b’,6), (‘a’, 1), (‘w’, 22)]
BList [‘c’, ‘g’]
Of course, all of this is happening in an IO monad.
Things I have tried:
Using mapM and a recursive helper function. I have looked at both:
Every element of AList checked against every element of bList -- mapM (myHelpF1 alist) blist and
Every element of BList checked against every element of AList – mapM (myHelpF2 alist) blist
Passing both lists to a function and using a complicated
if/then/else & helper function calls (feels like I am forcing
Haskell to be iterative; Messy convoluted code, Does not feel
right.)
I have thought about using filter, the character value of AList
element and Blist to create a third list of Bool and the count the
number of True values. Update the Int value. Then use filter on
BList to remove the BList elements that …… (again Does not feel
right, not very Haskell-like.)
Things I think I know about the problem:
The solution may be exceeding trivial. So much so, the more experienced Haskellers will be muttering under their breath “what a noob” as they type their response.
Any pointers would be greatly appreciated. (mutter away….)

A few pointers:
Don't use [(Char, Int)] for "AList". The data structure you are looking for is a finite map: Map Char Int. Particularly look at member and insertWith. toList and fromList convert from the representation you currently have for AList, so even if you are stuck with that representation, you can convert to a Map for this algorithm and convert back at the end. (This will be more efficient than staying in a list because you are doing so many lookups, and the finite map API is easier to work with than lists)
I'd approach the problem as two phases: (1) partition out the elements of blist by whether they are in the map, (2) insertWith the elements which are already in the map. Then you can return the resulting map and the other partition.
I would also get rid of the meaningless assumptions such as that keys are Char -- you can just say they are any type k (for "key") that satisfies the necessary constraints (that you can put it in a Map, which requires that it is Orderable). You do this with lowercase type variables:
import qualified Data.Map as Map
sieveList :: (Ord k) => Map.Map k Int -> [k] -> (Map.Map k Int, [k])
Writing algorithms in greater generality helps catch bugs, because it makes sure that you don't use any assumptions you don't need.
Oh, also this program has no business being in the IO monad. This is pure code.

import Data.List
type AInfo = (Char, Int)
type AList = [AInfo]
type BInfo = Char
type BList = [BInfo]
process :: AList -> BList -> AList
process [] _ = []
process (a:as) b = if is_in a b then (fst a,snd a + 1):(process as (delete (fst a) b)) else a:process as b where
is_in f [] = False
is_in f (s:ss) = if fst f == s then True else is_in f ss
*Main> process [('a',5),('b',5),('a',1),('b',21)] ['c','b','g','w','b']
[('a',5),('b',6),('a',1),('b',22)]
*Main> process [('a',5),('b',5),('a',1),('w',21)] ['c','g','w','b']
[('a',5),('b',6),('a',1),('w',22)]
Probably an important disclaimer: I'm rusty at Haskell to the point of ineptness, but as a relaxing midnight exercise I wrote this thing. It should do what you want, although it doesn't return a BList. With a bit of modification, you can get it to return an (AList,BList) tuple, but methinks you'd be better off using an imperative language if that kind of manipulation is required.
Alternately, there's an elegant solution and I'm too ignorant of Haskell to know it.

While I am by no means a Haskell expert, I have a partial attempt that returns that result of an operation once. Maybe you can find out how to map it over the rest to get your solution. The addwhile is clever, since you only want to update the first occurrence of an element in lista, if it exists twice, it will just add 0 to it. Code critiques are more than welcome.
import Data.List
type AInfo = (Char, Int)
type AList = [AInfo]
type BInfo = Char
type BList = [BInfo]
lista = ([('a', 2), ('b',5), ('a', 1), ('w', 21)] :: AList)
listb = ['a','a','c','g','a','w','b']
--step one, get the head, and its occurrences
items list = (eleA, eleB) where
eleA = length $ filter (\x -> x == (head list)) list
eleB = head list
getRidOfIt list ele = (dropWhile (\x -> x == ele) list) --drop like its hot
--add to lista
addWhile :: [(Char, Int)] -> Char -> Int -> [(Char,Int)]
addWhile [] _ _ = []
addWhile ((x,y):xs) letter times = if x == letter then (x,y+times) : addWhile xs letter times
else (x,y) : addWhile xs letter 0
--first answer
firstAnswer = addWhile lista (snd $ items listb) (fst $ items listb)
--[('a',5),('b',5),('a',1),('w',21)]

The operation you describe is pure, as #luqui points out, so we just define it as a pure Haskell function. It can be used inside a monad (including IO) by means of fmap (or do).
import Data.List
combine alist blist = (reverse a, b4) where
First we sort and count the B list:
b = map (\g->(head g,length g)) . group . sort $ blist
We need the import for group and sort to be available. Next, we roll along the alist and do our thing:
(a,b2) = foldl g ([],b) alist
g (acc,b) e#(x,c) = case pick x b of
Nothing -> (e:acc,b)
Just (n,b2) -> ((x,c+n):acc,b2)
b3 = map fst b2
b4 = [ c | c <- blist, elem c b3 ]
Now pick, as used, must be
pick x [] = Nothing
pick x ((y,n):t)
| x==y = Just (n,t)
| otherwise = case pick x t of Nothing -> Nothing
Just (k,r) -> Just (k, (y,n):r)
Of course pick performs a linear search, so if performance (speed) becomes a problem, b should be changed to allow for binary search (tree etc, like Map). The calculation of b4 which is filter (`elem` b3) blist is another potential performance problem with its repeated checks for presence in b3. Again, checking for presence in trees is faster than in lists, in general.
Test run:
> combine [('a', 2), ('b',5), ('a', 1), ('w', 21)] "aacgawb"
([('a',5),('b',6),('a',1),('w',22)],"cg")
edit: you probably want it the other way around, rolling along the blist while updating the alist and producing (or not) the elements of blist in the result (b4 in my code). That way the algorithm will operate in a more local manner on long input streams (that assuming your blist is long, though you didn't say that). As written above, it will have a space problem, consuming the input stream blist several times over. I'll keep it as is as an illustration, a food for thought.
So if you decide to go the 2nd route, first convert your alist into a Map (beware the duplicates!). Then, scanning (with scanl) over blist, make use of updateLookupWithKey to update the counts map and at the same time decide for each member of blist, one by one, whether to output it or not. The type of the accumulator will thus have to be (Map a Int, Maybe a), with a your element type (blist :: [a]):
scanl :: (acc -> a -> acc) -> acc -> [a] -> [acc]
scanning = tail $ scanl g (Nothing, fromList $ reverse alist) blist
g (_,cmap) a = case updateLookupWithKey (\_ c->Just(c+1)) a cmap of
(Just _, m2) -> (Nothing, m2) -- seen before
_ -> (Just a, cmap) -- not present in counts
new_b_list = [ a | (Just a,_) <- scanning ]
last_counts = snd $ last scanning
You will have to combine the toList last_counts with the original alist if you have to preserve the old duplicates there (why would you?).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Pairing up elements from a list given a predicate - haskell

Related

How to use groupBy on a list of tuples?

Take symmetrical pairs, of different numbers from list

Haskell delete largest number from a list

Argument of groupBy's Lambda

List processing in Haskell

Categories

Resources