Working with list of tuples - haskell

I've been trying to solve this, but I just can't figure it out. So, I've a list with tuples, for example:
[("Mary", 10), ("John", 45), ("Bradley", 30), ("Mary", 15), ("John", 10)]
and what I want to get is a list with also tuples where, if the name is the same, the numbers of those tuples should be added and, if not, that tuple must be part of the final list too, exemplifying:
[("Mary",25), ("John", 55), ("Bradley", 30)]
I don't know if I explained myself really well, but I think you'll probably understand with the examples.
I've tried this, but it doesn't work:
test ((a,b):[]) = [(a,b)]
test ((a,b):(c,d):xs) | a == c = (a,b+d):test((a,b):xs)
| otherwise = (c,d):test((a,b):xs)

Doing this sort of thing is always awkward with lists, because of their sequential nature--they don't really lend themselves to operations like "find matching items" or "compute a new list by combining specific combinations of list elements" or other things that are by nature non-sequential.
If you step back for a moment, what you really want to do here is, for each distinct String in the list, find all the numbers associated to it and add them up. This sounds more suited to a key-value style data structure, for which the most standard in Haskell is found in Data.Map, which gives you a key-value map for any value type and any ordered key type (that is, an instance of Ord).
So, to build a Map from your list, you can use the fromList function in Data.Map... which, conveniently, expects input in the form of a list of key-value tuples. So you could do this...
import qualified Data.Map as M
nameMap = M.fromList [("Mary", 10), ("John", 45), ("Bradley", 30), ("Mary", 15), ("John", 10)]
...but that's no good, because inserting them directly will overwrite the numbers instead of adding them. You can use M.fromListWith to specify how to combine values when inserting a duplicate key--in the general case, it's common to use this to build a list of values for each key, or similar things.
But in your case we can skip straight to the desired result:
nameMap = M.fromListWith (+) [("Mary", 10), ("John", 45), ("Bradley", 30), ("Mary", 15), ("John", 10)]
This will insert directly if it finds a new name, otherwise it will add the values (the numbers) on a duplicate. You can turn it back into a list of tuples if you like, using M.toList:
namesList = M.toList $ M.fromListWith (+) [("Mary", 10), ("John", 45), ("Bradley", 30), ("Mary", 15), ("John", 10)]
Which gives us a final result of [("Bradley",30),("John",55),("Mary",25)].
But if you want to do more stuff with the collection of names/numbers, it might make more sense to keep it as a Map until you're done.

Here's another way using lists:
import Data.List
answer :: [(String, Int)] -> [(String, Int)]
answer = map (foo . unzip) . groupBy (\x y -> fst x == fst y) . sort
where foo (names, vals) = (head names, sum vals)
It's a fairly straightforward approach.
First, the dot (.) represents function composition which allows us to pass values from one function to the next, that is, the output of one becomes the input of the next, and so on. We start by applying sort which will automatically move the names next to one another in the list. Next we use groupBy to put each pair with similar names into a single list. We end up with a list of lists, each containing pairs with similar names:
[[("Bradley",30)], [("John",10),("John",45)], [("Mary",10),("Mary", 15)]]
Given such a list, how would you handle each sublist?
That is, how would you handle a list containing all the same names?
Obviously we wish to shrink them down into a single pair, which contains the name and the sum of the values. To accomplish this, I chose the function (foo . unzip), but there are many other ways to go about it. unzip takes a list of pairs and creates a single pair. The pair contains 2 lists, the first with all the names, the second with all the values. This pair is then passed to foo by way of function composition, as discussed earlier. foo picks it apart using a pattern, and then applies head to the names, returning only a single name (they're all the same), and applying sum to the list of values. sum is another standard list function that sums the values in a list, naturally.
However, this (foo . unzip) only applies to a single list of pairs, yet we have a list of lists. This is where map comes in. map will apply our (foo . unzip) function to each list in the list, or more generally, each element in the list. We end up with a list containing the results of applying (foo . unzip) to each sublist.
I would recommend looking at all the list functions used in Data.List.

I think the reason your potential solution did not work, is that it will only group elements together if they occur sequentially with the same key in the list. So instead, I'm going to use a map (often called a dictionary if you've used other languages) to remember which keys we've seen and keep the totals. First we need to import the functions we need.
import Data.Map hiding (foldl, foldl', foldr)
import Data.List (foldl')
Now we can just fold along the list, and for each key value pair update our map accordingly.
sumGroups :: (Ord k, Num n) => [(k, n)] -> Map k n
sumGroups list = foldl' (\m (k, n) -> alter (Just . maybe n (+ n)) k m) empty list
So, foldl' walks along the list with a function. It calls the function with each element (here the pair (k, n)), and another argument, the accumulator. This is our map, which starts out as empty. For each element, we alter the map, using a function from Maybe n -> Maybe n. This reflects the fact the map may not already have anything in it under the key k - so we deal with both cases. If there's no previous value, we just return n, otherwise we add n to the previous value. This gives us a map at the end which should contain the sums of the groups. Calling the toList function on the result should give you the list you want.
Testing this in ghci gives:
$ ghci
GHCi, version 7.6.1: http://www.haskell.org/ghc/ :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Prelude> import Data.Map hiding (foldl, foldl', foldr)
Prelude Data.Map> import Data.List (foldl')
Prelude Data.Map Data.List> let sumGroups list = foldl' (\m (k, n) -> alter (Just . maybe n (+ n)) k m) empty list
Loading package array-0.4.0.1 ... linking ... done.
Loading package deepseq-1.3.0.1 ... linking ... done.
Loading package containers-0.5.0.0 ... linking ... done.
Prelude Data.Map Data.List> toList $ sumGroups $ [("Mary", 10), ("John", 45), ("Bradley", 30), ("Mary", 15), ("John", 10)]
[("Bradley",30),("John",55),("Mary",25)]
Prelude Data.Map Data.List>
The groups come out in sorted order as a bonus, because internally map uses a form of binary tree, and so it's relatively trivial to traverse in order and output a sorted (well, sorted by key anyway) list.

Here are my two cents. Using just the Haskell Prelude.
test tup = sumAll
where
collect ys [] = ys
collect ys (x:xs) =
if (fst x) `notElem` ys
then collect (fst x : ys) xs
else collect ys xs
collectAllNames = collect [] tup
sumOne [] n x = (x, n)
sumOne (y:ys) n x =
if fst y == x
then sumOne ys (n + snd y) x
else sumOne ys n x
sumAll = map (sumOne tup 0) collectAllNames
This method traverses the original list several times.
Collect builds a temporary list holding just the names, skipping name repetitions.
sumOne takes a name, checks what names in the list matches, and adds their numbers. It returns the name as well as the sum.

Related

How to create a ranking based on a list of scores in Haskell?

So I got a score list and want to create a ranking list from it. If scores are the same they share a rank.
For example if I have a score list like
[100, 100, 50, 50, 20]
the generated list would be
[(100, 1), (100, 1), (50, 2), (50, 2), (20, 3)]
I guess this is a fairly simple task, but I haven't gotten to solve it yet. I tried to do it with pattern matching or folding but without any luck.
My last failed approach looks like this:
scores = [100, 100, 50, 50, 20, 10]
ranks = foldr (\x acc -> if x == (fst $ last acc)
then last acc:acc
else (x, (+1) $ snd $ last acc):acc) [(head scores, 1)] scores
Any help is appreciated.
This solution is fairly similar to Willem's, except that it doesn't explicitly use recursion. Many style guides, including the Haskell wiki, suggest to avoid explicit recursion if there's a simple implementation involving higher-order functions. In your case, your function is a pretty straightforward use of scanl, which folds a list with an accumulating value (in your case, the accumulator is the current rank and score) and stores the intermediate results.
ranks :: Eq a => [a] -> [(a, Int)]
-- Handle the empty case trivially.
ranks [] = []
-- Scan left-to-right. The first element of the result should always
-- have rank 1, hence the `(x, 1)' for the starting conditions.
ranks (x:xs) = scanl go (x, 1) xs
-- The actual recursion is handled by `scanl'. `go' just
-- handles each specific iteration.
where go (curr, rank) y
-- If the "current" score equals the next element,
-- don't change the rank.
| curr == y = (curr, rank)
-- If they're not equal, increment the rank and
-- move on.
| otherwise = (y, rank + 1)
By avoiding explicit recursion, it's arguably easier to see at a glance what the function does. I can look at this, immediately see the scanl, and know that the function will be iterating over the list left-to-right with some state (the rank) and producing intermediate results.
We can write a recursive algorithm that maintains a state: the current rank it is assigning. The algorithm each time looks two elements far. In case the next element is the same, the rank is not incremented, otherwise it is.
We thus can implement it like:
rank :: Eq a => [a] -> [(a, Int)]
rank = go 1
where go i (x:xr#(x2:_)) = (x, i) : go'
where go' | x == x2 = go i xr
| otherwise = go (i+1) xr
go i [x] = [(x, i)]
go _ [] = []
We thus specify that rank = go 1, we thus "initialize" a state with 1. Each time we check with go if the list contains at least two elements. If that is the case, we first emit the first element with the state (x, i), and then we perform recursion on the rest xr. Depending on whether the first element x is equal to the second element x2, we do or do not increment the state. In case the list only contains one element x, we thus return [(x, i)], and in case the list contains no elements at all, we return the empty list.
Note that this assumes that the scores are already in descending order (or in an order from "best" to "worst", since in some games the "score" is sometimes a negative thing). We can however use a sort step as pre-processing if that would not be the case.
Here's a simple one-liner, putting some off-the-shelf pieces together with a list comprehension.
import Data.List
import Data.Ord (Down (..))
rank :: Ord a => [a] -> [(a, Int)]
rank xs = [(a, i) | (i, as) <- zip [1..] . group . sortBy (comparing Down) $ xs
, a <- as]
If the list is already sorted in reverse order, you can leave out the sortBy (comparing Down).

How to create Haskell function that returns every third element from a list of ints

I want to create a function that returns every third int from a list of ints without using any predefined functions. For example, everyThird [1,2,3,4,5] --> [1,4]
everyThird:: [a] -> [a]
Could I just continue to iterate over the list using tail and appending to a new list every third call? I am new to Haskell and very confused with all of this
One other way of doing this is to handle three different base cases, in all of which we're at the end of the list and the list is less than three elements long, and one recursive case, where the list is at least three elements long:
everyThird :: [a] -> [a]
everyThird [] = []
everyThird [x] = [x]
everyThird [x, _] = [x]
everyThird (x:_:_:xs) = x:everyThird xs
You want to do exactly what you said: iterate over the list and include the element only on each third call. However, there's a problem. Haskell is a funny language where the idea of "changing" a variable doesn't make sense, so the usual approach of "have a counter variable i which tells us whether we're on the third element or not" won't work in the usual way. Instead, we'll create a recursive helper function to maintain the count for us.
everyThird :: [Int] -> [Int]
everyThird xs = helper 0 xs
where helper _ [] = []
helper 0 (x : xs) = x : helper 2 xs
helper n (_ : xs) = helper (n - 1) xs
We have three cases in the helper.
If the list is empty, stop and return the empty list.
If the counter is at 0 (that is, if we're on the third element), make a list starting with the current element and ending with the rest of the computation.
If the counter is not at zero, count down and continue iteration.
Because of the way pattern matching works, it will try these three statements in order.
Notice how we use an additional argument to be the counter variable since we can't mutate the variable like we would in an imperative language. Also, notice how we construct the list recursively; we never "append" to an existing list because that would imply that we're mutating the list. We simply build the list up from scratch and end up with the correct result on the first go round.
Haskell doesn't have classical iteration (i.e. no loops), at least not without monads, but you can use similar logic as you would in a for loop by zipping your list with indexes [0..] and applying appropriate functions from Data.List.
E.g. What you need to do is filter every third element:
everyThirdWithIndexes list = filter (\x -> snd x `mod` 3 == 0) $ zip list [0..]
Of course you have to get rid of the indexes, there are two elegant ways you can do this:
everyThird list = map (fst) . everyThirdWithIndexes list
-- or:
everyThird list = fst . unzip . everyThirdWithIndexes list
If you're not familiar with filter and map, you can define a simple recursion that builds a list from every first element of a list, drops the next two and then adds another from a new function call:
everyThird [] = [] -- both in case if the list is empty and the end case
everyThird (x:xs) = x : everyThird (drop 2 xs)
EDIT: If you have any questions about these solutions (e.g. some syntax that you are not familiar with), feel free to ask in the comments. :)
One classic approach:
everyThird xs = [x | (1,x) <- zip (cycle [1..3]) xs]
You can also use chunksOf from Data.List.Split to seperate the lists into chunks of 3, then just map the first element of each:
import Data.List.Split
everyThird :: [a] -> [a]
everyThird xs = map head $ chunksOf 3 xs
Which works as follows:
*Main> everyThird [1,2,3,4,5]
[1,4]
Note: You may need to run cabal install split to use chunksOf.

List to tuple in Haskell

let's say i have a list like this:
["Questions", "that", "may", "already", "have", "your", "correct", "answer"]
and want to have this:
[("Questions","that"),("may","already"),("have","your"),("correct","answer")]
can this be done ? or is it a bad Haskell practice ?
For a simple method (that fails for a odd number of elements) you can use
combine :: [a] -> [(a, a)]
combine (x1:x2:xs) = (x1,x2):combine xs
combine (_:_) = error "Odd number of elements"
combine [] = []
Live demo
Or you could use some complex method like in an other answer that I don't really want to understand.
More generic:
map2 :: (a -> a -> b) -> [a] -> [b]
map2 f (x1:x2:xs) = (f x1 x2) : map2 f xs
map2 _ (_:_) = error "Odd number of elements"
map2 _ [] = []
Here is one way to do it, with the help of a helper function that lets you drop every second element from your target list, and then just use zip. This may not have your desired behavior when the list is of odd length since that's not yet defined in the question.
-- This is just from ghci
let my_list = ["Questions", "that", "may", "already", "have", "your", "correct", "answer"]
let dropEvery [] _ = []
let dropEvery list count = (take (count-1) list) ++ dropEvery (drop count list) count
zip (dropEvery my_list 2) $ dropEvery (tail my_list) 2
[("Questions","that"),("may","already"),("have","your"),("correct","answer")
The helper function is taken from question #6 from 99 Questions., where there are many other implementations of the same idea, probably many with better recursion optimization properties.
To understand dropEvery, it's good to remember what take and drop each do. take k some_list takes the first k entries of some_list. Meanwhile drop k some_list drops the first k entries.
If we want to drop every Nth element, it means we want to keep each run of (N-1) elements, then drop one, then do the same thing again until we are done.
The first part of dropEvery does this: it takes the first count-1 entries, which it will then concatenate to whatever it gets from the rest of the list.
After that, it says drop count (forget about the N-1 you kept, and also the 1 (in the Nth spot) that you had wanted to drop all along) -- and after these are dropped, you can just recursively apply the same logic to whatever is leftover.
Using ++ in this manner can be quite expensive in Haskell, so from a performance point of view this is not so great, but it was one of the shorter implementations available at that 99 questions page.
Here's a function to do it all in one shot, which is maybe a bit more readable:
byTwos :: [a] -> [(a,a)]
byTwos [] = []
byTwos xs = zip firsts seconds
where enumerated = zip xs [1..]
firsts = [fst x | x <- enumerated, odd $ snd x]
seconds = [fst x | x <- enumerated, even $ snd x]
In this case, I started out by saying this problem will be easy to solve with zip if I just already had the list of odd-indexed elements and the list of even-indexed elements. So let me just write that down, and then worry about getting them in some where clause.
In the where clause, I say first zip xs [1..] which will make [("Questions", 1), ("that", 2), ...] and so on.
Side note: recall that fst takes the first element of a tuple, and snd takes the second element.
Then firsts says take the first element of all these values if the second element is odd -- these will serve as "firsts" in the final output tuples from zip.
seconds says do the same thing, but only if the second element is even -- these will serve as "seconds" in the final output tuples from zip.
In case the list has odd length, firsts will be one element longer than seconds and so the final zip means that the final element of the list will simply be dropped, and the result will be the same as though you called the function on the front of the list (all but final element).
A simple pattern matching could do the trick :
f [] = []
f (x:y:xs) = (x,y):f(xs)
It means that an empty list gives an empty list, and that a list of a least two elements returns you a list with a couple of these two elements and then application of the same reasoning with what follows...
Using chunk from Data.List.Split you can get the desired result of pairing every two consecutive items in a list, namely for the given list named by xs,
import Data.List.Split
map (\ys -> (ys!!0, ys!!1)) $ chunk 2 xs
This solution assumes the given list has an even number of items.

Print elements of list that are repeated in Haskell

I want to print those elements that appear more than once in the list. can you please tell me how can I do that.. I am new to haskell.
for example if I have [1,2,3,3,2,4,5,6,5] that i want to get only [2,3,5] because these are the repeated elements in list.
Another solution: First sort the list, then group equal elements and take only the ones that appear multiple times:
>>> :m + Data.Maybe Data.List
>>> let xs = [1..100000] ++ [8,18..100] ++ [10,132,235]
>>> let safeSnd = listToMaybe . drop 1
>>> mapMaybe safeSnd $ group $ sort xs
[8,10,18,28,38,48,58,68,78,88,98,132,235]
group $ sort xs is a list of lists where each list contains all equal elements.
mapMaybe safe2nd returns only those lists that have a 2nd element (= the orignal element occured more than once in the orginal list).
This is method should be faster than the one using nub, especially for large lists.
Data.Map.Lazy and Data.Map.Strict are host to a bunch of interesting functions for constructing maps (association maps, dictionaries, whatever you want to call them). One of them is fromListWith
fromListWith :: Ord k => (a -> a -> a) -> [(k, a)] -> Map k a
What you want to build is a map that tells you, for each value in your input list, how often it occurs. The values would be the keys of the map (type k), their counts would be the values associated with the keys (type a). You could use the following expression for that:
fromListWith (+) . map (\x -> (x, 1))
First, all values in the list are put into a tuple, together with a count of one. Then, fromListWith builds a map from the list; if a key already exists, it computes a new count using (+).
Once you've done this, you're only interested in the elements that occur more than once. For this, you can use filter (> 1) from Data.Map.
Finally, you just want to know all keys that remain in the map. Use the function keys for this.
In the end, you get the following module:
import qualified Data.Map.Strict as M
findDuplicates :: (Ord a) => [a] -> [a]
findDuplicates
= M.keys
. M.filter (> 1)
. M.fromListWith (+)
. map (\x -> (x, 1 :: Integer))
It's common practice to import certain packages like Data.Map qualified, to avoid name conflicts between modules (e.g. filter from Data.Map and the one from Prelude are very different). In this situation, it's best to choose Data.Map.Strict; see the explanation at the top of Data.Map.
The complexity of this method should be O(n log n).
I thought it could be optimized by using a boolean flag to indicate that the value is a duplicate. However, this turned out to be about 20% slower.
You're basically looking for the list of elements that are not unique, or in other words, the difference between the original list and the list of unique elements. In code:
xs \\ (nub xs)
If you don't want to have duplicates in the result list, you'll want to call nub again:
nub $ xs \\ (nub xs)

List processing in Haskell

I am teaching myself Haskell and have run into a problem and need help.
Background:
type AInfo = (Char, Int)
type AList = [AInfo] (let’s say [(‘a’, 2), (‘b’,5), (‘a’, 1), (‘w’, 21)]
type BInfo = Char
type BList = [BInfo] (let’s say [‘a’, ‘a’, ‘c’, ‘g’, ‘a’, ‘w’, ‘b’]
One quick edit: The above information is for illustrative purposes only. The actual elements of the lists are a bit more complex. Also, the lists are not static; they are dynamic (hence the uses of the IO monad) and I need to keep/pass/"return"/have access to and change the lists during the running of the program.
I am looking to do the following:
For all elements of AList check against all elements of BList and where the character of the AList element (pair) is equal to the character in the Blist add one to the Int value of the AList element (pair) and remove the character from BList.
So what this means is after the first element of AList is checked against all elements of BList the values of the lists should be:
AList [(‘a’, 5), (‘b’,5), (‘a’, 1), (‘w’, 21)]
BList [‘c’, ‘g’, ‘w’, ‘b’]
And in the end, the lists values should be:
AList [(‘a’, 5), (‘b’,6), (‘a’, 1), (‘w’, 22)]
BList [‘c’, ‘g’]
Of course, all of this is happening in an IO monad.
Things I have tried:
Using mapM and a recursive helper function. I have looked at both:
Every element of AList checked against every element of bList -- mapM (myHelpF1 alist) blist and
Every element of BList checked against every element of AList – mapM (myHelpF2 alist) blist
Passing both lists to a function and using a complicated
if/then/else & helper function calls (feels like I am forcing
Haskell to be iterative; Messy convoluted code, Does not feel
right.)
I have thought about using filter, the character value of AList
element and Blist to create a third list of Bool and the count the
number of True values. Update the Int value. Then use filter on
BList to remove the BList elements that …… (again Does not feel
right, not very Haskell-like.)
Things I think I know about the problem:
The solution may be exceeding trivial. So much so, the more experienced Haskellers will be muttering under their breath “what a noob” as they type their response.
Any pointers would be greatly appreciated. (mutter away….)
A few pointers:
Don't use [(Char, Int)] for "AList". The data structure you are looking for is a finite map: Map Char Int. Particularly look at member and insertWith. toList and fromList convert from the representation you currently have for AList, so even if you are stuck with that representation, you can convert to a Map for this algorithm and convert back at the end. (This will be more efficient than staying in a list because you are doing so many lookups, and the finite map API is easier to work with than lists)
I'd approach the problem as two phases: (1) partition out the elements of blist by whether they are in the map, (2) insertWith the elements which are already in the map. Then you can return the resulting map and the other partition.
I would also get rid of the meaningless assumptions such as that keys are Char -- you can just say they are any type k (for "key") that satisfies the necessary constraints (that you can put it in a Map, which requires that it is Orderable). You do this with lowercase type variables:
import qualified Data.Map as Map
sieveList :: (Ord k) => Map.Map k Int -> [k] -> (Map.Map k Int, [k])
Writing algorithms in greater generality helps catch bugs, because it makes sure that you don't use any assumptions you don't need.
Oh, also this program has no business being in the IO monad. This is pure code.
import Data.List
type AInfo = (Char, Int)
type AList = [AInfo]
type BInfo = Char
type BList = [BInfo]
process :: AList -> BList -> AList
process [] _ = []
process (a:as) b = if is_in a b then (fst a,snd a + 1):(process as (delete (fst a) b)) else a:process as b where
is_in f [] = False
is_in f (s:ss) = if fst f == s then True else is_in f ss
*Main> process [('a',5),('b',5),('a',1),('b',21)] ['c','b','g','w','b']
[('a',5),('b',6),('a',1),('b',22)]
*Main> process [('a',5),('b',5),('a',1),('w',21)] ['c','g','w','b']
[('a',5),('b',6),('a',1),('w',22)]
Probably an important disclaimer: I'm rusty at Haskell to the point of ineptness, but as a relaxing midnight exercise I wrote this thing. It should do what you want, although it doesn't return a BList. With a bit of modification, you can get it to return an (AList,BList) tuple, but methinks you'd be better off using an imperative language if that kind of manipulation is required.
Alternately, there's an elegant solution and I'm too ignorant of Haskell to know it.
While I am by no means a Haskell expert, I have a partial attempt that returns that result of an operation once. Maybe you can find out how to map it over the rest to get your solution. The addwhile is clever, since you only want to update the first occurrence of an element in lista, if it exists twice, it will just add 0 to it. Code critiques are more than welcome.
import Data.List
type AInfo = (Char, Int)
type AList = [AInfo]
type BInfo = Char
type BList = [BInfo]
lista = ([('a', 2), ('b',5), ('a', 1), ('w', 21)] :: AList)
listb = ['a','a','c','g','a','w','b']
--step one, get the head, and its occurrences
items list = (eleA, eleB) where
eleA = length $ filter (\x -> x == (head list)) list
eleB = head list
getRidOfIt list ele = (dropWhile (\x -> x == ele) list) --drop like its hot
--add to lista
addWhile :: [(Char, Int)] -> Char -> Int -> [(Char,Int)]
addWhile [] _ _ = []
addWhile ((x,y):xs) letter times = if x == letter then (x,y+times) : addWhile xs letter times
else (x,y) : addWhile xs letter 0
--first answer
firstAnswer = addWhile lista (snd $ items listb) (fst $ items listb)
--[('a',5),('b',5),('a',1),('w',21)]
The operation you describe is pure, as #luqui points out, so we just define it as a pure Haskell function. It can be used inside a monad (including IO) by means of fmap (or do).
import Data.List
combine alist blist = (reverse a, b4) where
First we sort and count the B list:
b = map (\g->(head g,length g)) . group . sort $ blist
We need the import for group and sort to be available. Next, we roll along the alist and do our thing:
(a,b2) = foldl g ([],b) alist
g (acc,b) e#(x,c) = case pick x b of
Nothing -> (e:acc,b)
Just (n,b2) -> ((x,c+n):acc,b2)
b3 = map fst b2
b4 = [ c | c <- blist, elem c b3 ]
Now pick, as used, must be
pick x [] = Nothing
pick x ((y,n):t)
| x==y = Just (n,t)
| otherwise = case pick x t of Nothing -> Nothing
Just (k,r) -> Just (k, (y,n):r)
Of course pick performs a linear search, so if performance (speed) becomes a problem, b should be changed to allow for binary search (tree etc, like Map). The calculation of b4 which is filter (`elem` b3) blist is another potential performance problem with its repeated checks for presence in b3. Again, checking for presence in trees is faster than in lists, in general.
Test run:
> combine [('a', 2), ('b',5), ('a', 1), ('w', 21)] "aacgawb"
([('a',5),('b',6),('a',1),('w',22)],"cg")
edit: you probably want it the other way around, rolling along the blist while updating the alist and producing (or not) the elements of blist in the result (b4 in my code). That way the algorithm will operate in a more local manner on long input streams (that assuming your blist is long, though you didn't say that). As written above, it will have a space problem, consuming the input stream blist several times over. I'll keep it as is as an illustration, a food for thought.
So if you decide to go the 2nd route, first convert your alist into a Map (beware the duplicates!). Then, scanning (with scanl) over blist, make use of updateLookupWithKey to update the counts map and at the same time decide for each member of blist, one by one, whether to output it or not. The type of the accumulator will thus have to be (Map a Int, Maybe a), with a your element type (blist :: [a]):
scanl :: (acc -> a -> acc) -> acc -> [a] -> [acc]
scanning = tail $ scanl g (Nothing, fromList $ reverse alist) blist
g (_,cmap) a = case updateLookupWithKey (\_ c->Just(c+1)) a cmap of
(Just _, m2) -> (Nothing, m2) -- seen before
_ -> (Just a, cmap) -- not present in counts
new_b_list = [ a | (Just a,_) <- scanning ]
last_counts = snd $ last scanning
You will have to combine the toList last_counts with the original alist if you have to preserve the old duplicates there (why would you?).

Resources