The following program has as its purpose the transitive closure of relation (as a set of ordered pairs - a graph) and a test about membership of an ordered pair to that relation.
I tried to make the program efficient through the use of Data.Set instead of lists and eliminating redundancies in the generation of the missing pair.
I would like to know:
how to use QuickCheck to verify its correctness;
how calculate the efficiency of the program, if it is possible, or
how does it compare with similar solutions of the problem (ex.
Transitive closure from a list ).
Any criticism and suggestion will be appreciated.
import Data.Set as S
import Data.Foldable as F (foldMap)
data TruthValue = F | U | T deriving (Show,Eq)
isMemberOfTransitiveGraph :: Ord t => (t, t) -> Set (t, t) -> TruthValue
(x,y) `isMemberOfTransitiveGraph` gr
| S.member (x,y) closure = T -- as suggested by user5402
| S.member (y,x) closure = F -- as suggested by user5402
| otherwise = U
where
closure = transitiveClusureOfGraph gr -- as suggested by user5402
transitiveClusureOfGraph :: Ord a => Set (a, a) -> Set (a, a)
transitiveClusureOfGraph gr = F.foldMap (transitiveClosureOfArgument gr) domain
where
domain = S.map fst gr
transitiveClosureOfArgument :: Ord a => Set (a, a) -> a -> Set (a, a)
transitiveClosureOfArgument gr x = S.map ((,) x) $ recursiveImages gr (S.singleton x)
recursiveImages :: Ord a => Set (a, a) -> Set a -> Set a
recursiveImages gr imgs = f gr imgs S.empty
where
f :: Ord a => Set (a, a) -> Set a -> Set a -> Set a
f gr imgs acc
| S.null imgs = acc
| otherwise = f gr (newImgs S.\\ acc) (S.union newImgs acc)
where
newImgs = F.foldMap (imaginsOf gr) imgs
imaginsOf :: (Ord b, Eq a) => Set (a, b) -> a -> Set b
imaginsOf gr arg = S.foldr (\(a,b) acc -> if a == arg then S.insert b acc else acc) S.empty gr
**
EXAMPLE 1
**
someLessThan = S.fromList [("1","2"),("1","4"),("3","4"),("2","8"),("3","5"),("4","7"),("4","8"),("3","9")]
> transitiveClusureOfGraph someLessThan
> fromList [("1","2"),("1","4"),("1","7"),("1","8"),("2","8"),("3","4"),("3","5"),("3","7"),("3","8"),("3","9"),("4","7"),("4","8")]
a `isLessThan` b = (a,b) `isMemberOfTransitiveGraph` someLessThan
> "1" `isLessThan` "8"
> T
> "8" `isLessThan` "1"
> F
> "1" `isLessThan` "9"
> U
> "9" `isLessThan` "1"
> U
**
EXAMPLE 2
**
someTallerThan = S.fromList [("Alexandre","Andrea"),("Andrea","John"),("George","Frank"),("George","Lucy"),("John","Liza"),("Julia","Lucy"),("Liza","Bob"),("Liza","Frank")]
> transitiveClusureOfGraph someTallerThan
> fromList [("Alexandre","Andrea"),("Alexandre","Bob"),("Alexandre","Frank"),("Alexandre","John"),("Alexandre","Liza"),("Andrea","Bob"),("Andrea","Frank"),("Andrea","John"),("Andrea","Liza"),("George","Frank"),("George","Lucy"),("John","Bob"),("John","Frank"),("John","Liza"),("Julia","Lucy"),("Liza","Bob"),("Liza","Frank")]
a `isTallerThan` b = (a,b) `isMemberOfTransitiveGraph` someTallerThan
> "Alexandre" `isTallerThan` "Frank"
> T
> "Frank" `isTallerThan` "Alexandre"
> F
> "Alexandre" `isTallerThan` "George"
> U
> "George" `isTallerThan` "Alexandre"
> U
**
EXAMPLE 3
**
incomeIsLessOrEqualThan = S.fromList [("Bob","Liza"),("Liza","Tom"),("Tom","Bob"),("Tom","Mary"), ("Tom","Tom")]
> S.filter (\(a,b) -> a /= b) $ transitiveClusureOfGraph incomeIsLessOrEqualThan
> fromList [("Bob","Liza"),("Bob","Mary"),("Bob","Tom"),("Liza","Bob"),("Liza","Mary"),("Liza","Tom"),("Tom","Bob"),("Tom","Liza"),("Tom","Mary")]
Some comments:
Some ideas for Quickcheck tests:
Create a random connected graph and verify that every pair of points is in the transitive closure.
Verify that for any random graph the transitive closure of the transitive closure is just the same as doing the transitive closure just once.
Verify that your code returns the same answer as another implementation (such as from the fgl library.)
However, when I look at the fgl library I see that they just use a fixed graph to test their path query functions. Then they know exactly what the answers should be for all the tests.
Another idea is to solve an ACM (programming competition) problem which involves finding the transitive closure of a graph, and use your code in that solution. Both Timus and codeforces accept Haskell programs.
In isMemberOfTransitiveGraph you have the common sub-expression transitiveClusureOfGraph gr. Now GHC could (and should) detect this and factor it out so that it doesn't get evaluated twice, but it doesn't always do that. Moreover, being an interpreter, ghci won't perform common sub-expression elimination. So, given that transitiveClusureOfGraph is an expensive operation, you should write this function like
this:
isMemberOfTransitiveGraph (x,y) gr
| S.member (x,y) closure = T
| S.member (y,x) closure = F
| otherwise = U
where
closure = transitiveClusureOfGraph gr in
Also, computing the transitive closure for the entire graph is
an expensive way to determine if a specific pair is in the closure.
A better way to implement isMemberOfTransitiveClosure is to simply
perform a depth-first search start at one member of the pair until you a) either find the other element or b) fill out the connected component without finding the other element. Otherwise you are performing a lot of work on the other connected components which is irrelevant to the question you are trying to answer.
If you are really concerned about efficiency, restrict your node type to Int and use Data.IntSet or even Data.BitSet for sets of nodes.
Related
I have random number generator
rand :: Int -> Int -> IO Int
rand low high = getStdRandom (randomR (low,high))
and a helper function to remove an element from a list
removeItem _ [] = []
removeItem x (y:ys) | x == y = removeItem x ys
| otherwise = y : removeItem x ys
I want to shuffle a given list by randomly picking an item from the list, removing it and adding it to the front of the list. I tried
shuffleList :: [a] -> IO [a]
shuffleList [] = []
shuffleList l = do
y <- rand 0 (length l)
return( y:(shuffleList (removeItem y l) ) )
But can't get it to work. I get
hw05.hs:25:33: error:
* Couldn't match expected type `[Int]' with actual type `IO [Int]'
* In the second argument of `(:)', namely
....
Any idea ?
Thanks!
Since shuffleList :: [a] -> IO [a], we have shuffleList (xs :: [a]) :: IO [a].
Obviously, we can't cons (:) :: a -> [a] -> [a] an a element onto an IO [a] value, but instead we want to cons it onto the list [a], the computation of which that IO [a] value describes:
do
y <- rand 0 (length l)
-- return ( y : (shuffleList (removeItem y l) ) )
shuffled <- shuffleList (removeItem y l)
return y : shuffled
In do notation, values to the right of <- have types M a, M b, etc., for some monad M (here, IO), and values to the left of <- have the corresponding types a, b, etc..
The x :: a in x <- mx gets bound to the pure value of type a produced / computed by the M-type computation which the value mx :: M a denotes, when that computation is actually performed, as a part of the combined computation represented by the whole do block, when that combined computation is performed as a whole.
And if e.g. the next line in that do block is y <- foo x, it means that a pure function foo :: a -> M b is applied to x and the result is calculated which is a value of type M b, denoting an M-type computation which then runs and produces / computes a pure value of type b to which the name y is then bound.
The essence of Monad is thus this slicing of the pure inside / between the (potentially) impure, it is these two timelines going on of the pure calculations and the potentially impure computations, with the pure world safely separated and isolated from the impurities of the real world. Or seen from the other side, the pure code being run by the real impure code interacting with the real world (in case M is IO). Which is what computer programs must do, after all.
Your removeItem is wrong. You should pick and remove items positionally, i.e. by index, not by value; and in any case not remove more than one item after having picked one item from the list.
The y in y <- rand 0 (length l) is indeed an index. Treat it as such. Rename it to i, too, as a simple mnemonic.
Generally, with Haskell it works better to maximize the amount of functional code at the expense of non-functional (IO or randomness-related) code.
In your situation, your “maximum” functional component is not removeItem but rather a version of shuffleList that takes the input list and (as mentioned by Will Ness) a deterministic integer position. List function splitAt :: Int -> [a] -> ([a], [a]) can come handy here. Like this:
funcShuffleList :: Int -> [a] -> [a]
funcShuffleList _ [] = []
funcShuffleList pos ls =
if (pos <=0) || (length(take (pos+1) ls) < (pos+1))
then ls -- pos is zero or out of bounds, so leave list unchanged
else let (left,right) = splitAt pos ls
in (head right) : (left ++ (tail right))
Testing:
λ>
λ> funcShuffleList 4 [0,1,2,3,4,5,6,7,8,9]
[4,0,1,2,3,5,6,7,8,9]
λ>
λ> funcShuffleList 5 "#ABCDEFGH"
"E#ABCDFGH"
λ>
Once you've got this, you can introduce randomness concerns in simpler fashion. And you do not need to involve IO explicitely, as any randomness-friendly monad will do:
shuffleList :: MonadRandom mr => [a] -> mr [a]
shuffleList [] = return []
shuffleList ls =
do
let maxPos = (length ls) - 1
pos <- getRandomR (0, maxPos)
return (funcShuffleList pos ls)
... IO being just one instance of MonadRandom.
You can run the code using the default IO-hosted random number generator:
main = do
let inpList = [0,1,2,3,4,5,6,7,8]::[Integer]
putStrLn $ "inpList = " ++ (show inpList)
-- mr automatically instantiated to IO:
outList1 <- shuffleList inpList
putStrLn $ "outList1 = " ++ (show outList1)
outList2 <- shuffleList outList1
putStrLn $ "outList2 = " ++ (show outList2)
Program output:
$ pickShuffle
inpList = [0,1,2,3,4,5,6,7,8]
outList1 = [6,0,1,2,3,4,5,7,8]
outList2 = [8,6,0,1,2,3,4,5,7]
$
$ pickShuffle
inpList = [0,1,2,3,4,5,6,7,8]
outList1 = [4,0,1,2,3,5,6,7,8]
outList2 = [2,4,0,1,3,5,6,7,8]
$
The output is not reproducible here, because the default generator is seeded by its launch time in nanoseconds.
If what you need is a full random permutation, you could have a look here and there - Knuth a.k.a. Fisher-Yates algorithm.
Apologies for my poor wording of the question. I've tried searching for an answer but not knowing what to search is making it very difficult to find one.
Here is a simple function which calculates the area of a triangle.
triangleArea :: Float -> Float -> Float -> Float
triangleArea a b c
| (a + b) <= c = error "Not a triangle!"
| (a + c) <= b = error "Not a triangle!"
| (b + c) <= a = error "Not a triangle!"
| otherwise = sqrt (s * (s - a) * (s - b) * (s - c))
where s = (a + b + c) / 2
Three lines of the function have been taken up for the purposes of error checking. I was wondering if these three lines could be condensed into one generic line.
I was wondering if something similar to the following would be possible
(arg1 + arg2) == arg3
where Haskell knows to check each possible combination of the three arguments.
I think #behzad.nouri's comment is the best. Sometimes doing a little math is the best way to program. Here's a somewhat overdone expansion on #melpomene's solution, which I thought would be fun to share. Let's write a function similar to permutations but that computes combinations:
import Control.Arrow (first, second)
-- choose n xs returns a list of tuples, the first component of each having
-- n elements and the second component having the rest, in all combinations
-- (ignoring order within the lists). N.B. this would be faster if implemented
-- using a DList.
choose :: Int -> [a] -> [([a],[a])]
choose 0 xs = [([], xs)]
choose _ [] = []
choose n (x:xs) =
map (first (x:)) (choose (n-1) xs) ++
map (second (x:)) (choose n xs)
So..
ghci> choose 2 [1,2,3]
[([1,2],[3]),([1,3],[2]),([2,3],[1])]
Now you can write
triangleArea a b c
| or [ x + y <= z | ([x,y], [z]) <- choose 2 [a,b,c] ] = error ...
This doesn't address the question of how to shorten your error checking code, but you may be able to limit how often you repeat it by defining some new types with invariants. This function needs error checking because you can't trust the user to supply Float triples that make a reasonable triangle, and if you continue to define functions this way then every triangle-related function you write would need similar error checks.
However, if you define a Triangle type, you can check your invariants only once, when a triangle is created, and then all other functions will be guaranteed to receive valid triangles:
module Triangle (Triangle(), mkTriangle, area) where
data Triangle a = Triangle a a a deriving Show
mkTriangle :: (Num a, Ord a) => a -> a -> a -> Either String (Triangle a)
mkTriangle a b c
| a + b <= c = wrong
| a + c <= b = wrong
| b + c <= a = wrong
| otherwise = Right $ Triangle a b c
where wrong = Left "Not a triangle!"
area :: Floating a => Triangle a -> a
area (Triangle a b c) = sqrt (s * (s - a) * (s - b) * (s - c))
where s = (a + b + c) / 2
Here we export the Triangle type, but not its constructor, so that the client must use mkTriangle instead, which can do the required error checking. Then area, and any other triangle functions you write, can omit the checks that they are receiving a valid triangle. This general pattern is called "smart constructors".
Here are two ideas.
Using existing tools, you can generate all the permutations of the arguments and check that they all satisfy a condition. Thus:
import Data.List
triangleArea a b c
| any (\[x, y, z] -> x + y <= z) (permutations [a,b,c])
= error "Not a triangle!"
| otherwise = {- ... -}
This doesn't require writing very much additional code; however, it will search some permutations you don't care about.
Use the usual trick for choosing an element from a list and the left-overs. The zippers function is one I use frequently:
zippers :: [a] -> [([a], a, [a])]
zippers = go [] where
go b [] = []
go b (v:e) = (b, v, e) : go (v:b) e
We can use it to build a function which chooses only appropriate triples of elements:
triples :: [a] -> [(a, a, a)]
triples xs = do
(b1, v1, e1) <- zippers xs
(b2, v2, e2) <- zippers e1
v3 <- b1 ++ b2 ++ e2
return (v1, v2, v3)
Now we can write our guard like in part (1), but it will only consider unique pairings for the addition.
triangleArea a b c
| any (\(x, y, z) -> x + y <= z) (triples [a,b,c])
= error "Not a triangle!"
| otherwise = {- ... -}
Is there an efficient way to insert a value into a Data.Set while at the same time checking if that value was already a member of the set?
If there isn't, is there any particular reason such a function would be impossible to write with the current implementation of sets in the containers library?
You can do this with O(log n) complexity by taking advantage of the fact that size is O(1), and just compare before and after:
-- | Inserts a value into the Set. If the value was not already in the set,
-- | then True is returned, otherwise False
insertIfMissing :: Ord a => a -> Set a -> (Set a, Bool)
insertIfMissing a s = (newSet, missing)
where
newSet = Set.insert a s
oldSize = Set.size s
newSize = Set.size newSet
missing = oldSize < newSize
And if you aren't interested in whether it was already present, then this shouldn't compute the missing part thanks to laziness.
It is actually possible to write such a function by slightly changing the Set.insert function. I decided to return a Maybe (Set a), so the function only creates a new Set if the element did not alredy exist. It would be equally well possible to write a function with (Bool, Set a) as return type.
insertMember :: Ord a => a -> Set a -> Maybe (Set a)
insertMember = go
where
go :: Ord a => a -> Set a -> Maybe (Set a)
STRICT_1_OF_2(go)
go x Tip = Just $ singleton x
go x (Bin sz y l r) = case compare x y of
LT -> do
l' <- go x l
return $ balanceL y l' r
GT -> do
r' <- go x r
return $ balanceR y l
EQ -> Nothing
I'm trying to implement with Haskell an algorithm to manipulate mathematical expressions.
I have this data type :
data Exp = Var String | IVal Int | Add Exp Exp
This will be enough for my question.
Given a set of expression transformations, for example :
(Add a b) => (Add b a)
(Add (Add a b) c) => (Add a (Add b c))
And an expression, for example : x = (Add (Add x y) (Add z t)), I want to find all expressions in the neighborhood of x. Given that neighborhood of x is defined as: y in Neighborhood(x) if y can be reached from x within a single transformation.
I am new to Haskell. I am not even sure Haskell is the right tool for this job.
The final goal is to get a function : equivalent x which returns a set of all expressions that are equivalent to x. In other words, the set of all expressions that are in the closure of the neighborhood of x (given a set of transformations).
Right now, I have the following :
import Data.List(nub)
import Data.Set
data Exp = IVal Int
| Scalar String
| Add Exp Exp
deriving (Show, Eq, Ord)
commu (Add a b) = (Add b a)
commu x = x
assoc (Add (Add a b) c) = (Add a (Add b c))
assoc (Add a (Add b c)) = (Add (Add a b) c)
assoc x = x
neighbors x = [commu x, assoc x]
equiv :: [Exp] -> [Exp]
equiv closure
| closure == closureUntilNow = closure
| otherwise = equiv closureUntilNow
where closureUntilNow = nub $ closure ++ concat [neighbors x|x<-closure]
But It's probably slower than needed (nub is O(n^2)) and some terms are missing.
For example, if you have f = (x+y)+z, then, you will not get (x+z)+y, and some others.
Imports, etc. below. I'll be using the multiset package.
import Control.Monad
import Data.MultiSet as M
data Exp = Var String | IVal Int | Add Exp Exp deriving (Eq, Ord, Show, Read)
A bit of paper-and-pencil work shows the following fact: expressions e1 and e2 are in the congruence closure of your relation iff the multiset of leaves are equal. By leaves, I mean the Var and IVal values, e.g. the output of the following function:
leaves :: Exp -> MultiSet Exp
leaves (Add a b) = leaves a `union` leaves b
leaves e = singleton e
So this suggests a nice clean way to generate all the elements in a particular value's neighborhood (without attempting to generate any duplicates in the first place). First, generate the multiset of leaves; then nondeterministically choose a partition of the multiset and recurse. The code to generate partitions might look like this:
partitions :: Ord k => MultiSet k -> [(MultiSet k, MultiSet k)]
partitions = go . toOccurList where
go [] = [(empty, empty)]
go ((k, n):bag) = do
n' <- [0..n]
(left, right) <- go bag
return (insertMany k n' left, insertMany k (n-n') right)
Actually, we only want partitions where both the left and right part are non-empty. But we'll check that after we've generated them all; it's cheap, as there's only two that aren't like that per invocation of partitions. So now we can generate the whole neighborhood in one fell swoop:
neighborhood :: Exp -> [Exp]
neighborhood = go . leaves where
full = guard . not . M.null
go m
| size m == 1 = toList m
| otherwise = do
(leftBag, rightBag) <- partitions m
full leftBag
full rightBag
left <- go leftBag
right <- go rightBag
return (Add left right)
By the way, the reason you're not getting all the terms is because you're generating the reflexive, transitive closure but not the congruence closure: you need to apply your rewrite rules deep in the term, not just at the top level.
I have found myself in a dire need of your insights.
Here's my object of interest:
class Mergable m where
merge :: m -> m -> Maybe m
mergeList :: [m] -> [m]
mergeList [] = []
mergeList [x] = [x]
mergeList (x:y:t) = r1 ++ mergeList (r2 ++ t)
where
(r1,r2) = case (x `merge` y) of
Just m -> ([ ], [m])
Nothing -> ([x], [y])
But I'll come back to it later. For now I prepared some examples:
data AffineTransform = Identity
| Translation Float Float
| Rotation Float
| Scaling Float Float
| Affine Matrix3x3
instance Monoid AffineTransform where
mempty = Identity
Identity `mappend` x = x
x `mappend` Identity = x
(Translation dx1 dy1) `mappend` (Translation dx2 dy2) = Translation (dx1+dx2) (dy1+dy2)
(Rotation theta1) `mappend` (Rotation theta2) = Rotation (theta1+theta2)
(Scaling sx1 sy1) `mappend` (Scaling sx2 sy2) = Scaling (sx1*sx2) (sy1*sy2)
-- last resort: compose transforms from different subgroups
-- using an "expensive" matrix multiplication
x `mappend` y = Affine (toMatrix x `mult3x3` toMatrix y)
So now I can do:
toMatrix $ Rotation theta1 `mappend` Translation dx1 dy1 `mappend` Translation dx2 dy2 `mappend` Rotation theta2
or more briefly:
(toMatrix . mconcat) [Rotation theta1, Translation dx1 dy1, Translation dx2 dy2, Rotation theta2]
or more generally:
(toMatrix . (fold[r|r'|l|l'] mappend)) [Rotatio...], etc
In the above examples the first rotation and translation will be combined (expensively) to a matrix; then, that matrix combined with translation (also using multiplication) and then once again a multiplication will be used to produce the final result, even though (due to associativity) two translations in the middle could be combined cheaply for a total of two multiplications instead of three.
Anyhow, along comes my Mergable class to the rescue:
instance Mergable AffineTransform where
x `merge` Identity = Just x
Identity `merge` x = Just x
x#(Translation _ _) `merge` y#(Translation _ _) = Just $ x `mappend` y
x#(Rotation _) `merge` y#(Rotation _) = Just $ x `mappend` y
x#(Scaling _ _) `merge` y#(Scaling _ _) = Just $ x `mappend` y
_ `merge` _ = Nothing
so now (toMatrix . mconcat . mergeList) ~ (toMatrix . mconcat), as it should:
mergeList [Rotation theta1, Translation dx1 dy1, Translation dx2 dy2, Rotation theta2] == [Rotation theta1, Translation (dx1+dx2) (dy1+dy2), Rotation theta2]
Other examples I have in mind are more involved (code-wise) so I will just state the ideas.
Let's say I have some
data Message = ...
and a
dispatch :: [Message] -> IO a
where dispatch takes a message from the list, depending on it's type opens an appropriate channel (file, stream, etc), writes that message, closes the channel and continues with next message. So if opening and closing channels is an "expensive" operation, simply composing (dispatch . mergeList) can help improve performance with minimal effort.
Other times i have used it to handle events in gui applications like merging mousemoves, key presses, commands in an undo-redo system, etc.
The general pattern is that i take two items from the list, check if they are "mergeable" in some way and if so try to merge the result with the next item in the list or otherwise I leave the first item as it were and continue with the next pair (now that i think of it's a bit like generalized run length encoding)
My problem is that I can't shake the feeling that I'm reinventing the wheel and there has to be a similar structure in haskell that i could use. If that's not the case then:
1) How do I generalize it to other containers other than lists?
2) Can you spot any other structures Mergable is an instance of? (particularly Arrows if applicable, i have trouble wrapping my head around them)
3) Any insights on how strict/lazy should mergeList be and how to present it to user?
4) Optimization tips? Stackoverflow? Anything else?
Thanks!
I don't think there is anything like this already in a library. Hoogle and Hayoo don't turn up anything suitable.
Mergeable (I think it's spelt that way) looks like a generalisation of Monoid. Not an Arrow, sorry.
Sometimes you need to merge preserving order. Sometimes you don't need to preserve order when you merge.
I might do something like
newtype MergedInOrder a = MergedInOrder [a] -- without exporting the constructor
mergeInOrder :: Mergeable a => [a] -> MergedInOrder a
mergeInOrder = MergedInOrder . foldr f []
where f x [] = [x]
f x xs # (y : ys) = case merge x y of
Just z -> z : ys
Nothing -> x : xs
and similar newtypes for unordered lists, that take advantage of and do not require an Ord instance, respectively.
These newtypes have obvious Monoid instances.
I don't think we can write code to merge arbitrary containers of Mergeables, I think it would have to be done explicitly for each container.
Here was my first thought. Notice "deriving Ord". Otherwise this first section is almost exactly the same as some of the code you presented:
import Data.Monoid
import Data.List
data AffineTransform = Identity
| Translation Float Float
| Rotation Float
| Scaling Float Float
| Affine Matrix3x3
deriving (Eq, Show, Ord)
-- some dummy definitions to satisfy the typechecker
data Matrix3x3 = Matrix3x3
deriving (Eq, Show, Ord)
toMatrix :: AffineTransform -> Matrix3x3
toMatrix _ = Matrix3x3
mult3x3 :: Matrix3x3 -> Matrix3x3 -> Matrix3x3
mult3x3 _ _ = Matrix3x3
instance Monoid AffineTransform where
mempty = Identity
Identity `mappend` x = x
x `mappend` Identity = x
(Translation dx1 dy1) `mappend` (Translation dx2 dy2) =
Translation (dx1+dx2) (dy1+dy2)
(Rotation theta1) `mappend` (Rotation theta2) = Rotation (theta1+theta2)
(Scaling sx1 sy1) `mappend` (Scaling sx2 sy2) = Scaling (sx1*sx2) (sy1*sy2)
-- last resort: compose transforms from different subgroups
-- using an "expensive" matrix multiplication
x `mappend` y = Affine (toMatrix x `mult3x3` toMatrix y)
And now, the kicker:
mergeList :: [AffineTransform] -> [AffineTransform]
mergeList = map mconcat . groupBy sameConstructor . sort
where sameConstructor Identity Identity = True
sameConstructor (Translation _ _) (Translation _ _) = True
sameConstructor (Rotation _) (Rotation _) = True
sameConstructor (Scaling _ _) (Scaling _ _) = True
sameConstructor (Affine _) (Affine _) = True
sameConstructor _ _ = False
Assuming that translations, rotations, and scalings are orthagonal, why not reorder the list and group up all of those same operations together? (Is that a bad assumption?) That is the Haskell pattern that I saw: the good ol' group . sort trick. If you really want, you could pull sameConstructor out of mergeList:
mergeList :: (Monoid a, Ord a) => (a -> a -> Bool) -> [a] -> [a]
mergeList f = map mconcat . groupBy f . sort
P.S. if that was a bad assumption, then you could still do something like
mergeList = map mconcat . groupBy canMerge
But it seems to me that there is unusual overlap between merge and mappend the way you defined them.