Finding all palindromic word pairs - haskell

I came up with an unreal problem: finding all palindromic word pairs in a vocabulary, so I wrote the solution below,
import Data.List
findParis :: Ord a => [[a]] -> [[[a]]]
findPairs ss =
filter ((== 2) . length)
. groupBy ((==) . reverse)
. sortBy (compare . reverse)
$ ss
main = do
print . findPairs . permutations $ ['a'..'c']
-- malfunctioning: only got partial results [["abc","cba"]]
-- expected: [["abc","cba"],["bac","cab"],["bca","acb"]]
Could you help correct it if worthy of trying?
#Solution
Having benefited from #David Young #chi comments the tuned working code goes below,
import Data.List (delete)
import Data.Set hiding (delete, map)
findPairs :: Ord a => [[a]] -> [([a], [a])]
findPairs ss =
let
f [] = []
f (x : xs) =
let y = reverse x
in
if x /= y
then
let ss' = delete y xs
in (x, y) : f ss'
else f xs
in
f . toList
. intersection (fromList ss)
$ fromList (map reverse ss)

import Data.List
import Data.Ord
-- find classes of equivalence by comparing canonical forms (CF)
findEquivalentSets :: Ord b => (a->b) -> [a] -> [[a]]
findEquivalentSets toCanonical =
filter ((>=2) . length) -- has more than one
-- with the same CF?
. groupBy ((((== EQ) .) .) (comparing toCanonical)) -- group by CF
. sortBy (comparing toCanonical) -- compare CFs
findPalindromes :: Ord a => [[a]] -> [[[a]]]
findPalindromes = findEquivalentSets (\x -> min x (reverse x))
This function lets us find many kinds of equivalence as long as we can assign some effectively computable canonical form (CF) to our elements.
When looking for palindromic pairs, two strings are equivalent if one is a reverse of the other. The CF is the lexicographically smaller string.
findAnagrams :: Ord a => [[a]] -> [[[a]]]
findAnagrams = findEquivalentSets sort
In this example, two strings are equivalent if one is an anagram of the other. The CF is the sorted string (banana → aaabnn).
Likewise we can find SOUNDEX equivalents and whatnot.
This is not terribly efficient as one needs to compute the CF on each comparison. We can cache it, at the expense of readability.
findEquivalentSets :: Ord b => (a->b) -> [a] -> [[a]]
findEquivalentSets toCanonical =
map (map fst) -- strip CF
. filter ((>=2) . length) -- has more than one
-- with the same CF?
. groupBy ((((== EQ) .) .) (comparing snd)) -- group by CF
. sortBy (comparing snd) -- compare CFs
. map (\x -> (x, toCanonical x)) -- pair the element with its CF

Here's an approach you might want to consider.
Using sort implies that there's some keying function word2key that yields the same value for both words of a palindromic pair. The first one that comes to mind for me is
word2key w = min w (reverse w)
So, map the keying function over the list of words, sort, group by equality, take groups of length 2, and then recover the two words from the key (using the fact that the key is either equal to the word or its reverse.
Writing that, with a couple of local definitions for clarity, gives:
findPals :: (Ord a, Eq a) => [[a]] -> [[[a]]]
findPals = map (key2words . head) .
filter ((== 2) . length) .
groupBy (==) .
sort .
(map word2key)
where word2key w = min w (reverse w)
key2words k = [k, reverse k]
Edit:
I posted my answer in a stale window without refreshing, so missed the very nice response from n.m. above.
Mea culpa.
So I'll atone by mentioning that both answers are variations on the well-known (in Perl circles) "Schwartzian transform" which itself applies a common Mathematical pattern -- h = f' . g . f -- translate a task to an alternate representation in which the task is easier, do the work, then translate back to the original representation.
The Schwartzian transform tuples up a value with its corresponding key, sorts by the key, then pulls the original value back out of the key/value tuple.
The little hack I included above was based on the fact that key2words is the non-deterministic inverse relation of word2key. It is only valid when two words have the same key, but that's exactly the case in the question, and is insured by the filter.
overAndBack :: (Ord b, Eq c) => (a -> b) -> ([b] -> [c]) -> (c -> d) -> [a] -> [d]
overAndBack f g f' = map f' . g . sort . map f
findPalPairs :: (Ord a, Eq a) => [[a]] -> [[[a]]]
findPalPairs = overAndBack over just2 back
where over w = min w (reverse w)
just2 = filter ((== 2) . length) . groupBy (==)
back = (\k -> [k, reverse k]) . head
Which demos as
*Main> findPalPairs $ words "I saw no cat was on a chair"
[["no","on"],["saw","was"]]
Thanks for the nice question.

Related

Given a list, how can I perform some transformation only on sub-lists whose each two elements satisfy a binary predicate?

(In my actual use case I have a list of type [SomeType], SomeType having a finite number of constructors, all nullary; in the following I'll use String instead of [SomeType] and use only 4 Chars, to simplify a bit.)
I have a list like this "aaassddddfaaaffddsssadddssdffsdf" where each element can be one of 'a', 's', 'd', 'f', and I want to do some further processing on each contiguous sequence of non-as, let's say turning them upper case and reversing the sequence, thus obtaining "aaaFDDDDSSaaaSSSDDFFaFDSFFDSSDDD". (I've added the reversing requirement to make it clear that the processing involves all the contiguous non 'a'-s at the same time.)
To turn each sub-String upper case, I can use this:
func :: String -> String
func = reverse . map Data.Char.toUpper
But how do I run that func only on the sub-Strings of non-'a's?
My first thought is that Data.List.groupBy can be useful, and the overall solution could be:
concat $ map (\x -> if head x == 'a' then x else func x)
$ Data.List.groupBy ((==) `on` (== 'a')) "aaassddddfaaaffddsssadddssdffsdf"
This solution, however, does not convince me, as I'm using == 'a' both when grouping (which to me seems good and unavoidable) and when deciding whether I should turn a group upper case.
I'm looking for advices on how I can accomplish this small task in the best way.
You could classify the list elements by the predicate before grouping. Note that I’ve reversed the sense of the predicate to indicate which elements are subject to the transformation, rather than which elements are preserved.
{-# LANGUAGE ScopedTypeVariables #-}
import Control.Arrow ((&&&))
import Data.Function (on)
import Data.Monoid (First(..))
mapSegmentsWhere
:: forall a. (a -> Bool) -> ([a] -> [a]) -> [a] -> [a]
mapSegmentsWhere p f
= concatMap (applyMatching . sequenceA) -- [a]
. groupBy ((==) `on` fst) -- [[(First Bool, a)]]
. map (First . Just . p &&& id) -- [(First Bool, a)]
where
applyMatching :: (First Bool, [a]) -> [a]
applyMatching (First (Just matching), xs)
= applyIf matching f xs
applyIf :: forall a. Bool -> (a -> a) -> a -> a
applyIf condition f
| condition = f
| otherwise = id
Example use:
> mapSegmentsWhere (/= 'a') (reverse . map toUpper) "aaassddddfaaaffddsssadddssdffsdf"
"aaaFDDDDSSaaaSSSDDFFaFDSFFDSSDDD"
Here I use the First monoid with sequenceA to merge the lists of adjacent matching elements from [(Bool, a)] to (Bool, [a]), but you could just as well use something like map (fst . head &&& map snd). You can also skip the ScopedTypeVariables if you don’t want to write the type signatures; I just included them for clarity.
If we need to remember the difference between the 'a's and the rest, let's put them in different branches of an Either. In fact, let's define a newtype now that we are at it:
{-# LANGUAGE DeriveFoldable #-}
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE ViewPatterns #-}
import Data.Bifoldable
import Data.Char
import Data.List
newtype Bunched a b = Bunched [Either a b] deriving (Functor, Foldable)
instance Bifunctor Bunched where
bimap f g (Bunched b) = Bunched (fmap (bimap f g) b)
instance Bifoldable Bunched where
bifoldMap f g (Bunched b) = mconcat (fmap (bifoldMap f g) b)
fmap will let us work over the non-separators. fold will return the concatenation of the non-separators, bifold will return the concatenation of everything. Of course, we could have defined separate functions unrelated to Foldable and Bifoldable, but why avoid already existing abstractions?
To split the list, we can use an unfoldr that alternately searches for as and non-as with the span function:
splitty :: Char -> String -> Bunched String String
splitty c str = Bunched $ unfoldr step (True, str)
where
step (_, []) = Nothing
step (True, span (== c) -> (as, ys)) = Just (Left as, (False, ys))
step (False, span (/= c) -> (xs, ys)) = Just (Right xs, (True, ys))
Putting it to work:
ghci> bifold . fmap func . splitty 'a' $ "aaassddddfaaaffddsssadddssdffsdf"
"aaaFDDDDSSaaaSSSDDFFaFDSFFDSSDDD"
Note: Bunched is actually the same as Tannen [] Either from the bifunctors package, if you don't mind the extra dependency.
There are other answers here, but I think they get too excited about iteration abstractions. A manual recursion, alternately taking things that match the predicate and things that don't, makes this problem exquisitely simple:
onRuns :: Monoid m => (a -> Bool) -> ([a] -> m) -> ([a] -> m) -> [a] -> m
onRuns p = go p (not . p) where
go _ _ _ _ [] = mempty
go p p' f f' xs = case span p xs of
(ts, rest) -> f ts `mappend` go p' p f' f rest
Try it out in ghci:
Data.Char> onRuns ('a'==) id (reverse . map toUpper) "aaassddddfaaaffddsssadddssdffsdf"
"aaaFDDDDSSaaaSSSDDFFaFDSFFDSSDDD"
Here is a simple solution - function process below - that only requires that you define two functions isSpecial and func. Given a constructor from your type SomeType, isSpecial determines whether it is one of those constructors that form a special sublist or not. The function func is the one you included in your question; it defines what should happen with the special sublists.
The code below is for character lists. Just change isSpecial and func to make it work for your lists of constructors.
isSpecial c = c /= 'a'
func = reverse . map toUpper
turn = map (\x -> ([x], isSpecial x))
amalgamate [] = []
amalgamate [x] = [x]
amalgamate ((xs, xflag) : (ys, yflag) : rest)
| xflag /= yflag = (xs, xflag) : amalgamate ((ys, yflag) : rest)
| otherwise = amalgamate ((xs++ys, xflag) : rest)
work = map (\(xs, flag) -> if flag then func xs else xs)
process = concat . work . amalgamate . turn
Let's try it on your example:
*Main> process "aaassddddfaaaffddsssadddssdffsdf"
"aaaFDDDDSSaaaSSSDDFFaFDSFFDSSDDD"
*Main>
Applying one function at a time, shows the intermediate steps taken:
*Main> turn "aaassddddfaaaffddsssadddssdffsdf"
[("a",False),("a",False),("a",False),("s",True),("s",True),("d",True),
("d",True),("d",True),("d",True),("f",True),("a",False),("a",False),
("a",False),("f",True),("f",True),("d",True),("d",True),("s",True),
("s",True),("s",True),("a",False),("d",True),("d",True),("d",True),
("s",True),("s",True),("d",True),("f",True),("f",True),("s",True),
("d",True),("f",True)]
*Main> amalgamate it
[("aaa",False),("ssddddf",True),("aaa",False),("ffddsss",True),
("a",False),("dddssdffsdf",True)]
*Main> work it
["aaa","FDDDDSS","aaa","SSSDDFF","a","FDSFFDSSDDD"]
*Main> concat it
"aaaFDDDDSSaaaSSSDDFFaFDSFFDSSDDD"
*Main>
We can just do what you describe, step by step, getting a clear simple minimal code which we can easily read and understand later on:
foo :: (a -> Bool) -> ([a] -> [a]) -> [a] -> [a]
foo p f xs = [ a
| g <- groupBy ((==) `on` fst)
[(p x, x) | x <- xs] -- [ (True, 'a'), ... ]
, let (t:_, as) = unzip g -- ( [True, ...], "aaa" )
, a <- if t then as else (f as) ] -- final concat
-- unzip :: [(b, a)] -> ([b], [a])
We break the list into same-p spans and unpack each group with the help of unzip. Trying it out:
> foo (=='a') reverse "aaabcdeaa"
"aaaedcbaa"
So no, using == 'a' is avoidable and hence not especially good, introducing an unnecessary constraint on your data type when all we need is equality on Booleans.

Removing duplicate elements in a Seq

wondering how to implement nub over a Seq a
I get that one could do:
nubSeq :: Seq a -> Seq a
nubSeq = fromList . nub . toList
Just wondering is there something standard that does not convert to Lists in order to call nub :: [a]->[a]?
An implementation that occurred to me, based obviously on nub, is:
nubSeq :: (Eq a) => Seq a -> Seq a
nubSeq = Data.Sequence.foldrWithIndex
(\_ x a -> case x `Data.Sequence.elemIndexR` a of
Just _ -> a
Nothing -> a |> x) Data.Sequence.empty
But there must be something more elegant?
thanks.
Not sure whether this qualifies as more elegant but it splits the concerns in independent functions (caveat: you need an Ord constraint on a):
seqToNubMap takes a Seq and outputs a Map associating to each a the smallest index at which it appeared in the sequence
mapToList takes a Map of values and positions and produces a list of values in increasing order according to the specified positions
nubSeq combines these to generate a sequence without duplicates
The whole thing should be O(n*log(n)), I believe:
module NubSeq where
import Data.Map as Map
import Data.List as List
import Data.Sequence as Seq
import Data.Function
seqToNubMap :: Ord a => Seq a -> Map a Int
seqToNubMap = foldlWithIndex (\ m k v -> insertWith min v k m) Map.empty
mapToList :: Ord a => Map a Int -> [a]
mapToList = fmap fst . List.sortBy (compare `on` snd) . Map.toList
nubSeq :: Ord a => Seq a -> Seq a
nubSeq = Seq.fromList . mapToList . seqToNubMap
Or a simpler alternative following #DavidFletcher's comment:
nubSeq' :: forall a. Ord a => Seq a -> Seq a
nubSeq' xs = Fold.foldr cons nil xs Set.empty where
cons :: a -> (Set a -> Seq a) -> (Set a -> Seq a)
cons x xs seen
| x `elem` seen = xs seen
| otherwise = x <| xs (Set.insert x seen)
nil :: Set a -> Seq a
nil _ = Seq.empty
Another way with an Ord constraint - use a scan to make the sets of
elements that appear in each prefix of the list. Then we can filter out
any element that's already been seen.
import Data.Sequence as Seq
import Data.Set as Set
nubSeq :: Ord a => Seq a -> Seq a
nubSeq xs = (fmap fst . Seq.filter (uncurry notElem)) (Seq.zip xs seens)
where
seens = Seq.scanl (flip Set.insert) Set.empty xs
Or roughly the same thing as a mapAccumL:
nubSeq' :: Ord a => Seq a -> Seq a
nubSeq' = fmap fst . Seq.filter snd . snd . mapAccumL f Set.empty
where
f s x = (Set.insert x s, (x, x `notElem` s))
(If I was using lists I would use Maybes instead of the pairs with
Bool, then use catMaybes instead of filtering. There doesn't seem to be catMaybes
for Sequence though.)
I think your code should be pretty efficient. Since Sequences are tree data structures using another tree type data structure like Map or HashMap to store and lookup the previous items doesn't make too much sense to me.
Instead i take the first item and check it's existence in the rest. If exists i drop that item and proceed the same with the rest recursively. If not then construct a new sequence with first element is the unique element and the rest is the result of nubSeq fed by the rest. Should be typical. I use ViewPatterns.
{-# LANGUAGE ViewPatterns #-}
import Data.Sequence as Seq
nubSeq :: Eq a => Seq a -> Seq a
nubSeq (viewl -> EmptyL) = empty
nubSeq (viewl -> (x :< xs)) | elemIndexL x xs == Nothing = x <| nubSeq xs
| otherwise = nubSeq xs
*Main> nubSeq . fromList $ [1,2,3,4,4,2,3,6,7,1,2,3,4]
fromList [6,7,1,2,3,4]

Filter Duplicate elements from a [[String]] Haskell

I have a list in the form [["A1","A1","A1"] .. ["G3","G3","G3"]] which contains many duplicate elements like ["A1","A2","A3"] and ["A3","A2","A1"].
How do I filter out such duplicate elements?
if check the above two elements for equality, it shows false
*Main> ["A1","A2","A3"] == ["A3","A2","A1"]
False
nubBy :: (a -> a -> Bool) -> [a] -> [a] is a relevant function that removes duplicates from a list via an arbitrary equality test.
A version of the function you're looking for is:
import Data.List (sort, nubBy)
removeDuplicates' :: Ord a => [[a]] -> [[a]]
removeDuplicates' = nubBy (\l1 l2 = sort l1 == sort l2)
Of course, this does require that a is an Ord, not just an Eq, as well as using sort, which is (as stated below) an expensive function. So it is certainly not ideal. However, I don't know specifically how you want to do the equality tests on those lists, so I'll leave the details to you.
#AJFarmar's answer solves the issue. But it can be done a bit more efficient: since sort is an expensive function. We want to save on such function calls.
We can use:
import Data.List(nubBy, sort)
import Data.Function(on)
removeDuplicates' :: Ord a => [[a]] -> [[a]]
removeDuplicates' = map snd . nubBy ((==) `on` fst) . map ((,) =<< sort)
what we here do is first construct a map ((,) =<< sort). This means that for every element x in the original list, we construct a tuple (sort x,x). Now we will perform a nubBy on the first elements of the two tuples we want to sort. After we have sorted, we will perform a map snd where we - for every tuple (sort x,x) return the second item.
We can generalize this by constructing a nubOn function:
import Data.List(nubBy)
import Data.Function(on)
nubOn :: Eq b => (a -> b) -> [a] -> [a]
nubOn f = map snd . nubBy ((==) `on` fst) . map ((,) =<< f)
In that case removeDuplicates' is nubOn sort.
You may not even need to sort. You just need to see if all items are the same like;
\xs ys -> length xs == (length . filter (== True) $ (==) <$> xs <*> ys)
you just need to know that (==) <$> ["A1","A2","A3"] <*> ["A3","A2","A1"] would in fact return [False,False,True,False,True,False,True,False,False]
As per #rampion 's rightful comment let's take it further and import Data.Set then it gets pretty dandy.
import Data.Set as S
equity :: Ord a => [a] -> [a] -> Bool
equity = (. S.fromList) . (==) . S.fromList
*Main> equity ["A1","A2","A3"] ["A3","A2","A1"]
True

Filter a list of tuples by fst

What I'm trying to do is not really solve a problem, but more to learn how to write Haskell code that composes/utilizes basic functions to do it.
I have a function that takes a list of tuples (String, Int) and a String, and returns a tuple whose fst matches the given String.
This was fairly easy to do with filter and lambda, but what I want to do now, is remove the rightmost argument, ie. I want to refactor the function to be a composition of partially applied functions that'll do the same functionality.
Original code was:
getstat :: Player -> String -> Stat
getstat p n = head $ filter (\(n', v) -> n' == n) $ stats p
New code is:
getstat :: Player -> String -> Stat
getstat p = head . (flip filter $ stats p) . cmpfst
where cmpfst = (==) . fst . (flip (,)) 0 -- Wrong :-\
The idea is to flip the filter and partially apply by giving in the list of tuples (stats p) and then compose cmpfst.
cmpfst should be String -> (String, Int) -> Bool so that when String argument is applied, it becomes a -> Bool which is good for the filter to pass in tuples, but as you can see - I have problems composing (==) so that only fst's of given tuples are compared.
P.S. I know that the first code is likely cleaner; the point of this task was not to write clean code but to learn how to solve the problem through composition.
Edit:
I understand well that asking for a head on an possibly empty list is a bad programming that'll result in a crash. Like one earlier poster mentioned, it is very simply and elegantly resolved with Maybe monad - a task I've done before and am familiar with.
What I'd like the focus to be on, is how to make cmpfst composed primarily of basic functions.
So far, the furthest I got is this:
getstat :: Player -> String -> Stat
getstat p = head . (flip filter $ stats p) . (\n' -> (==(fst n')) . fst) . (flip (,)) 0
I can't get rid of the (a -> Bool) lambda by composing and partially applying around (==). This signals, to me, that I either don't understand what I'm doing, or it's impossible using (==) operator in the way I imagined.
Furthermore, unless there's no exact solution, I'll accept signature-change solution as correct one. I'd like not to change the signature of the function simply because its a mental exercise for me, not a production code.
If I were writing this function, I'd probably have given it this type signature:
getstat :: String -> Player -> Stat
This makes it easy to eta-reduce the definition to
getstat n = head . filter ((== n) . fst) . stats
In a comment, you reached
getstat p = head . (flip filter $ stats p) . (\n (n', v) -> n' == n)
I wonder if there's a nicer composition that can eliminate the anon f.
Well, here it is
\n (n', v) -> n' == n
-- for convenience, we flip the ==
\n (n', v) -> n == n'
-- prefix notation
\n (n', v) -> (==) n n'
-- let's remove pattern matching over (n', v)
\n (n', v) -> (==) n $ fst (n', v)
\n x -> (==) n $ fst x
-- composition, eta
\n -> (==) n . fst
-- prefix
\n -> (.) ((==) n) fst
-- composition
\n -> ((.) . (==) $ n) fst
-- let's force the application to be of the form (f n (g n))
\n -> ((.) . (==) $ n) (const fst $ n)
-- exploit f <*> g = \n -> f n (g n) -- AKA the S combinator
((.) . (==)) <*> (const fst)
-- remove unneeded parentheses
(.) . (==) <*> const fst
Removing p is left as an exercise.

Nondeterminism for infinite inputs

Using lists to model nondeterminism is problematic if the inputs can take infinitely many values. For example
pairs = [ (a,b) | a <- [0..], b <- [0..] ]
This will return [(0,1),(0,2),(0,3),...] and never get around to showing you any pair whose first element is not 0.
Using the Cantor pairing function to collapse a list of lists into a single list can get around this problem. For example, we can define a bind-like operator that orders its outputs more intelligently by
(>>>=) :: [a] -> (a -> [b]) -> [b]
as >>>= f = cantor (map f as)
cantor :: [[a]] -> [a]
cantor xs = go 1 xs
where
go _ [] = []
go n xs = hs ++ go (n+1) ts
where
ys = filter (not.null) xs
hs = take n $ map head ys
ts = mapN n tail ys
mapN :: Int -> (a -> a) -> [a] -> [a]
mapN _ _ [] = []
mapN n f xs#(h:t)
| n <= 0 = xs
| otherwise = f h : mapN (n-1) f t
If we now wrap this up as a monad, we can enumerate all possible pairs
newtype Select a = Select { runSelect :: [a] }
instance Monad Select where
return a = Select [a]
Select as >>= f = Select $ as >>>= (runSelect . f)
pairs = runSelect $ do
a <- Select [0..]
b <- Select [0..]
return (a,b)
This results in
>> take 15 pairs
[(0,0),(0,1),(1,0),(0,2),(1,1),(2,0),(0,3),(1,2),(2,1),(3,0),(0,4),(1,3),(2,2),(3,1),(4,0)]
which is a much more desirable result. However, if we were to ask for triples instead, the ordering on the outputs isn't as "nice" and it's not even clear to me that all outputs are eventually included --
>> take 15 triples
[(0,0,0),(0,0,1),(1,0,0),(0,1,0),(1,0,1),(2,0,0),(0,0,2),(1,1,0),(2,0,1),(3,0,0),(0,1,1),(1,0,2),(2,1,0),(3,0,1),(4,0,0)]
Note that (2,0,1) appears before (0,1,1) in the ordering -- my intuition says that a good solution to this problem will order the outputs according to some notion of "size", which could be an explicit input to the algorithm, or could be given implicitly (as in this example, where the "size" of an input is its position in the input lists). When combining inputs, the "size" of a combination should be some function (probably the sum) of the size of the inputs.
Is there an elegant solution to this problem that I am missing?
TL;DR: It flattens two dimensions at a time, rather than flattening three at once. You can't tidy this up in the monad because >>= is binary, not ternary etc.
I'll assume you defined
(>>>=) :: [a] -> (a -> [b]) -> [b]
as >>>= f = cantor $ map f as
to interleave the list of lists.
You like that because it goes diagonally:
sums = runSelect $ do
a <- Select [0..]
b <- Select [0..]
return (a+b)
gives
ghci> take 36 sums
[0,1,1,2,2,2,3,3,3,3,4,4,4,4,4,5,5,5,5,5,5,6,6,6,6,6,6,6,7,7,7,7,7,7,7,7]
so it's pleasingly keeping the "sizes" in order, but the pattern appears to be broken for triples, and you doubt completeness, but you needn't. It's doing the same trick, but twice, rather than for all three at once:
triplePairs = runSelect $ do
a <- Select [0..]
b <- Select [0..]
c <- Select [0..]
return $ (a,(b,c))
The second pair is treated as a single source of data, so notice that:
ghci> map fst $ take 36 pairs
[0,0,1,0,1,2,0,1,2,3,0,1,2,3,4,0,1,2,3,4,5,0,1,2,3,4,5,6,0,1,2,3,4,5,6,7]
ghci> map fst $ take 36 triplePairs
[0,0,1,0,1,2,0,1,2,3,0,1,2,3,4,0,1,2,3,4,5,0,1,2,3,4,5,6,0,1,2,3,4,5,6,7]
and (adding some spaces/newlines for clarity of pattern):
ghci> map snd $ take 36 pairs
[0, 1,0, 2,1,0, 3,2,1,0, 4,3,2,1,0, 5,4,3,2,1,0, 6,5,4,3,2,1,0, 7,6,5,4,3,2,1,0]
ghci> map snd $ take 36 triplePairs
[(0,0), (0,1),(0,0), (1,0),(0,1),(0,0), (0,2),(1,0),(0,1),(0,0),
(1,1),(0,2),(1,0),(0,1),(0,0),
(2,0),(1,1),(0,2),(1,0),(0,1),(0,0),
(0,3),(2,0),(1,1),(0,2),(1,0),(0,1),(0,0),
(1,2),(0,3),(2,0),(1,1),(0,2),(1,0),(0,1),(0,0)]
so you can see it's using exactly the same pattern. This doesn't preserve total sums and it oughtn't because we're getting to three dimensions by flattening two dimensions first before flattening the third in. The pattern is obscured, but it's just as guaranteed to make it to the end of the list.
Sadly if you want to do three dimensions in a sum-preserving way, you'll have to write cantor2, cantor3 and cantor4 functions, possibly a cantorN function, but you'll have to ditch the monadic interface, which is inherently based on the bracketing of >>=, hence two-at-a-time flattening of dimensions.
import Control.Applicative
import Control.Arrow
data Select a = Select [a]
| Selects [Select a]
instance Functor Select where
fmap f (Select x) = Select $ map f x
fmap f (Selects xss) = Selects $ map (fmap f) xss
instance Applicative Select where
pure = Select . (:[])
Select fs <*> xs = Selects $ map (`fmap`xs) fs
Selects fs <*> xs = Selects $ map (<*>xs) fs
instance Monad Select where
return = pure
Select xs >>= f = Selects $ map f xs
Selects xs >>= f = Selects $ map (>>=f) xs
runSelect :: Select a -> [a]
runSelect = go 1
where go n xs = uncurry (++) . second (go $ n+1) $ splitOff n xs
splitOff n (Select xs) = second Select $ splitAt n xs
splitOff n (Selects sls) = (concat hs, Selects $ tsl ++ rl)
where ((hs, tsl), rl) = first (unzip . map (splitOff n)) $ splitAt n sls
*Select> take 15 . runSelect $ do { a<‌-Select [0..]; b<‌-Select [0..]; return (a,b) }
[(0,0),(0,1),(1,0),(1,1),(0,2),(1,2),(2,0),(2,1),(2,2),(0,3),(1,3),(2,3),(3,0),(3,1),(3,2)]
*Select> take 15 . runSelect $ do { a<‌-Select [0..]; b<‌-Select [0..]; c<‌-Select [0..]; return (a,b,c) }
[(0,0,0),(0,0,1),(0,1,0),(0,1,1),(1,0,0),(1,0,1),(1,1,0),(1,1,1),(0,0,2),(0,1,2),(0,2,0),(0,2,1),(0,2,2),(1,0,2),(1,1,2)]
Note that this is still not quite Cantor-tuples ((0,1,1) shouldn't come before (1,0,0)), but getting it correct would be possible as well in a similar manner.
A correct multidimentional enumerator could be represented with a temporary state object
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE OverlappingInstances #-}
class Space a b where
slice :: a -> ([b], a)
instance Space [a] a where
slice (l:ls) = ([l], ls)
slice [] = ([], [])
instance (Space sp x) => Space ([sp], [sp]) x where
slice (fs, b:bs) = let
ss = map slice (b : fs)
yield = concat $ map fst ss
in (yield, (map snd ss, bs))
Here an N dimensional space is represented by a tuple of lists of N-1 dimensional subspaces that have and haven't been touched by the enumeration.
You can then use the following to produce a well ordered list
enumerate :: (Space sp x) => sp -> [x]
enumerate sp = let (sl, sp') = slice sp
in sl ++ enumerate sp'
Example in Ideone.
The omega package does exactly what you want and guarantees that every element will be eventually visited:
import Control.Applicative
import Control.Monad.Omega
main = print . take 200 . runOmega $
(,,) <$> each [0..] <*> each [0..] <*> each [0..]
Another option would be to use LogicT. It gives more flexibility (if you need) and has operations such as (>>-) that ensure that every combination is eventually encountered.
import Control.Applicative
import Control.Monad
import Control.Monad.Logic
-- | Convert a list into any MonadPlus.
each :: (MonadPlus m) => [a] -> m a
each = msum . map return
-- | A fair variant of '(<*>)` that ensures that both branches are explored.
(<#>) :: (MonadLogic m) => m (a -> b) -> m a -> m b
(<#>) f k = f >>- (\f' -> k >>- (\k' -> return $ f' k'))
infixl 4 <#>
main = print . observeMany 200 $
(,,) <$> each [0..] <#> each [0..] <#> each [0..]

Resources