Enumerating all pairs of possibly infinite lists [duplicate] - haskell

I have a function for finite lists
> kart :: [a] -> [b] -> [(a,b)]
> kart xs ys = [(x,y) | x <- xs, y <- ys]
but how to implement it for infinite lists? I have heard something about Cantor and set theory.
I also found a function like
> genFromPair (e1, e2) = [x*e1 + y*e2 | x <- [0..], y <- [0..]]
But I'm not sure if it helps, because Hugs only gives out pairs without ever stopping.
Thanks for help.

Your first definition, kart xs ys = [(x,y) | x <- xs, y <- ys], is equivalent to
kart xs ys = xs >>= (\x ->
ys >>= (\y -> [(x,y)]))
where
(x:xs) >>= g = g x ++ (xs >>= g)
(x:xs) ++ ys = x : (xs ++ ys)
are sequential operations. Redefine them as alternating operations,
(x:xs) >>/ g = g x +/ (xs >>/ g)
(x:xs) +/ ys = x : (ys +/ xs)
[] +/ ys = ys
and your definition should be good to go for infinite lists as well:
kart_i xs ys = xs >>/ (\x ->
ys >>/ (\y -> [(x,y)]))
testing,
Prelude> take 20 $ kart_i [1..] [101..]
[(1,101),(2,101),(1,102),(3,101),(1,103),(2,102),(1,104),(4,101),(1,105),(2,103)
,(1,106),(3,102),(1,107),(2,104),(1,108),(5,101),(1,109),(2,105),(1,110),(3,103)]
courtesy of "The Reasoned Schemer". (see also conda, condi, conde, condu).
another way, more explicit, is to create separate sub-streams and combine them:
kart_i2 xs ys = foldr g [] [map (x,) ys | x <- xs]
where
g a b = head a : head b : g (tail a) (tail b)
this actually produces exactly the same results. But now we have more control over how we combine the sub-streams. We can be more diagonal:
kart_i3 xs ys = g [] [map (x,) ys | x <- xs]
where -- works both for finite
g [] [] = [] -- and infinite lists
g a b = concatMap (take 1) a
++ g (filter (not . null) (take 1 b ++ map (drop 1) a))
(drop 1 b)
so that now we get
Prelude> take 20 $ kart_i3 [1..] [101..]
[(1,101),(2,101),(1,102),(3,101),(2,102),(1,103),(4,101),(3,102),(2,103),(1,104)
,(5,101),(4,102),(3,103),(2,104),(1,105),(6,101),(5,102),(4,103),(3,104),(2,105)]
With some searching on SO I've also found an answer by Norman Ramsey with seemingly yet another way to generate the sequence, splitting these sub-streams into four areas - top-left tip, top row, left column, and recursively the rest. His merge there is the same as our +/ here.
Your second definition,
genFromPair (e1, e2) = [x*e1 + y*e2 | x <- [0..], y <- [0..]]
is equivalent to just
genFromPair (e1, e2) = [0*e1 + y*e2 | y <- [0..]]
Because the list [0..] is infinite there's no chance for any other value of x to come into play. This is the problem that the above definitions all try to avoid.

Prelude> let kart = (\xs ys -> [(x,y) | ls <- map (\x -> map (\y -> (x,y)) ys) xs, (x,y) <- ls])
Prelude> :t kart
kart :: [t] -> [t1] -> [(t, t1)]
Prelude> take 10 $ kart [0..] [1..]
[(0,1),(0,2),(0,3),(0,4),(0,5),(0,6),(0,7),(0,8),(0,9),(0,10)]
Prelude> take 10 $ kart [0..] [5..10]
[(0,5),(0,6),(0,7),(0,8),(0,9),(0,10),(1,5),(1,6),(1,7),(1,8)]

you can think of the sequel as
0: (0, 0)
/ \
1: (1,0) (0,1)
/ \ / \
2: (2,0) (1, 1) (0,2)
...
Each level can be expressed by level n: [(n,0), (n-1, 1), (n-2, 2), ..., (0, n)]
Doing this to n <- [0..]
We have
cartesianProducts = [(n-m, m) | n<-[0..], m<-[0..n]]

Related

Two implementations for List Monad bind in the literature: why are they equivalent?

Reading the Monad chapter in "Programming in Haskell" 2nd ed. from Graham Hutton, I found this example on page 167 to illustrate the behaviour of the List Monad:
> pairs [1,2] [3,4]
[(1,3),(1,4),(2,3),(2,4)]
With pairs defined like this:
pairs :: [a] -> [b] -> [(a,b)]
pairs xs ys = do x <- xs
y <- ys
return (x,y)
And this implementation of bind:
instance Monad [] where
-- (>>=) :: [a] -> (a -> [b]) -> [b]
xs >>= f = [y | x <- xs, y <- f x]
I tried to understand with pencil and paper how the example worked out, but didn't get it.
Then I found, that in other books the bind operation is defined differently:
...
xs >>= f = concat (fmap f xs)
With this definition I understand why the example works.
But the first definition is the one I found on hackage in the prelude, so I trust its correct.
My question:
Can anybody explain why the first definition is equivalent to the second? (Where does the concat-stuff happen in the first one?)
List comprehensions are just syntactic sugar. Basically, [f x y | x<-l, y<-m] is sugar for
concatMap (\x -> concatMap (\y -> return $ f x y) m) l
or equivalently
concat $ fmap (\x -> concat $ fmap (\y -> return $ f x y) m) l
thus the two implementations are indeed equivalent by definition.
Anyway you can of course manually evaluate the example from the comprehension-based definition, using “intuitive set comprehension” evaluation:
pairs [1,2] [3,4]
≡ do { x <- [1,2]; y <- [3,4]; return (x,y) }
≡ [1,2] >>= \x -> [3,4] >>= \y -> return (x,y)
≡ [p | x<-[1,2], p <- (\ξ -> [3,4] >>= \y -> return (ξ,y)) x]
≡ [p | x<-[1,2], p <- ([3,4] >>= \y -> return (x,y))]
≡ [p | x<-[1,2], p <- [q | y<-[3,4], q <- (\υ -> return (x,υ)) y]]
≡ [p | x<-[1,2], p <- [q | y<-[3,4], q <- return (x,y)]]
≡ [p | x<-[1,2], p <- [q | y<-[3,4], q <- [(x,y)]]]
≡ [p | x<-[1,2], p <- [(x,3), (x,4)]]
≡ [(1,3), (1,4)] ++ [(2,3), (2,4)]
≡ [(1,3), (1,4), (2,3), (2,4)]
[y | x <- xs, y <- f x] takes all the xs in xs one-by-one and
apply f to them, which is a monadic action a -> [a], thus the result is a list of values ([a])
the comprehension proceeds to address each y in f x one-by-one
each y is sent to the output list
this is equivalent to first mapping f over each of the elements of the input list, resulting in a list of nested lists, that are then concatenated. Notice that fmap is map for lists, and you could use concatMap f xs as the definition of xs >>= f.

Define functions with foldl foldr

I understand the definitions of foldl, foldr, but I have problems with functions defined by them.
For example map with foldr:
map f [] = []
map f l = foldr (\x xs -> f x : xs) [] l
I don't understand the (\x xs -> f x : xs). It is the map function, which foldr takes? But shouldn't it be (\x xs -> f x : f xs), because map f (x:xs) = f x : map f xs?
Example with foldl:
concat (x:xs) = x ++ concat xs
concat' xs = foldl (++) [] xs
concat'' xs = foldl (\ys y -> ys ++ y) [] xs
Of course I understand (++), but what's the logic behind (\ys y -> ys ++ y)? Is it ys = [] and y = xs?
So the function takes [] as ys and y is the first element of xs and concates the [] with the y?
Concrete example:
concat'' [1,2,3] = foldl (\ys y -> ys ++ y) [] [1,2,3]
=> foldl (\ys y -> ys ++ y) ((\ys y -> ys ++ y) [] [1]) [2,3]
=> foldl (\ys y -> ys ++ y) [1] [2,3]
=> foldl (\ys y -> ys ++ y) ((\ys y -> ys ++ y) [1] [2]) [3]
=> foldl (\ys y -> ys ++ y) [1,2] [3]
=> foldl (\ys y -> ys ++ y) ((\ys y -> ys ++ y) [1,2] [3]) []
=> foldl (\ys y -> ys ++ y) [1,2,3] []
=> [1,2,3]
Another thing: concat only takes 1 list xs, so if I want to concat 2 lists?
concat (x:xs) ys = x ++ concat xs ys
concat [1,2,3] [4,5,6] with foldl?
Reverse:
reverse (x:xs) = reverse xs ++ [x]
reverse' l = foldl (\xs x -> [x] : xs) [] l
reverse'' l = foldr (\x xs -> xs ++ [x]) [] l
The foldr is intuitive clear (with the questions from above), but what's behind the reverse order in foldl (\xs x -> [x] : xs)? This foldl (\x xs -> xs ++ [x]) [] l would be wrong, wouldn't it?
Thanks a lot!
The code
foldr (\x xs -> ...) end list
could be read, roughly, as follows
scan the whole list
if it's empty, just return end end
otherwise:
let x be the element at hand
let xs be the rest of the list, after having been processed
apply the ... operation
The emphasized part is crucial. xs is not the rest of the list, but the result of the "recursive call" on it.
Indeed, xs is a bad name for that. In thee general case, it's not even a list! E.g. one would never write (silly example)
foldr (\x xs -> x + xs) 0 [1..100] -- sum 1..100
but rather prefer something like
foldr (\x partialSum -> x + partialSum) 0 [1..100] -- sum 1..100
(Actually, one would not sum using foldr, but let's leave that aside.)
So, just read it like this:
map f l = foldr (\x mappedTail -> f x : mappedTail) [] l

Ways to pack (adjacent) elements of a list into 2-tuples

I was wondering if there would be a concise/one-liner way to do the following:
pack :: [a] -> [(a, a)]
pack [] = []
pack [_] = []
pack (x:y:xs) = (x, y) : pack xs
Which is the same as:
pack' xs = [(x, y) | (x, y, i) <- zip3 xs (tail xs) [0..], even i]
I don’t have much against either of these two options, but I was wondering: is there more concise way by combining (,) with some other function?
I had assumed there’d be such a way, but it eludes me. So this is just out of curiosity.
Thanks!
We can easily split the list into two lists with alternating elements with this tidbit (due to HaskellWiki)
foldr (\a ~(x,y) -> (a:y,x)) ([],[])
All that remains is to combine the lists with zip
pack :: [a] -> [(a, a)]
pack = uncurry zip . foldr (\a ~(x,y) -> (a:y,x)) ([],[])
Another one-liner using LambdaCase and Data.List.unfoldr:
pack = unfoldr $ \case (x:y:zs) -> Just ((x,y),zs); _ -> Nothing
Something I want occasionally is splits -
splits :: Int -> [a] -> [[a]]
splits n = unfoldr $ \case [] -> Nothing ; xs -> Just $ splitAt n xs
And given that, pack becomes:
pack xs = [ (a,b) | [a,b] <- splits 2 xs ]
Note that for xs = [x1, x2, ..., xn-1, xn], we have
init xs = [x1, x2, ... , xn-1]
tail xs = [x2, x3, ... , xn ]
leading to
zip (init xs) (tail xs) = [(x1, x2), (x2, x3), (x3, x4), ...]
and what we want is
pack xs = [(x1, x2), (x3, x4), ...]
which is easy to get once we have a list of masks
cycle [True, False] = [ True, False, True, ... ]
leading to the one-liner
pack :: [a] -> [(a, a)]
pack xs = map snd . filter fst . zip (cycle [True, False]) $ zip (init xs) (tail xs)
I don't know if this is one line, but:
snd $ foldr (\ x (z, ps) -> maybe (Just x, ps) (\y -> (Nothing, (x, y) : ps) z) (Nothing, []) $ xs
should be the same as your function.

Join or merge function in haskell

Is there a function in Haskell to do the equivalent of an SQL join or an R merge ?
Basically I have two list of tuples and would like to 'zip' them according to their key.
I know there is only one or zero value for each key
a = [(1, "hello"), (2, "world")]
b = [(3, "foo"), (1, "bar")]
And get something like
[ (Just (1, "hello), Just (1, "bar))
, (Just (2, "world), Nothing)
, (Nothing , Just (3, "foo"))
]
With ordered sets (lists) and Ord key
key = fst
fullOuterJoin xs [] = map (\x -> (Just x, Nothing)) xs
fullOuterJoin [] ys = map (\y -> (Nothing, Just y)) ys
fullOuterJoin xss#(x:xs) yss#(y:ys) =
if key x == key y
then (Just x, Just y): fullOuterJoin xs ys
else
if key x < key y
then (Just x, Nothing): fullOuterJoin xs yss
else (Nothing, Just y): fullOuterJoin xss ys
(complexity is O(n+m) but if you must to sort then is O(n log n + m log m))
example
setA = [(1, "hello"), (2, "world")]
setB = [(1, "bar"), (3, "foo")]
*Main> fullOuterJoin setA setB
[(Just (1,"hello"),Just (1,"bar")),(Just (2,"world"),Nothing),(Nothing,Just (3, "foo"))]
(obviously with sort support
fullOuterJoin' xs ys = fullOuterJoin (sort xs) (sort ys)
As #Franky say, you can avoid if, for example
fullOuterJoin xs [] = [(Just x, Nothing) | x <- xs]
fullOuterJoin [] ys = [(Nothing, Just y) | y <- ys]
fullOuterJoin xss#(x:xs) yss#(y:ys) =
case (compare `on` key) x y of
EQ -> (Just x, Just y): fullOuterJoin xs ys
LT -> (Just x, Nothing): fullOuterJoin xs yss
GT -> (Nothing, Just y): fullOuterJoin xss ys
I can not think of any standard function doing this operation. I would convert the two lists into Data.Map.Maps and code the SQL-join myself. In this way, it looks doable with O(n log n) complexity, which is not too bad.
If you care about performance, this is not the answer you are looking for.
No answer for a built-in function since you didn't give a type.
It can be done by a simple list comprehension
joinOnFst as bs = [(a,b) | a<-as, b<-bs, fst a == fst b]
or with pattern matching and a different return type
joinOnFst as bs = [(a1,a2,b2) | (a1,a2)<-as, (b1,b2)<-bs, a1==b1]
More abstract, you can define
listJoinBy :: (a -> b -> Bool) -> [a] -> [b] -> [(a,b)]
listJoinBy comp as bs = [(a,b) | a<-as, b<-bs, comp a b]
listJoin :: (Eq c) => (a -> c) -> (b -> c) -> [a] -> [b] -> [(a,b)]
listJoin fa fb = listJoinBy (\a b -> fa a == fb b)
I bet the last line can be made point-free or at least the lambda can be eliminated.
You're asking if there's a way to something like a SQL join in Haskell.
Rosetta code has an example in Haskell of how to do a hash join, which is the algorithm lots of RDBMS's use for JOINs, one of which is presented as cleaner (though it is slower):
a cleaner and more functional solution would be to use Data.Map (based on binary trees):
import qualified Data.Map as M
import Data.List
import Data.Maybe
import Control.Applicative
mapJoin xs fx ys fy = joined
where yMap = foldl' f M.empty ys
f m y = M.insertWith (++) (fy y) [y] m
joined = concat .
mapMaybe (\x -> map (x,) <$> M.lookup (fx x) yMap) $ xs
main = mapM_ print $ mapJoin
[(1, "Jonah"), (2, "Alan"), (3, "Glory"), (4, "Popeye")]
snd
[("Jonah", "Whales"), ("Jonah", "Spiders"),
("Alan", "Ghosts"), ("Alan", "Zombies"), ("Glory", "Buffy")]
fst
What you asked is basically
ordzip a#(x:t) b#(y:r) = case compare x y of
LT -> (Just x, Nothing) : ordzip t b
EQ -> (Just x, Just y) : ordzip t r
GT -> (Nothing, Just y) : ordzip a r
ordzip a [] = [(Just x, Nothing) | x <- a]
ordzip [] b = [(Nothing, Just y) | y <- b]
With it, we can further define e.g.
import Control.Applicative (<|>)
diff xs ys = [x | (Just x, Nothing) <- ordzip xs ys] -- set difference
meet xs ys = [y | (Just _, Just y) <- ordzip xs ys] -- intersection
union xs ys = [z | (u,v) <- ordzip xs ys, let Just z = u <|> v]
or have few variations concerning the access to keys or handling of duplicates etc. (like defining ordzipBy k a#(x:t) b#(y:r) = case compare (k x) (k y) of ...).
But this is better represented with Data.These.These type to explicate the fact that (Nothing, Nothing) can never happen:
import Data.These
ordzip a#(x:t) b#(y:r) = case compare x y of
LT -> This x : ordzip t b
GT -> That y : ordzip a r
_ -> These x y : ordzip t r
diff xs ys = catThis $ ordzip xs ys
meet xs ys = map snd . catThese $ ordzip xs ys -- or map fst, or id
union xs ys = map (mergeThese const) $ ordzip xs ys
Of course the input lists are to be sorted by the key beforehand.

get all possible combinations of k elements from a list

I need a function that does the same thing as itertools.combinations(iterable, r) in python
So far I came up with this:
{-| forward application -}
x -: f = f x
infixl 0 -:
{-| combinations 2 "ABCD" = ["AB","AC","AD","BC","BD","CD"] -}
combinations :: Ord a => Int -> [a] -> [[a]]
combinations k l = (sequence . replicate k) l -: map sort -: sort -: nub
-: filter (\l -> (length . nub) l == length l)
Is there a more elegant and efficient solution?
xs elements taken n by n is
mapM (const xs) [1..n]
all combinations (n = 1, 2, ...) is
allCombs xs = [1..] >>= \n -> mapM (const xs) [1..n]
if you need without repetition
filter ((n==).length.nub)
then
combinationsWRep xs n = filter ((n==).length.nub) $ mapM (const xs) [1..n]
(Based on #JoseJuan's answer)
You can also use a list comprehension to filter out those where the second character is not strictly smaller than the first:
[x| x <- mapM (const "ABCD") [1..2], head x < head (tail x) ]
(Based on #FrankSchmitt’s answer)
We have map (const x) [1..n] == replicate n x so we could change his answer to
[x| x <- sequence (replicate 2 "ABCD"), head x < head (tail x) ]
And while in original question, 2 was a parameter k, for this particular example would probably not want to replicate with 2 and write
[ [x1,x2] | x1 <- "ABCD", x2 <- "ABCD", x1 < x2 ]
instead.
With a parameter k things are a bit more tricky if you want to generate them without duplicates. I’d do it recursively:
f 0 _ = [[]]
f _ [] = []
f k as = [ x : xs | (x:as') <- tails as, xs <- f (k-1) as' ]
(This variant does not remove duplicates if there are already in the list as; if you worry about them, pass nub as to it)
This SO answer:
subsequences of length n from list performance
is the fastest solution to the problem that I've seen.
compositions :: Int -> [a] -> [[a]]
compositions k xs
| k > length xs = []
| k <= 0 = [[]]
| otherwise = csWithoutHead ++ csWithHead
where csWithoutHead = compositions k $ tail xs
csWithHead = [ head xs : ys | ys <- compositions (k - 1) $ tail xs ]

Resources