Zipping with padding in Haskell

Zipping with padding in Haskell - haskell

A couple of times I've found myself wanting a zip in Haskell that adds padding to the shorter list instead of truncating the longer one. This is easy enough to write. (Monoid works for me here, but you could also just pass in the elements that you want to use for padding.)
zipPad :: (Monoid a, Monoid b) => [a] -> [b] -> [(a, b)]
zipPad xs [] = zip xs (repeat mempty)
zipPad [] ys = zip (repeat mempty) ys
zipPad (x:xs) (y:ys) = (x, y) : zipPad xs ys
This approach gets ugly when trying to define zipPad3. I typed up the following and then realized that of course it doesn't work:
zipPad3 :: (Monoid a, Monoid b, Monoid c) => [a] -> [b] -> [c] -> [(a, b, c)]
zipPad3 xs [] [] = zip3 xs (repeat mempty) (repeat mempty)
zipPad3 [] ys [] = zip3 (repeat mempty) ys (repeat mempty)
zipPad3 [] [] zs = zip3 (repeat mempty) (repeat mempty) zs
zipPad3 xs ys [] = zip3 xs ys (repeat mempty)
zipPad3 xs [] zs = zip3 xs (repeat mempty) zs
zipPad3 [] ys zs = zip3 (repeat mempty) ys zs
zipPad3 (x:xs) (y:ys) (z:zs) = (x, y, z) : zipPad3 xs ys zs
At this point I cheated and just used length to pick the longest list and pad the others.
Am I overlooking a more elegant way to do this, or is something like zipPad3 already defined somewhere?

How about custom head and tail functions (named next and rest in my example below)?
import Data.Monoid
zipPad :: (Monoid a, Monoid b) => [a] -> [b] -> [(a,b)]
zipPad [] [] = []
zipPad xs ys = (next xs, next ys) : zipPad (rest xs) (rest ys)
zipPad3 :: (Monoid a, Monoid b, Monoid c) => [a] -> [b] -> [c] -> [(a,b,c)]
zipPad3 [] [] [] = []
zipPad3 xs ys zs = (next xs, next ys, next zs) : zipPad3 (rest xs) (rest ys) (rest zs)
next :: (Monoid a) => [a] -> a
next [] = mempty
next xs = head xs
rest :: (Monoid a) => [a] -> [a]
rest [] = []
rest xs = tail xs
Test snippet:
instance Monoid Int where
mempty = 0
mappend = (+)
main = do
print $ zipPad [1,2,3,4 :: Int] [1,2 :: Int]
print $ zipPad3 [1,2,3,4 :: Int] [9 :: Int] [1,2 :: Int]
Its output:
[(1,1),(2,2),(3,0),(4,0)]
[(1,9,1),(2,0,2),(3,0,0),(4,0,0)]

This pattern comes up quite a lot. A solution I learned from Paul Chiusano is as follows:
data These a b = This a | That b | These a b
class Align f where
align :: (These a b -> c) -> f a -> f b -> f c
instance Align [] where
align f [] [] = []
align f (x:xs) [] = f (This x) : align f xs []
align f [] (y:ys) = f (That y) : align f [] ys
align f (x:xs) (y:ys) = f (These x y) : align f xs ys
liftAlign2 f a b = align t
where t (This l) = f l b
t (That r) = f a r
t (These l r) = f l r
zipPad a b = liftAlign2 (,) a b
liftAlign3 f a b c xs ys = align t (zipPad a b xs ys)
where t (This (x,y)) = f x y c
t (That r) = f a b r
t (These (x,y) r) = f x y r
zipPad3 a b c = liftAlign3 (,,) a b c
A little test in ghci:
*Main> zipPad3 ["foo", "bar", "baz"] [2, 4, 6, 8] [True, False] "" 0 False
[("foo",2,True),("bar",4,False),("baz",6,False),("",8,False)]

A simpler way to do this is with Maybe. I will illustrate with Edward's
more general formulation:
import Data.Maybe
import Control.Applicative
zipWithTails l r f as bs = catMaybes . takeWhile isJust $
zipWith fMaybe (extend as) (extend bs)
where
extend xs = map Just xs ++ repeat Nothing
fMaybe a b = liftA2 f a b <|> fmap l a <|> fmap r b

There are times when you want to be able to apply a different function to either tail rather than just supply mempty or manual zeroes as well:
zipWithTail :: (a -> a -> a) -> [a] -> [a] -> [a]
zipWithTail f (a:as) (b:bs) = f a b : zipWithTails f as bs
zipWithTail f [] bs = bs
zipWithTail f as _ = as
zipWithTails :: (a -> c) -> (b -> c) -> (a -> b -> c) -> [a] -> [b] -> [c]
zipWithTails l r f (a:as) (b:bs) = f a b : zipWithTails l r f as bs
zipWithTails _ r _ [] bs = fmap r bs
zipWithTails l _ _ as _ = fmap l as
I use the former when I'm doing something like zipWithTail (+)
and the former when I need to do something like zipWithTail (*b) (a*) (\da db -> a*db+b*da) since the former can be much more efficient than feeding a default into a function, and the latter a little bit so.
However, if you just wanted to make a more succinct version of what you have, you could probably turn to mapAccumL ,but its not any clearer, and the ++ can be expensive.
zipPad as bs = done $ mapAccumL go as bs
where go (a:as) b = (as,(a,b))
go [] b = ([],(mempty,b))
done (cs, both) = both ++ fmap (\x -> (x, mempty)) cs

Related

In filterM, why is `return (if b then x:ys else ys)` evaluated once after all the lists are created?

filterM :: Monad m => (a -> m Bool) -> [a] -> m [a]
filterM p [] = return []
filterM p (x:xs) = do b <- p x
ys <- filterM p xs
return (if b then x:ys else ys)
and
> filterM (\x -> [True,False]) [1,2,3]
[[1,2,3],[1,2],[1,3],[1],[2,3],[2],[3],[]]
Is return (if b then x:ys else ys) evaluated each time a list is created? Is yes, why isn't the result [[1,2,3]],[[1,2]],[[1,3]],[[1]],[[2,3]],[[2]],[[3]],[[]]?
Does the result [[1,2,3],[1,2],[1,3],[1],[2,3],[2],[3],[]] imply that return (if b then x:ys else ys) is evaluated once after all the lists are created?

In short: because the bind function (>>=) for the instance Monad [] is implement with concatMap, not map.
We can desugar the do block as:
filterM :: Monad m => (a -> m Bool) -> [a] -> m [a]
filterM p [] = return []
filterM p (x:xs) = p x >>= \b -> (filterM p xs >>= \ys -> return (if b then x:ys else ys))
For m ~ [], the >>= function is equivalent to flip concatMap, and return x is equivalent to [x], so that means that we can transform this, for a list, into:
filterM :: (a -> [Bool]) -> [a] -> [[a]]
filterM p [] = [[]]
filterM p (x:xs) = concatMap (\b -> concatMap (\ys -> [if b then (x:ys) else ys]) (filterM p xs)) (p x)
A concatMap (\x -> [f x]) is equivalent to map f, since the concatenation of all these singleton lists will result in a list that contains the outcomes of f for all elements in the given list.
It thus means that the above function is equivalent to:
filterM :: (a -> [Bool]) -> [a] -> [[a]]
filterM p [] = [[]]
filterM p (x:xs) = concatMap (\b -> map (\ys -> if b then (x:ys) else ys) (filterM p xs)) (p x)
If p is \_ -> [True, False], it thus means we can replace (p x) with [True, False], and thus obtain:
concatMap (\b -> map (\ys -> if b then (x:ys) else ys) (filterM p xs)) [True, False]
This thus means that concatMap is the concatenation of two lists: one where b is True, and one where b is False, like:
map (\ys -> (x:ys)) (filterM p xs) ++ map (\ys -> ys) (filterM p xs)
The first map will thus prepend all the lists from filterM p xs with x whereas the second one will not. The above expression is thus equivalent to:
map (x:) (filterM p xs) ++ filterM p xs
if filterM p xs contains the powerset of xs, then the above expression will thus contain the powerset of (x:xs).

sum3 with zipWith3 in Haskell

I'm trying to write a Haskell function that would take three lists and return a list of sums of their elements.
Currently I'm trying to do it using zipWith3:
sum3 :: Num a => [a] -> [a] -> [a] -> [a]
sum3 xs ys zs = zipWith3 (\x y z -> x+y+z) xs ys zs
The problem is it only works for lists of equal lengths. But I wish sum3 to work with lists of unequal lengths, so that
sum3 [1,2,3] [4,5] [6]
would return
[11,7,3]
I think that I should redefine zipWith3 to work with lists of unequal lengths, but can't figure out how to do it (I suspect that I have to exhaust all possibilities of empty lists).
Is there a solution?

a nice trick is to use transpose:
import Data.List (transpose)
sum3 :: Num a => [a] -> [a] -> [a] -> [a]
sum3 as bs cs = map sum $ transpose [as,bs,cs]
because obviously you want to sum up the columns ;)
> sum3 [1,2,3] [4,5] [6]
[11,7,3]

I've seen this sort of question before, here: Zip with default value instead of dropping values? My answer to that question also pertains here.
The ZipList applicative
Lists with a designated padding element are applicative (the applicative grown from the 1 and max monoid structure on positive numbers).
data Padme m = (:-) {padded :: [m], padder :: m} deriving (Show, Eq)
instance Applicative Padme where
pure = ([] :-)
(fs :- f) <*> (ss :- s) = zapp fs ss :- f s where
zapp [] ss = map f ss
zapp fs [] = map ($ s) fs
zapp (f : fs) (s : ss) = f s : zapp fs ss
-- and for those of you who don't have DefaultSuperclassInstances
instance Functor Padme where fmap = (<*>) . pure
Now we can pack up lists of numbers with their appropriate padding
pad0 :: [Int] -> Padme Int
pad0 = (:- 0)
And that gives
padded ((\x y z -> x+y+z) <$> pad0 [1,2,3] <*> pad0 [4,5] <*> pad0 [6])
= [11,7,3]
Or, with the Idiom Brackets that aren't available, you vould write
padded (|pad0 [1,2,3] + (|pad0 [4,5] + pad0 6|)|)
meaning the same.
Applicative gives you a good way to bottle the essential idea of "padding" that this problem demands.

Well if you must use zipWith3:
sum3 :: Num a => [a] -> [a] -> [a] -> [a]
sum3 xs ys zs = zipWith3 (\x y z -> x + y + z) xs' ys' zs'
where
xs' = pad nx xs; nx = length xs
ys' = pad ny ys; ny = length ys
zs' = pad nz zs; nz = length zs
n = nx `max` ny `max` nz
pad n' = (++ replicate (n-n') 0)
Some samples:
*> sum3 [] [] []
[]
*> sum3 [0] [] []
[0]
*> sum3 [1] [1] [2, 2]
[4,2]
*> sum3 [1,2,3] [4,5] [6]
[11,7,3]
but I'd recommend going with Carsten's transpose based implementation.

Perhaps you could get away with something that is almost zipWith3 but which relies on Default to generate empty values on the fly if one of the lists runs out of elements:
import Data.Default
zipWith3' :: (Default a, Default b, Default c)
=> ( a -> b -> c -> r )
-> ([a] -> [b] -> [c] -> [r])
zipWith3' f = go where
go [] [] [] = []
go (x:xs) (y:ys) (z:zs) = f x y z : go xs ys zs
go [] ys zs = go [def] ys zs
go xs [] zs = go xs [def] zs
go xs ys [] = go xs ys [def]
and 'sum3'`:
sum3' :: (Default a, Num a) => [a] -> [a] -> [a] -> [a]
sum3' = zipWith3' (\x y z -> x + y + z)

One could generalize zipWith so to handle the excess tails, instead of discarding them silently.
zipWithK :: (a->b->c) -> ([a]->[c]) -> ([b]->[c]) -> [a] -> [b] -> [c]
zipWithK fab fa fb = go
where go [] [] = []
go as [] = fa as
go [] bs = fb bs
go (a:as) (b:bs) = fab a b : go as bs
The original zipWith is then
zipWith' :: (a->b->c) -> [a] -> [b] -> [c]
zipWith' f = zipWithK f (const []) (const [])
Back to the original problem,
sum2 :: Num a => [a] -> [a] -> [a]
sum2 = zipWithK (+) id id
sum3 :: Num a => [a] -> [a] -> [a] -> [a]
sum3 xs ys zs = xs `sum2` ys `sum2` zs

This is my solution:
sumLists :: Num a => [a] -> [a] -> [a]
sumLists (x : xs) (y : ys) = (x + y) : sumLists xs ys
sumLists _ _ = []
sum3 :: (Num a, Enum a) => [a] -> [a] -> [a] -> [a]
sum3 xs ys zs = foldr sumLists defaultList (map addElems list)
where list = [xs, ys, zs]
defaultList = [] ++ [0, 0 ..]
maxLength = maximum $ map length list
addElems = \x -> if length x < maxLength then x ++ [0, 0 ..] else x

How to zip lists with different length?

How can I zip two lists like
["Line1","Line2","Line3"]
["Line4","Line5"]
without discarding rest elements in first list?
I'd like to zip extra elements with empty list, if it can be done.

zipWithPadding :: a -> b -> [a] -> [b] -> [(a,b)]
zipWithPadding a b (x:xs) (y:ys) = (x,y) : zipWithPadding a b xs ys
zipWithPadding a _ [] ys = zip (repeat a) ys
zipWithPadding _ b xs [] = zip xs (repeat b)
As long as there are elements, we can simply zip them. As soon as we run out of elements, we simply zip the remaining list with an infinite list of the padding element.
In your case, you would use this as
zipWithPadding "" "" ["Line1","Line2","Line3"] ["Line4","Line5"]
-- result: [("Line1","Line4"),("Line2","Line5"),("Line3","")]

Another solution is to make a zip function that works on monoids and fills in the missing values with mempty:
import Data.Monoid
mzip :: (Monoid a, Monoid b) => [a] -> [b] -> [(a, b)]
mzip (a:as) (b:bs) = (a, b) : mzip as bs
mzip [] (b:bs) = (mempty, b) : mzip [] bs
mzip (a:as) [] = (a, mempty) : mzip as []
mzip _ _ = []
> mzip ["Line1","Line2","Line3"] ["Line4","Line5"]
[("Line1","Line4"),("Line2","Line5"),("Line3","")]

An alternative implementation of Reite's solution, using higher order functions, just for fun. :) Possibly slower, though, since I guess the length functions will require additional traversals of the lists.
import Data.Monoid (mempty)
zipPad :: (Monoid a, Monoid b) => [a] -> [b] -> [(a,b)]
zipPad xs ys = take maxLength $ zip (pad xs) (pad ys)
where
maxLength = max (length xs) (length ys)
pad v = v ++ repeat mempty

I think it will be much simple for you if you are new one in programming in Haskell
zip' :: [String] -> [String] ->[(String,String)]
zip' [][] = []
zip' (x:xs)[] = bmi x : zip' xs []
where bmi x = (x,"")
zip' [](x:xs) = bmi x : zip' [] xs
where bmi x = ("",x)
zip' (x:xs) (y:ys) = bmi x y : zip' xs ys
where bmi x y = (x,y)

Sometimes I don't want to pad my list. For instance, when I want to zip equal length lists only. Here is a general purpose solution, which maybe returns any extra values if one list is longer.
zipWithSave :: (a -> b -> c) -> [a] -> [b] -> ([c],Maybe (Either [a] [b]))
zipWithSave f [] [] = ([],Nothing)
zipWithSave f [] bs = ([],Just (Right bs))
zipWithSave f as [] = ([],Just (Left as))
zipWithSave f (a:as) (b:bs) = (f a b : cs , sv)
where (cs, sv) = zipWithSave f as bs
Using (zps,svs) = zipWithSave f as bs, svs can be one of three cases: Just (Left x) wherein leftovers from as are returned as x, Just (Right x) wherein leftovers from bs are returned, or Nothing in the case of equal length lists.
Another general purpose one is to just supply extra functions for each case.
zipWithOr :: (a -> b -> c) -> (a -> c) -> (b -> c) -> [a] -> [b] -> [c]
zipWithOr _ _ _ [] [] = []
zipWithOr _ _ fb [] bs = map fb bs
zipWithOr _ fa _ as [] = map fa as
zipWithOr f fa fb (a:as) (b:bs) = (f a b) : zipWithOr f fa fb as bs
This is just an elaboration of Zeta's approach. That function is then implemented as (using {-# LANGUAGE TupleSections #-}):
zipWithPadding a b as bs = zipWithOr (,) (,b) (a,) as bs

Haskell : Comparing if two binary trees have the same elements

So I am working on a function that detects if two binary trees have the same numbers in them.
So what I've come up with is the following which works just fine but the problem is that I am using total of 5 functions. Is there another way of detecting if two BT's have the same elements with just one function ? Here is my solution so far which seems to work just fine.
flatten :: BinTree a -> [a]
flatten Empty = []
flatten (Node l x r) = flatten l ++ [x] ++ flatten r
splt :: Int -> [a] -> ([a], [a])
splt 0 xs = ([], xs)
splt _ [] = ([],[])
splt n (x:xs) = (\ys-> (x:fst ys, snd ys)) (splt (n-1) xs)
merge :: Ord a => [a] -> [a] -> [a]
merge xs [] = xs
merge [] ys = ys
merge (x:xs) (y:ys) = if (x > y) then y:merge (x:xs) ys else x:merge xs(y:ys)
msort :: Ord a => [a] -> [a]
msort [] =[]
msort (x:[]) = (x:[])
msort xs = (\y -> merge (msort (fst y)) (msort (snd y))) (splt (length xs `div` 2) xs)
areTreesEqual :: (Ord a) => BinTree a -> BinTree a-> Bool
areTreesEqual Empty Empty = True
areTreesEqual Empty a = False
areTreesEqual a Empty = False
areTreesEqual a b = msort (flatten (a) ) == msort (flatten (b))

How about
import Data.MultiSet as Set
equal a b = accum a == accum b
where accum Empty = Set.empty
accum (Node l x r) = Set.insert x (accum l `Set.union` accum r)
Sets are lovely for unordered comparison and multisets will make sure that
1 /= 1
1 1
Eg, that duplicate numbers are counted properly. If this isn't a big concern, than swap MultiSet for Set. I think 3 lines is pretty decent for this sort of thing.

How to partition a list in Haskell?

I want to take a list (or a string) and split it into sub-lists of N elements. How do I do it in Haskell?
Example:
mysteryFunction 2 "abcdefgh"
["ab", "cd", "ef", "gh"]

cabal update
cabal install split
And then use chunksOf from Data.List.Split

Here's one option:
partition :: Int -> [a] -> [[a]]
partition _ [] = []
partition n xs = (take n xs) : (partition n (drop n xs))
And here's a tail recursive version of that function:
partition :: Int -> [a] -> [[a]]
partition n xs = partition' n xs []
where
partition' _ [] acc = reverse acc
partition' n xs acc = partition' n (drop n xs) ((take n xs) : acc)

You could use:
mysteryFunction :: Int -> [a] -> [[a]]
mysteryFunction n list = unfoldr takeList list
where takeList [] = Nothing
takeList l = Just $ splitAt n l
or alternatively:
mysteryFunction :: Int -> [a] -> [[a]]
mysteryFunction n list = unfoldr (\l -> if null l then Nothing else Just $ splitAt n l) list
Note this puts any remaining elements in the last list, for example
mysteryFunction 2 "abcdefg" = ["ab", "cd", "ef", "g"]

import Data.List
import Data.Function
mysteryFunction n = map (map snd) . groupBy ((==) `on` fst) . zip ([0..] >>= replicate n)
... just kidding...

mysteryFunction x "" = []
mysteryFunction x s = take x s : mysteryFunction x (drop x s)
Probably not the elegant solution you had in mind.

There's already
Prelude Data.List> :t either
either :: (a -> c) -> (b -> c) -> Either a b -> c
and
Prelude Data.List> :t maybe
maybe :: b -> (a -> b) -> Maybe a -> b
so there really should be
list :: t -> ([a] -> t) -> [a] -> t
list n _ [] = n
list _ c xs = c xs
as well. With it,
import Data.List (unfoldr)
g n = unfoldr $ list Nothing (Just . splitAt n)
without it,
g n = takeWhile (not.null) . unfoldr (Just . splitAt n)

A fancy answer.
In the answers above you have to use splitAt, which is recursive, too. Let's see how we can build a recursive solution from scratch.
Functor L(X)=1+A*X can map X into a 1 or split it into a pair of A and X, and has List(A) as its minimal fixed point: List(A) can be mapped into 1+A*List(A) and back using a isomorphism; in other words, we have one way to decompose a non-empty list, and only one way to represent a empty list.
Functor F(X)=List(A)+A*X is similar, but the tail of the list is no longer a empty list - "1" - so the functor is able to extract a value A or turn X into a list of As. Then List(A) is its fixed point (but no longer the minimal fixed point), the functor can represent any given list as a List, or as a pair of a element and a list. In effect, any coalgebra can "stop" decomposing the list "at will".
{-# LANGUAGE DeriveFunctor #-}
import Data.Functor.Foldable
data N a x = Z [a] | S a x deriving (Functor)
(which is the same as adding the following trivial instance):
instance Functor (N a) where
fmap f (Z xs) = Z xs
fmap f (S x y) = S x $ f y
Consider the definition of hylomorphism:
hylo :: (f b -> b) -> (c -> f c) -> c -> b
hylo psi phi = psi . fmap (hylo psi phi) . phi
Given a seed value, it uses phi to produce f c, to which fmap applies hylo psi phi recursively, and psi then extracts b from the fmapped structure f b.
A hylomorphism for the pair of (co)algebras for this functor is a splitAt:
splitAt :: Int -> [a] -> ([a],[a])
splitAt n xs = hylo psi phi (n, xs) where
phi (n, []) = Z []
phi (0, xs) = Z xs
phi (n, (x:xs)) = S x (n-1, xs)
This coalgebra extracts a head, as long as there is a head to extract and the counter of extracted elements is not zero. This is because of how the functor was defined: as long as phi produces S x y, hylo will feed y into phi as the next seed; once Z xs is produced, functor no longer applies hylo psi phi to it, and the recursion stops.
At the same time hylo will re-map the structure into a pair of lists:
psi (Z ys) = ([], ys)
psi (S h (t, b)) = (h:t, b)
So now we know how splitAt works. We can extend that to splitList using apomorphism:
splitList :: Int -> [a] -> [[a]]
splitList n xs = apo (hylo psi phi) (n, xs) where
phi (n, []) = Z []
phi (0, xs) = Z xs
phi (n, (x:xs)) = S x (n-1, xs)
psi (Z []) = Cons [] $ Left []
psi (Z ys) = Cons [] $ Right (n, ys)
psi (S h (Cons t b)) = Cons (h:t) b
This time the re-mapping is fitted for use with apomorphism: as long as it is Right, apomorphism will keep using hylo psi phi to produce the next element of the list; if it is Left, it produces the rest of the list in one step (in this case, just finishes off the list with []).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Zipping with padding in Haskell - haskell

Related

In filterM, why is `return (if b then x:ys else ys)` evaluated once after all the lists are created?

sum3 with zipWith3 in Haskell

How to zip lists with different length?

Haskell : Comparing if two binary trees have the same elements

How to partition a list in Haskell?

Categories

Resources