Merging lists with unique elements - haskell

I am trying to combine two lists and remove duplicates.Following is my code but i think i have syntax error and also can anyone suggest a better method to implement this as i am sure there is a better way.
MY CODE
combine :: [Int] -> [Int] -> [Int]
combine (x:xs++y:ys)
|elem x xs || elem x ys = combine xs ++ ys
|elem y xs || elem y ys = combine xs ++ ys
|otherwise x:y:combine xs ys
I know what the problem wants me to do and i know how to solve it but i am not able to get over the syntax. Any help would be appreciated

The problem is you've defined the type of combine as taking two lists of ints and returning a list of ints, but you're defining combine as taking the combination of two lists. Also I believe otherwise requires the =
combine :: [Int] -> [Int] -> [Int]
combine x y
| null x && not (null y) = y
| null y && not (null x) = x
| null x && null y = []
| elem (head x) (tail x) || elem (head x) (tail y) = combine (tail x) y
| elem (head y) (tail x) || elem (head y) (tail y) = combine x (tail y)
| (head x) == (head y) = (head x) : combine (tail x) (tail y)
| otherwise = (head x) : (head y) : combine (tail x) (tail y)
This just smells like there's better ways. There's probably performance gain somewhere (scanning lists multiple times with elem, I'm looking at you). This code also looks like it's doing a lot of repeating itself too.
A much easier way is by using the nub function in Data.List.
import Data.List { nub }
combine:: [Int] -> [Int] -> [Int]
combine x y = nub (x ++ y)
After a short cabal install data-ordlist:
import Data.List.Ordered
combine x y = nubSort $ x ++ y
which runs a lot faster. wee!

Related

Difference list head function time complexity

I'm playing with Difference list data type in Haskell: http://hackage.haskell.org/package/dlist-0.8.0.2/docs/Data-DList.html
And I see from package description that head function runs in O(n) time.
I wonder, why it happens? From first glance it looks like head should work O(1) time in most reasonable cases. But to argue about time-complexity we should define what this n stands for? Is it number of elements or number of lists?
Let's expand some definitions to figure out how head works.
First I have x = [1,2] and y = [3,4]. Then I convert them to DList and obtain x' = DList (x++) and y' = DList (y++). After that I append them:
z = x' <> y' = DList $ \zs -> x ++ (y ++ zs)
Now to the head function. It is defined as
-- | /O(n)/. Return the head of the dlist
head :: DList a -> a
head = list (error "Data.DList.head: empty dlist") const
Where list is:
-- | /O(n)/. List elimination for dlists
list :: b -> (a -> DList a -> b) -> DList a -> b
list nill consit dl =
case toList dl of
[] -> nill
(x : xs) -> consit x (fromList xs)
So you can say it is obvious why head runs O(n) time: is uses list function which runs O(n) time. But let's do some equational reasoning:
head z
= list (error "Data.DList.head: empty dlist") const z
= case toList z of
[] -> error "Data.DList.head: empty dlist"
(x : xs) -> const x (fromList xs)
= case (toList $ DList $ \zs -> x ++ (y ++ zs)) of
[] -> error "Data.DList.head: empty dlist"
(x : xs) -> const x (fromList xs)
= case (x ++ (y ++ [])) of
[] -> error "Data.DList.head: empty dlist"
(x : xs) -> const x (fromList xs)
= case ((1:2:[]) ++ (y ++ [])) of
[] -> error "Data.DList.head: empty dlist"
(x : xs) -> const x (fromList xs)
= case (1:((2:[]) ++ (y ++ []))) of
[] -> error "Data.DList.head: empty dlist"
(x : xs) -> const x (fromList xs)
= (1 : ((2:[]) ++ (y ++ []))) -> const 1 (fromList (2:[]) ++ (y ++ []))
= 1
So it looks like head won't evaluate whole list to take just first element and will work in O(1) unless there no empty lists. Is this really true and description of function just tells about worst possible case?

Enumerating all pairs of possibly infinite lists [duplicate]

I have a function for finite lists
> kart :: [a] -> [b] -> [(a,b)]
> kart xs ys = [(x,y) | x <- xs, y <- ys]
but how to implement it for infinite lists? I have heard something about Cantor and set theory.
I also found a function like
> genFromPair (e1, e2) = [x*e1 + y*e2 | x <- [0..], y <- [0..]]
But I'm not sure if it helps, because Hugs only gives out pairs without ever stopping.
Thanks for help.
Your first definition, kart xs ys = [(x,y) | x <- xs, y <- ys], is equivalent to
kart xs ys = xs >>= (\x ->
ys >>= (\y -> [(x,y)]))
where
(x:xs) >>= g = g x ++ (xs >>= g)
(x:xs) ++ ys = x : (xs ++ ys)
are sequential operations. Redefine them as alternating operations,
(x:xs) >>/ g = g x +/ (xs >>/ g)
(x:xs) +/ ys = x : (ys +/ xs)
[] +/ ys = ys
and your definition should be good to go for infinite lists as well:
kart_i xs ys = xs >>/ (\x ->
ys >>/ (\y -> [(x,y)]))
testing,
Prelude> take 20 $ kart_i [1..] [101..]
[(1,101),(2,101),(1,102),(3,101),(1,103),(2,102),(1,104),(4,101),(1,105),(2,103)
,(1,106),(3,102),(1,107),(2,104),(1,108),(5,101),(1,109),(2,105),(1,110),(3,103)]
courtesy of "The Reasoned Schemer". (see also conda, condi, conde, condu).
another way, more explicit, is to create separate sub-streams and combine them:
kart_i2 xs ys = foldr g [] [map (x,) ys | x <- xs]
where
g a b = head a : head b : g (tail a) (tail b)
this actually produces exactly the same results. But now we have more control over how we combine the sub-streams. We can be more diagonal:
kart_i3 xs ys = g [] [map (x,) ys | x <- xs]
where -- works both for finite
g [] [] = [] -- and infinite lists
g a b = concatMap (take 1) a
++ g (filter (not . null) (take 1 b ++ map (drop 1) a))
(drop 1 b)
so that now we get
Prelude> take 20 $ kart_i3 [1..] [101..]
[(1,101),(2,101),(1,102),(3,101),(2,102),(1,103),(4,101),(3,102),(2,103),(1,104)
,(5,101),(4,102),(3,103),(2,104),(1,105),(6,101),(5,102),(4,103),(3,104),(2,105)]
With some searching on SO I've also found an answer by Norman Ramsey with seemingly yet another way to generate the sequence, splitting these sub-streams into four areas - top-left tip, top row, left column, and recursively the rest. His merge there is the same as our +/ here.
Your second definition,
genFromPair (e1, e2) = [x*e1 + y*e2 | x <- [0..], y <- [0..]]
is equivalent to just
genFromPair (e1, e2) = [0*e1 + y*e2 | y <- [0..]]
Because the list [0..] is infinite there's no chance for any other value of x to come into play. This is the problem that the above definitions all try to avoid.
Prelude> let kart = (\xs ys -> [(x,y) | ls <- map (\x -> map (\y -> (x,y)) ys) xs, (x,y) <- ls])
Prelude> :t kart
kart :: [t] -> [t1] -> [(t, t1)]
Prelude> take 10 $ kart [0..] [1..]
[(0,1),(0,2),(0,3),(0,4),(0,5),(0,6),(0,7),(0,8),(0,9),(0,10)]
Prelude> take 10 $ kart [0..] [5..10]
[(0,5),(0,6),(0,7),(0,8),(0,9),(0,10),(1,5),(1,6),(1,7),(1,8)]
you can think of the sequel as
0: (0, 0)
/ \
1: (1,0) (0,1)
/ \ / \
2: (2,0) (1, 1) (0,2)
...
Each level can be expressed by level n: [(n,0), (n-1, 1), (n-2, 2), ..., (0, n)]
Doing this to n <- [0..]
We have
cartesianProducts = [(n-m, m) | n<-[0..], m<-[0..n]]

Haskell - format issue

i am a beginner in haskell programming and very often i get the error
xxx.hs:30:1: parse error on input `xxx'
And often there is a little bit playing with the format the solution. Its the same code and it looks the same, but after playing around, the error is gone.
At the moment I've got the error
LookupAll.hs:30:1: parse error on input `lookupAll'
After that code:
lookupOne :: Int -> [(Int,a)] -> [a]
lookupOne _ [] = []
lookupOne x list =
if fst(head list) == x then snd(head list) : []
lookupOne x (tail list)
-- | Given a list of keys and a list of pairs of key and value
-- 'lookupAll' looks up the list of associated values for each key
-- and concatenates the results.
lookupAll :: [Int] -> [(Int,a)] -> [a]
lookupAll [] _ = []
lookupAll _ [] = []
lookupAll xs list = lookupOne h list ++ lookupAll t list
where
h = head xs
t = tail xs
But I have done everything right in my opinion. There are no tabs or something like that. Always 4 spaces. Is there a general solutoin for this problems? I am using notepad++ at the moment.
Thanks!
The problem is not with lookupAll, it's actually with the previous two lines of code
if fst (head list) == x then snd (head list) : []
lookupOne x (tail list)
You haven't included an else on this if statement. My guess is that you meant
if fst (head list) == x then snd (head list) : []
else lookupOne x (tail list)
Which I personally would prefer to format as
if fst (head list) == x
then snd (head list) : []
else lookupOne x (tail list)
but that's a matter of taste.
If you are wanting to accumulate a list of values that match a condition, there are a few ways. By far the easiest is to use filter, but you can also use explicit recursion. To use filter, you could write your function as
lookupOne x list
= map snd -- Return only the values from the assoc list
$ filter (\y -> fst y == x) list -- Find each pair whose first element equals x
If you wanted to use recursion, you could instead write it as
lookupOne _ [] = [] -- The base case pattern
lookupOne x (y:ys) = -- Pattern match with (:), don't have to use head and tail
if fst y == x -- Check if the key and lookup value match
then snd y : lookupOne x ys -- If so, prepend it onto the result of looking up the rest of the list
else lookupOne x ys -- Otherwise, just return the result of looking up the rest of the list
Both of these are equivalent. In fact, you can implement filter as
filter cond [] = []
filter cond (x:xs) =
if cond x
then x : filter cond xs
else filter cond xs
And map as
map f [] = []
map f (x:xs) = f x : map f xs
Hopefully you can spot the similarities between filter and lookupOne, and with map consider f == snd, so you have a merger of the two patterns of map and filter in the explicit recursive version of lookupOne. You could generalize this combined pattern into a higher order function
mapFilter :: (a -> b) -> (a -> Bool) -> [a] -> [b]
mapFilter f cond [] = []
mapFilter f cond (x:xs) =
if cond x
then f x : mapFilter f cond xs
else : mapFilter f cond xs
Which you can use to implement lookupOne as
lookupOne x list = mapFilter snd (\y -> fst y == x) list
Or more simply
lookupOne x = mapFilter snd ((== x) . fst)
I think #bheklilr is right - you're missing an else.
You could fix this particular formatting problem, however, by forming lookupOne as a function composition, rather than writing your own new recursive function.
For example, you can get the right kind of behaviour by defining lookupOne like this:
lookupOne a = map snd . filter ((==) a . fst)
This way it's clearer that you're first filtering out the elements of the input list for which the first element of the tuple matches the key, and then extracting just the second element of each tuple.

Haskell: Avoid using if for permutations?

I was trying to come up with a "one-liner" for generating permutations of a list in Haskell. This is what I have so far:
perms xs = if length xs == 0 then [[]] else [x:ys | x <- xs, ys <- perms $ delete x xs]
The problem is that I am having to use if which I don't like very much in Haskell. Is it possible to avoid using if, or a multi-part definition, or a case statement etc. and only "higher-order" functions (e.g. foldr etc.) to achieve this? (and also ideally keep it relatively small one-liner)
If you find it satisfactory to just get rid of the if without any interesting changes to the function, base-4.7.0.0 (bundled with GHC 7.8) includes bool in Data.Bool.
perms xs = bool [[]] [x:ys | x <- xs, ys <- perms $ delete x xs] $ not (null xs)
I manage to implement this perms by using foldr, but it becomes more complex. It is hardly to believe this is what you are looking for:
perms xs = filter (\l -> length l == length xs) $ foldr step [[]] xs
where step x acc = (map (x:) . perms $ delete x xs) ++ acc
[ghci] let perms1 xs = [x:ys | x <- xs, (not . null) xs ,ys <- perms $ delete x xs]
[ghci] perms1 []
[]
[ghci] perms1 [1,2,3]
[[1,2,3],[1,3,2],[2,1,3],[2,3,1],[3,1,2],[3,2,1]]
The higher function you need is the one that does the same thing as your if expression: check a given predicate, if true, then evaluate to expression1, else evaluate to expression2.
I didn't find this in the standard library, but this is the function you are looking for:
myBranchFunction :: t -> t -> [a] -> t
myBranchFunction e1 e2 xs = if length xs == 0 then e1 else e2
(you could generalize this further by having length be an argument to myBranchFunction if you would like.)
Then you could define perms' like so:
perms' :: Eq a => [a] -> [[a]]
perms' xs = myBranchFunction [[]] [x:ys | x <- xs, ys <- perms $ delete x xs] xs
Note that all we are doing is moving the if expression out to another function. For this function, however, I think I would prefer not to try to fit it on one line and just use a case expression for readability.

How to have multiple infinite ranges in list comprehensions?

In haskell I have a list comprehension like this:
sq = [(x,y,z) | x <- v, y <- v, z <- v, x*x + y*y == z*z, x < y, y < z]
where v = [1..]
However when I try take 10 sq, it just freezes...
Is there a way to handle multiple infinite ranges?
Thanks
In addition to the other answers explaining the problem, here is an alternative solution, generalized to work with level-monad and stream-monad that lend themselves for searches over infinite search spaces (It is also compatible with the list monad and logict, but those won't play nicely with infinite search spaces, as you already found out):
{-# LANGUAGE MonadComprehensions #-}
module Triples where
import Control.Monad
sq :: MonadPlus m => m (Int, Int, Int)
sq = [(x, y, z) | x <- v, y <- v, z <- v, x*x + y*y == z*z, x < y, y < z]
where v = return 0 `mplus` v >>= (return . (1+))
Now, for a fast breadth first search:
*Triples> :m +Control.Monad.Stream
*Triples Control.Monad.Stream> take 10 $ runStream sq
[(3,4,5),(6,8,10),(5,12,13),(9,12,15),(8,15,17),(12,16,20),(7,24,25),
(15,20,25),(10,24,26),(20,21,29)]
Alternatively:
*Triples> :m +Control.Monad.Levels
*Triples Control.Monad.Levels> take 5 $ bfs sq -- larger memory requirements
[(3,4,5),(6,8,10),(5,12,13),(9,12,15),(8,15,17)]
*Triples Control.Monad.Levels> take 5 $ idfs sq -- constant space, slower, lazy
[(3,4,5),(5,12,13),(6,8,10),(7,24,25),(8,15,17)]
List comprehensions are translated into nested applications of the concatMap function:
concatMap :: (a -> [b]) -> [a] -> [b]
concatMap f xs = concat (map f xs)
concat :: [[a]] -> [a]
concat [] = []
concat (xs:xss) = xs ++ concat xss
-- Shorter definition:
--
-- > concat = foldr (++) []
Your example is equivalent to this:
sq = concatMap (\x -> concatMap (\y -> concatMap (\z -> test x y z) v) v) v
where v = [1..]
test x y z =
if x*x + y*y == z*z
then if x < y
then if y < z
then [(x, y, z)]
else []
else []
else []
This is basically a "nested loops" approach; it'll first try x = 1, y = 1, z = 1, then move on to x = 1, y = 1, z = 2 and so on, until it tries all of the list's elements as values for z; only then can it move on to try combinations with y = 2.
But of course you can see the problem—since the list is infinite, we never run out of values to try for z. So the combination (3, 4, 5) can only occur after infinitely many other combinations, which is why your code loops forever.
To solve this, we need to generate the triples in a smarter way, such that for any possible combination, the generator reaches it after some finite number of steps. Study this code (which handles only pairs, not triples):
-- | Take the Cartesian product of two lists, but in an order that guarantees
-- that all combinations will be tried even if one or both of the lists is
-- infinite:
cartesian :: [a] -> [b] -> [(a, b)]
cartesian [] _ = []
cartesian _ [] = []
cartesian (x:xs) (y:ys) =
[(x, y)] ++ interleave3 vertical horizontal diagonal
where
-- The trick is to split the problem into these four pieces:
--
-- |(x0,y0)| (x0,y1) ... horiz
-- +-------+------------
-- |(x1,y0)| .
-- | . | .
-- | . | .
-- | . | .
-- vert diag
vertical = map (\x -> (x,y)) xs
horizontal = map (\y -> (x,y)) ys
diagonal = cartesian xs ys
interleave3 :: [a] -> [a] -> [a] -> [a]
interleave3 xs ys zs = interleave xs (interleave ys zs)
interleave :: [a] -> [a] -> [a]
interleave xs [] = xs
interleave [] ys = ys
interleave (x:xs) (y:ys) = x : y : interleave xs ys
To understand this code (and fix it if I messed up!) look at this blog entry on how to count infinite sets, and at the fourth diagram in particular—the function is an algorithm based on that "zigzag"!
I just tried a simple version of your sq using this; it finds (3,4,5) almost instantly, but then takes very long to get to any other combination (in GHCI at least). But I think the key lessons to take away from this are:
List comprehensions just don't work well for nested infinite lists.
Don't spend too much time playing around with list comprehensions. Everything that they can do, functions like map, filter and concatMap can do—plus there are many other useful functions in the list library, so concentrate your effort on that.
Your code freeze because yours predicate will never been satisfied.
Why ?
Let's take an example without any predicate to understand.
>>> let v = [1..] in take 10 $ [ (x, y, z) | x <- v, y <- v, z <- v ]
[(1,1,1),(1,1,2),(1,1,3),(1,1,4),(1,1,5),(1,1,6),(1,1,7),(1,1,8),(1,1,9),(1,1,10)]
As you see x and y will always be evaluated to 1 as z will never stop to rise.
Then your predicate can't be.
Any workaround ?
Try "Nested list" comprehension.
>>> [[ fun x y | x <- rangeX, predXY] | y <- rangeY, predY ]
Or parallel list comprehension which can be activated using,
>>> :set -XParallelListComp
lookup on the doc
This is possible, but you'll have to come up with an order in which to generate the numbers. The following generates the numbers you want; note that the x < y test can be replaced by generating only y that are >x and similarly for z (which is determined once x and y are bound):
[(x, y, z) | total <- [1..]
, x <- [1..total-2]
, y <- [x..total-1]
, z <- [total - x - y]
, x*x + y*y == z*z]

Resources