Finding the first duplicate element in a list - haskell

I am very new to Haskell. I am trying to write code in Haskell that finds the first duplicate element from the list, and if it does not have the duplicate elements gives the message no duplicates. I know i can do it through nub function but i am trying to do it without it.

This is one way to do it:
import qualified Data.Set as Set
dup :: Ord a => [a] -> Maybe a
dup xs = dup' xs Set.empty
where dup' [] _ = Nothing
dup' (x:xs) s = if Set.member x s
then Just x
else dup' xs (Set.insert x s)
dupString :: (Ord a, Show a) => [a] -> [Char]
dupString x = case dup x of
Just x -> "First duplicate: " ++ (show x)
Nothing -> "No duplicates"
main :: IO ()
main = do
putStrLn $ dupString [1,2,3,4,5]
putStrLn $ dupString [1,2,1,2,3]
putStrLn $ dupString "HELLO WORLD"
Here is how it works:
*Main> main
No duplicates
First duplicate: 1
First duplicate: 'L'

This is not the your final answer, because it does unnecessary work when an element is duplicated multiple times instead of returning right away, but it illustrates how you might go about systematically running through all the possibilities (i.e. "does this element of the list have duplicates further down the list?")
dupwonub :: Eq a => [a] -> [a]
dupwonub [] = []
dupwonub (x:xs) = case [ y | y <- xs, y == x ] of
(y:ys) -> [y]
[] -> dupwonub xs

In case you are still looking into Haskell I thought you might like a faster, but more complicated, solution. This runs in O(n) (I think), but has a slightly harsher restriction on the type of your list, namely has to be of type Ix.
accumArray is an incredibly useful function, really recommend looking into it if you haven't already.
import Data.Array
data Occurances = None | First | Duplicated
deriving Eq
update :: Occurances -> a -> Occurances
update None _ = First
update First _ = Duplicated
update Duplicated _ = Duplicated
firstDup :: (Ix a) => [a] -> a
firstDup xs = fst . first ((== Duplicated).snd) $ (map g xs)
where dupChecker = accumArray update None (minimum xs,maximum xs) (zip xs (repeat ()))
g x = (x, dupChecker ! x)
first :: (a -> Bool) -> [a] -> a
first _ [] = error "No duplicates master"
first f (x:xs) = if f x
then x
else first f xs
Watch out tho, an array of size (minimum xs,maximum xs) could really blow up your space requirements.

Related

how to show only the repeated element?

Im new to Haskell.
How to show only the repeated elements ?
Given as input: bbbool, expected output: bo
I found a way to do this from internet:
import qualified Data.Set as Set
dup :: Ord a => [a] -> Maybe a
dup xs = dup' xs Set.empty
where dup' [] _ = Nothing
dup' (x:xs) s = if Set.member x s
then Just x
else dup' xs (Set.insert x s)
dupString :: (Ord a, Show a) => [a] -> [Char]
dupString x = case dup x of
Just x -> "First duplicate: " ++ (show x)
Nothing -> "No duplicates"
But the problem is, it will only show the first repeated element.
for example : bbbool = b
I hope that my question is clear.
A simple way to do this would be, assuming you only look for adjacent repeated elements:
import Data.List (group)
findRepeating :: Eq a => [a] -> [a]
findRepeating = map head . filter ((> 1) . length) . group
You may want to sequence it with nub to deal with repeated elements, since for "bbbXaaaXbbb" it returns "bab".
Edit: if you meant all repeating elements, this should do:
import qualified Data.Map as M
findRepeating :: (Foldable k, Ord a) => t a -> [a]
findRepeating = M.keys . M.filter (> 1) . foldr (\x acc -> M.insertWith (+) x 1 acc) M.empty
This approach counts how many times each element shows up and then returns those that happened to appear more than once. The order in which they are returned is undefined.

Is there a straight-forward solution to receiving the element *prior* to hitting the dropWhile predicate?

Given a condition, I want to search through a list of elements and return the first element that reaches the condition, and the previous one.
In C/C++ this is easy :
int i = 0;
for(;;i++) if (arr[i] == 0) break;
After we get the index where the condition is met, getting the previous element is easy, through "arr[i-1]"
In Haskell:
dropWhile (/=0) list gives us the last element I want
takeWhile (/=0) list gives us the first element I want
But I don't see a way of getting both in a simple manner. I could enumerate the list and use indexing, but that seems messy. Is there a proper way of doing this, or a way of working around this?
I would zip the list with its tail so that you have pairs of elements
available. Then you can just use find on the list of pairs:
f :: [Int] -> Maybe (Int, Int)
f xs = find ((>3) . snd) (zip xs (tail xs))
> f [1..10]
Just (3,4)
If the first element matches the predicate this will return
Nothing (or the second match if there is one) so you might need to special-case that if you want something
different.
As Robin Zigmond says break can also work:
g :: [Int] -> (Int, Int)
g xs = case break (>3) xs of (_, []) -> error "not found"
([], _) -> error "first element"
(ys, z:_) -> (last ys, z)
(Or have this return a Maybe as well, depending on what you need.)
But this will, I think, keep the whole prefix ys in memory until it
finds the match, whereas f can start garbage-collecting the elements
it has moved past. For small lists it doesn't matter.
I would use a zipper-like search:
type ZipperList a = ([a], [a])
toZipperList :: [a] -> ZipperList a
toZipperList = (,) []
moveUntil' :: (a -> Bool) -> ZipperList a -> ZipperList a
moveUntil' _ (xs, []) = (xs, [])
moveUntil' f (xs, (y:ys))
| f y = (xs, (y:ys))
| otherwise = moveUntil' f (y:xs, ys)
moveUntil :: (a -> Bool) -> [a] -> ZipperList a
moveUntil f = moveUntil' f . toZipperList
example :: [Int]
example = [2,3,5,7,11,13,17,19]
result :: ZipperList Int
result = moveUntil (>10) example -- ([7,5,3,2], [11,13,17,19])
The good thing about zippers is that they are efficient, you can access as many elements near the index you want, and you can move the focus of the zipper forwards and backwards. Learn more about zippers here:
http://learnyouahaskell.com/zippers
Note that my moveUntil function is like break from the Prelude but the initial part of the list is reversed. Hence you can simply get the head of both lists.
A non-awkward way of implementing this as a fold is making it a paramorphism. For general explanatory notes, see this answer by dfeuer (I took foldrWithTails from it):
-- The extra [a] argument f takes with respect to foldr
-- is the tail of the list at each step of the fold.
foldrWithTails :: (a -> [a] -> b -> b) -> b -> [a] -> b
foldrWithTails f n = go
where
go (a : as) = f a as (go as)
go [] = n
boundary :: (a -> Bool) -> [a] -> Maybe (a, a)
boundary p = foldrWithTails findBoundary Nothing
where
findBoundary x (y : _) bnd
| p y = Just (x, y)
| otherwise = bnd
findBoundary _ [] _ = Nothing
Notes:
If p y is true we don't have to look at bnd to get the result. That makes the solution adequately lazy. You can check that by trying out boundary (> 1000000) [0..] in GHCi.
This solution gives no special treatment to the edge case of the first element of the list matching the condition. For instance:
GHCi> boundary (<1) [0..9]
Nothing
GHCi> boundary even [0..9]
Just (1,2)
There's several alternatives; either way, you'll have to implement this yourself. You could use explicit recursion:
getLastAndFirst :: (a -> Bool) -> [a] -> Maybe (a, a)
getLastAndFirst p (x : xs#(y:ys))
| p y = Just (x, y)
| otherwise = getLastAndFirst p xs
getLastAndFirst _ [] = Nothing
Alternately, you could use a fold, but that would look fairly similar to the above, except less readable.
A third option is to use break, as suggested in the comments:
getLastAndFirst' :: (a -> Bool) -> [a] -> Maybe (a,a)
getLastAndFirst' p l =
case break p l of
(xs#(_:_), (y:_)) -> Just (last xs, y)
_ -> Nothing
(\(xs, ys) -> [last xs, head ys]) $ break (==0) list
Using break as Robin Zigmond suggested ended up short and simple, not using Maybe to catch edge-cases, but I could replace the lambda with a simple function that used Maybe.
I toyed a bit more with the solution and came up with
breakAround :: Int -> Int -> (a -> Bool) -> [a] -> [a]
breakAround m n cond list = (\(xs, ys) -> (reverse (reverse take m (reverse xs))) ++ take n ys) $ break (cond) list
which takes two integers, a predicate, and a list of a, and returns a single list of m elements before the predicate and n elements after.
Example: breakAround 3 2 (==0) [3,2,1,0,10,20,30] would return [3,2,1,0,10]

How to remove second largest element in a list in haskell?

I have created a program to remove first smallest element but I dont how to do for second largest:
withoutBiggest (x:xs) =
withoutBiggestImpl (biggest x xs) [] (x:xs)
where
biggest :: (Ord a) => a -> [a] -> a
biggest big [] = big
biggest big (x:xs) =
if x < big then
biggest x xs
else
biggest big xs
withoutBiggestImpl :: (Eq a) => a -> [a] -> [a] -> [a]
withoutBiggestImpl big before (x:xs) =
if big == x then
before ++ xs
else
withoutBiggestImpl big (before ++ [x]) xs
Here is a simple solution.
Prelude> let list = [10,20,100,50,40,80]
Prelude> let secondLargest = maximum $ filter (/= (maximum list)) list
Prelude> let result = filter (/= secondLargest) list
Prelude> result
[10,20,100,50,40]
Prelude>
A possibility, surely not the best one.
import Data.Permute (rank)
x = [4,2,3]
ranks = rank (length x) x -- this gives [2,0,1]; that means 3 (index 1) is the second smallest
Then:
[x !! i | i <- [0 .. length x -1], i /= 1]
Hmm.. not very cool, let me some time to think to something better please and I'll edit my post.
EDIT
Moreover my previous solution was wrong. This one should be correct, but again not the best one:
import Data.Permute (rank, elems, inverse)
ranks = elems $ rank (length x) x
iranks = elems $ inverse $ rank (length x) x
>>> [x !! (iranks !! i) | i <- filter (/=1) ranks]
[4,2]
An advantage is that this preserves the order of the list, I think.
Here is a solution that removes the n smallest elements from your list:
import Data.List
deleteN :: Int -> [a] -> [a]
deleteN _ [] = []
deleteN i (a:as)
| i == 0 = as
| otherwise = a : deleteN (i-1) as
ntails :: Int -> [a] -> [(a, Int)] -> [a]
ntails 0 l _ = l
ntails n l s = ntails (n-1) (deleteN (snd $ head s) l) (tail s)
removeNSmallest :: Ord a => Int -> [a] -> [a]
removeNSmallest n l = ntails n l $ sort $ zip l [0..]
EDIT:
If you just want to remove the 2nd smallest element:
deleteN :: Int -> [a] -> [a]
deleteN _ [] = []
deleteN i (a:as)
| i == 0 = as
| otherwise = a : deleteN (i-1) as
remove2 :: [a] -> [(a, Int)] -> [a]
remove2 [] _ = []
remove2 [a] _ = []
remove2 l s = deleteN (snd $ head $ tail s) l
remove2Smallest :: Ord a => [a] -> [a]
remove2Smallest l = remove2 l $ sort $ zip l [0..]
It was not clear if the OP is looking for the biggest (as the name withoutBiggest implies) or what. In this case, one solution is to combine the filter :: (a->Bool) -> [a] -> [a] and maximum :: Ord a => [a] -> a functions from the Prelude.
withoutBiggest l = filter (/= maximum l) l
You can remove the biggest elements by first finding it and then filtering it:
withoutBiggest :: Ord a => [a] -> [a]
withoutBiggest [] = []
withoutBiggest xs = filter (/= maximum xs) xs
You can then remove the second-biggest element in much the same way:
withoutSecondBiggest :: Ord a => [a] -> [a]
withoutSecondBiggest xs =
case withoutBiggest xs of
[] -> xs
rest -> filter (/= maximum rest) xs
Assumptions made:
You want each occurrence of the second-biggest element removed.
When there is zero/one element in the list, there isn't a second element, so there isn't a second-biggest element. Having the list without an element that isn't there is equivalent to having the list.
When the list contains only values equivalent to maximum xs, there also isn't a second-biggest element even though there may be two or more elements in total.
The Ord type-class instance implies a total ordering. Otherwise you may have multiple maxima that are not equivalent; otherwise which one is picked as the biggest and second-biggest is not well-defined.

How to go backwards in a list in Haskell?

I need to write a simple function for one of my assignments that should remove all the duplicates from a given list except for the first occurrence of the element in the list.
Here is what I wrote:
remDup :: [Int]->[Int]
remDup []=[]
remDup (x:xs)
| present x xs==True = remDup xs
| otherwise = x:remDup xs
where
present :: Int->[Int]->Bool
present x [] = False
present x (y:ys)
| x==y =True
| otherwise = present x ys
But this code removes the duplicates except for the last occurrence of the element.
That is, if the given list is [1,2,3,3,2], it produces [1,3,2] instead of [1,2,3].
How to do it the other way around?
How about this idea:
remDup [] = []
remDup (x:xs) = x : remDup ( remove x xs )
where remove x xs removes all occurrences of x from the list xs (implementation left as an exercise.)
For every element you encounter, you simply want to check if you have encountered it before; build up a Set of encountered elements and use that to check if an element should be deleted.
remDup :: [Int] -> [Int]
remDup xs = helper S.empty xs
where
helper s [] = []
helper s (x:xs) | S.elem x s = helper xs
| otherwise = x:helper (S.insert x s) xs
You could reverse it, run your current duplicate remover, and then reverse the result.
So this is what I finally came up with after following user5402's advice.
remDup1 [] = []
remDup1 (x:xs) = x:remDup1(remove x xs)
remove x []=[]
remove x (y:ys)
| x==y = remove x ys
| x/=y = y:(remove x ys)
If you care about efficiency, you should think about using HashSet as an auxiliary data structure. Doing that, we can get an average-case complexity of O(n log n) and actually O(n) in practice (source).
import Data.Hashable
import Data.HashSet (HashSet)
import qualified Data.HashSet as HashSet
remDupSet :: (Hashable a, Eq a) => [a] -> [a]
remDupSet l = remDupSetAux HashSet.empty l
where remDupSetAux :: (Hashable a, Eq a) => HashSet a -> [a] -> [a]
remDupSetAux _ [] = []
remDupSetAux s (x:xs) = if x `HashSet.member` s
then remDupSetAux s xs
else x : remDupSetAux (HashSet.insert x s) xs
I just quickly wrote a program to compare the performance of this solution with the top-voted one:
import Data.List
import Data.Hashable
import Data.HashSet (HashSet)
import qualified Data.HashSet as HashSet
import Data.Time.Clock
import Control.DeepSeq
main :: IO ()
main = do
let a = [1..20000] :: [Int]
putStrLn "Test1: 20000 different values"
test "remDup" $ remDup a
test "remDupSet" $ remDupSet a
putStrLn ""
let b = replicate 20000 1 :: [Int]
putStrLn "Test2: one value repeted 20000 times"
test "remDup" $ remDup b
test "remDupSet" $ remDupSet b
test :: (NFData a) => String -> a -> IO ()
test s a = do time1 <- getCurrentTime
time2 <- a `deepseq` getCurrentTime
putStrLn $ s ++ ": " ++ show (diffUTCTime time2 time1)
remDup :: (Eq a) => [a] -> [a]
remDup [] = []
remDup (x:xs) = x : remDup (delete x xs)
remDupSet :: (Hashable a, Eq a) => [a] -> [a]
remDupSet l = remDupSetAux HashSet.empty l
where remDupSetAux :: (Hashable a, Eq a) => HashSet a -> [a] -> [a]
remDupSetAux _ [] = []
remDupSetAux s (x:xs) = if x `HashSet.member` s
then remDupSetAux s xs
else x : remDupSetAux (HashSet.insert x s) xs
As expected, there is a huge difference mainly when there are many distinct values:
Test1: 20000 different values
remDup: 15.79859s
remDupSet: 0.007725s
Test2: one value repeted 20000 times
remDup: 0.001084s
remDupSet: 0.00064s

Replace Strings in Haskell

I am looking for a way to replace occurrences of strings in Haskell. I have this working for single pairs. My current function is implemented like so:
replaceList :: (Eq a) => [a] -> ([a],[a]) -> [a]
replaceList z#(x:xs) (a,b)
| take (length a) z == a = b ++ strt z (length a)
| otherwise = x : replaceList xs (a,b)
In this context, strt is just a function that returns a list starting at a certain index. This function works. replaceList "Dragon" ("ragon","odo") will return "I am a dodo". However, I am looking for a way to make this function accept a list of these tupples. For instance:
replaceList "I am a dragon" [("I","You"),("am","are"),("a dragon","awesome")] returning "You are awesome".
The methods I have tried so far have been to map a partially applied replaceList over a list of tupples, but that returns a list of each individual element changed. I also tried using this code:
replaceInfinity :: (Eq a) => [a] -> [([a],[a])] -> [a]
replaceInfinity x [] = x
replaceInfinity x ((a,b):ys) = (flip replaceList (a,b)) . (replaceInfinity x ys)
but that fails to compile. I'm relatively new to Haskell and am using this to rewrite an old language preprocessor. I can't understand the logic of how to implement this type of function.
Could someone tell me the tactic to get to the answer, or even implement the function for me so I could learn how it would be done? If it helps, I have the entire source file I've been using to play around with this here: http://hpaste.org/85363
The smallest change to your code that I could think of is:
replaceInfinity :: Eq a => [a] -> [([a], [a])] -> [a]
replaceInfinity x [] = x
replaceInfinity x (y:ys) = replaceInfinity (replaceList x y) ys
OUTPUT:
*Main> replaceInfinity "I am a dragon" [("I","You"),("am","are"),("a dragon","awesome")]
"You are awesome"
On second thought:
replaceInfinity x ((a,b):ys) = (flip replaceInfinity ys) . (replaceList x) $ (a,b)
Or more succinctly:
replaceInfinity x (y:ys) = flip replaceInfinity ys . replaceList x $ y

Resources