Haskell Remove duplicates from list

Haskell Remove duplicates from list - haskell

I am new to Haskell and I am trying the below code to remove duplicates from a list. However it does not seem to work.
compress [] = []
compress (x:xs) = x : (compress $ dropWhile (== x) xs)
I have tried some search, all the suggestions use foldr/ map.head. Is there any implementation with basic constructs?

I think that the issue that you are referring to in your code is that your current implementation will only get rid of adjacent duplicates. As it was posted in a comment, the builtin function nub will eliminate every duplicate, even if it's not adjacent, and keep only the first occurrence of any element. But since you asked how to implement such a function with basic constructs, how about this?
myNub :: (Eq a) => [a] -> [a]
myNub (x:xs) = x : myNub (filter (/= x) xs)
myNub [] = []
The only new function that I introduced to you there is filter, which filters a list based on a predicate (in this case, to get rid of every element present in the rest of that list matching the current element).
I hope this helps.

First of all, never simply state "does not work" in a question. This leaves to the reader to check whether it's a compile time error, run time error, or a wrong result.
In this case, I am guessing it's a wrong result, like this:
> compress [1,1,2,2,3,3,1]
[1,2,3,1]
The problem with your code is that it removes successive duplicates, only. The first pair of 1s gets compressed, but the last lone 1 is not removed because of that.
If you can, sort the list in advance. That will make equal elements close, and then compress does the right job. The output will be in a different order, of course. There are ways to keep the order too if needed (start with zip [0..] xs, then sort, then ...).
If you can not sort becuase there is really no practical way to define a comparison, but only an equality, then use nub. Be careful that this is much less efficient than sorting & compressing. This loss of performance is intrinsic: without a comparator, you can only use an inefficient quadratic algorithm.

foldr and map are very basic FP constructs. However, they are very general and I found them a bit mind-bending to understand for a long time. Tony Morris' talk Explain List Folds to Yourself helped me a lot.
In your case an existing list function like filter :: (a -> Bool) -> [a] -> [a] with a predicate that exludes your duplicate could be used in lieu of dropWhile.

Related

Error: Non-exhaustive patterns in function Haskell

I did a thread for an error like this one, in it I explain my program. here's the link
I'm going forward in my project and I have an another problem like that one. I did an other thread but if I just need to edit the first one just tell me.
I want to reverse my matrix. For example [[B,B,N],[N,B,B]] will become [[B,N],[B,B],[N,B]]. Here's my code:
transpose :: Grille -> Grille
transpose [] = []
transpose g
| head(g) == [] = []
| otherwise = [premierElem(g)] ++ transpose(supp g)
supp :: Grille -> Grille
supp [] = []
supp (g:xg) = [tail(g)] ++ supp(xg)
premierElem :: Grille -> [Case]
premierElem [] = []
premierElem (x:xg) = [head(x)] ++ premierElem(xg)
I got the exact same error and I tried like for the first one but that's not it.
EDIT: The exact error
*Main> transpose g0
[[B,B,N,B,N,B],[B,B,B,B,N,B],[B,N,N,B,N,B],[B,B,N,N,B,N],[B,N,N,N,B,N],[B,B,N,B,B,B],[N,B,N,N,B,N],[*** Exception: Prelude.head: empty list

The problem is that your transpose function has a broken termination condition. How does it know when to stop? Try walking through the final step by hand...
In general, your case transpose [] = [] will never occur, because your supp function never changes the number of lists in its argument. A well-formed matrix will end up as [[],[],[],...], which will not match []. The only thing that will stop it is an error like you received.
So, you need to check the remaining length of your nested (row?) vectors, and stop if it is zero. There are many ways to approach this; if it's not cheating, you could look at the implementation of transpose in the Prelude documents.
Also, re your comment above: if you expect your input to be well-formed in some way, you should cover any excluded cases by complaining about the ill-formed input, such as reporting an error.

Fixing Your Code
You should avoid using partial functions, such as tail and head, and instead make your own functions do (more) pattern matching. For example:
premierElem g = [head(head(g))] ++ premierElem(tail(g))
Yuck! If you want the first element of the first list in g then match on the pattern:
premierElem ((a:_):rest) = [a] ++ premierElem rest
This in and of itself is insufficient, you'll want to handle the case where the first list of the Grille is an empty list and at least give a useful error message if you can't use a reasonable default value:
premeirElem ([]:rest) = premeirElem rest
Making Better Code
Eventually you will become more comfortable in the language and learn to express what you want using higher level operations, which often means you'll be able to reuse functions already provided in base or other libraries. In this case:
premeirElem :: [[a]] -> [a]
premeirElem = concatMap (take 1)
Which assumes you are OK with silently ignoring []. If that isn't your intent then other similarly concise solutions can work well, but we'd need clarity on the goal.

All possible orders of a list

I'm trying to teach myself Haskell and the book I'm using has said to create a list of all possible formations of said list, the example is as follows (roughly translated):
Given the list, ls = [1,2,3], there are 5 possible form in which this could occur:
[[1],[2],[3]]
[[1,2],[3]]
[[1,3],[2]]
[[2,3],[1]]
[[1,2,3]]
How would I even start about coding this?
Thank you and sorry for English, it is not my first language.

Expanding on Daniel Wagner's comment:
First, explain precisely what you want. I would put it like this:
Given a list, xs :: [a], whose elements are all distinct, produce a list yss :: [[[a]]] representing all the ways to partition the elements of xs into non-empty lists.
Now, consider cases:
ways :: [a] -> [[[a]]]
ways [] = ?
ways (ys : yss) = ?
You can expect the second case to be recursive. You can also expect to need to write at least one helper function.

Get elements with odd length in a Haskell list of strings

I have a list of strings in Haskell and I need to get those elements with odd length in another list. How can this be done using higher order functions like foldr, foldl, foldr1, foldl1, filter, map, and so on? I will very much appreciate your help. Can list comprehension be used in this case?

It seems that you are aware that filter exists (since you've mentioned), but perhaps are uncertain how it works. If you're trying to extract a specific subset of a list, this seems to be the right path. If you look at its type-signature, you'll find it's pretty straight-forward:
(a -> Bool) -> [a] -> [a]
That is, it takes a function that returns True or False (i.e. true to contain in the new set, false otherwise) and produces a new list. Similarly, Haskell provides a function called odd in Prelude. It's signature looks as follows:
Integral a => a -> Bool
That is, it can take any Integral type and returns True if it is odd, false otherwise.
Now, let's consider a solution:
filter odd [1..10]
This will extract all the odd numbers between [1,10].
I noticed you mentioned list comprehensions. You probably do not want to use this if you are already given a list and you are simply filtering it. However, a list comprehension would be a perfectly acceptable solution:
[x | x <- [1..10], odd x]
In general, list comprehensions are used to express the generation of lists with more complicated constraints.
Now, to actually answer your question. Since we know we can filter numbers, and if we're using Hoogle searching for the following type (notice that String is simply [Char]):
[a] -> Int
You will see a length function. With some function composition, we can quickly see how to create a function which filters odd length. In summary, we have odd which is type Int -> Bool (in this case) and we have length which is [a] -> Int or-- specifically-- String -> Int. Our solution now looks like this:
filter (odd . length) ["abc","def","eh","123","hm","even"]

Here ya go.
getOddOnes = filter . flip (foldr (const (. not)) id) $ False
Note: if you turn this in for your homework, you'd best be prepared to explain it!

Counting number of elements in a list that satisfy the given predicate

Does Haskell standard library have a function that given a list and a predicate, returns the number of elements satisfying that predicate? Something like with type (a -> Bool) -> [a] -> Int. My hoogle search didn't return anything interesting. Currently I am using length . filter pred, which I don't find to be a particularly elegant solution. My use case seems to be common enough to have a better library solution that that. Is that the case or is my premonition wrong?

The length . filter p implementation isn't nearly as bad as you suggest. In particular, it has only constant overhead in memory and speed, so yeah.
For things that use stream fusion, like the vector package, length . filter p will actually be optimized so as to avoid creating an intermediate vector. Lists, however, use what's called foldr/build fusion at the moment, which is not quite smart enough to optimize length . filter p without creating linearly large thunks that risk stack overflows.
For details on stream fusion, see this paper. As I understand it, the reason that stream fusion is not currently used in the main Haskell libraries is that (as described in the paper) about 5% of programs perform dramatically worse when implemented on top of stream-based libraries, while foldr/build optimizations can never (AFAIK) make performance actively worse.

No, there is no predefined function that does this, but I would say that length . filter pred is, in fact, an elegant implementation; it's as close as you can get to expressing what you mean without just invoking the concept directly, which you can't do if you're defining it.
The only alternatives would be a recursive function or a fold, which IMO would be less elegant, but if you really want to:
foo :: (a -> Bool) -> [a] -> Int
foo p = foldl' (\n x -> if p x then n+1 else n) 0
This is basically just inlining length into the definition. As for naming, I would suggest count (or perhaps countBy, since count is a reasonable variable name).

Haskell is a high-level language. Rather than provide one function for every possible combination of circumstances you might ever encounter, it provides you with a smallish set of functions that cover all of the basics, and you then glue these together as required to solve whatever problem is currently at hand.
In terms of simplicity and conciseness, this is as elegant as it gets. So yes, length . filter pred is absolutely the standard solution. As another example, consider elem, which (as you may know) tells you whether a given item is present in a list. The standard reference implementation for this is actually
elem :: Eq x => x -> [x] -> Bool
elem x = foldr (||) False . map (x ==)
In order words, compare every element in the list to the target element, creating a new list of Bools. Then fold the logical-OR function over this new list.
If this seems inefficient, try not to worry about it. In particular,
The compiler can often optimise away temporary data structures created by code like this. (Remember, this is the standard way to write code in Haskell, so the compiler is tuned to deal with it.)
Even if it can't be optimised away, laziness often makes such code fairly efficient anyway.
(In this specific example, the OR function will terminate the loop as soon as a match is seen - just like what would happen if you hand-coded it yourself.)
As a general rule, write code by gluing together pre-existing functions. Change this only if performance isn't good enough.

This is my amateurish solution to a similar problem. Count the number of negative integers in a list l
nOfNeg l = length(filter (<0) l)
main = print(nOfNeg [0,-1,-2,1,2,3,4] ) --2

No, there isn't!
As of 2020, there is indeed no such idiom in the Haskell standard library yet! One could (and should) however insert an idiom howMany (resembling good old any)
howMany p xs = sum [ 1 | x <- xs, p x ]
-- howMany=(length.).filter
main = print $ howMany (/=0) [0..9]
Try howMany=(length.).filter

I'd do manually
howmany :: (a -> Bool) -> [a] -> Int
howmany _ [ ] = 0
howmany pred (x:xs) = if pred x then 1 + howmany pred xs
else howmany pred xs

Is it recommended to always have exhaustive pattern matches in Haskell, even for "impossible" cases?

Is it recommended to always have exhaustive pattern matches in Haskell, even for "impossible" cases?
For example, in the following code, I am pattern matching on the "accumulator" of a foldr. I am in complete control of the contents of the accumulator, because I create it (it is not passed to me as input, but rather built within my function). Therefore, I know certain patterns should never match it. If I strive to never get the "Pattern match(es) are non-exhaustive" error, then I would place a pattern match for it that simply error's with the message "This pattern should never happen." Much like an assert in C#. I can't think of anything else to do there.
What practice would you recommend in this situation and why?
Here's the code:
gb_groupBy p input = foldr step [] input
where
step item acc = case acc of
[] -> [[item]]
((x:xs):ys) -> if p x item
then (item:x:xs):ys
else [item]:acc
The pattern not matched (as reported by the interpreter) is:
Warning: Pattern match(es) are non-exhaustive
In a case alternative: Patterns not matched: [] : _

This is probably more a matter of style than anything else. Personally, I would put in a
_ -> error "Impossible! Empty list in step"
if only to silence the warning :)

You can resolve the warning in this special case by doing this:
gb_groupBy p input = foldr step [] input
where
step item acc = case acc of
[] -> [[item]]
(xs:xss) -> if p (head xs) item
then (item:xs):xss
else [item]:acc
The pattern matching is then complete, and the "impossible" condition of an empty list at the head of the accumulator would cause a runtime error but no warning.
Another way of looking at the more general problem of incomplete pattern matchings is to see them as a "code smell", i.e. an indication that we're trying to solve a problem in a suboptimal, or non-Haskellish, way, and try to rewrite our functions.
Implementing groupBy with a foldr makes it impossible to apply it to an infinite list, which is a design goal that the Haskell List functions try to achieve wherever semantically reasonable. Consider
take 5 $ groupBy (==) someFunctionDerivingAnInfiniteList
If the first 5 groups w.r.t. equality are finite, lazy evaluation will terminate. This is something you can't do in a strictly evaluated language. Even if you don't work with infinite lists, writing functions like this will yield better performance on long lists, or avoid the stack overflow that occurs when evaluating expressions like
take 5 $ gb_groupBy (==) [1..1000000]
In List.hs, groupBy is implemented like this:
groupBy :: (a -> a -> Bool) -> [a] -> [[a]]
groupBy _ [] = []
groupBy eq (x:xs) = (x:ys) : groupBy eq zs
where (ys,zs) = span (eq x) xs
This enables the interpreter/compiler to evaluate only the parts of the computation necessary for the result.
span yields a pair of lists, where the first consists of (consecutive) elements from the head of the list all satisfying a predicate, and the second is the rest of the list. It's also implemented to work on infinite lists.

I find exhaustiveness checking on case patterns indispensible. I try never to use _ in a case at top level, because _ matches everything, and by using it you vitiate the value of exhaustiveness checking. This is less important with lists but critical important with user-defined algebraic data types, because I want to be able to add a new constructor and have the compiler barf on all the missing cases. For this reason I always compile with -Werror turned on, so there is no way I can leave out a case.
As observed, your code can be extended with this case
[] : _ -> error "this can't happen"
Internally, GHC has a panic function, which unlike error will give source coordinates, but I looked at the implementation and couldn't make head or tail of it.

To follow up on my earlier comment, I realised that there is a way to acknowledge the missing case but still get a useful error with file/line number. It's not ideal as it'll only appear in unoptimized builds, though (see here).
...
[]:xs -> assert False (error "unreachable because I know everything")

The type system is your friend, and the warning is letting you know your function has cracks. The very best approach is to go for a cleaner, more elegant fit between types.
Consider ghc's definition of groupBy:
groupBy :: (a -> a -> Bool) -> [a] -> [[a]]
groupBy _ [] = []
groupBy eq (x:xs) = (x:ys) : groupBy eq zs
where (ys,zs) = span (eq x) xs

My point of view is that an impossible case is undefined.
If it's undefined we have a function for it: the cunningly named undefined.
Complete your matching with the likes of:
_ -> undefined
And there you have it!

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Haskell Remove duplicates from list - haskell

Related

Error: Non-exhaustive patterns in function Haskell

All possible orders of a list

Get elements with odd length in a Haskell list of strings

Counting number of elements in a list that satisfy the given predicate

Is it recommended to always have exhaustive pattern matches in Haskell, even for "impossible" cases?

Categories

Resources