Sublist by boolean pattern - haskell

I need to extract from a list elements in the odd positions. In the Data.List library I found anything about. So I created the following functions. I would like to know if there is a library that contains this functions and other similar and if it is possible to refactor my functions significantly. Thanks.
extractByPattern p l = extractByPatternRaw bp l
where
bp = map (== 't') p
extractByPatternRaw p l = foldr select [] coupledList
where
coupledList = zip (concat . repeat $ p) l
select (b,x) acc
| b = x : acc
| otherwise = acc
oddPos = extractByPattern "tf"
-- ex. oddPos [1..20] == [1,3,5,7,9,11,13,15,17,19]
everyTwoAndFivePos = extractByPattern "ftfft"
-- ex. everyTwoAndFivePos [1..20] == [2,5,7,10,12,15,17,20]

As an alternative:
λ map fst $ filter snd $ zip [1..20] $ cycle . map (== 't') $ "ftfft"
[2,5,7,10,12,15,17,20]
So you could do something like the following:
extractByPattern pattern list = map fst $ filter snd $ zip list $ cycle . map (== 't') $ pattern
Nothing jumps out in Hoogle for [Bool] -> [a] -> [a] or [a] -> [Bool] -> [a], which would save the zip-filter-snd-map-fst hoop-jumping.

Related

replace character to number in haskell

I have function change which replace some characters to numbers. Here it is:
change [] = []
change (x:xs) | x == 'A' = '9':'9':change xs
| x == 'B' = '9':'8':change xs
| otherwise = change xs
and the output is:
Main> change "aAB11s"
"9998"
but I need this:
Main> change "aAB11s"
"a999811s"
How can I do this?
Try this:
change [] = []
change (x:xs) | x == 'A' = '9':'9':change xs
| x == 'B' = '9':'8':change xs
| otherwise = x:change xs
The only change is in otherwise.
In addition to #kostya 's answer, you don't need to write the recursive part youself, try this out:
change :: String -> String
change xs = concatMap chToStr xs
where chToStr 'A' = "99"
chToStr 'B' = "98"
chToStr x = [x]
or, more point-freely (actually this is preferred if the point-free refactoring doesn't hurt the readability):
change :: String -> String
change = concatMap chToStr
where chToStr 'A' = "99"
chToStr 'B' = "98"
chToStr x = [x]
And you can test the result:
λ> change "aAB11s"
"a999811s"
Some explanation:
It's tempting to do an elementwise replacement by passing map a function
f :: Char -> Char. But here you can't do that because for A, you want two characters, i.e. 99, so the function you want is of type Char -> String (String and [Char] in Haskell are equivalent) which does not fit the type signature.
So the solution is to also wrap other characters we don't care about into lists, and afterwards, we can perform a string concatenation(this function in Haskell is called concat) to get a string back.
Further, concatMap f xs is just a shorthand for concat (map f xs)
λ> map (\x -> [x,x]) [1..10]
[[1,1],[2,2],[3,3],[4,4],[5,5],[6,6],[7,7],[8,8],[9,9],[10,10]]
λ> concat (map (\x -> [x,x]) [1..10])
[1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10]
λ> concatMap (\x -> [x,x]) [1..10]
[1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10]

How can I efficiently filter a pair of lists in Haskell?

I'm working on a simple problem on Programming Praxis: remove all duplicates from a list without changing the order. Assuming the elements are in class Ord, I came up with the following:
import Data.Set (Set)
import qualified Data.Set as Set
buildsets::Ord a => [a] -> [Set a]
buildsets = scanl (flip Set.insert) Set.empty
nub2::Ord a => [a] -> [a]
nub2 thelist = map fst $ filter (not . uncurry Set.member) (zip thelist (buildsets thelist))
As you can see, the buildsets function gets me most of the way there, but that last step (nub2) of putting everything together looks absolutely horrible. Is there a cleaner way to accomplish this?
Since we have to filter the list and we should probably use some set to keep records, we might as well use filterM with the state monad:
import qualified Data.Set as S
import Control.Monad.State.Strict
nub2 :: Ord a => [a] -> [a]
nub2 = (`evalState` S.empty) . filterM go where
go x = state $ \s -> if S.member x s
then (False, s)
else (True, S.insert x s)
If I wanted to somewhat golf the function, I'd to the following:
import Control.Arrow (&&&)
nub2 = (`evalState` S.empty) . filterM (\x -> state (S.notMember x &&& S.insert x))
Simple recursion looks ok to me.
> g xs = go xs S.empty where
> go [] _ = []
> go (x:xs) a | S.member x a = go xs a
> | otherwise = x:go xs (S.insert x a)
Based directly on Sassa NF's suggestion, but with a slight type change for cleanliness:
g x = catMaybes $ unfoldr go (Set.empty, x)
where
go (_,[]) = Nothing
go (s,(x:xs)) = Just (if Set.member x s then Nothing else Just x,
(Set.insert x s, xs))
Sometimes it really cleans up code to pull out and name subpieces. (In some ways this really is the Haskell way to comment code)
This is wordier that what you did above, but I think it is much easier to understand....
First I start with some definitions:
type Info=([Int], S.Set Int) --This is the remaining and seen items at a point in the list
item=head . fst --The current item
rest=fst --Future items
seen=snd --The items already seen
Then I add two self descriptive helper functions:
itemHasBeenSeen::Info->Bool
itemHasBeenSeen info = item info `S.member` seen info
moveItemToSet::Info->Info
moveItemToSet info = (tail $ rest info, item info `S.insert` seen info)
With this the program becomes:
nub2::[Int]->[Int]
nub2 theList =
map item
$ filter (not . itemHasBeenSeen)
$ takeWhile (not . null . rest)
$ iterate moveItemToSet start
where start = (theList, S.empty)
Reading from bottom to top (just as the data flows), you can easily see what it happening:
start=(theList, S.empty), start with the full list, and an empty set.
iterate moveItemToSet start, repeatedly move the first item of the list into the set, saving each iteration of Info in an array.
takeWhile (not . null . rest)- Stop the iteration when you run out of elements.
filter (not . itemHasBeenSeen)- Remove items that have already been seen.
map item- Throw away the helper values....

Haskell - Most frequent value

how can i get the most frequent value in a list example:
[1,3,4,5,6,6] -> output 6
[1,3,1,5] -> output 1
Im trying to get it by my own functions but i cant achieve it can you guys help me?
my code:
del x [] = []
del x (y:ys) = if x /= y
then y:del x y
else del x ys
obj x []= []
obj x (y:ys) = if x== y then y:obj x y else(obj x ys)
tam [] = 0
tam (x:y) = 1+tam y
fun (n1:[]) (n:[]) [] =n1
fun (n1:[]) (n:[]) (x:s) =if (tam(obj x (x:s)))>n then fun (x:[]) ((tam(obj x (x:s))):[]) (del x (x:s)) else(fun (n1:[]) (n:[]) (del x (x:s)))
rep (x:s) = fun (x:[]) ((tam(obj x (x:s))):[]) (del x (x:s))
Expanding on Satvik's last suggestion, you can use (&&&) :: (b -> c) -> (b -> c') -> (b -> (c, c')) from Control.Arrow (Note that I substituted a = (->) in that type signature for simplicity) to cleanly perform a decorate-sort-undecorate transform.
mostCommon list = fst . maximumBy (compare `on` snd) $ elemCount
where elemCount = map (head &&& length) . group . sort $ list
The head &&& length function has type [b] -> (b, Int). It converts a list into a tuple of its first element and its length, so when it is combined with group . sort you get a list of each distinct value in the list along with the number of times it occurred.
Also, you should think about what happens when you call mostCommon []. Clearly there is no sensible value, since there is no element at all. As it stands, all the solutions proposed (including mine) just fail on an empty list, which is not good Haskell. The normal thing to do would be to return a Maybe a, where Nothing indicates an error (in this case, an empty list) and Just a represents a "real" return value. e.g.
mostCommon :: Ord a => [a] -> Maybe a
mostCommon [] = Nothing
mostCommon list = Just ... -- your implementation here
This is much nicer, as partial functions (functions that are undefined for some input values) are horrible from a code-safety point of view. You can manipulate Maybe values using pattern matching (matching on Nothing and Just x) and the functions in Data.Maybe (preferable fromMaybe and maybe rather than fromJust).
In case you would like to get some ideas from code that does what you wish to achieve, here is an example:
import Data.List (nub, maximumBy)
import Data.Function (on)
mostCommonElem list = fst $ maximumBy (compare `on` snd) elemCounts where
elemCounts = nub [(element, count) | element <- list, let count = length (filter (==element) list)]
Here are few suggestions
del can be implemented using filter rather than writing your own recursion. In your definition there was a mistake, you needed to give ys and not y while deleting.
del x = filter (/=x)
obj is similar to del with different filter function. Similarly here in your definition you need to give ys and not y in obj.
obj x = filter (==x)
tam is just length function
-- tam = length
You don't need to keep a list for n1 and n. I have also made your code more readable, although I have not made any changes to your algorithm.
fun n1 n [] =n1
fun n1 n xs#(x:s) | length (obj x xs) > n = fun x (length $ obj x xs) (del x xs)
| otherwise = fun n1 n $ del x xs
rep xs#(x:s) = fun x (length $ obj x xs) (del x xs)
Another way, not very optimal but much more readable is
import Data.List
import Data.Ord
rep :: Ord a => [a] -> a
rep = head . head . sortBy (flip $ comparing length) . group . sort
I will try to explain in short what this code is doing. You need to find the most frequent element of the list so the first idea that should come to mind is to find frequency of all the elements. Now group is a function which combines adjacent similar elements.
> group [1,2,2,3,3,3,1,2,4]
[[1],[2,2],[3,3,3],[1],[2],[4]]
So I have used sort to bring elements which are same adjacent to each other
> sort [1,2,2,3,3,3,1,2,4]
[1,1,2,2,2,3,3,3,4]
> group . sort $ [1,2,2,3,3,3,1,2,4]
[[1,1],[2,2,2],[3,3,3],[4]]
Finding element with the maximum frequency just reduces to finding the sublist with largest number of elements. Here comes the function sortBy with which you can sort based on given comparing function. So basically I have sorted on length of the sublists (The flip is just to make the sorting descending rather than ascending).
> sortBy (flip $ comparing length) . group . sort $ [1,2,2,3,3,3,1,2,4]
[[2,2,2],[3,3,3],[1,1],[4]]
Now you can just take head two times to get the element with the largest frequency.
Let's assume you already have argmax function. You can write
your own or even better, you can reuse list-extras package. I strongly suggest you
to take a look at the package anyway.
Then, it's quite easy:
import Data.List.Extras.Argmax ( argmax )
-- >> mostFrequent [3,1,2,3,2,3]
-- 3
mostFrequent xs = argmax f xs
where f x = length $ filter (==x) xs

Implement a function to count frequency of each element in a list

I try to write a program which will count the frequency of each element in a list.
In: "aabbcabb"
Out: [("a",3),("b",4),("c",1)]
You can view my code in the following link: http://codepad.org/nyIECIT2
In this code the output of unique function would be like this
In: "aabbcabb"
Out: "abc"
Using the output of unique we wil count the frequency of the target list.
You can see the code here also:
frequencyOfElt xs=ans
where ans=countElt(unique xs) xs
unique []=[]
unique xs=(head xs):(unique (filter((/=)(head xs))xs))
countElt ref target=ans'
where ans'=zip ref lengths
lengths=map length $ zipWith($)(map[(=='a'),(==',b'),(==',c')](filter.(==))ref)(repeat target)
Error:Syntax error in input (unexpected symbol "unique")
But in ghci 6.13 other type of error are showing also
Few asked me what is the purpose of using [(=='a'),(==',b'),(==',c')].
What I expect: If ref="abc" and target="aabbaacc"
then
zipWith($) (map filter ref)(repeat target)
will show ["aaaa","bb","cc"] then I can use map length over this to get the frequency
Here for filtering list according with the ref i use [(=='a'),(==',b'),(==',c')]
I assume some logical error lies [(=='a'),(==',b'),(==',c')] here..
You didn't say whether you want to write it whole on your own, or whether it's OK to compose it from some standard functions.
import Data.List
g s = map (\x -> ([head x], length x)) . group . sort $ s
-- g = map (head &&& length) . group . sort -- without the [...]
is the standard quick-n-dirty way to code it.
OK, so your original idea was to Code it Point-Free Style (certain tune playing in my head...):
frequencyOfElt :: (Eq a) => [a] -> [(a,Int)]
frequencyOfElt xs = countElt (unique xs) xs -- change the result type
where
unique [] = []
unique (x:xs) = x : unique (filter (/= x) xs)
countElt ref target = -- Code it Point-Free Style (your original idea)
zip
ref $ -- your original type would need (map (:[]) ref) here
map length $
zipWith ($) -- ((filter . (==)) c) === (filter (== c))
(zipWith ($) (repeat (filter . (==))) ref)
(repeat target)
I've changed the type here to the more reasonable [a] -> [(a,Int)] btw. Note, that
zipWith ($) fs (repeat z) === map ($ z) fs
zipWith ($) (repeat f) zs === map (f $) zs === map f zs
hence the code simplifies to
countElt ref target =
zip
ref $
map length $
map ($ target)
(zipWith ($) (repeat (filter . (==))) ref)
and then
countElt ref target =
zip
ref $
map length $
map ($ target) $
map (filter . (==)) ref
but map f $ map g xs === map (f.g) xs, so
countElt ref target =
zip
ref $
map (length . ($ target) . filter . (==)) ref -- (1)
which is a bit clearer (for my taste) written with a list comprehension,
countElt ref target =
[ (c, (length . ($ target) . filter . (==)) c) | c <- ref]
== [ (c, length ( ($ target) ( filter (== c)))) | c <- ref]
== [ (c, length $ filter (== c) target) | c <- ref]
Which gives us an idea to re-write (1) further as
countElt ref target =
zip <*> map (length . (`filter` target) . (==)) $ ref
but this obsession with point-free code becomes pointless here.
So going back to the readable list comprehensions, using a standard nub function which is equivalent to your unique, your idea becomes
import Data.List
frequencyOfElt xs = [ (c, length $ filter (== c) xs) | c <- nub xs]
This algorithm is actually quadratic (~ n^2), so it is worse than the first version above which is dominated by sort i.e. is linearithmic (~ n log(n)).
This code though can be manipulated further by a principle of equivalent transformations:
= [ (c, length . filter (== c) $ sort xs) | c <- nub xs]
... because searching in a list is the same as searching in a list, sorted. Doing more work here -- will it pay off?..
= [ (c, length . filter (== c) $ sort xs) | (c:_) <- group $ sort xs]
... right? But now, group had already grouped them by (==), so there's no need for the filter call to repeat the work already done by group:
= [ (c, length . get c . group $ sort xs) | (c:_) <- group $ sort xs]
where get c gs = fromJust . find ((== c).head) $ gs
= [ (c, length g) | g#(c:_) <- group $ sort xs]
= [ (head g, length g) | g <- group (sort xs)]
= (map (head &&& length) . group . sort) xs
isn't it? And here it is, the same linearithmic algorithm from the beginning of this post, actually derived from your code by factoring out its hidden common computations, making them available for reuse and code simplification.
Using multiset-0.1:
import Data.Multiset
freq = toOccurList . fromList

Algorithm - How to delete duplicate elements in a Haskell list

I'm having a problem creating an function similar to the nub function.
I need this func to remove duplicated elements form a list.
An element is duplicated when 2 elements have the same email, and it should keep the newer one (is closer to the end of the list).
type Regist = [name,email,,...,date]
type ListRe = [Regist]
rmDup ListRe -> ListRe
rmDup [] = []
rmDup [a] = [a]
rmDup (h:t) | isDup h (head t) = rmDup t
| otherwise = h : rmDup t
isDup :: Regist -> Regist -> Bool
isDup (a:b:c:xs) (d:e:f:ts) = b==e
The problem is that the function doesn't delete duplicated elements unless they are together in the list.
Just use nubBy, and specify an equality function that compares things the way you want.
And I guess reverse the list a couple of times if you want to keep the last element instead of the first.
Slightly doctored version of your original code to make it run:
type Regist = [String]
type ListRe = [Regist]
rmDup :: ListRe -> ListRe
rmDup [] = []
rmDup (x:xs) = x : rmDup (filter (\y -> not(x == y)) xs)
Result:
*Main> rmDup [["a", "b"], ["a", "d"], ["a", "b"]]
[["a","b"],["a","d"]]
Anon is correct: nubBy is the function you are looking for, and can be found in Data.List.
That said, you want a function rem which accepts a list xs and a function f :: a -> a -> Bool (on which elements are compared for removal from xs). Since the definition is recursive, you need a base case and a recursive case.
In the base case xs = [] and rem f xs = [], since the result of removing all duplicate elements from [] is []:
rem :: Eq a => (a -> a -> Bool) -> [a] -> [a]
rem f [] = []
In the recursive case, xs = (a:as). Let as' be the list obtained by removing all elements a' such that f a a' = True from the list as. This is simply the function filter (\a' -> not $ f a a') applied to the list as. Them rem f (a:as) is the result of recursively calling rem f on as', that is, a : rem f as':
rem f (a:as) = a : rem f $ filter (\a' -> not $ f a a') as
Replace f be a function comparing your list elements for the appropriate equality (e-mail addresses).
While nubBy with two reverse's is probably the best among simple solutions (and probably exactly what Justin needs for his task), one should not forget that it isn't the ideal solution in terms of efficiency - after all nubBy is O(n^2) (in the "worst case" - when there are no duplicates). Two reverse's will also take their toll (in the form of memory allocation).
For more efficient implementation Data.Map (O(logN) on inserts) can be used as an intermediate "latest non duplicating element" holder (Set.insert replaces older element with newer if there is a collision):
import Data.List
import Data.Function
import qualified Data.Set as S
newtype Regis i e = Regis { toTuple :: (i,[e]) }
selector (Regis (_,(_:a:_))) = a
instance Eq e => Eq (Regis i e) where
(==) = (==) `on` selector
instance Ord e => Ord (Regis i e) where
compare = compare `on` selector
rmSet xs = map snd . sortBy (compare `on` fst) . map toTuple . S.toList $ set
where
set = foldl' (flip (S.insert . Regis)) S.empty (zip [1..] xs)
While nubBy implementation is definitely much simpler:
rmNub xs = reverse . nubBy ((==) `on` (!!1)) . reverse $ xs
on 10M elements list (with lots of duplication - nub should play nice here) there is 3 times difference in terms of running time and 700 times difference in memory usage. Compiled with GHC with -O2 :
input = take 10000000 $ map (take 10) $ permutations [1..]
test1 = rmNub input
test2 = rmSet input
Not sure about the nature of the author's data though (the real data might change the picture).
(Assuming you want to figure out an answer, not just call a library function that does this job for you.)
You get what you ask for. What if h is not equal to head t but is instead equal to the 3rd element of t? You need to write an algorithm that compares h with every element of t, not just the first element.
Why not putting everything in a Map from email to Regist (of course respecting your "keep the newest" rule), and then transform the values of the map back in the list? That's the most efficient way I can think of.
I used Alexei Polkhanov's answer and came to the following, so you can remove duplicates from lists with a type that extends Eq class.
removeDuplicates :: Eq a => [[a]] -> [[a]]
removeDuplicates [] = []
removeDuplicates (x:xs) = x : removeDuplicates (filter (\y -> not (x == y)) xs)
Examples:
*Verdieping> removeDuplicates [[1],[2],[1],[1,2],[1,2]]
[[1],[2],[1,2]]
*Verdieping> removeDuplicates [["a","b"],["a"],["a","b"],["c"],["c"]]
[["a","b"],["a"],["c"]]

Resources