Function to show the lowest represented element in a list - haskell

If you have a list such as this in Haskell:
data TestType = A | B | C deriving (Ord, Eq, Show)
List1 :: [TestType]
List1 = [A,B,C,B,C,A,B,C,C,C]
Is it possible to write a function to determin which element is represented the least in a list (so in this case 'A')
My initial thought was to write a helper function such as this but now I am not sure if this is the right approach:
appears :: TestType -> [TestType] -> Int
appears _ [] = 0
appears x (y:ys) | x==y = 1 + (appears x ys)
| otherwise = appears x ys
I am still fairly new to Haskell, so apologies for the potentially silly question.
Many thanks

Slightly alternative version to Matt's approach
import Data.List
import Data.Ord
leastFrequent :: Ord a => [a] -> a
leastFrequent = head . minimumBy (comparing length) . group . sort

You can build a map counting how often each item occurs in the list
import qualified Data.Map as Map
frequencies list = Map.fromListWith (+) $ zip list (repeat 1)
Then you can find the least/most represented using minimumBy or maximumBy from Data.List on the list of Map.assocs of the frequency map, or even sort it by frequency using sortBy.
module Frequencies where
import Data.Ord
import Data.List
import qualified Data.Map as Map
frequencyMap :: Ord a => [a] -> Map.Map a Int
frequencyMap list = Map.fromListWith (+) $ zip list (repeat 1)
-- Caution: leastFrequent will cause an error if called on an empty list!
leastFrequent :: Ord a => [a] -> a
leastFrequent = fst . minimumBy (comparing snd) . Map.assocs . frequencyMap
ascendingFrequencies :: Ord a => [a] -> [(a,Int)]
ascendingFrequencies = sortBy (comparing snd) . Map.assocs . frequencyMap

Here's another way to do it:
sort the list
group the list
find the length of each group
return the group with the shortest length
Example:
import GHC.Exts
import Data.List
fewest :: (Eq a) => [a] -> a
fewest xs = fst $ head sortedGroups
where
sortedGroups = sortWith snd $ zip (map head groups) (map length groups)
groups = group $ sort xs

A less elegant idea would be:
At first sort and group the list
then pairing the cases with their number of representations
at last sort them relative to their num of representations
In code this looks like
import Data.List
sortByRepr :: (Ord a) => [a] ->[(a,Int)]
sortByRepr xx = sortBy compareSnd $ map numOfRepres $ group $ sort xx
where compareSnd x y = compare (snd x) (snd y)
numOfRepres x = (head x, length x)
the least you get by applying head to the resulting list.

Related

Removing duplicate elements in a Seq

wondering how to implement nub over a Seq a
I get that one could do:
nubSeq :: Seq a -> Seq a
nubSeq = fromList . nub . toList
Just wondering is there something standard that does not convert to Lists in order to call nub :: [a]->[a]?
An implementation that occurred to me, based obviously on nub, is:
nubSeq :: (Eq a) => Seq a -> Seq a
nubSeq = Data.Sequence.foldrWithIndex
(\_ x a -> case x `Data.Sequence.elemIndexR` a of
Just _ -> a
Nothing -> a |> x) Data.Sequence.empty
But there must be something more elegant?
thanks.
Not sure whether this qualifies as more elegant but it splits the concerns in independent functions (caveat: you need an Ord constraint on a):
seqToNubMap takes a Seq and outputs a Map associating to each a the smallest index at which it appeared in the sequence
mapToList takes a Map of values and positions and produces a list of values in increasing order according to the specified positions
nubSeq combines these to generate a sequence without duplicates
The whole thing should be O(n*log(n)), I believe:
module NubSeq where
import Data.Map as Map
import Data.List as List
import Data.Sequence as Seq
import Data.Function
seqToNubMap :: Ord a => Seq a -> Map a Int
seqToNubMap = foldlWithIndex (\ m k v -> insertWith min v k m) Map.empty
mapToList :: Ord a => Map a Int -> [a]
mapToList = fmap fst . List.sortBy (compare `on` snd) . Map.toList
nubSeq :: Ord a => Seq a -> Seq a
nubSeq = Seq.fromList . mapToList . seqToNubMap
Or a simpler alternative following #DavidFletcher's comment:
nubSeq' :: forall a. Ord a => Seq a -> Seq a
nubSeq' xs = Fold.foldr cons nil xs Set.empty where
cons :: a -> (Set a -> Seq a) -> (Set a -> Seq a)
cons x xs seen
| x `elem` seen = xs seen
| otherwise = x <| xs (Set.insert x seen)
nil :: Set a -> Seq a
nil _ = Seq.empty
Another way with an Ord constraint - use a scan to make the sets of
elements that appear in each prefix of the list. Then we can filter out
any element that's already been seen.
import Data.Sequence as Seq
import Data.Set as Set
nubSeq :: Ord a => Seq a -> Seq a
nubSeq xs = (fmap fst . Seq.filter (uncurry notElem)) (Seq.zip xs seens)
where
seens = Seq.scanl (flip Set.insert) Set.empty xs
Or roughly the same thing as a mapAccumL:
nubSeq' :: Ord a => Seq a -> Seq a
nubSeq' = fmap fst . Seq.filter snd . snd . mapAccumL f Set.empty
where
f s x = (Set.insert x s, (x, x `notElem` s))
(If I was using lists I would use Maybes instead of the pairs with
Bool, then use catMaybes instead of filtering. There doesn't seem to be catMaybes
for Sequence though.)
I think your code should be pretty efficient. Since Sequences are tree data structures using another tree type data structure like Map or HashMap to store and lookup the previous items doesn't make too much sense to me.
Instead i take the first item and check it's existence in the rest. If exists i drop that item and proceed the same with the rest recursively. If not then construct a new sequence with first element is the unique element and the rest is the result of nubSeq fed by the rest. Should be typical. I use ViewPatterns.
{-# LANGUAGE ViewPatterns #-}
import Data.Sequence as Seq
nubSeq :: Eq a => Seq a -> Seq a
nubSeq (viewl -> EmptyL) = empty
nubSeq (viewl -> (x :< xs)) | elemIndexL x xs == Nothing = x <| nubSeq xs
| otherwise = nubSeq xs
*Main> nubSeq . fromList $ [1,2,3,4,4,2,3,6,7,1,2,3,4]
fromList [6,7,1,2,3,4]

Filter Duplicate elements from a [[String]] Haskell

I have a list in the form [["A1","A1","A1"] .. ["G3","G3","G3"]] which contains many duplicate elements like ["A1","A2","A3"] and ["A3","A2","A1"].
How do I filter out such duplicate elements?
if check the above two elements for equality, it shows false
*Main> ["A1","A2","A3"] == ["A3","A2","A1"]
False
nubBy :: (a -> a -> Bool) -> [a] -> [a] is a relevant function that removes duplicates from a list via an arbitrary equality test.
A version of the function you're looking for is:
import Data.List (sort, nubBy)
removeDuplicates' :: Ord a => [[a]] -> [[a]]
removeDuplicates' = nubBy (\l1 l2 = sort l1 == sort l2)
Of course, this does require that a is an Ord, not just an Eq, as well as using sort, which is (as stated below) an expensive function. So it is certainly not ideal. However, I don't know specifically how you want to do the equality tests on those lists, so I'll leave the details to you.
#AJFarmar's answer solves the issue. But it can be done a bit more efficient: since sort is an expensive function. We want to save on such function calls.
We can use:
import Data.List(nubBy, sort)
import Data.Function(on)
removeDuplicates' :: Ord a => [[a]] -> [[a]]
removeDuplicates' = map snd . nubBy ((==) `on` fst) . map ((,) =<< sort)
what we here do is first construct a map ((,) =<< sort). This means that for every element x in the original list, we construct a tuple (sort x,x). Now we will perform a nubBy on the first elements of the two tuples we want to sort. After we have sorted, we will perform a map snd where we - for every tuple (sort x,x) return the second item.
We can generalize this by constructing a nubOn function:
import Data.List(nubBy)
import Data.Function(on)
nubOn :: Eq b => (a -> b) -> [a] -> [a]
nubOn f = map snd . nubBy ((==) `on` fst) . map ((,) =<< f)
In that case removeDuplicates' is nubOn sort.
You may not even need to sort. You just need to see if all items are the same like;
\xs ys -> length xs == (length . filter (== True) $ (==) <$> xs <*> ys)
you just need to know that (==) <$> ["A1","A2","A3"] <*> ["A3","A2","A1"] would in fact return [False,False,True,False,True,False,True,False,False]
As per #rampion 's rightful comment let's take it further and import Data.Set then it gets pretty dandy.
import Data.Set as S
equity :: Ord a => [a] -> [a] -> Bool
equity = (. S.fromList) . (==) . S.fromList
*Main> equity ["A1","A2","A3"] ["A3","A2","A1"]
True

Haskell: change all indices from a list to some value

If I am given a list of objects and another list for some indices from this list, is there an easy way to change every object in this list with an index from the list of indices to a different value?
E.g. I am hoping there exists some function f such that
f 0 [4,2,5] [6,5,8,4,3,6,2,7]
would output
[6,5,0,4,0,0,2,7]
Here is a beautiful version that uses lens:
import Control.Lens
f :: a -> [Int] -> [a] -> [a]
f x is = elements (`elem` is) .~ x
Here is an efficient version that doesn't have any dependencies other than base. Basically, we start by sorting (and removing duplicates from the) indices list. That way, we don't need to scan the whole list for every replacement.
import Data.List
f :: a -> [Int] -> [a] -> [a]
f x is xs = snd $ mapAccumR go is' (zip xs [1..])
where
is' = map head . group . sort $ is
go [] (y,_) = ([],y)
go (i:is) (y,j) = if i == j then (is,x) else (i:is,y)
You can define a helper function to replace a single value and then use it to fold over your list.
replaceAll :: a -> [Int] -> [a] -> [a]
replaceAll repVal indices values = foldl (replaceValue repVal) values indices
where replaceValue val vals index = (take index vals) ++ [val] ++ (drop (index + 1) vals)
Sort the indices first. Then you can traverse the two lists in tandem.
{-# LANGUAGE ScopedTypeVariables #-}
import Prelude (Eq, Enum, Num, Ord, snd, (==), (<$>))
import Data.List (head, group, sort, zip)
f :: forall a. (Eq a, Enum a, Num a, Ord a) => a -> [a] -> [a] -> [a]
f replacement indices values =
go (head <$> group (sort indices)) (zip [0..] values)
where
go :: [a] -> [(a, a)] -> [a]
go [] vs = snd <$> vs
go _ [] = []
go (i:is) ((i', v):vs) | i == i' = replacement : go is vs
go is (v:vs) = snd v : go is vs
The sorting incurs an extra log factor on the length of the index list, but the rest is linear.

Haskell add 2 maps

How can I retain the value of list 2 in the next example to compute the sum of 2 maps in haskell?
listsSumm :: Eq a => Bag a -> Bag a -> Bag a
listsSumm [] [] = []
listsSumm bag1 bag2
| q1==q2 = (q1,v1+v2):(listsSumm rBag1 rBag2)
| otherwise = bagSum [(q1,v1)] rBag2
where ((q1,v1):rBag1) = bag1
((q2,v2):rBag2) = bag2
and my imput is:
listSumm [("a",1),("c",1),("d",1),("b",1)] [("a",2),("c",1),("b",1),("d",1)]->[("a",3),("c",2),("d",2)]
How can I retain the content of the second list to keep reevaluating it after a test has finished?
Since you seem to be working with Maps, what you're trying to do can be done quite straightforwardly:
import qualified Data.Map as M
sumOfMaps :: M.Map String Int -> M.Map String Int -> M.Map String Int
sumOfMaps = M.unionWith (+)
If you don't want to rely on Data.Map you can use the following solution to merge your lists:
import Data.Function ( on )
import Data.List ( groupBy, sortBy )
import Data.Ord ( compare )
sumOfLists :: [(String, Int)] -> [(String, Int)] -> [(String, Int)]
sumOfLists l1 l2 = map merge . collect $ l1 ++ l2
where collect = groupBy ((==) `on` fst) . sortBy (compare `on` fst)
merge xs#(x:_) = (fst x, sum $ map snd xs)

Nondeterminism for infinite inputs

Using lists to model nondeterminism is problematic if the inputs can take infinitely many values. For example
pairs = [ (a,b) | a <- [0..], b <- [0..] ]
This will return [(0,1),(0,2),(0,3),...] and never get around to showing you any pair whose first element is not 0.
Using the Cantor pairing function to collapse a list of lists into a single list can get around this problem. For example, we can define a bind-like operator that orders its outputs more intelligently by
(>>>=) :: [a] -> (a -> [b]) -> [b]
as >>>= f = cantor (map f as)
cantor :: [[a]] -> [a]
cantor xs = go 1 xs
where
go _ [] = []
go n xs = hs ++ go (n+1) ts
where
ys = filter (not.null) xs
hs = take n $ map head ys
ts = mapN n tail ys
mapN :: Int -> (a -> a) -> [a] -> [a]
mapN _ _ [] = []
mapN n f xs#(h:t)
| n <= 0 = xs
| otherwise = f h : mapN (n-1) f t
If we now wrap this up as a monad, we can enumerate all possible pairs
newtype Select a = Select { runSelect :: [a] }
instance Monad Select where
return a = Select [a]
Select as >>= f = Select $ as >>>= (runSelect . f)
pairs = runSelect $ do
a <- Select [0..]
b <- Select [0..]
return (a,b)
This results in
>> take 15 pairs
[(0,0),(0,1),(1,0),(0,2),(1,1),(2,0),(0,3),(1,2),(2,1),(3,0),(0,4),(1,3),(2,2),(3,1),(4,0)]
which is a much more desirable result. However, if we were to ask for triples instead, the ordering on the outputs isn't as "nice" and it's not even clear to me that all outputs are eventually included --
>> take 15 triples
[(0,0,0),(0,0,1),(1,0,0),(0,1,0),(1,0,1),(2,0,0),(0,0,2),(1,1,0),(2,0,1),(3,0,0),(0,1,1),(1,0,2),(2,1,0),(3,0,1),(4,0,0)]
Note that (2,0,1) appears before (0,1,1) in the ordering -- my intuition says that a good solution to this problem will order the outputs according to some notion of "size", which could be an explicit input to the algorithm, or could be given implicitly (as in this example, where the "size" of an input is its position in the input lists). When combining inputs, the "size" of a combination should be some function (probably the sum) of the size of the inputs.
Is there an elegant solution to this problem that I am missing?
TL;DR: It flattens two dimensions at a time, rather than flattening three at once. You can't tidy this up in the monad because >>= is binary, not ternary etc.
I'll assume you defined
(>>>=) :: [a] -> (a -> [b]) -> [b]
as >>>= f = cantor $ map f as
to interleave the list of lists.
You like that because it goes diagonally:
sums = runSelect $ do
a <- Select [0..]
b <- Select [0..]
return (a+b)
gives
ghci> take 36 sums
[0,1,1,2,2,2,3,3,3,3,4,4,4,4,4,5,5,5,5,5,5,6,6,6,6,6,6,6,7,7,7,7,7,7,7,7]
so it's pleasingly keeping the "sizes" in order, but the pattern appears to be broken for triples, and you doubt completeness, but you needn't. It's doing the same trick, but twice, rather than for all three at once:
triplePairs = runSelect $ do
a <- Select [0..]
b <- Select [0..]
c <- Select [0..]
return $ (a,(b,c))
The second pair is treated as a single source of data, so notice that:
ghci> map fst $ take 36 pairs
[0,0,1,0,1,2,0,1,2,3,0,1,2,3,4,0,1,2,3,4,5,0,1,2,3,4,5,6,0,1,2,3,4,5,6,7]
ghci> map fst $ take 36 triplePairs
[0,0,1,0,1,2,0,1,2,3,0,1,2,3,4,0,1,2,3,4,5,0,1,2,3,4,5,6,0,1,2,3,4,5,6,7]
and (adding some spaces/newlines for clarity of pattern):
ghci> map snd $ take 36 pairs
[0, 1,0, 2,1,0, 3,2,1,0, 4,3,2,1,0, 5,4,3,2,1,0, 6,5,4,3,2,1,0, 7,6,5,4,3,2,1,0]
ghci> map snd $ take 36 triplePairs
[(0,0), (0,1),(0,0), (1,0),(0,1),(0,0), (0,2),(1,0),(0,1),(0,0),
(1,1),(0,2),(1,0),(0,1),(0,0),
(2,0),(1,1),(0,2),(1,0),(0,1),(0,0),
(0,3),(2,0),(1,1),(0,2),(1,0),(0,1),(0,0),
(1,2),(0,3),(2,0),(1,1),(0,2),(1,0),(0,1),(0,0)]
so you can see it's using exactly the same pattern. This doesn't preserve total sums and it oughtn't because we're getting to three dimensions by flattening two dimensions first before flattening the third in. The pattern is obscured, but it's just as guaranteed to make it to the end of the list.
Sadly if you want to do three dimensions in a sum-preserving way, you'll have to write cantor2, cantor3 and cantor4 functions, possibly a cantorN function, but you'll have to ditch the monadic interface, which is inherently based on the bracketing of >>=, hence two-at-a-time flattening of dimensions.
import Control.Applicative
import Control.Arrow
data Select a = Select [a]
| Selects [Select a]
instance Functor Select where
fmap f (Select x) = Select $ map f x
fmap f (Selects xss) = Selects $ map (fmap f) xss
instance Applicative Select where
pure = Select . (:[])
Select fs <*> xs = Selects $ map (`fmap`xs) fs
Selects fs <*> xs = Selects $ map (<*>xs) fs
instance Monad Select where
return = pure
Select xs >>= f = Selects $ map f xs
Selects xs >>= f = Selects $ map (>>=f) xs
runSelect :: Select a -> [a]
runSelect = go 1
where go n xs = uncurry (++) . second (go $ n+1) $ splitOff n xs
splitOff n (Select xs) = second Select $ splitAt n xs
splitOff n (Selects sls) = (concat hs, Selects $ tsl ++ rl)
where ((hs, tsl), rl) = first (unzip . map (splitOff n)) $ splitAt n sls
*Select> take 15 . runSelect $ do { a<‌-Select [0..]; b<‌-Select [0..]; return (a,b) }
[(0,0),(0,1),(1,0),(1,1),(0,2),(1,2),(2,0),(2,1),(2,2),(0,3),(1,3),(2,3),(3,0),(3,1),(3,2)]
*Select> take 15 . runSelect $ do { a<‌-Select [0..]; b<‌-Select [0..]; c<‌-Select [0..]; return (a,b,c) }
[(0,0,0),(0,0,1),(0,1,0),(0,1,1),(1,0,0),(1,0,1),(1,1,0),(1,1,1),(0,0,2),(0,1,2),(0,2,0),(0,2,1),(0,2,2),(1,0,2),(1,1,2)]
Note that this is still not quite Cantor-tuples ((0,1,1) shouldn't come before (1,0,0)), but getting it correct would be possible as well in a similar manner.
A correct multidimentional enumerator could be represented with a temporary state object
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE OverlappingInstances #-}
class Space a b where
slice :: a -> ([b], a)
instance Space [a] a where
slice (l:ls) = ([l], ls)
slice [] = ([], [])
instance (Space sp x) => Space ([sp], [sp]) x where
slice (fs, b:bs) = let
ss = map slice (b : fs)
yield = concat $ map fst ss
in (yield, (map snd ss, bs))
Here an N dimensional space is represented by a tuple of lists of N-1 dimensional subspaces that have and haven't been touched by the enumeration.
You can then use the following to produce a well ordered list
enumerate :: (Space sp x) => sp -> [x]
enumerate sp = let (sl, sp') = slice sp
in sl ++ enumerate sp'
Example in Ideone.
The omega package does exactly what you want and guarantees that every element will be eventually visited:
import Control.Applicative
import Control.Monad.Omega
main = print . take 200 . runOmega $
(,,) <$> each [0..] <*> each [0..] <*> each [0..]
Another option would be to use LogicT. It gives more flexibility (if you need) and has operations such as (>>-) that ensure that every combination is eventually encountered.
import Control.Applicative
import Control.Monad
import Control.Monad.Logic
-- | Convert a list into any MonadPlus.
each :: (MonadPlus m) => [a] -> m a
each = msum . map return
-- | A fair variant of '(<*>)` that ensures that both branches are explored.
(<#>) :: (MonadLogic m) => m (a -> b) -> m a -> m b
(<#>) f k = f >>- (\f' -> k >>- (\k' -> return $ f' k'))
infixl 4 <#>
main = print . observeMany 200 $
(,,) <$> each [0..] <#> each [0..] <#> each [0..]

Resources