Haskell add 2 maps - haskell

How can I retain the value of list 2 in the next example to compute the sum of 2 maps in haskell?
listsSumm :: Eq a => Bag a -> Bag a -> Bag a
listsSumm [] [] = []
listsSumm bag1 bag2
| q1==q2 = (q1,v1+v2):(listsSumm rBag1 rBag2)
| otherwise = bagSum [(q1,v1)] rBag2
where ((q1,v1):rBag1) = bag1
((q2,v2):rBag2) = bag2
and my imput is:
listSumm [("a",1),("c",1),("d",1),("b",1)] [("a",2),("c",1),("b",1),("d",1)]->[("a",3),("c",2),("d",2)]
How can I retain the content of the second list to keep reevaluating it after a test has finished?

Since you seem to be working with Maps, what you're trying to do can be done quite straightforwardly:
import qualified Data.Map as M
sumOfMaps :: M.Map String Int -> M.Map String Int -> M.Map String Int
sumOfMaps = M.unionWith (+)
If you don't want to rely on Data.Map you can use the following solution to merge your lists:
import Data.Function ( on )
import Data.List ( groupBy, sortBy )
import Data.Ord ( compare )
sumOfLists :: [(String, Int)] -> [(String, Int)] -> [(String, Int)]
sumOfLists l1 l2 = map merge . collect $ l1 ++ l2
where collect = groupBy ((==) `on` fst) . sortBy (compare `on` fst)
merge xs#(x:_) = (fst x, sum $ map snd xs)

Related

Removing duplicate elements in a Seq

wondering how to implement nub over a Seq a
I get that one could do:
nubSeq :: Seq a -> Seq a
nubSeq = fromList . nub . toList
Just wondering is there something standard that does not convert to Lists in order to call nub :: [a]->[a]?
An implementation that occurred to me, based obviously on nub, is:
nubSeq :: (Eq a) => Seq a -> Seq a
nubSeq = Data.Sequence.foldrWithIndex
(\_ x a -> case x `Data.Sequence.elemIndexR` a of
Just _ -> a
Nothing -> a |> x) Data.Sequence.empty
But there must be something more elegant?
thanks.
Not sure whether this qualifies as more elegant but it splits the concerns in independent functions (caveat: you need an Ord constraint on a):
seqToNubMap takes a Seq and outputs a Map associating to each a the smallest index at which it appeared in the sequence
mapToList takes a Map of values and positions and produces a list of values in increasing order according to the specified positions
nubSeq combines these to generate a sequence without duplicates
The whole thing should be O(n*log(n)), I believe:
module NubSeq where
import Data.Map as Map
import Data.List as List
import Data.Sequence as Seq
import Data.Function
seqToNubMap :: Ord a => Seq a -> Map a Int
seqToNubMap = foldlWithIndex (\ m k v -> insertWith min v k m) Map.empty
mapToList :: Ord a => Map a Int -> [a]
mapToList = fmap fst . List.sortBy (compare `on` snd) . Map.toList
nubSeq :: Ord a => Seq a -> Seq a
nubSeq = Seq.fromList . mapToList . seqToNubMap
Or a simpler alternative following #DavidFletcher's comment:
nubSeq' :: forall a. Ord a => Seq a -> Seq a
nubSeq' xs = Fold.foldr cons nil xs Set.empty where
cons :: a -> (Set a -> Seq a) -> (Set a -> Seq a)
cons x xs seen
| x `elem` seen = xs seen
| otherwise = x <| xs (Set.insert x seen)
nil :: Set a -> Seq a
nil _ = Seq.empty
Another way with an Ord constraint - use a scan to make the sets of
elements that appear in each prefix of the list. Then we can filter out
any element that's already been seen.
import Data.Sequence as Seq
import Data.Set as Set
nubSeq :: Ord a => Seq a -> Seq a
nubSeq xs = (fmap fst . Seq.filter (uncurry notElem)) (Seq.zip xs seens)
where
seens = Seq.scanl (flip Set.insert) Set.empty xs
Or roughly the same thing as a mapAccumL:
nubSeq' :: Ord a => Seq a -> Seq a
nubSeq' = fmap fst . Seq.filter snd . snd . mapAccumL f Set.empty
where
f s x = (Set.insert x s, (x, x `notElem` s))
(If I was using lists I would use Maybes instead of the pairs with
Bool, then use catMaybes instead of filtering. There doesn't seem to be catMaybes
for Sequence though.)
I think your code should be pretty efficient. Since Sequences are tree data structures using another tree type data structure like Map or HashMap to store and lookup the previous items doesn't make too much sense to me.
Instead i take the first item and check it's existence in the rest. If exists i drop that item and proceed the same with the rest recursively. If not then construct a new sequence with first element is the unique element and the rest is the result of nubSeq fed by the rest. Should be typical. I use ViewPatterns.
{-# LANGUAGE ViewPatterns #-}
import Data.Sequence as Seq
nubSeq :: Eq a => Seq a -> Seq a
nubSeq (viewl -> EmptyL) = empty
nubSeq (viewl -> (x :< xs)) | elemIndexL x xs == Nothing = x <| nubSeq xs
| otherwise = nubSeq xs
*Main> nubSeq . fromList $ [1,2,3,4,4,2,3,6,7,1,2,3,4]
fromList [6,7,1,2,3,4]

Define concat function for Digits

I have a list chars and I would like to concat all characters, which are digits and which are next to each other
For example: ['1','5','+','2','4'] => ["15","+","24"]
concat1 :: [Char] -> [Char] -> [String]
concat1 [] [] = []
concat1 [a] [b]
| (isDigit a) && (isDigit b) = [a] ++ [b]
I tried to write this code, but it doesn't seem to be right approach and debugger tells me this:
Couldn't match type `Char' with `[Char]'
Expected type: [String]
Actual type: [Char]
* In the expression: [a] ++ [b]
In an equation for `concat1':
concat1 [a] [b] | (isDigit a) && (isDigit b) = [a] ++ [b]
The type of concat1 is wrong: your example indicates you want one input list and an output list. The input list (ex: ['1','5','+','2','4']) is of type [Char] and the output (ex: ["15","+","24"]) of type [String]. That gives the signature
concat1 :: [Char] -> [String]
For the implementation, you probably want to use span, which finds the prefix satisfying a certain predicate function and also returns the remaining elements.
concat1 [] = []
concat1 (e : es)
| isDigit e = let (d, es') = span isDigit (e : es) in d : concat1 es'
| otherwise = [ e ] : concat1 es
Then, trying it out at GHCi:
ghci> concat1 ['1','5','+','2','4']
["15","+","24"]
An idea might be to construct a more generic function: a groupWith :: Eq b => (a -> b) -> [a] -> [(b,[a])] function like the one that is defined in Python's itertools:
groupWith :: Eq b => (a -> b) -> [a] -> [(b,[a])]
groupWith f = steps . map ((,) =<< f)
where steps [] = []
steps ((b,a):xs) = (b,(a : map snd ys)) : steps zs
where (ys,zs) = span ((b == ) . fst) xs
Given a function f :: a -> b and a list xs, it will construct groups [(b,[a])] such that every group represents the longest possible sequence of xs where f x is the same. So for groupWith isDigit "15+24", we get:
*Main Data.Char> groupWith isDigit ['1','5','+','2','4']
[(True,"15"),(False,"+"),(True,"24")]
Now we can simply obtain the second element snd of every tuple, so:
*Main Data.Char> map snd $ groupWith isDigit ['1','5','+','2','4']
["15","+","24"]
We can easily reuse this piece of code if we for instance wish to discriminate based on more conditions.
You can use groupBy in order to group together adjacent elements in a list over some equivalence function. You can use is like this:
import Data.List (groupBy)
import Data.Function (on)
import Data.Char (isDigit)
groupDigits :: String -> [String]
groupDigits = groupBy ((&&) `on` isDigit)
Prelude Data.List Data.Function Data.Char> groupDigits ['1','5','+','2','4']
["15","+","24"]
I guess as #4castle mentioned groupBy is the ideal tool for this job however my approach would be slightly different.
import Data.List (groupBy)
import Data.Function (on)
groupDigits :: String -> [String]
groupDigits = groupBy ((==) `on` ((&&) <$> (>'/') <*> (<':')))
*Main> groupDigits ['1','5','+','2','4']
["15","+","24"]
*Main> groupDigits ['1','5','+','+','2','4']
["15","++","24"]
The difference is instead of using (&&) operator i use an XNOR (which is simply (==) in Haskell).
And also my implementation of isDigit :: Char -> Bool is a little different than it is in Data.Char.

Haskell: change all indices from a list to some value

If I am given a list of objects and another list for some indices from this list, is there an easy way to change every object in this list with an index from the list of indices to a different value?
E.g. I am hoping there exists some function f such that
f 0 [4,2,5] [6,5,8,4,3,6,2,7]
would output
[6,5,0,4,0,0,2,7]
Here is a beautiful version that uses lens:
import Control.Lens
f :: a -> [Int] -> [a] -> [a]
f x is = elements (`elem` is) .~ x
Here is an efficient version that doesn't have any dependencies other than base. Basically, we start by sorting (and removing duplicates from the) indices list. That way, we don't need to scan the whole list for every replacement.
import Data.List
f :: a -> [Int] -> [a] -> [a]
f x is xs = snd $ mapAccumR go is' (zip xs [1..])
where
is' = map head . group . sort $ is
go [] (y,_) = ([],y)
go (i:is) (y,j) = if i == j then (is,x) else (i:is,y)
You can define a helper function to replace a single value and then use it to fold over your list.
replaceAll :: a -> [Int] -> [a] -> [a]
replaceAll repVal indices values = foldl (replaceValue repVal) values indices
where replaceValue val vals index = (take index vals) ++ [val] ++ (drop (index + 1) vals)
Sort the indices first. Then you can traverse the two lists in tandem.
{-# LANGUAGE ScopedTypeVariables #-}
import Prelude (Eq, Enum, Num, Ord, snd, (==), (<$>))
import Data.List (head, group, sort, zip)
f :: forall a. (Eq a, Enum a, Num a, Ord a) => a -> [a] -> [a] -> [a]
f replacement indices values =
go (head <$> group (sort indices)) (zip [0..] values)
where
go :: [a] -> [(a, a)] -> [a]
go [] vs = snd <$> vs
go _ [] = []
go (i:is) ((i', v):vs) | i == i' = replacement : go is vs
go is (v:vs) = snd v : go is vs
The sorting incurs an extra log factor on the length of the index list, but the rest is linear.

Need help rewriting a Haskell program that groups words by anagrams in OCaml (or any ML)?

I wrote a Haskell function that groups words by anagrams. I'm trying to learn OCaml, but I'm a little confused as to how use pattern matching in OCaml. Could someone help translate this to OCaml for me? Thank you!
This function takes a list of strings, and partitions it into a list of string lists, grouped by anagrams.
import Data.List
groupByAnagrams :: [String] -> [[String]]
groupByAnagrams [] = []
groupByAnagrams (x:xs) = let (listOfAnagrams, listOfNonAnagrams) = (partitionByAnagrams (sort x) xs)
in
(x:listOfAnagrams):(groupByAnagrams listOfNonAnagrams)
This helper function takes a sorted string sortedStr, and a list of strings (the reason the string is sorted is so that I don't have to call sort on it every iteration). The string list is partitioned into two lists; one consisting of the strings that are anagrams to sortedStr, the other consisting of the strings that are not. The function returns the tuple that consists of these two lists.
partitionByAnagrams :: String -> [String] -> ([String], [String])
partitionByAnagrams sortedStr [] = ([], [])
partitionByAnagrams sortedStr (x:xs)
| (sortedStr == (sort x)) = let (listOfAnagrams, listOfNonAnagrams) = (partitionByAnagrams sortedStr xs)
in
(x:listOfAnagrams, listOfNonAnagrams)
| otherwise = let (listOfAnagrams, listOfNonAnagrams) = (partitionByAnagrams sortedStr xs)
in
(listOfAnagrams, x:listOfNonAnagrams)
This is just a test case:
test1 = mapM_ print (groupByAnagrams ["opts", "alerting", "arrest", "bares", "drapes", "drawer", "emits", "least", "mate", "mates", "merit", "notes", "palest", "parses", "pores", "pots", "altering", "rarest", "baser", "parsed", "redraw", "items", "slate", "meat", "meats", "miter", "onset", "pastel", "passer", "poser", "spot", "integral", "raster", "bears", "rasped", "reward", "mites", "stale", "meta", "steam", "mitre", "steno", "petals", "spares", "prose", "stop", "relating", "raters", "braes", "spared", "warder", "smite", "steal", "tame", "tames", "remit", "stone", "plates", "sparse", "ropes", "tops", "triangle", "starer", "saber", "spread", "warred", "times", "tales", "team", "teams", "timer", "tones", "staple", "spears", "spore"])
**EDIT!!! This is a rewritten version of my function. Thanks to jrouquie for pointing out the inefficiency!
**EDITED AGAIN ON 10/7 - used pattern matching on tuples for clarity, no need for all those fsts and snds.
groupByAnagrams2 :: [String] -> [[String]]
groupByAnagrams2 str = groupBySnd $ map (\s -> (s, (sort s))) str
groupBySnd :: [(String, String)] -> [[String]]
groupBySnd [] = []
groupBySnd ((s1,s2):xs) = let (listOfAnagrams, listOfNonAnagramPairs) = (partitionBySnd s2 xs)
in
(s1:listOfAnagrams):(groupBySnd listOfNonAnagramPairs)
partitionBySnd :: String -> [(String, String)] -> ([String], [(String, String)])
partitionBySnd sortedStr [] = ([], [])
partitionBySnd sortedStr ((s, sSorted):ss)
| (sortedStr == sSorted) = let (listOfAnagrams, listOfNonAnagramPairs) = (partitionBySnd sortedStr ss)
in
(s:listOfAnagrams, listOfNonAnagramPairs)
| otherwise = let (listOfAnagrams, listOfNonAnagramPairs) = (partitionBySnd sortedStr ss)
in
(listOfAnagrams, (s, sSorted):listOfNonAnagramPairs)
I have to say that I find your Haskell code a bit clumsy. That is, your original function could have been written much more concise; for example:
import Control.Arrow ((&&&))
import Data.Function (on)
import Data.List (groupBy, sortBy)
anagrams :: Ord a => [[a]] -> [[[a]]]
anagrams =
map (map fst) .
groupBy ((==) `on` snd) .
sortBy (compare `on` snd) .
map (id &&& sortBy compare)
That is:
map (id &&& sortBy compare) pairs each string in the list with a sorted list of its characters;
sortBy (on compare snd) sorts the list of pairs that you now have on their second components, i.e., the sorted list of characters;
groupBy (on (==) snd) groups all consecutive items in the sorted list that have identical lists of sorted characters;
finally, map (map fst) drops the lists of sorted characters and leaves you with just the original strings.
For example:
Prelude> :m + Control.Arrow Data.Function Data.List
Prelude Control.Arrow Data.Function Data.List> ["foo", "bar", "rab", "ofo"]
["foo","bar","rab","ofo"]
Prelude Control.Arrow Data.Function Data.List> map (id &&& sortBy compare) it
[("foo","foo"),("bar","abr"),("rab","abr"),("ofo","foo")]
Prelude Control.Arrow Data.Function Data.List> sortBy (compare `on` snd) it
[("bar","abr"),("rab","abr"),("foo","foo"),("ofo","foo")]
Prelude Control.Arrow Data.Function Data.List> groupBy ((==) `on` snd) it
[[("bar","abr"),("rab","abr")],[("foo","foo"),("ofo","foo")]]
Prelude Control.Arrow Data.Function Data.List> map (map fst) it
[["bar","rab"],["foo","ofo"]]
"Translating" to Caml will then leave you with something along the lines of
let chars xs =
let n = String.length xs in
let rec chars_aux i =
if i = n then [] else String.get xs i :: chars_aux (i + 1)
in
List.sort compare (chars_aux 0)
let group eq xs =
let rec group_aux = function
| [] -> []
| [x] -> [[x]]
| x :: xs ->
let ((y :: _) as ys) :: yss = group_aux xs in
if eq x y then (x :: ys) :: yss else [x] :: ys :: yss
in
group_aux xs
let anagrams xs =
let ys = List.map chars xs in
let zs = List.sort (fun (_,y1) (_,y2) -> compare y1 y2) (List.combine xs ys) in
let zs = group (fun (_,y1) (_,y2) -> y1 = y2) zs in
List.map (List.map fst) zs
Here, the helper function chars takes a string to a sorted list of characters, while group should give you some insight in how to do pattern matching on lists in Caml.
The most general form of pattern matching is the match expression, which is the same as the case expression in Haskell.
let rec groupByAnagrams lst =
match lst with [] -> ...
| x::xs -> ...
However, when only the last argument of a function needs to be pattern-matched (as is the case here), there is a shortcut using the function syntax:
let rec groupByAnagrams = function
[] -> ...
| x::xs -> ...
As for the guards, there is no exact equivalent; you can use when inside a pattern match, but that only applies to a particular pattern, and you have to repeat that pattern for all the cases you want. You could also use if ... then ... else if ... then ... else ... but that is not as pretty.
let rec partitionByAnagrams sortedStr = function
[] -> ...
x::xs when ...(some condition here)... -> ...
x::xs -> ...

Function to show the lowest represented element in a list

If you have a list such as this in Haskell:
data TestType = A | B | C deriving (Ord, Eq, Show)
List1 :: [TestType]
List1 = [A,B,C,B,C,A,B,C,C,C]
Is it possible to write a function to determin which element is represented the least in a list (so in this case 'A')
My initial thought was to write a helper function such as this but now I am not sure if this is the right approach:
appears :: TestType -> [TestType] -> Int
appears _ [] = 0
appears x (y:ys) | x==y = 1 + (appears x ys)
| otherwise = appears x ys
I am still fairly new to Haskell, so apologies for the potentially silly question.
Many thanks
Slightly alternative version to Matt's approach
import Data.List
import Data.Ord
leastFrequent :: Ord a => [a] -> a
leastFrequent = head . minimumBy (comparing length) . group . sort
You can build a map counting how often each item occurs in the list
import qualified Data.Map as Map
frequencies list = Map.fromListWith (+) $ zip list (repeat 1)
Then you can find the least/most represented using minimumBy or maximumBy from Data.List on the list of Map.assocs of the frequency map, or even sort it by frequency using sortBy.
module Frequencies where
import Data.Ord
import Data.List
import qualified Data.Map as Map
frequencyMap :: Ord a => [a] -> Map.Map a Int
frequencyMap list = Map.fromListWith (+) $ zip list (repeat 1)
-- Caution: leastFrequent will cause an error if called on an empty list!
leastFrequent :: Ord a => [a] -> a
leastFrequent = fst . minimumBy (comparing snd) . Map.assocs . frequencyMap
ascendingFrequencies :: Ord a => [a] -> [(a,Int)]
ascendingFrequencies = sortBy (comparing snd) . Map.assocs . frequencyMap
Here's another way to do it:
sort the list
group the list
find the length of each group
return the group with the shortest length
Example:
import GHC.Exts
import Data.List
fewest :: (Eq a) => [a] -> a
fewest xs = fst $ head sortedGroups
where
sortedGroups = sortWith snd $ zip (map head groups) (map length groups)
groups = group $ sort xs
A less elegant idea would be:
At first sort and group the list
then pairing the cases with their number of representations
at last sort them relative to their num of representations
In code this looks like
import Data.List
sortByRepr :: (Ord a) => [a] ->[(a,Int)]
sortByRepr xx = sortBy compareSnd $ map numOfRepres $ group $ sort xx
where compareSnd x y = compare (snd x) (snd y)
numOfRepres x = (head x, length x)
the least you get by applying head to the resulting list.

Resources