Populating a list of tuples in a semantic way - haskell

I'm working on a piece of code where I have to process lists of tuples where both the order and names of the "keys" (fsts of the tuples) match a certain template. I'm implementing fault tolerance by validating and (if needed) generating a valid list based on the input.
Here's an example of what I mean:
Given the template of keys, ["hello", "world", "this", "is", "a", "test"], and a list [("hello", Just 1), ("world", Just 2), ("test", Just 3)], passing it to my function validate would cause it to fail validation - as the order and values of the keys do not match up with the template.
Upon failing validation, I want to generate a new list, which would look like [("hello", Just 1), ("world", Just 2), ("this", Nothing), ("is", Nothing), ("a", Nothing), ("test", Just 3)].
I tried performing this last step using an (incomplete) list comprehension:
[(x, y) | x <- template, y <- l]
(Obviously, this is missing the step where empty entries would be replaced with Nothings, and works under the assumption that the input is of type [(String, Maybe Int)]).
What would be the easiest semantic way of doing this?

You essentially want to map a function to your list of strings (which you call "template"), i.e. the function that
takes a string xs,
returns
(xs, Just n) if an integer n is associated to xs in your "list to validate",
(xs, Nothing) otherwise.
Here is one possible approach:
import Data.List ( lookup )
import Control.Monad ( join )
consolidate :: [String] -> [(String, Maybe Int)] -> [(String, Maybe Int)]
consolidate temp l = map (\xs -> (xs, join $ lookup xs l)) temp
However, you will get faster lookup if you build a Map holding the key-value pairs of your association list (the "list to validate"):
import qualified Data.Map as M
import Data.Maybe (maybe)
consolidate :: [String] -> [(String, Maybe Int)] -> [(String, Maybe Int)]
consolidate temp l = map (\cs -> (cs, M.lookup cs $ fromList' l)) temp
fromList' :: Ord a => [(a, Maybe b)] -> M.Map a b
fromList' xs = foldr insertJust M.empty xs
insertJust :: Ord a => (a, Maybe b) -> M.Map a b -> M.Map a b
insertJust (xs, maybeVal) mp = maybe mp (\n -> M.insert xs n mp) maybeVal
In GHCi:
λ> let myTemplate = ["hello", "world", "this", "is", "a", "test"]
λ> let myList = [("hello", Just 1), ("world", Just 2), ("test", Just 3)]
λ> consolidate myTemplate myList
[("hello",Just 1),("world",Just 2),("this",Nothing),("is",Nothing),("a",Nothing),("test",Just 3)]

Related

List of pairs into pair of Lists Haskell

Basically I have this exercise:
Using list comprehensions, write a polymorphic function:
split :: [(a, b)] -> ([a], [b])
which transforms a list of pairs (of any types) into a pair of lists. For example,
split [(1, 'a'), (2, 'b'), (3, 'c')] = ([1, 2, 3], "abc")
This was the way I wrote the function but it is not working:
split :: [(a, b)] -> ([a], [b])
split listOfPairs = (([a | a <- listOfPairs]), ([b | b <- listOfPairs]))
Can someone please explain why my solution doesn't work? Thank you!
A list comprehension like:
[a | a <- listOfPairs]
is actually nothing more than an identity operation for lists. It will yield the same list as the one you provide, since you basically iterate over listOfPairs, and for each iteration, you yield the element a.
Haskell does not perform implicit conversions, so it does not derive from the types that a in your a <- listOfPairs then only can be the first element. Even if that was possible, it was probably not a good idea anyway, since it would make the language more "unstable" in the sense that a small change in the types, could have significant impact in the semantics.
In order to obtain the first element of a tuple, you need to use pattern matching, like:
[a | (a, _) <- listOfPairs]
here we thus pattern match the first element of the tuple with a, and for the second one, we thus use:
[b | (_, b) <- listOfPairs]
We can thus impelement this as:
split:: [(a,b)] -> ([a],[b])
split listOfPairs = ([a | (a, _) <- listOfPairs], [b | (_, b) <- listOfPairs])
Or we can use map :: (a -> b) -> [a] -> [b], fst :: (a, b) -> a and snd :: (a, b) -> b:
split:: [(a,b)] -> ([a],[b])
split listOfPairs = (map fst listOfPairs, map snd listOfPairs)
But the above still has a problem: here we iterate twice independently over the same list. We can omit that by using recursion, like:
split:: [(a,b)] -> ([a],[b])
split [] = []
split ((a, b):xs) = (a:as, b:bs)
where (as, bs) = split xs
or we can use a foldr function:
split :: Foldable f => f (a,b) -> ([a],[b])
split = foldr (\(a,b) (as,bs) -> (a:as,b:bs)) ([],[])
There is already a Haskell function that does exactly what you want: unzip :: [(a, b)] -> ([a], [b]), with the source code.

Haskell: change all indices from a list to some value

If I am given a list of objects and another list for some indices from this list, is there an easy way to change every object in this list with an index from the list of indices to a different value?
E.g. I am hoping there exists some function f such that
f 0 [4,2,5] [6,5,8,4,3,6,2,7]
would output
[6,5,0,4,0,0,2,7]
Here is a beautiful version that uses lens:
import Control.Lens
f :: a -> [Int] -> [a] -> [a]
f x is = elements (`elem` is) .~ x
Here is an efficient version that doesn't have any dependencies other than base. Basically, we start by sorting (and removing duplicates from the) indices list. That way, we don't need to scan the whole list for every replacement.
import Data.List
f :: a -> [Int] -> [a] -> [a]
f x is xs = snd $ mapAccumR go is' (zip xs [1..])
where
is' = map head . group . sort $ is
go [] (y,_) = ([],y)
go (i:is) (y,j) = if i == j then (is,x) else (i:is,y)
You can define a helper function to replace a single value and then use it to fold over your list.
replaceAll :: a -> [Int] -> [a] -> [a]
replaceAll repVal indices values = foldl (replaceValue repVal) values indices
where replaceValue val vals index = (take index vals) ++ [val] ++ (drop (index + 1) vals)
Sort the indices first. Then you can traverse the two lists in tandem.
{-# LANGUAGE ScopedTypeVariables #-}
import Prelude (Eq, Enum, Num, Ord, snd, (==), (<$>))
import Data.List (head, group, sort, zip)
f :: forall a. (Eq a, Enum a, Num a, Ord a) => a -> [a] -> [a] -> [a]
f replacement indices values =
go (head <$> group (sort indices)) (zip [0..] values)
where
go :: [a] -> [(a, a)] -> [a]
go [] vs = snd <$> vs
go _ [] = []
go (i:is) ((i', v):vs) | i == i' = replacement : go is vs
go is (v:vs) = snd v : go is vs
The sorting incurs an extra log factor on the length of the index list, but the rest is linear.

Haskell: Create a list of tuples from a tuple with a static element and a list

Need to create a list of tuples from a tuple with a static element and a list. Such as:
(Int, [String]) -> [(Int, String)]
Feel like this should be a simple map call but am having trouble actually getting it to output a tuple as zip would need a list input, not a constant.
I think this is the most direct and easy to understand solution (you already seem to be acquainted with map anyway):
f :: (Int, [String]) -> [(Int, String)]
f (i, xs) = map (\x -> (i, x)) xs
(which also happens to be the desugared version of [(i, x) | x < xs], which Landei proposed)
then
Prelude> f (3, ["a", "b", "c"])
[(3,"a"),(3,"b"),(3,"c")]
This solution uses pattern matching to "unpack" the tuple argument, so that the first tuple element is i and the second element is xs. It then does a simple map over the elements of xs to convert each element x to the tuple (i, x), which I think is what you're after. Without pattern matching it would be slightly more verbose:
f pair = let i = fst pair -- get the FIRST element
xs = snd pair -- get the SECOND element
in map (\x -> (i, x)) xs
Furthermore:
The algorithm is no way specific to (Int, [String]), so you can safely generalize the function by replacing Int and String with type parameters a and b:
f :: (a, [b]) -> [(a, b)]
f (i, xs) = map (\x -> (i, x)) xs
this way you can do
Prelude> f (True, [1.2, 2.3, 3.4])
[(True,1.2),(True,2.3),(True,3.4)]
and of course if you simply get rid of the type annotation altogether, the type (a, [b]) -> [(a, b)] is exactly the type that Haskell infers (only with different names):
Prelude> let f (i, xs) = map (\x -> (i, x)) xs
Prelude> :t f
f :: (t, [t1]) -> [(t, t1)]
Bonus: you can also shorten \x -> (i, x) to just (i,) using the TupleSections language extension:
{-# LANGUAGE TupleSections #-}
f :: (a, [b]) -> [(a, b)]
f (i, xs) = map (i,) xs
Also, as Ørjan Johansen has pointed out, the function sequence does indeed generalize this even further, but the mechanisms thereof are a bit beyond the scope.
For completeness, consider also cycle,
f i = zip (cycle [i])
Using foldl,
f i = foldl (\a v -> (i,v) : a ) []
Using a recursive function that illustrates how to divide the problem,
f :: Int -> [a] -> [(Int,a)]
f _ [] = []
f i (x:xs) = (i,x) : f i xs
A list comprehension would be quite intuitive and readable:
f (i,xs) = [(i,x) | x <- xs]
Do you want the Int to always be the same, just feed zip with an infinite list. You can use repeat for that.
f i xs = zip (repeat i) xs

Haskell - Reduce list - MapReduce

I'm trying to reduce a list of tuples, where the values of a duplicate key are added together like this:
[(the, 1), (the, 1)] => [(the, 2)]
I tried this:
reduce :: [(String, Integer)] -> [(String, Integer)]
reduce [] = []
reduce [(k, v) : xs] = (+) [(k, v)] : reduce xs
I'm getting this error:
Couldn't match expected type `(String, Integer)'
with actual type `[(String, Integer)] -> [(String, Integer)]'
What am I doing wrong?
Edit
This is the full program
toTuple :: [String] -> [(String, Integer)]
toTuple [] = []
toTuple (k:xs) = (k, 1) : toTuple xs
reduce :: [(String, Integer)] -> [(String, Integer)]
reduce [] = []
reduce [(k, v) : xs] = (+) [(k, v)] : reduce xs
main_ = do list <- getWords "test.txt"
print $ reduce $ toTuple list
-- Loads words from a text file into a list.
getWords :: FilePath -> IO [String]
getWords path = do contents <- readFile path
return ([Prelude.map toLower x | x <- words contents])
You are doing the pattern matching wrong. The pattern match should be like this:
((k,v):xs)
(k,v) represents the head of the list and xs represents the tail of the list. Similarly this is problematic:
(+) [(k, v)] : reduce xs
The type of + is this:
λ> :t (+)
(+) :: Num a => a -> a -> a
You cannot simply do (+) [(k, v)] : reduce xs which doesn't appear anywhere reasonable. You have to check the contents of the String and then add second part of the tuple.
Let me point out that your function reduce is extremely similar to function fromListWith from Data.Map:
> :m Data.Map
> let reduce = toList . fromListWith (+)
> :t reduce
reduce :: (Ord k, Num a) => [(k, a)] -> [(k, a)]
> reduce [('a', 3), ('a', 1), ('b', 2), ('a', 10), ('b', 2), ('c', 1)]
[('a',14),('b',4),('c',1)]
> reduce [(c,1) | c <- "the quick brown fox jumps over the lazy dog"]
[(' ',8),('a',1),('b',1),('c',1),('d',1),('e',3),('f',1),('g',1),('h',2),('i',1),('j',1),('k',1),('l',1),('m',1),('n',1),('o',4),('p',1),('q',1),('r',2),('s',1),('t',2),('u',2),('v',1),('w',1),('x',1),('y',1),('z',1)]

haskell grouping problem

group :: Ord a => [(a, [b])] -> [(a, [b])]
I want to look up all pairs that have the same fst, and merge them, by appending all the list of bs together where they have the same a and discarding the unnessecary pair and so on...
I got as far as:
group ((s, ls):(s', ls'):ps) =
if s == s'
then group ((s, ls++ls'):ps)
else (s, ls) : group ((s', ls'):ps)
group p = p
but obviously this ain't going to cut it, because it doesn't group everything.
Edit:
example
[("a", as),("c", cs), ("c", cs3), ("b", bs),("c", cs2), ("b", bs2)]
would output
[("a", as),("c", cs++cs2++cs3),("b", bs++bs2)]
Two alternative solutions to barkmadley's answer:
As Tirpen notes in a comment, the best way to attack this problem depends on the number m of distinct first elements in the tuples of the input list. For small values of m barkmadley's use of Data.List.partition is the way to go. For large values however, the algorithm's complexity of O(n * m) is not so nice. In that case an O(n log n) sort of the input may turn out to be faster. Thus,
import Data.List (groupBy, sortBy)
combine :: (Ord a) => [(a, [b])] -> [(a, [b])]
combine = map mergeGroup . myGroup . mySort
where
mySort = sortBy (\a b -> compare (fst a) (fst b))
myGroup = groupBy (\a b -> fst a == fst b)
mergeGroup ((a, b):xs) = (a, b ++ concatMap snd xs)
This yields [("Dup",["2","3","1","5"]),("Non",["4"])] on barkmadley's input.
Alternatively, we can call in the help of Data.Map:
import Data.Map (assocs, fromListWith)
combine :: (Ord a) => [(a, [b])] -> [(a, [b])]
combine = assocs . fromListWith (++)
This will yield [("Dup",["5","1","2","3"]),("Non",["4"])], which may or may not be an issue. If it is, then there are again two solutions:
Reverse the input first using Data.List.reverse:
import Data.List (reverse)
import Data.Map (assocs, fromListWith)
combine :: (Ord a) => [(a, [b])] -> [(a, [b])]
combine = assocs . fromListWith (++) . reverse
Prepend (flip (++)) instead of append ((++)) (Thanks to barkmadley; I like this solution better):
import Data.Map (assocs, fromListWith)
combine :: (Ord a) => [(a, [b])] -> [(a, [b])]
combine = assocs . fromListWith (flip (++))
Both of these definitions will cause combine to output [("Dup",["2","3","1","5"]),("Non",["4"])].
As a last remark, note that all these definitions of combine require the first element of the tuples in the input list to be instances of class Ord. barkmadley's implementation only requires these elements to be instances of Eq. Thus there exist inputs which can be handled by his code, but not by mine.
import Data.List hiding (group)
group :: (Eq a) => [(a, [b])] -> [(a, [b])]
group ((s,l):rest) = (s, l ++ concatMap snd matches) : group nonmatches
where
(matches, nonmatches) = partition (\x-> fst x == s) rest
group x = x
this function produces the result:
group [("Dup", ["2", "3"]), ("Dup", ["1"]), ("Non", ["4"]), ("Dup", ["5"])]
= [("Dup", ["2", "3", "1", "5"]), ("Non", ["4"])]
it works by filtering the remaining bits into two camps, the bits that match and the bits that dont. it then combines the ones that match and recurses on the ones that don't. This effectly means you will have one tuple in the output list per 'key' in the input list.
Another solution, using a fold to accumulate the groups in a Map. Because of the Map this does require that a is an instance of Ord (BTW your original definition requires that a is an instance of Eq, which barkmadley has incorporated in his solution).
import qualified Data.Map as M
group :: Ord a => [(a, [b])] -> [(a, [b])]
group = M.toList . foldr insert M.empty
where
insert (s, l) m = M.insertWith (++) s l m
If you're a big fan of obscurity, replace the last line with:
insert = uncurry $ M.insertWith (++)
This omits the unnecessary m and uncurry breaks the (s, l) pair out into two arguments s and l.

Resources