haskell grouping problem

haskell grouping problem - haskell

group :: Ord a => [(a, [b])] -> [(a, [b])]
I want to look up all pairs that have the same fst, and merge them, by appending all the list of bs together where they have the same a and discarding the unnessecary pair and so on...
I got as far as:
group ((s, ls):(s', ls'):ps) =
if s == s'
then group ((s, ls++ls'):ps)
else (s, ls) : group ((s', ls'):ps)
group p = p
but obviously this ain't going to cut it, because it doesn't group everything.
Edit:
example
[("a", as),("c", cs), ("c", cs3), ("b", bs),("c", cs2), ("b", bs2)]
would output
[("a", as),("c", cs++cs2++cs3),("b", bs++bs2)]

Two alternative solutions to barkmadley's answer:
As Tirpen notes in a comment, the best way to attack this problem depends on the number m of distinct first elements in the tuples of the input list. For small values of m barkmadley's use of Data.List.partition is the way to go. For large values however, the algorithm's complexity of O(n * m) is not so nice. In that case an O(n log n) sort of the input may turn out to be faster. Thus,
import Data.List (groupBy, sortBy)
combine :: (Ord a) => [(a, [b])] -> [(a, [b])]
combine = map mergeGroup . myGroup . mySort
where
mySort = sortBy (\a b -> compare (fst a) (fst b))
myGroup = groupBy (\a b -> fst a == fst b)
mergeGroup ((a, b):xs) = (a, b ++ concatMap snd xs)
This yields [("Dup",["2","3","1","5"]),("Non",["4"])] on barkmadley's input.
Alternatively, we can call in the help of Data.Map:
import Data.Map (assocs, fromListWith)
combine :: (Ord a) => [(a, [b])] -> [(a, [b])]
combine = assocs . fromListWith (++)
This will yield [("Dup",["5","1","2","3"]),("Non",["4"])], which may or may not be an issue. If it is, then there are again two solutions:
Reverse the input first using Data.List.reverse:
import Data.List (reverse)
import Data.Map (assocs, fromListWith)
combine :: (Ord a) => [(a, [b])] -> [(a, [b])]
combine = assocs . fromListWith (++) . reverse
Prepend (flip (++)) instead of append ((++)) (Thanks to barkmadley; I like this solution better):
import Data.Map (assocs, fromListWith)
combine :: (Ord a) => [(a, [b])] -> [(a, [b])]
combine = assocs . fromListWith (flip (++))
Both of these definitions will cause combine to output [("Dup",["2","3","1","5"]),("Non",["4"])].
As a last remark, note that all these definitions of combine require the first element of the tuples in the input list to be instances of class Ord. barkmadley's implementation only requires these elements to be instances of Eq. Thus there exist inputs which can be handled by his code, but not by mine.

import Data.List hiding (group)
group :: (Eq a) => [(a, [b])] -> [(a, [b])]
group ((s,l):rest) = (s, l ++ concatMap snd matches) : group nonmatches
where
(matches, nonmatches) = partition (\x-> fst x == s) rest
group x = x
this function produces the result:
group [("Dup", ["2", "3"]), ("Dup", ["1"]), ("Non", ["4"]), ("Dup", ["5"])]
= [("Dup", ["2", "3", "1", "5"]), ("Non", ["4"])]
it works by filtering the remaining bits into two camps, the bits that match and the bits that dont. it then combines the ones that match and recurses on the ones that don't. This effectly means you will have one tuple in the output list per 'key' in the input list.

Another solution, using a fold to accumulate the groups in a Map. Because of the Map this does require that a is an instance of Ord (BTW your original definition requires that a is an instance of Eq, which barkmadley has incorporated in his solution).
import qualified Data.Map as M
group :: Ord a => [(a, [b])] -> [(a, [b])]
group = M.toList . foldr insert M.empty
where
insert (s, l) m = M.insertWith (++) s l m
If you're a big fan of obscurity, replace the last line with:
insert = uncurry $ M.insertWith (++)
This omits the unnecessary m and uncurry breaks the (s, l) pair out into two arguments s and l.

Related

How to construct a function from its graph?

I wonder if it is possible to make the inverse of the following function:
graphOf :: (Num a, Enum a) => (a -> b) -> [(a, b)]
graphOf f = [(e,v) | e <- [0..], v <- [f e]]
I mean I don't figure out how to write a Haskell function
fromGraph :: (Enum a) => [(a, b)] -> (a -> b)
such that
fromGraph [(1,3),(2,4),(3,5)] :: (Num a) => a -> a
(fromGraph [(1,3),(2,4),(3,5)]) 1 == 3
(fromGraph [(1,3),(2,4),(3,5)]) 2 == 4
(fromGraph [(1,3),(2,4),(3,5)]) 3 == 5
Is it possible?
At least for finite input list?

The simplest way is to use the lookup function:
Prelude> :m +Data.List
Prelude Data.List> lookup 1 [(1,3),(2,4),(3,5)]
Just 3
Prelude Data.List> lookup 2 [(1,3),(2,4),(3,5)]
Just 4
Prelude Data.List> lookup 3 [(1,3),(2,4),(3,5)]
Just 5
This is pretty inefficient though (for every query it just goes through the list linearly). You may want to back it with a faster lookup mechanism, using structures from the containers or unordered-containers packages, for example
import qualified Data.HashMap.Strict as HMS
import Data.Hashable (Hashable)
fastLookup :: Hashable k => [(k,b)] -> k -> Maybe b
fastLookup l = \k -> HMS.lookup k table
where table = HMS.fromList l
Note that I wrote fastLookup l = \k -> .... Do not simplify this to fastLookup l k = ..., because that would re-build the hash map for every query.

You could write something like this
fromGraph :: [(Int, b)] -> Int -> b
fromGraph g i = snd (g !! i)
This would only work for Int indices, and would also assume that for every i, the element in the graph g at g !! i would have index i as well. If you want to do it a little more generically, you could write this:
fromGraph :: Eq a => [(a, b)] -> a -> b
fromGraph g i = snd $ head $ filter ((==i) . fst) g
Not that this would still throw an error if you try to use this function with an index that is not part of the graph.

Given a list, how can I perform some transformation only on sub-lists whose each two elements satisfy a binary predicate?

(In my actual use case I have a list of type [SomeType], SomeType having a finite number of constructors, all nullary; in the following I'll use String instead of [SomeType] and use only 4 Chars, to simplify a bit.)
I have a list like this "aaassddddfaaaffddsssadddssdffsdf" where each element can be one of 'a', 's', 'd', 'f', and I want to do some further processing on each contiguous sequence of non-as, let's say turning them upper case and reversing the sequence, thus obtaining "aaaFDDDDSSaaaSSSDDFFaFDSFFDSSDDD". (I've added the reversing requirement to make it clear that the processing involves all the contiguous non 'a'-s at the same time.)
To turn each sub-String upper case, I can use this:
func :: String -> String
func = reverse . map Data.Char.toUpper
But how do I run that func only on the sub-Strings of non-'a's?
My first thought is that Data.List.groupBy can be useful, and the overall solution could be:
concat $ map (\x -> if head x == 'a' then x else func x)
$ Data.List.groupBy ((==) `on` (== 'a')) "aaassddddfaaaffddsssadddssdffsdf"
This solution, however, does not convince me, as I'm using == 'a' both when grouping (which to me seems good and unavoidable) and when deciding whether I should turn a group upper case.
I'm looking for advices on how I can accomplish this small task in the best way.

You could classify the list elements by the predicate before grouping. Note that I’ve reversed the sense of the predicate to indicate which elements are subject to the transformation, rather than which elements are preserved.
{-# LANGUAGE ScopedTypeVariables #-}
import Control.Arrow ((&&&))
import Data.Function (on)
import Data.Monoid (First(..))
mapSegmentsWhere
:: forall a. (a -> Bool) -> ([a] -> [a]) -> [a] -> [a]
mapSegmentsWhere p f
= concatMap (applyMatching . sequenceA) -- [a]
. groupBy ((==) `on` fst) -- [[(First Bool, a)]]
. map (First . Just . p &&& id) -- [(First Bool, a)]
where
applyMatching :: (First Bool, [a]) -> [a]
applyMatching (First (Just matching), xs)
= applyIf matching f xs
applyIf :: forall a. Bool -> (a -> a) -> a -> a
applyIf condition f
| condition = f
| otherwise = id
Example use:
> mapSegmentsWhere (/= 'a') (reverse . map toUpper) "aaassddddfaaaffddsssadddssdffsdf"
"aaaFDDDDSSaaaSSSDDFFaFDSFFDSSDDD"
Here I use the First monoid with sequenceA to merge the lists of adjacent matching elements from [(Bool, a)] to (Bool, [a]), but you could just as well use something like map (fst . head &&& map snd). You can also skip the ScopedTypeVariables if you don’t want to write the type signatures; I just included them for clarity.

If we need to remember the difference between the 'a's and the rest, let's put them in different branches of an Either. In fact, let's define a newtype now that we are at it:
{-# LANGUAGE DeriveFoldable #-}
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE ViewPatterns #-}
import Data.Bifoldable
import Data.Char
import Data.List
newtype Bunched a b = Bunched [Either a b] deriving (Functor, Foldable)
instance Bifunctor Bunched where
bimap f g (Bunched b) = Bunched (fmap (bimap f g) b)
instance Bifoldable Bunched where
bifoldMap f g (Bunched b) = mconcat (fmap (bifoldMap f g) b)
fmap will let us work over the non-separators. fold will return the concatenation of the non-separators, bifold will return the concatenation of everything. Of course, we could have defined separate functions unrelated to Foldable and Bifoldable, but why avoid already existing abstractions?
To split the list, we can use an unfoldr that alternately searches for as and non-as with the span function:
splitty :: Char -> String -> Bunched String String
splitty c str = Bunched $ unfoldr step (True, str)
where
step (_, []) = Nothing
step (True, span (== c) -> (as, ys)) = Just (Left as, (False, ys))
step (False, span (/= c) -> (xs, ys)) = Just (Right xs, (True, ys))
Putting it to work:
ghci> bifold . fmap func . splitty 'a' $ "aaassddddfaaaffddsssadddssdffsdf"
"aaaFDDDDSSaaaSSSDDFFaFDSFFDSSDDD"
Note: Bunched is actually the same as Tannen [] Either from the bifunctors package, if you don't mind the extra dependency.

There are other answers here, but I think they get too excited about iteration abstractions. A manual recursion, alternately taking things that match the predicate and things that don't, makes this problem exquisitely simple:
onRuns :: Monoid m => (a -> Bool) -> ([a] -> m) -> ([a] -> m) -> [a] -> m
onRuns p = go p (not . p) where
go _ _ _ _ [] = mempty
go p p' f f' xs = case span p xs of
(ts, rest) -> f ts `mappend` go p' p f' f rest
Try it out in ghci:
Data.Char> onRuns ('a'==) id (reverse . map toUpper) "aaassddddfaaaffddsssadddssdffsdf"
"aaaFDDDDSSaaaSSSDDFFaFDSFFDSSDDD"

Here is a simple solution - function process below - that only requires that you define two functions isSpecial and func. Given a constructor from your type SomeType, isSpecial determines whether it is one of those constructors that form a special sublist or not. The function func is the one you included in your question; it defines what should happen with the special sublists.
The code below is for character lists. Just change isSpecial and func to make it work for your lists of constructors.
isSpecial c = c /= 'a'
func = reverse . map toUpper
turn = map (\x -> ([x], isSpecial x))
amalgamate [] = []
amalgamate [x] = [x]
amalgamate ((xs, xflag) : (ys, yflag) : rest)
| xflag /= yflag = (xs, xflag) : amalgamate ((ys, yflag) : rest)
| otherwise = amalgamate ((xs++ys, xflag) : rest)
work = map (\(xs, flag) -> if flag then func xs else xs)
process = concat . work . amalgamate . turn
Let's try it on your example:
*Main> process "aaassddddfaaaffddsssadddssdffsdf"
"aaaFDDDDSSaaaSSSDDFFaFDSFFDSSDDD"
*Main>
Applying one function at a time, shows the intermediate steps taken:
*Main> turn "aaassddddfaaaffddsssadddssdffsdf"
[("a",False),("a",False),("a",False),("s",True),("s",True),("d",True),
("d",True),("d",True),("d",True),("f",True),("a",False),("a",False),
("a",False),("f",True),("f",True),("d",True),("d",True),("s",True),
("s",True),("s",True),("a",False),("d",True),("d",True),("d",True),
("s",True),("s",True),("d",True),("f",True),("f",True),("s",True),
("d",True),("f",True)]
*Main> amalgamate it
[("aaa",False),("ssddddf",True),("aaa",False),("ffddsss",True),
("a",False),("dddssdffsdf",True)]
*Main> work it
["aaa","FDDDDSS","aaa","SSSDDFF","a","FDSFFDSSDDD"]
*Main> concat it
"aaaFDDDDSSaaaSSSDDFFaFDSFFDSSDDD"
*Main>

We can just do what you describe, step by step, getting a clear simple minimal code which we can easily read and understand later on:
foo :: (a -> Bool) -> ([a] -> [a]) -> [a] -> [a]
foo p f xs = [ a
| g <- groupBy ((==) `on` fst)
[(p x, x) | x <- xs] -- [ (True, 'a'), ... ]
, let (t:_, as) = unzip g -- ( [True, ...], "aaa" )
, a <- if t then as else (f as) ] -- final concat
-- unzip :: [(b, a)] -> ([b], [a])
We break the list into same-p spans and unpack each group with the help of unzip. Trying it out:
> foo (=='a') reverse "aaabcdeaa"
"aaaedcbaa"
So no, using == 'a' is avoidable and hence not especially good, introducing an unnecessary constraint on your data type when all we need is equality on Booleans.

Program takes a list of pair and return it ordered from small to big

so I wrote this program where it takes a key value list and return the pair with the biggest value, now i want the program to return the the original list but organized as the smallest value at the first and then the biggest at the last for example [(4,3) , (2,10), (2,1)] -> it returns [(2,1) , (4,3), (2,10)]
vector :: Ord a => [(t, a)] -> (t, a)
vector (x:xs) = maxTail x xs
where maxTail currentMax [] = currentMax
maxTail (m, n) (p:ps)
| n < (snd p) = maxTail p ps
| otherwise = maxTail (m, n) ps
I tried to do this
vec xs = [ vector tt | tt <- xs]
but does not work because vector function brings back list of lists of pair and vector function takes lists of pair.
how can I modify it to do what I want it to do
#leftaroundabout helped with this function
import Data.List (sortOn)
sortOnSnd :: Ord a => [(a, b)] -> [(a, b)]
sortOnSnd = sortOn snd
now if there was two values that equal each other I want the program to order it according to the key for example [(42,3), (3,3), (4,1)] -> [(4,1), (3,3), (42,3)]
Here what I did
import Data.List (sortOn)
sortOnSnd :: Ord a => [(a, b)] -> [(a, b)]
sortOnSnd = sortOn snd
if sortOn snd == sortOn snd then sortOn fst
it gives error in the main IO function

First, what ever gave you the idea that it would be a good idea to call this function “vector”?? Let's make it instead
import Data.List (maximumBy)
import Data.List (comparing)
maximumOnSnd :: Ord a => [(t, a)] -> (t, a)
maximumOnSnd = maximumBy $ comparing snd
Now, you could use that for implementing a sorting function, but it will be absurdly inefficient – not just because it's an O (n2) algorithm, also because it requires some unnecessary extra list traversals. In particular, it's not enough to just find the highest element, you actually need to extract it. One possibility (both inefficient and unreliable) is to filter out the one you found:
extrMaximumOnSnd :: (Ord a, Eq t) => [(t, a)] -> ((t, a), [(t, a)])
extrMaximumOnSnd l = (maxMember, filter (/=maxMember) l)
where maxMember = maximumOnSnd l
Given that, you can then recursively implement the selection sort:
sortOnSnd :: (Ord a, Eq t) => [(t, a)] -> [(t, a)]
sortOnSnd [] = []
sortOnSnd l = case extrMaximumOnSnd l of
(maxMember, others) -> sortOnSnd ++ [maxMember]
But again, keep in mind that this is very inefficient, less general then it should be (Eq t constraint) and unreliable, in particular when the list contains duplicate elements.
A much better solution is
import Data.List (sortOn)
sortOnSnd :: Ord a => [(t, a)] -> [(t, a)]
sortOnSnd = sortOn snd

List of pairs into pair of Lists Haskell

Basically I have this exercise:
Using list comprehensions, write a polymorphic function:
split :: [(a, b)] -> ([a], [b])
which transforms a list of pairs (of any types) into a pair of lists. For example,
split [(1, 'a'), (2, 'b'), (3, 'c')] = ([1, 2, 3], "abc")
This was the way I wrote the function but it is not working:
split :: [(a, b)] -> ([a], [b])
split listOfPairs = (([a | a <- listOfPairs]), ([b | b <- listOfPairs]))
Can someone please explain why my solution doesn't work? Thank you!

A list comprehension like:
[a | a <- listOfPairs]
is actually nothing more than an identity operation for lists. It will yield the same list as the one you provide, since you basically iterate over listOfPairs, and for each iteration, you yield the element a.
Haskell does not perform implicit conversions, so it does not derive from the types that a in your a <- listOfPairs then only can be the first element. Even if that was possible, it was probably not a good idea anyway, since it would make the language more "unstable" in the sense that a small change in the types, could have significant impact in the semantics.
In order to obtain the first element of a tuple, you need to use pattern matching, like:
[a | (a, _) <- listOfPairs]
here we thus pattern match the first element of the tuple with a, and for the second one, we thus use:
[b | (_, b) <- listOfPairs]
We can thus impelement this as:
split:: [(a,b)] -> ([a],[b])
split listOfPairs = ([a | (a, _) <- listOfPairs], [b | (_, b) <- listOfPairs])
Or we can use map :: (a -> b) -> [a] -> [b], fst :: (a, b) -> a and snd :: (a, b) -> b:
split:: [(a,b)] -> ([a],[b])
split listOfPairs = (map fst listOfPairs, map snd listOfPairs)
But the above still has a problem: here we iterate twice independently over the same list. We can omit that by using recursion, like:
split:: [(a,b)] -> ([a],[b])
split [] = []
split ((a, b):xs) = (a:as, b:bs)
where (as, bs) = split xs
or we can use a foldr function:
split :: Foldable f => f (a,b) -> ([a],[b])
split = foldr (\(a,b) (as,bs) -> (a:as,b:bs)) ([],[])
There is already a Haskell function that does exactly what you want: unzip :: [(a, b)] -> ([a], [b]), with the source code.

Haskell: change all indices from a list to some value

If I am given a list of objects and another list for some indices from this list, is there an easy way to change every object in this list with an index from the list of indices to a different value?
E.g. I am hoping there exists some function f such that
f 0 [4,2,5] [6,5,8,4,3,6,2,7]
would output
[6,5,0,4,0,0,2,7]

Here is a beautiful version that uses lens:
import Control.Lens
f :: a -> [Int] -> [a] -> [a]
f x is = elements (`elem` is) .~ x
Here is an efficient version that doesn't have any dependencies other than base. Basically, we start by sorting (and removing duplicates from the) indices list. That way, we don't need to scan the whole list for every replacement.
import Data.List
f :: a -> [Int] -> [a] -> [a]
f x is xs = snd $ mapAccumR go is' (zip xs [1..])
where
is' = map head . group . sort $ is
go [] (y,_) = ([],y)
go (i:is) (y,j) = if i == j then (is,x) else (i:is,y)

You can define a helper function to replace a single value and then use it to fold over your list.
replaceAll :: a -> [Int] -> [a] -> [a]
replaceAll repVal indices values = foldl (replaceValue repVal) values indices
where replaceValue val vals index = (take index vals) ++ [val] ++ (drop (index + 1) vals)

Sort the indices first. Then you can traverse the two lists in tandem.
{-# LANGUAGE ScopedTypeVariables #-}
import Prelude (Eq, Enum, Num, Ord, snd, (==), (<$>))
import Data.List (head, group, sort, zip)
f :: forall a. (Eq a, Enum a, Num a, Ord a) => a -> [a] -> [a] -> [a]
f replacement indices values =
go (head <$> group (sort indices)) (zip [0..] values)
where
go :: [a] -> [(a, a)] -> [a]
go [] vs = snd <$> vs
go _ [] = []
go (i:is) ((i', v):vs) | i == i' = replacement : go is vs
go is (v:vs) = snd v : go is vs
The sorting incurs an extra log factor on the length of the index list, but the rest is linear.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

haskell grouping problem - haskell

Related

How to construct a function from its graph?

Given a list, how can I perform some transformation only on sub-lists whose each two elements satisfy a binary predicate?

Program takes a list of pair and return it ordered from small to big

List of pairs into pair of Lists Haskell

Haskell: change all indices from a list to some value

Categories

Resources