How can I use filter to find neighboring words in a text? - haskell

As pretext: I have, prior to this, written a function that counts the amount of times a pair of words occur in a text, this function calculates every single pair of words throughout the text.
Like so:
pairCounter = map (\x -> (head x,length x)). groupTuples . sort
This function returns: [((String, String), Int)] The first/second string being word1/2 in the pair, and the Int is how many times this can be found, or the "tally" of the pair if you will.
What I now would like to do is create a function that only returns the "neighbors" to any given word. For instance:
neighbours [(("red","car"),2),(("house","red"),1)] "red"
should return [("car",2),("house",1)] or some reordering of this list.
So basically; we have established all pairs of any given word, but now I want to single out only the neighbors to this word and a tally of its frequency.
So far, I have thought about using filters in this way:
filter (\(x, y) -> x /= c || y /= c)
--^(I have no idea if this is syntax correct but it is just to give myself an idea where to start)
However I find it hard to come up with a way to use filters and also include the tally of my neighbors, my Int argument that is.

One very idiomatic way would be via a list comprehension:
neighbours :: Eq a => [((a, a), b)] -> a -> [(a, b)]
neighbours items query =
[ (neighbor, count)
| ((s1, s2), count) <- items
, (self, neighbor) <- [(s1, s2), (s2, s1)]
, self == query
]
Actually, I'd probably put the arguments in the other order to match conventions used in existing libraries and shorten the names so that it comfortably fits on one line:
neighbours :: Eq a => a -> [((a, a), b)] -> [(a, b)]
neighbours x xs = [(x4, n) | ((x1, x2), n) <- xs, (x3, x4) <- [(x1, x2), (x2, x1)], x3 == x]
I suspect that the part where you don't care about order will come up in other parts of your code, and so additionally I would consider splitting out a part that symmetrizes. This will also be helpful if, later, you decide that pairs that occur in both orders should be normalized and their counts summed or some such thing, because you will only have to change one location to propagate that update to all consumers.
-- (s)ource, (t)arget, (u)ndirected edge, (w)eight, (w)eighted edge(s)
undirected :: [((a, a), b)] -> [((a, a), b)]
undirected ws = [(u, w) | ((s, t), w) <- ws, u <- [(s, t), (t, s)]]
neighbours :: Eq a => a -> [((a, a), b)] -> [(a, b)]
neighbours x xs = [(t, w) | ((s, t), w) <- undirected xs, s == x]
Alternately, you might decide to make the graph undirected from the very beginning.
import qualified Data.Map as M
-- export UPair the type but not UPair the constructor
data UPair a = UPair a a
deriving (Eq, Ord, Read, Show, Functor, Foldable, Traversable)
-- this is strict. can also make a lazy variant
upair :: Ord a => a -> a -> UPair a
upair a a' = if a < a' then UPair a a' else UPair a' a
pairCounter :: [a] -> M.Map (UPair a) Int
pairCounter as = M.fromListWith (+) $ zipWith (\a a' -> (upair a a', 1)) as (tail as)

For a given word c you thus should retain the items for which the first String, or the second String are equal to c. We should use ((s1, s2), v) as pattern since the outer 2-tuple has as elements a 2-tuple of Strings as first item, and an Int as second item.
We can work with concatMap :: Foldable t => (a -> [b]) -> t a -> [b] and work with a function that will return [(s2, v)] if s1 matches, [(s1, v)] if s2 matches, and the empty list if none of the two elements was a match:
We thus filter with:
neighbors :: (Foldable f, Eq a) -> f ((a, a), b) -> a -> [(a, b)]
neighbors items query = concatMap f items
where f ((s1, s2), v)
| query == s1 = [(s2, v)]
| query == s2 = [(s1, v)]
| otherwise = []

Related

Is there a way of folding from right to left, removing tuples that arent within one move away?

I've got a depth-first search that returns all the paths it goes down rather than just a path from start to finish. For example
from (1,1) to (3,5) it returns this list:
[(1,1),(1,2),(1,3),(2,3),(3,3),(4,3),(5,3),(5,2),(5,1),(4,1),(3,1),(5,4),(5,5),(4,5),(3,5)]
What I really want is:
[(1,1),(1,2),(1,3),(2,3),(3,3),(4,3),(5,3),(5,4),(5,5),(4,5),(3,5)]
(5,3) and (5,4) are neighbours whereas (3,1) and (5,4) are not.
[(5,3),***(5,2),(5,1),(4,1),(3,1),***(5,4)]
Could I fold through accumulating all the tuples that aren't neighbours and remove them from the list?
Start by simply making a check to see if two tuples are within 1 of each other:
neighbors :: (Num a, Ord a) => (a, a) -> (a, a) -> Bool
neighbors (a, b) (c, d) = abs (a - c) <= 1 && abs (b - d) <= 1
Note that, if you can't be 1 away in both directions at the same time, consider using the following implementation:
neighbors :: (Num a, Ord a) => (a, a) -> (a, a) -> Bool
neighbors (a, b) (c, d) = diff1 + diff2 <= 1
where diff1 = abs (a - c)
diff2 = abs (b - d)
Now you can fold where the accumulator is the final list, so the head of the accumulator would be the element you want to be neighbors with:
neighborsOnly :: (Num a, Ord a) => [(a, a)] -> [(a, a)]
neighborsOnly list = foldr (\val acc ->
if null acc || neighbors (head acc) val
then val:acc
else acc) [] list
The null acc check is for the first iteration, where the accumulator is still an empty list, in which case it should always be considered a neighbor.

Program takes a list of pair and return it ordered from small to big

so I wrote this program where it takes a key value list and return the pair with the biggest value, now i want the program to return the the original list but organized as the smallest value at the first and then the biggest at the last for example [(4,3) , (2,10), (2,1)] -> it returns [(2,1) , (4,3), (2,10)]
vector :: Ord a => [(t, a)] -> (t, a)
vector (x:xs) = maxTail x xs
where maxTail currentMax [] = currentMax
maxTail (m, n) (p:ps)
| n < (snd p) = maxTail p ps
| otherwise = maxTail (m, n) ps
I tried to do this
vec xs = [ vector tt | tt <- xs]
but does not work because vector function brings back list of lists of pair and vector function takes lists of pair.
how can I modify it to do what I want it to do
#leftaroundabout helped with this function
import Data.List (sortOn)
sortOnSnd :: Ord a => [(a, b)] -> [(a, b)]
sortOnSnd = sortOn snd
now if there was two values that equal each other I want the program to order it according to the key for example [(42,3), (3,3), (4,1)] -> [(4,1), (3,3), (42,3)]
Here what I did
import Data.List (sortOn)
sortOnSnd :: Ord a => [(a, b)] -> [(a, b)]
sortOnSnd = sortOn snd
if sortOn snd == sortOn snd then sortOn fst
it gives error in the main IO function
First, what ever gave you the idea that it would be a good idea to call this function “vector”?? Let's make it instead
import Data.List (maximumBy)
import Data.List (comparing)
maximumOnSnd :: Ord a => [(t, a)] -> (t, a)
maximumOnSnd = maximumBy $ comparing snd
Now, you could use that for implementing a sorting function, but it will be absurdly inefficient – not just because it's an O (n2) algorithm, also because it requires some unnecessary extra list traversals. In particular, it's not enough to just find the highest element, you actually need to extract it. One possibility (both inefficient and unreliable) is to filter out the one you found:
extrMaximumOnSnd :: (Ord a, Eq t) => [(t, a)] -> ((t, a), [(t, a)])
extrMaximumOnSnd l = (maxMember, filter (/=maxMember) l)
where maxMember = maximumOnSnd l
Given that, you can then recursively implement the selection sort:
sortOnSnd :: (Ord a, Eq t) => [(t, a)] -> [(t, a)]
sortOnSnd [] = []
sortOnSnd l = case extrMaximumOnSnd l of
(maxMember, others) -> sortOnSnd ++ [maxMember]
But again, keep in mind that this is very inefficient, less general then it should be (Eq t constraint) and unreliable, in particular when the list contains duplicate elements.
A much better solution is
import Data.List (sortOn)
sortOnSnd :: Ord a => [(t, a)] -> [(t, a)]
sortOnSnd = sortOn snd

Haskell and comprehension lists

I'm writing a function that compares two vectors in haskell using comprehension lists. The thing is that I want to add booleans to my final list, but Haskell interprets this code as if x == y, add the element to the list (that's how comprehensive lists works I know). What I want is a list with booleans if the coordinates I'm comparing are true or false.
Is it possible to do this with comprehension lists?
igualdad :: Vector -> Vector -> [Bool]
igualdad v1 v2 = [ x == y | x <- xs, y <- ys]
where xs = vectorToFloatList v1
ys = vectorToFloatList v2
PD: I'm going to use foldr (&&) True with the list that returns igualdad, in order to get the final result that I want.
Thanks.
What I want is a list with booleans if the coordinates I'm comparing are True or False. Is it possible to do this with comprehension lists?
You get such a list. For two Vectors v and w with lengths m and n respectively, you will get a list with m×n elements, such that the item vi and wj will be compared in the result list in the element with index i×m + j.
If you hwever want a list of length min(m, n), such that the item at index i checks if vi and wi are the same, then we can make use of zip :: [a] -> [b] -> [(a, b)]:
igualdad :: Vector -> Vector -> [Bool]
igualdad v1 v2 = [ x == y | (x, y) <- zip (vectorToFloatList xs) (vectorToFloatList ys)]
or with zipWith :: (a -> b -> c) -> [a] -> [b] -> [c] and on :: (b -> b -> c) -> (a -> b) -> a -> a -> c:
import Data.Function(on)
igualdad :: Vector -> Vector -> [Bool]
igualdad = on (zipWith (==)) vectorToFloatList
or we can make use of the ParallelListComp extension [ghc-doc] and run this with:
{-# LANGUAGE ParallelListComp #-}
igualdad :: Vector -> Vector -> [Bool]
igualdad v1 v2 = [ x == y | x <- vectorToFloatList xs | y <- vectorToFloatList ys]
PD: I'm going to use foldr (&&) True with the list that returns igualdad.
There exists a function for that already: that is and :: Foldable f => f Bool -> Bool. If you however want to check if all the items are the same, you can just use all :: Foldable f => (a -> Bool) -> f a -> Bool here:
import Data.Function(on)
sameVec :: Vector -> Vector -> Bool
sameVec = on (all (uncurry (==) .) . zip) vectorToFloatList

List of pairs into pair of Lists Haskell

Basically I have this exercise:
Using list comprehensions, write a polymorphic function:
split :: [(a, b)] -> ([a], [b])
which transforms a list of pairs (of any types) into a pair of lists. For example,
split [(1, 'a'), (2, 'b'), (3, 'c')] = ([1, 2, 3], "abc")
This was the way I wrote the function but it is not working:
split :: [(a, b)] -> ([a], [b])
split listOfPairs = (([a | a <- listOfPairs]), ([b | b <- listOfPairs]))
Can someone please explain why my solution doesn't work? Thank you!
A list comprehension like:
[a | a <- listOfPairs]
is actually nothing more than an identity operation for lists. It will yield the same list as the one you provide, since you basically iterate over listOfPairs, and for each iteration, you yield the element a.
Haskell does not perform implicit conversions, so it does not derive from the types that a in your a <- listOfPairs then only can be the first element. Even if that was possible, it was probably not a good idea anyway, since it would make the language more "unstable" in the sense that a small change in the types, could have significant impact in the semantics.
In order to obtain the first element of a tuple, you need to use pattern matching, like:
[a | (a, _) <- listOfPairs]
here we thus pattern match the first element of the tuple with a, and for the second one, we thus use:
[b | (_, b) <- listOfPairs]
We can thus impelement this as:
split:: [(a,b)] -> ([a],[b])
split listOfPairs = ([a | (a, _) <- listOfPairs], [b | (_, b) <- listOfPairs])
Or we can use map :: (a -> b) -> [a] -> [b], fst :: (a, b) -> a and snd :: (a, b) -> b:
split:: [(a,b)] -> ([a],[b])
split listOfPairs = (map fst listOfPairs, map snd listOfPairs)
But the above still has a problem: here we iterate twice independently over the same list. We can omit that by using recursion, like:
split:: [(a,b)] -> ([a],[b])
split [] = []
split ((a, b):xs) = (a:as, b:bs)
where (as, bs) = split xs
or we can use a foldr function:
split :: Foldable f => f (a,b) -> ([a],[b])
split = foldr (\(a,b) (as,bs) -> (a:as,b:bs)) ([],[])
There is already a Haskell function that does exactly what you want: unzip :: [(a, b)] -> ([a], [b]), with the source code.

Haskell - Counting how many times each distinct element in a list occurs

I'm new to Haskell and am just trying to write a list comprehension to calculate the frequency of each distinct value in a list, but I'm having trouble with the last part..
So far i have this:
frequency :: Eq a => [a] -> [(Int,a)]
frequency list = [(count y list,y) | y <- rmdups ]
Something is wrong with the last part involving rmdups.
The count function takes a character and then a list of characters and tells you how often that character occurs, the code is as follows..
count :: Eq a => a -> [a] -> Int
count x [] = 0
count x (y:ys) | x==y = 1+(count x ys)
| otherwise = count x ys
Thank-you in advance.
You could also use a associative array / finite map to store the associations from list elements to their count while you compute the frequencies:
import Data.Map (fromListWith, toList)
frequency :: (Ord a) => [a] -> [(a, Int)]
frequency xs = toList (fromListWith (+) [(x, 1) | x <- xs])
Example usage:
> frequency "hello world"
[(' ',1),('d',1),('e',1),('h',1),('l',3),('o',2),('r',1),('w',1)]
See documentation of fromListWith and toList.
I had to use Ord in instead of Eq because of the use of sort
frequency :: Ord a => [a] -> [(Int,a)]
frequency list = map (\l -> (length l, head l)) (group (sort list))
As requested, here's a solution using Control.Arrow:
frequency :: Ord a => [a] -> [(Int,a)]
frequency = map (length &&& head) . group . sort
This is the same function as ThePestest's answer, except
λ f g l -> (f l, g l)
is replaced with
-- simplified type signature
(&&&) :: (a -> b) -> (a -> c) -> a -> (b, c)
from Control.Arrow. If you want to avoid the import,
liftA2 (,) :: Applicative f => f a -> f b -> f (a, b)
works as well (using the Applicative instance of (->) r)
Assuming rmdups has the type
rmdups :: Eq a => [a] -> [a]
Then you're missing a parameter for it.
frequency :: Eq a => [a] -> [(Int,a)]
frequency list = [(count y list,y) | y <- rmdups list]
But the error you're getting would be helpful with diagnosis.
Your rmdups function is just nub from Data.List.
Replacing rmdups with nub list worked for me like a charm.
Hahahaha there is an rmdups on pg. 86 of Programming in Haskell by Graham Hutton. It is perfect and recursive. It is also handy in a great many situations.
Here is my one line rmdups and it produces the same results as nubor Hutton's.
rmdups ls = [d|(z,d)<- zip [0..] ls,notElem d $ take z ls]
It can well be used to count distinct elements of a list.
dl = "minimum-maximum"
[ (d,sum [1|x<-dl,d == x]) | d<-rmdups dl]
[('m',6),('i',3),('n',1),('u',2),('-',1),('a',1),('x',1)]

Resources