Count a occurrence of a specific word in a sentence haskell - haskell

I'm new in Haskell and I'm tring to write a simple function that counts the number of occurences of a substring in a string.
For example : "There is an apple" and I want to count how many times "is" in the sentence, in this case the result should be 1.
This is what I've tried:
countOf :: String -> Int
countOf x = length [n | n <- words x, filter "is" x]
According what I've studied it should work, but it doesn't. I really don't know how to solve the problem, and also don't know what the error message I get means:
input:1:41:
Couldn't match expected type `Bool' with actual type `[a0]'
In the return type of a call of `filter'
In the expression: filter "a" x
In a stmt of a list comprehension: filter "a" x

The function filter has the type
filter :: (a -> Bool) -> [a] -> [a]
This means that its first argument is another function, which takes an element and returns a Bool, and it applies this function to each element of the second argument. You're giving a String as the first argument instead of a function. Maybe you want something more like
countOf x = length [n | n <- words x, filter (\w -> w == "is") x]
But this won't work either! This is because any extra expression in a list comprehension has to be a Bool, not a list. filter returns a list of elements, not a Bool, and this is actually the source of your compiler error, it expects a Bool but it sees a list of type [a0] (it hasn't even gotten far enough to realize it should be [String]).
Instead, you could do
countOf x = length [n | n <- words x, n == "is"]
And this would be equivalent to
countOf x = length (filter (\w -> w == "is") (words x))
Or with $:
countOf x = length $ filter (\w -> w == "is") $ words x
Haskell will actually let us simplify this even further to
countOf x = length $ filter (== "is") $ words x
Which uses what is known as an operator section. You can then make it completely point free as
countOf = length . filter (== "is") . words

I would do like this:
countOf :: String -> Int
countOf x = length [n | n <- words x, compare "is" n == EQ]
Demo in ghci:
ghci> countOf "There is an apple"
1

You can put the comparison straight in the comprehension:
countOf x = length [n | n <- words x, n == "is"]

Actually, you try to count the number of occurences of a word in a string. In case you look for a substring:
import Data.List (inits, tails)
countOf = length . filter (=="is") . conSubsequences
where
conSubsequences = concatMap inits . tails

One could also try a foldr:
countOf :: String -> Int
countOf x = foldr count 0 (words x)
where
count x acc = if x == "is" then acc + 1 else acc

Related

Haskell - Exclude lists based on a test in a nested list comprehension

I want to create a series of possible equations based on a general specification:
test = ["12", "34=", "56=", "78"]
Each string (e.g. "12") represents a possible character at that location, in this case '1' or '2'.)
So possible equations from test would be "13=7" or "1=68".
I know the examples I give are not balanced but that's because I'm deliberately giving a simplified short string.
(I also know that I could use 'sequence' to search all possibilities but I want to be more intelligent so I need a different approach explained below.)
What I want is to try fixing each of the equals in turn and then removing all other equals in the equation. So I want:
[["12","=","56","78"],["12","34","=","78”]]
I've written this nested list comprehension:
(it needs: {-# LANGUAGE ParallelListComp #-} )
fixEquals :: [String] -> [[String]]
fixEquals re
= [
[
if index == outerIndex then equals else remain
| equals <- map (filter (== '=')) re
| remain <- map (filter (/= '=')) re
| index <- [1..]
]
| outerIndex <- [1..length re]
]
This produces:
[["","34","56","78"],["12","=","56","78"],["12","34","=","78"],["12","34","56","”]]
but I want to filter out any with empty lists within them. i.e. in this case, the first and last.
I can do:
countOfEmpty :: (Eq a) => [[a]] -> Int
countOfEmpty = length . filter (== [])
fixEqualsFiltered :: [String] -> [[String]]
fixEqualsFiltered re = filter (\x -> countOfEmpty x == 0) (fixEquals re)
so that "fixEqualsFiltered test" gives:
[["12","=","56","78"],["12","34","=","78”]]
which is what I want but it doesn’t seem elegant.
I can’t help thinking there’s another way to filter these out.
After all, it’s whenever "equals" is used in the if statement and is empty that we want to drop the equals so it seems a waste to build the list (e.g. ["","34","56","78”] and then ditch it.)
Any thoughts appreciated.
I don't know if this is any cleaner than your code, but it might be a bit more clear and maybe more efficient using a recursion:
fixEquals = init . f
f :: [String] -> [[String]]
f [] = [[]]
f (x:xs) | '=' `elem` x = ("=":removeEq xs) : map (removeEq [x] ++) (f xs)
| otherwise = map (x:) (f xs)
removeEq :: [String] -> [String]
removeEq = map (filter (/= '='))
The way it works is that, if there's an '=' in the current string, then it splits the return into two, if not just calls recursively. The init is needed as in the last element returned there's no equal in any string.
Finally, I believe you can probably find a better data structure to do what you need to achieve instead of using list of strings
Let
xs = [["","34","56","78"],["12","=","56","78"],["12","34","=","78"],["12","34","56",""]]
in
filter (not . any null) xs
will give
[["12","=","56","78"],["12","34","=","78"]]
If you want list comprehension then do
[x | x <- xs, and [not $ null y | y <- x]]
I think I'd probably do it this way. First, a preliminary that I've written so many times it's practically burned into my fingers by now:
zippers :: [a] -> [([a], a, [a])]
zippers = go [] where
go _ [] = []
go b (h:e) = (b,h,e):go (h:b) e
Probably running it once or twice in ghci will be a more clear explanation of what this does than any English writing I could do:
> zippers "abcd"
[("",'a',"bcd"),("a",'b',"cd"),("ba",'c',"d"),("cba",'d',"")]
In other words, it gives a way of selecting each element of a list in turn, giving the "leftovers" of what was before and after the selection point. Given that tool, here's our plan: we'll nondeterministically choose a String to serve as our equals sign, double-check that we've got an equals sign in the first place, and then clear out the equals from the others. So:
fixEquals ss = do
(prefix, s, suffix) <- zippers ss
guard ('=' `elem` s)
return (reverse (deleteEquals prefix) ++ ["="] ++ deleteEquals suffix)
deleteEquals = map (filter ('='/=))
Let's try it:
> fixEquals ["12", "34=", "56=", "78"]
[["12","=","56","78"],["12","34","=","78"]]
Perfect! But this is just a stepping-stone to actually generating the equations, right? It turns out to be not that hard to go all the way in one step, skipping this intermediate. Let's do that:
equations ss = do
(prefixes, s, suffixes) <- zippers ss
guard ('=' `elem` s)
prefix <- mapM (filter ('='/=)) (reverse prefixes)
suffix <- mapM (filter ('='/=)) suffixes
return (prefix ++ "=" ++ suffix)
And we can try it in ghci:
> equations ["12", "34=", "56=", "78"]
["1=57","1=58","1=67","1=68","2=57","2=58","2=67","2=68","13=7","13=8","14=7","14=8","23=7","23=8","24=7","24=8"]
The easiest waty to achieve what you want is to create all the combinations and to filter the ones that have a meaning:
Prelude> test = ["12", "34=", "56=", "78"]
Prelude> sequence test
["1357","1358","1367","1368","13=7","13=8","1457","1458","1467","1468","14=7","14=8","1=57","1=58","1=67","1=68","1==7","1==8","2357","2358","2367","2368","23=7","23=8","2457","2458","2467","2468","24=7","24=8"
Prelude> filter ((1==).length.filter('='==)) $ sequence test
["13=7","13=8","14=7","14=8","1=57","1=58","1=67","1=68","23=7","23=8","24=7","24=8","2=57","2=58","2=67","2=68"]
You pointed the drawback: imagine we have the followig list of strings: ["=", "=", "0123456789", "0123456789"]. We will generate 100 combinations and drop them all.
You can look at the combinations as a tree. For the ["12", "34"], you have:
/ \
1 2
/ \ / \
3 4 3 4
You can prune the tree: just ignore the subtrees when you have two = on the path.
Let's try to do it. First, a simple combinations function:
Prelude> :set +m
Prelude> let combinations :: [String] -> [String]
Prelude| combinations [] = [""]
Prelude| combinations (cs:ts) = [c:t | c<-cs, t<-combinations ts]
Prelude|
Prelude> combinations test
["1357","1358","1367","1368","13=7","13=8","1457","1458","1467","1468","14=7","14=8","1=57","1=58","1=67","1=68","1==7","1==8","2357","2358","2367","2368","23=7","23=8","2457","2458","2467","2468","24=7","24=8", ...]
Second, we need a variable to store the current number of = signs met:
if we find a second = sign, just drop the subtree
if we reach the end of a combination with no =, drop the combination
That is:
Prelude> let combinations' :: [String] -> Int -> [String]
Prelude| combinations' [] n= if n==1 then [""] else []
Prelude| combinations' (cs:ts) n = [c:t | c<-cs, let p = n+(fromEnum $ c=='='), p <= 1, t<-combinations' ts p]
Prelude|
Prelude> combinations' test 0
["13=7","13=8","14=7","14=8","1=57","1=58","1=67","1=68","23=7","23=8","24=7","24=8","2=57","2=58","2=67","2=68"]
We use p as the new number of = sign on the path: if p>1, drop the subtree.
If n is zero, we don't have any = sign in the path, drop the combination.
You may use the variable n to store more information, eg type of the last char (to avoid +* sequences).

How to destructure a string into first, middle, and last?

I'm writing a function to determine whether a number is a palindrome.
What I would like to do in the first case is to destructure the string into the first character, all the characters in the middle, and the last character. What I do is check if the first character is equal to the last, and then if so, proceed to check the middle characters.
What I have is below, but it generates type errors upon compilation.
numberIsPalindrome :: Int -> Bool
numberIsPalindrome n =
case nString of
(x:xs:y) -> (x == y) && numberIsPalindrome xs
(x:y) -> x == y
x -> True
where nString = show n
Using the String representation is cheating...
Not really, but this is more fun:
import Data.List
palindrome n = list == reverse list where
list = unfoldr f n
f 0 = Nothing
f k = Just (k `mod` 10, k `div` 10)
What it does is creating a list of digits of the number (unfoldr is really useful for such tasks), and then comparing whether the list stays the same when reversed.
What you try has several problems, e.g. you miss a conversion from the number to a String (which is just a list of Char in Haskell), and lists work completely different from what you try: Think of them more as stacks, where you usually operate only on one end.
That said, there is an init and a last function for lists, which allow to work your way from the "outer" elements of the list to the inner ones. A naive (and inefficient) implementation could look like this:
palindrome n = f (show n) where
f [] = True
f [_] = True
f (x : xs) = (x == last xs) && (f (init xs))
But this is only for demonstration purposes, don't use such code in real live...
The definition you probably want is
numberIsPalindrome :: Int -> Bool
numberIsPalindrome num = let str = show num
in (str == reverse str)
The (:) operator is known as cons, it prepends items to lists:
1:2:[] results in [1,2]
You are getting a type error because you are trying to compare the first argument, a Char, with the last one, a [a].
If you really would like to compare the first with the last you would use head and last.
But you are better using the solution that taktoa proposed:
numberIsPalindrome :: Int -> Bool
numberIsPalindrome num =
numberString == reverse numberString
where numberString = show num

How to find occurrences of char in an input string in Haskell

I am trying to write a function that will take a String and a Char and output the indexes where the char occurs in the string.
stringCount str ch =
Input : "haskell is hard" `h`
Output:[0,11]
Input : "haskell is hard" `a`
Output:[1,12]
Please help me I'm struggling to understand Haskell.
There are many ways to do this, but since you mention you're a Haskell beginner, a list comprehension may be easiest to understand (I'm assuming this is homework, so you have to implement it yourself, not use elemIndices):
stringCount str ch = [ y | (x, y) <- zip str [0..], x == ch ]
stringCount "haskell is hard" 'a'
-- [1,12]
stringCount "haskell is hard" 'h'
-- [0,11]
Here we zip, the string str with the infinite list starting from 0, producing the tuples ('h', 0), ('a', 1), ('s', 2), etc. We then only select the tuples where the character (bound to x) equals the argument ch and return the index (bound to y) for each of them.
If you wanted to keep your current argument order but use elementIndices you can use the following:
stringCount' = flip elemIndices
stringCount' "haskell is hard" 'h'
-- [0,11]
Here is a simpler but less sophisticated solution that the one post by karakfa:
stringCount :: String -> Char -> Integer -> [Integer]
stringCount [] c _ = []
stringCount (x:xs) c pos | x == c = pos:(stringCount xs c (pos+1))
| otherwise = stringCount xs c (pos+1)
The idea is that you go through the string char by char using recursion and then compare the actual caracter (head at the moment) with the char passed as argument. To keep track of the position I am using a counter called pos, and increment it for each recursion call.
you can use elemIndex to walk through the list, or simply write your own
indexOf x = map fst . filter (\(_,s) -> s==x) . zip [0..]
indexOf 'a' "haskell is hard"
[1,12]
or with findIndices
import Data.List(findIndices)
findIndices (\x -> x=='a') "haskell is hard"
[1,12]

Does Haskell have a takeUntil function?

Currently I am using
takeWhile (\x -> x /= 1 && x /= 89) l
to get the elements from a list up to either a 1 or 89. However, the result doesn't include these sentinel values. Does Haskell have a standard function that provides this variation on takeWhile that includes the sentinel in the result? My searches with Hoogle have been unfruitful so far.
Since you were asking about standard functions, no. But also there isn't a package containing a takeWhileInclusive, but that's really simple:
takeWhileInclusive :: (a -> Bool) -> [a] -> [a]
takeWhileInclusive _ [] = []
takeWhileInclusive p (x:xs) = x : if p x then takeWhileInclusive p xs
else []
The only thing you need to do is to take the value regardless whether the predicate returns True and only use the predicate as a continuation factor:
*Main> takeWhileInclusive (\x -> x /= 20) [10..]
[10,11,12,13,14,15,16,17,18,19,20]
Is span what you want?
matching, rest = span (\x -> x /= 1 && x /= 89) l
then look at the head of rest.
The shortest way I found to achieve that is using span and adding a function before it that takes the result of span and merges the first element of the resulting tuple with the head of the second element of the resulting tuple.
The whole expression would look something like this:
(\(f,s) -> f ++ [head s]) $ span (\x -> x /= 1 && x /= 89) [82..140]
The result of this expression is
[82,83,84,85,86,87,88,89]
The first element of the tuple returned by span is the list that takeWhile would return for those parameters, and the second element is the list with the remaining values, so we just add the head from the second list to our first list.

Haskell function that returns number of elements that overlap

How would you go about defining a function that takes two strings, say string x and string y and return the number of elements at the end of the first string (string x) which overlap with the beginning of the second string (second y).
I'm thinking that it would be a good idea to use isPrefixOf from the prelude to do this since it checks if a string is in another string and returns True if it does or False otherwise. But I'm a little confused at how to return the count of how many elements overlap. I wrote some pseudocode based off how I think you would approach the problem.
countElements :: Eq a => (Str a, Str a) -> Int
countElements (x,y) =
if x `isPrefixOf` y == True
then return the count of how many elements overlap
otherwise 0
A sample output would be:
countElements (board, directors) = 1
countElements (bend, ending) = 3
Any help here? I'm not very good at writing Haskell code.
You have exactly the right idea, but your pseudocode misses the fact that you'll have to iterate on all of the possible tails of the first string passed to the function.
countElements :: String -> String -> Int
countElements s t = length $ head $ filter (`isPrefixOf` t) (tails s)
> countElements "board" "directors"
1
> countElements "bend" "endings"
3
Version without isPrefixOf:
import Data.List
countElements xs ys = length . fst . last . filter (uncurry (==)) $ zip (reverse $ tails xs) (inits ys)

Resources