Defining my own isPrefixOf without recursion using foldr - haskell

I am working on a programming assignment where I must define my own version of isPrefixOf from Data.List using only foldr, map, and cons (and thus no recursion). The hint I've been given is that the return value of foldr should itself be a function. Can someone help me understand how I could apply that fact? My guess for the structure is included below.
startsWith :: String -> String -> Bool
startsWith s1 s2 = (foldr (???) ??? s1) s2
I am allowed to define my own helper functions. For the curious this is from an assignment for CIS 552 at Penn.

EDIT: It turns out that by folding over the pattern instead of the string, the code gets a lot simpler and shorter and something like
"" `isPrefixOf` undefined
works. Thank you #dfeuer and #WillNess. This is the updated program:
isPrefixOf pattern s = foldr g (const True) pattern s
where
g x r (h:t)
| x == h = r t
| otherwise = False
It works almost the same way as the below program, so refer to that for the explanation.
I managed to solve it using nothing but foldr:
isPrefixOf :: String -> String -> Bool
p `isPrefixOf` s = isPrefixOfS p
where
isPrefixOfS
= foldr
(\c acc ->
\str -> case str of
x:xs -> if c == x
then acc xs
else False
[] -> True
)
null
s
Here's the explanation.
In order to create the function isPrefixOf, we want this:
isPrefixOf pattern s
= case pattern of
[] -> True
x:xs -> if (null s) then False
else if (head s) /= x
then False
else xs `isPrefixOf` (tail s)
Well, let's simplify this - let's create a function called isPrefixOfS that only takes a pattern, which it compares to s automatically. We need to build this chain of nested functions:
-- Pseudocode, not actual Haskell
\p -> case p of
[] -> True
x:xs -> if x /= (s !! 0)
then False
else <apply xs to> \q -> case q of [] -> True
x:xs -> if x /= (s !! 1) -- Note that we've incremented the index
then False
else <apply xs to> \r -> ....
This seems pretty self explanatory - let me know in a comment if it requires further explanation.
Well, we can see that this chain has a recursive property to it, where the last character of s will be compared in the most deeply nested lambda. So we need to nest lambdas from right to left. What can we use for that? foldr.
isPrefixOfS
= foldr -- Folding from right to left across `s`
(\c acc -> -- `c` is the current character of `s`, acc is the chain of nested functions so far
\str -> -- We return a function in the fold that takes a String
case str of
-- If the string is not null:
x:xs -> if c == x -- If the head of the string is equal to the current character of s,
then acc xs -- Then pass the tail of the string to the nested function which will compare it with subsequent characters of s
else False -- Otherwise we return false
-- If the string is null, we have completely matched the prefix and so return True
[] -> True
)
null -- Innermost nested function - we need to make sure that if the prefix reaches here, it is null, i.e. we have entirely matched it
s
And now we use this function isPrefixOfS on p:
p `isPrefixOf` s = isPrefixOfS p
EDIT: I just found this post which uses a similar logic to implement zip in terms of foldr, you may want to look at that as well.

Related

Understanding non-strictness in Haskell with a recursive example

What is the difference between this two, in terms of evaluation?
Why this "obeys" (how to say?) non-strictness
recFilter :: (a -> Bool) -> [a] -> [a]
recFilter _ [] = []
recFilter p (h:tl) = if (p h)
then h : recFilter p tl
else recFilter p tl
while this doesn't?
recFilter :: (a -> Bool) -> [a] -> Int -> [a]
recFilter _ xs 0 = xs
recFilter p (h:tl) len
| p(h) = recFilter p (tl ++ [h]) (len-1)
| otherwise = recFilter p tl (len-1)
Is it possible to write a tail-recursive function non-strictly?
To be honest I also don't understand the call stack of the first example, because I can't see where that h: goes. Is there a way to see this in ghci?
The non-tail recursive function roughly consumes a portion of the input (the first element) to produce a portion of the output (well, if it's not filtered out at least). Then recursion handles the next portion of the input, and so on.
Your tail recursive function will recurse until len becomes zero, and only at that point it will output the whole result.
Consider this pseudocode:
def rec1(p,xs):
case xs:
[] -> []
(y:ys) -> if p(y): print y
rec1(p,ys)
and compare it with this accumulator-based variant. I'm not using len since I use a separate accumulator argument, which I assume to be initially empty.
def rec2(p,xs,acc):
case xs:
[] -> print acc
(y:ys) -> if p(y):
rec2(p,ys,acc++[y])
else:
rec2(p,ys,acc)
rec1 prints before recursing: it does not need to inspect the whole input list to start printing its output. It works in a "steraming" fashion, in a sense. Instead, rec2 will only start to print at the very end, after the input list was completely scanned.
In your Haskell code there are no prints, of course, but you can thing of returning x : function call as "printing x", since x is made available to the caller of our function before function call is actually made. (Well, to be pedantic this depends on how the caller will consume the output list, but I'll neglect this.)
Hence the non-tail recursive code can also work on infinite lists. Even on finite inputs, performance is improved: if we call head (rec1 p xs), we only evaluate xs until the first non-discarded element. By contrast head (rec2 p xs) would fully filter the whole list xs, even we don't need that.
The second implementation does not make much sense: a variable named len will not contain the length of the list. You thus need to pass this, for infinite lists, this would not work, since there is no length at all.
You likely want to produce something like:
recFilter :: (a -> Bool) -> [a] -> [a]
recFilter p = go []
where go ys [] = ys -- (1)
go ys (x:xs) | p x = go (ys ++ [x]) xs
| otherwise = go ys xs
where we thus have an accumulator to which we append the items in the list, and then eventually return the accumulator.
The problem with the second approach is that as long as the accumulator is not returned, Haskell will need to keep recursing until at least we reach weak head normal form (WHNF). This means that if we pattern match the result with [] or (_:_), we will need at least have to recurse until case one, since the other cases only produce a new expression, and it will thus not yield a data constructor on which we can pattern match.
This in contrast to the first filter where if we pattern match on [] or (_:_) it is sufficient to stop at the first case (1), or the third case 93) where the expression produces an object with a list data constructor. Only if we require extra elements to pattern match, for example (_:_:_), it will require to evaluate the recFilter p tl in case (2) of the first implementation:
recFilter :: (a -> Bool) -> [a] -> [a]
recFilter _ [] = [] -- (1)
recFilter p (h:tl) = if (p h)
then h : recFilter p tl -- (2)
else recFilter p tl
For more information, see the Laziness section of the Wikibook on Haskell that describes how laziness works with thunks.

Find index of substring in another string Haskell

I am to make a function which takes two parameters (Strings). The function shall see if the first parameter is a substring of the second parameter. If that is the case, it shall return tuples of each occurences which consists of the startindex of the substring, and the index of the end of the substring.
For example:
f :: String -> String -> [(Int,Int)]
f "oo" "foobar" = [(1,2)]
f "oo" "fooboor" = [(1,2),(4,5)]
f "ooo" "fooobar" = [(1,3)]
We are not allowed to import anything, but I have a isPrefix function. It checks if the first parameter is a prefix to the second parameter.
isPrefix :: Eq a => [a] -> [a] -> Bool
isPrefix [] _ = True
isPrefix _ [] = False
isPrefix (x:xs) (y:ys) |x== y = isPrefix xs ys
|otherwise = False
I was thinking a solution may be to run the function "isPrefix" first on x, and if it returns False, I run it on the tail (xs) and so on. However, I am struggling to implement it and struggling to understand how to return the index of the string if it exists. Perhaps using "!!"? Do you think I am onto something? As I am new to Haskell the syntax is a bit of a struggle :)
We can make a function that will check if the first list is a prefix of the second list. If that is the case, we prepend (0, length firstlist - 1) to the recursive call where we increment both indexes by one.
Ths thus means that such function looks like:
f :: Eq a => [a] -> [a] -> [(Int, Int)]
f needle = go
where go [] = []
go haystack#(_: xs)
| isPrefix needle haystack = (…, …) : tl -- (1)
| otherwise = tl
where tl = … (go xs) -- (2)
n = length needle
Here (1) thus prepends (…, …) to the list; and for (2) tl makes a recursive call and needs to post-process the list by incrementing both items of the 2-tuple by one.
There is a more efficient algorithm to do this where you pass the current index in the recursive call, or you can implement the Knuth-Morris-Pratt algorithm [wiki], I leave these as an exercise.

Remove a Character Sequence From a String

Consider a function, which takes a string and returns a list of all possible cases in which three subsequent 'X's can be removed from the list.
Example:
"ABXXXDGTJXXXDGXF" should become
["ABDGTJXXXDGXF", "ABXXXDGTJDGXF"]
(The order does not matter)
here is a naive implementation:
f :: String -> [String]
f xs = go [] xs [] where
go left (a:b:c:right) acc =
go (left ++ [a]) (b:c:right) y where -- (1)
y = if a == 'X' && b == 'X' && c == 'X'
then (left ++ right) : acc
else acc
go _ _ acc = acc
I think the main problem here is the line marked with (1). I'm constructing the left side of the list by appending to it, which is generally expensive.
Usually something like this can be solved by this pattern:
f [] = []
f (x:xs) = x : f xs
Or more explicitly:
f [] = []
f (x:right) = x : left where
left = f right
Now I'd have the lists right and left in each recursion. However, I need to accumulate them and I could not figure out how to do so here. Or am I on the wrong path?
A solution
Inspired by Gurkenglas' propose, here is a bit more generalized version of it:
import Data.Bool
removeOn :: (String -> Bool) -> Int -> String -> [String]
removeOn onF n xs = go xs where
go xs | length xs >= n =
bool id (right:) (onF mid) $
map (head mid:) $
go (tail xs)
where
(mid, right) = splitAt n xs
go _ = []
removeOn (and . map (=='X')) 3 "ABXXXDGTJXXXDGXF"
--> ["ABDGTJXXXDGXF","ABXXXDGTJDGXF"]
The main idea seems to be the following:
Traverse the list starting from its end. Make use of a 'look-ahead' mechanism which can examine the next n elements of the list (thus it must be checked, if the current list contains that many elements). By this recursive traversal an accumulating list of results is being enhanced in the cases the following elements pass a truth test. In any way those results must be added the current first element of the list because they stem from shorter lists. This can be done blindly, since adding characters to a result string won't change their property of being a match.
f :: String -> [String]
f (a:b:c:right)
= (if a == 'X' && b == 'X' && c == 'X' then (right:) else id)
$ map (a:) $ f (b:c:right)
f _ = []

Remove the First Value in a List that Meets a Criterion

I'm trying to solve this problem. This function takes two parameters. The first is a function that returns a boolean value, and the second is a list of numbers. The function is supposed to remove the first value in the second parameter that returns true when run with the first parameter.
There's a second function, which does the same thing, except it removes the last value that satisfies it, instead of the first.
I'm fairly certain I have the logic down, as I tested it in another language and it worked, my only problem is translating it into Haskell syntax. Here's what I have:
removeFirst :: (t -> Bool) -> [t] -> [t]
removeFirst p xs = []
removeFirst p xs
| p y = ys
| otherwise = y:removeFirst p ys
where
y:ys = xs
removeLast :: (t -> Bool) -> [t] -> [t]
removeLast p xs = []
removeLast p xs = reverse ( removeFirst p ( reverse xs ) )
I ran:
removeFirst even [1..10]
But instead of getting [1,3,4,5,6,7,8,9,10] as expected, I get [].
What am I doing wrong?
removeFirst p xs = []
This always returns the empty list and it matches all arguments. I think you mean this.
removeFirst _ [] = []
Your first equation,
removeFirst p xs = []
says „Whatever my arguments are, just return []“, and the rest of the code is ignored.
You probably mean
removeFirst p [] = []
saying „When the list is already empty, return the empty list.“

How to implement this using map/filter

I'm supposed to come up with a function and::[Bool]->Bool, where (and xs) is True only if xs containst no False elements.
I am perfectly able to write this using recursion. I even tried it using map, BUT map ALWAYS return a list, which conflicts my return type.
Here's my code using recursion: (it works just fine)
isTrue::[Bool]->Bool
isTrue [] =True
isTrue (x:xs)
| (x == True) =isTrue(xs)
| otherwise =False
I tried doing this with map: (this will never work)
and::[Bool]->Bool
and xs = map (True ==) xs
So how in the world can I use map, filter or foldr to implement this crazy function? Thanks
Consider also takeWhile in a similar fashion as filter, yet halting the filtering once the first False is encountered, as follows,
and' :: [Bool] -> Bool
and' xs = xs == takeWhile (== True) xs
Filter True values and check if the result is an empty list:
and' :: [Bool] -> Bool
and' = null . filter (== False)
More classic approach using foldr:
and'' :: [Bool] -> Bool
and'' = foldr (&&) True
foldr is the way to go:
There are various ways to get to that. The first is to think about folding as inserting a binary operation between pairs of elements:
foldr c n [x1,x2,x3,...,xn] =
x1 `c` (x2 `c` (x3 `c` (... `c` (xn `c` n))))
So in this case,
foldr (&&) True [x1,x2,x3,...,xn] =
x1 && x2 && x3 && ... && xn && True
Note that && is right associative, so we don't need the parentheses there.
Another approach is to figure out how to convert the recursive form you gave into a fold. The thing to read to see how this works in general is Graham Hutton's "A Tutorial on the Universality and Expressiveness of Fold".
Start with your recursive form:
and::[Bool]->Bool
and [] =True
and (x:xs)
| (x == True) = and (xs)
| otherwise = False
Now there's no reason to ask if x==True because that's actually just the same as testing x directly. And there's no need for extra parentheses around a function argument. So we can rewrite that like this:
and [] = True
and (x:xs)
| x = and xs
| otherwise = False
Now let's see if we can write this as a fold:
and xs = foldr c n xs
Because and [] = True we know that n = True:
and xs = foldr c True xs
Now look at the recursive case:
and (x:xs)
| x = and xs
| otherwise = False
You can see that this depends only on x and on and xs. This means that we will be able to come up with a c so the fold will work out right:
c x r
| x = r
| otherwise = False
r is the result of applying and to the whole rest of the list. But what is this c function? It's just (&&)!
So we get
and xs = foldr (&&) True xs
At each step, foldr passes (&&) the current element and the result of folding over the rest of the list.
We don't actually need that xs argument, so we can write
and = foldr (&&) True

Resources