What is the difference between this two, in terms of evaluation?
Why this "obeys" (how to say?) non-strictness
recFilter :: (a -> Bool) -> [a] -> [a]
recFilter _ [] = []
recFilter p (h:tl) = if (p h)
then h : recFilter p tl
else recFilter p tl
while this doesn't?
recFilter :: (a -> Bool) -> [a] -> Int -> [a]
recFilter _ xs 0 = xs
recFilter p (h:tl) len
| p(h) = recFilter p (tl ++ [h]) (len-1)
| otherwise = recFilter p tl (len-1)
Is it possible to write a tail-recursive function non-strictly?
To be honest I also don't understand the call stack of the first example, because I can't see where that h: goes. Is there a way to see this in ghci?
The non-tail recursive function roughly consumes a portion of the input (the first element) to produce a portion of the output (well, if it's not filtered out at least). Then recursion handles the next portion of the input, and so on.
Your tail recursive function will recurse until len becomes zero, and only at that point it will output the whole result.
Consider this pseudocode:
def rec1(p,xs):
case xs:
[] -> []
(y:ys) -> if p(y): print y
rec1(p,ys)
and compare it with this accumulator-based variant. I'm not using len since I use a separate accumulator argument, which I assume to be initially empty.
def rec2(p,xs,acc):
case xs:
[] -> print acc
(y:ys) -> if p(y):
rec2(p,ys,acc++[y])
else:
rec2(p,ys,acc)
rec1 prints before recursing: it does not need to inspect the whole input list to start printing its output. It works in a "steraming" fashion, in a sense. Instead, rec2 will only start to print at the very end, after the input list was completely scanned.
In your Haskell code there are no prints, of course, but you can thing of returning x : function call as "printing x", since x is made available to the caller of our function before function call is actually made. (Well, to be pedantic this depends on how the caller will consume the output list, but I'll neglect this.)
Hence the non-tail recursive code can also work on infinite lists. Even on finite inputs, performance is improved: if we call head (rec1 p xs), we only evaluate xs until the first non-discarded element. By contrast head (rec2 p xs) would fully filter the whole list xs, even we don't need that.
The second implementation does not make much sense: a variable named len will not contain the length of the list. You thus need to pass this, for infinite lists, this would not work, since there is no length at all.
You likely want to produce something like:
recFilter :: (a -> Bool) -> [a] -> [a]
recFilter p = go []
where go ys [] = ys -- (1)
go ys (x:xs) | p x = go (ys ++ [x]) xs
| otherwise = go ys xs
where we thus have an accumulator to which we append the items in the list, and then eventually return the accumulator.
The problem with the second approach is that as long as the accumulator is not returned, Haskell will need to keep recursing until at least we reach weak head normal form (WHNF). This means that if we pattern match the result with [] or (_:_), we will need at least have to recurse until case one, since the other cases only produce a new expression, and it will thus not yield a data constructor on which we can pattern match.
This in contrast to the first filter where if we pattern match on [] or (_:_) it is sufficient to stop at the first case (1), or the third case 93) where the expression produces an object with a list data constructor. Only if we require extra elements to pattern match, for example (_:_:_), it will require to evaluate the recFilter p tl in case (2) of the first implementation:
recFilter :: (a -> Bool) -> [a] -> [a]
recFilter _ [] = [] -- (1)
recFilter p (h:tl) = if (p h)
then h : recFilter p tl -- (2)
else recFilter p tl
For more information, see the Laziness section of the Wikibook on Haskell that describes how laziness works with thunks.
Related
I'm making some exercise to practice my Haskell skills. My task is to implement the Haskell function find by myself with the filter function.
I already implemented the find function without the filter function (see codeblock below) but now my problem is to implement it with filter function.
-- This is the `find` function without `filter` and it's working for me.
find1 e (x:xs)= if e x then x
else find1 e xs
-- This is the find function with the filter function
find2 e xs = filter e xs
The result of find1 is right
*Main> find1(>4)[1..10]
Output : [5].
But my actual task to write the function with filter gives me the
*Main> find2(>4)[1..10]
Output : [5,6,7,8,9,10].
My wanted result for find2 is the result of find1.
To "cut a list" to only have one, head element in it, use take 1:
> take 1 [1..]
[1]
> take 1 []
[]
> take 1 $ find2 (> 4) [1..10]
[5]
> take 1 $ find2 (> 14) [1..10]
[]
If you need to implement your own take 1 function, just write down its equations according to every possible input case:
take1 [] = []
take1 (x:xs) = [x]
Or with filter,
findWithFilter p xs = take1 $ filter p xs
Your find1 definition doesn't correspond to the output you show. Rather, the following definition would:
find1 e (x:xs) = if e x then [x] -- you had `x`
else find1 e xs
find1 _ [] = [] -- the missing clause
It is customary to call your predicate p, not e, as a mnemonic device. It is highly advisable to add type signatures to all your top-level definitions.
If you have difficulty in writing it yourself you can start without the signature, then ask GHCi which type did it infer, than use that signature if it indeed expresses your intent -- otherwise it means you've coded something different:
> :t find1
find1 :: (t -> Bool) -> [t] -> [t]
This seems alright as a first attempt.
Except, you actually intended that there would never be more than 1 element in the output list: it's either [] or [x] for some x, never more than one.
The list [] type is too permissive here, so it is not a perfect fit.
Such a type does exist though. It is called Maybe: values of type Maybe t can be either Nothing or Just x for some x :: t (read: x has type t):
import Data.Maybe (listToMaybe)
find22 p xs = listToMaybe $ filter p xs
We didn't even have to take 1 here: the function listToMaybe :: [a] -> Maybe a (read: has a type of function with input in [a] and output in Maybe a) already takes at most one element from its input list, as the result type doesn't allow for more than one element -- it simply has no more room in it. Thus it expresses our intent correctly: at most one element is produced, if any:
> find22 (> 4) [1..10]
Just 5
> find22 (> 14) [1..10]
Nothing
Do add full signature above its definition, when you're sure it is what you need:
find22 :: (a -> Bool) -> [a] -> Maybe a
Next, implement listToMaybe yourself. To do this, just follow the types, and write equations enumerating the cases of possible input, producing an appropriate value of the output type in each case, just as we did with take1 above.
Assume the following (non-functioning) code, that takes a predicate such as (==2) and a list of integers, and drops only the last element of the list that satisfies the predicate:
cutLast :: (a -> Bool) -> [Int] -> [Int]
cutLast a [] = []
cutLast pred (as:a)
| (pred) a == False = (cutLast pred as):a
| otherwise = as
This code does not work, so clearly lists cannot be iterated through in reverse like this. How could I implement this idea? I'm not 100% sure if the code is otherwise correct - but hopefully it gets the idea across.
Borrowing heavily from myself: the problem with this sort of question is that you don't know which element to remove until you get to the end of the list. Once we observe this, the most straightforward thing to do is traverse the list one way then back using foldr (the second traversal comes from the fact foldr is not tail-recursive).
The cleanest solution I can think of is to rebuild the list on the way back up, dropping the first element.
cutLast :: Eq a => (a -> Bool) -> [a] -> Either [a] [a]
cutLast f = foldr go (Left [])
where
go x (Right xs) = Right (x:xs)
go x (Left xs) | f x = Right xs
| otherwise = Left (x:xs)
The return type is Either to differentiate between not found anything to drop from the list (Left), and having encountered and dropped the last satisfying element from the list (Right). Of course, if you don't care about whether you dropped or didn't drop an element, you can drop that information:
cutLast' f = either id id . cutLast f
Following the discussion of speed in the comments, I tried swapping out Either [a] [a] for (Bool,[a]). Without any further tweaking, this is (as #dfeuer predicted) consistently a bit slower (on the order of 10%).
Using irrefutable patterns on the tuple, we can indeed avoid forcing the whole output (as per #chi's suggestion), which makes this much faster for lazily querying the output. This is the code for that:
cutLast' :: Eq a => (a -> Bool) -> [a] -> (Bool,[a])
cutLast' f = foldr go (False,[])
where
go x ~(b,xs) | not (f x) = (b,x:xs)
| not b = (False,x:xs)
| otherwise = (True,xs)
However, this is 2 to 3 times slower than either of the other two versions (that don't use irrefutable patterns) when forced to normal form.
One simple (but less efficient) solution is to implement cutFirst in a similar fashion to filter, then reverse the input to and output from that function.
cutLast pred = reverse . cutFirst . reverse
where cutFirst [] = []
cutFirst (x:xs) | pred x = xs
| otherwise = x : cutFirst xs
I'm reading Real world haskell book again and it's making more sense. I've come accross this function and wanted to know if my interpretation of what it's doing is correct. The function is
oddList :: [Int] -> [Int]
oddList (x:xs) | odd x = x : oddList xs
| otherwise = oddList xs
oddList _ = []
I've read that as
Define the function oddList which accepts a list of ints and returns a list of ints.
Pattern matching: when the parameter is a list.
Take the first item, binding it to x, leaving the remainder elements in xs.
If x is an odd number prepend x to the result of applying oddList to the remaining elements xs and return that result. Repeat...
When x isn't odd, just return the result of applying oddList to xs
In all other cases return an empty list.
1) Is that a suitable/correct way of reading that?
2) Even though I think I understand it, I'm not convinced I've got the (x:xs) bit down. How should that be read, what's it actually doing?
3) Is the |...| otherwise syntax similar/same as the case expr of syntax
1 I'd make only 2 changes to your description:
when the parameter is a nonempty list.
f x is an odd number prepend x to the result of applying oddList to the remaining elements xs and return that result. [delete "Repeat...""]
Note that for the "_", "In all other cases" actually means "When the argument is an empty list", since that is the only other case.
2 The (x:xs) is a pattern that introduces two variables. The pattern matches non empty lists and binds the x variable to the first item (head) of the list and binds xs to the remainder (tail) of the list.
3 Yes. An equivalent way to write the same function is
oddList :: [Int] -> [Int]
oddList ys = case ys of { (x:xs) | odd x -> x : oddList xs ;
(x:xs) | otherwise -> oddList xs ;
_ -> [] }
Note that otherwise is just the same as True, so | otherwise could be omitted here.
You got it right.
The (x:xs) parts says: If the list contains at least one element, bind the first element to x, and the rest of the list to xs
The code could also be written as
oddList :: [Int] -> [Int]
oddList (x:xs) = case (odd x) of
True -> x : oddList xs
False -> oddList xs
oddList _ = []
In this specific case, the guard (|) is just a prettier way to write that down. Note that otherwise is just a synonym for True , which usually makes the code easier to read.
What #DanielWagner is pointing out, is we in some cases, the use of guards allow for some more complex behavior.
Consider this function (which is only relevant for illustrating the principle)
funnyList :: [Int] -> [Int]
funnyList (x1:x2:xs)
| even x1 && even x2 = x1 : funnyList xs
| odd x1 && odd x2 = x2 : funnyList xs
funnyList (x:xs)
| odd x = x : funnyList xs
funnyList _ = []
This function will go though these clauses until one of them is true:
If there are at least two elements (x1 and x2) and they are both even, then the result is:
adding the first element (x1) to the result of processing the rest of the list (not including x1 or x2)
If there are at least one element in the list (x), and it is odd, then the result is:
adding the first element (x) to the result of processing the rest of the list (not including x)
No matter what the list looks like, the result is:
an empty list []
thus funnyList [1,3,4,5] == [1,3] and funnyList [1,2,4,5,6] == [1,2,5]
You should also checkout the free online book Learn You a Haskell for Great Good
You've correctly understood what it does on the low level.
However, with some experience you should be able to interpret it in the "big picture" right away: when you have two cases (x:xs) and _, and xs only turns up again as an argument to the function again, it means this is a list consumer. In fact, such a function is always equivalent to a foldr. Your function has the form
oddList' (x:xs) = g x $ oddList' xs
oddList' [] = q
with
g :: Int -> [Int] -> [Int]
g x qs | odd x = x : qs
| otherwise = qs
q = [] :: [Int]
The definition can thus be compacted to oddList' = foldr g q.
While you may right now not be more comfortable with a fold than with explicit recursion, it's actually much simpler to read once you've seen it a few times.
Actually of course, the example can be done even simpler: oddList'' = filter odd.
Read (x:xs) as: a list that was constructed with an expression of the form (x:xs)
And then, make sure you understand that every non-empty list must have been constructed with the (:) constructor.
This is apparent when you consider that the list type has just 2 constructors: [] construct the empty list, while (a:xs) constructs the list whose head is a and whose tail is xs.
You need also to mentally de-sugar expressions like
[a,b,c] = a : b : c : []
and
"foo" = 'f' : 'o' : 'o' : []
This syntactic sugar is the only difference between lists and other types like Maybe, Either or your own types. For example, when you write
foo (Just x) = ....
foo Nothing = .....
we are also considering the two base cases for Maybe:
it has been constructed with Just
it has been constructed with Nothing
The language I'm using is a subset of Haskell called Core Haskell which does not allow the use of the built-in functions of Haskell. For example, if I were to create a function which counts the number of times that the item x appears in the list xs, then I would write:
count = \x ->
\xs -> if null xs
then 0
else if x == head xs
then 1 + count x(tail xs)
else count x(tail xs)
I'm trying to create a function which outputs a list xs with its duplicate values removed. E.g. remdups (7:7:7:4:5:7:4:4:[]) => (7:4:5:[])
can anyone offer any advice?
Thanks!
I'm guessing that you're a student, and this is a homework problem, so I'll give you part of the answer and let you finish it. In order to write remdups, it would be useful to have a function that tells us if a list contains an element. We can do that using recursion. When using recursion, start by asking yourself what the "base case", or simplest possible case is. Well, when the list is empty, then obviously the answer is False (no matter what the character is). So now, what if the list isn't empty? We can check if the first character in the list is a match. If it is, then we know that the answer is True. Otherwise, we need to check the rest of the list -- which we do by calling the function again.
elem _ [] = False
elem x (y:ys) = if x==y
then True
else elem x ys
The underscore (_) simply means "I'm not going to use this variable, so I won't even bother to give it a name." That can be written more succinctly as:
elem _ [] = False
elem x (y:ys) = x==y || elem x ys
Writing remdups is a little tricky, but I suspect your teacher gave you some hints. One way to approach it is to imagine we're partway through processing the list. We have part of the list that hasn't been processed yet, and part of the list that has been processed (and doesn't contain any duplicates). Suppose we had a function called remdupHelper, which takes those two arguments, called remaining and finished. It would look at the first character in remaining, and return a different result depending on whether or not that character is in finished. (That result could call remdupHelper recursively). Can you write remdupHelper?
remdupHelper = ???
Once you have remdupHelper, you're ready to write remdups. It just invokes remdupHelper in the initial condition, where none of the list has been processed yet:
remdups l = remdupHelper l [] -- '
This works with Ints:
removeDuplicates :: [Int] -> [Int]
removeDuplicates = foldr insertIfNotMember []
where
insertIfNotMember item list = if (notMember item list)
then item : list
else list
notMember :: Int -> [Int] -> Bool
notMember item [] = True
notMember item (x:xs)
| item == x = False
| otherwise = notMember item xs
How it works should be obvious. The only "tricky" part is that the type of foldr is:
(a -> b -> b) -> b -> [a] -> b
but in this case b unifies with [a], so it becomes:
(a -> [a] -> [a]) -> [a] -> [a] -> [a]
and therefore, you can pass the function insertIfNotMember, which is of type:
Int -> [Int] -> [Int] -- a unifies with Int
isTogether' :: String -> Bool
isTogether' (x:xs) = isTogether (head xs) (head (tail xs))
For the above code, I want to go through every character in the string. I am not allowed to use recursion.
isTogether' (x:xs) = isTogether (head xs) (head (tail xs))
If I've got it right, you are interested in getting consequential char pairs from some string. So, for example, for abcd you need to test (a,b), (b,c), (c,d) with some (Char,Char) -> Bool or Char -> Char -> Bool function.
Zip could be helpful here:
> let x = "abcd"
> let pairs = zip x (tail x)
it :: [(Char, Char)]
And for some f :: Char -> Char -> Bool function we can get uncurry f :: (Char, Char) -> Bool.
And then it's easy to get [Bool] value of results with map (uncurry f) pairs :: [Bool].
In Haskell, a String is just a list of characters ([Char]). Thus, all of the normal higher-order list functions like map work on strings. So you can use whichever higher-order function is most applicable to your problem.
Note that these functions themselves are defined recursively; in fact, there is no way to go through the entire list in Haskell without either recursing explicitly or using a function that directly or indirectly recurses.
To do this without recursion, you will need to use a higher order function or a list comprehension. I don't understand what you're trying to accomplish so I can only give generic advice. You probably will want one of these:
map :: (a -> b) -> [a] -> [b]
Map converts a list of one type into another. Using map lets you perform the same action on every element of the list, given a function that operates on the kinds of things you have in the list.
filter :: (a -> Bool) -> [a] -> [a]
Filter takes a list and a predicate, and gives you a new list with only the elements that satisfy the predicate. Just with these two tools, you can do some pretty interesting things:
import Data.Char
map toUpper (filter isLower "A quick test") -- => "QUICKTEST"
Then you have folds of various sorts. A fold is really a generic higher order function for doing recursion on some type, so using it takes a bit of getting used to, but you can accomplish pretty much any recursive function on a list with a fold instead. The basic type of foldr looks like this:
foldr :: (a -> b -> b) -> b -> [a] -> b
It takes three arguments: an inductive step, a base case and a value you want to fold. Or, in less mathematical terms, you could think of it as taking an initial state, a function to take the next item and the previous state to produce the next state, and the list of values. It then returns the final state it arrived at. You can do some pretty surprising things with fold, but let's say you want to detect if a list has a run of two or more of the same item. This would be hard to express with map and filter (impossible?), but it's easy with recursion:
hasTwins :: (Eq a) => [a] -> Bool
hasTwins (x:y:xs) | x == y = True
hasTwins (x:y:xs) | otherwise = hasTwins (y:xs)
hasTwins _ = False
Well, you can express this with a fold like so:
hasTwins :: (Eq a) => [a] -> Bool
hasTwins (x:xs) = snd $ foldr step (x, False) xs
where
step x (prev, seenTwins) = (x, prev == x || seenTwins)
So my "state" in this fold is the previous value and whether we've already seen a pair of identical values. The function has no explicit recursion, but my step function passes the current x value along to the next invocation through the state as the previous value. But you don't have to be happy with the last state you have; this function takes the second value out of the state and returns that as the overall return value—which is the boolean whether or not we've seen two identical values next to each other.