I'm working through Learn You A Haskell in order to come up to speed with the basics of Haskell. I'm very comfortable with both functional programming and pattern matching, but the latter more so with how Mathematica does it.
In the same spirit as the naïve implementation of head in Chapter 4.1, I proceeded with a naïve implementation of last as:
last1 :: [a] -> a
last1 (_:x:[]) = x
However, calling last1 [1,2,3,4] gave an error Exception: ... Non-exhaustive patterns in function last1. I understand that this error implies that the pattern specified does not cover all possible inputs and usually, a catch-all pattern is necessary (which I've not provided). However, I'm not exactly sure why I get this error for my input.
Question 1: My understanding (of my incorrect approach) is that the first element is captured by _ and the rest get assigned to x, which isn't exactly what I had intended. However, shouldn't this give a type error, because I specified [a] -> a, but x is now a list?
Note that this is not about how to write a working last function — I know I can write it as (among other possibilities)
last2 :: [a] -> a
last2 [x] = x
last2 (_:x) = last2 x
Question 2: Along the same theme of better understanding pattern matching in Haskell, how can I use pattern matching to pick out the last element or more generally, the nth element from a given list, say, [1..10]?
This answer suggests that you can bind the last element using pattern matching with the ViewPatterns extension, but it seems strange that there isn't an analogous "simple" pattern like for head
In Mathematica, I would probably write it as:
Range[10] /. {Repeated[_, {5}], x_, ___} :> x
(* 6 *)
to pick out the 6th element and
Range[10] /. {___, x_} :> x
(* 10 *)
to pick out the last element of a non-empty list.
I apologize if this is covered later in the text, but I'm trying to relate each topic and concept as I come across them, to how it is handled in other languages that I know so that I can appreciate the differences and similarities.
To make sense of the result of your first attempt, you need to see how the
list data is defined. Lists enjoy a somewhat special syntax, but you would
write it something like this.
data List a = (:) a (List a)
| []
So, your list [1 .. 10] is actually structured as
(1 : (2 : (3 : (4 : []))))
In addition, due to the right associativity of the (:) operator, your pattern
for last1 actually looks like
last1 :: [a] -> a
last1 (_:(x:[])) = x
That is why 'x' has same type as an element of your list; it is the first
argument to the (:) constructor.
Pattern matching allows you to deconstruct data structures like lists, but you
need to know what "shape" they have to do so. That is why you cannot directly
specify a pattern that will extract the last element of a list, because there
are an infinite number of lengths a list can have. That is why the working
solution (last2) uses recursion to solve the problem. You know what pattern
a list of length one has and where to find the final element; for everything
else, you can just throw away the first element and extract the last element
of the resulting, shorter, list.
If you wanted, you could add more patterns, but it would not prove that
helpful. You could write it as
last2 :: [a] -> a
last2 (x:[]) = x
last2 (_:x:[]) = x
last2 (_:_:x:[]) = x
...
last2 (x:xs) = last2 xs
But without an infinite number of cases, you could never complete the function
for all lengths of input lists. Its even more dubious when you consider the fact that
lists can actually be infinitely long; what pattern would you use to match that?
There is no way to have pattern match get the "last" element without using view patterns. That is because there is no way to get the last element of a list without using recursion (at least implicitly), and what is more, there is no decidable way to get the last element.
Your code
last1 (_:x:[]) = x
should be parsed like
last1 (_:(x:[])) = x
which can be de-sugared into
last1 a = case a of
(_:b) -> case b of
(x:c) -> case c of
[] -> x
having completed this exercise we see what your code does: you have written a pattern that will match a list IF the outermost constructor of a list is a cons cell AND the next constructor is a cons AND the third constructor is a nil.
so in the case of
last1 [1,2,3,4]
we have
last1 [1,2,3,4]
= last1 (1:(2:(3:(4:[]))))
= case (1:(2:(3:(4:[])))) of
(_:b) -> case b of
(x:c) -> case c of
[] -> x
= case (2:(3:(4:[]))) of
(x:c) -> case c of
[] -> x
= let x = 2 in case (3:(4:[])) of
[] -> x
= pattern match failure
Your example
last1 (_:x:[]) = x
only matches lists containing two elements i.e. lists of the form a:b:[]. _ matches the head of the list without binding, x matches the following element, and the empty list matches itself.
When pattern matching lists, only the right-most item represents a list - the tail of the matched list.
You can get the nth element from a list with a function like:
getNth :: [a] -> Int -> a
getNth [] _ = error "Out of range"
getNth (h:t) 0 = h
getNth (h:t) n = getNth t (n-1)
This built-in using the !! operator e.g. [1..10] !! 5
You can indeed use ViewPatterns to do pattern matching at the end of a list, so let's do:
{-# LANGUAGE ViewPatterns #-}
and redefine your last1 and last2 by reversing the list before we pattern match.
This makes it O(n), but that's unavoidable with a list.
last1 (reverse -> (x:_)) = x
The syntax
mainFunction (viewFunction -> pattern) = resultExpression
is syntactic sugar for
mainFunction x = case viewFunction x of pattern -> resultExpression
so you can see it actually just reverses the list then pattern matches that, but it feels nicer.
viewFunction is just any function you like.
(One of the aims of the extension was to allow people to cleanly and easily use accessor functions
for pattern matching so they didn't have to use the underlying structure of their data type when
defining functions on it.)
This last1 gives an error if the list is empty, just like the original last does.
*Main> last []
*** Exception: Prelude.last: empty list
*Main> last1 []
*** Exception: Patterns.so.lhs:7:6-33: Non-exhaustive patterns in function last1
Well, OK, not exactly, but we can change that by adding
last1 _ = error "last1: empty list"
which gives you
*Main> last1 []
*** Exception: last1: empty list
We can of course use the same trick for last2:
last2 (reverse -> (_:x:_)) = x
last2 _ = error "last2: list must have at least two elements"
But it would be nicer to define
maybeLast2 (reverse -> (_:x:_)) = Just x
maybeLast2 _ = Nothing
You can carry on this way with for example last4:
last4 (reverse -> (_:_:_:x:_)) = x
And you can see that using the reverse viewpattern,
we've changed the semantics of (_:_:_:x:_) from
(ignore1st,ignore2nd,ignore3rd,get4th,ignoreTheRestOfTheList) to
(ignoreLast,ignore2ndLast,ignore3rdLast,get4thLast,ignoreTheRestOfTheList).
You note that in Mathematica, the number of underscores is used to indicate the number of elements being ignored.
In Haskell, we just use the one _, but it can be used for any ignored value, and in the presence of the
asymmetric list constructor :, the semantics depend on which side you're on, so in a:b, the a must mean an
element and the b must be a list (which could itself be c:d because : is right associative - a:b:c means
a:(b:c)). This is why a final underscore in any list pattern reresents ignoreTheRestOfTheList, and in the
presence of the reverse viewfunction, that means ignoring the front elements of the list.
The recursion/backtracking that's hidden under the hood in Mathematica is explicit here with the viewFunction reverse (which is a recursive function).
Related
I'm just wondering, for recursion example:
squaresRec :: [Double] -> [Double]
squaresRec [] = []
squaresRec (x:xs) = x*x : squaresRec xs
Why on the recursive case, there is no bracket? Shouldn't it suppose to be like this:
squaresRec :: [Double] -> [Double]
squaresRec [] = []
squaresRec [x:xs] = x*x : squaresRec xs
I know this will not work. But just wondering the explanation behind it.
[] matches the empty list.
[1] matches a list containing exactly one element, and that must be a number equal to one. Note that [1] is actually syntactic sugar for (1:[]), i.e. what this really matches is: a list beginning with the number 1, followed by a list that is empty... which is just a complicated way of saying “a list containing the single element 1”.
(x:xs) matches a list that begins with x, followed by xs (and that may contain any number of elements, possibly zero). I.e. this pattern matches any list with at least one element.
[x:xs] matches again a list which contains exactly one element, and that element should match the pattern (x:xs). (Which doesn't make sense even type-wise, because your lists contain Double-numbers, not lists.)
I had the same problem because I'm coming from Erlang.
The thing to understand is that the [head|tail] pattern we have in Erlang is actually translated by the cons function in Haskell, which is the : operator. The parenthesis are just here to isolate the function parameters, like (3+4) would do.
I know it's tempting to ask "why though???" and that it visually makes more sense, but : is how we build (and separate when pattern-matching) the head and the tail of a linked list.
Doing the third of the 99-Haskell problems (I am currently trying to learn the language) I tried to incorporate pattern matching as well as recursion into my function which now looks like this:
myElementAt :: [a] -> Int -> a
myElementAt (x ++ xs) i =
if length (x ++ xs) == i && length xs == 1 then xs!!0
else myElementAt x i
Which gives me Parse error in pattern: x ++ xs. The questions:
Why does this give me a parse error? Is it because Haskell is no idea where to cut my list (Which is my best guess)?
How could I reframe my function so that it works? The algorithmic idea is to check wether the list has the length as the specified inde; if yes return the last elemen; if not cut away one element at the end of the list and then do the recursion.
Note: I know that this is a really bad algorithm, but it I've set myself the challenge to write that function including recursion and pattern matching. I also tried not to use the !! operator, but that is fine for me since the only thing it really does (or should do if it compiled) is to convert a one-element list into that element.
Haskell has two different kinds of value-level entities: variables (this also includes functions, infix operators like ++ etc.) and constructors. Both can be used in expressions, but only constructors can also be used in patterns.
In either case, it's easy to tell whether you're dealing with a variable or constructor: a constructor always starts with an uppercase letter (e.g. Nothing, True or StateT) or, if it's an infix, with a colon (:, :+). Everything else is a variable. Fundamentally, the difference is that a constructor is always a unique, immediately matcheable value from a predefined collection (namely, the alternatives of a data definition), whereas a variable can just have any value, and often it's in principle not possible to uniquely distinguish different variables, in particular if they have a function type.
Yours is actually a good example for this: for the pattern match x ++ xs to make sense, there would have to be one unique way in which the input list could be written in the form x ++ xs. Well, but for, say [0,1,2,3], there are multiple different ways in which this can be done:
[] ++[0,1,2,3]
[0] ++ [1,2,3]
[0,1] ++ [2,3]
[0,1,2] ++ [3]
[0,1,2,3]++ []
Which one should the runtime choose?
Presumably, you're trying to match the head and tail part of a list. Let's step through it:
myElementAt (x:_) 0 = x
This means that if the head is x, the tail is something, and the index is 0, return the head. Note that your x ++ x is a concatenation of two lists, not the head and tail parts.
Then you can have
myElementAt(_:tl) i = myElementAt tl (i - 1)
which means that if the previous pattern was not matched, ignore the head, and take the i - 1 element of the tail.
In patterns, you can only use constructors like : and []. The append operator (++) is a non-constructor function.
So, try something like:
myElementAt :: [a] -> Int -> a
myElementAt (x:xs) i = ...
There are more issues in your code, but at least this fixes your first problem.
in standard Haskell pattern matches like this :
f :: Int -> Int
f (g n 1) = n
g :: Int -> Int -> Int
g a b = a+b
Are illegal because function calls aren't allowed in patterns, your case is just a special case as the operator ++ is just a function.
To pattern match on lists you can do it like this:
myElementAt :: [a] -> Int -> a
myElementAt (x:xs) i = // result
But in this case x is of type a not [a] , it is the head of the list and xs is its tail, you'll need to change your function implementation to accommodate this fact, also this function will fail with the empty list []. However that's the idiomatic haskell way to pattern match aginst lists.
I should mention that when I said "illegal" I meant in standard Haskell, there are GHC extensions that give something similar to that , it's called ViewPatterns But I don't think you need it especially that you're still learning.
My current understanding of pattern overlapping in Haskell is that 2 patterns are considered to be overlapping if some argument values passed to a function can be matched by multiple patterns.
Given:
last :: [a] -> a
last [x] = x
last (_ : xs) = last xs
passing the argument value [1] would match both the first pattern [x] and the 2nd pattern (_ : xs) - so that would mean the function has overlapping patterns even though both patterns can be matched.
What makes this confusing is that although the patterns are (by the definition above) overlapping, GHC does not show any warning about them being overlapping.
Reverting the 2 pattern matches in the last function does show the overlapping warning:
last :: [a] -> a
last (_ : xs) = last xs
last [x] = x
Warning:
src\OverlappingPatterns.hs:6:1: Warning:
Pattern match(es) are overlapped
In an equation for `last': last [x] = ...
It is almost as though GHC consideres the patterns overlapping if a previous pattern makes it impossible to match a pattern which occurs later.
What is the correct way to determine if a function has overlapping patterns or not?
Update
I am looking for the overlapping pattern definition used in fp101x course.
According to the definition used in fp101x the following function has overlapping patterns:
last :: [a] -> a
last [x] = x
last (_ : xs) = last xs
This is in contradiction with GHC definition of overlapping pattern which does not consider it to have any overlapping patterns.
Without a proper definition of what overlapping pattern means in the fp101x course context, it is impossible to solve that exercise. And the definition used there is not the GHC one.
The updated question clarifies the OP wants a formal definition of overlapping patterns. Here "overlapping" is meant in the sense used by GHC when it emits its warnings: that is, when it detects that a case branch is unreachable because its pattern does not match with anything which is not already handled by earlier branch.
A possible formal definition can indeed follow that intuition. That is, for any pattern p one can first define the set of values (denotations) [[p]] matching with p. (For this, it is important to know the type of the variables involved in p -- [[p]] depends on a type environment Gamma.) Then, one can say that in the sequence of patterns
q0 q1 ... qn p
the pattern p is overlapping iff [[p]], as a set, is included in [[q0]] union ... union [[qn]].
The above definition is hardly operative, though -- it does not immediately lead to an algorithm for checking overlaps. Indeed, computing [[p]] is unfeasible since it is an infinite set, in general.
To define an algorithm, I'd try to define a representation for the set of terms "not yet matched" by any pattern q0 .. qn. As an example, suppose we work with lists of booleans:
Remaining: _ (that is, any list)
q0 = []
Remaining: _:_ (any non empty list)
q1 = (True:xs)
Remaining: False:_
p = (True:False:ys)
Remaining: False:_
Here, the "remaining" set did not change, so the last pattern is overlapping.
As another example:
Remaining: _
q0 = True:[]
Remaining: [] , False:_ , True:_:_
q1 = False:xs
Remaining: [], True:_:_
q2 = True:False:xs
Remaining: [], True:True:_
q3 = []
Remaining: True:True:_
p = True:xs
Remaining: nothing -- not overlapping (and exhaustive as well!)
As you can see, in each step we match each of the "remaining" samples with the pattern at hand. This generates a new set of remaining samples (possibly none). The collection of all these samples forms the new remaining set.
For this, note that it is important to know the list of constructors for each type. This is because when matching with True, you must know there's another False case remaining. Similarly, if you match against [], there's another _:_ case remaining. Roughly, when matching against constructor K, all other constructors of the same type remain.
The above examples are not yet an algorithm, but they can get you started, hopefully.
All of this of course ignores case guards (which make the overlap undecidable), pattern guards, GADTs (which can further refine the remaining set in quite subtle ways).
I am looking for the overlapping pattern definition used in fp101x course.
"Patterns that do not rely on the order in which they are matched are
called disjoint or non-overlapping." (from "Programming in Haskell"
Graham Hutton)
So this example would be non-overlapping
foldr :: (a → b → b) → b → [a] → b
foldr v [] = v
foldr f v (x : xs) = f x (foldr f v xs)
Because you can change the order of pattern-matching like this:
foldr :: (a → b → b) → b → [a] → b
foldr f v (x : xs) = f x (foldr f v xs)
foldr v [] = v
And here you can't:
last :: [a] -> a
last [x] = x
last (_ : xs) = last xs
So the last one )) is overlapping.
I think the thing is that in the first case, not all matches of [x] will match (_:xs). On the second case, the converse is true (no one matching (_:xs) will fall through [x]). So, overlapping really means that there is an unreachable pattern.
This is what GHC documentation has to say about it:
By default, the compiler will warn you if a set of patterns are either
incomplete (i.e., you're only matching on a subset of an algebraic
data type's constructors), or overlapping, i.e.,
f :: String -> Int
f [] = 0
f (_:xs) = 1
f "2" = 2
where the last pattern match in `f' won't ever be reached, as
the second pattern overlaps it. More often than not, redundant
patterns is a programmer mistake/error, so this option is enabled by
default.
Maybe "unreachable pattern" would be a better choice of words.
I would suggest using reasoning logic in combination with compiler messages and test results would be a better way to understand if a function has overlapping patterns or not. As two examples, the first which has already been listed, indeed results in a compiler warning.
-- The first definition should work as expected.
last1 :: [a] -> a
last1 [x] = x
last1 (_:xs) = last xs
in the second case if we swap the last two lines around then a compiler error which states. Program error: pattern match failure: init1 [] results
last :: [a] -> a
last (_:xs) = last xs
last [x] = x
This matches the logic of passing a singleton list which could match in both patterns, and in this case the now second line.
last (_:xs) = last xs
will match in both cases. If we then move onto the second example
-- The first definition should work as expected
drop :: Int -> [a] -> [a]
drop 0 xs = xs
drop n [] = []
drop n (_:xs) = drop1 (n - 1) xs
In the second case if we again swap the last line with the first line then we don't get a compiler error but we also don't get the results we expect. Main> drop 1 [1,2,3] returns an empty list []
drop :: Int -> [a] -> [a]
drop n (_:xs) = drop1 (n - 1) xs
drop 0 xs = xs
drop n [] = []
In summary I think this is why reasoning (as oppose to a formal definition) for determining overlapping patterns works ok.
Since there is a way to bind the head and tail of a list via pattern matching, I'm wondering if you can use pattern matching to bind the last element of a list?
Yes, you can, using the ViewPatterns extension.
Prelude> :set -XViewPatterns
Prelude> let f (last -> x) = x*2
Prelude> f [1, 2, 3]
6
Note that this pattern will always succeed, though, so you'll probably want to add a pattern for the case where the list is empty, else last will throw an exception.
Prelude> f []
*** Exception: Prelude.last: empty list
Also note that this is just syntactic sugar. Unlike normal pattern matching, this is O(n), since you're still accessing the last element of a singly-linked list. If you need more efficient access, consider using a different data structure such as Data.Sequence, which offers O(1) access to both ends.
You can use ViewPatterns to do pattern matching at the end of a list, so let's do
{-# LANGUAGE ViewPatterns #-}
and use reverse as the viewFunction, because it always succeeds, so for example
printLast :: Show a => IO ()
printLast (reverse -> (x:_)) = print x
printLast _ = putStrLn "Sorry, there wasn't a last element to print."
This is safe in the sense that it doesn't throw any exceptions as long as you covered all the possibilities.
(You could rewrite it to return a Maybe, for example.)
The syntax
mainFunction (viewFunction -> pattern) = resultExpression
is syntactic sugar for
mainFunction x = case viewFunction x of pattern -> resultExpression
so you can see it actually just reverses the list then pattern matches that, but it feels nicer.
viewFunction is just any function you like.
(One of the aims of the extension was to allow people to cleanly and easily use accessor functions
for pattern matching so they didn't have to use the underlying structure of their data type when
defining functions on it.)
The other answers explain the ViewPatterns-based solutions. If you want to make it more pattern matching-like, you can package that into a PatternSynonym:
tailLast :: [a] -> Maybe ([a], a)
tailLast xs#(_:_) = Just (init xs, last xs)
tailLast _ = Nothing
pattern Split x1 xs xn = x1 : (tailLast -> Just (xs, xn))
and then write your function as e.g.
foo :: [a] -> (a, [a], a)
foo (Split head mid last) = (head, mid, last)
foo _ = error "foo: empty list"
This is my first day of Haskell programming and I also encountered the same issue, but I could not resolve to use some kind of external artifact as suggested in previous solutions.
My feeling about Haskell is that if the core language has no solution for your problem, then the solution is to transform your problem until it works for the language.
In this case transforming the problem means transforming a tail problem into a head problem, which seems the only supported operation in pattern matching. It turns that you can easily do that using a list inversion, then work on the reversed list using head elements as you would use tail elements in the original list, and finally, if necessary, revert the result back to initial order (eg. if it was a list).
For example, given a list of integers (eg. [1,2,3,4,5,6]), assume we want to build this list in which every second element of the original list starting from the end is replaced by its double (exercise taken from Homework1 of this excellent introduction to Haskell) : [2,2,6,4,10,6].
Then we can use the following:
revert :: [Integer] -> [Integer]
revert [] = []
revert (x:[]) = [x]
revert (x:xs) = (revert xs) ++ [x]
doubleSecond :: [Integer] -> [Integer]
doubleSecond [] = []
doubleSecond (x:[]) = [x]
doubleSecond (x:y:xs) = (x:2*y : (doubleSecond xs))
doubleBeforeLast :: [Integer] -> [Integer]
doubleBeforeLast l = ( revert (doubleSecond (revert l)) )
main = putStrLn (show (doubleBeforeLast [1,2,3,4,5,6,7,8,9]))
It's obviously much longer than previous solutions, but it feels more Haskell-ish to me.
Struggling to learn Haskell, how does one take the head of a string and compare it with the next character untill it finds a character thats note true?
In pseudo code I'm trying to:
while x == 'next char in string' put in new list to be returned
The general approach would be to create a function that recursively evaluates the head of the string until it finds the false value or reaches the end.
To do that, you would need to
understand recursion (prerequisite: understand recursion) and how to write recursive functions in Haskell
know how to use the head function
quite possibly know how to use list comprehension in Haskell
I have notes on Haskell that you may find useful, but you may well find Yet Another Haskell Tutorial more comprehensive (Sections 3.3 Lists; 3.5 Functions; and 7.8 More Lists would probably be good places to start in order to address the bullet points I mention)
EDIT0:
An example using guards to test the head element and continue only if it the same as the second element:
someFun :: String -> String
someFun[] = []
someFun [x:y:xs]
| x == y = someFun(y:xs)
| otherwise = []
EDIT1:
I sort of want to say x = (newlist) and then rather than otherwise = [] have otherwise = [newlist] if that makes any sense?
It makes sense in an imperative programming paradigm (e.g. C or Java), less so for functional approaches
Here is a concrete example to, hopefully, highlight the different between the if,then, else concept the quote suggests and what is happening in the SomeFun function:
When we call SomeFun [a,a,b,b] we match this to SomeFun [x:y:xs] and since x is 'a', and y is 'a', and x==y, then SomeFun [a,a,b,b] = SomeFun [a,b,b], which again matches SomeFun [x:y:xs] but condition x==y is false, so we use the otherwise guard, and so we get SomeFun [a,a,b,b] = SomeFun [a,b,b] = []. Hence, the result of SomeFun [a,a,b,b] is [].
So where did the data go? .Well, I'll hold my hands up and admit a bug in the code, which is now a feature I'm using to explain how Haskell functions work.
I find it helpful to think more in terms of constructing mathematical expressions rather than programming operations. So, the expression on the right of the = is your result, and not an assignment in the imperative (e.g. Java or C sense).
I hope the concrete example has shown that Haskell evaluates expressions using substitution, so if you don't want something in your result, then don't include it in that expression. Conversely, if you do want something in the result, then put it in the expression.
Since your psuedo code is
while x == 'next char in string' put in new list to be returned
I'll modify the SomeFun function to do the opposite and let you figure out how it needs to be modified to work as you desire.
someFun2 :: String -> String
someFun2[] = []
someFun2 [x:y:xs]
| x == y = []
| otherwise = x : someFun(y:xs)
Example Output:
SomeFun2 [a,a,b,b] = []
SomeFun2 [a,b,b,a,b] = [a]
SomeFun2 [a,b,a,b,b,a,b] = [a,b,a]
SomeFun2 [a,b,a,b] = [a,b,a,b]
(I'd like to add at this point, that these various code snippets aren't tested as I don't have a compiler to hand, so please point out any errors so I can fix them, thanks)
There are two typical ways to get the head of a string. head, and pattern matching (x:xs).
In fact, the source for the head function shows is simply defined with pattern matching:
head (x:_) = x
head _ = badHead
I highly recommend you check out Learn You a Haskell # Pattern Matching. It gives this example, which might help:
tell (x:y:[]) = "The list has two elements: " ++ show x ++ " and " ++ show y
Notice how it pattern matched against (x:y:[]), meaning the list must have two elements, and no more. To match the first two elements in a longer list, just swap [] for a variable (x:y:xs)
If you choose the pattern matching approach, you will need to use recursion.
Another approach is the zip xs (drop 1 xs). This little idiom creates tuples from adjacent pairs in your list.
ghci> let xs = [1,2,3,4,5]
ghci> zip xs (drop 1 xs)
[(1,2),(2,3),(3,4),(4,5)]
You could then write a function that looks at these tuples one by one. It would also be recursive, but it could be written as a foldl or foldr.
For understanding recursion in Haskell, LYAH is again highly recommended:
Learn You a Haskell # Recursion