Span and pattern matching - haskell

The span function is defined below. I am curious as to how (ys, zs) is pattern matched with (x:ys, zs) where there is already an 'x' and a cons. I some how believed pattern matching would be an in-place replacement, but this blows my mind and had my jaw dropped. This is really beautiful.
I am curious as to if this construct and more is explained in any book (I am currently reading Real World Haskell Chapter 4 and wonder if this book or any other explains this in detail). Sorry if I come off as naive, but to me, this is a fine pattern matching construct and I would love to know more.
span p [] = ([],[])
span p xs#(x:xs')
| p x = (x:ys,zs)
| otherwise = ([],xs)
where (ys,zs) = span p xs'

you're right, this is beautiful. It is the closest thing to Prolog's TRMC, in Haskell.
Let me explain. That definition is equivalent to
span p xs = case xs of
(x:t) | p x -> let (ys,zs) = span p t in
(x:ys,zs) -- value1
_ -> ([],xs) -- value2 constructed from known parts
Because Haskell is lazy, value1 is constructed and returned immediately, without any intermediate recursive calls, just as the simple value2. At this point x is already known (it was bound as part of pattern matching) but ys and zs are not calculated yet — just their definition is retained alongside the value1 with two "holes" in it, (x:_,_). Only if either of the "holes" values will be demanded later, their values will be calculated by making the further call to span and filling those holes with the destructured result (let bindings are pattern matches too).
This is known as guarded recursion in Haskell - the recursive call is guarded against by the constructor(s) - here, (,) and (:) - creating value with hole(s), to be filled later as needed.
Incidentally, in Prolog this is written as
span(P,[], [],[]). % -- two "inputs", two "outputs"
span(P,XS, A,B):-
XS = [X|T],
( call(P,X) -> % -- predicate P holds for X:
A=[X|YS], B=ZS, % -- first, the value with holes is created
span(P,T, YS,ZS) % -- then, the holes are filled
; % -- else:
A=[], B=XS ). % -- output values are set

Much of pattern syntax can also be used for expressions, so that you can use the same syntax for taking apart data with a pattern as you use for building it with an expression.
Note that since Haskell values are immutable, there are no in place replacements.
The part (x:ys,zs) is not itself a pattern, but is an expression that builds a new value from the values x, ys and zs, which themselves come from patterns.
x comes from the pattern xs#(x:xs') and is bound to the first element of the list passed as the second argument of span. This also binds xs' to the remainder of the list, and xs to the original whole. (The # means "match the pattern to the right but also give a name bound to the whole, and is an exception to the rule that patterns can also be used as expressions.)
ys and zs come from the pattern (ys,zs) in where (ys,zs) = span p xs'. They are bound to the first and second element of the tuple returned from a recursive call of span p xs' with the remainder of the list after x has been removed.
Putting this together, the expression (x:ys,zs) makes a tuple that is the same as the one returned from the recursive span p xs', except that x has been consed to the first tuple element.
Someone else will have to answer about books, I learned Haskell too long ago to have read them. But if everything else fails, you can read the precise definitions in the Haskell report.

Related

Haskell: Parse error in pattern x ++ xs

Doing the third of the 99-Haskell problems (I am currently trying to learn the language) I tried to incorporate pattern matching as well as recursion into my function which now looks like this:
myElementAt :: [a] -> Int -> a
myElementAt (x ++ xs) i =
if length (x ++ xs) == i && length xs == 1 then xs!!0
else myElementAt x i
Which gives me Parse error in pattern: x ++ xs. The questions:
Why does this give me a parse error? Is it because Haskell is no idea where to cut my list (Which is my best guess)?
How could I reframe my function so that it works? The algorithmic idea is to check wether the list has the length as the specified inde; if yes return the last elemen; if not cut away one element at the end of the list and then do the recursion.
Note: I know that this is a really bad algorithm, but it I've set myself the challenge to write that function including recursion and pattern matching. I also tried not to use the !! operator, but that is fine for me since the only thing it really does (or should do if it compiled) is to convert a one-element list into that element.
Haskell has two different kinds of value-level entities: variables (this also includes functions, infix operators like ++ etc.) and constructors. Both can be used in expressions, but only constructors can also be used in patterns.
In either case, it's easy to tell whether you're dealing with a variable or constructor: a constructor always starts with an uppercase letter (e.g. Nothing, True or StateT) or, if it's an infix, with a colon (:, :+). Everything else is a variable. Fundamentally, the difference is that a constructor is always a unique, immediately matcheable value from a predefined collection (namely, the alternatives of a data definition), whereas a variable can just have any value, and often it's in principle not possible to uniquely distinguish different variables, in particular if they have a function type.
Yours is actually a good example for this: for the pattern match x ++ xs to make sense, there would have to be one unique way in which the input list could be written in the form x ++ xs. Well, but for, say [0,1,2,3], there are multiple different ways in which this can be done:
[] ++[0,1,2,3]
[0] ++ [1,2,3]
[0,1] ++ [2,3]
[0,1,2] ++ [3]
[0,1,2,3]++ []
Which one should the runtime choose?
Presumably, you're trying to match the head and tail part of a list. Let's step through it:
myElementAt (x:_) 0 = x
This means that if the head is x, the tail is something, and the index is 0, return the head. Note that your x ++ x is a concatenation of two lists, not the head and tail parts.
Then you can have
myElementAt(_:tl) i = myElementAt tl (i - 1)
which means that if the previous pattern was not matched, ignore the head, and take the i - 1 element of the tail.
In patterns, you can only use constructors like : and []. The append operator (++) is a non-constructor function.
So, try something like:
myElementAt :: [a] -> Int -> a
myElementAt (x:xs) i = ...
There are more issues in your code, but at least this fixes your first problem.
in standard Haskell pattern matches like this :
f :: Int -> Int
f (g n 1) = n
g :: Int -> Int -> Int
g a b = a+b
Are illegal because function calls aren't allowed in patterns, your case is just a special case as the operator ++ is just a function.
To pattern match on lists you can do it like this:
myElementAt :: [a] -> Int -> a
myElementAt (x:xs) i = // result
But in this case x is of type a not [a] , it is the head of the list and xs is its tail, you'll need to change your function implementation to accommodate this fact, also this function will fail with the empty list []. However that's the idiomatic haskell way to pattern match aginst lists.
I should mention that when I said "illegal" I meant in standard Haskell, there are GHC extensions that give something similar to that , it's called ViewPatterns But I don't think you need it especially that you're still learning.

Using patterns to find the nth element

I'm working through Learn You A Haskell in order to come up to speed with the basics of Haskell. I'm very comfortable with both functional programming and pattern matching, but the latter more so with how Mathematica does it.
In the same spirit as the naïve implementation of head in Chapter 4.1, I proceeded with a naïve implementation of last as:
last1 :: [a] -> a
last1 (_:x:[]) = x
However, calling last1 [1,2,3,4] gave an error Exception: ... Non-exhaustive patterns in function last1. I understand that this error implies that the pattern specified does not cover all possible inputs and usually, a catch-all pattern is necessary (which I've not provided). However, I'm not exactly sure why I get this error for my input.
Question 1: My understanding (of my incorrect approach) is that the first element is captured by _ and the rest get assigned to x, which isn't exactly what I had intended. However, shouldn't this give a type error, because I specified [a] -> a, but x is now a list?
Note that this is not about how to write a working last function — I know I can write it as (among other possibilities)
last2 :: [a] -> a
last2 [x] = x
last2 (_:x) = last2 x
Question 2: Along the same theme of better understanding pattern matching in Haskell, how can I use pattern matching to pick out the last element or more generally, the nth element from a given list, say, [1..10]?
This answer suggests that you can bind the last element using pattern matching with the ViewPatterns extension, but it seems strange that there isn't an analogous "simple" pattern like for head
In Mathematica, I would probably write it as:
Range[10] /. {Repeated[_, {5}], x_, ___} :> x
(* 6 *)
to pick out the 6th element and
Range[10] /. {___, x_} :> x
(* 10 *)
to pick out the last element of a non-empty list.
I apologize if this is covered later in the text, but I'm trying to relate each topic and concept as I come across them, to how it is handled in other languages that I know so that I can appreciate the differences and similarities.
To make sense of the result of your first attempt, you need to see how the
list data is defined. Lists enjoy a somewhat special syntax, but you would
write it something like this.
data List a = (:) a (List a)
| []
So, your list [1 .. 10] is actually structured as
(1 : (2 : (3 : (4 : []))))
In addition, due to the right associativity of the (:) operator, your pattern
for last1 actually looks like
last1 :: [a] -> a
last1 (_:(x:[])) = x
That is why 'x' has same type as an element of your list; it is the first
argument to the (:) constructor.
Pattern matching allows you to deconstruct data structures like lists, but you
need to know what "shape" they have to do so. That is why you cannot directly
specify a pattern that will extract the last element of a list, because there
are an infinite number of lengths a list can have. That is why the working
solution (last2) uses recursion to solve the problem. You know what pattern
a list of length one has and where to find the final element; for everything
else, you can just throw away the first element and extract the last element
of the resulting, shorter, list.
If you wanted, you could add more patterns, but it would not prove that
helpful. You could write it as
last2 :: [a] -> a
last2 (x:[]) = x
last2 (_:x:[]) = x
last2 (_:_:x:[]) = x
...
last2 (x:xs) = last2 xs
But without an infinite number of cases, you could never complete the function
for all lengths of input lists. Its even more dubious when you consider the fact that
lists can actually be infinitely long; what pattern would you use to match that?
There is no way to have pattern match get the "last" element without using view patterns. That is because there is no way to get the last element of a list without using recursion (at least implicitly), and what is more, there is no decidable way to get the last element.
Your code
last1 (_:x:[]) = x
should be parsed like
last1 (_:(x:[])) = x
which can be de-sugared into
last1 a = case a of
(_:b) -> case b of
(x:c) -> case c of
[] -> x
having completed this exercise we see what your code does: you have written a pattern that will match a list IF the outermost constructor of a list is a cons cell AND the next constructor is a cons AND the third constructor is a nil.
so in the case of
last1 [1,2,3,4]
we have
last1 [1,2,3,4]
= last1 (1:(2:(3:(4:[]))))
= case (1:(2:(3:(4:[])))) of
(_:b) -> case b of
(x:c) -> case c of
[] -> x
= case (2:(3:(4:[]))) of
(x:c) -> case c of
[] -> x
= let x = 2 in case (3:(4:[])) of
[] -> x
= pattern match failure
Your example
last1 (_:x:[]) = x
only matches lists containing two elements i.e. lists of the form a:b:[]. _ matches the head of the list without binding, x matches the following element, and the empty list matches itself.
When pattern matching lists, only the right-most item represents a list - the tail of the matched list.
You can get the nth element from a list with a function like:
getNth :: [a] -> Int -> a
getNth [] _ = error "Out of range"
getNth (h:t) 0 = h
getNth (h:t) n = getNth t (n-1)
This built-in using the !! operator e.g. [1..10] !! 5
You can indeed use ViewPatterns to do pattern matching at the end of a list, so let's do:
{-# LANGUAGE ViewPatterns #-}
and redefine your last1 and last2 by reversing the list before we pattern match.
This makes it O(n), but that's unavoidable with a list.
last1 (reverse -> (x:_)) = x
The syntax
mainFunction (viewFunction -> pattern) = resultExpression
is syntactic sugar for
mainFunction x = case viewFunction x of pattern -> resultExpression
so you can see it actually just reverses the list then pattern matches that, but it feels nicer.
viewFunction is just any function you like.
(One of the aims of the extension was to allow people to cleanly and easily use accessor functions
for pattern matching so they didn't have to use the underlying structure of their data type when
defining functions on it.)
This last1 gives an error if the list is empty, just like the original last does.
*Main> last []
*** Exception: Prelude.last: empty list
*Main> last1 []
*** Exception: Patterns.so.lhs:7:6-33: Non-exhaustive patterns in function last1
Well, OK, not exactly, but we can change that by adding
last1 _ = error "last1: empty list"
which gives you
*Main> last1 []
*** Exception: last1: empty list
We can of course use the same trick for last2:
last2 (reverse -> (_:x:_)) = x
last2 _ = error "last2: list must have at least two elements"
But it would be nicer to define
maybeLast2 (reverse -> (_:x:_)) = Just x
maybeLast2 _ = Nothing
You can carry on this way with for example last4:
last4 (reverse -> (_:_:_:x:_)) = x
And you can see that using the reverse viewpattern,
we've changed the semantics of (_:_:_:x:_) from
(ignore1st,ignore2nd,ignore3rd,get4th,ignoreTheRestOfTheList) to
(ignoreLast,ignore2ndLast,ignore3rdLast,get4thLast,ignoreTheRestOfTheList).
You note that in Mathematica, the number of underscores is used to indicate the number of elements being ignored.
In Haskell, we just use the one _, but it can be used for any ignored value, and in the presence of the
asymmetric list constructor :, the semantics depend on which side you're on, so in a:b, the a must mean an
element and the b must be a list (which could itself be c:d because : is right associative - a:b:c means
a:(b:c)). This is why a final underscore in any list pattern reresents ignoreTheRestOfTheList, and in the
presence of the reverse viewfunction, that means ignoring the front elements of the list.
The recursion/backtracking that's hidden under the hood in Mathematica is explicit here with the viewFunction reverse (which is a recursive function).

Haskell pattern matching the first, middle section, and last

So I wanted to do a simple string reverse function in Haskell
swapReverse :: String => String
swapReverse [x] = [x]
swapReverse [x,y] = [y,x]
swapReverse (x:xs:l) = -- pattern match fails here
let last = [l]
middle = xs
first = [x]
in last ++ swapReverse middle ++ first
So is there a way to define a pattern structure in haskell that has first and last element, and all the elements in the middle ?
No, you cannot. Why? Because pattern matches match values and their subparts, but the "middle" of a list isn't a subpart of the list. The list [1, 2, 3, 4] is 1:(2:(3:(4:[]))), in terms of its structure. So you want to match first to 1 and last to 4, which are both subparts of the list, and thus not disqualified. But the middle that you want would be 2:(3:[]), which is not a subpart of the list, and thus, cannot be a match.
Note that we can't write a pattern to match the first and the last elements of a list simultaneously, either. A pattern has a depth that's fixed at compilation time.
Pattern matching works on constructors, : is the only list constructor so you can not match on the middle of the list. You need to construct the new list backwards (obviously :) ) which can be done by taking the head and appending that to the reverse of the rest of the list.
Try this code:
last1 (x:xs:l) = (x,xs,l)
l doesn't get you the last element in a list, it get's you the rest of the list besides the first two variables, which are assigned the first two elements in a list.
When you write a pattern match for a list, the first variable is assigned the first element, and so on, until the program get's to the last variable, where everything that is left is assigned to it. There is nothing special about adding an s after an x, a variable named y would do the same thing.
If you want to get the last element of a list, you need to create a pattern similar to (x:xs), and use recursion on xs and apply that pattern until you get down to one list element, which is the last element. However, I would recommend reading Adam Bergmark's answer for a better way to reverse a list that does not involve finding the first and last elements of a list.
A working version:
swapReverse :: String -> String
swapReverse (x:xs) = [last xs] ++ swapReverse (init xs) ++ [x]
swapReverse xs = xs
Note that this implementation is performance-wise a disaster. Implementations using a fold and/or accumulators are much more efficient.

Haskell: Minimum sum of list

So, I'm new here, and I would like to ask 2 questions about some code:
Duplicate each element in list by n times. For example, duplicate [1,2,3] should give [1,2,2,3,3,3]
duplicate1 xs = x*x ++ duplicate1 xs
What is wrong in here?
Take positive numbers from list and find the minimum positive subtraction. For example, [-2,-1,0,1,3] should give 1 because (1-0) is the lowest difference above 0.
For your first part, there are a few issues: you forgot the pattern in the first argument, you are trying to square the first element rather than replicate it, and there is no second case to end your recursion (it will crash). To help, here is a type signature:
replicate :: Int -> a -> [a]
For your second part, if it has been covered in your course, you could try a list comprehension to get all differences of the numbers, and then you can apply the minimum function. If you don't know list comprehensions, you can do something similar with concatMap.
Don't forget that you can check functions on http://www.haskell.org/hoogle/ (Hoogle) or similar search engines.
Tell me if you need a more thorough answer.
To your first question:
Use pattern matching. You can write something like duplicate (x:xs). This will deconstruct the first cell of the parameter list. If the list is empty, the next pattern is tried:
duplicate (x:xs) = ... -- list is not empty
duplicate [] = ... -- list is empty
the function replicate n x creates a list, that contains n items x. For instance replicate 3 'a' yields `['a','a','a'].
Use recursion. To understand, how recursion works, it is important to understand the concept of recursion first ;)
1)
dupe :: [Int] -> [Int]
dupe l = concat [replicate i i | i<-l]
Theres a few problems with yours, one being that you are squaring each term, not creating a new list. In addition, your pattern matching is off and you would create am infinite recursion. Note how you recurse on the exact same list as was input. I think you mean something along the lines of duplicate1 (x:xs) = (replicate x x) ++ duplicate1 xs and that would be fine, so long as you write a proper base case as well.
2)
This is pretty straight forward from your problem description, but probably not too efficient. First filters out negatives, thewn checks out all subtractions with non-negative results. Answer is the minumum of these
p2 l = let l2 = filter (\x -> x >= 0) l
in minimum [i-j | i<-l2, j<-l2, i >= j]
Problem here is that it will allow a number to be checkeed against itself, whichwiull lend to answers of always zero. Any ideas? I'd like to leave it to you, commenter has a point abou t spoon-feeding.
1) You can use the fact that list is a monad:
dup = (=<<) (\x -> replicate x x)
Or in do-notation:
dup xs = do x <- xs; replicate x x; return x
2) For getting only the positive numbers from a list, you can use filter:
filter (>= 0) [1,-1,0,-5,3]
-- [1,0,3]
To get all possible "pairings" you can use either monads or applicative functors:
import Control.Applicative
(,) <$> [1,2,3] <*> [1,2,3]
[(1,1),(1,2),(1,3),(2,1),(2,2),(2,3),(3,1),(3,2),(3,3)]
Of course instead of creating pairs you can generate directly differences when replacing (,) by (-). Now you need to filter again, discarding all zero or negative differences. Then you only need to find the minimum of the list, but I think you can guess the name of that function.
Here, this should do the trick:
dup [] = []
dup (x:xs) = (replicate x x) ++ (dup xs)
We define dup recursively: for empty list it is just an empty list, for a non empty list, it is a list in which the first x elements are equal to x (the head of the initial list), and the rest is the list generated by recursively applying the dup function. It is easy to prove the correctness of this solution by induction (do it as an exercise).
Now, lets analyze your initial solution:
duplicate1 xs = x*x ++ duplicate1 xs
The first mistake: you did not define the list pattern properly. According to your definition, the function has just one argument - xs. To achieve the desired effect, you should use the correct pattern for matching the list's head and tail (x:xs, see my previous example). Read up on pattern matching.
But that's not all. Second mistake: x*x is actually x squared, not a list of two values. Which brings us to the third mistake: ++ expects both of its operands to be lists of values of the same type. While in your code, you're trying to apply ++ to two values of types Int and [Int].
As for the second task, the solution has already been given.
HTH

Compare the head of a haskell string?

Struggling to learn Haskell, how does one take the head of a string and compare it with the next character untill it finds a character thats note true?
In pseudo code I'm trying to:
while x == 'next char in string' put in new list to be returned
The general approach would be to create a function that recursively evaluates the head of the string until it finds the false value or reaches the end.
To do that, you would need to
understand recursion (prerequisite: understand recursion) and how to write recursive functions in Haskell
know how to use the head function
quite possibly know how to use list comprehension in Haskell
I have notes on Haskell that you may find useful, but you may well find Yet Another Haskell Tutorial more comprehensive (Sections 3.3 Lists; 3.5 Functions; and 7.8 More Lists would probably be good places to start in order to address the bullet points I mention)
EDIT0:
An example using guards to test the head element and continue only if it the same as the second element:
someFun :: String -> String
someFun[] = []
someFun [x:y:xs]
| x == y = someFun(y:xs)
| otherwise = []
EDIT1:
I sort of want to say x = (newlist) and then rather than otherwise = [] have otherwise = [newlist] if that makes any sense?
It makes sense in an imperative programming paradigm (e.g. C or Java), less so for functional approaches
Here is a concrete example to, hopefully, highlight the different between the if,then, else concept the quote suggests and what is happening in the SomeFun function:
When we call SomeFun [a,a,b,b] we match this to SomeFun [x:y:xs] and since x is 'a', and y is 'a', and x==y, then SomeFun [a,a,b,b] = SomeFun [a,b,b], which again matches SomeFun [x:y:xs] but condition x==y is false, so we use the otherwise guard, and so we get SomeFun [a,a,b,b] = SomeFun [a,b,b] = []. Hence, the result of SomeFun [a,a,b,b] is [].
So where did the data go? .Well, I'll hold my hands up and admit a bug in the code, which is now a feature I'm using to explain how Haskell functions work.
I find it helpful to think more in terms of constructing mathematical expressions rather than programming operations. So, the expression on the right of the = is your result, and not an assignment in the imperative (e.g. Java or C sense).
I hope the concrete example has shown that Haskell evaluates expressions using substitution, so if you don't want something in your result, then don't include it in that expression. Conversely, if you do want something in the result, then put it in the expression.
Since your psuedo code is
while x == 'next char in string' put in new list to be returned
I'll modify the SomeFun function to do the opposite and let you figure out how it needs to be modified to work as you desire.
someFun2 :: String -> String
someFun2[] = []
someFun2 [x:y:xs]
| x == y = []
| otherwise = x : someFun(y:xs)
Example Output:
SomeFun2 [a,a,b,b] = []
SomeFun2 [a,b,b,a,b] = [a]
SomeFun2 [a,b,a,b,b,a,b] = [a,b,a]
SomeFun2 [a,b,a,b] = [a,b,a,b]
(I'd like to add at this point, that these various code snippets aren't tested as I don't have a compiler to hand, so please point out any errors so I can fix them, thanks)
There are two typical ways to get the head of a string. head, and pattern matching (x:xs).
In fact, the source for the head function shows is simply defined with pattern matching:
head (x:_) = x
head _ = badHead
I highly recommend you check out Learn You a Haskell # Pattern Matching. It gives this example, which might help:
tell (x:y:[]) = "The list has two elements: " ++ show x ++ " and " ++ show y
Notice how it pattern matched against (x:y:[]), meaning the list must have two elements, and no more. To match the first two elements in a longer list, just swap [] for a variable (x:y:xs)
If you choose the pattern matching approach, you will need to use recursion.
Another approach is the zip xs (drop 1 xs). This little idiom creates tuples from adjacent pairs in your list.
ghci> let xs = [1,2,3,4,5]
ghci> zip xs (drop 1 xs)
[(1,2),(2,3),(3,4),(4,5)]
You could then write a function that looks at these tuples one by one. It would also be recursive, but it could be written as a foldl or foldr.
For understanding recursion in Haskell, LYAH is again highly recommended:
Learn You a Haskell # Recursion

Resources