Is the function `a -> <pattern> -> Bool` possible? - haskell

Would it be possible to have a function that takes some value and a pattern in order to check if both match?
Let's call this hypothetical function matches and with it we could rewrite the following function...
isSingleton :: [a] -> Bool
isSingleton [_] = True
isSingleton _ = False
...like so...
isSingleton xs = xs `matches` [_]
Would this be theoretically possible? If yes, how? And if not, why?

Well, you can't really use [_] as an expression – the compiler doesn't allow it. There are various bodges that could be applied to make it kind-of-possible:
-fdefer-typed-holes makes GHC ignore the _ during compilation. That doesn't really make it a legal expression, it just means the error will be raised at runtime instead. (Generally a very bad idea, should only be used for trying out something while another piece of your code isn't complete yet.)
You could define _' = undefined to have similar syntax that is accepted as an expression. Of course, that still means there will be a runtime error.
A runtime error could be caught, but only in the IO monad. That means basically don't do anything that requires it. But of course, technically speaking you could, and then wrap it in unsafePerformIO. Disgusting, but I won't say there aren't situations where this sort of thing is a necessary evil.
With that, you could already implement a function that would be able to determine that either it definitely doesn't match [_] (if == returns False without an error being raise), or possibly matches (in case the error is raised before a constructor discrepancy is found). That would be enough to determine that [] does not match [_], but not to determine that [1,_,3] does not match [1,_,0].
Again this could be botched around with a type class that first converts the values to a proper ⊥-less tree structure (using the unsafePerformIO catch to determine how deep to descent), but... please, let's stop the thought process here. We're clearly working against the language instead of with it with all of that.
What could of course be done is to do it in Template Haskell, but that would fix you to compile-time pattern expressions, I doubt that's in the spirit of the question. Or you could generate a type that explicitly expresses “original type with holes in it”, so the holes could properly be distinguished without any unsafe IO shenanigans. That would become quite verbose though.
In practice, I would just not try passing around first-class patterns, but instead pass around boolean-valued functions. Granted, functions are more general than patterns, but not in a way that would pose any practical problems.
Something that's more closely analogous to patterns are Prisms in the lens ecosystem. Indeed, there is the is combinator that does what you're asking for. However, lenses/prisms can be again fiddly to construct, I'm not even sure a prism corresponding to [_] can be built using the _Cons and _Empty primitives.
tl;dr it's not worth it.

Patterns aren't first-class in Haskell; you cannot write a function that receives a pattern as an argument.
However, anywhere you would call your hypothetical matches function, you can instead make a bespoke function that tests the particular pattern you're interested in, and use that function instead. So you don't really need first class patterns, they just might save a bit of boilerplate.
The extension LambdaCase more or less allows you to write a function that just does a pattern match with minimal syntactic overhead, although there is no special-case syntax available for the specific purpose of returning a Bool saying whether a pattern matches (such special-purpose syntax would allow you to to avoid explicitly writing that you want to map the pattern to True and having to add a catch-all alternative case to map to False).
For example:
isSingleton = \case [_] -> True
_ -> False
For something like isSingleton (which already is a single pattern match encapsulated into a function) there's not much benefit in doing this over just implementing it directly. But in a more complex function whether you want to call x `matches` <pattern> (or pass (`matches` <pattern>) to another function), it might be an alternative that keeps the pattern inline.
Honestly I'd probably just define functions like isSingleton for each pattern I wanted this for though (possibly just in a where clause).

Related

Number of Parameters of a Haskell Function

When I try to compile this with ghc it complains the number of parameters in the left hand sides of the function definition are different.
module Example where
import Data.Maybe
from_maybe :: a -> Maybe a -> a
from_maybe a Nothing = a
from_maybe _ = Data.Maybe.fromJust
I'm wondering whether this is a ghc restriction. I tried to see if I
could find anything about the number of parameters in the Haskell 2010
Report but I wasn't successful.
Is this legal Haskell or isn't it? If not, where is this parameter count
restriction listed?
It's not legal. The restriction is described in the
Haskell 2010 Report:
4.4.3.1 Function bindings
[...]
Note that all clauses defining a function must be contiguous, and the number of patterns in each clause must be the same.
The answer from melpomene is the summation of the syntax rules. In this answer I want to try to describe why this has to be the case.
From the point of view of the programmer, the syntax in the question seems quite reasonable. There is a first specific case which is marked by a value in the second parameter. Then there is a general solution which reduces to an existing function. Either pattern on its own is valid so why can they not co-exist?
The syntax rule that all patterns must capture exactly the same parameters is a necessary result of haskell being a lazy language. Being lazy means that the values of parameters are not evaluated until they are needed.
Let's look at what happens when we curry a function, that is provide it with less then the number of parameters in the declaration. When that happens haskell produces an anonymous function* that will complete the function and stores the first parameters in the scope of that function without evaluating them.
That last part is important. For functions with different numbers of parameters, it would be necessary to evaluate some of them to choose which pattern to match, but then not evaluate them.
In other words the compiler needs to know exactly how many parameters it needs before it evaluates them to choose among the patterns. So while it seems like we should not need to put an extra _ in, the compiler needs it to be there.
As an aside, the order in which parameters occur can make a big difference in how easy a function is to implement and use. See the answers to my question Ordering of parameters to make use of currying for suggestions. If the order is reversed on the example function, it can be implemented with only the one named point.
from_maybe :: Maybe a -> a -> a
from_maybe Nothing = id
from_maybe (Just x) = const x
*: What an implementation like GHC actually does in this situation is subject to optimization and may not work exactly this way, but the end result must be the same as if it did.

How does the presence of the "error" function bear on the purity of Haskell?

I've always wondered how the Haskell exception system fits in with the whole "Pure functional language" thing. For example see the below GHCi session.
GHCi, version 8.0.1: http://www.haskell.org/ghc/ :? for help
Prelude> head []
*** Exception: Prelude.head: empty list
Prelude> :t head
head :: [a] -> a
Prelude> :t error
error :: [Char] -> a
Prelude> error "ranch"
*** Exception: ranch
CallStack (from HasCallStack):
error, called at <interactive>:4:1 in interactive:Ghci1
Prelude>
The type of head is [a] -> a. But when you call it on the special case of an empty list, you get an exception instead. But this exception is not accounted for in the type signature.
If I remember correctly it's a similar story when there is a failure during pattern matching. It doesn't matter what the type signature says, if you haven't accounted for every possible pattern, you run the risk of throwing an exception.
I don't have a single, concise question to ask, but my head is swimming. What was the motivation for adding this strange exception system to an otherwise pure and elegant language? Is it still pure but I'm just missing something? If I want to take advantage of this exception feature, how would I go about doing it (ie how do I catch and handle exceptions? is there anything else I can do with them?) For example, if ever I write code that uses the "head" function, surely I should take precautions for the case where an empty list somehow smuggles itself in.
You are confusing two concepts: purity and totality.
Purity says that functions have no side effects.
Totality says that every function terminates and produces a value.
Haskell is pure, but is not total.
Outside of IO, nontermination (e.g., let loop = loop in loop) and exceptions (e.g., error "urk!") are the same – nonterminating and exceptional terms, when forced, do not evaluate to a value. The designers of Haskell wanted a Turing-complete language, which – as per the halting problem – means that they forwent totality. And once you have nontermination, I suppose you might as well have exceptions, too – defining error msg = error msg and having calls to error do nothing forever is much less satisfying in practice than actually seeing the error message you want in finite time!
In general, though, you're right – partial functions (those which are not defined for every input value, like head) are ugly. Modern Haskell generally prefers writing total functions instead by returning Maybe or Either values, e.g.
safeHead :: [a] -> Maybe a
safeHead [] = Nothing
safeHead (x:_) = Just x
errHead :: [a] -> Either String a
errHead [] = Left "Prelude.head: empty list"
errHead (x:_) = Right x
In this case, the Functor, Applicative, Monad, MonadError, Foldable, Traversable, etc., machinery makes combining these total functions and working with their results easy.
Should you actually come across an exception in your code – for instance, you might use error to check a complicated invariant in your code that you think you've enforced, but you have a bug – you can catch it in IO. Which returns to the question of why it's OK to interact with exceptions in IO – doesn't that make the language impure? The answer is the same as that to the question of why we can do I/O in IO, or work with mutable variables – evaluating a value of type IO A doesn't produce the side effects that it describes, it's just an action that describes what a program could do. (There are better descriptions of this elsewhere on the internet; exceptions aren't any different than other effects.)
(Also, note that there is a separate-but-related exception system in IO, which is used when e.g. trying to read a file that isn't there. People are often OK with this exception system, in moderation, because since you're in IO you're already working with impure code.)
For example, if ever I write code that uses the "head" function, surely I should take precautions for the case where an empty list somehow smuggles itself in.
A simpler solution: don't use head. There are plenty of replacements: listToMaybe from Data.Maybe, the various alternative implementations in the safe package, etc. The partial functions [1] in the base libraries -- specially ones as easy to replace as head -- are little more than historical cruft, and should be either ignored or replaced by safe variants, such as those in the aforementioned safe package. For further arguments, here is an entirely reasonable rant about partial functions.
If I want to take advantage of this exception feature, how would I go about doing it (ie how do I catch and handle exceptions? is there anything else I can do with them?)
Exceptions of the sort thrown by error can only be caught in the IO monad. If you are writing pure functions you won't want to force your users to run them in the IO monad merely for catching exceptions. Therefore, if you ever use error in a pure function, assume the error will not be caught [2]. Ideally you shouldn't use error in pure code at all, but if you are somehow compelled to do so, at least make sure to write an informative error message (that is, not "Prelude.head: empty list") so that your users know what is going on when the program crashes.
If I remember correctly it's a similar story when there is a failure during pattern matching. It doesn't matter what the type signature says, if you haven't accounted for every possible pattern, you run the risk of throwing an exception.
Indeed. The only difference from using head to writing the incomplete pattern match (\(x:_) -> x) by yourself explicitly is that in the latter case the compiler will at least warn you if you use -Wall, while with head even that is swept under the rug.
I've always wondered how the Haskell exception system fits in with the whole "Pure functional language" thing.
Technically speaking, partial functions don't affect purity (which doesn't make them any less nasty, of course). From a theoretical point of view, head [] is just as undefined as things like foo = let x = x in x. (The keyword for further reading into such subtleties is "bottom".)
[1]: Partial functions are functions that, just like head, are not defined for some values of the argument types they are supposed to take.
[2]: It is worth mentioning that exceptions in IO are a whole different issue, as you can't trivially avoid e.g. a file read failure just by using better functions. There are quite a few approaches towards handling such scenarios in a sensible way. If you are curious about the issue, here is one "highly opinionated" article about it that is illustrative of the relevant tools and trade-offs.
Haskell does not require that your functions be total, and doesn't track when they're not. (Total functions are those that have a well defined output for every possible value of their input type)
Even without exceptions or pattern match failures, you can have a function that doesn't define output for some inputs by just going on forever. An example is length (repeat 1). This continues to compute forever, but never actually throws an error.
The way Haskell semantics "copes" with this is to declare that there is an "extra" value in every single type; the so called "bottom value", and declare that any computation that doesn't properly complete and produce a normal value of its type actually produces the bottom value. It's represented by the mathematical symbol ⊥ (only when talking about Haskell; there isn't really any way in Haskell to directly refer to this value, but undefined is often also used since that is a Haskell name that is bound to an error-raising computation, and so semantically produces the bottom value).
This is a theoretical wart in the system, since it gives you the ability to create a 'value' of any type (albeit not a very useful one), and a lot of the reasoning about bits of code being correct based on types actually relies on the assumption that you can't do exactly that (if you're into the Curry-Howard isomorphism between pure functional programs and formal logic, the existence of ⊥ gives you the ability to "prove" logical contradictions, and thus to prove absolutely anything at all).
But in practice it seems to work out that all the reasoning done by pretending that ⊥ doesn't exist in Haskell still generally works well enough to be useful when you're writing "well-behaved" code that doesn't use ⊥ very much.
The main reason for tolerating this situation in Haskell is ease-of-use as a programming language rather than a system of formal logic or mathematics. It's impossible to make a compiler that could actually tell of arbitrary Haskell-like code whether or not each function is total or partial (see the Halting Problem). So a language that wanted to enforce totality would have to either remove a lot of the things you can do, or require you to jump through lots of hoops to demonstrate that your code always terminates, or both. The Haskell designers didn't want to do that.
So given that Haskell as a language is resigned to partiality and ⊥, it may as well give you things like error as a convenience. After all, you could always write a error :: String -> a function by just not terminating; getting an immediate printout of the error message rather than having the program just spin forever is a lot more useful to practicing programmers, even if those are both equivalent in the theory of Haskell semantics!
Similarly, the original designers of Haskell decided that implicitly adding a catch-all case to every pattern match that just errors out would be more convenient than forcing programmers to add the error case explicitly every time they expect a part of their code to only ever see certain cases. (Although a lot of Haskell programmers, including me, work with the incomplete-pattern-match warning and almost always treat it as an error and fix their code, and so would probably prefer the original Haskell designers went the other way on this one).
TLDR; exceptions from error and pattern match failure are there for convenience, because they don't make the system any more broken than it already has to be, without being quite a different system than Haskell.
You can program by throwing and catch exceptions if you really want, including catching the exceptions from error or pattern match failure, by using the facilities from Control.Exception.
In order to not break the purity of the system you can raise exceptions from anywhere (because the system always has to deal with the possibility of a function not properly terminating and producing a value; "raising an exception" is just another way in which that can happen), but exceptions can only be caught by constructs in IO. Because the formal semantics of IO permit basically anything to happen (because it has to interface with the real world and there aren't really any hard restrictions we can impose on that from the definition of Haskell), we can also relax most of the rules we usually need for pure functions in Haskell and still have something that technically fits in the model of Haskell code being pure.
I haven't used this very much at all (usually I prefer to keep my error handling using things that are more well-defined in terms of Haskell's semantic model than the operational model of what IO does, which can be as simple as Maybe or Either), but you can read about it if you want to.

Haskell usage of null

I'm reading through Real World Haskell and in my copy, on page 59, it states:
In Haskell we don't have the equivalent of null. We could use Maybe... Instead we've decided to use a no-argument Empty constructor
Then, in the next section, on Errors, there is the Haskell function:
mySecond xs = if null (tail xs)
then error "list too short"
else head (tail xs)
Now, I don't understand what the "null" in this function definition is referring to, since it was stated clearly that there is no equivalent to (Java's) null in Haskell.
Any help appreciated.
null is a function which simply tests if a list is empty. It doesn't have anything to do with nullable values in other languages.
null :: [a] -> Bool
null [] = True
null (_:_) = False
Not really an answer to your question, but I feel it fits here to give some technical background of what the book is talking about:
Most language implementations nowadays (C++ is the most important counterexample1) store objects not right "in place" where they are used (i.e. in the function-call stack), but in some random place on the heap. All that's stored on the stack is a pointer/reference to the actual object.
One consequence of this is that you can, with mutability, easily switch a reference to point to another object, without needing to meddle with that object itself. Also, you don't even need to have an object, instead you can consider the reference a "promise" that there will be an object by the time somebody derefers it. Now if you've learned some Haskell that will heavily remind you of lazy evaluation, and indeed it's at the ground of how lazyness is implemented. This allows not only Haskell's infinite lists etc., also you can structuce both code and data more nicely – in Haskell we call it "tying the knot", and particularly in OO languages it's quite detrimental to the way you construct objects.
Imperative languages can't do this "promise" thing nicely automatic as Haskell can: lazy evaluation without referential transparency would lead to awfully unpredictable side-effects. So when you promise a value, you need to keep track of where it'll be soonest used, and make sure you manally construct the promised object before that point. Obviously, it's a big disaster to use a reference that just points to some random place in memory (as can easily happen in C), so it's now standard to point to a conventional spot that can't be a valid memory location: this is the null pointer. That way, you get at least a predictable error message, but an error it is nevertheless; Tony Hoare therefore now calls this idea of his "billion dollar mistake".
So null references are evil, Haskell cleverly avoids them. Or does it?
Actually, it can be perfectly reasonable to not have an object at all, namely when you explicitly allow empty data structures. The classic example are linked lists, and thus Haskell has a specialised null function that checks just this: is the list empty, or is there a head I can safely refer to?
Still – this isn't really idiomatic, for Haskell has a far more general and convenient way of savely resolving such "options" scenarios: pattern matching. The idiomatic version of your example would be
mySecond' (_:x2:_) = x2
mySecond' _ = error "list too short"
Note that this is not just more consise, but also safer than your code: you check null (tail xs), but if xs itself is empty then tail will already throw an error.
1C++ certainly allows storing stuff on the heap too, but experienced programmers like to avoid it for performance reasons. On the other hand, Java stores "built-in" types (you can recognise them by lowercase names, e.g. int) on the stack, to avoid redirection cost.
2Consider the words pointer and reference as synonyms, though some languages prefer either or give them slightly different meanings.

Can I pass a pattern into a function?

I want a function like following, is this possible?
in fact, I don't know if the type Pattern exists.
fun1 a :: Pattern a -> a -> Bool
fun1 pattern a = case a of
pattern -> True
_ -> False
I don't think this is possible in Haskell.
However, in your case, the pattern is effectively just a function of type a -> Bool. So instead of accepting a pattern, accept any function from a to Bool. You example is equivalent to applying a function a -> Bool on an a.
Now, if you wanted to do something more general, like being able to use the matched symbols from the pattern in the body of fun1, you would not be able to do it with a function. However, I doubt this is possible with Haskell at all--it would require weird extensions to the type system to make any sense. Pattern matching in Haskell is not a first-class citizen at all, so you can't really pass patterns around like that.
If you want this kind of behavior, check out the book Pattern Calculus where the author develops and formalizes a language with more general pattern-matching features than Haskell. It makes patterns a first-class citizen, unlike Haskell. I haven't actually finished this book yet, but I'm pretty sure that code like that is exactly what you would be able to write, among other things.
The author built a language around his ideas about pattern matching called bondi; it's probably also worth checking out, especially if you don't want to bother with the book. I don't know if it's ready for practical use, but it's certainly interesting.
Check out the Functional Pearl, Type Safe Pattern Combinators. A bit of Googling shows that there is a Hackage package based on it as well.
I'm pretty sure that you are looking for View Patterns.
(see trac/ghc/wiki or ghc/user-manual/syntax-extensions)
Every function is a "Pattern":
case "string that ends with x" of
(last->'x') -> True
_ -> False
case "foo" of
(elemIndex 'g'->Just i) -> i+5
(elemIndex 'f'->Nothing) -> 23
_ -> 42
do
x <- fmap foo bar
=
do
(foo->x) <- bar

What is the difference between Pattern Matching and Guards?

I am very new to Haskell and to functional programming in general. My question is pretty basic. What is the difference between Pattern Matching and Guards?
Function using pattern matching
check :: [a] -> String
check [] = "Empty"
check (x:xs) = "Contains Elements"
Function using guards
check_ :: [a] -> String
check_ lst
| length lst < 1 = "Empty"
| otherwise = "Contains elements"
To me it looks like Pattern Matching and Guards are fundamentally the same. Both evaluate a condition, and if true will execute the expression hooked to it. Am I correct in my understanding?
In this example I can either use pattern matching or guards to arrive at the same result. But something tells me I am missing out on something important here. Can we always replace one with the other?
Could someone give examples where pattern matching is preferred over guards and vice versa?
Actually, they're fundamentally quite different! At least in Haskell, at any rate.
Guards are both simpler and more flexible: They're essentially just special syntax that translates to a series of if/then expressions. You can put arbitrary boolean expressions in the guards, but they don't do anything you couldn't do with a regular if.
Pattern matches do several additional things: They're the only way to deconstruct data, and they bind identifiers within their scope. In the same sense that guards are equivalent to if expressions, pattern matching is equivalent to case expressions. Declarations (either at the top level, or in something like a let expression) are also a form of pattern match, with "normal" definitions being matches with the trivial pattern, a single identifier.
Pattern matches also tend to be the main way stuff actually happens in Haskell--attempting to deconstruct data in a pattern is one of the few things that forces evaluation.
By the way, you can actually do pattern matching in top-level declarations:
square = (^2)
(one:four:nine:_) = map square [1..]
This is occasionally useful for a group of related definitions.
GHC also provides the ViewPatterns extension which sort of combines both; you can use arbitrary functions in a binding context and then pattern match on the result. This is still just syntactic sugar for the usual stuff, of course.
As for the day-to-day issue of which to use where, here's some rough guides:
Definitely use pattern matching for anything that can be matched directly one or two constructors deep, where you don't really care about the compound data as a whole, but do care about most of the structure. The # syntax lets you bind the overall structure to a variable while also pattern matching on it, but doing too much of that in one pattern can get ugly and unreadable quickly.
Definitely use guards when you need to make a choice based on some property that doesn't correspond neatly to a pattern, e.g. comparing two Int values to see which is larger.
If you need only a couple pieces of data from deep inside a large structure, particularly if you also need to use the structure as a whole, guards and accessor functions are usually more readable than some monstrous pattern full of # and _.
If you need to do the same thing for values represented by different patterns, but with a convenient predicate to classify them, using a single generic pattern with a guard is usually more readable. Note that if a set of guards is non-exhaustive, anything that fails all the guards will drop down to the next pattern (if any). So you can combine a general pattern with some filter to catch exceptional cases, then do pattern matching on everything else to get details you care about.
Definitely don't use guards for things that could be trivially checked with a pattern. Checking for empty lists is the classic example, use a pattern match for that.
In general, when in doubt, just stick with pattern matching by default, it's usually nicer. If a pattern starts getting really ugly or convoluted, then stop to consider how else you could write it. Besides using guards, other options include extracting subexpressions as separate functions or putting case expressions inside the function body in order to push some of the pattern matching down onto them and out of the main definition.
For one, you can put boolean expressions within a guard.
For example:
Just as with list comprehensions, boolean expressions can be freely mixed with among the pattern guards. For example:
f x | [y] <- x
, y > 3
, Just z <- h y
= ...
Update
There is a nice quote from Learn You a Haskell about the difference:
Whereas patterns are a way of making sure a value conforms to some form and deconstructing it, guards are a way of testing whether some property of a value (or several of them) are true or false. That sounds a lot like an if statement and it's very similar. The thing is that guards are a lot more readable when you have several conditions and they play really nicely with patterns.
To me it looks like Pattern Matching and Guards are fundamentally the same. Both evaluate a condition, and if true will execute the expression hooked to it. Am I correct in my understanding?
Not quite. First pattern matching can not evaluate arbitrary conditions. It can only check whether a value was created using a given constructor.
Second pattern matching can bind variables. So while the pattern [] might be equivalent to the guard null lst (not using length because that'd not be equivalent - more on that later), the pattern x:xs most certainly is not equivalent to the guard not (null lst) because the pattern binds the variables x and xs, which the guard does not.
A note on using length: Using length to check whether a list is empty is very bad practice, because, to calculate the length it needs to go through the whole list, which will take O(n) time, while just checking whether the list is empty takes O(1) time with null or pattern matching. Further using `length´ just plain does not work on infinite lists.
In addition to the other good answers, I'll try to be specific about guards: Guards are just syntactic sugar. If you think about it, you will often have the following structure in your programs:
f y = ...
f x =
if p(x) then A else B
That is, if a pattern matches, it is followed right after by a if-then-else discrimination. A guard folds this discrimination into the pattern match directly:
f y = ...
f x | p(x) = A
| otherwise = B
(otherwise is defined to be True in the standard library). It is more convenient than an if-then-else chain and sometimes it also makes the code much simpler variant-wise so it is easier to write than the if-then-else construction.
In other words, it is sugar on top of another construction in a way which greatly simplifies your code in many cases. You will find that it eliminates a lot of if-then-else chains and make your code more readable.

Resources