Why do non-exhaustive guards cause irrefutable pattern match to fail? - haskell

I have this function in Haskell:
test :: (Eq a) => a -> a -> Maybe a
test a b
| a == b = Just a
test _ _ = Nothing
This is what I got when I tried the function with different inputs:
ghci>test 3 4
Nothing
ghci>test 3 3
Just 3
According to Real World Haskell, the first pattern is irrefutable. But it seems like test 3 4 doesn't fails the first pattern, and matches the second. I expected some kind of error -- maybe 'non-exhaustive guards'. So what is really going on here, and is there a way to enable compiler warnings in case this accidentally happens?

The first pattern is indeed an "irrefutable pattern", however this does not mean that it will always choose the corresponding right hand side of your function. It is still subject to the guard which may fail as it does in your example.
To ensure all cases are covered, it is common to use otherwise to have a final guard which will always succeed.
test :: (Eq a) => a -> a -> Maybe a
test a b
| a == b = Just a
| otherwise = Nothing
Note that there is nothing magic about otherwise. It is defined in the Prelude as otherwise = True. However, it is idiomatic to use otherwise for the final case.
Having a compiler warn about non-exaustive guards would be impossible in the general case, as it would involve solving the halting problem, however there exist tools like Catch which attempt to do a better job than the compiler at determining whether all cases are covered or not in the common case.

The compiler should be warning you if you leave out the second clause, i.e. if your last match has a set of guards where the last one is not trivially true.
Generally, testing guards for completeness is obviously not possible, as it would be as hard as solving the halting problem.
Answer to Matt's comment:
Look at the example:
foo a b
| a <= b = True
| a > b = False
A human can see that one of both guards must be true. But the compiler does not know that either a <= b or a > b.
Now look for another example:
fermat a b c n
| a^n + b^n /= c^n = ....
| n < 0 = undefined
| n < 3 = ....
To prove that the set of guards is complete, the compiler had to prove Fermat's Last Theorem. It's impossible to do that in a compiler. Remember that the number and complexity of the guards is not limited. The compiler would have to be a general solver for mathematical problems, problems that are stated in Haskell itself.
More formally, in the easiest case:
f x | p x = y
the compiler must prove that if p x is not bottom, then p x is True for all possible x. In other words, it must prove that either p x is bottom (does not halt) no matter what x is or evaluates to True.

Guards aren't irrefutable. But is very common (and good) practise to add one last guard that catch the other cases, so your function becomes:
test :: (Eq a) => a -> a -> Maybe a
test a b
| a == b = Just a
| True = Nothing

Related

Why would you write Haskell like this?

I've been reading some Haskell code and keep seeing functions that look something like this:
ok :: a -> Result i w e a
ok a =
Result $ \i w _ good ->
     good i w a
Why is a lambda used? Why wouldn't you just write the following?:
ok :: a -> Result i w e a
ok a =
Result $ good i w a
This is continuation passing style or "CPS".
So first off, your alternative example doesn't make sense. good, i, and w are not known at the point they are used, and you will get an error.
The basic idea of continuation passing style is that instead of returning the relevant information, you instead call a function that you are given (in this case good), passing it your intended result as an argument. Presumably (based on the naming) the ignored argument _ would have been called bad, and it is a function that you would call in the case of failure.
If you are the ok function, it's like the difference between asking you to
Bake me a batch of cookies.
(where I have the intention of giving the cookies to Dave), and
Bake a batch of cookies and then give it to Dave.
which accomplishes the same thing but now I don't have to be a middleman anymore. There are often performance advantages to cutting me out as a middleman, and also it means you can do more things, for example if the batch of cookies is really good you might decide to give it to your mom instead of Dave (thus aborting whatever Dave would have done with them), or bake two batches and give them both to Dave (duplicating what Dave would have done). Sometimes you want this ability and other times you don't, it depends on context. (N.B. in the below examples the types are sufficiently general to disallow these possibilities)
Here is a very simple example of continuation passing style. Say you have a program
pred :: Integer -> Maybe Integer
pred n = if n > 0 then Just (n-1) else Nothing
which subtracts 1 from a number and returns it (in a Just constructor), unless it would become negative then it returns Nothing. You might use it like this:
main = do
x <- readLn
case x of
Just predx -> putStrLn $ "The predecessor is " ++ show predx
Nothing -> putStrLn "Can't take the predecessor"
We can encode this in continuation passing style by, instead of returning Maybe, have pred take an argument for what to do in each case:
pred :: Integer -> (Integer -> r) -> r -> r
-- ^^^^^^^^^^^^^^ ^
-- Just case |
-- Nothing case
pred n ifPositive ifNegative =
if n > 0
then ifPositive (n-1)
else ifNegative
And the usage becomes:
main = do
x <- readLn
pred x (\predx -> putStrLn $ "The predecessor is " ++ show predx)
(putStrLn "Can't take the predecessor)
See how that works? -- doing it the first way we got the result and then did case analysis; in the second way each case became an argument to the function. And in the process the call to pred became a tail call, eliminating the need for a stack frame and the intermediate Maybe data structure.
The only remaining problem is that the pred's signature is kind of confusing. We can make it a bit clearer by wrapping the CPS stuff in its own type constructor:
newtype CPSMaybe a = CPSMaybe (forall r. (a -> r) -> r -> r)
pred :: Integer -> CPSMaybe Integer
pred n = CPSMaybe $ \ifPositive ifNegative ->
if n > 0
then ifPositive (n-1)
else ifNegative
which has a signature that looks more like the first one but with code that looks like the second (except for the CPSMaybe newtype wrapper, which has no effect at runtime). And now maybe you can see the connection to the code in your question.
Well, the Result type apparently wraps a function, so a lambda is the natural thing to use here. If you wanted to avoid a lambda, you could use a local definition instead using let or where, e.g.:
ok a = let
proceed i w _ good = good i w a
in Result proceed
-- or --
ok a = Result proceed
where
proceed i w _ good = good i w a
Writing this won’t work because the variables i, w, and good are not in scope:
ok :: a -> Result i w e a
ok a =
Result $ good i w a
I wonder if the source of your confusion is the fact that i and w are also used as type variables in the signature of ok, but they’re different variables that happen to have the same names. It’s just as if you’d written something like this:
ok :: a -> Result i w e a
ok value =
Result $ continue index writer value
Here it should be obvious that the continue, index, and writer variables aren’t defined.
In the first sample, good, w and i are locally defined parameters to the lambda expression. In the second sample, they are free variables. I would expect the second sample to fail with an error saying that those identifiers are not in scope. Result apparently is a type that contains information about how to use given data and handlers. ok says to take the data and apply the handler indicating a good outcome to it. In the second sample, it is not clear that one is even refering to the arguments available to what Result wraps, or which names refer to which arguments.

Haskell's `otherwise` is a synonym for `_`?

I ran across a piece of code recently that used Haskell's otherwise to pattern match on a list. This struck me as odd, since:
ghci> :t otherwise
otherwise :: Bool
So, I tried the following:
ghci> case [] of otherwise -> "!?"
"!?"
I also tried it with various other patterns of different types and with -XNoImplicitPrelude turned on (to remove otherwise from scope), and it still works. Is this supposed to happen? Where is this documented?
It's not equivalent to _, it's equivalent to any other identifier. That is if an identifier is used as a pattern in Haskell, the pattern always matches and the matched value is bound to that identifier (unlike _ where it also always matches, but the matched value is discarded).
Just to be clear: the identifier otherwise is not special here. The code could just as well have been x -> "!?". Also, since the binding is never actually used, it would make more sense to use _ to avoid an "unused identifier" warning and to make it obvious to the reader that the value does not matter.
Just since nobody has said it yet, otherwise is supposed to be used as a guard expression, not a pattern. case ... of pat | ... -> ... | otherwise -> ... Now its definition as True is important. – Reid Barton
An example:
fact n acc
| n == 0 = acc
| otherwise = fact (n-1) $! (acc * n)
Since otherwise is True, that second guard will always succeed.
Note that using otherwise in a pattern (as opposed to a guard) is likely to confuse people. It will also trip a name shadowing warning if GHC is run with the appropriate warnings enabled.

What is the Maybe type and how does it work?

I am just starting to program in Haskell, and I came across the following definition:
calculate :: Float -> Float -> Maybe Float
Maybe a is an ordinary data type defined as:
data Maybe a = Just a | Nothing
There are thus two possibilities: or you define a value of type a as Just a (like Just 3), or Nothing in case the query has no answer.
It is meant to be defined as a way to define output for non-total functions.
For instance: say you want to define sqrt. The square root is only defined for positive integers, you can thus define sqrt as:
sqrt x | x >= 0 = Just $ ...
| otherwise = Nothing
with ... a way to calculate the square root for x.
Some people compare Nothing with the "null pointer" you find in most programming languages. By default, you don't implement a null pointer for data types you define (and if you do, all these "nulls" look different), by adding Nothing you have a generic null pointer.
It can thus be useful to use Maybe to denote that it is possible no output can be calculated. You could of course also error on values less than 0:
sqrt x | x >= 0 = Just $ ...
| otherwise = error "The value must be larger or equal to 0"
But errors usually are not mentioned in the type signature, nor does a compiler have any problem if you don't take them into account. Haskell is also shifting to total functions: it's better to always try at least to return a value (e.g. Nothing) for all possible inputs.
If you later want to use the result of a Maybe a, you for instance need to write:
succMaybe :: Maybe Int -> Maybe Int
succMaybe (Just x) = Just (x+1)
succMaybe _ = Nothing
But by writing Just for the first case, you somehow warn yourself that it is possible that Nothing can occur. You can also get rid of the Maybe by introducing a "default" value:
justOrDefault :: a -> Maybe a -> a
justOrDefault _ (Just x) = x
justOrDefault d _ = d
The builtin maybe function (note the lowercase), combines the two previous functions:
maybe :: b -> (a -> b) -> Maybe a -> b
maybe _ f (Just x) = f x
maybe z _ Nothing = z
So you specify a b (default value) together with a function (a -> b). In case Maybe a is Just x, the function is applied to it and returned, in case the input value is Nothing, the default value will be used.
Working with Maybe a's can be hard, because you always need to take the Nothing case into account, to simplify this you can use the Maybe monad.
Tom Schrijvers also shows that Maybe is the successor function in type algebra: you add one extra value to your type (Either is addition and (,) is the type-algebraic equivalent of multiplication).

Would you ever write seq x x?

I'm not entirely clear on how seq works in Haskell.
It seems like it there are lots of cases where it would be useful to write
seq x x
and maybe even define a function:
strict x = seq x x
but such a function doesn't already exist so I'm guessing this approach is somehow wrongheaded. Could someone tell me if this is meaningful or useful?
seq a b returns the value of b, but makes that value depend on the evaluation of a. Thus, seq a a is exactly the same thing as a.
I think the misunderstanding here is that seq doesn't take any action, because pure functions don't take actions, it just introduces a dependency.
There is a function evaluate :: a -> IO () in Control.Exception that does what you want (note that it's in IO). They put it in exception because it's useful to see if the evaluation of an expression would throw, and if so handle the exception.
The expression x = seq a b means that if x is evaluated, then a will also be evaluated (but x will be equal to b).
It does not mean "evaluate a now".
Notice that if x is being evaluated, then since x equals b, then b will also be evaluated.
And hence, if I write x = seq a a, I am saying "if x is evaluated then evaluate a". But if I just do x = a, that would achieve exactly the same thing.
When you say seq a b what you are telling the computer is,
Whenever you need to evaluate b, evaluate a for me too, please.
If we replace both a and b with x you can see why it's useless to write seq x x:
Whenever you need to evaluate x, evaluate x for me too, please.
Asking the computer to evaluate x when it needs to evaluate x is just a useless thing to do – it was going to evaluate x anyway!
seq does not evaluate anything – it simply tells the computer that when you need the second argument, also evaluate the first argument. Understanding this is actually really important, because it allows you to understand the behaviour of your programs much better.
seq x x would be entirely, trivially redundant.
Remember, seq is not a command. The presence of a seq a b in your program does not force evaluation of a or b What it does do, is it makes the evaluation of the result artificially dependent on the evaluation of a, even though the result itself is b If you print out seq a b, a will be evaluated and its result discarded.. Since x already depends on itself, seq x x is silly.
Close! deepseq (which is the "more thorough" seq -- see the docs for a full description) has the type NFData a => a -> b -> b, and force (with type NFData a => a -> a) is defined simply as
force :: (NFData a) => a -> a
force x = x `deepseq` x

implementing indexOf in Haskell

I'm going through the Learn You a Haskell tutorial and attempted to modify the elem' function from the section on Recursion.
The original elem' function is:
elem' :: (Eq a) => a -> [a] -> Bool
elem' a [] = False
elem' a (x:xs)
| a == x = True
| otherwise = a `elem'` xs
My indexOf function is:
indexOf :: (Eq a, Integral s) => a -> [a] -> s -> s
indexOf _ [] _ = -1
indexOf a (x:xs) s
| a == x = s
| otherwise = indexOf a xs s+1
The function should return either the index of the element in the list, or -1 if the element is not found.
At the end of my .hs file I test the function with:
main = putStrLn(show(indexOf 7 [1,2,3] 0))
The function works correctly for finding values that appear in the list. However, for the test written above, instead of returning -1, it prints 2. The return value seems to always be the list length minus one.
After encountering the edge condition (empty list), I would expect the -1 return value to propagate all the way back upwards through the call stack. Where's my mistake?
You've got a precedence problem. Function application binds more tightly than (+), so the otherwise case is being parsed as (index' a xs s) + 1.
There is one simple change you could make to the example above, which would be almost guaranteed to fix your bug. Change the return type to Maybe a, and return Nothing upon failure. While this alone won't fix your bug, it gives the compiler enough information that it will lead you down the correct path that you will be able to fix it yourself.
Returning -1 on failure is a very "c" thing to do, and it is error prone. In your case, the compiler is confusing -1 with an actual array index, which can be added to as you go back up through the recursive calls. (I can see that this was not your intention from the signature, but this is how the compiler interpreted what you have done).
(One added advantage- Once you master the art of using correct types, you will soon realize how hard it is to work with functions of the type a->Maybe b, or more generally a->m b, where m is some wrapper type.... This will bring you to the doorstep of understanding what and why we use monads).

Resources