Haskell irrefutable pattern matching - haskell

In order to get a head start into the path of Haskell, I chose the book by one of its creators Hudak. So, I am going through the gentle introduction to Haskell.
I got stuck at trying to understand the following statement:
Technically speaking, formal parameters are also patterns but they never fail to match a value.
From my little but relatively greater habituation with languages like C (or let's broadly say as non-functional languages), I could form that formal parameters are the arguments in the definition of a function. So, suppose there were a function like the following in C:
int func_add(int a, int d)
then passing a value of some other type like string will be a failure in pattern matching if I am correct. So calling func_add as func_add("trs", 5) is a case of pattern mismatch.
With a heavy possibility of an incorrect understanding or interpretation this similar situation can occur very well in Haskell as well when a piece of code calls a function by passing arguments of different types.
So, why is it said that in Haskell formal parameters are irrefutable pattern matching?

What you describe is not a pattern, it is a type. Haskell has types as well, and these are resolved at compile time. Each type can have several patterns. For example a list is defined as:
data Color = Red | Green | Blue | Other String
Now we can define a function foo:
foo :: Color -> String
foo Red = "Red"
foo Green = "Green"
foo Blue = "Blue"
foo (Other s) = s
The elements in boldface are all patterns. But these are not irrefutable: the first one will check if we have given the function a Red, the second whether we have given it Green, the third if the value is Blue, and finally we have a pattern (Other s) that will match with all Other patterns (regardless what the value of s is), since s is a variable, and a variable is an irrefutable pattern: we do not perform any checks on the value of the string.
Mind that these checks are done at runtime: if we would however call foo "Red", we will get a type error at compile time since the Haskell compiler knows that foo has type Color -> String.
If we would have written:
foo :: Color -> String
foo c = "Some color"
foo Red = "Red"
c is an irrefutable pattern: it will match any Color object, so the second line foo Red will never match, so c is an irrefutable pattern.

No, passing a value of some other type is not a failure in the pattern matching. It's a type error, and the code won't even compile. Formal parameters are irrefutable patterns for a well-typed program, which is the only kind of program that the compiler allows you to run.

In Haskell, you can define types in various ways. One of these is to introduce a sum type, like this:
data FooBar = Foo Int | Bar Bool
You could attempt to write a function like this, using pattern matching:
myFunction (Foo x) = x
That would, however, be a partially matched function, and if you try to call it with myFunction (Bar False), you'd get an exception.
You can, on the other hand, also define single-case sum types, like this:
data MyInt = MyInt Int
Here, you can write a function like this:
myFunction' (MyInt x) = x
Here, you're still using pattern matching, but since there's only a single case, this is a complete match. If the calling code compiles, the match can't fail.
The above MyInt is really only a wrapper around Int, so you could say that if you write a function that takes an Int, like this, it's the same sort of pattern matching:
myFunction'' :: Int -> Int
myFunction'' x = x + 42
While Int doesn't have a value constructor that you can use to pattern-match on, x is a pattern that always matches an Int value. Therefore you can say that a function argument is a match that always succeeds.

Related

What does a stand for in a data type declaration?

Normally when using type declarations we do:
function_name :: Type -> Type
However in an exercise I am trying to solve there is the following structure:
function_name :: Type a -> Type a
or explicitly as in the exercise
alphabet :: DFA a -> Alphabet a
alphabet = undefined
What does a stand for?
Short answer: it's a type variable.
At the computation level, the way we define functions is to use variables to refer to their arguments. Like this:
f x = x + 3
Here x is a variable, and its value will be chosen when the function is called. Haskell has a similar (but not identical...) mechanism in its type sublanguage. For example, you can write things like:
type F x = (x, Int, x)
type Endo a = a -> a -> a
Here again x is a variable in the first one (and a in the second), and its value will be chosen at use sites. One can also use this mechanism when defining new types. (The previous two examples just give new names to existing types, but the following does more.) One of the most basic nontrivial examples of this is the Maybe family of types:
data Maybe a = Nothing | Just a
The things on the right of the = are computation-level, so you can mostly ignore them for now, but on the left we are declaring a new family of types Maybe which accepts other types as an argument. For example, Maybe Int, Maybe (Bool, String), Maybe (Endo Char), and even passing in expressions that have variables like Maybe (x, Int, x) are all possible.
Syntactically, type constructors (things which are defined as part of the program text and that we expect the compiler to look up the definition for) start with an upper case letter and type variables (things which will be instantiated later and so don't currently have a concrete definition) start with lower case letters.
So, in the type signature you showed:
alphabet :: DFA a -> Alphabet a
I suspect there are actually two constructs new to you, not just one: first, the type variable a that you asked about, and second, the concept of type application, where we apply at the type level one "function-like" type to another. (Outside of this answer, people say "parameterized" instead of "function-like".)
...and, believe it or not, there is even a type system for types that makes sure you don't write things like these:
Int a -- Int is not parameterized, so shouldn't be applied to arguments
Int Char -- ditto
Maybe -> String -- Maybe is parameterized, so should be applied to
-- arguments, but isn't

How does return statement work in Haskell? [duplicate]

This question already has answers here:
What's so special about 'return' keyword
(3 answers)
Closed 5 years ago.
Consider these functions
f1 :: Maybe Int
f1 = return 1
f2 :: [Int]
f2 = return 1
Both have the same statement return 1. But the results are different. f1 gives value Just 1 and f2 gives value [1]
Looks like Haskell invokes two different versions of return based on return type. I like to know more about this kind of function invocation. Is there a name for this feature in programming languages?
This is a long meandering answer!
As you've probably seen from the comments and Thomas's excellent (but very technical) answer You've asked a very hard question. Well done!
Rather than try to explain the technical answer I've tried to give you a broad overview of what Haskell does behind the scenes without diving into technical detail. Hopefully it will help you to get a big picture view of what's going on.
return is an example of type inference.
Most modern languages have some notion of polymorphism. For example var x = 1 + 1 will set x equal to 2. In a statically typed language 2 will usually be an int. If you say var y = 1.0 + 1.0 then y will be a float. The operator + (which is just a function with a special syntax)
Most imperative languages, especially object oriented languages, can only do type inference one way. Every variable has a fixed type. When you call a function it looks at the types of the argument and chooses a version of that function that fits the types (or complains if it can't find one).
When you assign the result of a function to a variable the variable already has a type and if it doesn't agree with the type of the return value you get an error.
So in an imperative language the "flow" of type deduction follows time in your program Deduce the type of a variable, do something with it and deduce the type of the result. In a dynamically typed language (such as Python or javascript) the type of a variable is not assigned until the value of the variable is computed (which is why there don't seem to be types). In a statically typed language the types are worked out ahead of time (by the compiler) but the logic is the same. The compiler works out what the types of variables are going to be, but it does so by following the logic of the program in the same way as the program runs.
In Haskell the type inference also follows the logic of the program. Being Haskell it does so in a very mathematically pure way (called System F). The language of types (that is the rules by which types are deduced) are similar to Haskell itself.
Now remember Haskell is a lazy language. It doesn't work out the value of anything until it needs it. That's why it makes sense in Haskell to have infinite data structures. It never occurs to Haskell that a data structure is infinite because it doesn't bother to work it out until it needs to.
Now all that lazy magic happens at the type level too. In the same way that Haskell doesn't work out what the value of an expression is until it really needs to, Haskell doesn't work out what the type of an expression is until it really needs to.
Consider this function
func (x : y : rest) = (x,y) : func rest
func _ = []
If you ask Haskell for the type of this function it has a look at the definition, sees [] and : and deduces that it's working with lists. But it never needs to look at the types of x and y, it just knows that they have to be the same because they end up in the same list. So it deduces the type of the function as [a] -> [a] where a is a type that it hasn't bothered to work out yet.
So far no magic. But it's useful to notice the difference between this idea and how it would be done in an OO language. Haskell doesn't convert the arguments to Object, do it's thing and then convert back. Haskell just hasn't been asked explicitly what the type of the list is. So it doesn't care.
Now try typing the following into ghci
maxBound - length ""
maxBound : "Hello"
Now what just happened !? minBound bust be a Char because I put it on the front of a string and it must be an integer because I added it to 0 and got a number. Plus the two values are clearly very different.
So what is the type of minBound? Let's ask ghci!
:type minBound
minBound :: Bounded a => a
AAargh! what does that mean? Basically it means that it hasn't bothered to work out exactly what a is, but is has to be Bounded if you type :info Bounded you get three useful lines
class Bounded a where
minBound :: a
maxBound :: a
and a lot of less useful lines
So if a is Bounded there are values minBound and maxBound of type a.
In fact under the hood Bounded is just a value, it's "type" is a record with fields minBound and maxBound. Because it's a value Haskell doesn't look at it until it really needs to.
So I appear to have meandered somewhere in the region of the answer to your question. Before we move onto return (which you may have noticed from the comments is a wonderfully complex beast.) let's look at read.
ghci again
read "42" + 7
read "'H'" : "ello"
length (read "[1,2,3]")
and hopefully you won't be too surprised to find that there are definitions
read :: Read a => String -> a
class Read where
read :: String -> a
so Read a is just a record containing a single value which is a function String -> a. Its very tempting to assume that there is one read function which looks at a string, works out what type is contained in the string and returns that type. But it does the opposite. It completely ignores the string until it's needed. When the value is needed, Haskell first works out what type it's expecting, once it's done that it goes and gets the appropriate version of the read function and combines it with the string.
now consider something slightly more complex
readList :: Read a => [String] -> a
readList strs = map read strs
under the hood readList actually takes two arguments
readList' (Read a) -> [String] -> [a]
readList' {read = f} strs = map f strs
Again as Haskell is lazy it only bothers looking at the arguments when it's needs to find out the return value, at that point it knows what a is, so the compiler can go and fine the right version of Read. Until then it doesn't care.
Hopefully that's given you a bit of an idea of what's happening and why Haskell can "overload" on the return type. But it's important to remember it's not overloading in the conventional sense. Every function has only one definition. It's just that one of the arguments is a bag of functions. read_str doesn't ever know what types it's dealing with. It just knows it gets a function String -> a and some Strings, to do the application it just passes the arguments to map. map in turn doesn't even know it gets strings. When you get deeper into Haskell it becomes very important that functions don't know very much about the types they're dealing with.
Now let's look at return.
Remember how I said that the type system in Haskell was very similar to Haskell itself. Remember that in Haskell functions are just ordinary values.
Does this mean I can have a type that takes a type as an argument and returns another type? Of course it does!
You've seen some type functions Maybe takes a type a and returns another type which can either be Just a or Nothing. [] takes a type a and returns a list of as. Type functions in Haskell are usually containers. For example I could define a type function BinaryTree which stores a load of a's in a tree like structure. There are of course lots of much stranger ones.
So, if these type functions are similar to ordinary types I can have a typeclass that contains type functions. One such typeclass is Monad
class Monad m where
return a -> m a
(>>=) m a (a -> m b) -> m b
so here m is some type function. If I want to define Monad for m I need to define return and the scary looking operator below it (which is called bind)
As others have pointed out the return is a really misleading name for a fairly boring function. The team that designed Haskell have since realised their mistake and they're genuinely sorry about it. return is just an ordinary function that takes an argument and returns a Monad with that type in it. (You never asked what a Monad actually is so I'm not going to tell you)
Let's define Monad for m = Maybe!
First I need to define return. What should return x be? Remember I'm only allowed to define the function once, so I can't look at x because I don't know what type it is. I could always return Nothing, but that seems a waste of a perfectly good function. Let's define return x = Just x because that's literally the only other thing I can do.
What about the scary bind thing? what can we say about x >>= f? well x is a Maybe a of some unknown type a and f is a function that takes an a and returns a Maybe b. Somehow I need to combine these to get a Maybe b`
So I need to define Nothing >== f. I can't call f because it needs an argument of type a and I don't have a value of type a I don't even know what 'a' is. I've only got one choice which is to define
Nothing >== f = Nothing
What about Just x >>= f? Well I know x is of type a and f takes a as an argument, so I can set y = f a and deduce that y is of type b. Now I need to make a Maybe b and I've got a b so ...
Just x >>= f = Just (f x)
So I've got a Monad! what if m is List? well I can follow a similar sort of logic and define
return x = [x]
[] >>= f = []
(x : xs) >>= a = f x ++ (xs >>= f)
Hooray another Monad! It's a nice exercise to go through the steps and convince yourself that there's no other sensible way of defining this.
So what happens when I call return 1?
Nothing!
Haskell's Lazy remember. The thunk return 1 (technical term) just sits there until someone needs the value. As soon as Haskell needs the value it know what type the value should be. In particular it can deduce that m is List. Now that it knows that Haskell can find the instance of Monad for List. As soon as it does that it has access to the correct version of return.
So finally Haskell is ready To call return, which in this case returns [1]!
The return function is from the Monad class:
class Applicative m => Monad (m :: * -> *) where
...
return :: a -> m a
So return takes any value of type a and results in a value of type m a. The monad, m, as you've observed is polymorphic using the Haskell type class Monad for ad hoc polymorphism.
At this point you probably realize return is not an good, intuitive, name. It's not even a built in function or a statement like in many other languages. In fact a better-named and identically-operating function exists - pure. In almost all cases return = pure.
That is, the function return is the same as the function pure (from the Applicative class) - I often think to myself "this monadic value is purely the underlying a" and I try to use pure instead of return if there isn't already a convention in the codebase.
You can use return (or pure) for any type that is a class of Monad. This includes the Maybe monad to get a value of type Maybe a:
instance Monad Maybe where
...
return = pure -- which is from Applicative
...
instance Applicative Maybe where
pure = Just
Or for the list monad to get a value of [a]:
instance Applicative [] where
{-# INLINE pure #-}
pure x = [x]
Or, as a more complex example, Aeson's parse monad to get a value of type Parser a:
instance Applicative Parser where
pure a = Parser $ \_path _kf ks -> ks a

How to know the data type of a value of a sum type [duplicate]

What is pattern matching in Haskell and how is it related to guarded equations?
I've tried looking for a simple explanation, but I haven't found one.
EDIT:
Someone tagged as homework. I don't go to school anymore, I'm just learning Haskell and I'm trying to understand this concept. Pure out of interest.
In a nutshell, patterns are like defining piecewise functions in math. You can specify different function bodies for different arguments using patterns. When you call a function, the appropriate body is chosen by comparing the actual arguments with the various argument patterns. Read A Gentle Introduction to Haskell for more information.
Compare:
with the equivalent Haskell:
fib 0 = 1
fib 1 = 1
fib n | n >= 2
= fib (n-1) + fib (n-2)
Note the "n ≥ 2" in the piecewise function becomes a guard in the Haskell version, but the other two conditions are simply patterns. Patterns are conditions that test values and structure, such as x:xs, (x, y, z), or Just x. In a piecewise definition, conditions based on = or ∈ relations (basically, the conditions that say something "is" something else) become patterns. Guards allow for more general conditions. We could rewrite fib to use guards:
fib n | n == 0 = 1
| n == 1 = 1
| n >= 2 = fib (n-1) + fib (n-2)
There are other good answers, so I'm going to give you a very technical answer. Pattern matching is the elimination construct for algebraic data types:
"Elimination construct" means "how to consume or use a value"
"Algebraic data type", in addition to first-class functions, is the big idea in a statically typed functional language like Clean, F#, Haskell, or ML
The idea of algebraic data types is that you define a type of thing, and you say all the ways you can make that thing. As an example, let's define "Sequence of String" as an algebraic data type, with three ways to make it:
data StringSeq = Empty -- the empty sequence
| Cat StringSeq StringSeq -- two sequences in succession
| Single String -- a sequence holding a single element
Now, there are all sorts of things wrong with this definition, but as an example it's interesting because it provides constant-time concatenation of sequences of arbitrary length. (There are other ways to achieve this.) The declaration introduces Empty, Cat, and Single, which are all the ways there are of making sequences. (That makes each one an introduction construct—a way to make things.)
You can make an empty sequence without any other values.
To make a sequence with Cat, you need two other sequences.
To make a sequence with Single, you need an element (in this case a string)
Here comes the punch line: the elimination construct, pattern matching, gives you a way to scrutinize a sequence and ask it the question what constructor were you made with?. Because you have to be prepared for any answer, you provide at least one alternative for each constructor. Here's a length function:
slen :: StringSeq -> Int
slen s = case s of Empty -> 0
Cat s s' -> slen s + slen s'
Single _ -> 1
At the core of the language, all pattern matching is built on this case construct. However, because algebraic data types and pattern matching are so important to the idioms of the language, there's special "syntactic sugar" for doing pattern matching in the declaration form of a function definition:
slen Empty = 0
slen (Cat s s') = slen s + slen s'
slen (Single _) = 1
With this syntactic sugar, computation by pattern matching looks a lot like definition by equations. (The Haskell committee did this on purpose.) And as you can see in the other answers, it is possible to specialize either an equation or an alternative in a case expression by slapping a guard on it. I can't think of a plausible guard for the sequence example, and there are plenty of examples in the other answers, so I'll leave it there.
Pattern matching is, at least in Haskell, deeply tied to the concept of algebraic data types. When you declare a data type like this:
data SomeData = Foo Int Int
| Bar String
| Baz
...it defines Foo, Bar, and Baz as constructors--not to be confused with "constructors" in OOP--that construct a SomeData value out of other values.
Pattern matching is nothing more than doing this in reverse--a pattern would "deconstruct" a SomeData value into its constituent pieces (in fact, I believe that pattern matching is the only way to extract values in Haskell).
When there are multiple constructors for a type, you write multiple versions of a function for each pattern, with the correct one being selected depending on which constructor was used (assuming you've written patterns to match all possible constructions--which it's generally good practice to do).
In a functional language, pattern matching involves checking an argument against different forms. A simple example involves recursively defined operations on lists. I will use OCaml to explain pattern matching since it's my functional language of choice, but the concepts are the same in F# and Haskell, AFAIK.
Here is the definition of a function to compute the length of a list lst. In OCaml, an ``a listis defined recursively as the empty list[], or the structureh::t, wherehis an element of typea(abeing any type we want, such as an integer or even another list),tis a list (hence the recursive definition), and::` is the cons operator, which creates a new list out of an element and a list.
So the function would look like this:
let rec len lst =
match lst with
[] -> 0
| h :: t -> 1 + len t
rec is a modifier that tells OCaml that a function will call itself recursively. Don't worry about that part. The match statement is what we're focusing on. OCaml will check lst against the two patterns - empty list, or h :: t - and return a different value based on that. Since we know every list will match one of these patterns, we can rest assured that our function will return safely.
Note that even though these two patterns will take care of all lists, you aren't limited to them. A pattern like h1 :: h2 :: t (matching all lists of length 2 or more) is also valid.
Of course, the use of patterns isn't restricted to recursively defined data structures, or recursive functions. Here is a (contrived) function to tell you whether a number is 1 or 2:
let is_one_or_two num =
match num with
1 -> true
| 2 -> true
| _ -> false
In this case, the forms of our pattern are the numbers themselves. _ is a special catch-all used as a default case, in case none of the above patterns match.
Pattern matching is one of those painful operations that is hard to get one's head around if you come from procedural programming background. I find it hard to get into because the same syntax used to create a data structure can be used for matching.
In F# you can use the cons operator :: to add an element to the beginning of a list like so:
let a = 1 :: [2;3]
//val a : int list = [1; 2; 3]
Similarly you can use the same operator to split the list up like so:
let a = [1;2;3];;
match a with
| [a;b] -> printfn "List contains 2 elements" //will match a list with 2 elements
| a::tail -> printfn "Head is %d" a //will match a list with 2 or more elements
| [] -> printfn "List is empty" //will match an empty list

How to define multiple patterns in Frege?

I'm having some trouble defining a function in Frege that uses multiple patterns. Basically, I'm defining a mapping by iterating through a list of tuples. I've simplified it down to the following:
foo :: a -> [(a, b)] -> b
foo _ [] = [] --nothing found
foo bar (baz, zab):foobar
| bar == baz = zab
| otherwise = foo bar foobar
I get the following error:
E morse.fr:3: redefinition of `foo` introduced line 2
I've seen other examples like this that do use multiple patterns in a function definition, so I don't know what I'm doing wrong. Why am I getting an error here? I'm new to Frege (and new to Haskell), so there may be something simple I'm missing, but I really don't think this should be a problem.
I'm compiling with version 3.24-7.100.
This is a pure syntactical problem that affects newcomers to languages of the Haskell family. It won't take too long until you internalize the rule that function application has higher precedence than infix expression.
This has consequences:
Complex arguments of function application need parentheses.
In infix expressions, function applications on either side of the operator do not need parentheses (however, individual components of function application may still need them).
In Frege, in addition, the following rule holds:
The syntax of function application and infix expressions on the left hand side of a definition is identical to the one on the right hand side as far as lexemes allowed on both sides are concerned. (This holds in Haskell only when # and ~ are not used.)
This is so you can define an addition function like this:
data Number = Z | Succ Number
a + Z = a
a + Succ b = Succ a + b
Hence, when you apply this to your example, you see that syntactically, you're going to redefine the : operator. To achieve what you want, you need to write it thus:
foo bar ((baz, zab):foobar) = ....
-- ^ ^
This corresponds to the situation where you apply foo to a list you are constructing:
foo 42 (x:xs)
When you write
foo 42 x:xs
this means
(foo 42 x):xs

SML conversions to Haskell

A few basic questions, for converting SML code to Haskell.
1) I am used to having local embedded expressions in SML code, for example test expressions, prints, etc. which functions local tests and output when the code is loaded (evaluated).
In Haskell it seems that the only way to get results (evaluation) is to add code in a module, and then go to main in another module and add something to invoke and print results.
Is this right? in GHCi I can type expressions and see the results, but can this be automated?
Having to go to the top level main for each test evaluation seems inconvenient to me - maybe just need to shift my paradigm for laziness.
2) in SML I can do pattern matching and unification on a returned result, e.g.
val myTag(x) = somefunct(a,b,c);
and get the value of x after a match.
Can I do something similar in Haskell easily, without writing separate extraction functions?
3) How do I do a constructor with a tuple argument, i.e. uncurried.
in SML:
datatype Thing = Info of Int * Int;
but in Haskell, I tried;
data Thing = Info ( Int Int)
which fails. ("Int is applied to too many arguments in the type:A few Int Int")
The curried version works fine,
data Thing = Info Int Int
but I wanted un-curried.
Thanks.
This question is a bit unclear -- you're asking how to evaluate functions in Haskell?
If it is about inserting debug and tracing into pure code, this is typically only needed for debugging. To do this in Haskell, you can use Debug.Trace.trace, in the base package.
If you're concerned about calling functions, Haskell programs evaluate from main downwards, in dependency order. In GHCi you can, however, import modules and call any top-level function you wish.
You can return the original argument to a function, if you wish, by making it part of the function's result, e.g. with a tuple:
f x = (x, y)
where y = g a b c
Or do you mean to return either one value or another? Then using a tagged union (sum-type), such as Either:
f x = if x > 0 then Left x
else Right (g a b c)
How do I do a constructor with a tuple argument, i.e. uncurried in SML
Using the (,) constructor. E.g.
data T = T (Int, Int)
though more Haskell-like would be:
data T = T Int Bool
and those should probably be strict fields in practice:
data T = T !Int !Bool
Debug.Trace allows you to print debug messages inline. However, since these functions use unsafePerformIO, they might behave in unexpected ways compared to a call-by-value language like SML.
I think the # syntax is what you're looking for here:
data MyTag = MyTag Int Bool String
someFunct :: MyTag -> (MyTag, Int, Bool, String)
someFunct x#(MyTag a b c) = (x, a, b, c) -- x is bound to the entire argument
In Haskell, tuple types are separated by commas, e.g., (t1, t2), so what you want is:
data Thing = Info (Int, Int)
Reading the other answers, I think I can provide a few more example and one recommendation.
data ThreeConstructors = MyTag Int | YourTag (String,Double) | HerTag [Bool]
someFunct :: Char -> Char -> Char -> ThreeConstructors
MyTag x = someFunct 'a' 'b' 'c'
This is like the "let MyTag x = someFunct a b c" examples, but it is a the top level of the module.
As you have noticed, Haskell's top level can defined commands but there is no way to automatically run any code merely because your module has been imported by another module. This is entirely different from Scheme or SML. In Scheme the file is interpreted as being executed form-by-form, but Haskell's top level is only declarations. Thus Libraries cannot do normal things like run initialization code when loaded, they have to provide a "pleaseRunMe :: IO ()" kind of command to do any initialization.
As you point out this means running all the tests requires some boilerplate code to list them all. You can look under hackage's Testing group for libraries to help, such as test-framework-th.
For #2, yes, Haskell's pattern matching does the same thing. Both let and where do pattern matching. You can do
let MyTag x = someFunct a b c
in ...
or
...
where MyTag x = someFunct a b c

Resources