I've come to heavily rely on haskell's type system to avoid bugs and was looking if it could also be used to ensure a (weak?) form of code correctness.
The form of code correctness I have in mind goes as follows:
A function f :: a -> b is correct if f is guaranteed to produce an output of type b for any input of type a. This clearly fails for the well known function head (head:: [a] -> a).
One way I am aware of that the type system fails to guarantee this form of correctness is when code uses error (error :: Char -> a). The error function pretty much overrides the entire type system so I would like to avoid the use of this function unless explicitly intended.
My question is two-fold:
What other functions (or haskell snippets) besides error are exceptions to Haskell's type system?
More importantly, is there a method to forbid the use of such functions within a Haskell module?
Many thanks!
You can write your own functions that never produce a value. Consider the following function:
never :: a -> b
never a = never a
It fails to terminate, so it is considered to be the value _|_ (pronounced bottom). All types in Haskell are actually the sum of the type and this special value.
You can also write a function that is only partially defined, like
undefinedForFalse :: Bool -> Bool
undefinedForFalse True = True
undefinedForFalse False is undefined, which is a special value that is semantically equivalent to _|_ except that the runtime system can stop execution, because it knows it will never finish.
error is also special function that's result is always semantically equivalent to _|_, it tells the runtime system that it can stop execution, knowing it will never finish, along with an informative error message.
Haskell is incapable of proving that a type can never take the value _|_ because it is always possible for a type to take the value _|_. Values that can never be _|_ are called "total". Functions which always produce a value other than _|_ in a finite period of time are called "total functions". Languages that can prove this are called "total languages" or "total functional programming languages". If they can prove this for all types in the language, they are necessarily Turing incomplete.
It isn't possible to prove that any arbitrary function will produce a value in a Turing complete language. You can for some specific functions though, of course. This was covered well by Cirdec, but I would like to point out a couple things with regard to using the type system to guarantee (some level) of correctness.
The relevant topic is parametricity: the fact that given a polymorphic type signature, there is often only a limited set of possible implementations (if we exclude error, undefined and other things like that).
One example of this would be (a -> b) -> a -> b. There's in fact only one possible implementation of this function.
Another example is that given an f from some type A to some type B and a function r :: [a] -> [a], it can be shown that r . map f = map f . r. This is called the "free theorem" for map. Many other polymorphic types also have free theorems.
A (sometimes more useful) result is that if you can prove that a function mapper :: (a -> b) -> F a -> F b obeys the law mapper id = id, it can be shown that mapper f . mapper g = mapper (f . g). Also, this would be the fmap implementation for the type F since those two laws are the Functor laws. This particular "fusion" law can be useful for optimization purposes.
Source: Theorems for free! http://ttic.uchicago.edu/~dreyer/course/papers/wadler.pdf
Related
Is there a common name/type for a lens-like object that does not satisfy the property of getting out what you put in? For example something like a listLength :: Lens [a] Int where if you put in a length shorter than that of the source list you get a shortened list, but if you put in a longer length the original length is preserved.
A lens is not just a function with the type forall f. Functor f => (a -> f b) -> s -> f t — it's a function with that type that obeys certain laws. In particular, (as documented in the lens docs):
You get back what you put in,
Putting back what you got doesn't change anything, and
Setting twice is the same as setting once.
If your function doesn't obey those laws, then it's just a function with a type that's similar to a lens.
In your particular example, listLength breaks the first and third laws, so it's not a lens. That said, it would work just fine as a Getter, which I think is the only principle thing we can say about it.
More generally, it doesn't really make sense to ask about things that lack laws, as things tend to be defined by what laws they obey rather than what they don't obey. For example, I pointed out that listLength makes a perfectly good Getter because it consistently extracts a value out of the [a] input.
So, I'll ask you: What distinguishes it from a function listZero :: Lens [a] Int that always emits 0? Can you come up with a general law that listLength obeys that listZero doesn't? If so, then you have something to actually look for in the current literature: that is, listLength is one of a set of functions that obeys some (possibly interesting) law. If not, then you just have a function with a type that makes it look like a lens.
I'm trying to understand State newtype and I'm struggling with this explanation of the isomorphism in a book:
Newtypes must have the same underlying representation as the type they wrap, as the newtype wrapper disappears at compile time. So the function contained in the newtype must be isomorphic to the type it wraps. That is, there must be a way to go from the newtype to the thing it wraps and back again without losing information.
What does it mean applied to State newtype?
newtype State s a = State { runState :: s -> (a, s) }
That explanation "there must be a way to go from the newtype to the thing it wraps and back again" isn't clear.
Also, can you please say, where there is an isomorphism in this examples, where is not and why.
type Iso a b = (a -> b, b -> a)
newtype Sum a = Sum { getSum :: a }
sumIsIsomorphicWithItsContents :: Iso a (Sum a)
sumIsIsomorphicWithItsContents = (Sum, getSum)
(a -> Maybe b, b -> Maybe a)
[a] -> a, a -> [a]
The statement you quote makes no mention of State specifically. It is purely a statement about newtypes. It is a little misleading in referring to "the function contained in the newtype" because there is no requirement for the type wrapped by a newtype to be a function type - although this is the case for State and many other commonly used types defined by newtype.
The key thing for a newtype in general is exactly as it says: it has to simply wrap another type in a way that makes it trivial to go from the wrapped type to the wrapping one, and vice versa, with no loss of information - this is what it means for two types to be isomorphic, and also what makes it completely safe for the two types to have identical runtime representations.
It's easy to demonstrate typical data declarations that could not possibly fulfil this. For example take any type with 2 constructors, such as Either:
data Either a b = Left a | Right b
It's obvious that this is not isomorphic to either of its constituent types. For example, the Left constructor embeds a inside Either a b, but you can't get any of the Right values this way.
And even with a single constructor, if it takes more than one argument - such as the tuple constructor (,) - then again, you can embed either of the constituent types (given an arbitrary value of the other type) but you can't possibly get every value.
This is why the newtype keyword is only allowed for types with a single constructor which takes a single argument. This always provides an isomorphism, because given newtype Foo a = Foo a, then Foo constructor and the function \Foo a -> a are trivially inverses of each other. And this works the same for more complicated examples where the type constructor takes more type arguments, and/or where the wrapped type is more complex.
Such is exactly the case with State:
newtype State s a = State {runState :: s -> (a, s)}
The functions State and runState respectively wrap and unwrap the underlying type (which in this case is a function), and clearly are inverse to each other - therefore they provide an isomorphism.
Note finally that there is nothing special here about the use of record syntax in the definition - although it's very common in such cases in order to have an already-named "unwrapping" function. Other than this small convenience there is no difference from a newtype defined without record syntax.
To step back a little: newtype declarations are very similar to data declarations with a single constructor and a single argument - the difference is mainly in performance, as the keyword tells the compiler that the two types are equivalent so that there is no runtime overhead of conversion between the two types, which there otherwise would be. (There is also a difference with regard to laziness but I won't mention that, except here for completeness.) As for why do this rather than just use the underlying type - that's to provide extra type safety (there are 2 different types here for the compiler even though they're the same at runtime), and also allows typeclass instances to be specified without attaching those to the underlying type. Sum and Product are great examples here, as they provide Monoid instances for numeric types, based on addition and multiplication respectively, without giving either the undeserved distinction of being "the" Monoid instance for the underlying type.
And something similar is at work with State - when we use this type we signal explicitly that we're using it to represent state manipulation, which wouldn't be the case if we were just working with ordinary functions that happen to return a pair.
Why do we need Control.Lens.Reified? Is there some reason I can't place a Lens directly into a container? What does reify mean anyway?
We need reified lenses because Haskell's type system is predicative. I don't know the technical details of exactly what that means, but it prohibits types like
[Lens s t a b]
For some purposes, it's acceptable to use
Functor f => [(a -> f b) -> s -> f t]
instead, but when you reach into that, you don't get a Lens; you get a LensLike specialized to some functor or another. The ReifiedBlah newtypes let you hang on to the full polymorphism.
Operationally, [ReifiedLens s t a b] is a list of functions each of which takes a Functor f dictionary, while forall f . Functor f => [LensLike f s t a b] is a function that takes a Functor f dictionary and returns a list.
As for what "reify" means, well, the dictionary will say something, and that seems to translate into a rather stunning variety of specific meanings in Haskell. So no comment on that.
The problem is that, in Haskell, type abstraction and application are completely implicit; the compiler is supposed to insert them where needed. Various attempts at designing 'impredicative' extensions, where the compiler would make clever guesses where to put them, have failed; so the safest thing ends up being relying on the Haskell 98 rules:
Type abstractions occur only at the top level of a function definition.
Type applications occur immediately whenever a variable with a polymorphic type is used in an expression.
So if I define a simple lens:[1]
lensHead f [] = pure []
lensHead f (x:xn) = (:xn) <$> f x
and use it in an expression:
[lensHead]
lensHead gets automatically applied to some set of type parameters; at which point it's no longer a lens, because it's not polymorphic in the functor anymore. The take-away is: an expression always has some monomorphic type; so it's not a lens. (You'll note that the lens functions take arguments of type Getter and Setter, which are monomorphic types, for similar reasons to this. But a [Getter s a] isn't a list of lenses, because they've been specialized to only getters.)
What does reify mean? The dictionary definition is 'make real'. 'Reifying' is used in philosophy to refer to the act of regarding or treating something as real (rather than ideal or abstract). In programming, it tends to refer to taking something that normally can't be treated as a data structure and representing it as one. For example, in really old Lisps, there didn't use to be first-class functions; instead, you had to use S-Expressions to pass 'functions' around, and eval them when you needed to call the function. The S-Expressions represented the functions in a way you could manipulate in the program, which is referred to as reification.
In Haskell, we don't typically need such elaborate reification strategies as Lisp S-Expressions, partly because the language is designed to avoid needing them; but since
newtype ReifiedLens s t a b = ReifiedLens (Lens s t a b)
has the same effect of taking a polymorphic value and turning it into a true first-class value, it's referred to as reification.
Why does this work, if expressions always have monomorphic types? Well, because the Rank2Types extension adds a third rule:
Type abstractions occur at the top-level of the arguments to certain functions, with so-called rank 2 types.
ReifiedLens is such a rank-2 function; so when you say
ReifiedLens l
you get a type lambda around the argument to ReifiedLens, and then l is applied immediately to the the lambda-bound type argument. So l is effectively just eta-expanded. (Compilers are free to eta-reduce this and just use l directly).
Then, when you say
f (ReifiedLens l) = ...
on the right-hand side, l is a variable with polymorphic type, so every use of l is immediately implicitly assigned to whatever type arguments are needed for the expression to type-check. So everything works the way you expect.
The other way to think about is that, if you say
newtype ReifiedLens s t a b = ReifiedLens { unReify :: Lens s t a b }
the two functions ReifiedLens and unReify act like explicit type abstraction and application operators; this allows the compiler to identify where you want the abstractions and applications to take place well enough that the issues with impredicative type systems don't come up.
[1] In lens terminology, this is apparently called something other than a 'lens'; my entire knowledge of lenses comes from SPJ's presentation on them so I have no way to verify that. The point remains, since the polymorphism is still necessary to make it work as both a getter and a setter.
Does it help the compiler to optimise, or is it just surplus work to add additional type signatures? For example, one often sees:
foo :: a -> b
foo x = bar x
where bar x = undefined
Rather than:
foo :: a -> b
foo x = bar x
where bar :: a -> b
bar x = undefined
If I omit the top level type signature, GHC gives me a warning, so if I don't get warnings I am quite confident my program is correct. But no warnings are issued if I omit the signature in a where clause.
There exists a class of local functions whose types cannot be written in Haskell (without using fancy GHC extensions, that is). For example:
f :: a -> (a, Int)
f h = g 1
where g n = (h, n)
This is because while the a in the f type signature is polymorphic viewed from outside f, this is not so from within f. In g, it is just some unknown type, but not any type, and (standard) Haskell cannot express "the same type as the first argument of the function this one is defined in" in its type language.
Often definitions in where clauses are to avoid repeating yourself if a sub-expression occurs more than once in a definition. In such a case, the programmer thinks of the local definition as a simple stand-in for writing out the inline sub-expressions. You usually wouldn't explicitly type the inline sub-expressions, so you don't type the where definition either. If you're doing it to save on typing, then the type declaration would kill all your savings.
It seems quite common to introduce where to learners of Haskell with examples of that form, so they go on thinking that "normal style" is to not give type declarations for local definitions. At least, that was my experience learning Haskell. I've since found that many of my functions that are complicated enough to need a where block become rather inscrutable if I don't know the type of the local definitions, so I try to err towards always typing them now; even if I think the type is obvious while I'm writing the code, it may not be so obvious when I'm reading it after not having looked at it for a while. A little effort for my fingers is almost always outweighed by even one or two instances of having to run type inference in my head!
Ingo's answer gives a good reason for deliberately not giving a type to a local definition, but I suspect the main reason is that many programmers have assimilated the rule of thumb that type declarations be provided for top level definitions but not for local definitions from the way they learned Haskell.
Often where declarations are used for short, local things, which have simple types, or types that are easily inferred. As a result there's no benefit to the human or compiler to add the type.
If the type is complex, or cannot be inferred, then you might want to add the type.
While giving monomorphic type signatures can make top level functions faster, it isn't so much of a win for local definitions in where clauses, since GHC will inline and optimize away the definitions in most cases anyway.
Adding a type signature can make your code faster. Take for example the following program (Fibonacci):
result = fib 25 ;
-- fib :: Int -> Int
fib x = if x<2 then 1 else (fib (x-1)) + (fib (x-2))
Without the annotation in the 2nd line, it takes 0.010 sec. to run.
With the Int -> Int annotation, it takes 0.002 sec.
This happens because if you don't say anything about fib, it is going to be typed as fib :: (Num a, Num a1, Ord a) => a -> a1, which means that during runtime, extra data structures ("dictionaries") will have to be passed between functions to represent the Num/Ord typeclasses.
What it says in the title. If I write a type signature, is it possible to algorithmically generate an expression which has that type signature?
It seems plausible that it might be possible to do this. We already know that if the type is a special-case of a library function's type signature, Hoogle can find that function algorithmically. On the other hand, many simple problems relating to general expressions are actually unsolvable (e.g., it is impossible to know if two functions do the same thing), so it's hardly implausible that this is one of them.
It's probably bad form to ask several questions all at once, but I'd like to know:
Can it be done?
If so, how?
If not, are there any restricted situations where it becomes possible?
It's quite possible for two distinct expressions to have the same type signature. Can you compute all of them? Or even some of them?
Does anybody have working code which does this stuff for real?
Djinn does this for a restricted subset of Haskell types, corresponding to a first-order logic. It can't manage recursive types or types that require recursion to implement, though; so, for instance, it can't write a term of type (a -> a) -> a (the type of fix), which corresponds to the proposition "if a implies a, then a", which is clearly false; you can use it to prove anything. Indeed, this is why fix gives rise to ⊥.
If you do allow fix, then writing a program to give a term of any type is trivial; the program would simply print fix id for every type.
Djinn is mostly a toy, but it can do some fun things, like deriving the correct Monad instances for Reader and Cont given the types of return and (>>=). You can try it out by installing the djinn package, or using lambdabot, which integrates it as the #djinn command.
Oleg at okmij.org has an implementation of this. There is a short introduction here but the literate Haskell source contains the details and the description of the process. (I'm not sure how this corresponds to Djinn in power, but it is another example.)
There are cases where is no unique function:
fst', snd' :: (a, a) -> a
fst' (a,_) = a
snd' (_,b) = b
Not only this; there are cases where there are an infinite number of functions:
list0, list1, list2 :: [a] -> a
list0 l = l !! 0
list1 l = l !! 1
list2 l = l !! 2
-- etc.
-- Or
mkList0, mkList1, mkList2 :: a -> [a]
mkList0 _ = []
mkList1 a = [a]
mkList2 a = [a,a]
-- etc.
(If you only want total functions, then consider [a] as restricted to infinite lists for list0, list1 etc, i.e. data List a = Cons a (List a))
In fact, if you have recursive types, any types involving these correspond to an infinite number of functions. However, at least in the case above, there is a countable number of functions, so it is possible to create an (infinite) list containing all of them. But, I think the type [a] -> [a] corresponds to an uncountably infinite number of functions (again restrict [a] to infinite lists) so you can't even enumerate them all!
(Summary: there are types that correspond to a finite, countably infinite and uncountably infinite number of functions.)
This is impossible in general (and for languages like Haskell that does not even has the strong normalization property), and only possible in some (very) special cases (and for more restricted languages), such as when a codomain type has the only one constructor (for example, a function f :: forall a. a -> () can be determined uniquely). In order to reduce a set of possible definitions for a given signature to a singleton set with just one definition need to give more restrictions (in the form of additional properties, for example, it is still difficult to imagine how this can be helpful without giving an example of use).
From the (n-)categorical point of view types corresponds to objects, terms corresponds to arrows (constructors also corresponds to arrows), and function definitions corresponds to 2-arrows. The question is analogous to the question of whether one can construct a 2-category with the required properties by specifying only a set of objects. It's impossible since you need either an explicit construction for arrows and 2-arrows (i.e., writing terms and definitions), or deductive system which allows to deduce the necessary structure using a certain set of properties (that still need to be defined explicitly).
There is also an interesting question: given an ADT (i.e., subcategory of Hask) is it possible to automatically derive instances for Typeable, Data (yes, using SYB), Traversable, Foldable, Functor, Pointed, Applicative, Monad, etc (?). In this case, we have the necessary signatures as well as additional properties (for example, the monad laws, although these properties can not be expressed in Haskell, but they can be expressed in a language with dependent types). There is some interesting constructions:
http://ulissesaraujo.wordpress.com/2007/12/19/catamorphisms-in-haskell
which shows what can be done for the list ADT.
The question is actually rather deep and I'm not sure of the answer, if you're asking about the full glory of Haskell types including type families, GADT's, etc.
What you're asking is whether a program can automatically prove that an arbitrary type is inhabited (contains a value) by exhibiting such a value. A principle called the Curry-Howard Correspondence says that types can be interpreted as mathematical propositions, and the type is inhabited if the proposition is constructively provable. So you're asking if there is a program that can prove a certain class of propositions to be theorems. In a language like Agda, the type system is powerful enough to express arbitrary mathematical propositions, and proving arbitrary ones is undecidable by Gödel's incompleteness theorem. On the other hand, if you drop down to (say) pure Hindley-Milner, you get a much weaker and (I think) decidable system. With Haskell 98, I'm not sure, because type classes are supposed to be able to be equivalent to GADT's.
With GADT's, I don't know if it's decidable or not, though maybe some more knowledgeable folks here would know right away. For example it might be possible to encode the halting problem for a given Turing machine as a GADT, so there is a value of that type iff the machine halts. In that case, inhabitability is clearly undecidable. But, maybe such an encoding isn't quite possible, even with type families. I'm not currently fluent enough in this subject for it to be obvious to me either way, though as I said, maybe someone else here knows the answer.
(Update:) Oh a much simpler interpretation of your question occurs to me: you may be asking if every Haskell type is inhabited. The answer is obviously not. Consider the polymorphic type
a -> b
There is no function with that signature (not counting something like unsafeCoerce, which makes the type system inconsistent).