There are ways (for example, https://jeltsch.wordpress.com/2012/04/30/dependently-typed-programming-and-theorem-proving-in-haskell/, PromotedDataKinds extension) to fake dependent types in haskell, and this makes it possible to prove some code properties in haskell itself.
Examples of properties I may want proved include totality of some function or monad laws for somr Monad typeclass instance.
Such a proof would be a value (a::T) where T represents the statement we've proved. Unfortunately, in haskell we can write terms of false type, e. g. (fix id::forall a.a), so we can actually prove everything. Is there a tool to check proof correctness or monad laws satisfaction in haskell?
Does anybody include proofs in source code?
Related
To clarify my question, let me rephrase it in a more or less equivalent way:
Why is there a concept of superclass/class inheritance in Haskell?
What are the historical reasons that led to that design choice?
Why would it be so bad, for example, to have a base library with no class hierarchy, just typeclasses independent from each other?
Here I'll expose some random thoughts that made me want to ask this question. My current intuitions might be inaccurate as they are based on my current understanding of Haskell which is not perfect, but here they are...
It is not obvious to me why type class inheritance exists in Haskell. I find it a bit weird, as it creates asymmetry in concepts.
Often in mathematics, concepts can be defined from different viewpoints, I don't necessarily want to favor an order of how they ought to be defined. OK there is some order in which one should prove things, but once theorems and structures are there, I'd rather see them as available independent tools.
Moreover one perhaps not so good thing I see with class inheritance is this: I think a class instance will silently pick a corresponding superclass instance, which was probably implemented to be the most natural one for that type. Let's consider a Monad viewed as a subclass of Functor. Maybe there could be more than one way to define a Functor on some type that also happens to be a Monad. But saying that a Monad is a Functor implicitly makes the choice of one particular Functor for that Monad. Someday, you might forget that actually you wanted some other Functor.
Perhaps this example is not the best fit, but I have the feeling this sort of situation might generalize and possibly be dangerous if your class is a child of many. Current Haskell inheritance sounds like it makes default choices about parents implicitly.
If instead you have a design without hierarchy, I feel you would always have to be explicit about all the properties required, which would perhaps mean a bit less risk, more clarity, and more symmetry. So far, what I'm seeing is that the cost of such a design, would be : more constraints to write in instance definitions, and newtype wrappers, for each meaningful conversion from one set of concepts to another. I am not sure, but perhaps that could have been acceptable. Unfortunately, I think Haskell auto deriving mechanism for newtypes doesn't work very well, I would appreciate that the language was somehow smarter with newtype wrapping/unwrapping and required less verbosity.
I'm not sure, but now that I think about it, perhaps an alternative to newtype wrappers could be specific imports of modules containing specific variations of instances.
Another alternative I thought about while writing this, is that maybe one could weaken the meaning of class (P x) => C x, where instead of it being a requirement that an instance of C selects an instance of P, we could just take it to loosely mean that for example, C class also contains P's methods but no instance of P is automatically selected, no other relationship with P exists. So we could keep some sort of weaker hierarchy that might be more flexible.
Thanks if you have some clarifications over that topic, and/or correct my possible misunderstandings.
Maybe you're tired of hearing from me, but here goes...
I think superclasses were introduced as a relatively minor and unimportant feature of type classes. In Wadler and Blott, 1988, they are briefly discussed in Section 6 where the example class Eq a => Num a is given. There, the only rationale offered is that it's annoying to have to write (Eq a, Num a) => ... in a function type when it should be "obvious" that data types that can be added, multiplied, and negated ought to be testable for equality as well. The superclass relationship allows "a convenient abbreviation".
(The unimportance of this feature is underscored by the fact that this example is so terrible. Modern Haskell doesn't have class Eq a => Num a because the logical justification for all Nums also being Eqs is so weak. The example class Eq a => Ord a would be been a lot more convincing.)
So, the base library implemented without any superclasses would look more or less the same. There would just be more logically superfluous constraints on function type signatures in both library and user code, and instead of fielding this question, I'd be fielding a beginner question about why:
leq :: (Ord a) => a -> a -> Bool
leq x y = x < y || x == y
doesn't type check.
To your point about superclasses forcing a particular hierarchy, you're missing your target.
This kind of "forcing" is actually a fundamental feature of type classes. Type classes are "opinionated by design", and in a given Haskell program (where "program" includes all the libraries, include base used by the program), there can be only one instance of a particular type class for a particular type. This property is referred to as coherence. (Even though there is a language extension IncohorentInstances, it is considered very dangerous and should only be used when all possible instances of a particular type class for a particular type are functionally equivalent.)
This design decision comes with certain costs, but it also comes with a number of benefits. Edward Kmett talks about this in detail in this video, starting at around 14:25. In particular, he compares Haskell's coherent-by-design type classes with Scala's incoherent-by-design implicits and contrasts the increased power that comes with the Scala approach with the reusability (and refactoring benefits) of "dumb data types" that comes with the Haskell approach.
So, there's enough room in the design space for both coherent type classes and incoherent implicits, and Haskell's appoach isn't necessarily the right one.
BUT, since Haskell has chosen coherent type classes, there's no "cost" to having a specific hierarchy:
class Functor a => Monad a
because, for a particular type, like [] or MyNewMonadDataType, there can only be one Monad and one Functor instance anyway. The superclass relationship introduces a requirement that any type with Monad instance must have Functor instance, but it doesn't restrict the choice of Functor instance because you never had a choice in the first place. Or rather, your choice was between having zero Functor [] instances and exactly one.
Note that this is separate from the question of whether or not there's only one reasonable Functor instance for a Monad type. In principle, we could define a law-violating data type with incompatible Functor and Monad instances. We'd still be restricted to using that one Functor MyType instance and that one Monad MyType instance throughout our program, whether or not Functor was a superclass of Monad.
I've been toying around in a Haskell for the past year or so and I'm actually starting to 'get' it, up until Monads, Lenses, Type Families, ... the lot.
I'm about to leave this comfort zone a little and I am moving to an OCaml project as a day job. Going through the syntax a little I was looking for similar higher level concepts, like for example functor.
I read the code in OCaml and the structure of a functor but I cannot seem to get whether they are now similar concepts in Haskell and OCaml or not.
In a nutshell, a functor in Haskell is for me mainly a way to lift functions in Haskell and I use it (and like it) like that.
In OCaml it gives me the feeling that it's closer to programming to an interface (for example when making a set or a list, with that compare function) and I wouldn't really know how to for example lift functions over the functor or so.
Can somebody explain me whether the two concepts are similar and if so what am I missing or not seeing? I googled around a bit and there doesn't seem to be a clear answer to be found.
Kasper
From a practical standpoint, you can think of "functors" in OCaml and Haskell as unrelated. As you said, in Haskell a functor is any type that allows you to map a function over it. In OCaml, a functor is a module parametrized by another module.
In Functional Programming, what is a functor? has a good description of what functors in the two languages are and how they differ.
However, as the name implies, there is actually a connection between the two seemingly disparate concepts! Both language's functors are just realizations of a concept from category theory.
Category theory is the study of categories, which are just arbitrary collections of objects with "morphisms" between them. The idea of a category is very abstract, so "objects" and "morphisms" can really be anything with a few restrictions—there has to be an identity morphism for every object and morphisms have to compose.
The most obvious example of a category is the category of sets and functions: the sets are the objects and the functions between sets the morphisms. Clearly, every set has has an identity function and functions can be composed. A very similar category can be formed by looking at a functional programming language like Haskell or OCaml: concrete types (e.g. types with kind *) are the objects and Haskell/OCaml functions are the morphisms between them.
In category theory, a functor is a transformation between categories. It is like a function between categories. When we're looking at the category of Haskell types, a functor is essentially a type-level function: it maps types to something else. The particular kind of functor we care about maps types to other types. A perfect example of this is Maybe: Maybe maps Int to Maybe Int, String to Maybe String and so on. It provides a mapping for every possible Haskell type.
Functors have one additional requirement—they have to map the category's morphisms as well as the objects. In particular, if we have a morphism A → B and our functor maps A to A' and B to B', it has to map the morphism A → B to some morphism A' → B'. As a concrete example, let's say we have the types Int and String. There are a whole bunch of Haskell functions Int → String. For Maybe to be a proper functor, it has to have a function Maybe Int → Maybe String for each of these.
Happily, this is exactly what the fmap function does—it maps functions. For Maybe, it has the type (a → b) → Maybe a → Maybe b; we can add some parentheses to get: (a → b) → (Maybe a → Maybe b). What this type signature tells us is that for any normal function we have, we also have a corresponding function over Maybes.
So a functor is a mapping between types that also preserves the functions between them. The fmap function is essentially just a proof of this second restriction on functors. This makes it easy to see how the Haskell Functor class is just a particular version of the mathematical concept.
So what about OCaml? In OCaml, a functor is not a type—it's a module. In particular, it's a parametrized module: a module that takes another module as an argument. Already, we can see some parallels: in Haskell, a Functor is like a type-level function; in OCaml, a functor is like a module-level function. So really, it's the same mathematical idea; however, instead of being used on types—like in Haskell—it's used on modules.
There's far more detail about how OCaml functors related to category theory functors on the CS site: What is the relation between functors in SML and Category theory?. The question talks about SML rather than OCaml per se, but my understanding is that the module system of OCaml is very closely related to that of SML.
In summary: functors in Haskell and in OCaml are two fundamentally different structures which both happen to be reifications of the same very abstract mathematical idea. I think it's pretty neat :).
All the typeclasses in Typeclassopedia have associated laws, such as associativity or commutativity for certain operators. The definition of a "law" seems to be a constraint that cannot be expressed in the type system. I certainly understand why you want to have, say, monad laws, but is there a fundamental reason why a typeclass that can be expressed fully within the type system is pointless?
You will notice that almost always the laws are algebraic laws. They could be expressed by the type system by using some extensions, but the proofs would be cumbersome to express. So you have unchecked laws and potentially implementations might break them. Why is this good?
The reason is that the design patterns used in Haskell are motivated (and in most cases mirrored) by mathematical structures, usually from abstract algebra. While most other languages have an intuitive notion of certain features like safety, performance and semantics, we Haskell programmers prefer to establish a formal notion. The advantage of doing this is: Once your types and functions obey the safety laws, they are safe in the sense of the underlying algebraic structure. They are provably safe.
Take functors as an example. A Haskell functor has the following two laws:
fmap f . fmap g = fmap (f . g)
fmap id = id
Firstly this is very important: Functions in Haskell are opaque. You cannot examine, compare or whatever them. While this sounds like a bad thing in Haskell it is actually a very good thing. The fmap function cannot examine the function you've passed it. Particularly it can't check that you've passed the identity function or that you've passed a composition. In short: it can't cheat! The only way for it to obey these two laws is actually not to introduce any effects of its own. That means, in a proper functor fmap will never do anything unexpected. In fact it cannot do anything else than to map the given function. This is a very simple example and I haven't explained all the subtleties why fmap can't cheat, but it demonstrates the point.
Now extend this all over the language, the base libraries and most sensible third party libraries. This gives you a language that is as predictable as a language can get. When you write code, you know what it's going to do. That's one of the main reasons why Haskell code often works out of the box. I often write pages of Haskell code before compiling. Once my type errors are fixed, my program usually works.
The other reason why this is desirable is that it allows a more compositional style of programming. This is particularly useful when working as a team. First you map your application to algebraic structures and establish the necessary laws. For example: You express what it means for something to be a Valid Web Server. In particular you establish a formal notion of web server composition. If you compose two Valid Web Servers, the result is a Valid Web Server. Do you see where this is going? After establishing these laws the teammates go to work, and they work in isolation. Little to no communication is necessary to get their job done. When they meet again, everybody presents their Valid Web Servers and they just compose them to make the final product, a web site. Since the individual components were all Valid Web Servers, the final result must be a Valid Web Server. Provably.
Yes and no. For instance the Show class does not have any laws associated with it, and it is certainly useful.
However, typeclasses express interfaces. An interface needs to satisfy more than being just a bunch of functions - you want these functions to fulfill a specification. The specification is normally more complicated than what can be expressed in Haskell's type system. For example, take the Eq class. It only needs to provide us with a function, the type of which has to be a -> a -> Bool. That's the most that Haskell's type system will allow us to require from an instance of an Eq type. However, we would normally expect more from this function - you would probably want it to be an equivalence relation (reflexive, symmetric and transitive). So then you state these requirements as separate "laws".
A typeclass doesn't need to have laws, but it often will be more useful if it has them. Many typeclasses are expected to function in a certain way, the laws codify user expectations. The laws let users make assumptions about the way that an instance of a typeclass will work. If you break the typeclass laws, you don't get arrested by the Haskell police, you just end up with confused users.
Have all the Haskell instances of Applicative typeclass that we get with the Haskell platform been proved to satisfy all the Applicative laws?
If yes, where do we find those proofs?
The source code of Control.Applicative does not seem to contain any proof that the applicative laws for various instances do hold.
It just mentions that
-- | A functor with application.
--
--Instances should satisfy the following laws:
It then just states the laws in the comments.
I found a similar case for the instances of other typeclasses (Alternative and Monad) too.
Are the users of these libraries supposed to verify these laws by themselves?
But I was wondering whether the rigorous proofs of these laws have been given elsewhere by the developers?
Again, I am aware that a rigorous proof of Applicate (or Monad) laws for IO Monad, which involves talking with outside world, in general, may well be very much complex.
Thanks.
Yes, the burden of proof is entirely on the library writers. There are some examples of implementations that violate these laws. The canonical example of a law violation is ListT, which doesn't obey the monad laws for most base monads (see examples). This gives very buggy behavior and nobody really uses ListT as a result.
I'm pretty sure most of these kinds of proofs haven't been etched in stone in a standard place. Most of the proofs have simply been repeated and checked to death by various curious members of the community so after a while we know which implementations do and do not satisfy their laws.
To give a concrete example, when I write my pipes library, I have to prove that my pipes satisfy the Category laws, but I just keep these proofs in a text file or an hpaste for future record if somebody requests them. Including them in the source is not really feasible because they can get very long, especially for associativity laws.
However, I think a good practice might be to include, when possible, machine-checked proofs in the original repositories so that users can refer to them as necessary.
There is excellent library checkers which provides QuickCheck properties to check laws.
The experimental library ghc-proofs allows you to use the compiler to verify such laws for you:
app_law_2 a b (c :: Succs a) = pure (.) <*> a <*> b <*> c
=== a <*> (b <*> c)
It only works in a few cases, such as the one described in my blog post, and is best seen as an experiment, not a ready tool.
In the question, Seeking constructive criticism on monad implementation, abesto asked people to criticize his "Monad" which kept count of the number of bind operations. It turned out that this was not actually a monad because it did not satisfy the first two monadic laws, but I found the example interesting. Is there any data type that would be suitable for such kinds of structures?
That's an interesting question, and has to do with the mathematical lineage of monads.
We could certainly create a typeclass called something like Monadish, which would look exactly like the Monad typeclass:
class Monadish m where
returnish :: a -> m a
bindish :: m a -> (a -> m b) -> m b
So the monad laws have nothing to do with the actual signature of the typeclass; they're extra information that an implementor has to enforce by themselves. So, in one sense, the answer is "of course"; just make another typeclass and say it doesn't have to satisfy any laws.
But is such a typeclass interesting? For a mathematician, the answer would be no: the lack of any laws means that there is no interesting structure by which to reason with. When we define a mathematical structure, we usually define some objects (check), some operations (check) and then some properties of the operations (...nope). We need all three of these to prove theorems about this class of objects, and, to take one example, abstract algebra is all about taking the same operations and adding more or fewer laws.
For a software engineer, the answer is a little more complex. Reasoning is not required: you can always just use a typeclass to overload syntax for your own nefarious purposes. We can use a typeclass to group things together that "feel" the same, even though we don't have any formal reasons for believing so. There are some benefits to doing this, but I personally feel this throws out a lot of the benefits of having laws, and leads to architecture astronauts who invent abstract structures without a whole lot of thought of their applicability. Maths is a safer bet: the monad laws correspond to left identity, right identity, and associativity, reasonably fundamental assumptions that even a non-mathematical person would be familiar with.