Why does the category of language types have morphisms, not functors? - programming-languages

I am probably not phrasing this question well, so please bear with me as I try to explain what I mean.
I am working on learning category theory, as applied to programming. So far, I understand that:
Objects in a category are "unbroken"; you're not supposed to peer inside them to see their internal structure.
Any Set is a category
Programming language types can be thought of as sets. (Bool is the set True, False; Int is the set of all integers; etc.)
Thus, all of the types in a language form a category of sets.
Morphisms are arrows between objects.
If those objects are themselves categories, then the morphisms get called functors.
Taken together, that would imply that a function from Int to Bool is a functor, because it's a map from the the set category Int to the set category Bool.
However, I have also read elsewhere (in particular https://www.johndcook.com/blog/2014/05/10/haskell-category-theory/), that thinking of it that way is wrong and we really shouldn't be talking about language types being a base category with "just" morphisms. But I don't see how that fits with my previous logic.
I therefore must conclude that my previous logic is faulty, but I'm unclear how or why. What is the right way to conceptualize this? Are Sets just extra special exceptions? Or is it really just an arbitrary matter of preference for how to view the problem space? Or am I just flat out wrong somewhere?

Related

What are algebraic structures in functional programming?

I've been doing some light reading on functional programming concepts and ideas. So far, so good, I've read about three main concepts: algebraic structures, type classes, and algebraic data types. I have a fairly good understanding of what algebraic data types are. I think sum types and product types are fairly straightforward. For example, I can imagine creating an algebraic data type like a Card type which is a product type consisting of two enum types, Suit (with four values and symbols) and Rank (with 13 values and symbols).
However, I'm still hung up on trying to understand precisely what algebraic structures and type classes are. I just have a surface-level picture in my head but can't quite completely wrap my head around, for instance, the different types of algebraic structures like functors, monoids, monads, etc. How exactly are these different? How can they be used in a programming setting? How are type classes different from regular classes? Can anyone at least point me in the direction of a good book on abstract algebra and functional programming? Someone recommended I learn Haskell but do I really need to learn Haskell in order to understand functional programming?
"algebraic structure" is a concept that goes well beyond programming, it belongs to mathematics.
Imagine the unfathomably deep sea of all possible mathematical objects. Numbers of every stripe (the naturals, the reals, p-adic numbers...) are there, but also things like sequences of letters, graphs, trees, symmetries of geometrical figures, and all well-defined transformations and mappings between them. And much else.
We can try to "throw a net" into this sea and retain only some of those entities, by specifying conditions. Like "collections of things, for which there is an operation that combines two of those things into a third thing of the same type, and for which the operation is associative". We can give those conditions their own name, like, say, "semigroup". (Because we are talking about highly abstract stuff, choosing a descriptive name is difficult.)
That leaves out many inhabitants of the mathematical "sea", but the description still fits a lot of them! Many collections of things are semigroups. The natural numbers with the multiplication operation for example, but also non-empty lists of letters with concatenation, or the symmetries of a square with composition.
You can expand your description with extra conditions. Like "a semigroup, and there's also an element such that combining it with any other element gives the other element, unchanged". That restricts the number of mathematical entities that fit the description, because you are demanding more of them. Some valid semigroups will lack that "neutral element". But a lot of mathematical entities will still satisfy the expanded description. If you aren't careful, you can declare conditions so restrictive that no possible mathematical entity can actually fit them! At other times, you can be so precise that only one entity fits them.
Working purely with these descriptions of mathematical entities, using only the general properties we require of them, we can obtain unexpected results about them, non-obvious at first sight, results that will apply to all entities which fit the description. Think of these discoveries as the mathematical equivalent of "code reuse". For example, if we know that some collection of things is a semigroup, then we can calculate exponentials using binary exponentiation instead of tediously combining a thing with itself n times. But that only works because of the associative property of the semigroup operation.
You’ve asked quite a few questions here, but I can try to answer them as best I can:
… different types of algebraic structures like functors, monoids, monads, etc. How exactly are these different? How can they be used in a programming setting?
This is a very common question when learning Haskell. I won’t write yet another answer here — and a complete answer is fairly long anyway — but a simple Google search gives some very good answers: e.g. I can recommend 1 2 3
How are type classes different from regular classes?
(By ‘regular classes’ I assume you mean classes as found in OOP.)
This is another common question. Basically, the two have almost nothing in common except the name. A class in OOP is a combination of fields and methods. Classes are used by creating instances of that class; each instance can store data in its fields, and manipulate that data using its methods.
By contrast, a type class is simply a collection of functions (often also called methods, though there’s pretty much no connection). You can declare an instance of a type class for a data type (again, no connection) by redefining each method of the class for that type, after which you may use the methods with that type. For instance, the Eq class looks like this:
class Eq a where
(==) :: a -> a -> Bool
(/=) :: a -> a -> Bool
And you can define an instance of that class for, say, Bool, by implementing each function:
instance Eq Bool where
True == True = True
False == False = True
_ == _ = False
p /= q = not (p == q)
Can anyone at least point me in the direction of a good book on abstract algebra and functional programming?
I must admit that I can’t help with this (and it’s off-topic for Stack Overflow anyway).
Someone recommended I learn Haskell but do I really need to learn Haskell in order to understand functional programming?
No, you don’t — you can learn functional programming from any functional language, including Lisp (particularly Scheme dialects), OCaml, F#, Elm, Scala etc. Haskell happens to be a particularly ‘pure’ functional programming language, and I would recommend it as well, but if you just want to learn and understand functional programming then any one of those will do.

What are abstract patterns?

I am learning Haskell and trying to understand the Monoid typeclass.
At the moment, I am reading the haskellbook and it says the following about the pattern (monoid):
One of the finer points of the Haskell community has been its
propensity for recognizing abstract patterns in code which have
well-defined, lawful representations in mathematics.
What does the author mean by abstract patterns?
Abstract in this sense is the opposite of concrete. This is probably one of the key things to understand about Haskell.
What is a concrete thing? Well, most values in Haskell are concrete. For example 'a' :: Char. The letter 'a' is a Char value, and it's a concrete value. Char is a concrete type. But in 1 :: Num a => a, the number 1 is actually a value of any type, so long as that type has the set of functions that the Num typeclass sets out as mandatory. This is an abstract value! We can have abstract values, abstract types, and therefore abstract functions. When the program is compiled, the Haskell compiler will pick a particular concrete value that supports all of our requirements.
Haskell, at its core, has a very simple, small but incredibly flexible language. It's very similar to an expression of maths, actually. This makes it very powerful. Why? because most things that would be built in language constructs in other languages are not directly built into Haskell, but defined in terms of this simple core.
One of the core pieces is the function, which, it turns out, most of computation is expressible in terms of. Because so much of Haskell is just defined in terms of this small simple core, it means we can extend it to almost anywhere we can imagine.
Typeclasses are probably the best example of this. Monoid, and Num are examples of typeclasses. These are constructs that allow programmers to use an abstraction like a function across a great many types but only having to define it once. Typeclasses let us use the same function names across a whole range of types if we can define those functions for those types. Why is that important or useful? Well, if we can recognise a pattern across, for example, all numbers, and we have a mechanism for talking about all numbers in the language itself, then we can write functions that work with all numbers at once. This is an abstract pattern. You'll notice some Haskellers are quite interested in a branch of mathematics called Category Theory. This branch is pretty much the mathematical definition of abstract patterns. Contrast this ability to encode such things with the inability of other languages, where in other languages the patterns the community notice are often far less rigorous and have to be manually written out, and without any respect for its mathematical nature. The beauty of following the mathematics is the extremely large body of stuff we get for free by aligning our language closer with mathematics.
This is a good explanation of these basics including typeclasses in a book that I helped author: http://www.happylearnhaskelltutorial.com/1/output_other_things.html
Because functions are written in a very general way (because Haskell puts hardly any limits on our ability to express things generally), we can write functions that use types which express such things as "any type, so long as it's a Monoid". These are called type constraints, as above.
Generally abstractions are very useful because we can, for example, write on single function to operate on an entire range of types which means we can often find functions that do exactly what we want on our types if we just make them instances of specific typeclasses. The Ord typeclass is a great example of this. Making a type we define ourselves an instance of Ord gives us a whole bunch of sorting and comparing functions for free.
This is, in my opinion, one of the most exciting parts about Haskell, because while most other languages also allow you to be very general, they mostly take an extreme dip in how expressive you can be with that generality, so therefore also are less powerful. (This is because they are less precise in what they talk about, because their types are less well "defined").
This is how we're able to reason about the "possible values" of a function, and it's not limited to Haskell. The more information we encode at the type level, the more toward the specificity end of the spectrum of expressivity we veer. For example, to take a classic case, the function const :: a -> b -> a. This function requires that a and b can be of absolutely any type at all, including the same type if we like. From that, because the second parameter can be a different type than the first, we can work out that it really only has one possible functionality. It can't return an Int, unless we give it an Int as its first value, because that's not any type, right? So therefore we know the only value it can return is the first value! The functionality is defined right there in the type! If that's not mindblowing, then I don't know what is. :)
As we move to dependent types (that is, a type system where types are first class, which means also that ordinary values can be encoded in the type system), we can get closer and closer to having the type system specify specifically what the constraints of possible functionality are. However, the kicker is, it doesn't necessarily speak about the implementation of the functionality unless we want it to, because we're in control of how abstract it is, but while maintaining expressivity and much precision. That's pretty fascinating, and amazingly powerful.
Much math can be expressed in the language that underpins Haskell, the lambda calculus.

Type classes with laws that contain not equalities/symmetries but inequalities/asymmetries

All of the type classes that I've come across, I think have had laws that establish symmetries by specifying equations. I was wondering though if there are any prominent theoretical or even practical examples of type classes that establish asymmetries, i.e. ones that demand the lack of symmetry? By e.g. specifying <expr1> /= <expr2> or <, or not somePredicate(a, b).
I understand that inequality can be expressed as an equality with a free variable, i.e. a > b = a + k = b etc, but I'm thinking the introduction of free variables itself might align with my idea of enforced asymmetry.
What would be the (theoretical) applications of such law? And are there any (runnable) examples of this?
Alternatively: if this can't be considered Haskell enough to be on SO, should this go on CS or CSTheory?
Algebraic laws in general typically are only specified in terms of equational identities, and not not disequalities. The standpoint to think about this is model theory. A theory can be thought of as 1) a collection of symbols, of different arities, so that sentences can be constructed from them (i.e. of arity 0 we might have sequences of numerals, of arity 1 we have negation, and of arity two we have addition) and 2) a set of equations that provide relations between sentences constructed from such signatures.
This lets us describe things like various arithmetic theories, groups, rings, modules, etc.
Now a model of a theory is a set of concrete assignments of mathematical objects (numbers, functions, etc) to the elements of the signature, such that the translation of sentences into the elements of the model respects these equations.
Categorically, we often think of a theory as a special sort of category of all possible sentences generated by the signature. The arrows in this category are implication -- sentences which may be generated from others by application of the equational identities. This in turn induces equivalences between all sentences which are the same under the application of the equational identities (this yields the "generators and relations" approach). And in turn, a model is simply a functor from this theory to any other category, though typically Set.
This yields a very nice adjunction between syntax and semantics. The greater the collection of sentences you want to model, the fewer the models you can get, and the more models you have, the smaller the set of sentences that will be satisfied by all of them. (Here I am only sketching the idea rather than filling in lots of important details).
In any case, one consequence of this that people tend to ignore, but that really pays off, is that in such a setting you want a "terminal model" that is the least among all models, just as you want an "initial theory" that admits all models. The terminal (aka trivial) model is the functor that sends everything in the theory to the empty set and maps on the empty set. Lots of very nice properties emerge when you have such things. But note -- to have such things, you must only have equational identities and not "disidentities." Such theories are called Algebraic Theories.
What does this all have to do with Haskell? Well, we can think of the signatures of typeclasses exactly as signatures of algebraic theories, and the laws of them as the equations of such theories. And that's generally how typeclasses are used in Haskell and why they were introduced -- to suit these sorts of situations.
But of course we don't have to do this -- we can have whatever laws we want. But we lose all sorts of nice properties along the way -- and often in fact find that disequalities mean our theory will have very few models, and with weird structure relating them. Since typeclasses are intended to capture common structure between various things, and since non-algebraic theories tend to fix unique(ish) models, then it turns out it is rarely the case that we would want to use disequality relations in typeclass laws. And indeed I can't think of any examples where I've seen it come up.
Here's another way to think of it -- consider a theory with equalities and disequalities both, and then eliminate the disequalities. What remains still admits all the prior models, but also may have a bunch of "unintended" ones. So we don't gain additional reasoning in terms of rewrites -- we just have certain models that are a priori excluded. Furthermore, when one wishes to rule out "unintended" models this is usually because we want to fix a particular "intended" one. And if we want to fix a particular intended model, the question immediately arises -- why not just use that concrete structure, instead of the more general typeclass?

Is there a term for a monad that is also a comonad?

I'm just wondering whether there's a concise term for something that's both a monad and a comonad. I've done some searching, and I know these structures exist, but I haven't found a name for them.
Such a creature, subject to certain conditions, is sometimes called a "Hopf monad" or a "Bimonad" (http://ncatlab.org/nlab/show/Hopf+monad).
However, this also requires fulfilling a number of axioms regarding distributive properties, and I haven't seen it come up in a programming context in any particular way.
As far as I know, there is no term to define it because a monad-comonad would enforce nothing: you can always do a return to get in or an extract to get out.
As types are there to enforce some constraints, a too permissive constraint wouldn't be of any use. As no one would use it (except for the identity), no one probably bothered to name it.

What happens to you if you break the monad laws?

Do the compiler or the more "native" parts of the libraries (IO or functions that have access to black magic and the implementation) make assumptions about these laws? Will breaking them cause the impossible to happen?
Or do they just express a programming pattern -- ie, the only person you'll annoy by breaking them are people who use your code and didn't expect you to be so careless?
The monad laws are simply additional rules that instances are expected to follow, beyond what can be expressed in the type system. Insofar as Monad expresses a programming pattern, the laws are part of that pattern. Such laws apply to other type classes as well: Monoid has very similar rules to Monad, and it's generally expected that instances of Eq will follow the rules expected for an equality relation, among other examples.
Because these laws are in some sense "part of" the type class, it should be reasonable for other code to expect they will hold, and act accordingly. Misbehaving instances may thus violate assumptions made by client code's logic, resulting in bugs, the blame for which is properly placed at the instance, not the code using it.
In short, "breaking the monad laws" should generally be read as "writing buggy code".
I'll illustrate this point with an example involving another type class, modified from one given by Daniel Fischer on the haskell-cafe mailing list. It is (hopefully) well known that the standard libraries include some misbehaving instances, namely Eq and Ord for floating point types. The misbehavior occurs, as you might guess, when NaN is involved. Consider the following data structure:
> let x = fromList [0, -1, 0/0, -5, -6, -3] :: Set Float
Where 0/0 produces a NaN, which violates the assumptions about Ord instances made by Data.Set.Set. Does this Set contain 0?
> member 0 x
True
Yes, of course it does, it's right there in plain sight! Now, we insert a value into the Set:
> let x' = insert (0/0) x
This Set still contains 0, right? We didn't remove anything, after all.
> member 0 x'
False
...oh. Oh, dear.
The compiler doesn't make any assumptions about the laws, however, if your instance does not obey the laws, it will not behave like a monad -- it will do strange things and otherwise appear to your users to not work correctly (e.g. dropping values, or evaluating things in the wrong order).
Also, refactorings your users might make assuming the monad laws hold will obviously not be sound.
For people working in more "mainstream" languages, this would be like implementing an interface, but doing so incorrectly. For example, imagine you're using a framework that offers an IShape interface, and you implement it. However, your implementation of the draw() method doesn't draw at all, but instead merely instantiates 1000 more instances of your class.
The framework would try to use your IShape and do reasonable things with it, and God knows what would happen. It'd be kind of an interesting train wreck to watch.
If you say you're a Monad, you're "declaring" that you adhere to its contract and laws. Other code will believe your declaration and act accordingly. Since you lied, things will go wrong in unforeseen ways.

Resources