Why aren't there many discussions about co- and contra-variance in Haskell (as opposed to Scala or C#)? - haskell

I know what covariance and contravariance of types are. My question is why haven't I encountered discussion of these concepts yet in my study of Haskell (as opposed to, say, Scala)?
It seems there is a fundamental difference in the way Haskell views types as opposed to Scala or C#, and I'd like to articulate what that difference is.
Or maybe I'm wrong and I just haven't learned enough Haskell yet :-)

There are two main reasons:
Haskell lacks an inherent notion of subtyping, so in general variance is less relevant.
Contravariance mostly appears where mutability is involved, so most data types in Haskell would simply be covariant and there'd be little value to distinguishing that explicitly.
However, the concepts do apply--for instance, the lifting operation performed by fmap for Functor instances is actually covariant; the terms co-/contravariance are used in Category Theory to talk about functors. The contravariant package defines a type class for contravariant functors, and if you look at the instance list you'll see why I said it's much less common.
There are also places where the idea shows up implicitly, in how manual conversions work--the various numeric type classes define conversions to and from basic types like Integer and Rational, and the module Data.List contains generic versions of some standard functions. If you look at the types of these generic versions you'll see that Integral constraints (giving toInteger) are used on types in contravariant position, while Num constraints (giving fromInteger) are used for covariant position.

There are no "sub-types" in Haskell, so covariance and contravariance don't make any sense.
In Scala, you have e.g. Option[+A] with the subclasses Some[+A] and None. You have to provide the covariance annotations + to say that an Option[Foo] is an Option[Bar] if Foo extends Bar. Because of the presence of sub-types, this is necessary.
In Haskell, there are no sub-types. The equivalent of Option in Haskell, called Maybe, has this definition:
data Maybe a = Nothing | Just a
The type variable a can only ever be one type, so no further information about it is necessary.

As mentioned, Haskell does not have subtypes. However, if you're looking at typeclasses it may not be clear how that works without subtyping.
Typeclasses specify predicates on types, not types themselves. So when a Typeclass has a superclass (e.g. Eq a => Ord a), that doesn't mean instances are subtypes, because only the predicates are inherited, not the types themselves.
Also, co-, contra-, and in- variance mean different things in different fields of math (see Wikipedia). For example the terms covariant and contravariant are used in functors (which in turn are used in Haskell), but the terms mean something completely different. The term invariant can be used in a lot of places.


What are algebraic structures in functional programming?

I've been doing some light reading on functional programming concepts and ideas. So far, so good, I've read about three main concepts: algebraic structures, type classes, and algebraic data types. I have a fairly good understanding of what algebraic data types are. I think sum types and product types are fairly straightforward. For example, I can imagine creating an algebraic data type like a Card type which is a product type consisting of two enum types, Suit (with four values and symbols) and Rank (with 13 values and symbols).
However, I'm still hung up on trying to understand precisely what algebraic structures and type classes are. I just have a surface-level picture in my head but can't quite completely wrap my head around, for instance, the different types of algebraic structures like functors, monoids, monads, etc. How exactly are these different? How can they be used in a programming setting? How are type classes different from regular classes? Can anyone at least point me in the direction of a good book on abstract algebra and functional programming? Someone recommended I learn Haskell but do I really need to learn Haskell in order to understand functional programming?
"algebraic structure" is a concept that goes well beyond programming, it belongs to mathematics.
Imagine the unfathomably deep sea of all possible mathematical objects. Numbers of every stripe (the naturals, the reals, p-adic numbers...) are there, but also things like sequences of letters, graphs, trees, symmetries of geometrical figures, and all well-defined transformations and mappings between them. And much else.
We can try to "throw a net" into this sea and retain only some of those entities, by specifying conditions. Like "collections of things, for which there is an operation that combines two of those things into a third thing of the same type, and for which the operation is associative". We can give those conditions their own name, like, say, "semigroup". (Because we are talking about highly abstract stuff, choosing a descriptive name is difficult.)
That leaves out many inhabitants of the mathematical "sea", but the description still fits a lot of them! Many collections of things are semigroups. The natural numbers with the multiplication operation for example, but also non-empty lists of letters with concatenation, or the symmetries of a square with composition.
You can expand your description with extra conditions. Like "a semigroup, and there's also an element such that combining it with any other element gives the other element, unchanged". That restricts the number of mathematical entities that fit the description, because you are demanding more of them. Some valid semigroups will lack that "neutral element". But a lot of mathematical entities will still satisfy the expanded description. If you aren't careful, you can declare conditions so restrictive that no possible mathematical entity can actually fit them! At other times, you can be so precise that only one entity fits them.
Working purely with these descriptions of mathematical entities, using only the general properties we require of them, we can obtain unexpected results about them, non-obvious at first sight, results that will apply to all entities which fit the description. Think of these discoveries as the mathematical equivalent of "code reuse". For example, if we know that some collection of things is a semigroup, then we can calculate exponentials using binary exponentiation instead of tediously combining a thing with itself n times. But that only works because of the associative property of the semigroup operation.
You’ve asked quite a few questions here, but I can try to answer them as best I can:
… different types of algebraic structures like functors, monoids, monads, etc. How exactly are these different? How can they be used in a programming setting?
This is a very common question when learning Haskell. I won’t write yet another answer here — and a complete answer is fairly long anyway — but a simple Google search gives some very good answers: e.g. I can recommend 1 2 3
How are type classes different from regular classes?
(By ‘regular classes’ I assume you mean classes as found in OOP.)
This is another common question. Basically, the two have almost nothing in common except the name. A class in OOP is a combination of fields and methods. Classes are used by creating instances of that class; each instance can store data in its fields, and manipulate that data using its methods.
By contrast, a type class is simply a collection of functions (often also called methods, though there’s pretty much no connection). You can declare an instance of a type class for a data type (again, no connection) by redefining each method of the class for that type, after which you may use the methods with that type. For instance, the Eq class looks like this:
class Eq a where
(==) :: a -> a -> Bool
(/=) :: a -> a -> Bool
And you can define an instance of that class for, say, Bool, by implementing each function:
instance Eq Bool where
True == True = True
False == False = True
_ == _ = False
p /= q = not (p == q)
Can anyone at least point me in the direction of a good book on abstract algebra and functional programming?
I must admit that I can’t help with this (and it’s off-topic for Stack Overflow anyway).
Someone recommended I learn Haskell but do I really need to learn Haskell in order to understand functional programming?
No, you don’t — you can learn functional programming from any functional language, including Lisp (particularly Scheme dialects), OCaml, F#, Elm, Scala etc. Haskell happens to be a particularly ‘pure’ functional programming language, and I would recommend it as well, but if you just want to learn and understand functional programming then any one of those will do.

In Haskell, why is there a typeclass hierarchy/inheritance?

To clarify my question, let me rephrase it in a more or less equivalent way:
Why is there a concept of superclass/class inheritance in Haskell?
What are the historical reasons that led to that design choice?
Why would it be so bad, for example, to have a base library with no class hierarchy, just typeclasses independent from each other?
Here I'll expose some random thoughts that made me want to ask this question. My current intuitions might be inaccurate as they are based on my current understanding of Haskell which is not perfect, but here they are...
It is not obvious to me why type class inheritance exists in Haskell. I find it a bit weird, as it creates asymmetry in concepts.
Often in mathematics, concepts can be defined from different viewpoints, I don't necessarily want to favor an order of how they ought to be defined. OK there is some order in which one should prove things, but once theorems and structures are there, I'd rather see them as available independent tools.
Moreover one perhaps not so good thing I see with class inheritance is this: I think a class instance will silently pick a corresponding superclass instance, which was probably implemented to be the most natural one for that type. Let's consider a Monad viewed as a subclass of Functor. Maybe there could be more than one way to define a Functor on some type that also happens to be a Monad. But saying that a Monad is a Functor implicitly makes the choice of one particular Functor for that Monad. Someday, you might forget that actually you wanted some other Functor.
Perhaps this example is not the best fit, but I have the feeling this sort of situation might generalize and possibly be dangerous if your class is a child of many. Current Haskell inheritance sounds like it makes default choices about parents implicitly.
If instead you have a design without hierarchy, I feel you would always have to be explicit about all the properties required, which would perhaps mean a bit less risk, more clarity, and more symmetry. So far, what I'm seeing is that the cost of such a design, would be : more constraints to write in instance definitions, and newtype wrappers, for each meaningful conversion from one set of concepts to another. I am not sure, but perhaps that could have been acceptable. Unfortunately, I think Haskell auto deriving mechanism for newtypes doesn't work very well, I would appreciate that the language was somehow smarter with newtype wrapping/unwrapping and required less verbosity.
I'm not sure, but now that I think about it, perhaps an alternative to newtype wrappers could be specific imports of modules containing specific variations of instances.
Another alternative I thought about while writing this, is that maybe one could weaken the meaning of class (P x) => C x, where instead of it being a requirement that an instance of C selects an instance of P, we could just take it to loosely mean that for example, C class also contains P's methods but no instance of P is automatically selected, no other relationship with P exists. So we could keep some sort of weaker hierarchy that might be more flexible.
Thanks if you have some clarifications over that topic, and/or correct my possible misunderstandings.
Maybe you're tired of hearing from me, but here goes...
I think superclasses were introduced as a relatively minor and unimportant feature of type classes. In Wadler and Blott, 1988, they are briefly discussed in Section 6 where the example class Eq a => Num a is given. There, the only rationale offered is that it's annoying to have to write (Eq a, Num a) => ... in a function type when it should be "obvious" that data types that can be added, multiplied, and negated ought to be testable for equality as well. The superclass relationship allows "a convenient abbreviation".
(The unimportance of this feature is underscored by the fact that this example is so terrible. Modern Haskell doesn't have class Eq a => Num a because the logical justification for all Nums also being Eqs is so weak. The example class Eq a => Ord a would be been a lot more convincing.)
So, the base library implemented without any superclasses would look more or less the same. There would just be more logically superfluous constraints on function type signatures in both library and user code, and instead of fielding this question, I'd be fielding a beginner question about why:
leq :: (Ord a) => a -> a -> Bool
leq x y = x < y || x == y
doesn't type check.
To your point about superclasses forcing a particular hierarchy, you're missing your target.
This kind of "forcing" is actually a fundamental feature of type classes. Type classes are "opinionated by design", and in a given Haskell program (where "program" includes all the libraries, include base used by the program), there can be only one instance of a particular type class for a particular type. This property is referred to as coherence. (Even though there is a language extension IncohorentInstances, it is considered very dangerous and should only be used when all possible instances of a particular type class for a particular type are functionally equivalent.)
This design decision comes with certain costs, but it also comes with a number of benefits. Edward Kmett talks about this in detail in this video, starting at around 14:25. In particular, he compares Haskell's coherent-by-design type classes with Scala's incoherent-by-design implicits and contrasts the increased power that comes with the Scala approach with the reusability (and refactoring benefits) of "dumb data types" that comes with the Haskell approach.
So, there's enough room in the design space for both coherent type classes and incoherent implicits, and Haskell's appoach isn't necessarily the right one.
BUT, since Haskell has chosen coherent type classes, there's no "cost" to having a specific hierarchy:
class Functor a => Monad a
because, for a particular type, like [] or MyNewMonadDataType, there can only be one Monad and one Functor instance anyway. The superclass relationship introduces a requirement that any type with Monad instance must have Functor instance, but it doesn't restrict the choice of Functor instance because you never had a choice in the first place. Or rather, your choice was between having zero Functor [] instances and exactly one.
Note that this is separate from the question of whether or not there's only one reasonable Functor instance for a Monad type. In principle, we could define a law-violating data type with incompatible Functor and Monad instances. We'd still be restricted to using that one Functor MyType instance and that one Monad MyType instance throughout our program, whether or not Functor was a superclass of Monad.

What are abstract patterns?

I am learning Haskell and trying to understand the Monoid typeclass.
At the moment, I am reading the haskellbook and it says the following about the pattern (monoid):
One of the finer points of the Haskell community has been its
propensity for recognizing abstract patterns in code which have
well-defined, lawful representations in mathematics.
What does the author mean by abstract patterns?
Abstract in this sense is the opposite of concrete. This is probably one of the key things to understand about Haskell.
What is a concrete thing? Well, most values in Haskell are concrete. For example 'a' :: Char. The letter 'a' is a Char value, and it's a concrete value. Char is a concrete type. But in 1 :: Num a => a, the number 1 is actually a value of any type, so long as that type has the set of functions that the Num typeclass sets out as mandatory. This is an abstract value! We can have abstract values, abstract types, and therefore abstract functions. When the program is compiled, the Haskell compiler will pick a particular concrete value that supports all of our requirements.
Haskell, at its core, has a very simple, small but incredibly flexible language. It's very similar to an expression of maths, actually. This makes it very powerful. Why? because most things that would be built in language constructs in other languages are not directly built into Haskell, but defined in terms of this simple core.
One of the core pieces is the function, which, it turns out, most of computation is expressible in terms of. Because so much of Haskell is just defined in terms of this small simple core, it means we can extend it to almost anywhere we can imagine.
Typeclasses are probably the best example of this. Monoid, and Num are examples of typeclasses. These are constructs that allow programmers to use an abstraction like a function across a great many types but only having to define it once. Typeclasses let us use the same function names across a whole range of types if we can define those functions for those types. Why is that important or useful? Well, if we can recognise a pattern across, for example, all numbers, and we have a mechanism for talking about all numbers in the language itself, then we can write functions that work with all numbers at once. This is an abstract pattern. You'll notice some Haskellers are quite interested in a branch of mathematics called Category Theory. This branch is pretty much the mathematical definition of abstract patterns. Contrast this ability to encode such things with the inability of other languages, where in other languages the patterns the community notice are often far less rigorous and have to be manually written out, and without any respect for its mathematical nature. The beauty of following the mathematics is the extremely large body of stuff we get for free by aligning our language closer with mathematics.
This is a good explanation of these basics including typeclasses in a book that I helped author: http://www.happylearnhaskelltutorial.com/1/output_other_things.html
Because functions are written in a very general way (because Haskell puts hardly any limits on our ability to express things generally), we can write functions that use types which express such things as "any type, so long as it's a Monoid". These are called type constraints, as above.
Generally abstractions are very useful because we can, for example, write on single function to operate on an entire range of types which means we can often find functions that do exactly what we want on our types if we just make them instances of specific typeclasses. The Ord typeclass is a great example of this. Making a type we define ourselves an instance of Ord gives us a whole bunch of sorting and comparing functions for free.
This is, in my opinion, one of the most exciting parts about Haskell, because while most other languages also allow you to be very general, they mostly take an extreme dip in how expressive you can be with that generality, so therefore also are less powerful. (This is because they are less precise in what they talk about, because their types are less well "defined").
This is how we're able to reason about the "possible values" of a function, and it's not limited to Haskell. The more information we encode at the type level, the more toward the specificity end of the spectrum of expressivity we veer. For example, to take a classic case, the function const :: a -> b -> a. This function requires that a and b can be of absolutely any type at all, including the same type if we like. From that, because the second parameter can be a different type than the first, we can work out that it really only has one possible functionality. It can't return an Int, unless we give it an Int as its first value, because that's not any type, right? So therefore we know the only value it can return is the first value! The functionality is defined right there in the type! If that's not mindblowing, then I don't know what is. :)
As we move to dependent types (that is, a type system where types are first class, which means also that ordinary values can be encoded in the type system), we can get closer and closer to having the type system specify specifically what the constraints of possible functionality are. However, the kicker is, it doesn't necessarily speak about the implementation of the functionality unless we want it to, because we're in control of how abstract it is, but while maintaining expressivity and much precision. That's pretty fascinating, and amazingly powerful.
Much math can be expressed in the language that underpins Haskell, the lambda calculus.

What are types with type constraints called?

For example, Num a => a.
I assumed they're just called "constrained types", but Googling didn't turn up many uses of that term so I'm curious to know if they go by some other name.
Types with this particular kind of constraints are called "qualified types", and the feature itself sometimes "qualified polymorphism". I believe the terminology was originally introduced by Mark Jones' ESOP '92 paper.
Qualified types should not be confused with the more mainstream notion of "bounded polymorphism", on which generics in languages like Java are based. Bounded polymorphism essentially is the (rather complicated) combination of parametric polymorphism with subtyping, whereas qualified types get along without subtyping.
"Qualified types". See Mark P. Jones. Qualified Types: Theory and Practice. Cambridge University Press, Cambridge, 1994.
Plenty of relevant matches on Google.
I'm no type theory expert, but with a little research, this is what I've found (which may or may not be helpful, but I can't fit this in a comment).
A Gentle Introduction to Haskell: Classes calls the Num a portion the type's context:
The constraint that a type a must be an instance of the class Eq is
written Eq a. Thus Eq a is not a type expression, but rather it
expresses a constraint on a type, and is called a context.
So I suppose you could say "a type with a context", or as you mentioned "constrained type".
Another place to look is where type-classes are first described (I believe) for Haskell: How to make ad-hoc polymorphism less ad-hoc [postscript].
Type classes appear to be closely related to issues that arise in
object-oriented programming, bounded quantification of types, and
abstract data types[CW85, MP85, Rey85]. Some of the connections are
outlined below, but more work is required to under-stand these
relations fully.
This paper was written in 1988, so I'm not sure if these relations are now fully understood, but the wikipedia page for Bounded quantification doesn't mention Haskell, so I'm not sure it's exactly the same thing. (once again, not a type theorist -- just a guy who likes Haskell)
Also, about the type square :: Num a => a -> a it says:
This is read, "square has type a -> a, for every a such that a belongs
to class Num (i.e., such that (+),(*), and negate are defined on a)."
You could say the type "belongs to a class".
That's about all I've got. Personally, I think "constrained types" or "types constrained to a class" work fine.
The Num a => part is indeed called a constraint; you can read it as "if Num a is true, then ..."
Normally, constraints and quantifiers are discussed together. Any constrained type can be converted to an equivalent type where constraints only appear just inside forall or exists quantifiers. So, you won't hear of "constrained types" as often as you will hear of "constrained parametric polymorphism" (forall a. C => T), "constrained existential types" (exists a. C => T), or "constrained polymorphism" (both kinds of quantifiers).
A related term is "bounded polymorphism." Bounded polymorphism usually means constrained polymorphism where the constraint is a subtype or supertype constraint. However, this distinction isn't strictly followed. In languages with subtyping like Java or Scala, you will often hear any kind of constraint called a "bound."
You could call it a bounded polymorphic type (see wikipedia).

Haskell's TypeClasses and Go's Interfaces

What are the similarities and the differences between Haskell's TypeClasses and Go's Interfaces? What are the relative merits / demerits of the two approaches?
Looks like only in superficial ways are Go interfaces like single parameter type classes (constructor classes) in Haskell.
Methods are associated with an interface type
Objects (particular types) may have implementations of that interface
It is unclear to me whether Go in any way supports bounded polymorphism via interfaces, which is the primary purpose of type classes. That is, in Haskell, the interface methods may be used at different types,
class I a where
put :: a -> IO ()
get :: IO a
instance I Int where
instance I Double where
So my question is whether Go supports type polymorphism. If not, they're not really like type classes at all. And they're not really comparable.
Haskell's type classes allow powerful reuse of code via "generics" -- higher kinded polymorphism -- a good reference for cross-language support for such forms of generic program is this paper.
Ad hoc, or bounded polymorphism, via type classes, is well described here. This is the primary purpose of type classes in Haskell, and one not addressed via Go interfaces, meaning they're not really very similar at all. Interfaces are strictly less powerful - a kind of zeroth-order type class.
I will add to Don Stewart's excellent answer that one of the surprising consquences of Haskell's type classes is that you can use logic programming at compile time to generate arbitrarily many instances of a class. (Haskell's type-class system includes what is effectively a cut-free subset of Prolog, very similar to Datalog.) This system is exploited to great effect in the QuickCheck library. Or for a very simple example, you can see how to define a version of Boolean complement (not) that works on predicates of arbitrary arity. I suspect this ability was an unintended consequence of the type-class system, but it has proven incredibly powerful.
Go has nothing like it.
In haskell typeclass instantiation is explicit (i.e. you have to say instance Foo Bar for Bar to be an instance of Foo), while in go implementing an interface is implicit (i.e. when you define a class that defines the right methods, it automatically implements the according interface without having to say something like implement InterfaceName).
An interface can only describe methods where the instance of the interface is the receiver. In a typeclass the instantiating type can appear at any argument position or the return type of a function (i.e. you can say, if Foo is an instance of type Bar there must be a function named baz, which takes an Int and returns a Foo - you can't say that with interfaces).
Very superficial similarities, Go's interfaces are more like structural sub-typing in OCaml.
C++ Concepts (that didn't make it into C++0x) are like Haskell type classes. There were also "axioms" which aren't present in Haskell at all. They let you formalize things like the monad laws.
