Why has Haskell troubles resolving "overloaded" operators?

Why has Haskell troubles resolving "overloaded" operators? - haskell

This post poses the question for the case of !! . The accepted answer tell us that what you are actually doing is creating a new function !! and then you should avoid importing the standard one.
But, why to do so if the new function is to be applied to different types than the standard one? Is not the compiler able to choose the right one according to its parameters?
Is there any compiler flag to allow this?
For instance, if * is not defined for [Float] * Float
Why the compiler cries
> Ambiguous occurrence *
> It could refer to either `Main.*', defined at Vec.hs:4:1
> or `Prelude.*',
for this code:
(*) :: [Float] -> Float -> [Float]
(*) as k = map (\a -> a*k) as -- here: clearly Float*Float
r = [1.0, 2.0, 3.0] :: [Float]
s = r * 2.0 -- here: clearly [Float] * Float
main = do
print r
print s

Allowing the compiler to choose the correct implementation of a function based on its type is the purpose of typeclasses. It is not possible without them.
For a justification of this approach, you might read the paper that introduced them: How to make ad-hoc polymorphism less ad hoc [PDF].

Really, the reason is this: in Haskell, there is not necessarily a clear association “variable x has type T”.
Haskell is almost as flexible as dynamic languages, in the sense that any type can be a type variable, i.e. can have polymorphic type. But whereas in dynamic languages (and also e.g. OO polymorphism or C++ templates), the types of such type-variables are basically just extra information attached to the value-variables in your code (so an overloaded operator can see: argument is an Int->do this, is a String->do that), in Haskell the type variables live in a completely seperate scope in the type language. This gives you many advantages, for instance higher-kinded polymorphism is pretty much impossible without such a system. However, it also means it's harder to reason about how overloaded functions should be resolved. If Haskell allowed you to just write overloads and assume the compiler does its best guess at resolving the ambiguity, you'd often end up with strange error messages in unexpected places. (Actually, this can easily happen with overloads even if you have no Hindley-Milner type system. C++ is notorious for it.)
Instead, Haskell chooses to force overloads to be explicit. You must first define a type class before you can overload methods, and though this can't completely preclude confusing compilation errors it makes them much easier to avoid. Also, it lets you express polymorphic methods with type resolution that couldn't be expressed with traditional overloading, in particular polymorphic results (which is great for writing very easily reusable code).

It is a design decision, not a theoretical problem, not to include this in Haskell. As you say, many other languages use types to disambiguate between terms on an ad-hoc way. But type classes have similar functionality and additionally allow abstraction over things that are overloaded. Type-directed name resolution does not.
Nevertheless, forms of type-directed name resolution have been discussed for Haskell (for example in the context of resolving record field selectors) and are supported by some languages similar to Haskell such as Agda (for data constructors) or Idris (more generally).

Related

"Generalized arrows" and proc notation?

When learning about Control.Arrow and Haskell's built-in proc notation, I had the idea that this language might prove very useful as an eDSL for general monoidal categories (using *** for tensor and >>> for composition), if only the Arrow typeclass were generalized to allow a general tens :: * -> * -> * operation rather than Arrow's (,) : * -> * -> *.
After doing some research, I found GArrows, which seem to fit my needs. However, the linked Garrow typeclass comes bundled with the so-called "HetMet" GHC extensions, and support for other features that (for the time being, anyway), I don't have much use for, such as "modal types".
Given that I would like to be able to use such a GArrow typeclass without having to install non-standard GHC extensions:
Is there an actual (somewhat standardized) library on Hackage that meets my needs for such a generalized arrow typeclass?
Given such a library, is there any way to use such a GArrow type class with a "generalized proc" notation without having to cook up my own GHC extension? (With RebindableSyntax perhaps?)
Note: Also, I'm fine with using quasiquotation for a generalized proc notation. So perhaps it wouldn't be too difficult to modify something like this to suit my needs.

I've wondered about that before, too. But – proc notation is so widely considered a silly oddball that there's probably not much interest in generalisation either (though I daresay this is what would make it actually useful!)
However, it's actually not necessary to have special syntax. The primary reference that must be named here is Conal Elliott's work on compiling lambda notation to bicartesian closed categories. Which I thought would have caught on in the Haskell community some time by now, but somehow hasn't. It is available as a GHC plugin, at any rate.
Even that isn't always needed. For some category combinators, you can just wrap a value that's universally quantified in the argument, and treat that as a pseudo-return-value. I call those Agent in constrained-categories; not sure if that's usable for your application, at any rate several things you'd do with arrow-like categories can be done. (In constrained-categories, the tensor product is fixed to (,), however, so probably not what you want. Although, could you explain what tensor product you need?)

Why do GHC and GHCI differ on type inference?

I noticed, when doing a codegolf challenge, that by default, GHC doesn't infer the most general type for variables, leading to type errors when you try to use it with two different types.
For example:
(!) = elem
x = 'l' ! "hello" -- From its use here, GHC assumes (!) :: Char -> [Char] -> Bool
y = 5 ! [3..8] -- Fails because GHC expects these numbers to be of type Char, too
This can be changed using the pragma NoMonomorphismRestriction.
However, typing this into GHCI produces no type error, and :t (!) reveals that here, it assumes (Foldable t, Eq a) => a -> t a -> Bool, even when explicitly run with -XMonomorphismRestriction.
Why do GHC and GHCI differ on assuming the most general type for functions?
(Also, why have it enabled by default anyway? What does it help?)

The background of why the committee made this decision is given, in the designers’ own words, in the article “A History of Haskell: Being Lazy with Class” by Paul Hudak et al.
A major source of controversy in the early stages was the so-called
“monomorphism restriction.” Suppose that genericLength has this
overloaded type:
genericLength :: Num a => [b] -> a
Now consider this definition:
f xs = (len, len)`
where
len = genericLength xs
It looks as if len should be computed only once, but it can
actually be computed twice. Why? Because we can infer the type
len :: (Num a) => a; when desugared with the dictionary-passing
translation, len becomes a function that is called once for each
occurrence of len, each of which might used at a different type.
[John] Hughes argued strongly that it was unacceptable to silently
duplicate computation in this way. His argument was motivated by
a program he had written that ran exponentially slower than he expected.
(This was admittedly with a very simple compiler, but we were reluctant to
make performance differences as big as this dependent on compiler
optimisations.)
Following much debate, the committee adopted the now-notorious
monomorphism restriction. Stated briefly, it says that a definition
that does not look like a function (i.e. has no arguments on
the left-hand side) should be monomorphic in any overloaded
type variables. In this example, the rule forces len to be used
at the same type at both its occurrences, which solves the performance
problem. The programmer can supply an explicit type signature for len if
polymorphic behaviour is required.
The monomorphism restriction is manifestly a wart on the language.
It seems to bite every new Haskell programmer by giving rise to an
unexpected or obscure error message. There has been much
discussion of alternatives.
(18, Emphasis added.) Note that John Hughes is a co-author of the article.

I can't replicate your result that GHCi infers the type (Foldable t, Eq a) => a -> t a -> Bool even with -XMonomorphismRestriction (GHC 8.0.2).
What I see is that when I enter the line (!) = elem it infers the type (!) :: () -> [()] -> Bool, which is actually a perfect illustration of why you would want GHCi to behave "differently" from GHC, given that GHC is using the monomorphism restriction.
The problem described in #Davislor's answer that the monomorphism restriction was intended to address is that you could write code that syntactically looks like it's computing a value once, binding it to a name, and then using it several times, where actually the thing bound to a name is a reference to a closure awaiting a type class dictionary before it can really compute the value. All the use sites would separately work out what dictionary they need to pass and compute the value again, even if all the use sites actually pick the same dictionary (exactly as if you write a function of a number and then invoke it from several different places with the same parameter, you'd get the same result computed multiple times). But if the user was thinking of that binding as a simple value then this would be unexpected, and it's extremely likely that all the use-sites will want a single dictionary (because the user expected a reference to a single value computed from a single dictionary).
The monomorphism restriction forces GHC not to infer types that still need a dictionary (for bindings that have no syntactic parameters). So now the dictionary is chosen once at the binding site, instead of separately at each use of the binding, and the value really is only computed once. But that only works if the dictionary chosen at the binding site is the correct one that all the use sites would have chosen. If GHC picked the wrong one at the binding site, then all the use-sites would be type errors, even if they all agree on what type (and thus dictionary) they were expecting.
GHC compiles entire modules at once. So it can see the use sites and the binding site at the same time. Thus if any use of the binding requires a specific concrete type, the binding will use that type's dictionary, and everything will be well so long as all of the other use sites are compatible with that type (even if they were actually polymorphic and would also have worked with other types). This works even if the code that pins down the correct type is widely separated from the binding by many other calls; all the constraints on the types of things are effectively connected by unification during the type checking/inference phase, so when the compiler is choosing a type at the binding site it can "see" the requirements from all of the use-sites (within the same module).
But if the use sites are not all consistent with a single concrete type, then you get a type error, as in your example. One use-site of (!) requires the a type variable to be instantiated as Char, the other requires a type that also has a Num instance (which Char doesn't).
This wasn't consistent with our hopeful assumption that all the use-sites would want a single dictionary, and so the monomorphism restriction has resulted in an error that could have been avoided by inferring a more general type for (!). It's certainly debatable that the monomorphism restriction prevents more problems than it solves, but given that it is there, surely we'd want GHCi to behave the same way, right?
However GHCi is an interpreter. You enter code one statement at a time, not one module at a time. So when you type (!) = elem and hit enter, GHCi has to understand that statement and produce a value to bind to (!) with some specific type right now (it can be an unevaluated thunk, but we have to know what its type is). With the monomorphism restriction we can't infer (Foldable t, Eq a) => a -> t a -> Bool, we have to pick a type for those type variables now, with no information from use-sites to help us pick something sensible. The extended default rules that are on in GHCi (another difference from GHC) default those to [] and (), so you get (!) :: () -> [()] -> Bool1. Pretty useless, and you get a type error trying either of the uses from your example.
The problem that the monomorphism restriction addresses is particularly egregious in the case of numeric calculations when you're not writing explicit type signatures. Since Haskell's numeric literals are overloaded you could easily write an entire complex calculation, complete with starting data, whose most general type is polymorphic with a Num or Floating or etc constraint. Most of the builtin numeric types are very small, so you're very likely to have values that you'd much rather store than compute multiple times. The scenario is more likely to happen, and more likely to be a problem.
But it's also exactly with numeric types that the whole-module type-inference process is essential to defaulting type variables to a concrete type in a way that is at all usable (and small examples with numbers are exactly what people new to Haskell are likely to be trying out in the interpreter). Before the monomorphism restriction was off by default in GHCi, there was a constant stream of Haskell question here on Stack Overflow from people confused why they couldn't divide numbers in GHCi that they could in compiled code, or something similar (basically the reverse of your question here). In compiled code you can mostly just write code the way you want with no explicit types and the full-module type inference figures out whether it should default your integer literals to Integer, or Int if they need to be added to something returned by length, or Double if they need to be added to something and multiplied by something else which is elsewhere divided by something, etc etc. In GHCi a simple x = 2 very often does the wrong under the monomorphism restriction turned on (because it'll pick Integer regardless of what you wanted to do with x later), with the result that you need to add far more type annotations in a quick-and-easy interactive interpreter than even the most ardent explicit-typist would use in production compiled code.
So it's certainly debateable whether GHC should use the monomorphism restriction or not; it's intended to address a real problem, it just also causes some other ones2. But the monomorphism restriction is a terrible idea for the interpreter. The fundamental difference between line-at-a-time and module-at-a-time type inference means that even when they both did default to using it they behaved quite differently in practice anyway. GHCi without the monomorphism restriction is at least significantly more usable.
1 Without the extended default rules you instead get an error about an ambiguous type variable, because it doesn't have anything to pin down a choice, not even the somewhat silly defaulting rules.
2 I find it only a mild irritation in actual development because I write type signatures for top-level bindings. I find that's enough to make the monomorphism restriction apply only rarely, so it doesn't help or hinder me much. Thus I'd probably rather it was scrapped so that everything works consistently, especially as it seems to bite learners far more often than it bites me as a practitioner. On the other hand, debugging a rare performance problem on the occasion that it matters is much harder than rarely having to add a correct type signature that GHC annoyingly won't infer.

NoMonomorphismRestriction is a useful default in GHCI because you don't have to write out so many pesky type signatures in the repl. GHCI will try to infer the most general types it can.
MonomorphismRestriction is a useful default otherwise for efficiency / performance reasons. Specifically, the issue boils down to the fact that:
typeclasses essentially introduce additional function parameters -- specifically, the dictionary of code implementing the instances in question. In the case of typeclass polymorphic pattern bindings, you end up turning something that looked like a pattern binding -- a constant that would only ever be evaluated once, into what is really a function binding, something which will not be memoised.
Link

Type erasure in Haskell?

I was reading a lecture note on Haskell when I came across this paragraph:
This “not caring” is what the “parametric” in parametric polymorphism means. All Haskell functions must be parametric in their type parameters; the functions must not care or make decisions based on the choices for these parameters. A function can't do one thing when a is Int and a different thing when a is Bool. Haskell simply provides no facility for writing such an operation. This property of a langauge is called parametricity.
There are many deep and profound consequences of parametricity. One consequence is something called type erasure. Because a running Haskell program can never make decisions based on type information, all the type information can be dropped during compilation. Despite how important types are when writing Haskell code, they are completely irrelevant when running Haskell code. This property gives Haskell a huge speed boost when compared to other languages, such as Python, that need to keep types around at runtime. (Type erasure is not the only thing that makes Haskell faster, but Haskell is sometimes clocked at 20x faster than Python.)
What I don't understand is how are "all Haskell functions" parametric? Aren't types explicit/static in Haskell? Also I don't really understand how type erasure improves compiling time runtime?
Sorry if these questions are really basic, I'm new to Haskell.
EDIT:
One more question: why does the author say that "Despite how important types are when writing Haskell code, they are completely irrelevant when running Haskell code"?

What I don't understand is how are "all Haskell functions" parametric?
It doesn't say all Haskell functions are parametric, it says:
All Haskell functions must be parametric in their type parameters.
A Haskell function need not have any type parameters.
One more question: why does the author say that "Despite how important types are when writing Haskell code, they are completely irrelevant when running Haskell code"?
Unlike a dynamically typed language where you need to check at run time if (for example) two things are numbers before trying to add them together, your running Haskell program knows that if you're trying to add them together, then they must be numbers because the compiler made sure of it beforehand.
Aren't types explicit/static in Haskell?
Types in Haskell can often be inferred, in which case they don't need to be explicit. But you're right that they're static, and that is actually why they don't matter at run time, because static means that the compiler makes sure everything has the type that it should before your program ever executes.

Types can be erased in Haskell because the type of an expression is either know at compile time (like True) or its type does not matter at runtime (like []).
There's a caveat to this though, it assumes that all values have some kind of uniform representation. Most Haskell implementations use pointers for everything, so the actual type of what a pointer points to doesn't matter (except for the garbage collector), but you could imagine a Haskell implementation that uses a non-uniform representation and then some type information would have to be kept.

Others have already answered, but perhaps some examples can help.
Python, for instance, retains type information until runtime:
>>> def f(x):
... if type(x)==type(0):
... return (x+1,x)
... else:
... return (x,x)
...
>>> f("hello")
('hello', 'hello')
>>> f(10)
(11, 10)
The function above, given any argument x returns the pair (x,x), except when x is of type int. The function tests for that type at runtime, and if x is found to be an int it behaves in a special way, returning (x+1, x) instead.
To realize the above, the Python runtime must keep track of types. That is, when we do
>>> x = 5
Python can not just store the byte representation of 5 in memory. It also needs to mark that representation with a type tag int, so that when we do type(x) the tag can be recovered.
Further, before doing any operation such as x+1 Python needs to check the type tag to ensure we are really working on ints. If x is for instance a string, Python will raise an exception.
Statically checked languages such as Java do not need such checks at runtime. For instance, when we run
SomeClass x = new SomeClass(42);
x.foo();
the compiler has already checked there's indeed a method foo for x at compile time, so there's no need to do that again. This can improve performance, in principle. (Actually, the JVM does some runtime checks at class load time, but let's ignore those for the sake of simplicity)
In spite of the above, Java has to store type tags like Python does, since it has a type(-) analogous:
if (x instanceof SomeClass) { ...
Hence, Java allows one to write functions which can behave "specially" on some types.
// this is a "generic" function, using a type parameter A
<A> A foo(A x) {
if (x instanceof B) { // B is some arbitrary class
B b = (B) x;
return (A) new B(b.get()+1);
} else {
return x;
}
}
The above function foo() just returns its argument, except when it's of type B, for which a new object is created instead. This is a consequence of using instanceof, which requires every object to carry a tag at runtime.
To be honest, such a tag is already present to be able to implement virtual methods, so it does not cost anything more. Yet, the presence of instanceof makes it possible to cause the above non-uniform behaviour on types -- some types can be handled differently.
Haskell, instead has no such type/instanceof operator. A parametric Haskell function having type
foo :: a -> (a,a)
must behave in the same way at all types. There's no way to cause some "special" behaviour. Concretely, foo x must return (x,x), and we can see this just by looking at the type annotation above. To stress the point, there's no need to look at the code (!!) to prove such property. This is what parametricity ensures from the type above.

Implementations of dynamically typed languages typically need to store type information with each value in memory. This isn't too bad for Lisp-like languages that have just a few types and can reasonably identify them with a few tag bits (although such limited types lead to other efficiency issues). It's much worse for a language with lots of types. Haskell lets you carry type information to runtime, but it forces you to be explicit about it, so you can pay attention to the cost. For example, adding the context Typeable a to a type offers a value with that type access, at runtime, to a representation of the type of a. More subtly, typeclass instance dictionaries are usually specialized away at compile time, but in sufficiently polymorphic or complex cases may survive to runtime. In a compiler like the apparently-abandoned JHC, and one likely possibility for the as-yet-barely-started compiler THC, this could lead to some type information leaking to runtime in the form of pointer tagging. But these situations are fairly easy to identify and only rarely cause serious performance problems.

Are typeclasses essential?

I once asked a question on haskell beginners, whether to use data/newtype or a typeclass. In my particular case it turned out that no typeclass was required. Additionally Tom Ellis gave me a brilliant advice, what to do when in doubt:
The simplest way of answering this which is mostly correct is:
use data
I know that typeclasses can make a few things a bit prettier, but not much AFIK. It also strikes me that typeclasses are mostly used for brain stem stuff, wheras in newer stuff, new typeclasses hardly ever get introduced and everything is done with data/newtype.
Now I wonder if there are cases where typeclasses are absolutely required and things could not be expressed with data/newtype?
Answering a similar question on StackOverflow Gabriel Gonzales said
Use type classes if:
There is only one correct behavior per given type
The type class has associated equations (i.e. "laws") that all instances must satisfy
Hmm ..
Or are typeclasses and data/newtype somewhat competing concepts which coexist for historical reasons?

I would argue that typeclasses are an essential part of Haskell.
They are the part of Haskell that makes it the easiest language I know of to refactor, and they are a great asset to your being able to reason about the correctness of code.
So, let's talk about dictionary passing.
Now, any sort of dictionary passing is a big improvement in the state of affairs in traditional object oriented languages. We know how to do OOP with vtables in C++. However, the vtable is 'part of the object' in OOP languages. Fusing the vtable with the object forces your code into a form where you have a rigid discipline about who can extend the core types with new features, its really only the original author of the class who has to incorporate all the things others want to bake into their type. This leads to "lava flow code" and all sorts of other design antipatterns, etc.
Languages like C# give you the ability to hack in extension methods to fake new stuff, and "traits" in languages like scala and multiple inheritance in other languages let you delegate some of the work as well, but they are partial solutions.
When you split the vtable from the objects they manipulate you get a heady rush of power. You can now pass them around wherever you want, but then of course you need to name them and talk about them. The ML discipline around modules / functors and the explicit dictionary passing style take this approach.
Typeclasses take a slightly different tack. We rely on uniqueness of a typeclass instance for a given type and it is in large part it is this choice permits us to get away with such simple core data types.
Why?
Because we can move the use of the dictionaries to the use sites, and don't have to carry them around with the data types and we can rely upon the fact that when we do so nothing has changed about the behavior of the code.
Mechanical translation of the code to more complex manually passed dictionaries loses the uniqueness of such a dictionary at a given type. Passing the dictionaries in at different points in your program now leads to programs with greatly differing behavior. You may or may not have to remember the dictionaries your data type was constructed with, and woe betide you if you want to have conditional behavior based on what your arguments are.
For simple examples like Set you can get away with a manual dictionary translation. The price doesn't seem so high. You have to bake in the dictionary for, say, how you want to sort the Set when you make the object and then insert/lookup, would just preserve your choice. This might be a cost you can bear. When you union two Sets now, of course, its up in the air which ordering you get. Maybe you take the smaller and insert it into the larger, but then the ordering would change willy nilly, so instead you have to take say, the left and always insert it into the right, or document this haphazard behavior. You're now being forced into suboptimal performing solutions in the interest of 'flexibility'.
But Set is a trivial example. There you might bake an index into the type about which instance it was you are using, there is only one class involved. What happens when you want more complex behavior? One of the things we do with Haskell is work with monad transformers. Now you have lots of instances floating around -- and you don't have a good place to store them all, MonadReader, MonadWriter, MonadState, etc. may all apply.. conditionally, based on the underlying monad. what happens when you hoist and swap it out and now different things may or may not apply?
Carrying around an explicit dictionaries for this is a lot of work, there isn't a good place to store them and you are asking users to adopt a global program transformation to adopt this practice.
These are the things that typeclasses make effortless.
Do I believe you should use them for everything?
Not by a long shot.
But I can't agree with the other replies here that they are inessential to Haskell.
Haskell is the only language that supplies them and they are critical to at least my ability to think in this language, and are a huge part of why I consider Haskell home.
I do agree with a few things here, use typeclasses when there are laws and when the choice is unambiguous.
I'd challenge however, that if you don't have laws or if the choice isn't unambiguous, you may not know enough about how to model the problem domain, and should be seeking something for which you can fit it into the typeclass mold, possibly even into existing abstractions -- and when you finally find that solution, you'll find you can easily reuse it.

Typeclasses are, in most cases, inessential. Any typeclass code can be mechanically converted into dictionary-passing style. They mainly provide convenience, sometimes an essential amount of convenience (cf. kmett's answer).
Sometimes the single-instance property of typeclasses is used to enforce invariants. For example, you could not convert Data.Set into dictionary-passing style safely, because if you inserted twice with two different Ord dictionaries, you could break the data structure invariant. Of course you could still convert any working code to working code in dictionary-passing style, but you would not be able to outlaw as much broken code.
Laws are another important cultural aspect to typeclasses. The compiler does not enforce laws, but Haskell programmers expect typeclasses to come with laws that all the instances satisfy. This can be leveraged to provide stonger guarantees about some functions. This advantage comes only from the conventions of the community, and is not a formal property of a language.

To answer that part of the question:
"typeclasses and data/newtype somewhat competing concepts"
No. Typeclasses are an extension to the type system, that allows you to make constraints on polymorphic arguments. Like most things in programming, they are, of course, syntactic sugar [so they aren't essential in the sense that their use can't be replaced by anything else]. That doesn't mean they're superfluous. It just means you could express similar things using other language facilities, but you'd lose some clarity while you're at it. Dictionary passing can be used for mostly the same things, but it's ultimately less strict in the type system because it allows changing behavior at runtime (which is also an excellent example of where you'd use dictionary passing instead of type classes).
Data and newtype still mean exactly the same thing whether you have typeclasses or not: Introduce a new type, in the case of data as new kind of data structure, and in case of newtype as a typesafe variant of type.

To expand slightly on my comment I would suggest always starting by using data and dictionary passing. If the boilerplate and manual instance plumbing becomes too much to bear then consider introducing a typeclass. I suspect this approach generally leads to a cleaner design.

I just want to make a really mundane point about syntax.
People tend to underestimate the convenience afforded by type classes, probably because they have never tried Haskell without using any. This is a "the grass is greener on the other side of the fence" sort of phenomenon.
while :: Monad m -> m Bool -> m a -> m ()
while m p body = (>>=) m p $ \x ->
if x
then (>>) m body (while m p body)
else return m ()
average :: Floating a -> a -> a -> a -> a
average f a b c = (/) f ((+) (floatingToNum f) a ((+) (floatingToNum f) b c))
(fromInteger (floatingToNum f) 3)
This is the historical motivation for type classes and it remains valid today. If we didn't have type classes, we'd certainly need some kind of replacement for it to avoid writing monstrosities like these. (Maybe something like record puns or Agda's "open".)

I know that typeclasses can make a few things a bit prettier, but not much AFIK.
Bit prettier?? No! Way prettier! (as others have already noted)
However the answer to this really depends very much where this question comes from.
If Haskell is your tool of choice for serious software engineering, typeclasses are
powerful and essential.
If you are a beginner using haskell to learn (functional) programming, the complexity and difficulty of typeclasses can outweigh the advantages – certainly at the beginning of your studies.
Here are a couple of examples comparing ghc with gofer (predecessor of hugs,
predecessor of modern haskell):
gofer
? 1 ++ [2,3,4]
ERROR: Type error in application
*** expression :: 1 ++ [2,3,4]
*** term :: 1
*** type :: Int
*** does not match :: [Int]
Now compare with ghc:
Prelude> 1 ++ [2,3,4]
:2:1:
No instance for (Num [a0]) arising from the literal `1'
Possible fix: add an instance declaration for (Num [a0])
In the first argument of `(++)', namely `1'
In the expression: 1 ++ [2, 3, 4]
In an equation for `it': it = 1 ++ [2, 3, 4]
:2:7:
No instance for (Num a0) arising from the literal `2'
The type variable `a0' is ambiguous
Possible fix: add a type signature that fixes these type variable(s)
Note: there are several potential instances:
instance Num Double -- Defined in `GHC.Float'
instance Num Float -- Defined in `GHC.Float'
instance Integral a => Num (GHC.Real.Ratio a)
-- Defined in `GHC.Real'
...plus three others
In the expression: 2
In the second argument of `(++)', namely `[2, 3, 4]'
In the expression: 1 ++ [2, 3, 4]
This should suggest that error-message-wise, not only are typeclasses not prettier, they can be uglier!
One can go all the way (in gofer) and use the 'simple prelude' that uses
no typeclasses at all. This makes it quite unrealistic for serious programming
but real neat for wrapping your head round Hindley-Milner:
Standard Prelude
? :t (==)
(==) :: Eq a => a -> a -> Bool
? :t (+)
(+) :: Num a => a -> a -> a
Simple Prelude
? :t (==)
(==) :: a -> a -> Bool
? :t (+)
(+) :: Int -> Int -> Int

What makes Haskell's type system more "powerful" than other languages' type systems?

Reading Disadvantages of Scala type system versus Haskell?, I have to ask: what is it, specifically, that makes Haskell's type system more powerful than other languages' type systems (C, C++, Java). Apparently, even Scala can't perform some of the same powers as Haskell's type system. What is it, specifically, that makes Haskell's type system (Hindley–Milner type inference) so powerful? Can you give an example?

What is it, specifically, that makes Haskell's type system
It has been engineered for the past decade to be both flexible -- as a logic for property verification -- and powerful.
Haskell's type system has been developed over the years to encourage a relatively flexible, expressive static checking discipline, with several groups of researchers identifying type system techniques that enable powerful new classes of compile-time verification. Scala's is relatively undeveloped in that area.
That is, Haskell/GHC provides a logic that is both powerful and designed to encourage type level programming. Something fairly unique in the world of functional programming.
Some papers that give a flavor of the direction the engineering effort on Haskell's type system has taken:
Fun with type functions
Associated types with class
Fun with functional dependencies

Hindley-Milner is not a type system, but a type inference algorithm. Haskell's type system, back in the day, used to be able to be fully inferred using HM, but that ship has long sailed for modern Haskell with extensions. (ML remains capable of being fully inferred).
Arguably, the ability to mainly or entirely infer all types yields power in terms of expressiveness.
But that's largely not what I think the question is really about.
The papers that dons linked point to the other aspect -- that the extensions to Haskell's type system make it turing complete (and that modern type families make that turing complete language much more closely resemble value-level programming). Another nice paper on this topic is McBride's Faking It: Simulating Dependent Types in Haskell.
The paper in the other thread on Scala: "Type Classes as Objects and Implicits" goes into why you can in fact do most of this in Scala as well, although with a bit more explicitness. I tend to feel, but this is more a gut sense than from real Scala experience, that its more ad-hoc and explicit approach (what the C++ discussion called "nominal") is ultimately a bit messier.

Let's go with a very simple example: Haskell's Maybe.
data Maybe a = Nothing | Just a
In C++:
template <T>
struct Maybe {
bool isJust;
T value; // IMPORTANT: must ignore when !isJust
};
Let's consider these two function signatures, in Haskell:
sumJusts :: Num a => [Maybe a] -> a
and C++:
template <T> T sumJusts(vector<maybe<T> >);
Differences:
In C++ there are more possible mistakes to make. The compiler doesn't check the usage rule of Maybe.
The C++ type of sumJusts does not specify that it requires + and cast from 0. The error messages that show up when things do not work are cryptic and odd. In Haskell the compiler will just complain that the type is not an instance of Num, very straightforward..
In short, Haskell has:
ADTs
Type-classes
A very friendly syntax and good support for generics (which in C++ people try to avoid because of all their cryptickynessishisms)

Haskell language allows you to write safer code without giving up with functionalities. Most languages nowadays trade features for safety: the Haskell language is there to show that's possible to have both.
We can live without null pointers, explicit castings, loose typing and still have a perfectly expressive language, able to produce efficient final code.
More, the Haskell type system, along with its lazy-by-default and purity approach to coding gives you a boost in complicate but important matters like parallelism and concurrency.
Just my two cents.

One thing I really like and miss in other languages is the support of typclasses, which are an elegant solution for many problems (including for instance polyvariadic functions).
Using typeclasses, it's extremely easy to define very abstract functions, which are still completely type-safe - like for instance this Fibonacci-function:
fibs :: Num a => [a]
fibs#(_:xs) = 0:1:zipWith (+) fibs xs
For instance:
map (`div` 2) fibs -- integral context
(fibs !! 10) + 1.234 -- rational context
map (:+ 1.0) fibs -- Complex context
You may even define your own numeric type for this.

What is expressiveness? To my understanding it is what constraint the type system allow us to put on our code, or in other words what properties of code which we can prove. The more expressive a type system is, the more information we can embed at the type level (which can be used at compile time by the type-checker to check our code).
Here are some properties of Haskell's type system that other languages don't have.
Purity.
Purity allows Haskell to distinguish pure code and IO capable code
Paramtricity.
Haskell enforces parametricity for parametrically polymorphic functions so they must obey some laws. (Some languages does let you to express polymorphic function types but they don't enforce parametricity, for example Scala lets you to pattern match on a specific type even if the argument is polymorphic)
ADT
Extensions
Haskell's base type system is a weaker version of λ2 which itself isn't really impressive. But with these extensions it become really powerful (even able to express dependent types with singleton):
existential types
rank-n types (full λ2)
type families
data kinds (allows "typed" programming at type level)
GADT
...

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string