What is practical use of monoids?

What is practical use of monoids? - haskell

I'm reading Learn You a Haskell and I've already covered applicative and now I'm on monoids. I have no problem understanding the both, although I found applicative useful in practice and monoid isn't quite so. So I think I don't understand something about Haskell.
First, speaking of Applicative, it creates something like uniform syntax to perform various actions on 'containers'. So we can use normal functions to perform actions on Maybe, lists, IO (should I have said monads? I don't know monads yet), functions:
λ> :m + Control.Applicative
λ> (+) <$> (Just 10) <*> (Just 13)
Just 23
λ> (+) <$> [1..5] <*> [1..5]
[2,3,4,5,6,3,4,5,6,7,4,5,6,7,8,5,6,7,8,9,6,7,8,9,10]
λ> (++) <$> getLine <*> getLine
one line
and another one
"one line and another one"
λ> (+) <$> (* 7) <*> (+ 7) $ 10
87
So applicative is an abstraction. I think we can live without it but it helps express some ideas mode clearly and that's fine.
Now, let's take a look at Monoid. It is also abstraction and pretty simple one. But does it help us? For every example from the book it seems to be obvious that there is more clear way to do things:
λ> :m + Data.Monoid
λ> mempty :: [a]
[]
λ> [1..3] `mappend` [4..6]
[1,2,3,4,5,6]
λ> [1..3] ++ [4..6]
[1,2,3,4,5,6]
λ> mconcat [[1,2],[3,6],[9]]
[1,2,3,6,9]
λ> concat [[1,2],[3,6],[9]]
[1,2,3,6,9]
λ> getProduct $ Product 3 `mappend` Product 9
27
λ> 3 * 9
27
λ> getProduct $ Product 3 `mappend` Product 4 `mappend` Product 2
24
λ> product [3,4,2]
24
λ> getSum . mconcat . map Sum $ [1,2,3]
6
λ> sum [1..3]
6
λ> getAny . mconcat . map Any $ [False, False, False, True]
True
λ> or [False, False, False, True]
True
λ> getAll . mconcat . map All $ [True, True, True]
True
λ> and [True, True, True]
True
So we have noticed some patterns and created new type class... Fine, I like math. But from practical point of view, what the point of Monoid? How does it help us better express ideas?

Gabriel Gonzalez wrote in his blog great information about why you should care, and you truly should care. You can read it here (and also see this).
It's about scalability, architecture & design of API. The idea is that there's the "Conventional architecture" that says:
Combine a several components together of type A to generate a
"network" or "topology" of type B
The issue with this kind of design is that as your program scales, so does your hell when you refactor.
So you want to change module A to improve your design or domain, so you do. Oh, but now module B & C that depend on A broke. You fix B, great. Now you fix C. Now B broke again, as B also used some of C's functionality. And I can go on with this forever, and if you ever used OOP - so can you.
Then there's what Gabriel calls the "Haskell architecture":
Combine several components together of type A to generate a new
component of the same type A, indistinguishable in character from its substituent parts
This solves the issue, elegantly too. Basically: do not layer your modules or extend to make specialized ones.
Instead, combine.
So now, what's encouraged is that instead of saying things like "I have multiple X, so let's make a type to represent their union", you say "I have multiple X, so let's combine them into an X". Or in simple English: "Let's make composable types in the very first place." (do you sense the monoids' lurking yet?).
Imagine you want to make a form for your webpage or application, and you have the module "Personal Information Form" that you created because you needed personal information. Later you found that you also need "Change Picture Form" so quickly wrote that. And now you say I want to combine them, so let's make a "Personal Information & Picture Form" module. And in real life scalable applications this can and does get out of hand. Probably not with forms but to demonstrate, you need to compose and compose so you will end up with "Personal Information & Change Picture & Change Password & Change Status & Manage Friends & Manage Wishlist & Change View Settings & Please don't extend me anymore & please & please stop! & STOP!!!!" module. This is not pretty, and you will have to manage this complexity in the API. Oh, and if you want change anything - it probably has dependencies. So.. yeah.. Welcome to hell.
Now let's look at the other option, but first let's look at the benefit because it will guide us to it:
These abstractions scale limitlessly because they always preserve
combinability, therefore we never need to layer further abstractions
on top. This is one reason why you should learn Haskell: you learn how
to build flat architectures.
Sounds good, so, instead of making "Personal Information Form" / "Change Picture Form" module, stop and think if we can make anything here composable. Well, we can just make a "Form", right? would be more abstract too.
Then it can make sense to construct one for everything you want, combine them together and get one form just like any other.
And so, you don't get a messy complex tree anymore, because of the key that you take two forms and get one form. So Form -> Form -> Form. And as you can already see clearly, this signature is an instance of mappend.
The alternative, and the conventional architecture would probably look like a -> b -> c and then c -> d -> e and then...
Now, with forms it's not so challenging; the challenge is to work with this in real world applications. And to do that simply ask yourself as much as you can (because it pays off, as you can see): How can I make this concept composable? and since monoids are such a simple way to achieve that (we want simple) ask yourself first: How is this concept a monoid?
Sidenote: Thankfully Haskell will very much discourage you to extend types as it is a functional language (no inheritance). But it's still possible to make a type for something, another type for something, and in the third type to have both types as fields. If this is for composition - see if you can avoid it.

Fine, I like math. But from practical point of view, what the point of Monoid? How does it help us better express ideas?
It's an API. A simple one. For types that support:
having a zero element
having an append operation
Lots of types support these operations. So having a name for the operations and an API helps us capture the fact more clearly.
APIs are good because they let us reuse code, and reuse concepts. Meaning better, more maintainable code.

A very simple example is foldMap. Just by plugging in different monoids into this single function, you can compute:
the first and the last element,
the sum or the product of elements (from this also their average etc.),
check if all elements or any has a given property,
compute the maximal or minimal element,
map the elements to a collection (like lists, sets, strings, Text, ByteString or ByteString Builder) and concatenate them together - they're all monoids.
Moreover, monoids are composable: if a and b are monoids, so is (a, b). So you can easily compute several different monoidal values in one pass (like the sum and the product when computing the average of elements etc).
And although you can do all this without monoids, using foldr or foldl, it's much more cumbersome and also often less effective: for example, if you have a balanced binary tree and you want to find its minimum and maximum element, you can't do both effectively with foldr (or both with foldl), one will be always O(n) for one of the cases, while when using foldMap with appropriate monoids, it'll be O(log n) in both cases.
And this was all just a single function foldMap! There are many other interesting applications. To give one, exponentiation by squaring is an efficient way for computing powers. But it's not actually tied to computing powers. You can implement it for any monoid, and if its <> is O(1), you have an efficient way of computing n-times x <> ... <> x. And suddenly you can do efficient matrix exponentiation and compute n-th Fibonacci number with just O(log n) multipications. See times1p in semigroup.
See also Monoids and Finger Trees.

The point to it is that when you tag an Int as Product, you express your intent for the integers to be multiplied. And by tagging them as Sum, to be added together.
Then you can use the same mconcat on both. This is used e.g. in Foldable where one foldMap expresses the idea of folding over a containing structure, while combining the elements in a specific monoid's kind of way.

Related

How to quickly read the do notation without translating to >>= compositions?

This question is related to this post: Understanding do notation for simple Reader monad: a <- (*2), b <- (+10), return (a+b)
I don't care if a language is hard to understand if it promises to solve some problems that easy to understand languages give us. I've been promised that the impossibility of changing state in Haskell (and other functional languages) is a game changer and I do believe that. I've had too many bugs in my code related to state and I totally agree with this post that reasoning about the interaction of objects in OOP languages is near impossible because they can change states, and thus in order to reason about code we should consider all the possible permutations of these states.
However, I've been finding that reasoning about Haskell monads is also very hard. As you can see in the answers to the question I linked, we need a big diagram to understand 3 lines of the do notation. I always end up opening stackedit.io to desugar the do notation by hand and write step by step the >>= applications of the do notation in order to understand the code.
The problem is more or less like this: in the majority of the cases when we have S a >>= f we have to unwrap a from S and apply f to it. However, f is actually another thing more or less in the formS a >>= g, which we also have to unwrap and so on. Human brain doesn't work like that, we can't easily apply these things in the head and stop, keep them in the brain's stack, and continue applying the rest of the >>= until we reach the end. When the end is reached, we get all those things stored in the brain's stack and glue them together.
Therefore, I must be doing something wrong. There must be an easy way to understand '>>= composition' in the brain. I know that do notation is very simple, but I can only think of that as a way to easily write >>= compositions. When I see the do notation I simply translate it to a bunch of >>=. I don't see it as a separate way of understanding code. If there is a way, I'd like someone to tell me.
So the question is: how to read the do notation?

Given a simple code like
foo :: Monad m => m Int -> m Int -> m Int
foo x y = do
a <- y -- I'm intentionally doing y first; see the Either example
b <- x
return (a + b)
you can't say much about <- except that it "gets" an Int value from x or y. What "get" means depends very much on what m is.
Some examples:
m ~ Maybe
foo (Just 3) (Just 5) evaluates to Just 8; replace either argument with Nothing, and you get Nothing. <- tries to get a value out the Maybe Int value, but aborts the rest of the block if it fails.
m ~ Either a
Pretty much the same as Maybe, but replacing Nothing with the first Left value that it encounters. foo (Right 3) (Right 5) returns Right 8. foo x (Left "foo") returns Left "foo", whether x is a Right or Left value.
m ~ []
Now, instead of getting an Int, <- gets every Int from among the given choices. It does so nondeterministically; you can imagine that the function "forks" into multiple parallel copies, each one having chosen a different value from its list. In the end, the final result is a list of all the results that were computed.
foo [1,2] [3,4] returns [4, 5, 5, 6] ([3 + 1, 3 + 2, 4 + 1, 4 + 2]).
m ~ IO
This one is tricky, because unlike the previous monads we've looked at, there isn't necessarily a value yet to get. foo readLn readLn will return whatever the sum of the two numbers read from standard input is, with the possibility of a run-time error should the strings so read not be parseable as Int values.
You might think of it as working like the Maybe monad, but with run-time exceptions replacing Nothing.

Part 1: no need to go into the weeds
There is actually a very simple, easy to grasp, intuition behind monads: they encode the order of stuff happening. Like, first do this thing, then do the other thing, then do the third thing. For example:
executeMadDoctrine = do
wait oneYear
s <- evaluatePoliticalSituation
case s of
Stable -> do
printInNewspapers "We're going to live another day"
executeMadDoctrine -- recursive call
Unstable -> do
printInNewspapers "Run for your lives"
launchMissiles
return ()
Or a slightly more realistic (and also compilable and executable) example:
main = do
putStrLn "What's your name?"
name <- getLine
if name == "EXIT" then
return ()
else do
putStrLn $ "Hi, " <> name
main
Simple. Just like Python. Human brain does, indeed, work exactly like this.
You see, you don't need to know how it all works inside, unless you start doing more advanced things. After all, you're probably not thinking about the order of cylinders firing every time you start your car, do you? You just hit the gas and it goes. It's the same with do.
Part 2: you picked a bad example
The example you picked in your previous question is not the best candidate for this stuff. The Monad instance for functions is indeed a bit brain-wrecking. Even I have to make a little effort to understand what's going on - and I've been doing Haskell professionally for quite some time.
The trouble here is mathematics. The bloody thing turns out unreasonably effective time after time, especially when nobody is asking it to.
Think about this: first we had perfectly good natural numbers that we could very well understand. I have two eyes, and you have one sword, I better run. But then it turned out that we need zero. Why the bloody hell do we need it? It's sacrilege! You can't write down something that isn't! But it turns out you have to have it. It unambiguously follows from the other stuff we know is true. And then we got irrational numbers. WTF is that? How do I even understand it? I can't have π oranges after all, can I? But they too must exist. It just follows. No way around it. And then complex numbers, transcendental, hypercomplex, unconstructible... My brain is boiling at this point.
It's sort of the same with monads: there is this peculiar mathematical object, and at some point somebody noticed that it's very good for expressing the order of computation, so we appropriated monads for that. But then it turns out that all kinds of things can be made to look like monads, mathematically speaking. No way around it, it just is.
And so we have all these funny instances. And the do notation still works for them, because they're monads (mathematically speaking), but it's no longer about order. Like, did you know that lists were monads too? But just like with functions, the interpretation for lists is not "order", it's nested loops. And if you combine lists with something else, you get non-determinism. Fun stuff.
But just like with different kinds of numbers, you can learn. You can build up intuition over time. Do you absolutely have to? See part 1.

Any long do chains can be re-arranged into the equivalent binary do, by the associativity law of monads, grouping everything on the right, as
do { A ; B ; C ; ... }
===
do { A ; r <- do { B ; C ; ... } ; return r }.
So we only need to understand this binary do form to understand everything else. And that is expressed as single >>= combination.
Then, treat do code's interpretation (for a particular monad) axiomatically instead, as a bunch of re-write rules. Convince yourself about the validity of those rules for a particular monad just once (yes, using possibly extensive >>=-based re-writes, once).
So, for the Reader monad from the question's linked entry,
(do { S f }) x === f x
(do { a <- S f ; return (h a) }) x === let {a = f x} in h a
=== h (f x)
(do { a <- S f ; === let {a = f x ;
b <- S g ; b = g x} in h a b
return (h a b) }) x === h (f x) (g x)
and any longer chain of lets is expressible as nested binary lets, equivalently.
The last one is liftM2 actually, so an argument could be made that understanding a particular monad means understanding its particular liftM2 (*), really.
And those Ss, we end up just ignoring them as noise, forced on us by Haskell syntax (well, that question didn't use them at all, but it could).
(*) more precisely, liftBind, (do { a <- S f ; b <- k a ; return (h a b) }) x === let {a = f x ; b = g x } in h a b where (S g) x === k a x. (specifically, this, after the words "the long version")
And so, your attitude of "When I see the do notation I simply translate it to a bunch of >>=. I don't see it as a separate way of understanding code" could actually be the problem.
do notation is your friend. Personally, I first hated it, then learned to love it, and now I see the >>=-based re-writes as its (low-level) implementation, more and more.
And, even more abstractly, do can equivalently be written as Monad Comprehensions, looking just like list comprehensions!

#chepner has already included in his answer quite a lot of what I would have said, but I wish to emphasise another aspect, which I feel is quite pertinent to this question: the fact that do notation is, for most developers, a much easier and more easily understandable way to work with any monadic expression which is at least moderately complex.
The reason for this is that, in an almost miraculous way, do blocks end up very much resembling code written in an imperative language. Imperative style code is much easier to understand for the majority of developers, and not only because it's by far the most common paradigm: it gives an explicit "recipe" for what a piece of code is doing, whereas more typical Haskell expressions, particularly monadic ones involving nested lambdas and >>= everywhere, very easily become difficult to comprehend.
In saying this I certainly do not mean that one should code in an imperative language as opposed to Haskell. The advantages of the pure functional style are well documented and seemingly well understood by the OP, so I will not go into them here. But Haskell's do notation allows one to write code in an imperative-looking "style", which therefore is explicit and easier to comprehend - at least on a small scale - while sacrificing none of the advantages of using a pure functional language.
This "imperative style" of do notation is, I feel, more visible with some monads than others, and I wish to illustrate my point with examples from a couple of monads which I find suit the "imperative style" well. First, IO, where I can give this simple example:
greet :: IO ()
greet = do
putStrLn "Hello, what is your name?"
name <- readLine
putStrLn $ "Pleased to meet you, " ++ name ++ "!"
I hope it's immediately obvious what this code does, when executed by the Haskell runtime. What I wish to emphasise is how similar it is to imperative code, for example, this Python translation (which is not the most idiomatic, but has been chosen to exactly match the Haskell code line for line).
def greet():
print("Hello, what is your name?")
name = input()
print("Pleased to meet you, " + name + "!")
Now ask yourself, how easy would the code be to understand in its desugared from, without do?
greet = putStrLn "Hello, what is your name?" >> readLine >>= \name -> putStrLn $ "Pleased to meet you, " ++ name ++ "!"
It's not particularly difficult, granted - but I hope you agree that it's much more "noisy" than the do block above. I can't speak for others, but I very much doubt I'm alone in saying that the latter version might take me 10-20 seconds or so to fully comprehend, whereas the do block is instantly comprehensible. And this of course is an extremely simple action - anything more complicated, as found in many real-world applications, makes the difference in comprehensibility much greater.
I have chosen IO for a reason, of course - I think it's in dealing with IO in particular that it's most natural to think in terms of "do this action, then if the result is that then do the next action, otherwise...". While the semantics of the IO monad fits this perfectly, it's much easier to translate the code into something like that when written in quasi-imperative notation than it is to use >>= directly. And the do notation is easier to write, too.
But although IO is the clearest example of this, it's certainly not the only one. Another great example is the State monad. Here's a simple example of using it to find the sum of a list of integers (and I know you wouldn't actually do that this way, but it's just a very simple example of some not-totally-trivial code that used this monad):
sumList :: State [Int] Int
sumList = go 0
where go subtotal = do
remaining <- get
case remaining of
[] -> return subtotal
(x:xs) -> do
put xs
go $ subtotal + X
Here, in my opinion, the steps are very clear - the auxiliary function go successively adds the first element of the list to the running total, while updating the internal state with the tail of the list. When there is no more list, it returns the running total. (Given the above, the function evalState sumList will take an actual list and sum it.)
One can probably come up with better examples (particularly ones where the calculation involved isn't trivial to do in other ways), but my point is hopefully still clear: rewriting the above with >>= and lambdas would make it much less comprehensible.
do notation is, in my opinion, why the often-quoted quip about Haskell being "the world's finest imperative language", has more than a grain of truth. By using and defining different monads one can write easily understandable "imperative" code in a wide variety of situations - while still having guarantees that various functions can't, for example, change global state. It's in many ways the best of both worlds.

Monotonic sequence type in haskell

I would like to define a type for infinite number sequence in haskell. My idea is:
type MySeq = Natural -> Ratio Integer
However, I would also like to be able to define some properties of the sequence on the type level. A simple example would be a non-decreasing sequence like this. Is this possible to do this with current dependent-type capabilities of GHC?
EDIT: I came up with the following idea:
type PositiveSeq = Natural -> Ratio Natural
data IncreasingSeq = IncreasingSeq {
start :: Ratio Natural,
diff :: PositiveSeq}
type IKnowItsIncreasing = [Ratio Natural]
getSeq :: IncreasingSeq -> IKnowItsIncreasing
getSeq s = scanl (+) (start s) [diff s i | i <- [1..]]
Of course, it's basically a hack and not actually type safe at all.

This isn't doing anything very fancy with types, but you could change how you interpret a sequence of naturals to get essentially the same guarantee.
I think you are thinking along the right lines in your edit to the question. Consider
data IncreasingSeq = IncreasingSeq (Integer -> Ratio Natural)
where each ratio represents how much it has increased from the previous number (starting with 0).
Then you can provide a single function
applyToIncreasing :: ([Ratio Natural] -> r) -> IncreasingSeq -> r
applyToIncreasing f (IncreasingSeq s) = f . drop 1 $ scanl (+) 0 (map (s $) [0..])
This should let you deconstruct it in any way, without allowing the function to inspect the real structure.
You just need a way to construct it: probably a fromList that just sorts it and an insert that performs a standard ordered insertion.
It pains part of me to say this, but I don't think you'd gain anything over this using fancy type tricks: there are only three functions that could ever possibly go wrong, and they are fairly simple to correctly implement. The implementation is hidden so anything that uses those is correct as a result of those functions being correct. Just don't export the data constructor for IncreasingSeq.
I would also suggest considering making [Ratio Natural] be the underlying representation. It simplifies things and guarantees that there are no "gaps" in the sequence (so it is guaranteed to be a sequence).
If you want more safety and can take the performance hit, you can use data Nat = Z | S Nat instead of Natural.
I will say that if this was Coq, or a similar language, instead of Haskell I would be more likely to suggest doing some fancier type-level stuff (depending on what you are trying to accomplish) for a couple reasons:
In systems like Coq, you are usually proving theorems about the code. Because of this, it can be useful to have a type-level proof that a certain property holds. Since Haskell doesn't really have a builtin way to prove those sorts of theorems, the utility diminishes.
On the other hand, we can (sometimes) construct data types that essentially must have the properties we want using a small number of trusted functions and a hidden implementation. In the context of a system with more theorem proving capability, like Coq, this might be harder to convince theorem prover of the property than if we used a dependent type (possibly, at least). In Haskell, however, we don't have that issue in the first place.

Functors and Non-Inductive Types

I am working through the section on Functors in the Typeclassopedia.
A simple intuition is that a Functor represents a “container” of some sort, along with the ability to apply a function uniformly to every element in the container.
OK. So, functors appear pretty natural for inductive types like lists or trees.
Functors also appear pretty simple if the number of elements is fixed to a low number. For example, with Maybe you just have to be concerned about "Nothing" or "Just a" -- two things.
So, how would you make something like a graph, that could potentially have loops, an instance of Functor? I think a more generalized way to put it is, how do non-inductive types "fit into" Functors?
The more I think about it, the more I realize that inductive / non-inductive doesn't really matter. Inductive types are just easier to define fmap for...
If I wanted to make a graph an instance of Functor, I would have to implement a graph traversal algorithm inside fmap; for example it would probably have to use a helper function that would keep track of the visited nodes. At this point, I am now wondering why bother defining it as a Functor instead of just writing this as a function itself? E.g. map vs fmap for lists...?
I hope someone with experience, war stories, and scars can shed some light. Thanks!

Well let's assume you define a graph like this
data Graph a = Node a [Graph a]
Then fmap is just defined precisely as you would expect
instance Functor Graph where
fmap f (Node a ns) = Node (f a) (map (fmap f) ns)
Now, if there's a loop then we'd have had to do something like
foo = Node 1 [bar]
bar = Node 2 [foo]
Now fmap is sufficiently lazy that you can evaluate part of it's result without forcing the rest of the computation, so it works just as well as any knot-tied graph representation would!
In general this is the trick: fmap is lazy so you can treat it's results just as you would treat any non-inductive values in Haskell (: carefully).
Also, you should define fmap vs the random other functions since
fmap is a good, well known API with rules
Your container now places well with things expecting Functors
You can abstract away other bits of your program so they depend on Functor, not your Graph
In general when I see something is a functor I think "Ah wonderful, I know just how to use that" and when I see
superAwesomeTraversal :: (a -> b) -> Foo a -> Foo b
I get a little worried that this will do unexpected things..

Closures and list comprehensions in Haskell

I'm playing around with Haskell at the moment and thus stumbled upon the list comprehension feature.
Naturally, I would have used a closure to do this kind of thing:
Prelude> [x|x<-[1..7],x>4] -- list comprehension
[5,6,7]
Prelude> filter (\x->x>4) [1..7] -- closure
[5,6,7]
I still don't feel this language, so which way would a Haskell programmer go?
What are the differences between these two solutions?

Idiomatic Haskell would be filter (> 4) [1..7]
Note that you are not capturing any of the lexical scope in your closure, and are instead making use of a sectioned operator. That is to say, you want a partial application of >, which operator sections give you immediately. List comprehensions are sometimes attractive, but the usual perception is that they do not scale as nicely as the usual suite of higher order functions ("scale" with respect to more complex compositions). That kind of stylistic decision is, of course, largely subjective, so YMMV.

List comprehensions come in handy if the elements are somewhat complex and one needs to filter them by pattern matching, or the mapping part feels too complex for a lambda abstraction, which should be short (or so I feel), or if one has to deal with nested lists. In the latter case, a list comprehension is often more readable than the alternatives (to me, anyway).
For example something like:
[ (f b, (g . fst) a) | (Just a, Right bs) <- somelist, a `notElem` bs, (_, b) <- bs ]
But for your example, the section (>4) is a really nice way to write (\a -> a > 4) and because you use it only for filtering, most people would prefer ANthonys solution.

Why does Haskell's `head` crash on an empty list (or why doesn't it return an empty list)? (Language philosophy)

Note to other potential contributors: Please don't hesitate to use abstract or mathematical notations to make your point. If I find your answer unclear, I will ask for elucidation, but otherwise feel free to express yourself in a comfortable fashion.
To be clear: I am not looking for a "safe" head, nor is the choice of head in particular exceptionally meaningful. The meat of the question follows the discussion of head and head', which serve to provide context.
I've been hacking away with Haskell for a few months now (to the point that it has become my main language), but I am admittedly not well-informed about some of the more advanced concepts nor the details of the language's philosophy (though I am more than willing to learn). My question then is not so much a technical one (unless it is and I just don't realize it) as it is one of philosophy.
For this example, I am speaking of head.
As I imagine you'll know,
Prelude> head []
*** Exception: Prelude.head: empty list
This follows from head :: [a] -> a. Fair enough. Obviously one cannot return an element of (hand-wavingly) no type. But at the same time, it is simple (if not trivial) to define
head' :: [a] -> Maybe a
head' [] = Nothing
head' (x:xs) = Just x
I've seen some little discussion of this here in the comment section of certain statements. Notably, one Alex Stangl says
'There are good reasons not to make everything "safe" and to throw exceptions when preconditions are violated.'
I do not necessarily question this assertion, but I am curious as to what these "good reasons" are.
Additionally, a Paul Johnson says,
'For instance you could define "safeHead :: [a] -> Maybe a", but now instead of either handling an empty list or proving it can't happen, you have to handle "Nothing" or prove it can't happen.'
The tone that I read from that comment suggests that this is a notable increase in difficulty/complexity/something, but I am not sure that I grasp what he's putting out there.
One Steven Pruzina says (in 2011, no less),
"There's a deeper reason why e.g 'head' can't be crash-proof. To be polymorphic yet handle an empty list, 'head' must always return a variable of the type which is absent from any particular empty list. It would be Delphic if Haskell could do that...".
Is polymorphism lost by allowing empty list handling? If so, how so, and why? Are there particular cases which would make this obvious? This section amply answered by #Russell O'Connor. Any further thoughts are, of course, appreciated.
I'll edit this as clarity and suggestion dictates. Any thoughts, papers, etc., you can provide will be most appreciated.

Is polymorphism lost by allowing empty
list handling? If so, how so, and why?
Are there particular cases which would
make this obvious?
The free theorem for head states that
f . head = head . $map f
Applying this theorem to [] implies that
f (head []) = head (map f []) = head []
This theorem must hold for every f, so in particular it must hold for const True and const False. This implies
True = const True (head []) = head [] = const False (head []) = False
Thus if head is properly polymorphic and head [] were a total value, then True would equal False.
PS. I have some other comments about the background to your question to the effect of if you have a precondition that your list is non-empty then you should enforce it by using a non-empty list type in your function signature instead of using a list.

Why does anyone use head :: [a] -> a instead of pattern matching? One of the reasons is because you know that the argument cannot be empty and do not want to write the code to handle the case where the argument is empty.
Of course, your head' of type [a] -> Maybe a is defined in the standard library as Data.Maybe.listToMaybe. But if you replace a use of head with listToMaybe, you have to write the code to handle the empty case, which defeats this purpose of using head.
I am not saying that using head is a good style. It hides the fact that it can result in an exception, and in this sense it is not good. But it is sometimes convenient. The point is that head serves some purposes which cannot be served by listToMaybe.
The last quotation in the question (about polymorphism) simply means that it is impossible to define a function of type [a] -> a which returns a value on the empty list (as Russell O'Connor explained in his answer).

It's only natural to expect the following to hold: xs === head xs : tail xs - a list is identical to its first element, followed by the rest. Seems logical, right?
Now, let's count the number of conses (applications of :), disregarding the actual elements, when applying the purported 'law' to []: [] should be identical to foo : bar, but the former has 0 conses, while the latter has (at least) one. Uh oh, something's not right here!
Haskell's type system, for all its strengths, is not up to expressing the fact that you should only call head on a non-empty list (and that the 'law' is only valid for non-empty lists). Using head shifts the burden of proof to the programmer, who should make sure it's not used on empty lists. I believe dependently typed languages like Agda can help here.
Finally, a slightly more operational-philosophical description: how should head ([] :: [a]) :: a be implemented? Conjuring a value of type a out of thin air is impossible (think of uninhabited types such as data Falsum), and would amount to proving anything (via the Curry-Howard isomorphism).

There are a number of different ways to think about this. So I am going to argue both for and against head':
Against head':
There is no need to have head': Since lists are a concrete data type, everything that you can do with head' you can do by pattern matching.
Furthermore, with head' you're just trading off one functor for another. At some point you want to get down to brass tacks and get some work done on the underlying list element.
In defense of head':
But pattern matching obscures what's going on. In Haskell we are interested in calculating functions, which is better accomplished by writing them in point-free style using compositions and combinators.
Furthermore, thinking about the [] and Maybe functors, head' allows you to move back and forth between them (In particular the Applicative instance of [] with pure = replicate.)

If in your use case an empty list makes no sense at all, you can always opt to use NonEmpty instead, where neHead is safe to use. If you see it from that angle, it's not the head function that is unsafe, it's the whole list data-structure (again, for that use case).

I think this is a matter of simplicity and beauty. Which is, of course, in the eye of the beholder.
If coming from a Lisp background, you may be aware that lists are built of cons cells, each cell having a data element and a pointer to next cell. The empty list is not a list per se, but a special symbol. And Haskell goes with this reasoning.
In my view, it is both cleaner, simpler to reason about, and more traditional, if empty list and list are two different things.
...I may add - if you are worried about head being unsafe - don't use it, use pattern matching instead:
sum [] = 0
sum (x:xs) = x + sum xs

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string