In Haskell, why is the infix alias of mappend (from class Monoid) <> instead of +? In algebra courses + is usually used for the binary operator of a monoid.
The function + is specific to numbers, and moreover, it's only one way to implement Monoid for numbers (* is equally valid). Similarly, with booleans, it would be equally valid to use && and ||. Using the symbol + suggests that Monoids are about addition specifically, when really they're just about any associative operation.
It is true that, at least in my experience, one is likely to use mappend in a fashion resembling addition: concatenating lists or vectors, taking unions of sets or maps, etc, etc. However, the Haskell mindset favors generality and adherence to mathematical principles over (arguably) what is more intuitive. It's certainly reasonable, in my opinion, to think of mappend as a sort of general addition, and make adjustments in the cases where it isn't.
Partly due to the principle of least astonishment and partly because there are at least two sensible monoid instances for numbers (namely, Sum and Product from Data.Monoid).
Related
Haskell typeclasses often come with laws; for instance, instances of Monoid are expected to observe that x <> mempty = mempty <> x = x.
Typeclass laws are often written with single-equals (=) rather than double-equals (==). This suggests that the notion of equality used in typeclass laws is something other than that of Eq (which makes sense, since Eq is not a superclass of Monoid)
Searching around, I was unable to find any authoritative statement on the meaning of = in typeclass laws. For instance:
The Haskell 2010 report does not even contain the word "law" in it
Speaking with other Haskell users, most people seem to believe that = usually means extensional equality or substitution but is fundamentally context-dependent. Nobody provided any authoritative source for this claim.
The Haskell wiki article on monad laws states that = is extensional, but, again, fails to provide a source, and I wasn't able to track down any way to contact the author of the relevant edit.
The question, then: Is there any authoritative source on or standard for the semantics for = in typeclass laws? If so, what is it? Additionally, are there examples where the intended meaning of = is particularly exotic?
(As a side note, treating = extensionally can get tricky. For instance, there is a Monoid (IO a) instance, but it's not really clear what extensional equality of IO values looks like.)
I suspect most folks use = to mean "moral equality" as from Fast and Loose Reasoning is Morally Correct, which you can think of as extensional equality up to defined-ness.
But there's no hard-and-fast rule here. There's a lot of libraries, and a lot of authors, and if you take any two authors they probably have some minor detail about = on which they disagree.
Typeclass laws are not part of the Haskell language, so they are not subject to the same kind of language-theoretic semantic analysis as the language itself.
Instead, these laws are typically presented as an informal mathematical notation. Most presentations do not need a more detailed mathematical exposition, so they do not provide one.
I agree with comingstorm that the equality in those laws is that of a mathematical language. But I would also say that it is in the respect of the operator ==.
Why? Because == is supposed to implement mathematical equality.
For example, look at fractions (rational numbers). They can be implemented as pairs of integers with some rules. The pair (a, b) represents the fraction a/b. The pairs (a, b) and (c, d) represent the same rational number if a*d == b*c. The two pairs are then said to be equivalent, and we talk about an equivalence relation. In mathematics we let a rational number be an equivalence class of pairs under this equivalence. In programming we instead define the operator == to tell if two pairs are equivalent, i.e. if they represent the same fraction.
why the question One may say that zip is a method of Applicative, the usual instance being ZipList. I am unhappy with it because it is unsafe. I am unhappy with Align too, because it is, by virtue of being all-encompassing, overly complicated and not specific enough for usual cases.
lawful classes Some type classes in Haskell may be dubbed lawful. This means that they come with equalities that must hold — the laws for a class. It is ordinary that these laws come from category theoretic conceptualization of an aspect of programming. For example, Monad is a conceptualization of computations (whatever is meant by that) via the eponymous category theory device.
overlaying things The usual operation to want to do with boxes of things is to lay them on top of each other, and if they are monoid, they will meld.
Examples:
Arithmetics with Maybe.
Addition of matrices.
not enough laws The conceptualization of this concept is via monoidal functors, and the corresponding Applicative type class. There is, however, an annoying complication in that there are very often two ways to define the Applicative that both appear suitable. Why so? I propose that the answer is "not enough laws".
Examples:
For arithmetics:
The Sum monoid is the actual "endo-monoid". It is only legal for kin things. You cannot sum mass and force, for instance.
The Product monoid takes numbers of dimension a & b to a number of dimension c. Multiplying mass and force is legal and gets us to warmth.
So, the right choice of monoid may be inferred from types.
For lists:
The usual direct sum of lists is the more safe one. It works with any finite number of elements trivially, and with co-finite number thereof with a "diagonal process" definition such as LogicT.
The ZipList definition is clearly unsafe. It is defined to, given two lists of distinct length, crop the longer one to the length of the shorter.
Length indexed vectors are the device that allows for a safe definition of zip, by demanding a proof that the given lists are of same length.
For matrices:
The usual addition of matrices has the (very reasonable) requirement of dimension homogeneity, the same as with length indexed vectors mentioned above. Since matrices are habitually used in various real world simulations, such as 3D graphics, once matrices begin to get cropped or zero-padded, people would complain quite immediately, so a ZipMatrix definition along the lines of ZipList above does not appear attractive.
The stranger Kronecker multiplication is reminiscent of the direct product of lists. And it admits the definition of Monad, too.
two cases From these examples, it reveals itself that there are two distinct ideas mixed up in the thing we call a "monoid" or a "monoidal functor", and the distinction is very important for programming (unlike, perhaps, pure theory) because it would clean up confusion, remove unsafeties and, primarily, because there are, in each case, two completely unrelated algorithms to run.
I am thinking that maybe invertibility (also called "strength") of the monoidal functor is what matters. But the results of the Sum and the Product monoidal operation on Peano naturals are indistinguishable. (I am unsure whether they can be considered monoidal endofunctors.) So, I am turned to a guess that the changing of types is the hallmark. Multiplication of physical quantities does not type check as a Monoid, even!
P.S. There presents itself an instance of Monad for length indexed vectors over cartesian products and for matrices over Kronecker multiplication, with some sort of fold zip as join.
Exact zipping (as the safe package calls it) can be expressed through the Representable class. There is a fair amount of theory associated with Representable. For our current purposes, we can focus on...
A Functor f is Representable if tabulate and index witness an isomorphism to (->) x.
... and:
Representable endofunctors over the category of Haskell types are isomorphic to the reader monad and so inherit a very large number of properties for free.
Since Representable functors are isomorphic to functions from some type (e.g. an homogeneous pair is isomorphic to Bool -> a, and an infinite stream is isomorphic to Nat -> a), exact zipping can be achieved by zipping the functions pointwise. That is what mzipRep, the default implementation for MonadZip's mzip, does:
mzipRep :: Representable f => f a -> f b -> f (a, b)
mzipRep as bs = tabulate (index as &&& index bs)
While MonadZip is a rather awkward class (it is primarily part of the implementation of the MonadComprehensions extension), it has a relevant law, which I will restate it in non-monadic terms:
Information preservation: if () <$ u = () <$ v then munzip (mzip u v) = (u, v)
In other words, if u and v have the same shape, then mzip does not drop information (and so it can be undone by munzip). As Representable implies there being just one possible shape, it allows us to drop the condition, thus getting exact zipping.
Tangential notes:
The ZipList definition is clearly unsafe. It is defined to, given two lists of distinct length, crop the longer one to the length of the shorter.
I'd say that depends on what you want to use zipping for. Sometimes you will want or need exact zipping, and sometimes you won't (for instance, consider the commonplace trick of attaching indices to a list with zip [0..]); and sometimes padding rather than trimming will be what makes sense (cf. leftaroundabout's comment). That is why I prefer calling exact zipping "exact", rather than "safe".
There is, however, an annoying complication in that there are very often two ways to define the Applicative that both appear suitable. Why so? I propose that the answer is "not enough laws".
I very much disagree with the view that a class is underspecified if it allows more than one instance for some data type. I'd rather say that e.g. lists with the cartesian product applicative and lists with the zipping applicative are different structures, characterised by the relevant morphisms -- it just happens that they can be represented in Haskell through the same data type.
pigworker once asked how to express that a type is infinitely differentiable. This question brought to mind the fact that in complex analysis, a function that is differentiable (on an open set) must be infinitely differentiable (on that set). Is there a way to talk about complex differentiation of datatypes? If so, does a similar theorem hold?
Not really an answer... but this rant is way too long for a comment.
I find it a bit misleading to think complex differentiability just implies infinite differentiability. It's in fact much stronger than that: if a function is complex differentiable, then its derivatives at any point determine the entire function. And because infinite differentiability gives you a full Taylor series, you have an analytic function which is equal to your function, i.e. is your function itself. So, in a sense complex differentiable functions are analytic... because they are.
From a (standard) calculus perspective, the key contrast between real diff'ability and complex diff'ability is that in the reals, there is only one direction in which you can take the limit of difference-quotients (f(x+δ) - f x)/δ. You merely require that the left limit equals the right limit. But because that's an equality after the limit, this has only an effect locally. (Topologically speaking, the constraint just compares two discrete values, so it doesn't really deal with continuity properties at all.)
OTOH, for complex differentiability we require that the limit of the difference quotient is the same if we approach x from any direction in the entire complex plane. That's an entire continuous degree of freedom constrained. You can then go on to perform topological tricks (Cauchy integrals are essentially that) to “spread” the constraint through the entire domain.
I consider this a bit problematic philosophically. Holomorphic functions aren't really functions at all, as in: they're not so much defined by the entirety of their result values across the domain, as by some way to write them with analytic formulas (i.e. possibly-infinite algebraic expressions / polynomials).
Most mathematicians and physicists apparently like this a lot – such expressions are just the way in which they generally write functions.
I don't, really, like it at all: to me, a function should be a function, something defined by individual values, like field strengths you can measure in space or results you can define in Haskell.
Anyway, I digress...
If we translate this issue from functions on numbers to functors on Haskell types, I suppose the upshot is that complex diff'ability means nothing else but: a type can be written as a (possibly infinite?) ADT polynomial. And how to get infinite differentiability for such ADTs was shown in the post you linked to.
Another spin... perhaps closer to an answer.
These “derivatives” of Haskell types aren't really derivatives in the calculus sense. As in, they aren't motivated by a concept of small-pertubation response analysis†. It so happens that you can mathematically proove, for a very specific class of functions – those defined by an algebraic expression – that the calculus-derivative can again be written in a simple algebraic way (given by the well-known differentiation rules). That means trivially that you can differentiate infinitely often.
The usefulness of this symbolic differentiation also motivates to think about it as a more abstract operation. And when you're differentiating Haskell types, it is mainly just this algebraic definition you're going after, not the original calculus one.
Which is fine... but once you're doing algebra rather than calculus, it's not very meaningful to distinguish “real” from “complex” – it's actually neither, because you're not handling values but symbolic representations of values. An untyped language, if you will (and indeed, Haskell's type language is still untyped, with everything having kind *).
†Be it with traditional convergent limits or NSA-infinitesimals.
(Sorry, I'm stupid and uneducated, so this is probably a ridiculous question.)
I just started looking at J, and they use the terms "monadic" and "dyadic" for what seems (to me) to be unary and binary operators. Why is this done, and how does it relate to the other place I've heard the term (Haskell)? My guess is they are unrelated homonyms but I'm not sure.
They're unrelated except by both deriving from the Greek root for "one". Monadic and dyadic are indeed terms for unary and binary functions. Specifically, they're the Greek-derived equivalents--using -adic instead of -ary. Consider the word "triad", which is also Greek-derived.
Monad in the sense Haskell uses it has an unclear etymology but probably derives from "monoid".
I would encourage sticking with the Latin-derived "n-ary" terms in Haskell, though. All functions in Haskell technically have one argument because of currying, so using the Greek-derived form could produce arbitrary amounts of confusion.
They're unrelated; C. A. McCann points out the etymologies of both.
In any case, the Haskell use, of course, comes from category theory, and is thought to be an independent coining unrelated to the other senses of monad.
Indeed, the J sense of "monadic" dates back to APL, which predates Haskell by a quarter of a century! I think it might predate the category theory usage of the term, too.
Adicity (or adinity) is an alternative to arity, using Greek numeral roots instead of Latin:
niladic/medadic = nullary
monadic = unary
dyadic = binary
triadic = ternary
tetradic = quaternary
…
The various meanings of monad in philosophy, religion, biology, category theory, and functional programming are all derived separately, from its literal denotation of a “unit”. The Haskell term is probably derived from monoid, an algebraic structure equivalent to an additive monad.
No, the J use has nothing to do with the Haskell term. Monadic and dyadic functions are functions of one and two arguments, respectively.
The J terms originate from APL, which is a bit older than Haskell, but I have rarely seen them used like this outside of the APL family.
One example of the use of these terms in a non-APL context is from the book Clean Code, which in the chapter about functions talks about niladic, monadic and dyadic functions.
I'm wondering why Haskell doesn't have a single element tuple. Is it just because nobody needed it so far, or any rational reasons? I found an interesting thread in a comment at the Real World Haskell's website http://book.realworldhaskell.org/read/types-and-functions.html#funcstypes.composite, and people guessed various reasons like:
No good syntax sugar.
It is useless.
You can think that a normal value like (1) is actually a single element tuple.
But does anyone know the reason except a guess?
There's a lib for that!
http://hackage.haskell.org/packages/archive/OneTuple/0.2.1/doc/html/Data-Tuple-OneTuple.html
Actually, we have a OneTuple we use all the time. It's called Identity, and is now used as the base of standard pure monads in the new mtl:
http://hackage.haskell.org/packages/archive/transformers/0.2.2.0/doc/html/Data-Functor-Identity.html
And it has an important use! By virtue of providing a type constructor of kind * -> *, it can be made an instance (a trival one, granted, though not the most trivial) of Monad, Functor, etc., which lets us use it as a base for transformer stacks.
The exact reason is because it's totally unnecessary. Why would you need a one-tuple if you can just have its value?
The syntax also tends to be a bit clunky. In Python, you can have one-tuples, but you need a trailing comma to distinguish it from a parenthesized expression:
onetuple = (3,)
All in all, there's no reason for it. I'm sure there's no "official" reason because the designers of Haskell probably never even considered a single element tuple because it has no use.
I don't know if you were looking for some reasons beyond the obvious, but in this case the obvious answer is the right one.
My answer is not exactly about Haskell semantics, but about the theoretical mathematical elegance of making a value the same as its one-tuple. (So this answer should not be taken as an explanation of the standard behavior expected of a Haskell implementation, because it isn't intended as such.)
In programming languages and computation models where all functions are curried, such as lambda-calculus and combinatory logic, every function has exactly one input argument and one output/return value. No more, no less.
When we want a particular function f to have more than one input argument – say 3 –, we simulate it under this curried regime by creating a 1-argument function that returns a 2-argument function. Thus, f x y z = ((f x) y) z, and f x would return a 2-argument function.
Likewise, sometimes we might want to return more than one value from a function. It is not literally possible under this semantics, but we can simulate it by returning a tuple. We can generalize this.
If, for uniformity, we constrain the only return value of any function to be an (n-)tuple, we are able to harmonize some interesting features of the unit value and of supposedly non-tuple return values with the features of tuples in general, as follows.
Let's adopt as the general syntax of n-tuples the following schema, where ci is the component with the index i:
Notice that n-tuples have delimiting parentheses in this syntax.
Under this schema, how would we represent a 0-tuple? Since it has no components, this degenerate case would be represented like this: ( ). This syntax precisely coincides with the syntax we adopt to represent the unit value. So, we are tempted to make the unit value the same as a 0-tuple.
What about a 1-tuple? It would have this representation: . Here a syntactical ambiguity would immediately arise: parentheses would be used in the language both as 1-tuple delimiters and as mere grouping of values or expressions. So, in a context where (v) appears, a compiler or interpreter would be unsure whether this is a 1-tuple with a component whose value is v, or just an isolated value v inside superfluous parentheses.
A way to solve this ambiguity is to force a value to be the same as the 1-tuple that would have it as its only component. Not much would be sacrificed, since the only non-empty projection we can perform on a 1-tuple is to obtain its only value.
For this to be consistently enforced, the syntactical consequence is that we would have to relax a bit our former requirement that delimiting parentheses are mandatory for all n-tuples: now they would be optional for 1-tuples, and mandatory for all other values of n. (Or we could require all values to be delimited by parentheses, but this would be inconvenient for practical use.)
In summary, under the interpretation that a 1-tuple is the same as its only component value, we could, by making syntactic puns with parentheses, consider all return values of functions in our programming language or computing model as n-tuples: the 0-tuple in the case of the unit type, 1-tuples in the case of ordinary/"atomic" values which we usually don't think of as tuples, and pairs/triples/quadruples/... for other kinds of tuples. This heterodox interpretation is mathematically parsimonious and uniform, is expressive enough to simulate functions with multiple input arguments and multiple return values, and is not incompatible with Haskell (in the sense that no harm is done if the programmer assumes this unofficial interpretation).
This was an argument by syntactic puns. Whether you are satisfied or not with it, we can do even better. A more principled argument can be taken from the mathematical theory of relations, by exploring the Cartesian product operation.
An (n-adic) relation is extensionally defined as a uniform set of (n-)tuples. (This characterization is fundamental to relational database theory and is therefore important knowledge for professional computer programmers.)
A dyadic relation – a set of pairs (2-tuples) – is a subset of the Cartesian product of 2 sets:
. For a homogeneous relation: .
A triadic relation – a set of triples (3-tuples) – is a subset of the Cartesian product of 3 sets:
. For a homogeneous relation: .
A monadic relation – a set of monuples (1-tuples) – is a subset of the Cartesian product of 1 set:
(by the usual mathematical convention).
As we can see, a monadic relation is just a set of atomic values! This means that a set of 1-tuples is a set of atomic values. Therefore, it is convenient to consider that 1-tuples and atomic values are the same thing.