Prevent user to use binary operators defined in a new type

Prevent user to use binary operators defined in a new type - haskell

I'm currently trying to define multivariate polynomials over a field in Haskell (work in progress). I have as a starting point:
data Polynomial a = Zero
| M (Monomial a)
| Polynomial a :+: Polynomial a
| Polynomial a :*: Polynomial a
deriving (Show)
Is it possible to prevent the user to use the binary operators :+: and :*:? I'd like, because I define the addition and the multiplication later, which not only perform the operation but also put the result in canonical form (sum of monomials with distinct powers), and I would like that the user can only use these operations.
I would bet that's not possible if one exports the Polynomial type, but maybe the brilliant minds here have a trick?

You can export the Polynomial type without exporting its constructors.
module Foo(Polynomial()) where
would do this.

Related

Factoring out recursion in a complex AST

For a side project I am working on I currently have to deal with an abstract syntax tree and transform it according to rules (the specifics are unimportant).
The AST itself is nontrivial, meaning it has subexpressions which are restricted to some types only. (e.g. the operator A must take an argument which is of type B only, not any Expr. A drastically simplified reduced version of my datatype looks like this:
data Expr = List [Expr]
| Strange Str
| Literal Lit
data Str = A Expr
| B Expr
| C Lit
| D String
| E [Expr]
data Lit = Int Int
| String String
My goal is to factor out the explicit recursion and rely on recursion schemes instead, as demonstrated in these two excellent blog posts, which provide very powerful general-purpose tools to operate on my AST. Applying the necessary factoring, we end up with:
data ExprF a = List [a]
| Strange (StrF a)
| Literal (LitF a)
data StrF a = A a
| B a
| C (LitF a)
| D String
| E [a]
data LitF a = Int Int
| String String
If I didn't mess up, type Expr = Fix ExprF should now be isomorphic to the previously defined Expr.
However, writing cata for these cases becomes rather tedious, as I have to pattern match B a :: StrF a inside of an Str :: ExprF a for cata to be well-typed. For the entire original AST this is unfeasible.
I stumbled upon fixing GADTs, which seems to me like it is a solution to my problem, however the user-unfriendly interface of the duplicated higher-order type classes etc. is quite the unneccessary boilerplate.
So, to sum up my questions:
Is rewriting the AST as a GADT the correct way to go about this?
If yes, how could I transform the example into a well-working version? On a second note, is there better support for higher kinded Functors in GHC now?

If you've gone through the effort of to separate out the recursion in your data type, then you can just derive Functor and you're done. You don't need any fancy features to get the recursion scheme. (As a side note, there's no reason to parameterize the Lit data type.)
The fold is:
newtype Fix f = In { out :: f (Fix f) }
gfold :: (Functor f) => (f a -> a) -> Fix f -> a
gfold alg = alg . fmap (gfold alg) . out
To specify the algebra (the alg parameter), you need to do a case analysis against ExprF, but the alternative would be to have the fold have a dozen or more parameters: one for each data constructor. That wouldn't really save you much typing and would be much harder to read. If you want (and this may require rank-2 types in general), you can package all those parameters up into a record and then you could use record update to update "pre-made" records that provide "default" behavior in various circumstances. There's an old paper Dealing with Large Bananas that takes an approach like this. What I'm suggesting, to be clear, is just wrapping the gfold function above with a function that takes a record, and passes in an algebra that will do the case analysis and call the appropriate field of the record for each case.
Of course, you could use GHC Generics or the various "generic/polytypic" programming libraries like Scrap Your Boilerplate instead of this. You are basically recreating what they do.

Haskell data type for hmatrix Vector and Matrix

I am just starting out with Haskell, I have read up to the defining data types section of LYAH and am attempting to implement the Sum-Product algorithm for Belief Propagation. One of the rudimentary tasks is to define the Probabilistic Graphical Model.
As shown below, I have attempted to create a graph by tying the knot to represent the graph where each node represents a Gaussian distribution and has constant weight links(for now) to it's neighbours. However, when trying to define the Mean and Covariance types I am having some difficulty in specifying the types of the Matrix and Vector types, i.e. Float or Double.
module Graph(Graph) where
import Numeric.LinearAlgebra
data Mean = Mean Vector
data Covariance = Covariance Matrix
data Gaussian = Gaussian Mean Covariance
data Node = Node [Node] Gaussian
data Graph = Graph [Node]
In this simple example, what is the syntax to define Mean as a Vector of type Double and Covariance as a Matrix of type Double. Additionally, how would one generalise so that Mean and Covariance can be of type Float or Double?
I currently get the following from GHCi
Graph.hs:5:18: error:
• Expecting one more argument to ‘Vector’
Expected a type, but ‘Vector’ has kind ‘* -> *’
• In the type ‘Vector’
In the definition of data constructor ‘Mean’
In the data declaration for ‘Mean’
Failed, modules loaded: none.
I am using the hmatrix package as described here

Vector and Matrix are parameterised on the scalar type (so you can not only have matrices of floating-point “real numbers”, but also matrices of integers, complex numbers etc.). This is what GHC tells you by ‘Vector’ has kind ‘* -> *’: by itself, Vector is not a type (types have kind *, aka Type). Rather it is a type function mapping types of kind * to types of kind *. Scalars like Double are already plain types, so you can just apply Vector to them.
GHCi> :kind Vector
Vector :: * -> *
GHCi> :k Double
Double :: *
GHCi> :k Vector Double
Vector Double :: *
Thus you need
newtype Mean = Mean (Vector Double)
newtype Covariance = Covariance (Matrix Double)
(newtype does the same thing as data here, but it's a bit more efficient because no extra box/pointer is needed).
Alternatively, you may use more meaningfully-typed vector spaces, e.g.
import Math.LinearMap.Category
newtype Mean v = Mean v
newtype Covariance v = Covariance (v +> DualVector v)
The advantage of this is that dimensions are checked at compile time, which prevents nasty runtime errors (and can in principle also improve performance, though frankly the linearmap-category library is not optimised at all yet).
You'd then also parameterise the other types over the vector space:
data Gaußian v = Gaußian (Mean v) (Covariance v)
data Node v = Node [Node v] (Gaussian v)
data Graph v = Graph [Node v]
Somewhat unrelated to your question: this knot-tying sure feels elegant, but it's not really a suitable way to represent a graph, because nodes can't be identity-checked. Any cycles in the graph lead to, for all distinguishable means, an infinite structure. In practice, you won't get around giving your nodes e.g. Int labels and keeping a separate structure for the edges.

How are free objects constructed?

So I understand that a free object is defined as being the left-hand side of an adjunction. But how does that lead you to the Haskell definition of such objects?
More concretely: given a "forgetful functor" from the category of monads to the category of endofunctors,
newtype Forget m a = Forget (m a)
instance Monad m => Functor (Forget m) where
fmap f (Forget x) = Forget (liftM f x)
then the free monad Free :: (* -> *) -> (* -> *) is a type admitting (a Monad instance and) the following isomorphism:
type f ~> g = forall x. f x -> g x
fwd :: (Functor f, Monad m) => (f ~> Forget m) -> (Free f ~> m)
bwd :: (Functor f, Monad m) => (Free f ~> m) -> (f ~> Forget m)
fwd . bwd = id = bwd . fwd
If we drop the Forgets, for the free monad in Control.Monad.Free we have fwd = foldFree and bwd = (. liftF) (I think?)
But how do these laws lead to the construction found in Control.Monad.Free? How do you come up with data Free f a = Return a | Free (f (Free f a))? Surely you don't just guess until you come up with something that satisfies the laws? Same question goes for the free category of a graph, the free monoid of a set, and any other free object you care to name.

I don't think the notion of "free" is as well-defined as you seem to believe. While I do think the general consensus is that it is indeed a left adjoint of a forgetful functor, the issue lies in what "forgetful" means. There are clear definitions in some broad-ranging cases, particularly for concrete categories.
Universal algebra provides a broad ranging approach which covers almost all "algebraic" structures (over sets). The upshot is given a "signature" which consists of sorts, operations, and equations, you build a term algebra (i.e. an AST) of the operations and then quotient it by the equivalence relation generated by the equations. This is the free algebra generated from that signature. For example, we usually talk about monoids as being a set equipped with an associative multiplication and unit. In code, the free algebra before quotienting would be:
data PreFreeMonoid a
= Unit
| Var a
| Mul (PreFreeMonid a) (PreFreeMonoid a)
We would then quotient by the equivalence relation generated from the equations:
Mul Unit x = x
Mul x Unit = x
Mul (Mul x y) z = Mul x (Mul y z)
But you can show that the resulting quotient type is isomorphic to lists. In the multi-sorted case, we'd have a family of term algebras, one for each sort.
One way to recast this categorically is to use the notion of a (slightly generalized) Lawvere theory. Given a signature with a set of sorts, S, we can build a small category, call it T, whose objects are lists of elements of S. These small categories will be called theories in general. Operations get mapped to arrows whose source and target correspond to the appropriate arities. We freely add "tupling" and "projection" arrows so that e.g. [A,B,A] becomes the product [A]×[B]×[A]. Finally, we add commutative diagrams (i.e. equations between arrows) correspond to each equation in the signature. At this point, T essentially represents the term algebra(s). In fact, an actual interpretation or model of this term algebra is just a finite product preserving functor T → Set, write Mod(T) for the category of finite product preserving functors from T → Set. In the single sorted case, we'd have an underlying set functor, but in general we get a S-indexed family of sets, i.e. we have a functor U : Mod(T) → SetS where we're viewing S as a discrete category here. U is simply U(m)(s) = m([s]). We can actually calculate the left adjoint. First, we have a family of sets indexed by elements of S, call it G. Then we need to build a finite product preserving functor T → Set, but any functor into Set (i.e. copresheaf) is a colimit of representables which, in this case, means it's a quotient of the following (dependent) sum type:
Free(G)(s) = Σt:T.T(t,s)×Free(G)(t)
If Free(G) is finite product preserving then in the t = [A,B] case, for example, we'd have:
T([A,B],s)×Free(G)([A,B]) = T([A,B],s)×Free(G)([A])×Free(G)([B])
and we simply define Free(G)([A]) = G(A) for each A in S producing:
T([A,B],s)×Free(G)([A])×Free(G)([B]) = T([A,B],s)×G(A)×G(B)
Altogether this says that, an element of Free(G)([A]) consists of an arrow of T into [A] and a list of elements of the appropriate sets corresponding to the source of that arrow, i.e. the arity of the term modulo equations that make it behave sensibly and obey the equations from the signature but which I'm not going to elaborate on. For the multiplication of a monoid, we'd have an arrow m : [A,A] → [A] and this would lead to a tuples (m, x, y) where x and y are elements of G(A) corresponding to an term like m(x, y). Recasting this definition of as a recursive one takes looking at the equations we're quotienting by.
There are other things to verify to show that Free &dashv; U but it isn't too hard. Once that's done, U&compfn;Free is a monad on SetS.
The nice thing about the Lawvere theory approach is that it is easy to generalize in multiple ways. One straightforward way is to replace Set by some other topos E. It's actually the case that the category of directed multigraphs form a topos, but I don't believe you can (easily) view categories as theories over Graph. A different direction to extend Lawvere theories is to consider doctrines other than finite product preserving functors, in particular finite limit preserving functors aka left exact or lex functors is an interesting point. Both small categories and directed multigraphs (which categorists sometimes call quivers) can be viewed as models of a category with finite limits. There's a straightforward inclusion of the theory of directed multigraphs into the theory of small categories. This, contravariantly, induces a functor cat → Graph simply by precomposition. The left adjoint of this is then (almost) the left Kan extensions along that inclusion. These left Kan extensions will occur in Set so ultimately they are just colimits which are just quotients of (dependent) sum types. (Technically, you need to verify that the resulting Kan extensions are finite limit preserving. We're also helped by the fact that the models of the theory of graphs are essentially arbitrary functors from the theory of graphs. This happens because the theory of graphs consists only of unary operations.)
None of this helps for free monads though. However, it turns out that one construction subsumes all of these including free monads. Returning to universal algebra, it's the case that every signature with no equations gives rise to a (polynomial) functor whose initial algebra is the free term algebra. Lambek's lemma suggests and it's easy to prove that the initial algebra is just colimit of repeated applications of the functor. The above general result is based on a similar approach and the relevant case for free monads is the unpointed endofunctor case, in which you start to see the definition of Free that you gave, but actually working it out fully requires unfolding many constructions.
Frankly, though, what I'm pretty sure actually happened in the FP world is the following. If you look at PreFreeMonoid, it's actually a free monad. PreFreeMonoid Void is the initial algebra for the functor the monoid signature (minus the equations) would give rise to. If you are familiar with using functors for initial algebras and you even start thinking about universal algebra, you are almost certainly going to end up defining a type like data Term f a = Var a | Op (f (Term f a)). It's easy to verify this is a monad once you think to ask the question. If you're even vaguely familiar with the relationship monads have to algebraic structures or to term substitution, then you may ask the question quite quickly. The same construction can be stumbled upon from a programming language implementation perspective. If you just directly set your goal to be deriving the free monad construction in Haskell, there are several intuitive ways to arrive at the right definition especially combined with some equational/parametricity-driven reasoning. In fact, the "monoid object in the category of endofunctors" one is quite suggestive.
('really wish this StackExchange had MathJax support.)

Is my understanding of monoid valid?

So, I'm learning Haskell at the moment, and I would like to confirm or debunk my understanding of monoid.
What I figured out from reading CIS194 course is that monoid is basically "API" for defining custom binary operation on custom set.
Than I went to inform my self some more and I stumbled upon massive ammount of very confusing tutorials trying to clarify the thing, so I'm not so sure anymore.
I have decent mathematical background, but I just got confused from all the metaphors and am looking for clear yes/no answer to my understanding of monoid.

From Wikipedia:
In abstract algebra, a branch of mathematics, a monoid is an algebraic structure with a single associative binary operation and an identity element.
I think your understanding is correct. From a programming perspective, Monoid is an interface with two "methods" that must be implemented.
The only piece that seems to be missing from your description is the "identity", without which you are describing a Semigroup.
Anything that has a "zero" or an "empty" and a way of combining two values can be a Monoid. One thing to note is that it may be possible for a set/type to be made a Monoid in more than one way, for example numbers via addition with identity 0, or multiplication with identity 1.

from Wolfram:
A monoid is a set that is closed under an associative binary operation and has an identity element I in S such that for all a in S, Ia=aI=a.
from Wiki:
In abstract algebra, a branch of mathematics, a monoid is an algebraic structure with a single associative binary operation and an identity element.
so your intuition is more or less right.
You should only keep in mind that it's not defined for a "custom set" in Haskell but a type. The distinction is small (because types in type theory are very similar to sets in set theory) but the types for which you can define a Monoid instance need not be types that represent mathematical sets.
In other words: a type describes the set of all values that are of that type. Monoid is an "interface" that states that any type that claims to adhere to that interface must provide an identity value, a binary operation combining two values of that type, and there are some equations these should satisfy in order for all generic Monoid operations to work as intended (such as the generic summation of a list of monoid values) and not produce illogical/inconsistent results.
Also, note that the existence of an identity element in that set (type) is required for a type to be an instance of the Monoid class.
For example, natural numbers form a Monoid under both addition (identity = 0):
0 + n = n
n + 0 = n
as well as multiplication (identity = 1):
1 * n = n
n * 1 = n
also lists form a monoid under ++ (identity = []):
[] ++ xs = xs
xs ++ [] = xs
also, functions of type a -> a form a monoid under composition (identity = id)
id . f = f
f . id = f
so it's important to keep in mind that Monoid isn't about types that represents sets but about types when viewed as sets, so to say.
as an example of a malconstructed Monoid instance, consider:
import Data.Monoid
newtype MyInt = MyInt Int deriving Show
instance Monoid MyInt where
mempty = MyInt 0
mappend (MyInt a) (MyInt b) = MyInt (a * b)
if you now try to mconcat a list of MyInt values, you'll always get MyInt 0 as the result because the identity value 0 and binary operation * don't play well together:
λ> mconcat [MyInt 1, MyInt 2]
MyInt 0

At a basic level you're right - it's just an API for a binary operator we denote by <>.
However, the value of the monoid concept is in its relationship to other types and classes. Culturally we've decided that <> is the natural way of joining/appending two things of the same type together.
Consider this example:
{-# LANGUAGE OverloadedStrings #-}
import Data.Monoid
greet x = "Hello, " <> x
The function greet is extremely polymorphic - x can be a String, ByteString or Text just to name a few possibilities. Moreover, in each of these cases it does basically what you expect it to - it appends x to the string `"Hello, ".
Additionally, there are lots of algorithms which will work on anything that can be accumulated, and those are good candidates for generalization to a Monoid. For example consider the foldMap function from the Foldable class:
foldMap :: Monoid m => (a -> m) -> t a -> m
Not only does foldMap generalize the idea of folding over a structure, but I can generalize how the accumulation is performed by substituting the right Monoid instance.
If I have a foldable structure t containing Ints, I can use foldMap with the Sum monoid to get the sum of the Ints, or with Product to get the product, etc.
Finally, using <> affords convenience. For instance, there is an abundance of different Set implementations, but for all of them s <> t is always the union of two sets s and t (of the same type). This enables me to write code which is agnostic of the underlying implementation of the set thereby simplifying my code. The same can be said for a lot of other data structures, e.g. sequences, trees, maps, priority queues, etc.

Abusing the algebra of algebraic data types - why does this work?

The 'algebraic' expression for algebraic data types looks very suggestive to someone with a background in mathematics. Let me try to explain what I mean.
Having defined the basic types
Product •
Union +
Singleton X
Unit 1
and using the shorthand X² for X•X and 2X for X+X et cetera, we can then define algebraic expressions for e.g. linked lists
data List a = Nil | Cons a (List a) ↔ L = 1 + X • L
and binary trees:
data Tree a = Nil | Branch a (Tree a) (Tree a) ↔ T = 1 + X • T²
Now, my first instinct as a mathematician is to go nuts with these expressions, and try to solve for L and T. I could do this through repeated substitution, but it seems much easier to abuse the notation horrifically and pretend I can rearrange it at will. For example, for a linked list:
L = 1 + X • L
(1 - X) • L = 1
L = 1 / (1 - X) = 1 + X + X² + X³ + ...
where I've used the power series expansion of 1 / (1 - X) in a totally unjustified way to derive an interesting result, namely that an L type is either Nil, or it contains 1 element, or it contains 2 elements, or 3, etc.
It gets more interesting if we do it for binary trees:
T = 1 + X • T²
X • T² - T + 1 = 0
T = (1 - √(1 - 4 • X)) / (2 • X)
T = 1 + X + 2 • X² + 5 • X³ + 14 • X⁴ + ...
again, using the power series expansion (done with Wolfram Alpha). This expresses the non-obvious (to me) fact that there is only one binary tree with 1 element, 2 binary trees with two elements (the second element can be on the left or the right branch), 5 binary trees with three elements etc.
So my question is - what am I doing here? These operations seem unjustified (what exactly is the square root of an algebraic data type anyway?) but they lead to sensible results. does the quotient of two algebraic data types have any meaning in computer science, or is it just notational trickery?
And, perhaps more interestingly, is it possible to extend these ideas? Is there a theory of the algebra of types that allows, for example, arbitrary functions on types, or do types require a power series representation? If you can define a class of functions, then does composition of functions have any meaning?

Disclaimer: A lot of this doesn't really work quite right when you account for ⊥, so I'm going to blatantly disregard that for the sake of simplicity.
A few initial points:
Note that "union" is probably not the best term for A+B here--that's specifically a disjoint union of the two types, because the two sides are distinguished even if their types are the same. For what it's worth, the more common term is simply "sum type".
Singleton types are, effectively, all unit types. They behave identically under algebraic manipulations and, more importantly, the amount of information present is still preserved.
You probably want a zero type as well. Haskell provides that as Void. There are no values whose type is zero, just as there is one value whose type is one.
There's still one major operation missing here but I'll get back to that in a moment.
As you've probably noticed, Haskell tends to borrow concepts from Category Theory, and all of the above has a very straightforward interpretation as such:
Given objects A and B in Hask, their product A×B is the unique (up to isomorphism) type that allows two projections fst : A×B → A and snd : A×B → B, where given any type C and functions f : C → A, g : C → B you can define the pairing f &&& g : C → A×B such that fst ∘ (f &&& g) = f and likewise for g. Parametricity guarantees the universal properties automatically and my less-than-subtle choice of names should give you the idea. The (&&&) operator is defined in Control.Arrow, by the way.
The dual of the above is the coproduct A+B with injections inl : A → A+B and inr : B → A+B, where given any type C and functions f : A → C, g : B → C, you can define the copairing f ||| g : A+B → C such that the obvious equivalences hold. Again, parametricity guarantees most of the tricky parts automatically. In this case, the standard injections are simply Left and Right and the copairing is the function either.
Many of the properties of product and sum types can be derived from the above. Note that any singleton type is a terminal object of Hask and any empty type is an initial object.
Returning to the aforementioned missing operation, in a cartesian closed category you have exponential objects that correspond to arrows of the category. Our arrows are functions, our objects are types with kind *, and the type A -> B indeed behaves as BA in the context of algebraic manipulation of types. If it's not obvious why this should hold, consider the type Bool -> A. With only two possible inputs, a function of that type is isomorphic to two values of type A, i.e. (A, A). For Maybe Bool -> A we have three possible inputs, and so on. Also, observe that if we rephrase the copairing definition above to use algebraic notation, we get the identity CA × CB = CA+B.
As for why this all makes sense--and in particular why your use of the power series expansion is justified--note that much of the above refers to the "inhabitants" of a type (i.e., distinct values having that type) in order to demonstrate the algebraic behavior. To make that perspective explicit:
The product type (A, B) represents a value each from A and B, taken independently. So for any fixed value a :: A, there is one value of type (A, B) for each inhabitant of B. This is of course the cartesian product, and the number of inhabitants of the product type is the product of the number of inhabitants of the factors.
The sum type Either A B represents a value from either A or B, with the left and right branches distinguished. As mentioned earlier, this is a disjoint union, and the number of inhabitants of the sum type is the sum of the number of inhabitants of the summands.
The exponential type B -> A represents a mapping from values of type B to values of type A. For any fixed argument b :: B, any value of A can be assigned to it; a value of type B -> A picks one such mapping for each input, which is equivalent to a product of as many copies of A as B has inhabitants, hence the exponentiation.
While it's tempting at first to treat types as sets, that doesn't actually work very well in this context--we have disjoint union rather than the standard union of sets, there's no obvious interpretation of intersection or many other set operations, and we don't usually care about set membership (leaving that to the type checker).
On the other hand, the constructions above spend a lot of time talking about counting inhabitants, and enumerating the possible values of a type is a useful concept here. That quickly leads us to enumerative combinatorics, and if you consult the linked Wikipedia article you'll find that one of the first things it does is define "pairs" and "unions" in exactly the same sense as product and sum types by way of generating functions, then does the same for "sequences" that are identical to Haskell's lists using exactly the same technique you did.
Edit: Oh, and here's a quick bonus that I think demonstrates the point strikingly. You mentioned in a comment that for a tree type T = 1 + T^2 you can derive the identity T^6 = 1, which is clearly wrong. However, T^7 = T does hold, and a bijection between trees and seven-tuples of trees can be constructed directly, cf. Andreas Blass's "Seven Trees in One".
Edit×2: On the subject of the "derivative of a type" construction mentioned in other answers, you might also enjoy this paper from the same author which builds on the idea further, including notions of division and other interesting whatnot.

Binary trees are defined by the equation T=1+XT^2 in the semiring of types. By construction, T=(1-sqrt(1-4X))/(2X) is defined by the same equation in the semiring of complex numbers. So given that we're solving the same equation in the same class of algebraic structure it actually shouldn't be surprising that we see some similarities.
The catch is that when we reason about polynomials in the semiring of complex numbers we typically use the fact that the complex numbers form a ring or even a field so we find ourselves using operations such as subtraction that don't apply to semirings. But we can often eliminate subtractions from our arguments if we have a rule that allows us to cancel from both sides of an equation. This is the kind of thing proved by Fiore and Leinster showing that many arguments about rings can be transferred to semirings.
This means that lots of your mathematical knowledge about rings can be reliably transferred to types. As a result, some arguments involving complex numbers or power series (in the ring of formal power series) can carry over to types in a completely rigorous way.
However there's more to the story than this. It's one thing to prove two types are equal (say) by showing two power series are equal. But you can also deduce information about types by inspecting the terms in the power series. I'm not sure of what the formal statement here should be. (I recommend Brent Yorgey's paper on combinatorial species for some work that's closely related but species are not the same as types.)
What I find utterly mind blowing is that what you've discovered can be extended to calculus. Theorems about calculus can be transferred over to the semiring of types. In fact, even arguments about finite differences can be transferred over and you find that classical theorems from numerical analysis have interpretations in type theory.
Have fun!

Calculus and Maclaurin series with types
Here is another minor addition - a combinatorial insight into why the coefficients in a series expansion should 'work', in particular focusing on series which can be derived using the Taylor-Maclaurin approach from calculus. NB: the example series expansion you give of the manipulated list type is a Maclaurin series.
Since other answers and comments deal with the behaviour of algebraic type expressions (sums, products and exponents), this answer will elide that detail and focus on type 'calculus'.
You may notice inverted commas doing some heavy lifting in this answer. There are two reasons:
we are in the business of giving interpretations from one domain to entities from another and it seems appropriate to delimit such such foreign notions in this way.
some notions will be able to be formalised more rigorously, but the shape and ideas seem more important (and take less space to write) than the details.
Definition of Maclaurin series
The Maclaurin series of a function f : ℝ → ℝ is defined as
f(0) + f'(0)X + (1/2)f''(0)X² + ... + (1/n!)f⁽ⁿ⁾(0)Xⁿ + ...
where f⁽ⁿ⁾ means the nth derivative of f.
To be able to make sense of the Maclaurin series as interpreted with types, we need to understand how we can interpret three things in a type context:
a (possibly multiple) derivative
applying a function to 0
terms like (1/n!)
and it turns out that these concepts from analysis have suitable counterparts in the type world.
What do I mean by a 'suitable counterpart'? It should have the flavour of an isomorphism - if we can preserve truth in both directions, facts derivable in one context can be transferred to the other.
Calculus with types
So what does the derivative of a type expression mean? It turns out that for a large and well-behaved ('differentiable') class of type expressions and functors, there is a natural operation which behaves similarly enough to be a suitable interpretation!
To spoil the punchline, the operation analogous to differentiation is that of making 'one-hole contexts'. This is an excellent place to expand on this particular point further but the basic concept of a one-hole context (da/dx) is that it represents the result of extracting a single subitem of a particular type (x) from a term (of type a), preserving all other information, including that necessary to determine the original location of the subitem. For example, one way to represent a one-hole context for a list is with two lists: one for items which came before the extracted one, and one for items which came after.
The motivation for identifying this operation with differentiation comes from the following observations. We write da/dx to mean the type of one-hole contexts for type a with hole of type x.
d1/dx = 0
dx/dx = 1
d(a + b)/dx = da/dx + db/dx
d(a × b)/dx = a × db/dx + b × da/dx
d(g(f(x))/dx = d(g(y))/dy[f(x)/a] × df(x)/dx
Here, 1 and 0 represent types with exactly one and exactly zero inhabitants, respectively, and + and × represent sum and product types as usual. f and g are used to represent type functions, or type expression formers, and [f(x)/a] means the operation of substituting f(x) for every a in the preceding expression.
This may be written in a point-free style, writing f' to mean the derivative function of type function f, thus:
(x ↦ 1)' = x ↦ 0
(x ↦ x)' = x ↦ 1
(f + g)' = f' + g'
(f × g)' = f × g' + g × f'
(g ∘ f)' = (g' ∘ f) × f'
which may be preferable.
NB the equalities can be made rigorous and exact if we define derivatives using isomorphism classes of types and functors.
Now, we notice in particular that the rules in calculus pertaining to the algebraic operations of addition, multiplication and composition (often called the Sum, Product and Chain rules) are reflected exactly by the operation of 'making a hole'. Further, the base cases of 'making a hole' in a constant expression or the termx itself also behave as differentiation, so by induction we get differentiation-like behaviour for all algebraic type expressions.
Now we can interpret differentiation, what does the nth 'derivative' of a type expression, dⁿe/dxⁿ mean? It is a type representing n-place contexts: terms which, when 'filled' with n terms of type x yield an e. There is another key observation related to '(1/n!)' coming later.
The invariant part of a type functor: applying a function to 0
We already have an interpretation for 0 in the type world: an empty type with no members. What does it mean, from a combinatorial point of view, to apply a type function to it? In more concrete terms, supposing f is a type function, what does f(0) look like? Well, we certainly don't have access to anything of type 0, so any constructions of f(x) which require an x are unavailable. What is left is those terms which are accessible in their absence, which we can call the 'invariant' or 'constant' part of the type.
For an explicit example, take the Maybe functor, which can be represented algebraically as x ↦ 1 + x. When we apply this to 0, we get 1 + 0 - it's just like 1: the only possible value is the None value. For a list, similarly, we get just the term corresponding to the empty list.
When we bring it back and interpret the type f(0) as a number it can be thought of as the count of how many terms of type f(x) (for any x) can be obtained without access to an x: that is, the number of 'empty-like' terms.
Putting it together: complete interpretation of a Maclaurin series
I'm afraid I can't think of an appropriate direct interpretation of (1/n!) as a type.
If we consider, though, the type f⁽ⁿ⁾(0) in light of the above, we see that it can be interpreted as the type of n-place contexts for a term of type f(x) which do not already contain an x - that is, when we 'integrate' them n times, the resulting term has exactly n xs, no more, no less. Then the interpretation of the type f⁽ⁿ⁾(0) as a number (as in the coefficients of the Maclaurin series of f) is simply a count of how many such empty n-place contexts there are. We are nearly there!
But where does (1/n!) end up? Examining the process of type 'differentiation' shows us that, when applied multiple times, it preserves the 'order' in which subterms are extracted. For example, consider the term (x₀, x₁) of type x × x and the operation of 'making a hole' in it twice. We get both sequences
(x₀, x₁) ↝ (_₀, x₁) ↝ (_₀, _₁)
(x₀, x₁) ↝ (x₀, _₀) ↝ (_₁, _₀)
(where _ represents a 'hole')
even though both come from the same term, because there are 2! = 2 ways to take two elements from two, preserving order. In general, there are n! ways to take n elements from n. So in order to get a count of the number of configurations of a functor type which have n elements, we have to count the type f⁽ⁿ⁾(0) and divide by n!, exactly as in the coefficients of the Maclaurin series.
So dividing by n! turns out to be interpretable simply as itself.
Final thoughts: 'recursive' definitions and analyticity
First, some observations:
if a function f : ℝ → ℝ has a derivative, this derivative is unique
similarly, if a function f : ℝ → ℝ is analytic, it has exactly one corresponding polynomial series
Since we have the chain rule, we can use implicit differentiation, if we formalise type derivatives as isomorphism classes. But implicit differentiation doesn't require any alien manoeuvres like subtraction or division! So we can use it to analyse recursive type definitions. To take your list example, we have
L(X) ≅ 1 + X × L(X)
L'(X) = X × L'(X) + L(X)
and then we can evaluate
L'(0) = L(0) = 1
to obtain the coefficient of X¹ in the Maclaurin series.
But since we are confident that these expressions are indeed strictly 'differentiable', if only implicitly, and since we have the correspondence with functions ℝ → ℝ, where derivatives are certainly unique, we can rest assured that even if we obtain the values using 'illegal' operations, the result is valid.
Now, similarly, to use the second observation, due to the correspondence (is it a homomorphism?) with functions ℝ → ℝ, we know that, provided we are satisfied that a function has a Maclaurin series, if we can find any series at all, the principles outlined above can be applied to make it rigorous.
As for your question about composition of functions, I suppose the chain rule provides a partial answer.
I'm not certain how many Haskell-style ADTs this applies to, but I suspect it is many if not all. I have discovered a truly marvellous proof of this fact, but this margin is too small to contain it...
Now, certainly this is only one way to work out what is going on here and there are probably many other ways.
Summary: TL;DR
type 'differentiation' corresponds to 'making a hole'.
applying a functor to 0 gets us the 'empty-like' terms for that functor.
Maclaurin power series therefore (somewhat) rigorously correspond to enumerating the number of members of a functor type with a certain number of elements.
implicit differentiation makes this more watertight.
uniqueness of derivatives and uniqueness of power series mean we can fudge the details and it works.

It seems that all you're doing is expanding the recurrence relation.
L = 1 + X • L
L = 1 + X • (1 + X • (1 + X • (1 + X • ...)))
= 1 + X + X^2 + X^3 + X^4 ...
T = 1 + X • T^2
L = 1 + X • (1 + X • (1 + X • (1 + X • ...^2)^2)^2)^2
= 1 + X + 2 • X^2 + 5 • X^3 + 14 • X^4 + ...
And since the rules for the operations on the types work like the rules for arithmetic operations, you can use algebraic means to help you figure out how to expand the recurrence relation (since it is not obvious).

I don't have a complete answer, but these manipulations tend to 'just work'. A relevant paper might be Objects of Categories as Complex Numbers by Fiore and Leinster - I came across that one while reading sigfpe's blog on a related subject ; the rest of that blog is a goldmine for similar ideas and is worth checking out!
You can also differentiate datatypes, by the way - that will get you the appropriate Zipper for the datatype!

The Algebra of Communicating Processes (ACP) deals with similar kinds of expressions for processes.
It offers addition and multiplication as operators for choice and sequence, with associated neutral elements.
Based on these there are operators for other constructs, such as parallelism and disruption.
See http://en.wikipedia.org/wiki/Algebra_of_Communicating_Processes. There is also a paper online named "A Brief History of Process Algebra".
I am working on extending programming languages with ACP. Last April I presented a research paper at Scala Days 2012, available at http://code.google.com/p/subscript/
At the conference I demonstrated a debugger running a parallel recursive specification of a bag:
Bag = A; (Bag&a)
where A and a stand for input and output actions; the semicolon and ampersand stand for sequence and parallelism. See the video at SkillsMatter, reachable from the previous link.
A bag specification more comparable to
L = 1 + X•L
would be
B = 1 + X&B
ACP defines parallelism in terms of choice and sequence using axioms; see the Wikipedia article. I wonder what the bag analogy would be for
L = 1 / (1-X)
ACP style programming is handy for text parsers and GUI controllers. Specifications such as
searchCommand = clicked(searchButton) + key(Enter)
cancelCommand = clicked(cancelButton) + key(Escape)
may be written down more concisely by making the two refinements "clicked" and "key" implicit (like what Scala allows with functions). Hence we can write:
searchCommand = searchButton + Enter
cancelCommand = cancelButton + Escape
The right hand sides now contains operands that are data, rather than processes. At this level it is not necessary needed to know what implicit refinements will turn these operands into processes; they would not necessarily refine into input actions; output actions would also apply, e.g. in the specification of a test robot.
Processes get this way data as companions; thus I coin the term "item algebra".

Dependent type theory and 'arbitrary' type functions
My first answer to this question was high on concepts and low on details and reflected on the subquestion, 'what is going on?'; this answer will be the same but focused on the subquestion, 'can we get arbitrary type functions?'.
One extension to the algebraic operations of sum and product are the so called 'large operators', which represent the sum and product of a sequence (or more generally, the sum and product of a function over a domain) usually written Σ and Π respectively. See Sigma Notation.
So the sum
a₀ + a₁X + a₂X² + ...
might be written
Σ[i ∈ ℕ]aᵢXⁱ
where a is some sequence of real numbers, for example. The product would be represented similarly with Π instead of Σ.
When you look from a distance, this kind of expression looks a lot like an 'arbitrary' function in X; we are limited of course to expressible series, and their associated analytic functions. Is this a candidate for a representation in a type theory? Definitely!
The class of type theories which have immediate representations of these expressions is the class of 'dependent' type theories: theories with dependent types. Naturally we have terms dependent on terms, and in languages like Haskell with type functions and type quantification, terms and types depending on types. In a dependent setting, we additionally have types depending on terms. Haskell is not a dependently typed language, although many features of dependent types can be simulated by torturing the language a bit.
Curry-Howard and dependent types
The 'Curry-Howard isomorphism' started life as an observation that the terms and type-judging rules of simply-typed lambda calculus correspond exactly to natural deduction (as formulated by Gentzen) applied to intuitionistic propositional logic, with types taking the place of propositions, and terms taking the place of proofs, despite the two being independently invented/discovered. Since then, it has been a huge source of inspiration for type theorists. One of the most obvious things to consider is whether, and how, this correspondence for propositional logic can be extended to predicate or higher order logics. Dependent type theories initially arose from this avenue of exploration.
For an introduction to the Curry-Howard isomorphism for simply-typed lambda calculus, see here. As an example, if we want to prove A ∧ B we must prove A and prove B; a combined proof is simply a pair of proofs: one for each conjunct.
In natural deduction:
Γ ⊢ A Γ ⊢ B
Γ ⊢ A ∧ B
and in simply-typed lambda calculus:
Γ ⊢ a : A Γ ⊢ b : B
Γ ⊢ (a, b) : A × B
Similar correspondences exist for ∨ and sum types, → and function types, and the various elimination rules.
An unprovable (intuitionistically false) proposition corresponds to an uninhabited type.
With the analogy of types as logical propositions in mind, we can start to consider how to model predicates in the type-world. There are many ways in which this has been formalised (see this introduction to Martin-Löf's Intuitionistic Type Theory for a widely-used standard) but the abstract approach usually observes that a predicate is like a proposition with free term variables, or, alternatively, a function taking terms to propositions. If we allow type expressions to contain terms, then a treatment in lambda calculus style immediately presents itself as a possibility!
Considering only constructive proofs, what constitutes a proof of ∀x ∈ X.P(x)? We can think of it as a proof function, taking terms (x) to proofs of their corresponding propositions (P(x)). So members (proofs) of the type (proposition) ∀x : X.P(x) are 'dependent functions', which for each x in X give a term of type P(x).
What about ∃x ∈ X.P(x)? We need any member of X, x, together with a proof of P(x). So members (proofs) of the type (proposition) ∃x : X.P(x) are 'dependent pairs': a distinguished term x in X, together with a term of type P(x).
Notation:
I will use
∀x ∈ X...
for actual statements about members of the class X, and
∀x : X...
for type expressions corresponding to universal quantification over type X. Likewise for ∃.
Combinatorial considerations: products and sums
As well as the Curry-Howard correspondence of types with propositions, we have the combinatorial correspondence of algebraic types with numbers and functions, which is the main point of this question. Happily, this can be extended to the dependent types outlined above!
I will use the modulus notation
|A|
to represent the 'size' of a type A, to make explicit the correspondence outlined in the question, between types and numbers. Note that this is a concept outside of the theory; I do not claim that there need be any such operator within the language.
Let us count the possible (fully reduced, canonical) members of type
∀x : X.P(x)
which is the type of dependent functions taking terms x of type X to terms of type P(x). Each such function must have an output for every term of X, and this output must be of a particular type. For each x in X, then, this gives |P(x)| 'choices' of output.
The punchline is
|∀x : X.P(x)| = Π[x : X]|P(x)|
which of course doesn't make huge deal of sense if X is IO (), but is applicable to algebraic types.
Similarly, a term of type
∃x : X.P(x)
is the type of pairs (x, p) with p : P(x), so given any x in X we can construct an appropriate pair with any member of P(x), giving |P(x)| 'choices'.
Hence,
|∃x : X.P(x)| = Σ[x : X]|P(x)|
with the same caveats.
This justifies the common notation for dependent types in theories using the symbols Π and Σ, and indeed many theories blur the distinction between 'for all' and 'product' and between 'there is' and 'sum', due to the above-mentioned correspondences.
We are getting close!
Vectors: representing dependent tuples
Can we now encode numerical expressions like
Σ[n ∈ ℕ]Xⁿ
as type expressions?
Not quite. While we can informally consider the meaning of expressions like Xⁿ in Haskell, where X is a type and n a natural number, it's an abuse of notation; this is a type expression containing a number: distinctly not a valid expression.
On the other hand, with dependent types in the picture, types containing numbers is precisely the point; in fact, dependent tuples or 'vectors' are a very commonly-cited example of how dependent types can provide pragmatic type-level safety for operations like list access. A vector is just a list along with type-level information regarding its length: precisely what we are after for type expressions like Xⁿ.
For the duration of this answer, let
Vec X n
be the type of length-n vectors of X-type values.
Technically n here is, rather than an actual natural number, a representation in the system of a natural number. We can represent natural numbers (Nat) in Peano style as either zero (0) or the successor (S) of another natural number, and for n ∈ ℕ I write ˻n˼ to mean the term in Nat which represents n. For example, ˻3˼ is S (S (S 0)).
Then we have
|Vec X ˻n˼| = |X|ⁿ
for any n ∈ ℕ.
Nat types: promoting ℕ terms to types
Now we can encode expressions like
Σ[n ∈ ℕ]Xⁿ
as types. This particular expression would give rise to a type which is of course isomorphic to the type of lists of X, as identified in the question. (Not only that, but from a category-theoretic point of view, the type function - which is a functor - taking X to the above type is naturally isomorphic to the List functor.)
One final piece of the puzzle for 'arbitrary' functions is how to encode, for
f : ℕ → ℕ
expressions like
Σ[n ∈ ℕ]f(n)Xⁿ
so that we can apply arbitrary coefficients to a power series.
We already understand the correspondence of algebraic types with numbers, allowing us to map from types to numbers and type functions to numerical functions. We can also go the other way! - taking a natural number, there is obviously a definable algebraic type with that many term members, whether or not we have dependent types. We can easily prove this outside of the type theory by induction. What we need is a way to map from natural numbers to types, inside the system.
A pleasing realisation is that, once we have dependent types, proof by induction and construction by recursion become intimately similar - indeed they are the very same thing in many theories. Since we can prove by induction that types exist which fulfil our needs, should we not be able to construct them?
There are several ways to represent types at the term level. I will use here an imaginary Haskellish notation with * for the universe of types, itself usually considered a type in a dependent setting.1
Likewise, there are also at least as many ways to notate 'ℕ-elimination' as there are dependent type theories. I will use a Haskellish pattern-matching notation.
We need a mapping, α from Nat to *, with the property
∀n ∈ ℕ.|α ˻n˼| = n.
The following pseudodefinition suffices.
data Zero -- empty type
data Successor a = Z | Suc a -- Successor ≅ Maybe
α : Nat -> *
α 0 = Zero
α (S n) = Successor (α n)
So we see that the action of α mirrors the behaviour of the successor S, making it a kind of homomorphism. Successor is a type function which 'adds one' to the number of members of a type; that is, |Successor a| = 1 + |a| for any a with a defined size.
For example α ˻4˼ (which is α (S (S (S (S 0))))), is
Successor (Successor (Successor (Successor Zero)))
and the terms of this type are
Z
Suc Z
Suc (Suc Z)
Suc (Suc (Suc Z))
giving us exactly four elements: |α ˻4˼| = 4.
Likewise, for any n ∈ ℕ, we have
|α ˻n˼| = n
as required.
Many theories require that the members of * are mere representatives of types, and an operation is provided as an explicit mapping from terms of type * to their associated types. Other theories permit the literal types themselves to be term-level entities.
'Arbitrary' functions?
Now we have the apparatus to express a fully general power series as a type!
The series
Σ[n ∈ ℕ]f(n)Xⁿ
becomes the type
∃n : Nat.α (˻f˼ n) × (Vec X n)
where ˻f˼ : Nat → Nat is some suitable representation within the language of the function f. We can see this as follows.
|∃n : Nat.α (˻f˼ n) × (Vec X n)|
= Σ[n : Nat]|α (˻f˼ n) × (Vec X n)| (property of ∃ types)
= Σ[n ∈ ℕ]|α (˻f˼ ˻n˼) × (Vec X ˻n˼)| (switching Nat for ℕ)
= Σ[n ∈ ℕ]|α ˻f(n)˼ × (Vec X ˻n˼)| (applying ˻f˼ to ˻n˼)
= Σ[n ∈ ℕ]|α ˻f(n)˼||Vec X ˻n˼| (splitting product)
= Σ[n ∈ ℕ]f(n)|X|ⁿ (properties of α and Vec)
Just how 'arbitrary' is this? We are limited not only to integer coefficients by this method, but to natural numbers. Apart from that, f can be anything at all, given a Turing Complete language with dependent types, we can represent any analytic function with natural number coefficients.
I haven't investigated the interaction of this with, for example, the case provided in the question of List X ≅ 1/(1 - X) or what possible sense such negative and non-integer 'types' might have in this context.
Hopefully this answer goes some way to exploring how far we can go with arbitrary type functions.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string