Combining two simple assumptions in Lean - lean

I'm trying to construct this proof in Lean:
(P → Q) ∧ (R → ¬Q) → ¬(P ∧ R)
It feels like a simple proof by contradiction:
Assume P and R, the opposite of the conclusion.
Assume P → Q. Since P, Q.
Assume R → ¬Q. Since R, ¬Q.
Q and ¬Q. Contradiction.
Here's what I've got so far in Lean:
example (P Q R : Prop) : (P → Q) ∧ (R → ¬Q) → ¬(P ∧ R) :=
begin
assume a : (P → Q) ∧ (R → ¬Q),
assume b : P ∧ R,
cases a with pq rnq,
cases b with p r,
sorry
end
That leaves me with this goal:
P Q R : Prop,
pq : P → Q,
rnq : R → ¬Q,
p : P,
r : R
⊢ false
I feel like I should just be able to somehow combine p and pq to get Q, and combine r and rnq to get ¬Q. But I can't figure out how to do it. If I didn't have the false in the final goal, I could just apply pq p and it would be done.
Ignoring this particular proof, is there a way to combine two simple hypotheses into another simple hypothesis?
Is there a different way to approach this proof? Is my theorem just wrong in some way?

I think the tactic you're missing here is have. The have tactic tells Lean how to construct a new thing from what it already has, and adds it to the stock of resources in the current context. This is what you need to "combine p and pq to get Q".
Since you have pq : P → Q and you have p : P, you can apply pq to p to get a term of Q. This works just like applying a function f : ℕ → ℤ to a term n : ℕ to get a term of ℤ.
So you can continue your proof like this:
example (P Q R : Prop) : (P → Q) ∧ (R → ¬Q) → ¬(P ∧ R) :=
begin
assume a : (P → Q) ∧ (R → ¬Q),
assume b : P ∧ R,
cases a with pq rnq,
cases b with p r,
have q : Q := pq p,
have nq : ¬Q := rnq r,
exact nq q,
end
On the last line, since we now have have q : Q and nq : ¬Q (or equivalently, nq : Q → false), we can apply nq to q to get a term of false. But since that's the goal, we write exact here instead of have.

Related

Is there a way to call a function recursively from tactic mode or from match expressions in Lean?

Attempt #1:
def sget' {α : Type} {n : ℕ} (i : ℕ) {h1 : n > 0} {h2 : i < n} (s: sstack α n) : α :=
begin
cases n with n0 nn,
begin
have f : false, from nat.lt_asymm h1 h1,
tauto,
end,
induction s,
cases s_val,
begin
have : stack.empty.size = 0, from #stack_size_0 α,
tauto,
end,
cases i with i0 ri,
exact s_val_x,
exact sget' (pred i) s_val_s,
end
Attempt #2:
def sget' {α : Type} {n : ℕ} (i : ℕ) {h1 : n > 0} {h2 : i < n} (s: sstack α n) : α :=
match i, s with
| 0, ⟨stack.push x s, _⟩ := x
| i, ⟨stack.push _ s, _⟩ := sget' (pred i) ⟨s, _⟩
| _, ⟨stack.empty, _⟩ := sorry -- just ignore this
Lean in both cases throws unknown identifier sget' error. I know that I can call sget' recursively from ehh pattern guards (not sure how they are properly called), but is there any way to do something like that with tactics and/or match expressions?
You can do recursive calls if you use the equation compiler
def map (f : α → β) : list α → list β
| [] := []
| (a :: l) := f a :: map l
Otherwise you should use induction tactic or one of the explicit recursor functions (like nat.rec).

Coq extraction to Haskell

I have the following Coq implementation of integer division with remainder.
When I extract it to Haskell everything works fine. I compared the Coq version to the generated Haskell version and tried to understand what's going on. It seems that rewrite is simply removed here,
and what actually steers the extraction here are induction, destruct, exists and specialize. Is there any scenario where rewrite is used during extraction? Also, some variables names are kept (like q0 and m0'') but others change (r0 to h) is there any reason to change names? Here is the Coq code followed by the extracted code:
(***********)
(* IMPORTS *)
(***********)
Require Import Coq.Arith.PeanoNat.
Require Import Coq.Structures.OrdersFacts.
Lemma Sn_eq_Sm: forall n m,
(n = m) -> ((S n) = (S m)).
Proof.
intros n m H.
rewrite H.
reflexivity.
Qed.
Lemma Sn_lt_Sm: forall n m,
(n < m) -> ((S n) < (S m)).
Proof.
intros n0 m0 H.
unfold lt in H.
apply Nat.lt_succ_r.
apply H.
Qed.
Lemma add_nSm : forall (n m : nat),
(n + (S m)) = S (n + m).
Proof.
intros n m.
induction n.
- reflexivity.
- simpl.
apply Sn_eq_Sm.
apply IHn.
Qed.
Lemma n_lt_m: forall n m,
((n <? m) = false) -> (m <= n).
Proof.
Admitted.
Lemma n_le_m_le_n: forall n m,
(n <= m) -> ((m <= n) -> (m = n)).
Proof.
Admitted.
Lemma Sn_ge_0: forall n,
0 <= (S n).
Proof.
induction n as [|n' IHn'].
- apply le_S. apply le_n.
- apply le_S. apply IHn'.
Qed.
Lemma n_ge_0: forall n,
0 <= n.
Proof.
induction n as [|n' IHn'].
- apply le_n.
- apply le_S. apply IHn'.
Qed.
Lemma Sn_gt_0: forall n,
0 < (S n).
Proof.
induction n as [|n' IHn'].
- apply le_n.
- apply le_S. apply IHn'.
Qed.
Lemma n_le_m_implies_Sn_le_Sm: forall n m,
(n <= m) -> ((S n) <= (S m)).
Proof.
induction n as [|n' IHn'].
- induction m as [|m' IHm'].
+ intros H1. apply le_n.
+ intros H1. apply le_S.
apply IHm'. apply n_ge_0.
- induction m as [|m' IHm'].
+ intros H1. inversion H1.
+ intros H1. inversion H1.
apply le_n. apply IHm' in H0 as H2.
apply le_S in H2. apply H2.
Qed.
(****************************************)
(* division with quotient and remainder *)
(****************************************)
Definition div_q_r: forall n m : nat,
{ q:nat & { r:nat | (n = q * (S m) + r) /\ (r < (S m))}}.
Proof.
induction n as [|n' IHn'].
- exists 0. exists 0. split. reflexivity. apply Sn_gt_0.
- intros m0.
destruct m0 as [|m0''] eqn:E1.
+ exists (S n'). exists 0. split.
* rewrite Nat.add_0_r with (n:=(S n') * 1).
rewrite Nat.mul_1_r with (n:=(S n')). reflexivity.
* specialize Sn_gt_0 with (n:=0). intros H. apply H.
+ specialize IHn' with (m:=(S m0'')).
destruct IHn' as [q0 H]. destruct H as [r0 H].
destruct (r0 <? (S m0'')) eqn:E2.
* exists q0. exists (S r0). split.
-- rewrite add_nSm with (n:=q0 * S (S m0'')).
apply Sn_eq_Sm. apply proj1 in H as H1. apply H1.
-- apply Nat.ltb_lt in E2. apply Sn_lt_Sm. apply E2.
* exists (S q0). exists 0. split.
-- apply proj2 in H as H2. rewrite Nat.lt_succ_r in H2.
apply n_lt_m in E2. apply n_le_m_le_n in H2.
apply proj1 in H as H1. rewrite H2 in H1. rewrite H1.
rewrite <- add_nSm with (n:=q0 * S (S m0'')) (m:=S m0'').
rewrite Nat.add_0_r.
rewrite Nat.mul_succ_l with (n:=q0) (m:=S (S m0'')).
reflexivity. apply E2.
-- unfold "<". apply n_le_m_implies_Sn_le_Sm. apply Sn_ge_0.
Qed.
(********************************)
(* Extraction Language: Haskell *)
(********************************)
Extraction Language Haskell.
(***************************)
(* Use Haskell basic types *)
(***************************)
Require Import ExtrHaskellBasic.
(****************************************)
(* Use Haskell support for Nat handling *)
(****************************************)
Require Import ExtrHaskellNatNum.
Extract Inductive Datatypes.nat => "Prelude.Integer" ["0" "Prelude.succ"]
"(\fO fS n -> if n Prelude.== 0 then fO () else fS (n Prelude.- 1))".
(***************************)
(* Extract to Haskell file *)
(***************************)
Extraction "/home/oren/GIT/some_file_Haskell.hs" div_q_r.
And here is the extracted Haskell code:
div_q_r :: Prelude.Integer -> Prelude.Integer -> SigT Prelude.Integer
Prelude.Integer
div_q_r n =
nat_rec (\_ -> ExistT 0 0) (\n' iHn' m0 ->
(\fO fS n -> if n Prelude.== 0 then fO () else fS (n Prelude.- 1))
(\_ -> ExistT (Prelude.succ n')
0)
(\m0'' ->
let {iHn'0 = iHn' (Prelude.succ m0'')} in
case iHn'0 of {
ExistT q0 h ->
let {b = ltb h (Prelude.succ m0'')} in
case b of {
Prelude.True -> ExistT q0 (Prelude.succ h);
Prelude.False -> ExistT (Prelude.succ q0) 0}})
m0) n
When you use rewrite, the goal is actually a type (a formula) and the type of this type is often Prop. When this happens, as in your example, the effect of the rewrite tactic is discarded because the part of the term where it took place was discarded.
the extraction tool does not look at tactics: it remove expressions whose type has type Prop from the term that will be executed. The whole system is designed in such a way that these expressions should not have an effect on computation.
In a sense, it is a distinction between compile-time verification and run-time verification. All the proofs that you do in Coq are compile-time verifications, at run-time they don't need to be redone, so they are removed from the code. The Prop sort is used to mark computations that happen only at compile-time and won't have an effect on the execution at run-time.
You can somehow predict the content of the Haskell extracted program by looking at the result of Print div_q_r.
The result contains instances of existT and instance of exist. The type of existT is :
forall (A : Type) (P : A -> Type) (x : A), P x -> {x : A & P x}
The notation {x : A & P x} is for #sigT A P. In turn the type of sigT is
forall A : Type, (A -> Type) -> Type
The type of existT P xx pp is #sigT A P and the type of the latter is Type. In consequence, the extraction tool decides that this term contains data that is important at run time. Moreover, the second component of sigT A P, has type P xx which itself has type Type, so this also is important at run time: it won't be discarded.
Now let's turn our attention to expression of the form exist _ _. Such an expression has type #sig A P and sig has type :
forall A: Type, (A -> Prop) -> Type
So an expression exist Q y qq contains y whose type has type Type and qq whose type is Q y and has type Prop. Information on how to compute y will be kept at run time, but information on how to compute qq is discarded.
If you want to know where rewrite had an effect in the proof, you only need to look for instances of eq_ind and eq_ind_r in the result of Print div_q_r. You will see that these instances are subterms of the third argument of exist statements. This is the reason why they don't appear in the final result. It is not because the extraction has special treatement of rewrites, it is because it has a special behavior on the type of types Prop (we also call it the sort Prop).
It is possible to construct functions where rewrite leaves a trace in the extraction result, but I am not sure that these functions behave correctly in Haskell. This when, the formula where the rewrite occur is not in sort Prop.
Definition nat_type n :=
match n with O => nat | S p => bool end.
Definition strange n : nat_type (n * 0).
rewrite Nat.mul_0_r.
exact n.
Defined.

Representing a theorem with multiple hypotheses in Lean (propositional logic)

Real beginners question here. How do I represent a problem with multiple hypotheses in Lean? For example:
Given
A
A→B
A→C
B→D
C→D
Prove the proposition D.
(Problem taken from The Incredible Proof Machine, Session 2 problem 3. I was actually reading Logic and Proof, Chapter 4, Propositional Logic in Lean but there are less exercises available there)
Obviously this is completely trivial to prove by applying modus ponens twice, my question is how do I represent the problem in the first place?! Here's my proof:
variables A B C D : Prop
example : (( A )
/\ ( A->B )
/\ ( A->C )
/\ ( B->D )
/\ ( C->D ))
-> D :=
assume h,
have given1: A, from and.left h,
have given2: A -> B, from and.left (and.right h),
have given3: A -> C, from and.left (and.right (and.right h)),
have given4: B -> D, from and.left (and.right (and.right (and.right h))),
have given5: C -> D, from and.right (and.right (and.right (and.right h))),
show D, from given4 (given2 given1)
Try it!
I think I've made far too much a meal of packaging up the problem then unpacking it, could someone show me a better way of representing this problem please?
I think it is a lot clearer by not using And in the hypotheses instead using ->. here are 2 equivalent proofs, I prefer the first
def s2p3 {A B C D : Prop} (ha : A)
(hab : A -> B) (hac : A -> C)
(hbd : B -> D) (hcd : C -> D) : D
:= show D, from (hbd (hab ha))
The second is the same as the first except using example,
I believe you have to specify the names of the parameters using assume
rather than inside the declaration
example : A -> (A -> B) -> (A -> C) -> (B -> D) -> (C -> D) -> D :=
assume ha : A,
assume hab : A -> B,
assume hac, -- You can actually just leave the types off the above 2
assume hbd,
assume hcd,
show D, from (hbd (hab ha))
if you want to use the def syntax but the problem is e.g. specified using example syntax
example : A -> (A -> B) -> (A -> C)
-> (B -> D) -> (C -> D) -> D := s2p3
Also, when using and in your proof, in the unpacking stage
You unpack given3, and given 5 but never use them in your "show" proof.
So you don't need to unpack them e.g.
example : (( A )
/\ ( A->B )
/\ ( A->C )
/\ ( B->D )
/\ ( C->D ))
-> D :=
assume h,
have given1: A, from and.left h,
have given2: A -> B, from and.left (and.right h),
have given4: B -> D, from and.left (and.right (and.right (and.right h))),
show D, from given4 (given2 given1)

Red Black Trees: Kahrs version

I'm currently trying to understand the Red-Black tree implementation as given by Okasaki and delete methods by Kahrs (the untyped version).
In the delete implementation a function app is used, which seems to "merging" the children of the node being deleted. And again, the algorithm seems to use the same "break" the Red-Red property rather than black height (please correct me if I'm wrong).. We are always creating Red Nodes (even if we break the Red-Red property). walk down the subtree rooted at the node being deleted, correct the red-red violations, once we reach the leafs, we start going up the path (starting at the "new tree" merge created) fixing the red-red violation up the path.
app :: RB a -> RB a -> RB a
app E x = x
app x E = x
app (T R a x b) (T R c y d) =
case app b c of
T R b' z c' -> T R(T R a x b') z (T R c' y d)
bc -> T R a x (T R bc y d)
app (T B a x b) (T B c y d) =
case app b c of
T R b' z c' -> T R(T B a x b') z (T B c' y d)
bc -> balleft a x (T B bc y d)
app a (T R b x c) = T R (app a b) x c
app (T R a x b) c = T R a x (app b c)
I'm not able to see how we are "not creating" / "fixing" the black height violation? deleting a black node would create bh-1 and bh subtrees at some node up the path.
Results from this paper, looks like this implementation is really fast and that the "merge" method might be the key to answering the increase in speed.
any pointers to an explanation for this "merge" operation would be great.
please note this is not a homework problem or anything else. I'm studying independently the implementations given in Okasaki and fill in the "messy" deletes too.
Thanks.
Given that there is a lot that can be said on this topic, I'll try to stick as closely as possible to your questions, but let me know if I missed something important.
What the hell is app doing?
You are correct in that app breaks the red invariant rather than the black one on the way down and fixes this on the way back up.
It appends or merges two subtrees that obey the order property, black invariant, red invariant, and have the same black-depth into a new tree that also obeys the order property, black invariant, and red invariant. The one notable exception is that the root or app rb1 rb2 sometimes is red and has two red subtrees. Such trees are said to be "infrared". This is dealt with in delete by just setting the root to be black in this line.
case del t of {T _ a y b -> T B a y b; _ -> E}
Claim 1 (Order property) if the inputs rb1 and rb2 obey the order property individually (left subtree < node value < right subtree) and the max value in rb1 is less than the min value in rb2, then app rb1 rb2 also obeys the order property.
This one is easy to prove. In fact, you can even sort of see it when looking at the code - the as always stays to the left of the bs or b's, which always stay to the left of the cs or c's, which always stay to the left of the ds. And the b's and c's also obey this property since they are the result of recursive calls to app with subtrees b and c satisfying the claim.
Claim 2 (Red invariant) if the inputs rb1 and rb2 obey the red invariant (if a node is red, then both its children are black), then so do all the nodes in app rb1 rb2, except for possibly the root. However, the root can be "infrared" only when one of its arguments has a red root.
Proof The proof is by branching on the patterns.
For cases app E x = x and app x E = x the claim is trivial
For app (T R a x b) (T R c y d), the claim hypothesis tells us all of a, b, c, and d are black. It follows that app b c obeys the red invariant fully (it is not infrared).
If app b c matches T R b' z c' then its subtrees b' and c' must be black (and obey the red invariant), so T R (T R a x b') z (T R c' y d) obeys the red-invariant with an infrared root.
Otherwise, app b c produced a black node bc, so T R a x (T R bc y d) obeys the red invariant.
For app (T B a x b) (T B c y d) we only care that app b c will itself obey the red-invariant
If app b c is a red node, it can be infrared but its subtrees b' and c', on the other hand, must obey the red invariant completely. That means T R (T B a x b') z (T B c' y d) also obeys the red invariant.
Now, if bc turns out to be black, we call balleft a x (T B bc y d). The neat thing here is that we already know which two cases of balleft can be triggered: depending on whether a is red or black
balleft (T R a x b) y c = T R (T B a x b) y c
balleft bl x (T B a y b) = balance bl x (T R a y b)
In the first case, what ends up happening is that we swap the color of the left subtree from red to black (and doing so never breaks the red-invariant) and put everything in a red subtree. Then balleft a x (T B bc y d) actually looks like T R (T B .. ..) (T B bc y d), and that obeys the red invariant.
Otherwise, the call to balleft turns into balance a x (T R bc y d) and the whole point of balance is that it fixes root level red violations.
For app a (T R b x c) we know that a and b must be black, so app a b is not infrared, so T R (app a b) x c obeys the red invariant. The same holds for app a (T R b x c) but with letters a, b, and c permuted.
Claim 3 (Black invariant) if the inputs rb1 and rb2 obey the black invariant (any path from the root to the leaves has the same number of black nodes on the way) and have the same black-depth, app rb1 rb2 also obeys the black invariant and has the same black-depth.
Proof The proof is by branching on the patterns.
For cases app E x = x and app x E = x the claim is trivial
For app (T R a x b) (T R c y d) we know that since T R a x b and T R c y d have the same black depth, so do their subtrees a, b, c, and d. By the claim (remember, this is induction!) app b c will also obey the black invariant and have the same black depth. Now, we branch our proof on case app b c of ...
If app b c matches T R b' z c' it is red and its subtrees b' and c' will have the same black depth as app b c (itself), which in turn has the same black depth as a and d, so T R (T R a x b') z (T R c' y d) obeys the black invariant and has the same black depth as its inputs.
Otherwise, app b c produced a black node bc, but again that node has the same black depth as a and d, so T R a x (T R bc y d) as a whole still obeys the black invariant and has the same black depth as its inputs.
For app (T B a x b) (T B c y d) we again know immediately that subtrees a, b, c, and d all have the same black depth as app b c. As before, we branch our proof on case app b c of ...
If app b c is a red node of the form T R b' z c', we again get that b', c', a, and d have the same black-depth, so T R (T B a x b') z (T B c' y d) also has this same black depth
Now, if bc turns out to be black, we apply the same reasoning as in our previous claim to figure out that the output balleft a x (T B bc y d) actually has the form either
T R (T B .. ..) (T B bc y d) where (T B .. ..) is just a recolored as black so the overall tree will satisfy the black invariant and will have black-depth one more than any of a, b, c, or d, which is to say the same black-depth as the inputs T B a x b and T B c y d).
balance a x (T R bc y d) and balance maintains the black invariant.
For app a (T R b x c) or app (T R a x b) c, we know that a, b, and c all have the same black-depth and consequently, which means app a b and app b c have this same black-depth, which means T R (app a b) x c and T R (app a b) x c also have this same black-depth
Why is it fast?
My Racket is a bit rusty so I don't have a great answer to this. In general, app makes delete fast by allowing you to do everything in two steps: you go down to the target site, then you continue downwards to merge the subtrees, then you come back up fixing the invariants as you go, all the way to the root.
In the paper you reference, once you get down to the target site, you call min/delete (I think this is the key difference - the rotations otherwise look pretty similar) which will recursively call itself to find the element in the subtree that you can plop into the target site and the state of the subtree after that element has been taken out. I believe that last part is what hurts the performance of that implementation.

Why do we need containers?

(As an excuse: the title mimics the title of Why do we need monads?)
There are containers [1] (and indexed ones [2]) (and hasochistic ones [3]) and descriptions.[4] But containers are problematic [5] and to my very small experience it's harder to think in terms of containers than in terms of descriptions. The type of non-indexed containers is isomorphic to Σ — that's quite too unspecific. The shapes-and-positions description helps, but in
⟦_⟧ᶜ : ∀ {α β γ} -> Container α β -> Set γ -> Set (α ⊔ β ⊔ γ)
⟦ Sh ◃ Pos ⟧ᶜ A = ∃ λ sh -> Pos sh -> A
Kᶜ : ∀ {α β} -> Set α -> Container α β
Kᶜ A = A ◃ const (Lift ⊥)
we are essentially using Σ rather than shapes and positions.
The type of strictly-positive free monads over containers has a rather straightforward definition, but the type of Freer monads looks simpler to me (and in a sense Freer monads are even better than usual Free monads as described in the paper [6]).
So what can we do with containers in a nicer way than with descriptions or something else?
References
Abbott, Michael, Thorsten Altenkirch, and Neil Ghani. "Containers: Constructing strictly positive types." Theoretical Computer Science 342, no. 1 (2005): 3-27.
Altenkirch, Thorsten, Neil Ghani, Peter Hancock, Conor McBride, and PETER MORRIS. 2015. “Indexed Containers.” Journal of Functional Programming 25. Cambridge University Press: e5. doi:10.1017/S095679681500009X.
McBride, Conor. "hasochistic containers (a first attempt)." Jun, 2015.
Chapman, James, Pierre-Evariste Dagand, Conor Mcbride, and Peter Morris. "The gentle art of levitation." In ICFP 2010, pp. 3-14. 2010.
Francesco. "W-types: good news and bad news." Mar 2010.
Kiselyov, Oleg, and Hiromi Ishii. "Freer monads, more extensible effects." In 8th ACM SIGPLAN Symposium on Haskell, Haskell 2015, pp. 94-105. Association for Computing Machinery, Inc, 2015.
To my mind, the value of containers (as in container theory) is their uniformity. That uniformity gives considerable scope to use container representations as the basis for executable specifications, and perhaps even machine-assisted program derivation.
Containers: a theoretical tool, not a good run-time data representation strategy
I would not recommend fixpoints of (normalized) containers as a good general purpose way to implement recursive data structures. That is, it is helpful to know that a given functor has (up to iso) a presentation as a container, because it tells you that generic container functionality (which is easily implemented, once for all, thanks to the uniformity) can be instantiated to your particular functor, and what behaviour you should expect. But that's not to say that a container implementation will be efficient in any practical way. Indeed, I generally prefer first-order encodings (tags and tuples, rather than functions) of first-order data.
To fix terminology, let us say that the type Cont of containers (on Set, but other categories are available) is given by a constructor <| packing two fields, shapes and positions
S : Set
P : S -> Set
(This is the same signature of data which is used to determine a Sigma type, or a Pi type, or a W type, but that does not mean that containers are the same as any of these things, or that these things are the same as each other.)
The interpretation of such a thing as a functor is given by
[_]C : Cont -> Set -> Set
[ S <| P ]C X = Sg S \ s -> P s -> X -- I'd prefer (s : S) * (P s -> X)
mapC : (C : Cont){X Y : Set} -> (X -> Y) -> [ C ]C X -> [ C ]C Y
mapC (S <| P) f (s , k) = (s , f o k) -- o is composition
And we're already winning. That's map implemented once for all. What's more, the functor laws hold by computation alone. There is no need for recursion on the structure of types to construct the operation, or to prove the laws.
Descriptions are denormalized containers
Nobody is attempting to claim that, operationally, Nat <| Fin gives an efficient implementation of lists, just that by making that identification we learn something useful about the structure of lists.
Let me say something about descriptions. For the benefit of lazy readers, let me reconstruct them.
data Desc : Set1 where
var : Desc
sg pi : (A : Set)(D : A -> Desc) -> Desc
one : Desc -- could be Pi with A = Zero
_*_ : Desc -> Desc -> Desc -- could be Pi with A = Bool
con : Set -> Desc -- constant descriptions as singleton tuples
con A = sg A \ _ -> one
_+_ : Desc -> Desc -> Desc -- disjoint sums by pairing with a tag
S + T = sg Two \ { true -> S ; false -> T }
Values in Desc describe functors whose fixpoints give datatypes. Which functors do they describe?
[_]D : Desc -> Set -> Set
[ var ]D X = X
[ sg A D ]D X = Sg A \ a -> [ D a ]D X
[ pi A D ]D X = (a : A) -> [ D a ]D X
[ one ]D X = One
[ D * D' ]D X = Sg ([ D ]D X) \ _ -> [ D' ]D X
mapD : (D : Desc){X Y : Set} -> (X -> Y) -> [ D ]D X -> [ D ]D Y
mapD var f x = f x
mapD (sg A D) f (a , d) = (a , mapD (D a) f d)
mapD (pi A D) f g = \ a -> mapD (D a) f (g a)
mapD one f <> = <>
mapD (D * D') f (d , d') = (mapD D f d , mapD D' f d')
We inevitably have to work by recursion over descriptions, so it's harder work. The functor laws, too, do not come for free. We get a better representation of the data, operationally, because we don't need to resort to functional encodings when concrete tuples will do. But we have to work harder to write programs.
Note that every container has a description:
sg S \ s -> pi (P s) \ _ -> var
But it's also true that every description has a presentation as an isomorphic container.
ShD : Desc -> Set
ShD D = [ D ]D One
PosD : (D : Desc) -> ShD D -> Set
PosD var <> = One
PosD (sg A D) (a , d) = PosD (D a) d
PosD (pi A D) f = Sg A \ a -> PosD (D a) (f a)
PosD one <> = Zero
PosD (D * D') (d , d') = PosD D d + PosD D' d'
ContD : Desc -> Cont
ContD D = ShD D <| PosD D
That's to say, containers are a normal form for descriptions. It's an exercise to show that [ D ]D X is naturally isomorphic to [ ContD D ]C X. That makes life easier, because to say what to do for descriptions, it's enough in principle to say what to do for their normal forms, containers. The above mapD operation could, in principle, be obtained by fusing the isomorphisms to the uniform definition of mapC.
Differential structure: containers show the way
Similarly, if we have a notion of equality, we can say what one-hole contexts are for containers uniformly
_-[_] : (X : Set) -> X -> Set
X -[ x ] = Sg X \ x' -> (x == x') -> Zero
dC : Cont -> Cont
dC (S <| P) = (Sg S P) <| (\ { (s , p) -> P s -[ p ] })
That is, the shape of a one-hole context in a container is the pair of the shape of the original container and the position of the hole; the positions are the original positions apart from that of the hole. That's the proof-relevant version of "multiply by the index, decrement the index" when differentiating power series.
This uniform treatment gives us the specification from which we can derive the centuries-old program to compute the derivative of a polynomial.
dD : Desc -> Desc
dD var = one
dD (sg A D) = sg A \ a -> dD (D a)
dD (pi A D) = sg A \ a -> (pi (A -[ a ]) \ { (a' , _) -> D a' }) * dD (D a)
dD one = con Zero
dD (D * D') = (dD D * D') + (D * dD D')
How can I check that my derivative operator for descriptions is correct? By checking it against the derivative of containers!
Don't fall into the trap of thinking that just because a presentation of some idea is not operationally helpful that it cannot be conceptually helpful.
On "Freer" monads
One last thing. The Freer trick amounts to rearranging an arbitrary functor in a particular way (switching to Haskell)
data Obfuncscate f x where
(:<) :: forall p. f p -> (p -> x) -> Obfuncscate f x
but this is not an alternative to containers. This is a slight currying of a container presentation. If we had strong existentials and dependent types, we could write
data Obfuncscate f x where
(:<) :: pi (s :: exists p. f p) -> (fst s -> x) -> Obfuncscate f x
so that (exists p. f p) represents shapes (where you can choose your representation of positions, then mark each place with its position), and fst picks out the existential witness from a shape (the position representation you chose). It has the merit of being obviously strictly positive exactly because it's a container presentation.
In Haskell, of course, you have to curry out the existential, which fortunately leaves a dependency only on the type projection. It's the weakness of the existential which justifies the equivalence of Obfuncscate f and f. If you try the same trick in a dependent type theory with strong existentials, the encoding loses its uniqueness because you can project and tell apart different choices of representation for positions. That is, I can represent Just 3 by
Just () :< const 3
or by
Just True :< \ b -> if b then 3 else 5
and in Coq, say, these are provably distinct.
Challenge: characterizing polymorphic functions
Every polymorphic function between container types is given in a particular way. There's that uniformity working to clarify our understanding again.
If you have some
f : {X : Set} -> [ S <| T ]C X -> [ S' <| T' ]C X
it is (extensionally) given by the following data, which make no mention of elements whatsoever:
toS : S -> S'
fromP : (s : S) -> P' (toS s) -> P s
f (s , k) = (toS s , k o fromP s)
That is, the only way to define a polymorphic function between containers is to say how to translate input shapes to output shapes, then say how to fill output positions from input positions.
For your preferred representation of strictly positive functors, give a similarly tight characterisation of the polymorphic functions which eliminates abstraction over the element type. (For descriptions, I would use exactly their reducability to containers.)
Challenge: capturing "transposability"
Given two functors, f and g, it is easy to say what their composition f o g is: (f o g) x wraps up things in f (g x), giving us "f-structures of g-structures". But can you readily impose the extra condition that all of the g structures stored in the f structure have the same shape?
Let's say that f >< g captures the transposable fragment of f o g, where all the g shapes are the same, so that we can just as well turn the thing into a g-structure of f-structures. E.g., while [] o [] gives ragged lists of lists, [] >< [] gives rectangular matrices; [] >< Maybe gives lists which are either all Nothing or all Just.
Give >< for your preferred representation of strictly positive functors. For containers, it's this easy.
(S <| P) >< (S' <| P') = (S * S') <| \ { (s , s') -> P s * P' s' }
Conclusion
Containers, in their normalized Sigma-then-Pi form, are not intended to be an efficient machine representation of data. But the knowledge that a given functor, implemented however, has a presentation as a container helps us understand its structure and give it useful equipment. Many useful constructions can be given abstractly for containers, once for all, when they must be given case-by-case for other presentations. So, yes, it is a good idea to learn about containers, if only to grasp the rationale behind the more specific constructions you actually implement.

Resources