Pattern matching in Observational Type Theory - haskell

In the end of the "5. Full OTT" section of Towards Observational Type Theory the authors show how to define coercible-under-constructors indexed data types in OTT. The idea is basically to turn indexed data types into parameterized like this:
data IFin : ℕ -> Set where
zero : ∀ {n} -> IFin (suc n)
suc : ∀ {n} -> IFin n -> IFin (suc n)
data PFin (m : ℕ) : Set where
zero : ∀ {n} -> suc n ≡ m -> PFin m
suc : ∀ {n} -> suc n ≡ m -> PFin n -> PFin m
Conor also mentions this technique at the bottom of observational type theory (delivery):
The fix, of course, is to do what the GADT people did, and define
inductive families explicitly upto propositional equality. And then of
course you can transport them, by transisitivity.
However a type checker in Haskell is aware of equality constraints in scope and actually uses them during type checking. E.g. we can write
f :: a ~ b => a -> b
f x = x
It doesn't work so in type theory, since it's not enough to have a proof of a ~ b in scope to be able to rewrite by this equation: that proof also must be refl, because in the presense of a false hypothesis type checking becomes undecidable due to termination issues (something like this). So when you pattern match on Fin m in Haskell m gets rewritten to suc n in each branch, but that can't happen in type theory, instead you're left with an explicit proof of suc n ~ m. In OTT it's not possible to pattern match on proofs at all, hence you can neither pretend the proof is refl nor actually require that. It's only possible to supply the proof to coerce or just ignore it.
This makes it very hard to write anything that involves indexed data types. E.g. the usual three-lines (including the type signature) lookup for vectors becomes this beast:
vlookupₑ : ∀ {n m a} {α : Level a} {A : Univ α} -> ⟦ n ≅ m ⇒ fin n ⇒ vec A m ⇒ A ⟧
vlookupₑ p (fzeroₑ q) (vconsₑ r x xs) = x
vlookupₑ {n} {m} p (fsucₑ {n′} q i) (vconsₑ {m′} r x xs) =
vlookupₑ (left (suc n′) {m} {suc m′} (trans (suc n′) {n} {m} q p) r) i xs
vlookupₑ {n} {m} p (fzeroₑ {n′} q) (vnilₑ r) =
⊥-elim $ left (suc n′) {m} {0} (trans (suc n′) {n} {m} q p) r
vlookupₑ {n} {m} p (fsucₑ {n′} q i) (vnilₑ r) =
⊥-elim $ left (suc n′) {m} {0} (trans (suc n′) {n} {m} q p) r
vlookup : ∀ {n a} {α : Level a} {A : Univ α} -> Fin n -> Vec A n -> ⟦ A ⟧
vlookup {n} = vlookupₑ (refl n)
It could be a bit simplified, since if two elements of a data type that has decidable equality are observably equal, then they are also equal in the usual intensional sense, and natural numbers do have decidable equality, so we can coerce all the equations to their intensional counterparts and pattern match on them, but that would break some computational properties of vlookup and is verbose anyway. It's nearly impossible to deal in more complicated cases with indices which equality cannot be decided.
Is my reasoning correct? How is pattern matching in OTT meant to work? If this is a problem indeed, are there any ways to mitigate it?

I guess I'll field this one. I find it a strange question, but that's because of my own particular journey. The short answer is: don't do pattern matching in OTT, or in any kernel type theory. Which is not the same thing as to not do pattern matching ever.
The long answer is basically my PhD thesis.
In my PhD thesis, I show how to elaborate high-level programs written in a pattern matching style into a kernel type theory which has only the induction principles for inductive datatypes and a suitable treatment of propositional equality. The elaboration of pattern matching introduces propositional equations on datatype indices, then solves them by unification. Back then, I was using an intensional equality, but observational equality gives you at least the same power. That is: my technology for elaborating pattern matching (and thus keeping it out of the kernel theory), hiding all the equational piggery-jokery, predates the upgrade to observational equality. The ghastly vlookup you've used to illustrate your point might correspond to the output of the elaboration process, but the input need not be that bad. The nice definition
vlookup : Fin n -> Vec X n -> X
vlookup fz (vcons x xs) = x
vlookup (fs i) (vcons x xs) = vlookup i xs
elaborates just fine. The equation-solving that happens along the way is just the same equation-solving that Agda does at the meta-level when checking a definition by pattern matching, or that Haskell does. Don't be fooled by programs like
f :: a ~ b => a -> b
f x = x
In kernel Haskell, that elaborates to some sort of
f {q} x = coerce q x
but it's not in your face. And it's not in compiled code, either. OTT equality proofs, like Haskell equality proofs, can be erased before computing with closed terms.
Digression. To be clear about the status of equality data in Haskell, the GADT
data Eq :: k -> k -> * where
Refl :: Eq x x
really gives you
Refl :: x ~ y -> Eq x y
but because the type system is not logically sound, type safety relies on strict pattern matching on that type: you can't erase Refl and you really must compute it and match it at run time, but you can erase the data corresponding to the proof of x~y. In OTT, the entire propositional fragment is proof-irrelevant for open terms and erasable for closed computation. End of digression.
The decidability of equality on this or that datatype is not especially relevant (at least, not if you have uniqueness of identity proofs; if you don't always have UIP, decidability is one way to get it sometimes). The equational problems which show up in pattern matching are on arbitrary open expressions. That's a lot of rope. But a machine can certainly decide the fragment which consists of first-order expressions built from variables and fully applied constructors (and that's what Agda does when you split cases: if the constraints are too weird, the thing just barfs). OTT should allow us to push a bit further into the decidable fragments of higher-order unification. If you know (forall x. f x = t[x]) for unknown f, that's equivalent to f = \ x -> t[x].
So, "no pattern matching in OTT" has always been a deliberate design choice, as we always intended it to be an elaboration target for a translation we already knew how to do. Rather, it's a strict upgrade in kernel theory power.

Related

Why does Haskell 9.0 not have Zero in its linear types, but Idris 2 does?

From the Idris 2 publication about linear types "Idris 2: Quantitative Type Theory in Practice":
For Idris 2, we make a concrete choice of semiring, where a multiplicity can be one of:
0: the variable is not used at run time
1: the variable is used exactly once at run time
ω: no restrictions on the variable’s usage at run time
But for Haskell:
In the fashion of levity polymorphism, the proposal introduces a data type Multiplicity which is treated specially by the type checker, to represent the multiplicities:
data Multiplicity
= One -- represents 1
| Many -- represents ω
Why didn't they add a Zero?
In Idris, the value of a function argument can appear in the return type. You might write a function with type
vecLen : (n : Nat) -> Vect n a -> (m : Nat ** n = m)
This says "vecLen is a function that takes a Nat (call it n) and a Vect of length n and element type a and returns a Nat (call it m) such that n = m". The intent is to walk the linked list Vect and come up with its length as a runtime Nat value. But the issue is that vecLen requires you to already have the length of the vector as a runtime value, because that's the very n argument we're talking about! I.e. it's possible to implement just
vecLen n _ = (n ** Refl) -- does no real work
You can't not take n as an argument, because the Vect type needs its length as an argument to make sense. This is a somewhat serious problem, because having both n and a Vect n a duplicates information—without intervention the "classic" definition of Vect n a itself is in fact space-quadratic in n! So we need a way to take an argument that we know "in principle exists" but may not "actually exist" at runtime. That's what the zero multiplicity is for.
vecLen : (0 n : Nat) -> Vect n a -> (m : Nat ** n = m)
-- old definition invalid, you MUST iterate over the Vect to collect information on n (which is what we want!)
vecLen _ [] = (0 ** Refl)
vecLen _ (_ :: xs) with (vecLen _ xs)
vecLen _ (_ :: xs) | (n ** Refl) = (S n ** Refl)
This is the use of the zero multiplicity in Idris: it lets you talk about values that may not exist yet/anymore/ever inside of types so you can reason about them etc. (I believe in one talk Edwin Brady did, he used or at least noted you could use zero multiplicities to reason about the previous state of a mutable resource.)
But Haskell does not quite have this kind of dependent typing... so there just isn't a use case for zero multiplicities. From another point of view, the zero multiplicity form of f :: a -> b is just f :: forall (x :: a). b, because Haskell is wholesale type-erased in a way that Idris and other dependently typed languages cannot easily be. (In this sense, IMO, I think Haskell has the whole dependent type erasure problem that other languages have such trouble with neatly solved—at the cost of having to use an awkward encoding for everything else about dependent types!)
In the latter sense, the "dependent" Haskell (i.e. souped-up with singletons) way to take a dependent, unrestricted argument is this:
-- vvvvvv + vvvvvvvv duplicate information!
vecLen :: SNat n -> Vect n a -> SNat n
vecLen n _ = n -- allowed, ergh!
and the "zero multiplicity" version is just
vecLen :: Vect n a -> SNat n
-- must walk Vect, cannot match on erased types like n
vecLen Nil = SZ
vecLen (_ `Cons` xs) = SS (goodLen xs)

Design options for constructor constraints: GADT compare PatternSynonym Required

(This is a follow-up to this answer, trying to get the q more precise.)
Use Case Constructors to build/access a Set datatype. Being a set, the invariant is 'no duplicates'. To implement that I need an Eq constraint on the element type. (More realistically, the set might be implemented as a BST or hash-index, which'll need a more restrictive constraint; using Eq here to keep it simple.)
I want to disallow building even an empty set with an unacceptable type.
" it's now considered bad practice to require a constraint for an operation (data type construction or destruction) that does not need the constraint. Instead, the constraints should be moved closer to the "usage site".", to quote that answer.
OK so there's no need to get the constraint 'built into' the data structure. And operations that access/deconstruct (like showing or counting elements) won't necessarily need Eq.
Then consider two (or rather five) possible designs:
(I'm aware some of the constraints could be achieved via deriving, esp Foldable to get elem. But I'll hand-code here, so I can see what minimal constraints GHC wants.)
Option 1: No constraints on datatype
data NoCSet a where -- no constraints on the datatype
NilSet_ :: NoCSet a
ConsSet_ :: a -> NoCSet a -> NoCSet a
Option 1a. use PatternSynonym as 'smart constructor'
pattern NilSet :: (Eq a) => () => NoCSet a
pattern NilSet = NilSet_
pattern ConsSet x xs <- ConsSet_ x xs where
ConsSet x xs | not (elemS x xs) = ConsSet_ x xs
elemS x NilSet_ = False
elemS x (ConsSet_ y ys) | x == y = True -- infers (Eq a) for elemS
| otherwise = elemS x ys
GHC infers the constraint elemS :: Eq t => t -> NoCSet t -> Bool. But it doesn't infer a constraint for ConsSet that uses it. Rather, it rejects that definition:
* No instance for (Eq a) arising from a use of `elemS'
Possible fix:
add (Eq a) to the "required" context of
the signature for pattern synonym `ConsSet'
Ok I'll do that, with an explicitly empty 'Provided' constraint:
pattern ConsSet :: (Eq a) => () => ConsType a -- Req => Prov'd => type; Prov'd is empty, so omittable
Consequently inferred type (\(ConsSet x xs) -> x) :: Eq a => NoCSet a -> a, so the constraint 'escapes' from the destructor (also from elemS), whether or not I need it at the "usage site".
Option 1b. Pattern synonym wrapping a GADT constructor as 'smart constructor'
data CSet a where CSet :: Eq a => NoCSet a -> CSet a -- from comments to the earlier q
pattern NilSetC = CSet NilSet_ -- inferred Eq a provided
pattern ConsSetC x xs <- CSet (ConsSet_ x xs) where -- also inferred Eq a provided
ConsSetC x xs | not (elemS x xs) = CSet (ConsSet_ x xs)
GHC doesn't complain about the lack of signature, does infer pattern ConsSetC :: () => Eq a => a -> NoCSet a -> CSet a a Provided constraint, but empty Required.
Inferred (\(ConsSetC x xs) -> x) :: CSet p -> p, so the constraint doesn't escape from a "usage site".
But there's a bug: to Cons an element, I need to unwrap the NoCSet inside the CSet in the tail then re-wrap a CSet. And trying to do that with the ConsSetC pattern alone is ill typed. Instead:
insertCSet x (CSet xs) = ConsSetC x xs -- ConsSetC on rhs, to check for duplicates
As 'smart constructors' go, that's dumb. What am I doing wrong?
Inferred insertCSet :: a -> CSet a -> CSet a, so again the constraint doesn't escape.
Option 1c. Pattern synonym wrapping a GADT constructor as 'smarter constructor'
Same setup as option 1b, except this monster as ViewPattern for the Cons pattern
pattern ConsSetC2 x xs <- ((\(CSet (ConsSet_ x' xs')) -> (x', CSet xs')) -> (x, xs)) where
ConsSetC2 x (CSet xs) | not (elemS x xs) = CSet (ConsSet_ x xs)
GHC doesn't complain about the lack of signature, does infer pattern ConsSetC2 :: a -> CSet a -> CSet a with no constraint at all. I'm nervous. But it does correctly reject attempts to build a set with duplicates.
Inferred (\(ConsSetC2 x xs) -> x) :: CSet a -> a, so the constraint that isn't there doesn't escape from a "usage site".
Edit: ah, I can get a somewhat less monstrous ViewPattern expression to work
pattern ConsSetC3 x xs <- (CSet (ConsSet_ x (CSet -> xs))) where
ConsSetC3 x (CSet xs) | not (elemS x xs) = CSet (ConsSet_ x xs)
Curiously inferred pattern ConsSetC3 :: () => Eq a => a -> CSet a -> CSet a -- so the Provided constraint is visible, unlike with ConsSetC2, even though they're morally equivalent. It does reject attempts to build a set with duplicates.
Inferred (\(ConsSetC3 x xs) -> x) :: CSet p -> p, so that constraint that is there doesn't excape from "usage sites".
Option 2: GADT constraints on datatype
data GADTSet a where
GADTNilSet :: Eq a => GADTSet a
GADTConsSet :: Eq a => a -> GADTSet a -> GADTSet a
elemG x GADTNilSet = False
elemG x (GADTConsSet y ys) | x == y = True -- no (Eq a) 'escapes'
| otherwise = elemG x ys
GHC infers no visible constraint elemG :: a -> GADTSet a -> Bool; (\(GADTConsSet x xs) -> x) :: GADTSet p -> p.
Option 2a. use PatternSynonym as 'smart constructor' for the GADT
pattern ConsSetG x xs <- GADTConsSet x xs where
ConsSetG x xs | not (elemG x xs) = GADTConsSet x xs -- does infer Provided (Eq a) for ConsSetG
GHC doesn't complain about the lack of signature, does infer pattern ConsSetG :: () => Eq a => a -> GADTSet a -> GADTSet a a Provided constraint, but empty Required.
Inferred (\(ConsSetG x xs) -> x) :: GADTSet p -> p, so the constraint doesn't escape from a "usage site".
Option 2b. define an insert function
insertGADTSet x xs | not (elemG x xs) = GADTConsSet x xs -- (Eq a) inferred
GHC infers insertGADTSet :: Eq a => a -> GADTSet a -> GADTSet a; so the Eq has escaped, even though it doesn't escape from elemG.
Questions
With insertGADTSet, why does the constraint escape? It's only needed for the elemG check, but elemG's type doesn't expose the constraint.
With constructors GADTConsSet, GADTNilSet, there's a constraint wrapped 'all the way down' the data structure. Does that mean the data structure has a bigger memory footprint than with ConsSet_, NilSet_?
With constructors GADTConsSet, GADTNilSet, it's the same type a 'all the way down'. Is the same Eq a dictionary repeated at each node? Or shared?
By comparison, pattern ConsSetC/constructor CSet/Option 1b wraps only a single dictionary(?), so it'll have a smaller memory footprint than a GADTSet structure(?)
insertCSet has a performance hit of unwrapping and wrapping CSets?
ConsSetC2 in the build direction seems to work; there's a performance hit in unwrapping and wrapping CSets? But worse there's a unwrapping/wrapping performance hit in accessing/walking the nodes?
(I'm feeling there's no slam-dunk winner amongst these options for my use case.)
I don't think there is any realistic scenario where it is important that you disallow the creation of an empty set with an "unacceptable" type. It is at least partly because of lack of such realistic scenarios that DatatypeContexts is considered bad practice. Seriously, try to imagine how such a restriction could possibly help avoid real-world programming errors. That is, try to imagine how someone using your types and functions might (1) write an erroneous program that (2) "accidentally" uses a set of, say, functions and yet (3) somehow gets it to type check in a manner that (4) could have been caught if only there'd been an extra Eq constraint on NilSet. As soon as a programmer tries to do anything with that set that makes it non-empty (i.e., anything that needs Eq functionality), it won't type check, so what exactly are you trying to prevent? You want to stop someone who only needs empty sets of functions from using your types? Why? Is it spite? ... It is spite, isn't it?
Getting down to your various options, putting the constraint in a GADT is inappropriate and unnecessary to my mind. The point of constraints in GADTs is to allow destruction to dynamically and/or conditionally bring an instance dictionary into scope, based on a runtime case match. You do not need this functionality. In particular, you do not need the overhead of a dictionary in every one of your Cons nodes as per option 2. However, you also don't need a dictionary in the data type as per option 1(b). It's better to use the normal non-GADT mechanisms of passing dictionaries to functions instead of carrying them around in the data types. I expect you'll be missing many opportunities for specialization and optimization if you try option 1(b). Partly this may be because GADTs are intrinsically harder to optimize, but there's also much less work that's been put into optimizing code using GADTs than code using non-GADTs. Some of your questions suggest you're very concerned about small performance gains. If so, it's generally a good idea to stay well away from GADTs!
Option 1(a) is a reasonable solution. Without the unnecessary Eq constraint on Nil, and folding the insert function into the pattern definition, you get something like:
{-# LANGUAGE PatternSynonyms #-}
data Set a = Nil | Cons_ a (Set a) deriving (Show)
pattern Cons :: (Eq a) => a -> Set a -> Set a
pattern Cons x xs <- Cons_ x xs
where Cons x xs = go xs
where go Nil = Cons_ x xs
go (Cons_ y ys) | x == y = xs
| otherwise = go ys
which seems like a perfectly idiomatic, straightforward, smart constructor implementation using patterns, as designed.
Indeed, it's unfortunately that the constraint applies to destruction, when it isn't really needed. Ideally, GHC would allow constraints for use of Cons as a constructor to be specified separately, instead of assuming they're the combination of the required and provided constraints for destruction.
This would allow us to write something like:
pattern Cons :: a -> Set a -> Set a
pattern Cons x xs <- Cons_ x xs
where Cons :: Eq a => a -> Set a -> Set a -- <== only need Eq here
Cons x Nil = Cons_ x Nil
Cons x rest#(Cons_ y xs) | x == y = rest
| otherwise = Cons_ y (Cons x xs)
and then usages of Cons as a destructor could be constraint-free while uses as a constructor could take advantage of the Eq a constraint. I see this as a limitation of PatternSynonyms rather than an indication that adding unnecessary constraints a la DatatypeContexts is actually good programming practice. It looks like at least a few other people agree that this is a bug not a feature.
To your first four questions:
In option 2(b), insertGADTSet needs an Eq a dictionary to insert into the dictionary slot in the GADTConsSet constructor on the RHS. So, the Eq a constraint comes from the use of the GADTConsSet constructor.
Yes, GADT constraints become additional fields in the data type. In option 2, a pointer to the Eq a dictionary is added to every node in your set.
The dictionary itself is shared, but each node includes its own pointer to the dictionary.
Yes, for the CSet type, there's only one pointer to the dictionary per CSet value.
I believe that option 1b is your best bet. Of course, I may be biased, as I'm the one which suggested it on your other question.
To address the issue you've pointed out with your pattern synonym, let us imagine we don't have pattern synonyms. How might we deconstruct a Cons set?
One way is to write a method with this signature:
openSetC :: CSet a -> (Eq a => a -> CSet a -> r) -> (Eq a => r) -> r
Which says that given:
a CSet a,
a function which takes: a proof that Eq a, an a, and another CSet a, and returns some arbitrary type,
and a function which takes a proof that Eq a and returns the same arbitrary type,
we can produce a value of that arbitrary type. Since the type is arbitrary, we know that the values comes from calling the function, or from the given value of that type. The contract of this function is that it invokes the first function and return its result if and only if the set is ConsSet_, otherwise, if it is NilSet_, it invokes the second function.
If you squint a little, you can see this function is in a sense "equivalent" to pattern matching on CSet. You don't need pattern matching at all anymore; you can do everything with this function that you can do with pattern matching.
It's implementation is quite trivial:
openSetC (CSet (ConsSet_ x xs)) k _ = k x (CSet xs)
openSetC (CSet NilSet_) _ z = z
Consider now a different form of this function, which accomplishes all the same things, but is maybe a bit easier to use.
Note that the type forall r . (a -> r) -> (b -> r) -> r is isomorphic to Either a b. Also note that x0 -> y0 -> r is isomorphic (or close enough) to (x0, y0) -> r. And finally note that C a => r is isomorphic to Dict (C a) -> r where:
data Dict c where Dict :: c => Dict c
If we exploit these isomorphisms, we can write openSetC differently as:
openSetC' :: CSet a -> Either (a, CSet a, Dict (Eq a)) (Dict (Eq a))
openSetC' (CSet (ConsSet_ x xs)) = Left (x, CSet xs, Dict)
openSetC' (CSet NilSet_) = Right Dict
Now the fun part: using ViewPatterns, we can use this function directly to easily write the pattern with the signature you want. It's easy only because we've set up the type of openSetC' to match with the type of the pattern you want:
pattern ConsSetC :: () => Eq a => a -> CSet a -> CSet a
pattern ConsSetC x xs <- (openSetC' -> Left (x, xs, Dict))
-- included for completeness, but the expression form of the pattern synonym is not at issue here
where
ConsSetC x (CSet xs) | not (elemS x xs) = CSet (ConsSet_ x xs)
| otherwise = CSet xs
As for the rest of your questions, I would strongly suggest splitting them up into different posts so they could all have focused answers.

How to use a refutation to direct the type checker in Haskell?

Does filling the hole in the following program necessarily require non-constructive means? If yes, is it still the case if x :~: y decidable?
More generally, how do I use a refutation to guide the type checker?
(I am aware that I can work around the problem by defining Choose as a GADT, I'm asking specifically for type families)
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TypeOperators #-}
module PropositionalDisequality where
import Data.Type.Equality
import Data.Void
type family Choose x y where
Choose x x = 1
Choose x _ = 2
lem :: (x :~: y -> Void) -> Choose x y :~: 2
lem refutation = _
If you try hard enough to implement a function, you can convince yourself
that it is not possible. If you're not convinced, the argument can be made
more formal: we enumerate programs exhaustively to find that none is possible. It turns out there's only half a dozen of meaningful cases to consider.
I wonder why this argument is not made more often.
Totally not accurate summary:
Act I: proof search is easy.
Act II: dependent types too.
Act III: Haskell is still fine for writing dependently typed programs.
I. The proof search game
First we define the search space.
We can reduce any Haskell definition to one of the form
lem = (exp)
for some expression (exp). Now we only need to find a single expression.
Look at all possible ways of making an expression in Haskell:
https://www.haskell.org/onlinereport/haskell2010/haskellch3.html#x8-220003
(this doesn't account for extensions, exercise for the reader).
It fits a page in a single column, so it's not that big to start with.
Moreover, most of them are sugar for some form of function application or
pattern-matching; we can also desugar away type classes with dictionary
passing, so we're left with a ridiculously small lambda calculus:
lambdas, \x -> ...
pattern-matching, case ... of ...
function application, f x
constructors, C (including integer literals)
constants, c (for primitives that cannot be written in terms of the constructs above, so various built-ins (seq) and maybe FFI if that counts)
variables (bound by lambdas and cases)
We can exclude every constant on the grounds that I think the question is
really about pure lambda calculus (or the reader can enumerate the constants,
to exclude black-magic constants like undefined, unsafeCoerce,
unsafePerformIO that make everything collapse (any type is inhabited and, for
some of those, the type system is unsound), and to be left with white-magic
constants to which the present theoretical argument can be generalized via a
well-funded thesis).
We can also reasonably assume that we want a solution with no recursion involved
(to get rid of noise like lem = lem, and fix if you felt like you couldn't
part with it before), and which actually has a normal form, or preferably, a
canonical form with respect to βη-equivalence. In other words, we refine and
examine the set of possible solutions as follows.
lem :: _ -> _ has a function type, so we can assume WLOG that its definition starts with a lambda:
-- Any solution
lem = (exp)
-- is η-equivalent to a lambda
lem = \refutation -> (exp) refutation
-- so let's assume we do have a lambda
lem = \refutation -> _hole
Now enumerate what could be under the lambda.
It could be a constructor,
which then has to be Refl, but there is no proof that Choose x y ~ 2 in
the context (here we could formalize and enumerate the type equalities the
typechecker knows about and can derive, or make the syntax of coercions
(proofs of equalities) explicit and keep playing this proof search game
with them), so this doesn't type check:
lem = \refutation -> Refl
Maybe there is some way of constructing that equality proof, but then the
expression would start with something else, which is going to be another
case of the proof.
It could be some application of a constructor C x1 x2 ..., or the
variable refutation (applied or not); but there's no possible way that's
well-typed, it has to somehow produce a (:~:), and Refl is really the
only way.
Or it could be a case. WLOG, there is no nested case on the left, nor any
constructor, because the expression could be simplified in both cases:
-- Any left-nested case expression
case (case (e) of { C x1 x2 -> (f) }) { D y1 y2 -> (g) }
-- is equivalent to a right-nested case
case (e) of { C x1 x2 -> case (f) of { D y1 y2 -> (g) } }
-- Any case expression with a nested constructor
case (C u v) of { C x1 x2 -> f x1 x2 }
-- reduces to
f u v
So the last subcase is the variable case:
lem = \refutation -> case refutation (_hole :: x :~: y) of {}
and we have to construct a x :~: y. We enumerate ways of filling the
_hole again. It's either Refl, but no proof is available, or
(skipping some steps) case refutation (_anotherHole :: x :~: y) of {},
and we have an infinite descent on our hands, which is also absurd.
A different possible argument here is that we can pull out the case
from the application, to remove this case from consideration WLOG.
-- Any application to a case
f (case e of C x1 x2 -> g x1 x2)
-- is equivalent to a case with the application inside
case e of C x1 x2 -> f (g x1 x2)
There are no more cases. The search is complete, and we didn't find an
implementation of (x :~: y -> Void) -> Choose x y :~: 2. QED.
To read more on this topic, I guess a course/book about lambda calculus up
until the normalization proof of the simply-typed lambda calculus should give
you the basic tools to start with. The following thesis contains an
introduction on the subject in its first part, but admittedly I'm a poor judge
of the difficulty of such material: Which types have a unique inhabitant?
Focusing on pure program equivalence,
by Gabriel Scherer.
Feel free to suggest more adequate resources and literature.
II. Fixing the proposition and proving it with dependent types
Your initial intuition that this should encode a valid proposition
is definitely valid. How might we fix it to make it provable?
Technically, the type we are looking at is quantified with forall:
forall x y. (x :~: y -> Void) -> Choose x y :~: 2
An important feature of forall is that it is an irrelevant quantifier.
The variables it introduces cannot be used "directly" in a term of this type. Although that aspect becomes more prominent in the presence of dependent types,
it still pervades Haskell today, providing another intuition for why this (and many other examples) is not "provable" in Haskell: if you think about why you
think that proposition is valid, you will naturally start with a case split about whether x is equal to y, but to even do such a case split you need a way to
decide which side you're on, which will of course have to look at x and y,
so they cannot be irrelevant. forall in Haskell is not at all like what most people mean with "for all".
Some discussion on the matter of relevance can be found in the thesis Dependent Types in Haskell, by Richard Eisenberg (in particular, Section 3.1.1.5 for an initial example, Section 4.3 for relevance in Dependent Haskell and Section 8.7 for comparison with other languages with dependent types).
Dependent Haskell will need a relevant quantifier to complement forall, and
which would get us closer to proving this:
foreach x y. (x :~: y -> Void) -> Choose x y :~: 2
Then we could probably write this:
lem :: foreach x y. (x :~: y -> Void) -> Choose x y :~: 2
lem x y p = case x ==? u of
Left r -> absurd (p r) -- x ~ y, that's absurd!
Right Irrefl -> Refl -- x /~ y, so Choose x y = 2
That also assumes a first-class notion of disequality /~, complementing ~,
to help Choose reduce when it is in the context and a decision function
(==?) :: foreach x y. Either (x :~: y) (x :/~: y).
Actually, that machinery isn't necessary, that just makes for a shorter
answer.
At this point I'm making stuff up because Dependent Haskell does not exist yet,
but that is easily doable in related dependently typed languages (Coq, Agda,
Idris, Lean), modulo an adequate replacement of the type family Choose
(type families are in some sense too powerful to be translated as mere
functions, so may be cheating, but I digress).
Here is a comparable program in Coq, showing also that lem applied to 1 and 2
and a suitable proof does reduce to a proof by reflexivity of choose 1 2 = 2.
https://gist.github.com/Lysxia/5a9b6996a3aae741688e7bf83903887b
III. Without dependent types
A critical source of difficulty here is that Choose is a closed type
family with overlapping instances. It is problematic because there is
no proper way to express the fact that x and y are not equal in Haskell,
to know that the first clause Choose x x does not apply.
A more fruitful avenue if you're into Pseudo-Dependent Haskell is to use a
boolean type equality:
-- In the base library
import Data.Type.Bool (If)
import Data.Type.Equality (type (==))
type Choose x y = If (x == y) 1 2
An alternative encoding of equality constraints becomes useful for this style:
type x ~~ y = ((x == y) ~ 'True)
type x /~ y = ((x == y) ~ 'False)
with that, we can get another version of the type-proposition above,
expressible in current Haskell (where SBool is the singleton type of Bool),
which essentially can be read as adding the assumption that the equality of x
and y is decidable. This does not contradict the earlier claim about "irrelevance" of forall, the function is inspecting a boolean (or rather an SBool), which postpones the inspection of x and y to whoever calls lem.
lem :: forall x y. SBool (x == y) -> ((x ~~ y) => Void) -> Choose x y :~: 2
lem decideEq p = case decideEq of
STrue -> absurd p
SFalse -> Refl

Curry Howard correspondence and equality

A while ago I read that the function type a -> b corresponds to the relation a ≤ b, or is it a ≥ b? This makes sense to me because two types are isomorphic if we have a bijection between them (i.e. (a ≈ b) ≡ (a -> b, b -> a)). Similarly, (a = b) ≡ (a ≤ b) ∧ (a ≥ b).
I know that this is not the Curry-Howard-Lambek correspondence (i.e. the correspondence between type theory, logic and category theory). It's the correspondence between type theory and something else. I want to know learn more about this correspondence. Could somebody point me in the right direction?
I know that this doesn't seem like a programming question but it is related to programming and I'm hoping that some functional programmer knows more about it and can point me in the right direction.
Every pre-ordered set forms a category. Let (S, «) be a pre-ordered set. Define a category C whose objects are the elements of S and with Hom(a, b) inhabited by (a, b) if a « b and uninhabited otherwise. Define composition the only way you possibly can. The category laws follow immediately from the transitivity and reflexivity of the pre-order.
A lattice, in particular, will form a category admitting finite products and coproducts. A bounded lattice will form one with initial and final objects.
Types and functions in a sufficiently well-behaved functional language also form a category with finite products and coproducts, and initial and final objects. So if you squint out to a categorical blur, these things will start to look vaguely similar.
(This is more a comment than an answer, but I need more space.)
The type a -> b corresponds to a <= b. This is useful, e.g., to speak about fixed points at the type level, which are needed to properly define recursive types (lists, trees, ...).
Recall how recursion is solved, without categories. In domain theory, given a function f :: a -> a we look for a least x satisfying f x = x (least fixed point). This turns out to also be the least x satisfying f x <= x (least prefixed point). We then get the induction principle
f y <= y ==> fix f <= y
which basically states that, if we have any prefixed point y, then the least (pre)fixed point fix f must be less than y -- indeed, it is the least!
Now, let's sprinkle some category powder over that. Implication becomes a -> arrow, and <= also becomes ->. We get
(f y -> y) -> fix f -> y
Looks familiar, where did I see that...? Ah!
newtype Fix f = Fix { unFix :: f (Fix f) }
cata :: Functor f => (f y -> y) -> Fix f -> y
cata g = g . fmap (cata g) . unFix
Hence, the cata general eliminator/catamorphism is just a category-empowered version of the good old induction principle.
Note how domain points y are now object in our category (i.e. types). Also, functions f must be applicable to y, so these are not morphisms in our category (which would be function values :: A -> B, from some type to some type), but correspond to functors int the category of types (mapping types to types :: * -> *).

Do Hask or Agda have equalisers?

I was somewhat undecided as to whether this was a math.SE question or an SO one, but I suspect that mathematicians in general are fairly unlikely to know or care much about this category in particular, whereas Haskell programmers might well do.
So, we know that Hask has products, more or less (I'm working with idealised-Hask, here, of course). I'm interested in whether or not it has equalisers (in which case it would have all finite limits).
Intuitively it seems not, since you can't do separation like you can on sets, and so subobjects seem hard to construct in general. But for any specific case you'd like to come up with, it seems like you'd be able to hack it by working out the equaliser in Set and counting it (since after all, every Haskell type is countable and every countable set is isomorphic either to a finite type or the naturals, both of which Haskell has). So I can't see how I'd go about finding a counterexample.
Now, Agda seems a bit more promising: there it is relatively easy to form subobjects. Is the obvious sigma type Σ A (λ x → f x == g x) an equaliser? If the details don't work, is it morally an equaliser?
tl;dr the proposed candidate is not quite an equaliser, but its irrelevant counterpart is
The candidate for an equaliser in Agda looks good. So let's just try it. We'll need some basic kit. Here are my refusenik ASCII dependent pair type and homogeneous intensional equality.
record Sg (S : Set)(T : S -> Set) : Set where
constructor _,_
field
fst : S
snd : T fst
open Sg
data _==_ {X : Set}(x : X) : X -> Set where
refl : x == x
Here's your candidate for an equaliser for two functions
Q : {S T : Set}(f g : S -> T) -> Set
Q {S}{T} f g = Sg S \ s -> f s == g s
with the fst projection sending Q f g into S.
What it says: an element of Q f g is an element s of the source type, together with a proof that f s == g s. But is this an equaliser? Let's try to make it so.
To say what an equaliser is, I should define function composition.
_o_ : {R S T : Set} -> (S -> T) -> (R -> S) -> R -> T
(f o g) x = f (g x)
So now I need to show that any h : R -> S which identifies f o h and g o h must factor through the candidate fst : Q f g -> S. I need to deliver both the other component, u : R -> Q f g and the proof that indeed h factors as fst o u. Here's the picture: (Q f g , fst) is an equalizer if whenever the diagram commutes without u, there is a unique way to add u with the diagram still commuting.
Here goes existence of the mediating u.
mediator : {R S T : Set}(f g : S -> T)(h : R -> S) ->
(q : (f o h) == (g o h)) ->
Sg (R -> Q f g) \ u -> h == (fst o u)
Clearly, I should pick the same element of S that h picks.
mediator f g h q = (\ r -> (h r , ?0)) , ?1
leaving me with two proof obligations
?0 : f (h r) == g (h r)
?1 : h == (\ r -> h r)
Now, ?1 can just be refl as Agda's definitional equality has the eta-law for functions. For ?0, we are blessed by q. Equal functions respect application
funq : {S T : Set}{f g : S -> T} -> f == g -> (s : S) -> f s == g s
funq refl s = refl
so we may take ?0 = funq q r.
But let us not celebrate prematurely, for the existence of a mediating morphism is not sufficient. We require also its uniqueness. And here the wheel is likely to go wonky, because == is intensional, so uniqueness means there's only ever one way to implement the mediating map. But then, our assumptions are also intensional...
Here's our proof obligation. We must show that any other mediating morphism is equal to the one chosen by mediator.
mediatorUnique :
{R S T : Set}(f g : S -> T)(h : R -> S) ->
(qh : (f o h) == (g o h)) ->
(m : R -> Q f g) ->
(qm : h == (fst o m)) ->
m == fst (mediator f g h qh)
We can immediately substitute via qm and get
mediatorUnique f g .(fst o m) qh m refl = ?
? : m == (\ r -> (fst (m r) , funq qh r))
which looks good, because Agda has eta laws for records, so we know that
m == (\ r -> (fst (m r) , snd (m r)))
but when we try to make ? = refl, we get the complaint
snd (m _) != funq qh _ of type f (fst (m _)) == g (fst (m _))
which is annoying, because identity proofs are unique (in the standard configuration). Now, you can get out of this by postulating extensionality and using a few other facts about equality
postulate ext : {S T : Set}{f g : S -> T} -> ((s : S) -> f s == g s) -> f == g
sndq : {S : Set}{T : S -> Set}{s : S}{t t' : T s} ->
t == t' -> _==_ {Sg S T} (s , t) (s , t')
sndq refl = refl
uip : {X : Set}{x y : X}{q q' : x == y} -> q == q'
uip {q = refl}{q' = refl} = refl
? = ext (\ s -> sndq uip)
but that's overkill, because the only problem is the annoying equality proof mismatch: the computable parts of the implementations match on the nose. So the fix is to work with irrelevance. I replace Sg by the Existential quantifier, whose second component is marked as irrelevant with a dot. Now it matters not which proof we use that the witness is good.
record Ex (S : Set)(T : S -> Set) : Set where
constructor _,_
field
fst : S
.snd : T fst
open Ex
and the new candidate equaliser is
Q : {S T : Set}(f g : S -> T) -> Set
Q {S}{T} f g = Ex S \ s -> f s == g s
The entire construction goes through as before, except that in the last obligation
? = refl
is accepted!
So yes, even in the intensional setting, eta laws and the ability to mark fields as irrelevant give us equalisers.
No undecidable typechecking was involved in this construction.
Hask
Hask doesn't have equalizers. An important thing to remember is that thinking about a type (or the objects in any category) and their isomorphism classes really requires thinking about the arrows. What you say about the underlying sets is true, but types with isomorphic underlying sets certainly aren't necessarily isomorphic. One difference between Hask and Set as that Hask's arrows must be computable, and in fact for idealized Hask, they must be total.
I spent a while trying to come up with a real defensible counterexample, and found some references suggesting it cannot be done, but without proofs. However, I do have some "moral" counterexamples if you will; I cannot prove that no equalizer exists in Haskell, but it certainly seems impossible!
Example 1
f, g: ([Int], Int) -> Int
f (p,v) = treat p as a polynomial with given coefficients, and evaluate p(v).
g _ = 0
The equalizer "should" be the type of all pairs (p,n) where p(n) = 0, along with a function injecting these pairs into ([Int], Int). By Hilbert's 10th problem, this set is undecidable. It seems to me that this should exclude the possibility of it being a Haskell type, but I can't prove that (is it possible that there's some bizarre way to construct this type that nobody has discovered?). It maybe I haven't connected a dot or two -- perhaps proving this is impossible isn't hard?
Example 2
Say you have a programing language. You have a compiler that takes the source code and an input and produces a function, for which the fixed point of the function is the output. (While we don't have compilers like this, specifying semantics sort of like this isn't unheard of). So, you have
compiler : String -> Int -> (Int -> Int)
(Un)curry that into a function
compiler' : (String, Int, Int) -> Int
and add a function
id' : (String, Int, Int) -> Int
id' (_,_,x) = x
Then the equalizer of compiler', id' would be the collection of triplets of source program, input, output -- and this is uncomputable because the programing language is fully general.
More Examples
Pick your favorite undecidable problem: it generally involves deciding if an object is the member of some set. You often have a total function that can be used to check this property for a particular object. You can use this function to create an equalizer where the type should be all the items in your undecidable set. That's where the first two examples came from, and there are tons more.
Agda
I'm not as familiar with Agda. My intuition is that your sigma-type should be an equalizer: you can write the type down, along with the necessary injection function, and it looks like it satisfies the definition entirely. However, as someone who doesn't use Agda, I don't think I'm really qualified to check the details.
The real practical issue though is that typechecking that sigma type won't always be computable, so it's not always useful to do this. In all the examples above, you can write down the Sigma type you provided, but you won't be able to readily check if something is a member of that type without a proof.
Incidentally, this is why Haskell shouldn't be able to have equalizers: if it did, the typechecking would be undecidable! Dependent types is what makes everything tick. They should be able to express interesting mathematical structures in its types, while Haskell can't since its typesystem is decideable. So, I would naturally expect idealized Agda to have all finite limits (I would be disappointed otherwise). The same goes for other dependently typed languages; Coq, for example, should definitely have all limits.

Resources