Access definitions from sublocale - locale

In isabelle, I have the following sketch of a hierarchy of locales:
locale Base =
fixes foo :: "nat set"
begin
definition small :: "nat ⇒ nat set" where
"small n = {m ∈ foo. n * m < 100}"
end
locale Other =
fixes n :: nat
sublocale Other ⊆ Base "{..<n}" .
I now want to access the definition of small as interpreted in Other:
term Base.small (* works but parameters have to be provided *)
context Other begin term small end (* works, but needs context *)
term Other.small (* does not work *)
I would like the last line to work, or something equivalent so that I can refer to the correct definition of Other.small from global context. I reality, Other includes parameters, too, and I have a theorem that relates different interpretations of Other. For example I want to write something like this:
lemma card_small_lt: "n < m ==> card (Other.small n) < card (Other.small m)"
sorry
I have figured out a work around, but it leads to a bit of manual typing in the long run. First, I name the sublocale, then I introduce abbreviations in a context:
sublocale Other ⊆ B: Base "{..<n}" .
context Other
begin
abbreviation "small ≡ B.small"
end
term Other.small (* does now work! *)
The question remains, if the above can be automated, I'd rather not write an abbreviation for each definition I want to use from the outside.

Related

How are tail-position contexts GHC join points paper formed?

Compiling without Continuations describes a way to extend ANF System F with join points. GHC itself has join points in Core (an intermediate representation) rather than exposing join points directly in the surface language (Haskell). Out of curiosity, I started trying to write a language that simply extends System F with join points. That is, the join points are user facing. However, there's something about the typing rules in the paper that I don't understand. Here are the parts that I do understand:
There are two environments, one for ordinary values/functions and one that only has join points.
The rational for ∆ being ε in several of the rules. In the expression let x:σ = u in ..., u cannot reference any join points (VBIND) because it join points cannot return to arbitrary locations.
The strange typing rule for JBIND. The paper does a good job explaining this.
Here's what I don't get. The paper introduces a notation that I will call the "overhead arrow", but the paper itself does not explicitly give it a name or mention it. Visually, it looks like a arrow pointing to the right, and it goes above an expression. Roughly, this seems to indicate a "tail context" (the paper does use this term). In the paper, these overhead arrows can be applied to terms, types, data constructors, and even environments. They can be nested as well. Here's the main difficulty I'm having. There are several rules with premises that include type environments under an overhead arrow. JUMP, CASE, RVBIND, and RJBIND all include premises with such a type environment (Figure 2 in the paper). However, none of the typing rules have a conclusion where the type environment is under an overhead arrow. So, I cannot see how JUMP, CASE, etc. can ever be used since the premises cannot be derived by any of the other rules.
That's the question, but if anyone has any supplementary material that provides more context are the overhead arrow convention or if anyone is aware an implementation of the System-F-with-join-points type system (other than in GHC's IR), that would be helpful too.
In this paper, x⃗ means “A sequence of x, separated by appropriate delimiters”.
A few examples:
If x is a variable, λx⃗. e is an abbreviation for λx1. λx2. … λxn e. In other words, many nested 1-argument lambdas, or a many-argument lambda.
If σ and τ are types, σ⃗ → τ is an abbreviation for σ1 → σ2 → … → σn → τ. In other words, a function type with many parameter types.
If a is a type variable and σ is a type, ∀a⃗. σ is an abbreviation for ∀a1. ∀a2. … ∀an. σ. In other words, many nested polymorphic functions, or a polymorphic function with many type parameters.
In Figure 1 of the paper, the syntax of a jump expression is defined as:
e, u, v ⩴ … | jump j ϕ⃗ e⃗ τ
If this declaration were translated into a Haskell data type, it might look like this:
data Term
-- | A jump expression has a label that it jumps to, a list of type argument
-- applications, a list of term argument applications, and the return type
-- of the overall `jump`-expression.
= Jump LabelVar [Type] [Term] Type
| ... -- Other syntactic forms.
That is, a data constructor that takes a label variable j, a sequence of type arguments ϕ⃗, a sequence of term arguments e⃗, and a return type τ.
“Zipping” things together:
Sometimes, multiple uses of the overhead arrow place an implicit constraint that their sequences have the same length. One place that this occurs is with substitutions.
{ϕ/⃗a} means “replace a1 with ϕ1, replace a2 with ϕ2, …, replace an with ϕn”, implicitly asserting that both a⃗ and ϕ⃗ have the same length, n.
Worked example: the JUMP rule:
The JUMP rule is interesting because it provides several uses of sequencing, and even a sequence of premises. Here’s the rule again:
(j : ∀a⃗. σ⃗ → ∀r. r) ∈ Δ
(Γ; ε ⊢⃗ u : σ {ϕ/⃗a})
Γ; Δ ⊢ jump j ϕ⃗ u⃗ τ : τ
The first premise should be fairly straightforward, now: lookup j in the label context Δ, and check that the type of j starts with a bunch of ∀s, followed by a bunch of function types, ending with a ∀r. r.
The second “premise” is actually a sequence of premises. What is it looping over? So far, the sequences we have in scope are ϕ⃗, σ⃗, a⃗, and u⃗.
ϕ⃗ and a⃗ are used in a nested sequence, so probably not those two.
On the other hand, u⃗ and σ⃗ seem quite plausible if you consider what they mean.
σ⃗ is the list of argument types expected by the label j, and u⃗ is the list of argument terms provided to the label j, and it makes sense that you might want to iterate over argument types and argument terms together.
So this “premise” actually means something like this:
for each pair of σ and u:
Γ; ε ⊢ u : σ {ϕ/⃗a}
Pseudo-Haskell implementation
Finally, here’s a somewhat-complete code sample illustrating what this typing rule might look like in an actual implementation. x⃗ is implemented as a list of x values, and some monad M is used to signal failure when a premise is not satisfied.
data LabelVar
data Type
= ...
data Term
= Jump LabelVar [Type] [Term] Type
| ...
typecheck :: TermContext -> LabelContext -> Term -> M Type
typecheck gamma delta (Jump j phis us tau) = do
-- Look up `j` in the label context. If it's not there, throw an error.
typeOfJ <- lookupLabel j delta
-- Check that the type of `j` has the right shape: a bunch of `foralls`,
-- followed by a bunch of function types, ending with `forall r.r`. If it
-- has the correct shape, split it into a list of `a`s, a list of `\sigma`s
-- and the return type, `forall r.r`.
(as, sigmas, ret) <- splitLabelType typeOfJ
-- exactZip is a helper function that "zips" two sequences together.
-- If the sequences have the same length, it produces a list of pairs of
-- corresponding elements. If not, it raises an error.
for each (u, sigma) in exactZip (us, sigmas):
-- Type-check the argument `u` in a context without any tail calls,
-- and assert that its type has the correct form.
sigma' <- typecheck gamma emptyLabelContext u
-- let subst = { \sequence{\phi / a} }
subst <- exactZip as phis
assert (applySubst subst sigma == sigma')
-- After all the premises have been satisfied, the type of the `jump`
-- expression is just its return type.
return tau
-- Other syntactic forms
typecheck gamma delta u = ...
-- Auxiliary definitions
type M = ...
instance Monad M
lookupLabel :: LabelVar -> LabelContext -> M Type
splitLabelType :: Type -> M ([TypeVar], [Type], Type)
exactZip :: [a] -> [b] -> M [(a, b)]
applySubst :: [(TypeVar, Type)] -> Type -> Type
As far as I know SPJ’s style for notation, and this does align with what I see in the paper, it simply means “0 or more”. E.g. you can replace \overarrow{a} with a_1, …, a_n, n >= 0.
It may be “1 or more” in some cases, but it shouldn’t be hard to figure one which one of the two.

Why does OCaml sometimes require eta expansion?

If I have the following OCaml function:
let myFun = CCVector.map ((+) 1);;
It works fine in Utop, and Merlin doesn't mark it as a compilation error. When I try to compile it, however, I get the following error:
Error: The type of this expression,
(int, '_a) CCVector.t -> (int, '_b) CCVector.t,
contains type variables that cannot be generalized
If I eta-expand it however then it compiles fine:
let myFun foo = CCVector.map ((+) 1) foo;;
So I was wondering why it doesn't compile in eta-reduced form, and also why the eta-reduced form seems to work in the toplevel (Utop) but not when compiling?
Oh, and the documentation for CCVector is here. The '_a part can be either `RO or `RW, depending whether it is read-only or mutable.
What you got here is the value polymorphism restriction of ML language family.
The aim of the restriction is to settle down let-polymorphism and side effects together. For example, in the following definition:
let r = ref None
r cannot have a polymorphic type 'a option ref. Otherwise:
let () =
r := Some 1; (* use r as int option ref *)
match !r with
| Some s -> print_string s (* this time, use r as a different type, string option ref *)
| None -> ()
is wrongly type-checked as valid, but it crashes, since the reference cell r is used for these two incompatible types.
To fix this issue many researches were done in 80's, and the value polymoprhism is one of them. It restricts polymorphism only to let bindings whose definition form is "non-expansive". Eta expanded form is non expansive therefore your eta expanded version of myFun has a polymorphic type, but not for eta reduced one. (More precisely speaking, OCaml uses a relaxed version of this value polymorphism, but the story is basically the same.)
When the definition of let binding is expansive there is no polymorphism introduced therefore the type variables are left non-generalized. These types are printed as '_a in the toplevel, and their intuitive meaning is: they must be instantiated to some concrete type later:
# let r = ref None (* expansive *)
val r : '_a option ref = {contents = None} (* no polymorphism is allowed *)
(* type checker does not reject this,
hoping '_a is instantiated later. *)
We can fix the type '_a after the definition:
# r := Some 1;; (* fixing '_a to int *)
- : unit = ()
# r;;
- : int option ref = {contents = Some 1} (* Now '_a is unified with int *)
Once fixed, you cannot change the type, which prevents the crash above.
This typing delay is permitted until the end of the typing of the compilation unit. The toplevel is a unit which never ends and therefore you can have values with '_a type variables anywhere of the session. But in the separated compilation, '_a variables must be instantiated to some type without type variables till the end of ml file:
(* test.ml *)
let r = ref None (* r : '_a option ref *)
(* end of test.ml. Typing fails due to the non generalizable type variable remains. *)
This is what is happening with your myFun function with the compiler.
AFAIK, there is no perfect solution to the problem of polymorphism and side effects. Like other solutions, the value polymorphism restriction has its own drawback: if you want to have a polymorphic value, you must make the definition in non-expansive: you must eta-expand myFun. This is a bit lousy but is considered acceptable.
You can read some other answers:
http://caml.inria.fr/pub/old_caml_site/FAQ/FAQ_EXPERT-eng.html#variables_de_types_faibles
What is the difference between 'a and '_l?
or search by like "value restriction ml"

Correct terminology for continuations

I've been poking around continuations recently, and I got confused about the correct terminology. Here Gabriel Gonzalez says:
A Haskell continuation has the following type:
newtype Cont r a = Cont { runCont :: (a -> r) -> r }
i.e. the whole (a -> r) -> r thing is the continuation (sans the wrapping)
The wikipedia article seems to support this idea by saying
a continuation is an abstract representation of the control state
of a computer program.
However, here the authors say that
Continuations are functions that represent "the remaining computation to do."
but that would only be the (a->r) part of the Cont type. And this is in line to what Eugene Ching says here:
a computation (a function) that requires a continuation function in order
to fully evaluate.
We’re going to be seeing this kind of function a lot, hence, we’ll give it
a more intuitive name. Let’s call them waiting functions.
I've seen another tutorial (Brian Beckman and Erik Meijer) where they call the whole thing (the waiting function) the observable and the function which is required for it to complete the observer.
What is the the continuation, the (a->r)->r thingy or just the (a->r) thing (sans the wrapping)?
Is the wording observable/observer about correct?
Are the citations above really contradictory, is there a common truth?
What is the the continuation, the (a->r)->r thingy or just the (a->r) thing (sans the wrapping)?
I would say that the a -> r bit is the continuation and the (a -> r) -> r is "in continuation passing style" or "is the type of the continuation monad.
I am going to go off on a long digression on the history of continuations which is not really relivant to the question...so be warned.
It is my belief that the first published paper on continuations was "Continuations: A Mathematical Semantics for Handling Full Jumps" by Strachey and Wadsworth (although the concept was already folklore). The idea of that paper is I think a pretty important one. Early semantics for imperative programs attempted to model commands as state transformer functions. For example, consider the simple imperative language given by the following BNF
Command := set <Expression> to <Expression>
| skip
| <Command> ; <Command>
Expression := !<Expression>
| <Number>
| <Expression> + <Expression>
here we use a expressions as pointers. The simplest denotational function interprets the state as functions from natural numbers to natural numbers:
S = N -> N
We can interpret expressions as functions from state to the natural numbers
E[[e : Expression]] : S -> N
and commands as state transducers.
C[[c : Command]] : S -> S
This denotational semantics can be spelled out rather simply:
E[[n : Number]](s) = n
E[[a + b]](s) = E[[a]](s) + E[[b]](s)
E[[!e]](s) = s(E[[e]](s))
C[[skip]](s) = s
C[[set a to b]](s) = \n -> if n = E[[a]](s) then E[[b]](s) else s(n)
C[[c_1;c_2]](s) = (C[[c_2] . C[[c_1]])(s)
As simple program in this language might look like
set 0 to 1;
set 1 to (!0) + 1
which would be interpreted as function that turns a state function s into a new function that is just like s except it maps 0 to 1 and 1 to 2.
This was all well and good, but how do you handle branching? Well, if you think about it alot you can probably come up with a way to handle if and loops that go an exact number of times...but what about general while loops?
Strachey and Wadsworth's showed us how to do it. First of all, they pointed out that these "State transducer functions" were pretty important and so decided to call them "command continuations" or just "continuations."
C = S -> S
From this they defined a new semantics, which we will provisionally define this way
C'[[c : Command]] : C -> C
C'[[c]](cont) = cont . C[[c]]
What is going on here? Well, observe that
C'[[c_1]](C[[c_2]]) = C[[c_1 ; c_2]]
and further
C'[[c_1]](C'[[c_2]](cont) = C'[[c_1 ; c_2]](cont)
Instead of doing it this way, we can inline the definition
C'[[skip]](cont) = cont
C'[[set a to b]](cont) = cont . \s -> \n -> if n = E[[a]](s) then E[[b]](s) else s(n)
C'[[c_1 ; c_2]](cont) = C'[[c_1]](C'[[c_2]](cont)
What has this bought us? Well, a way to interpret while, thats what!
Command := ... | while <Expression> do <Command> end
C'[[while e do c end]](cont) =
let loop = \s -> if E[[e]](s) = 0 then C'[[c]](loop)(s) else cont(s)
in loop
or, using a fixpoint combinator
C'[[while e do c end]](cont)
= Y (\f -> \s -> if E[[e]](s) = 0 then C'[[c]](f)(s) else cont(s))
Anyways...that is history and not particularly important...except in so far as it showed how to interpret programs mathematically, and set the language of "continuation."
Also, the approach to denotational semantics of "1. define a new semantic function in terms of the old 2. inline 3. profit" works surprisingly often. For example, it is often useful to have your semantic domain form a lattice (think, abstract interpretation). How do you get that? Well, one option is to take the powerset of the domain, and inject into this by interpreting your functions as singletons. If you inline this powerset construction you get something that can either model non-determinism or, in the case of abstract interpretation, various amounts of information about a program other than exact certainty as to what it does.
Various other work followed. Here I skip over many greats such as the lambda papers... But, perhaps the most notable was Griffin's landmark paper "A Formulae-as-Types Notion of Control" which showed a connection between continuation passing style and classical logic. Here the connection between "continuation" and "Evaluation context" is emphasized
That is, E represents the rest of the computation that remains to be done after N is evaluated. The context E is called the continuation (or control context) of N at this point in the evalu- ation sequence. The notation of evaluation contexts allows, as we shall see below, a concise specification of the operational semantics of operators that ma- nipulate continuations (indeed, this was its intended use [3, 2, 4, 1]).
making clear that the "continuation" is "just the a -> r bit"
This all looks at things from the point of view of semantics and sees continuations as functions. The thing is, continuations as functions give you more power than you get with something like scheme's callCC. So, another perspective on continuations is that they are variables in the program which internalize the call stack. Parigot had the idea to make continuation variables a seperate syntactic category leading to the elegant lambda-mu calculus in "λμ-Calculus: An algorithmic interpretation of classical natural deduction."
Is the wording observable/observer about correct?
I think it is in so far as it is what Eric Mejier uses. It is non standard terminology in academic PLs.
Are the citations above really contradictory, is there a common truth?
Let us look at the citations again
a continuation is an abstract representation of the control state of a computer program.
In my interpretation (which I think is pretty standard) a continuation models what a program should do next. I think wikipedia is consistent with this.
A Haskell continuation has the following type:
This is a bit odd. But, note that later in the post Gabriel uses language which is more standard and supports my use of language.
That means that if we have a function with two continuations:
(a1 -> r) -> ((a2 -> r) -> r)
Fueled by reading about continuations via Andrzej Filinski's Declarative Continuations and Categorical Duality I adopt the following terminology and understanding.
A continuation on values of a is a "black hole which accepts values of a". You can see it as a black box with one operation—you feed it a value a and then the world ends. Locally at least.
Now let's assume we're in Haskell and I demand that you construct for me a function forall r . (a -> r) -> r. Let's say, for now, that a ~ Int and it'll look like
f :: forall r . (Int -> r) -> r
f cont = _
where the type hole has a context like
r :: Type
cont :: Int -> r
-----------------
_ :: r
Clearly, the only way we can comply with these demands is to pass an Int into the cont function and return it after which no further computation can happen. This models the idea of "feed an Int to the continuation and then the world ends".
So, I would call the function (a -> r) the continuation so long as it's in a context with a fixed-but-unknown r and a demand to return that r. For instance, the following is not so much of a continuation
forall r . (a -> r) -> (r, a)
as we're clearly allowed to pass back out more information from our failing universe than the continuation alone allows.
On "Observable"
I'm personally not a fan of the "observer"/"observable" terminology. In that terminology we might write
newtype Observable a = O { observe :: forall r . (a -> r) -> r }
so that we have observe :: Observable a -> (a -> r) -> r which ensures that exactly one a will be passed to an "observer" a -> r "observing" it. This gives a very operational view to the type above while Cont or even the scarily named Yoneda Identity explains much more declaratively what the type actually is.
I think the point is to somehow hide the complexity of Cont behind metaphor to make it less scary for "the average programmer", but that just adds an extra layer of metaphor for behavior to leak out of. Cont and Yoneda Identity explain exactly what the type is without dressing it up.
I suggest to recall the call convention for C on x86 platforms, because of its use of the stack and registers to pass the arguments around. This will turn out very useful to understand the abstraction.
Suppose, function f calls function g and passes 0 to it. This will look like so:
mov eax, 0
call g -- now eax is the first argument,
-- and the stack has the address of return point, f'
g: -- here goes g that uses eax to compute the return value
mov eax,1 -- which by calling convention is placed in eax
ret -- get the return point, f', off the stack, and jump there
f': ...
You see, placing the return point f' on the stack is the same as passing a function pointer as one of the arguments, and then the return is the same as calling the given function and pass it a value. So from g's point of view the return point to f looks like function of one argument, f' :: a -> r. As you understand, the state of the stack completely captures the state of the computation f was performing, and which needed a from g in order to proceed.
At the same time, at the point g is called it looks like a function that accepts a function of one argument (we place the pointer of that function on the stack), which will eventually compute the value of type r that the code from f': onwards was meant to compute, so the type becomes g :: (a->r)->r.
Since f' is given a value of type a from "somewhere", f' can be seen as the observer of g - which is, conversely, the observable.
This is only intended to give a basic idea and tie somehow to the world you probably already know. The magic of continuations permits to do more tricks than just convert "plain" computation into computation with continuations.
When we refer to a continuation, we mean the part that let us continue calculating a result.
An operation in the Continuation Monad is analogous to a function that is incomplete and so it is waiting on another function to complete it. Although, the Continuation Monad is itself a valid construct that can be used to complete another Continuation Monad, that is what the binding operator (>>=) for the Cont Monad does.
When writing code that involves callCC or Call with Current Continuation, you are passing the current Cont Monad into another Cont Monad so that the second one can make use of it. For example, it might prematurely end execution by calling the first Cont Monad, and from there the cycle can either repeat or diverge into a different Continuation Monad.
The part that is the continuation is different from which perspective you use. In my personal opinion, the best way to describe a continuation is in relation to another construct.
So if we return to our example of two Cont Monads interacting, from the perspective of the first Monad the continuation is the (a -> r) -> r (because that is the unwrapped type of the first Monad) and from the perspective of the second Monad the continuation is the (a -> r) (because that is the unwrapped type of the first monad when a is substituted for (a -> r)).

Proving "no corruption" in Haskell

I work in a safety-critical industry, and our software projects generally have safety requirements imposed; things that we have to demonstrate that the software does to a high degree of certainty. Often these are negatives, such as " shall not corrupt more frequently than 1 in ". (I should add that these requirements come from statistical system safety requirements).
One source of corruption is clearly coding errors, and I would like to use the Haskell type system to exclude at least some classes of these errors. Something like this:
First, here is our critical data item that must not be corrupted.
newtype Critical = Critical String
Now I want to store this item in some other structures.
data Foo = Foo Integer Critical
data Bar = Bar String Critical
Now I want to write a conversion function from Foo to Bar which is guaranteed not to mess with the Critical data.
goodConvert, badConvert :: Foo -> Bar
goodConvert (Foo n c) = Bar (show n) c
badConvert (Foo n (Critical s)) = Bar (show n) (Critical $ "Bzzt - " ++ s)
I want "goodConvert" to type check, but "badConvert" to fail type checking.
Obviously I can carefully not import the Critical constructor into the module that does conversion. But it would be much better if I could express this property in the type, because then I can compose up functions that are guaranteed to preserve this property.
I've tried adding phantom types and "forall" in various places, but that doesn't help.
One thing that would work would be to not export the Critical constructor, and then have
mkCritical :: String -> IO Critical
Since the only place that these Critical data items get created is in the input functions, this makes some sense. But I'd prefer a more elegant and general solution.
Edit
In the comments FUZxxl suggested a look at Safe Haskell. This looks like the best solution. Rather than adding a "no corruption" modifier at the type level as I originally wanted, it looks like you can do it at the module level, like this:
1: Create a module "Critical" that exports all the features of the Critical data type, including its constructor. Mark this module as "unsafe" by putting "{-# LANGUAGE Unsafe #-}" in the header.
2: Create a module "SafeCritical" that re-exports everything except the constructor and any other functions that might be used to corrupt a critical value. Mark this module as "trustworthy".
3: Mark any modules that are required to handle Critical values without corruption as "safe". Then use this to demonstrate that any function imported as "safe" cannot cause corruption to a Critical value.
This will leave a smaller minority of code, such as input code that parses Critical values, requiring further verification. We can't eliminate this code, but reducing the amount that needs detailed verification is still a significant win.
The method is based on the fact that a function cannot invent a new value unless a function returns it. If a function only gets one Critical value (as in the "convert" function above) then that is the only one it can return.
A harder variation of the problem comes when a function has two or more Critical values of the same type; it has to guarantee not to mix them up. For instance,
swapFooBar :: (Foo, Bar) -> (Bar, Foo)
swapFooBar (Foo n c1, Bar s c2) = (Bar s c1, Foo n c2)
However this can be handled by giving the same treatment to the containing data structures.
You can use parametricity to get partway there
data Foo c = Foo Integer c
data Bar c = Bar String c
goodConvert :: Foo c -> Bar c
goodConvert (Foo n c) = Bar (show n) c
Since c is an unconstrained type variable, you know that the function goodConvert cannot know anything about c, and therefore cannot construct a different value of that type. It has to use the one provided in the input.
Well, almost. Bottom values allow you to break this guarantee. However, you at least know that if you try to use a "corrupted" value, it will result in an exception (or non-termination).
badConvert :: Foo c -> Bar c
badConvert (Foo n c) = Bar (show n) undefined
While hammar's solution is excellent and I would normally suggest smart constructors / not exporting the constructor, today I decided to try solving this in the Coq proof assistant and extracting to Haskell.
Take note! I am not very well versed in Coq / extraction. Some people have done good work with proving and extracting Haskell code, so look to them for quality examples - I'm just toying!
First we want to define your data types. In Coq this looks much like Haskell GADTs:
Require Import String.
Require Import ZArith.
Inductive Critical :=
Crit : string -> Critical.
Inductive FooT :=
Foo : Z -> Critical -> FooT.
Inductive BarT :=
Bar : string -> Critical -> BarT.
Think of those Inductive lines, such as Inductive FooT := Foo : ... ., as data type declarations: data FooT = Foo Integer Critical
For ease of use, lets get some field accessors:
Definition critF f := match f with Foo _ c => c end.
Definition critB b := match b with Bar _ c => c end.
Since Coq doesn't define many "show" style functions, I'll use a placeholder for showing integers.
Definition ascii_of_Z (z : Z) : string := EmptyString. (* FIXME *)
Now we've got the basics, lets define the goodConvert function!
Definition goodConvert (foo : FooT) : BarT :=
match foo with
Foo n c => Bar (ascii_of_Z n) c
end.
That's all fairly obvious - it's your convert function but in Coq and using a case like statement instead of top-level pattern matching. But how do we know this function is actually going to maintain the invariant? We prove it!
Lemma convertIsGood : forall (f : FooT) (b : BarT),
goodConvert f = b -> critF f = critB b.
Proof.
intros.
destruct f. destruct b.
unfold goodConvert in H. simpl.
inversion H. reflexivity.
Qed.
That says that if converting f results in b then the critical field of f must be the same as the critical field of b (assuming some minor things, such as you not messing up the field accessor implementations).
Now lets extract this to Haskell!
Extraction Language Haskell.
Extract Constant ascii_of_Z => "Prelude.show". (* obviously, all sorts of unsafe and incorrect behavior can be introduced by your extraction *)
Extract Inductive string => "Prelude.String" ["[]" ":"]. Print positive.
Extract Inductive positive => "Prelude.Integer" ["`Data.Bits.shiftL` 1 + 1" "`Data.Bits.shiftL` 1" "1"].
Extract Inductive Z => "Prelude.Integer" ["0" "" ""].
Extraction "so.hs" goodConvert critF critB.
Producing:
module So where
import qualified Prelude
data Bool =
True
| False
data Ascii0 =
Ascii Bool Bool Bool Bool Bool Bool Bool Bool
type Critical =
Prelude.String
-- singleton inductive, whose constructor was crit
data FooT =
Foo Prelude.Integer Critical
data BarT =
Bar Prelude.String Critical
critF :: FooT -> Critical
critF f =
case f of {
Foo z c -> c}
critB :: BarT -> Critical
critB b =
case b of {
Bar s c -> c}
ascii_of_Z :: Prelude.Integer -> Prelude.String
ascii_of_Z z =
[]
goodConvert :: FooT -> BarT
goodConvert foo =
case foo of {
Foo n c -> Bar (ascii_of_Z n) c}
Can we run it?? Does it work?
> critB $ goodConvert (Foo 32 "hi")
"hi"
Great! If anyone has suggestions for me, even though this is an "answer", I'm all ears. I'm not sure how to drop the dead code of things like Ascii0 or Bool, not to mention make good show instances. If anyone's curious, I think the field names can be done automatically if I used a Record instead of an Inductive, but that might make this post syntactically uglier.
I think the solution of hiding constructors is idiomatic. You can export two functions:
mkCritical :: String -> D Critical
extract :: Critical -> String
where D is the trivial monad, or any other. Any function that creates objects of type Critical at some point is marked with D. A function without that D can extract data from Critical objects, but not create new ones.
Alternatively:
data C a = C a Critical
modify :: (a -> String -> b) -> C a -> C b
modify f (C x (Critical y)) = C (f x y) (Critical y)
If you don't export constructor C, only modify, you can write:
goodConvert :: C Int -> C String
goodConvert = modify (\(a, _) -> show a)
but badConvert is impossible to write.

SML conversions to Haskell

A few basic questions, for converting SML code to Haskell.
1) I am used to having local embedded expressions in SML code, for example test expressions, prints, etc. which functions local tests and output when the code is loaded (evaluated).
In Haskell it seems that the only way to get results (evaluation) is to add code in a module, and then go to main in another module and add something to invoke and print results.
Is this right? in GHCi I can type expressions and see the results, but can this be automated?
Having to go to the top level main for each test evaluation seems inconvenient to me - maybe just need to shift my paradigm for laziness.
2) in SML I can do pattern matching and unification on a returned result, e.g.
val myTag(x) = somefunct(a,b,c);
and get the value of x after a match.
Can I do something similar in Haskell easily, without writing separate extraction functions?
3) How do I do a constructor with a tuple argument, i.e. uncurried.
in SML:
datatype Thing = Info of Int * Int;
but in Haskell, I tried;
data Thing = Info ( Int Int)
which fails. ("Int is applied to too many arguments in the type:A few Int Int")
The curried version works fine,
data Thing = Info Int Int
but I wanted un-curried.
Thanks.
This question is a bit unclear -- you're asking how to evaluate functions in Haskell?
If it is about inserting debug and tracing into pure code, this is typically only needed for debugging. To do this in Haskell, you can use Debug.Trace.trace, in the base package.
If you're concerned about calling functions, Haskell programs evaluate from main downwards, in dependency order. In GHCi you can, however, import modules and call any top-level function you wish.
You can return the original argument to a function, if you wish, by making it part of the function's result, e.g. with a tuple:
f x = (x, y)
where y = g a b c
Or do you mean to return either one value or another? Then using a tagged union (sum-type), such as Either:
f x = if x > 0 then Left x
else Right (g a b c)
How do I do a constructor with a tuple argument, i.e. uncurried in SML
Using the (,) constructor. E.g.
data T = T (Int, Int)
though more Haskell-like would be:
data T = T Int Bool
and those should probably be strict fields in practice:
data T = T !Int !Bool
Debug.Trace allows you to print debug messages inline. However, since these functions use unsafePerformIO, they might behave in unexpected ways compared to a call-by-value language like SML.
I think the # syntax is what you're looking for here:
data MyTag = MyTag Int Bool String
someFunct :: MyTag -> (MyTag, Int, Bool, String)
someFunct x#(MyTag a b c) = (x, a, b, c) -- x is bound to the entire argument
In Haskell, tuple types are separated by commas, e.g., (t1, t2), so what you want is:
data Thing = Info (Int, Int)
Reading the other answers, I think I can provide a few more example and one recommendation.
data ThreeConstructors = MyTag Int | YourTag (String,Double) | HerTag [Bool]
someFunct :: Char -> Char -> Char -> ThreeConstructors
MyTag x = someFunct 'a' 'b' 'c'
This is like the "let MyTag x = someFunct a b c" examples, but it is a the top level of the module.
As you have noticed, Haskell's top level can defined commands but there is no way to automatically run any code merely because your module has been imported by another module. This is entirely different from Scheme or SML. In Scheme the file is interpreted as being executed form-by-form, but Haskell's top level is only declarations. Thus Libraries cannot do normal things like run initialization code when loaded, they have to provide a "pleaseRunMe :: IO ()" kind of command to do any initialization.
As you point out this means running all the tests requires some boilerplate code to list them all. You can look under hackage's Testing group for libraries to help, such as test-framework-th.
For #2, yes, Haskell's pattern matching does the same thing. Both let and where do pattern matching. You can do
let MyTag x = someFunct a b c
in ...
or
...
where MyTag x = someFunct a b c

Resources