Alloy changing comparison operators precedence for signatures - alloy

I have noticed Alloy's comparison operators precedence follows this order:
comparison negation operators: ! and not;
comparison operators: in, =, <, >, =<, =>.
A project I work in has defined two predicates mySyn and Semantics to evaluate Boolean expressions through these following signatures (myNotEquals, myEquals and myGreaterThan) built on Alloy's comparison operators, respectively (!=, = and >). Those are extending BExp abstract signature.
I would like to ask two questions (to simplify, I have omitted some pieces of code, by using ... symbol):
Does the evaluation of those signatures follow the original Alloy order? I mean myNotEquals comes first, next myEquals and finally, myGreaterThan?
Is that possible to change the precedence of those signatures whose are going to be evaluated, for instance myEquals comes first, next myGreaterThan and finally, myNotEquals?
mySyn predicate:
pred mySyn[...] {
...
, Semantics[evalC, evalB, evalA]
...
}
Semantics predicate:
pred Semantics[ ...] {
...
, evalB: BExp -> (State -> Bit)
...
}
Evaluating Boolean expressions
all b: myEquals, s: State | aritmethicExpr1[s] = aritmethicExpr2[s] implies evalB[b][s] = BitTrue else evalB[b][s] = BitFalse
all b: myNotEquals, s: State | aritmethicExpr1[s] != aritmethicExpr2[s] implies evalB[b][s] = BitTrue else evalB[b][s] = BitFalse
all b: myGreaterThan, s: State | aritmethicExpr1[s] > aritmethicExpr2[s] implies evalB[b][s] = BitTrue else evalB[b][s] = BitFalse
Thank you for your help :)

Related

Can I overload the '=' operator for struct-typed arguments?

I want to get a warning everywhere the equals operator '=' is used on struct-typed arguments. It seems that if I define this operator it does not overload, it just redefines '=' to only work for struct-typed arguments, which is not what I want.
[<Obsolete("Possible performance issue with '=' operator on structs. See https://github.com/dotnet/fsharp/issues/526.")>]
let inline (=) (x: ^T) (y: ^T) : bool when ^T: struct and ^T: (static member (=) : ^T * ^T -> bool) =
x = y
let a = obj()
let x = Guid.NewGuid()
let y = Guid.NewGuid()
a = a |> ignore // oh no - "A generic construct requires that the type 'obj' is a CLI or F# struct type."
x = y |> ignore // ok - gets desired warning
Can this be done in F# as is?
Update: found a possible workaround: simply use the [<NoEquality>] attribute on the affected structs; it does mean that all structs need to be annotated but at least it does help.
No, you cannot redefine a globally-scoped (i.e. non-member) function only for some cases. Once a function is in scope, it will always be used, there is no fallback to the previously defined one.

Alloy pred/fun parameter constraint

The following Alloy predicate p has a parameter t declared as a singleton of type S. Invoke run p gives the correct result because the predicate body states that t may contain two different elements s and s'. However, in the second run command, a set of two disjoint elements of type S is passed into predicate p and this command gives an instance. Why is it the case?
sig S {}
pred p(t: one S) {
some s, s': t | s != s'
}
r1: run p -- no instance found
r2: run { -- instance found
some disj s0, s1: S {
S = s0 + s1
p[S]
}
}
See https://stackoverflow.com/a/43002442/1547046. Same issue, I think.
BTW, there's a nice research problem here. Can you define a coherent semantics for argument declarations that would be better (that is, simpler, unsurprising, and well defined in all contexts)?

Replacing recursion with transitive closure (reachability and productivity of non-terminals)

Sometimes, when I would like to use recursion in Alloy, I find I can get by with transitive closure, or sequences.
For example, in a model of context-free grammars:
abstract sig Symbol {}
sig NT, T extends Symbol {}
// A grammar is a tuple(V, Sigma, Rules, Start)
one sig Grammar {
V : set NT,
Sigma : set T,
Rules : set Prod,
Start : one V
}
// A production rule has a left-hand side
// and a right-hand side
sig Prod {
lhs : NT,
rhs : seq Symbol
}
fact tidy_world {
// Don't fill the model with irrelevancies
Prod in Grammar.Rules
Symbol in (Grammar.V + Grammar.Sigma)
}
One possible definition of reachable non-terminals would be "the start symbol, and any non-terminal appearing on the right-hand side of a rule for a reachable symbol." A straightforward translation would be
// A non-terminal 'refers' to non-terminals
// in the right-hand sides of its rules
pred refers[n1, n2 : NT] {
let r = (Grammar.Rules & lhs.n1) |
n2 in r.rhs.elems
}
pred reachable[n : NT] {
n in Grammar.Start
or some n2 : NT
| (reachable[n2] and refers[n2,n])
}
Predictably, this blows the stack. But if we simply take the transitive closure of Grammar.Start under the refers relation (or, strictly speaking, a relation formed from the refers predicate), we can define reachability:
// A non-terminal is 'reachable' if it's the
// start symbol or if it is referred to by
// (rules for) a reachable symbol.
pred Reachable[n : NT] {
n in Grammar.Start.*(
{n1, n2 : NT | refers[n1,n2]}
)
}
pred some_unreachable {
some n : (NT - Grammar.Start)
| not Reachable[n]
}
run some_unreachable for 4
In principle, the definition of productive symbols is similarly recursive: a symbol is productive iff it is a terminal symbol, or it has at least one rule in which every symbol in the right-hand side is productive. The simple-minded way to write this is
pred productive[s : Symbol] {
s in T
or some p : (lhs.s) |
all r : (p.rhs.elems) | productive[r]
}
Like the straightforward definition of reachability, this blows the stack. But I have not yet found a relation I can define on symbols which will give me, via transitive closure, the result I want. Have I found a case where transitive closure cannot substitute for recursion? Or have I just not thought hard enough to find the right relation?
There is an obvious, if laborious, hack:
pred p0[s : Symbol] { s in T }
pred p1[s : Symbol] { p0[s]
or some p : (lhs.s)
| all e : p.rhs.elems
| p0[e]}
pred p2[s : Symbol] { p1[s]
or some p : (lhs.s)
| all e : p.rhs.elems
| p1[e]}
pred p3[s : Symbol] { p2[s]
or some p : (lhs.s)
| all e : p.rhs.elems
| p2[e]}
... etc ...
pred productive[n : NT] {
p5[n]
}
This works OK as long as one doesn't forget to add enough predicates to handle the longest possible chain of non-terminal references, if one raises the scope.
Concretely, I seem to have several questions; answers to any of them would be welcome:
1 Is there a way to define the set of productive non-terminals in Alloy without resorting to the p0, p1, p2, ... hack?
2 If one does have to resort to the hack, is there a better way to define it?
3 As a theoretical question: is it possible to characterize the set of recursive predicates that can be defined using transitive closure, or sequences of atoms, instead of recursion?
Have I found a case where transitive closure cannot substitute for recursion?
Yes, that is the case. More precisely, you found a recursive relation that cannot be expressed with the first-order transitive closure (that is supported in Alloy).
Is there a way to define the set of productive non-terminals in Alloy without resorting to the p0, p1, p2, ... hack? If one does have to resort to the hack, is there a better way to define it?
There is no a way to do this in Alloy. However, there might be a way to do this in Alloy* which supports higher-order quantification. (The idea would be to describe the set of productive elements with a closure over the relation of "productiveness", which would use second-order quantification over the set of productive symbols, and constraining this set to be minimal. Similar idea is described in "A.1.9 Axiomatizing Transitive Closure" in the Alloy book.)
As a theoretical question: is it possible to characterize the set of recursive predicates that can be defined using transitive closure, or sequences of atoms, instead of recursion?
This is an interesting question. The wiki article mentions relative expressiveness of second order logic when transitive closure and fixed point operator are added (the later being able to express forms of recursion).

Bison/Flex, reduce/reduce, identifier in different production

I am doing a parser in bison/flex.
This is part of my code:
I want to implement the assignment production, so the identifier can be both boolean_expr or expr, its type will be checked by a symbol table.
So it allows something like:
int a = 1;
boolean b = true;
if(b) ...
However, it is reduce/reduce if I include identifier in both term and boolean_expr, any solution to solve this problem?
Essentially, what you are trying to do is to inject semantic rules (type information) into your syntax. That's possible, but it is not easy. More importantly, it's rarely a good idea. It's almost always best if syntax and semantics are well delineated.
All the same, as presented your grammar is unambiguous and LALR(1). However, the latter feature is fragile, and you will have difficulty maintaining it as you complete the grammar.
For example, you don't include your assignment syntax in your question, but it would
assignment: identifier '=' expr
| identifier '=' boolean_expr
;
Unlike the rest of the part of the grammar shown, that production is ambiguous, because:
x = y
without knowing anything about y, y could be reduced to either term or boolean_expr.
A possibly more interesting example is the addition of parentheses to the grammar. The obvious way of doing that would be to add two productions:
term: '(' expr ')'
boolean_expr: '(' boolean_expr ')'
The resulting grammar is not ambiguous, but it is no longer LALR(1). Consider the two following declarations:
boolean x = (y) < 7
boolean x = (y)
In the first one, y must be an int so that (y) can be reduced to a term; in the second one y must be boolean so that (y) can be reduced to a boolean_expr. There is no ambiguity; once the < is seen (or not), it is entirely clear which reduction to choose. But < is not the lookahead token, and in fact it could be arbitrarily distant from y:
boolean x = ((((((((((((((((((((((y...
So the resulting unambiguous grammar is not LALR(k) for any k.
One way you could solve the problem would be to inject the type information at the lexical level, by giving the scanner access to the symbol table. Then the scanner could look a scanned identifier token in the symbol table and use the information in the symbol table to decide between one of three token types (or more, if you have more datatypes): undefined_variable, integer_variable, and boolean_variable. Then you would have, for example:
declaration: "int" undefined_variable '=' expr
| "boolean" undefined_variable '=' boolean_expr
;
term: integer_variable
| ...
;
boolean_expr: boolean_variable
| ...
;
That will work but it should be obvious that this is not scalable: every time you add a type, you'll have to extend both the grammar and the lexical description, because the now the semantics is not only mixed up with the syntax, it has even gotten intermingled with the lexical analysis. Once you let semantics out of its box, it tends to contaminate everything.
There are languages for which this really is the most convenient solution: C parsing, for example, is much easier if typedef names and identifier names are distinguished so that you can tell whether (t)*x is a cast or a multiplication. (But it doesn't work so easily for C++, which has much more complicated name lookup rules, and also much more need for semantic analysis in order to find the correct parse.)
But, honestly, I'd suggest that you do not use C -- and much less C++ -- as a model of how to design a language. Languages which are hard for compilers to parse are also hard for human beings to parse. The "most vexing parse" continues to be a regular source of pain for C++ newcomers, and even sometimes trips up relatively experienced programmers:
class X {
public:
X(int n = 0) : data_is_available_(n) {}
operator bool() const { return data_is_available_; }
// ...
private:
bool data_is_available_;
// ...
};
X my_x_object();
// ...
if (!x) {
// This code is unreachable. Can you see why?
}
In short, you're best off with a language which can be parsed into an AST without any semantic information at all. Once the parser has produced the AST, you can do semantic analyses in separate passes, one of which will check type constraints. That's far and away the cleanest solution. Without explicit typing, the grammar is slightly simplified, because an expr now can be any expr:
expr: conjunction | expr "or" conjunction ;
conjunction: comparison | conjunction "and" comparison ;
comparison: product | product '<' product ;
product: factor | product '*' factor ;
factor: term | factor '+' term ;
term: identifier
| constant
| '(' expr ')'
;
Each action in the above would simply create a new AST node and set $$ to the new node. At the end of the parse, the AST is walked to verify that all exprs have the correct type.
If that seems like overkill for your project, you can do the semantic checks in the reduction actions, effectively intermingling the AST walk with the parse. That might seem convenient for immediate evaluation, but it also requires including explicit type information in the parser's semantic type, which adds unnecessary overhead (and, as mentioned, the inelegance of letting semantics interfere with the parser.) In that case, every action would look something like this:
expr : expr '+' expr { CheckArithmeticCompatibility($1, $3);
$$ = NewArithmeticNode('+', $1, $3);
}

Union types and Intersection types

What are the various use cases for union types and intersection types? There has been lately a lot of buzz about these type system features, yet somehow I have never felt need for either of these!
Union Types
To quote Robert Harper, "Practical Foundations for Programming
Languages", ch 15:
Most data structures involve
alternatives such as the distinction
between a leaf and an interior node in
a tree, or a choice in the outermost
form of a piece of abstract syntax.
Importantly, the choice determines the
structure of the value. For example,
nodes have children, but leaves do
not, and so forth. These concepts are
expressed by sum types, specifically
the binary sum, which offers a choice
of two things, and the nullary sum,
which offers a choice of no things.
Booleans
The simplest sum type is the Boolean,
data Bool = True
| False
Booleans have only two valid values, T or F. So instead of representing them as numbers, we can instead use a sum type to more accurately encode the fact there are only two possible values.
Enumerations
Enumerations are examples of more general sum types: ones with many, but finite, alternative values.
Sum types and null pointers
The best practically motivating example for sum types is discriminating between valid results and error values returned by functions, by distinguishing the failure case.
For example, null pointers and end-of-file characters are hackish encodings of the sum type:
data Maybe a = Nothing
| Just a
where we can distinguish between valid and invalid values by using the Nothing or Just tag to annotate each value with its status.
By using sum types in this way we can rule out null pointer errors entirely, which is a pretty decent motivating example. Null pointers are entirely due to the inability of older languages to express sum types easily.
Intersection Types
Intersection types are much newer, and their applications are not as widely understood. However, Benjamin Pierce's thesis ("Programming with Intersection Types
and Bounded Polymorphism") gives a good overview:
The most intriguing and potentially
useful property of intersection types
is their ability to express an
essentially unbounded (though of
course finite) amount of information
about the components of a program.
For
example, the addition function (+) can be
given the type Int -> Int -> Int ^ Real -> Real -> Real, capturing both the
general fact that the sum of two real
numbers is always a real and the more
specialized fact that the sum of two
integers is always an integer. A
compiler for a language with
intersection types might even provide
two different object-code sequences
for the two versions of (+), one using a
floating point addition instruction and
one using integer addition. For each
instance of+ in a program, the
compiler can decide whether both
arguments are integers and generate
the more efficient object code sequence
in this case.
This kind of finitary
polymorphism or coherent overloading
is so expressive, that ... the set of
all valid typings for a program
amounts to a complete characterization
of the program’s behavior
They let us encode a lot of information in the type, explaining via type theory what multiple inheritance means, giving types to type classes,
Union types are useful for typing dynamic languages or otherwise allowing more flexibility in the types passed around than most static languages allow. For example, consider this:
var a;
if (condition) {
a = "string";
} else {
a = 123;
}
If you have union types, it's easy to type a as int | string.
One use for intersection types is to describe an object that implements multiple interfaces. For example, C# allows multiple interface constraints on generics:
interface IFoo {
void Foo();
}
interface IBar {
void Bar();
}
void Method<T>(T arg) where T : IFoo, IBar {
arg.Foo();
arg.Bar();
}
Here, arg's type is the intersection of IFoo and IBar. Using that, the type-checker knows both Foo() and Bar() are valid methods on it.
If you want a more practice-oriented answer:
With union and recursive types you can encode regular tree types and therefore XML types.
With intersection types you can type BOTH overloaded functions and refinement types (what in a previous post is called coherent overloading)
So for instance you can write the function add (that overloads integer sum and string concatenation) as follows
let add ( (Int,Int)->Int ; (String,String)->String )
| (x & Int, y & Int) -> x+y
| (x & String, y & String) -> x#y ;;
Which has the intersection type
(Int,Int)->Int & (String,String)->String
But you can also refine the type above and type the function above as
(Pos,Pos) -> Pos &
(Neg,Neg) -> Neg &
(Int,Int)->Int &
(String,String)->String.
where Pos and Neg are positive and negative integer types.
The code above is executable in the language CDuce ( http://www.cduce.org ) whose type system includes union, intersections, and negation types (it is mainly targeted at XML transformations).
If you want to try it and you are on Linux, then it is probably included in your distribution (apt-get install cduce or yum install cduce should do the work) and you can use its toplevel (a la OCaml) to play with union and intersection types. On the CDuce site you will find a lot of practical examples of use of union and intersection types. And since there is a complete integration with OCaml libraries (you can import OCaml libraries in CDuce and export CDuce modules to OCaml) you can also check the correspondence with ML sum types (see here).
Here you are a complex example that mix union and intersection types (explained in the page "http://www.cduce.org/tutorial_overloading.html#val"), but to understand it you need to understand regular expression pattern matching, which requires some effort.
type Person = FPerson | MPerson
type FPerson = <person gender = "F">[ Name Children ]
type MPerson = <person gender = "M">[ Name Children ]
type Children = <children>[ Person* ]
type Name = <name>[ PCDATA ]
type Man = <man name=String>[ Sons Daughters ]
type Woman = <woman name=String>[ Sons Daughters ]
type Sons = <sons>[ Man* ]
type Daughters = <daughters>[ Woman* ]
let fun split (MPerson -> Man ; FPerson -> Woman)
<person gender=g>[ <name>n <children>[(mc::MPerson | fc::FPerson)*] ] ->
(* the above pattern collects all the MPerson in mc, and all the FPerson in fc *)
let tag = match g with "F" -> `woman | "M" -> `man in
let s = map mc with x -> split x in
let d = map fc with x -> split x in
<(tag) name=n>[ <sons>s <daughters>d ] ;;
In a nutshell it transforms values of type Person into values of type (Man | Women) (where the vertical bar denotes a union type) but keeping the correspondence between genres: split is a function with intersection type
MPerson -> Man & FPerson -> Woman
For instance with union types one could describe json domain model without introducing actual new classes but using only type aliases.
type JObject = Map[String, JValue]
type JArray = List[JValue]
type JValue = String | Number | Bool | Null | JObject | JArray
type Json = JObject | JArray
def stringify(json: JValue): String = json match {
case String | Number | Bool | Null => json.toString()
case JObject => "{" + json.map(x y => x + ": " + stringify(y)).mkStr(", ") + "}"
case JArray => "[" + json.map(stringify).mkStr(", ") + "]"
}

Resources