Link between Alloy vars. and primary vars. and Alloy relational variables - alloy

I can't seem to understand the meaning of the vars. and primary vars. numbers that are displayed when solving is done. The Alloy book in section 5.2.1 explains that Alloy relational variables are mapped to boolean variables associated to the tuples of each relation. but I don't understand the correspondence between this definition of variables and the variable counts displayed in the gui. For example when this code is run (I am using the Alloy Analyzer 4.2 build date: 2012-09-25 15:54 EDT.):
sig A {}
pred show {}
run show for 2
it displays
0 vars. 0 primary vars. 0 clauses.
although there there exists one relation. And when this code is run :
sig A {}
fact {no A }
pred show {}
run show for 2
the variables count is like this :
6 vars. 2 primary vars. 5 clauses.
I can understand that maybe the 2 primary vars correspond to the max 2 elements of set A but I don't understand what are the 4 more variables that are enumerated.

Essentially, primary variables are the variables that correspond to instances of your declared signatures. Their number represents all instances created in the whole Alloy universe. On the other hand, the total number of variables is usually greater, since it also reflects the variables that are needed to represent given facts when encoded into the SAT formula. (Some details about the statistics, of the KodKod underlying solver, can be found here.)
Therefore, in your second example, the number of primary variables is 3 due to the limit on the instances of signatures to 2. (E.g. if you add another signature, the number of primary variables will be 4.) The number of variables in total reflects the total number of variables in the (CNF) encoding of the formula, which in turn depends on the specific facts you've declared. Note that, in the first example, there are no variables needed since there is nothing to check (and solver does not need to emit anything).

Related

How can measurements of health properties be modeled in a class diagram?

I have a User class that can "measure" some parameters associated to a date and input them in an application. So 1 User -> many parameters of many types associated to many dates (many measurements). The parameters types are fixed and can be both numeric or strings, e.g: weight, height, calories intake, some strings... which are represented as an enumeration.
Now my main problem is: does the fact that the parameters can be of different datatypes (numbers or strings) mean that the general parameter type has to have specialisations for the two subgroups of parameters? Or is the datatype for each type of parameter implied in the type itself? (e.g. a "weight" implies it should be a number)
How can the "Parameter" class be represented in a correct way considering that:
it can be both numerical or a string
there is also a superuser class that can add parameters for a specific user
the parameters the superuser can input are some of the ones the normal user can PLUS some other parameters exclusive to the superuser (say: fat body mass) so there is not a 1-1 correspondence
the numerical parameters have other additional attributes that can be modified by the superuser (for example: limit weight)
the superuser supposedly should be able to add "notes" for some parameters
My confusion stems from the fact that I have no background in OOP programming and i can't find any similar examples online. I just need an input towards the right direction to go to. Is the pictured diagram correct? And why it most likely isn't? The problem as of now would be how to implement the fact that the superuser can also add notes to some parameters.
Do I:
create a single parameter class with the enumeration type as attribute which automatically implies the datatype of the input e.g weight = number?
create two subclasses for each User, e.g. UserParameters and SuperUserParameters, although some parameters overlap?
leave it as is with some adjustments?
other better approach?
I'd like to propose using an improved terminology. Since your app is about (health) property measurements, I'll replace your class name "Parameter" with Measurement.
The following model should satisfy all of your requirements (except the one discussed below):
Notice that the two subclasses UserProperty and SpecialProperty simply define a partitioning of Property. They can be eliminated by adding an enumeration attribute propertyCategory to the Property class, having USER_PPROPERTY and SPECIAL_PPROPERTY as its enum literals.
The only requirement, which is not yet covered, is
the numerical parameters have other additional attributes that can be
modified by the superuser (for example: limit weight)
This needs further carification. If these "other additional attributes" form a fixed set, then they can be modeled as further attributes of the Property class.
I don't think you should do that on UML level at all. You are going into memory management/overlays. And those are implementation details you should not take care of. Rather you are dealing with HeartRate and Weight as distinct objects. They will not have a common "value", which is just some memory allocation. They are what they are and whether you need a string or a number is some property of the distinct business objects.

Why are fields different when used in an arrow expression?

The following signature describes the state of a photo-management application:
sig ApplicationState {
catalogs: set Catalog,
catalogState: catalogs -> one CatalogState
}
A signature, of course, creates a set. In this case, it creates a set of ApplicationStates:
ApplicationState0
ApplicationState1
...
catalogs is a field. It maps each ApplicationState to a set of Catalog values:
ApplicationState0, Catalog0
ApplicationState0, Catalog1
ApplicationState1, Catalog0
...
catalogState is also a field. It maps each ApplicationState to a relation. The relation is:
catalogs -> one CatalogState
That relation says: Map each value of catalogs to one CatalogState value. We already saw catalogs, which I'll repeat here:
ApplicationState0, Catalog0
ApplicationState0, Catalog1
ApplicationState1, Catalog0
...
So, the relation says to map each of those tuples to one CatalogState, like so:
ApplicationState0, Catalog0, CatalogState0
ApplicationState0, Catalog1, CatalogState0
ApplicationState1, Catalog0, CatalogState0
...
Okay, back to catalogState. Earlier we said that it maps each ApplicationState to a relation, and we just saw what that relation is. So, I believe that catalogState denotes a relation with arity=4, like so:
ApplicationState0, ApplicationState0, Catalog0, CatalogState0
ApplicationState0, ApplicationState0, Catalog1, CatalogState0
ApplicationState0, ApplicationState1, Catalog0, CatalogState0
...
But, when I run the Alloy Evaluator, it says that catalogState is a ternary relation. My takeaway from this example is:
Usually a field name denotes a relation.
A field name used in an arrow expression does not denote a relation. Rather, it denotes column 2 of the relation (the range of the relation).
Is that right? Where is this explained in the Software Abstractions book?
Section 4.2.2 of Sofware Abstractions (p. 97 in the second edition) begins
Relations are declared as fields of signatures.
That addresses at least part of your question, I think. (I think it may be helpful to work through the index entries for 'field' and relation and read every section they point to.)
You say
A field name used in an arrow expression does not denote a relation. Rather, it denotes column 2 of the relation (the range of the relation).
It may sound pedantic, but no: field names always denote relations. Within the context of a signature declaration however, they are implicitly prefixed with this., which removes the first column of the relation. In your declaration catalogState: catalogs -> one CatalogState, the reference to catalogs is indeed a reference to a binary relation over ApplicationState and Catalog. In this context, however, it's silently expanded to this.catalogs, which evaluates to a set of Catalog individuals. The keyword this is introduced in section 4.2.2 of Software Abstractions.
The cardinality constraints on declarations may also be a complicating factor in your example; I won't try to explain their effect here. I'll only say that when I have run into problems with cardinality constraints, I have often found that a very careful reading of the relevant parts of the language reference in Appendix B has generally sufficed to let me understand what was going on. (I admit that sometimes it has taken more than one reading.)

cql binary protocol and named bound variables in prepared queries

imagine I have a simple CQL table
CREATE TABLE test (
k int PRIMARY KEY,
v1 text,
v2 int,
v3 float
)
There are many cases where one would want to make use of the schema-less essence of Cassandra and only set some of the values and do, for example, a
INSERT into test (k, v1) VALUES (1, 'something');
When writing an application to write to such a CQL table in a Cassandra cluster, the need to do this using prepared statements immediately arises, for performance reasons.
This is handled in different ways by different drivers. Java driver for example has introduced (with the help of a modification in CQL binary protocol), the chance of using named bound variables. Very practical: CASSANDRA-6033
What I am wondering is what is the correct way, from a binary protocol point of view, to provide values only for a subset of bound variables in a prepared query?
Values in fact are provided to a prepared query by building a values list as described in
4.1.4. QUERY
[...]
Values. In that case, a [short] <n> followed by <n> [bytes]
values are provided. Those value are used for bound variables in
the query.
Please note the definition of [bytes]
[bytes] A [int] n, followed by n bytes if n >= 0. If n < 0,
no byte should follow and the value represented is `null`.
From this description I get the following:
"Values" in QUERY offers no ways to provide a value for a specific column. It is just an ordered list of values. I guess the [short] must correspond to the exact number of bound variables in a prepared query?
All values, no matter what types they are, are represented as [bytes]. If that is true, any interpretation of the [bytes] value is left to the server (conversion to int, short, text,...)?
Assuming I got this all right, I wonder if a 'null' [bytes] value can be used to just 'skip' a bound variable and not assign a value for it.
I tried this and patched the cpp driver (which is what I am interested in). Queries get executed but when I perform a SELECT from clqsh, I don't see the 'null' string representation for empty fields, so I wonder if that is a hack that for some reasons is not just crashing or the intended way to do this.
I am sorry but I really don't think I can just download the java driver and see how named bound variables are implemented ! :(
---------- EDIT - SOLVED ----------
My assumptions were right and now support to skip a field in a prepared query has been added to cpp driver (see here ) by using a null [bytes value].
What I am wondering is what is the correct way, from a binary protocol point of view, to provide values only for a subset of bound variables in a prepared query?
You need to prepare a query that only inserts/updates the subset of columns that you're interested in.
"Values" in QUERY offers no ways to provide a value for a specific column. It is just an ordered list of values. I guess the [short] must correspond to the exact number of bound variables in a prepared query?
That's correct. The ordering is determined by the column metadata that Cassandra returns when you prepare a query.
All values, no matter what types they are, are represented as [bytes]. If that is true, any interpretation of the [bytes] value is left to the server (conversion to int, short, text,...)?
That's also correct. The driver will use the returned column metadata to determine how to convert native values (strings, UUIDS, ints, etc) to a binary (bytes) format. Cassandra does the inverse of this operation server-side.
Assuming I got this all right, I wonder if a 'null' [bytes] value can be used to just 'skip' a bound variable and not assign a value for it.
A null column insertion is interpreted as a deletion.
Implementation of what I was trying to achieve has been done (see here ) based on the principle I described.

definition of variable

I ended up in a discussion with some friends about the definition of variable with respect to programming.
My understanding is that a variable in programming can be constant or changing.
Their opinion is that the real definition of the word variable is it can change, thus an identifier referring to some value which can change is a variable, where as a set of characters referencing a value which is defined as constant is literally called a constant. i.e.,
Int constant blah
Int argh
Thus by their definition they would refer to blah as a constant and argh as a variable.
My definition is would be the variable blah is constant and argh is also a variable (which is not constant)
Have I been referring to these identifiers incorrectly?
Your friends are correct. Constants and variables are essentially opposites by their definition.
A variable can represent many different values, and the value is unknown when referred to by name.
A constant on the other hand only represents one value at all times, and if you know it's value you can count on it never changing.
Of course in programming languages they are very similar things. They usually follow the same naming rules and can be stored the same way, but, just like variables aren't constants, constants aren't variables.
From my experience, it depends who you're talking to. That being said, my definition is
* A value is... a value (1, "a", etc)
* A variable is a name used to reference a value. It's possible to use multiple names to reference the same value, and for the value referenced by a variable to change over time, but neither is mandatory.
int a = 1;
^ variable
^ value
The wikipedia link mentioned by Cody Gray reinforces this view, or seems to in my opinion.
If it helps, consider that purely functional languages have variables but, by definition of being a functional language, the values that those variables point at cannot change over time.
It's also worth noting that your definition also depends on the context of your discussion. If you're talking about "variables vs constants", its reasonable to say they're polar opposites. If you're talking about "variables vs values vs keywords", you're talking about a different usage of the word variable (kind of).
As an example, consider fruit vs vegetable. In science terminology, an eggplant is a fruit. In culinary terminology, it's a vegetable. The culinary term vegetable can refer to things, in science terms, are fruits, roots, nuts, and a variety of other things. You need to know the context of your discussion to be able to say whether "x is a fruit" is accurate.

Misuse of a variables value?

I came across an instance where a solution to a particular problem was to use a variable whose value when zero or above meant the system would use that value in a calculation but when less than zero would indicate that the value should not be used at all.
My initial thought was that I didn't like the multipurpose use of the value of the variable: a.) as a range to be using in a formula; b.) as a form of control logic.
What is this kind of misuse of a variable called? Meta-'something' or is there a classic antipattern that this fits?
Sort of feels like when a database field is set to null to represent not using a value and if it's not null then use the value in that field.
Update:
An example would be that if a variable's value is > 0 I would use the value if it's <= 0 then I would not use the value and decided to perform some other logic.
Values such as these are often called "distinguished values". By far the most common distinguished value is null for reference types. A close second is the use of distinguished values to indicate unusual conditions (e.g. error return codes or search failures).
The problem with distinguished values is that all client code must be aware of the existence of such values and their associated semantics. In practical terms, this usually means that some kind of conditional logic must be wrapped around each call site that obtains such a value. It is far too easy to forget to add that logic, obtaining incorrect results. It also promotes copy-and-paste code as the boilerplate code required to deal with the distinguished values is often very similar throughout the application but difficult to encapsulate.
Common alternatives to the use of distinguished values are exceptions, or distinctly typed values that cannot be accidentally confused with one another (e.g. Maybe or Option types).
Having said all that, distinguished values may still play a valuable role in environments with extremely tight memory availability or other stringent performance constraints.
I don't think what your describing is a pure magic number, but it's kind of close. It's similar to the situation in pre-.NET 2.0 where you'd use Int32.MinValue to indicate a null value. .NET 2.0 introduced Nullable and kind of alleviated this issue.
So you're describing the use of a variable who's value really means something other than it's value -- -1 means essentially the same as the use of Int32.MinValue as I described above.
I'd call it a magic number.
Hope this helps.
Using different ranges of the possible values of a variable to invoke different functionality was very common when RAM and disk space for data and program code were scarce. Nowadays, you would use a function or an additional, accompanying value (boolean, or enumeration) to determine the action to take.
Current OS's suggest 1GiB of RAM to operate correctly, when 256KiB was high very few years ago. Cheap disk space has gone from hundreds of MiB to multiples of TiB in a matter of months. Not too long ago I wrote programs for 640KiB of RAM and 10MiB of disk, and you would probably hate them.
I think it would be good to cope with code like that if it's just a few years old (refactor it!), and denounce it as bad practice if it's recent.

Resources