How to interpret the Result of msat LTL commands of NuXMV - model-checking

I am using NuXMV for checking LTL properties using msat_check_ltlspec_bmc command on a fairly large model. The result shows no counterexample found within the given bounds. Do I interpret it as that property is True. Or it can alternatively mean that analysis is not complete.
This is because, by changing the property proposition to true or false, the result is always no counterexample. Most of The results are counterintuitive.
Started with real variables based properties but since unable to understand the result, shifted to Boolean based properties on the same model, using the same command.

Bounded Model Checking is a bug-oriented technique which checks the validity of a property on execution traces up to a given lenght k.
When an execution trace violates a property, great: a bug was found.
Otherwise, (in the general case) the model checking result provides no useful information and it should be treated as such.
In some cases, knowing additional information about the model can help. In particular, if one knows that every execution trace of length k must loop-back to one of the k-1 states, then it is possible to draw stronger conclusions from the lack of counter-examples of length smaller or equal k.

Related

Recursive methods on CUDD

This is a follow-up to a suggestion by #DCTLib in the post below.
Cudd_PrintMinterm, accessing the individual minterms in the sum of products
I've been pursuing part (b) of the suggestion and will share some pseudo-code in a separate post.
Meanwhile, in his part (b) suggestion, #DCTLib posted a link to https://github.com/VerifiableRobotics/slugs/blob/master/src/BFAbstractionLibrary/BFCudd.cpp. I've been trying to read this program. There is a recursive function in the classic Somenzi paper, Binary Decision Diagrams, which describes an algo to compute the number of satisfying assignments (below, Fig. 7). I've been trying to compare the two, slugs and Fig. 7. But having a hard time seeing any similarities. But then C is mostly inscrutable to me. Do you know if slugs BFCudd is based on Somenze fig 7, #DCTLib?
Thanks,
Gui
It's not exactly the same algorithm.
There are two main differences:
First, the "SatHowMany" function does not take a cube of variables to consider for counting. Rather, that function considers all variables. The fact that "recurse_getNofSatisfyingAssignments" supports cubes manifest in the function potentially returning NaN (not a number) if a variable is found in the BDD that does not appear in the cube. The rest of the differences seem to stem from this support.
Second, SatHowMany returns the number of satisfying assignments to all n variables for a node. This leads, for instance, to the division by 2 in line -4. "recurse_getNofSatisfyingAssignments" only returns the number of assignments for the remaining variables to be considered.
Both algorithms cache information - in "SatHowMany", it's called a table, in "recurse_getNofSatisfyingAssignments" it's called a buffer. Note that in line 24 of "recurse_getNofSatisfyingAssignments", there is a constant string thrown. This means that either the function does not work, or the code is never reached. Most likely it's the latter.
Function "SatHowMany" seems to assume that it gets a BDD node - it cannot be a pointer to a complemented BDD node. Function "recurse_getNofSatisfyingAssignments" works correctly with complemented nodes, as a DdNode* may store a pointer to a complemented node.
Due to the support for cubes, "recurse_getNofSatisfyingAssignments" supports flexible variable ordering (hence the lookup of "cuddI" which denotes for a variable where it is in the current BDD variable ordering). For function SatHowMany, the variable ordering does not make a difference.

Why would more array accesses perform better?

I'm taking a course on coursera that uses minizinc. In one of the assignments, I was spinning my wheels forever because my model was not performing well enough on a hidden test case. I finally solved it by changing the following types of accesses in my model
from
constraint sum(neg1,neg2 in party where neg1 < neg2)(joint[neg1,neg2]) >= m;
to
constraint sum(i,j in 1..u where i < j)(joint[party[i],party[j]]) >= m;
I dont know what I'm missing, but why would these two perform any differently from eachother? It seems like they should perform similarly with the former being maybe slightly faster, but the performance difference was dramatic. I'm guessing there is some sort of optimization that the former misses out on? Or, am I really missing something and do those lines actually result in different behavior? My intention is to sum the strength of every element in raid.
Misc. Details:
party is an array of enum vars
party's index set is 1..real_u
every element in party should be unique except for a dummy variable.
solver was Gecode
verification of my model was done on a coursera server so I don't know what optimization level their compiler used.
edit: Since minizinc(mz) is a declarative language, I'm realizing that "array accesses" in mz don't necessarily have a direct corollary in an imperative language. However, to me, these two lines mean the same thing semantically. So I guess my question is more "Why are the above lines different semantically in mz?"
edit2: I had to change the example in question, I was toting the line of violating coursera's honor code.
The difference stems from the way in which the where-clause "a < b" is evaluated. When "a" and "b" are parameters, then the compiler can already exclude the irrelevant parts of the sum during compilation. If "a" or "b" is a variable, then this can usually not be decided during compile time and the solver will receive a more complex constraint.
In this case the solver would have gotten a sum over "array[int] of var opt int", meaning that some variables in an array might not actually be present. For most solvers this is rewritten to a sum where every variable is multiplied by a boolean variable, which is true iff the variable is present. You can understand how this is less efficient than an normal sum without multiplications.

assume() does not work for initial statement

For https://i.imgur.com/NCUjYmr.png , why doesn't the signal "reset" assumed to be '1' initially ? Anyone have any idea why the assume does not work ?
I found the solution. I am in temporal induction where it starts at a state which is not the initial state of the system. Therefore, the signal "reset" is not assumed to be '1' initially
Assumes will work as assumptions only in formal verification environment. But in simulation based verification, they work as an assert statement only.
As per the LRM :
The immediate assume statement specifies that its expression is assumed to hold. For example, immediate assume statements can be used with formal verification tools to specify assumptions on design inputs that constrain the verification computation. When used in this way, they specify the expected behavior of the environment of the design as opposed to that of the design itself. In simulation, an immediate assume may behave as an immediate assert to verify that the environment behaves as assumed. A simulation tool shall
provide the capability to check the immediate assume statement in this way.
Due to this, in your design, it won't actually assume the value, but it will check whether the proper value is given or not.

UML activity diagram for showing a two-pass algorithm

I want to show use of same algorithm as a black box in two-pass iteration. In first pass, I would pass a value of a flag f as false, and an array of one element as A[1..1], output of first pass would be B[1..N]. In second pass, same algorithm would be used with f as true (to indicate second pass) with an input of A[1..N] (fed from output B[1..N] of first pass) whereas the output of second pass would be B[1..M]
Please help me drawing the UML Activity diagram for the same.
It's not a good idea to try "programming graphically". The algorithm you describe is better shown in meta code than in an activity diagram, as you already have seen. So what I'd do in your case is to have a single Action (representing most likely some CallOperation of some class. And the according behavior of the operation contains the description in either meta code or plain text (as you already stated above).
If for what reason ever you really want to "program graphically" you would need to use single actions for the assignments of the flag like this:
The A and B arrays would be just mentioned in the description of the single actions.
To actually show passing the A and B arrays you would need to add ActionsPins or Objects with ObjectFlows between the single Actions. Honestly, that would make the whole thing even more unreadable and hinder more than helping the reader:

ID3 Implementation Clarification

I am trying to implement the ID3 algorithm, and am looking at the pseudo-code:
(Source)
I am confused by the bit where it says:
If examples_vi is empty, create a leaf node with label = most common value in TargegetAttribute in Examples.
Unless I am missing out on something, shouldn't this be the most common class?
That is, if we cannot split the data on an attribute value because no sample takes that value for the particular attribute, then we take the most common class among all samples and use that?
Also, isn't this just as good as picking a random class?
The training set tells us nothing about the relation between the attribute value and the class labels...
1) Unless I am missing out on something, shouldn't this be the most
common class?
You're correct, and the text also says the same. Look at the function description at the top :
Target_Attribute is the attribute whose value is to be predicted by the tree
so the value of Target_Attribute is the class/label.
2) That is, if we cannot split the data on an attribute value because no sample takes that value for the particular attribute, then we take the most common class among all samples and use that?
Yes, but not among all samples in your whole dataset, but rather those samples that reached up to this point in the tree/recursion. (ID3 functions is recursive and so the current Examples is actually Examples_vi of the caller)
3) Also, isn't this just as good as picking a random class?
The training set tells us nothing about the relation between the attribute value and the class labels...
No, picking a random class (with equal chances for each class) is not the same. Because often the inputs do have an unbalanced class distribution (this distribution is often called the prior distribution in many texts), so you may have 99% of examples are positive and only 1% negative. So whenever you really have no information whatsoever to decide on the outcome of some input, it makes sense to predict the most probable class, so that you have the most probability of being correct. This maximizes your classifier's accuracy on unseen data only under the assumption that the class distribution in your training data is the same as in the unseen data.
This explanation holds with the same reasoning for the base case when Attributes is empty (see 4 line in your pseudocode text); whenever we have no information, we just report the most common class of the data at hand.
If you never implemented the codes(ID3) but still want to know more in processing details, I suggest you to read this paper:
Building Decision Trees in Python
and here is the source code from the paper:
decision tree source code
This paper has a example or use example from your book(replace the "data" file with the same format). And you can just debug it (with some breakpoints) in eclipse to check the attribute values during the algorithms running.
Go over it, you will understand ID3 better.

Resources