at least one of these elements, but some at most one - xsd

I would like my XSD to be able to enforce that at least one of a set of elements must be present. Some of those elements can appear unlimited times, some are limited.
None of the elements are compulsory by themselves but at least one must be present.
Finally the elements can appear in any order. i.e. the order should neither be enforced nor have any impact on the processing of the document
My attempt at this:
<xs:element name="Root">
<xs:complexType>
<xs:choice>
<xs:element minOccurs="0" maxOccurs="unbounded" name="A" type="a:a"/>
<xs:element minOccurs="0" maxOccurs="unbounded" name="B" type="a:a"/>
<xs:element minOccurs="0" maxOccurs="1" name="C" type="a:a"/>
<xs:element minOccurs="0" maxOccurs="1" name="D" type="a:a"/>
</xs:choice>
</xs:complexType>
</xs:element>
This doesn't quite work as it doesn't enforce there being at least one of these elements.
So in the example above there must be at least one of A, B, C or D.
There can be multiple A or B elements but only 1 each of C or D
So the following should all be valid:
<root><A/></root>
<root><A/><A/></root>
<root><A/><B/><A/></root>
<root><C/></root>
<root><D/><C/></root>
But the following should not be valid:
<root/> - at least one element must be present
<root></root> - at least one element must be present
<root><C/><C/></root> - duplicate <C/> elements
There is the added complication that this is an extension on an existing complexType but I don't think that should affect anything?
Is this possible?
I tried adding minOccurs="1" to the xs:choice element. This had no effect, presumably because each instance of the choice can be empty!
I also tried having both an xs:choice (with no minOccurs in the children) and an xs:sequence (without C & D nodes) But I couldn't find any parent element which would allow both to co-exist.

In addition to allowing the first two of your should-be-invalid examples, the content model you show also fails by not allowing the third or fifth of your should-be-ok examples.
By far the simplest way to capture the constraints you describe in XSD, Relax NG, or DTDs is to add another and constrain the order. The content model is then (in DTD notation) ((A+, B*, C?, D) | (B+, C?, D?) | (C, D?) | (D)). If the sequence of A, B, C, D children conveys no information, this is by far the simplest approach.
If the sequence of children does convey information, however, and needs to be free, you have a most complicated task. Not particularly difficult, but complicated and sometimes (especially for more complicated examples) rather tedious.
The set of legal sequences of A, B, C, and D children you describe is a regular language; it may help to think about the finite state automaton you would use to recognize it. Essentially, you need to keep track of three bits of information: (a) have we seen any child element at all? (b) have we seen a C element? (c) have we seen a D element? If these were completely independent, we'd need eight states in the FSA; since they aren't, we need only five:
0 start state (no, no, no)
1 have seen one or more A or B elements but not C or D (yes, no, no)
2 have seen a C element, no D (X, yes, no)
3 have seen a D element, no C
4 have seen both C and D
The transition table is
0 A, B -> 1; C -> 2; D -> 3
1 A, B -> 1; C -> 2; D -> 3
2 A, B -> 2; C -> error; D -> 4
3 A, B -> 3; C -> 4; D -> error
4 A, B -> 4; C, D -> error
Accept states are 1-4.
Note that once we have seen at least one element, neither A nor B ever has any effect on the state of the automaton. This suggests a good way to start writing the content model: ignore A and B for a moment and write a content model with only C and D. Since one of these must occur, but neither can repeat, we can describe the legal sequences of C and D elements this way (again, in DTD notation for compactness):
((C, D?) | (D, C?))
Allowing A and B to occur arbitrarily many times (but not at the beginning of the sequence) gives us
( (C, (A|B)*, (D, (A|B)*)?)
| (D, (A|B)*, (C, (A|B)*)?) )
If we begin with either A or B, the initial sequence of As and Bs can be followed by any sequence matching the sequence above, or by nothing. For that case, the content model would be
((A|B)+, ( (C, (A|B)*, (D, (A|B)*)?)
| (D, (A|B)*, (C, (A|B)*)?) )? )
Putting these together, the content model as a whole becomes
( ( (A|B)+, ( (C, (A|B)*, (D, (A|B)*)?)
| (D, (A|B)*, (C, (A|B)*)?) )? )
| (C, (A|B)*, (D, (A|B)*)?)
| (D, (A|B)*, (C, (A|B)*)?) )
It's trivial to translate this into XSD notation.
In some schema languages, there are more convenient (by which mostly people mean: more compact) ways to express this or related content models.
In Schematron or XSD 1.1, you can write assertions (using XPath) to say
Every child is named A, B, C, or D.
There is at least one child.
There is at most one child named C.
There is at most one child named D.
In XSD 1.1, you can use an all-group whose children are A*, B*, C?, and D?, together with an assertion that says there is at least one child.
In Relax NG, you can write a content model which allows the empty sequence but is otherwise as described using the interleave operator, thus:
(A* & B* & C? & D?)
You can enforce non-emptiness with a slightly more complex model:
( ((A|B), (A* & B* & C? & D?))
| (C, (A* & B* & D?))
| (D, (A* & B* & C?)) )

Related

Python code yielding different result for same numerical value, depending on inclusion of precision point

I defined a function which returns a third order polynomial function for either a value, a list or a np.array:
def two_d_third_order(x, a, b, c, d):
return a + np.multiply(b, x) + np.multiply(c, np.multiply(x, x)) + np.multiply(d, np.multiply(x, np.multiply(x, x)))
The issue I noticed is, however, when I use "two_d_third_order" on the following two inputs:
1500
1500.0
With (a, b, c, d) = (1.20740028e+00, -2.93682465e-03, 2.29938078e-06, -5.09134552e-10), I get two different results:
2.4441
0.2574
, respectively. I don't know how this is possible, and any help would be appreciated.
I tried several inputs, and somehow the inclusion of a floating point on certain values (despite representing the same numerical value) changes the end result.
Python uses implicit data type conversions. When you use only integers (like 1500), there is a loss of precision in all subsequent operations. Whereas when you pass it a float or double (like 1500.0), subsequent operations are performed with the associated datatype, i.e in this case with higher precision.
This is not a "bug" so to speak, but generally how Python operates without the explicit declaration of data types. Languages like C and C++ require explicit data type declarations and explicit data type casting to ensure operations are performed in the prescribed precision formats. Can be a boon or a bane depending on usage.
1500 * 2_250_000 is 3_375_000_000 overflowing the range on int32:
print(type(np.multiply(1500, 2250000)))
print(np.multiply(1500, 2250000))
giving:
<class 'numpy.int32'>
-919967296
Where as floats use a much larger container.
print(type(np.multiply(1500.0, 2250000.0)))
print(np.multiply(1500.0, 2250000.0))
giving:
<class 'numpy.float64'>
3375000000.0
Try casting your input to a larger int.
(a, b, c, d) = (1.20740028e+00, -2.93682465e-03, 2.29938078e-06, -5.09134552e-10)
x1 = np.int64(1500)
x2 = 1500.0
print(two_d_third_order(x1, a, b, c, d) == two_d_third_order(x2, a, b, c, d))

TLA+: specify that the range of each element of a sequence of functions is {0}

I am trying to specify a collection of memory cells in TLA+, each holding 256 32-bit integers. I would like to specify that at initialization time all the memory is zeroed out. I intuit that the correct approach is something like nested forall statements, but I don't know how to express that in TLA+.
---------------------------- MODULE combinators ----------------------------
EXTENDS Integers, FiniteSets, Sequences
CONSTANTS Keys, Values
VARIABLES Cells
TypeOK ==
/\ Channels = 0 .. 255
/\ Values = -2147483648 .. 2147483647
/\ Cells \in Seq([Keys -> Values])
Init == ???
A few things.
If Values are constants, specify their domain in an ASSUME, not in an invariant. CONSTANT means some arbitray input; if you meant actual constants, then just define Values == -2147483648 .. 2147483647.
Keys could even be infinite; you must always specify an ASSUME for each constant (even IsFiniteSet).
You didn't declare Channels, but, like Values it seems like it should be a simple definition, not an invariant.
You didn't say how many Cells you're starting out with. The TypeOK is defined, the number of Cells can change at each step, and even be empty.
But suppose you want N cells for some N, so:
Cells = [c ∈ 1..N ↦ [k ∈ Keys ↦ 0]]
But you wrote "domain" and here 0 is in the range, so I'm not sure I understand your question. You also mention channels so perhaps you meant:
Cells = [c ∈ 1..N ↦ [k ∈ Channels ↦ 0]]

Prolog ways to compare variables

I was trying to implement some graph algorithms in Prolog. I came up with an idea to use unification to build a tree from the graph structure:
The graph would be defined as follows:
A list of Vertex-Variable pairs where Vertex is a constant representing the vertex and Variable is a corresponding variable, that would be used as a "reference" to the vertex. e.g.:
[a-A, b-B, c-C, d-D]
A list of VertexVar-NeighboursList pairs, where the VertexVar and the individual neighbours in the NeighboursList are the "reference variables". e.g.:
[A-[B, C, D], B-[A, C], C-[A, B], D-[A]] meaning b, c, d are neighbours of a etc.
Then before some graph algorithm (like searching for components, or simple DFS/BFS etc.) that could use some kind of tree built from the original graph, one could use some predicate like unify_neighbours that unifies the VertexVar-NeighbourList pairs as VertexVar = NeighboursList. After that, the vertex variables may be interpreted as lists of its neighbours, where each neighbour is again a list of its neighbours.
So this would result in a good performance when traversing the graph, as there is no need in linear search for some vertex and its neighbours for every vertex in the graph.
But my problem is: How to compare those vertex variables? (To check if they're the same.) I tried to use A == B, but there are some conflicts. For the example above, (with the unify_neighbours predicate) Prolog interprets the graph internally as:
[a-[S_1, S_2, S_3], b-S_1, c-S_2, d-S_3]
where:
S_1 = [[S_1, S_2, S_3], S_2]
S_2 = [[S_1, S_2, S_3], S_1]
S_3 = [[S_1, S_2, S_3]]
The problem is with S_1 and S_2 (aka b and c) as X = [something, Y], Y = [something, X], X == Y is true. The same problem would be with vertices, that share the same neighbours. e.g. U-[A, B] and V-[A, B].
So my question is: Is there any other way to compare variables, that could help me with this? Something that compares "the variables themselves", not the content, like comparing addresses in procedural programming languages? Or would that be too procedural and break the declarative idea of Prolog?
Example
graph_component(Vertices, Neighbours, C) :-
% Vertices and Neighbours as explained above.
% C is some component found in the graph.
vertices_refs(Vertices, Refs),
% Refs are only the variables from the pairs.
unify_neighbours(Neighbours), % As explained above.
rec_(Vertices, Refs, [], C).
rec_(Vertices, Refs, Found, RFound) :-
% Vertices as before.
% Refs is a stack of the vertex variables to search.
% Found are the vertices found so far.
% RFound is the resulting component found.
[Ref|RRest] = Refs,
vertices_pair(Vertices, Vertex-Ref),
% Vertex is the corresponding Vertex for the Ref variable
not(member(Vertex, Found)),
% Go deep:
rec_(Vertices, Ref, [Vertex|Found], DFound),
list_revpush_result([Vertex|Found], DFound, Found1),
% Go wide:
rec_(Vertices, RRest, Found1, RFound).
rec_(Vertices, Refs, Found, []) :-
% End of reccursion.
[Ref|_] = Refs,
vertices_pair(Vertices, Vertex-Ref),
member(Vertex, Found).
This example doesn't really work, but it's the idea. (Also, checking whether the vertices were found is done linearly, so the performance is still not good, but it's just for demonstration.) Now the predicate, that finds the corresponding vertex for the variable is implemented as:
vertices_pair([Vertex-Ref|_], Vertex-Ref).
vertices_pair([_-OtherRef|Rest], Vertex-Ref) :-
Ref \== OtherRef,
vertices_pair(Rest, Vertex-Ref).
where the \== operator is not really what I want and it creates those conflicts.
It is an intrinsic feature of Prolog that, once you have bound a variable to a term, it becomes indistinguishable from the term itself. In other words, if you bind two variables to the same term, you have two identical things, and there is no way to tell them apart.
Applied to your example: once you have unified every vertex-variable with the corresponding neighbours-list, all the variables are gone: you are left simply with a nested (and most likely circular) data structure, consisting of a list of lists of lists...
But as you suggest, the nested structure is an attractive idea because it gives you direct access to adjacent nodes. And although Prolog system vary somewhat in how well they support circular data structures, this need not stop you from exploiting this idea.
The only problem with your design is that a node is identified purely by the (potentially deeply nested and circular) data structure that describes the sub-graph that is reachable from it. This has the consequence that
two nodes that have the same descendants are indistinguishable
it can be very expensive to check whether two "similar looking" sub-graphs are identical or not
A simple way around that is to include a unique node identifier (such as a name or number) in your data structure. To use your example (slightly modified to make it more interesting):
make_graph(Graph) :-
Graph = [A,B,C,D],
A = node(a, [C,D]),
B = node(b, [A,C]),
C = node(c, [A,B]),
D = node(d, [A]).
You can then use that identifier to check for matching nodes, e.g. in a depth-first traversal:
dfs_visit_nodes([], Seen, Seen).
dfs_visit_nodes([node(Id,Children)|Nodes], Seen1, Seen) :-
( member(Id, Seen1) ->
Seen2 = Seen1
;
writeln(visiting(Id)),
dfs_visit_nodes(Children, [Id|Seen1], Seen2)
),
dfs_visit_nodes(Nodes, Seen2, Seen).
Sample run:
?- make_graph(G), dfs_visit_nodes(G, [], Seen).
visiting(a)
visiting(c)
visiting(b)
visiting(d)
G = [...]
Seen = [d, b, c, a]
Yes (0.00s cpu)
Thanks, #jschimpf, for the answer. It clarified a lot of things for me. I just got back to some graph problems with Prolog and thought I'd give this recursive data structure another try and came up with the following predicates to construct this data structure from a list of edges:
The "manual" creation of the data structure, as proposed by #jschimpf:
my_graph(Nodes) :-
Vars = [A, B, C, D, E],
Nodes = [
node(a, [edgeTo(1, B), edgeTo(5, D)]),
node(b, [edgeTo(1, A), edgeTo(4, E), edgeTo(2, C)]),
node(c, [edgeTo(2, B), edgeTo(6, F)]),
node(d, [edgeTo(5, A), edgeTo(3, E)]),
node(e, [edgeTo(3, D), edgeTo(4, B), edgeTo(1, F)]),
node(e, [edgeTo(1, E), edgeTo(6, C)])
],
Vars = Nodes.
Where edgeTo(Weight, VertexVar) represents an edge to some vertex with a weight assosiated with it. The weight is just to show that this can be customized for any additional information. node(Vertex, [edgeTo(Weight, VertexVar), ...]) represents a vertex with its neighbours.
A more "user-friendly" input format:
[edge(Weight, FromVertex, ToVertex), ...]
With optional list of vertices:
[Vertex, ...]
For the example above:
[edge(1, a, b), edge(5, a, d), edge(2, b, c), edge(4, b, e), edge(6, c, f), edge(3, d, e), edge(1, e, f)]
This list can be converted to the recursive data structure with the following predicates:
% make_directed_graph(+Edges, -Nodes)
make_directed_graph(Edges, Nodes) :-
vertices(Edges, Vertices),
vars(Vertices, Vars),
pairs(Vertices, Vars, Pairs),
nodes(Pairs, Edges, Nodes),
Vars = Nodes.
% make_graph(+Edges, -Nodes)
make_graph(Edges, Nodes) :-
vertices(Edges, Vertices),
vars(Vertices, Vars),
pairs(Vertices, Vars, Pairs),
directed(Edges, DiretedEdges),
nodes(Pairs, DiretedEdges, Nodes),
Vars = Nodes.
% make_graph(+Edges, -Nodes)
make_graph(Edges, Nodes) :-
vertices(Edges, Vertices),
vars(Vertices, Vars),
pairs(Vertices, Vars, Pairs),
directed(Edges, DiretedEdges),
nodes(Pairs, DiretedEdges, Nodes),
Vars = Nodes.
% make_directed_graph(+Vertices, +Edges, -Nodes)
make_directed_graph(Vertices, Edges, Nodes) :-
vars(Vertices, Vars),
pairs(Vertices, Vars, Pairs),
nodes(Pairs, Edges, Nodes),
Vars = Nodes.
The binary versions of these predicates assume, that every vertex can be obtained from the list of edges only - There are no "edge-less" vertices in the graph. The ternary versions take an additional list of vertices for exactly these cases.
make_directed_graph assumes the input edges to be directed, make_graph assumes them to be undirected, so it creates additional directed edges in the opposite direction:
% directed(+UndirectedEdges, -DiretedEdges)
directed([], []).
directed([edge(W, A, B)|UndirectedRest], [edge(W, A, B), edge(W, B, A)|DirectedRest]) :-
directed(UndirectedRest, DirectedRest).
To get all the vertices from the list of edges:
% vertices(+Edges, -Vertices)
vertices([], []).
vertices([edge(_, A, B)|EdgesRest], [A, B|VerticesRest]) :-
vertices(EdgesRest, VerticesRest),
\+ member(A, VerticesRest),
\+ member(B, VerticesRest).
vertices([edge(_, A, B)|EdgesRest], [A|VerticesRest]) :-
vertices(EdgesRest, VerticesRest),
\+ member(A, VerticesRest),
member(B, VerticesRest).
vertices([edge(_, A, B)|EdgesRest], [B|VerticesRest]) :-
vertices(EdgesRest, VerticesRest),
member(A, VerticesRest),
\+ member(B, VerticesRest).
vertices([edge(_, A, B)|EdgesRest], VerticesRest) :-
vertices(EdgesRest, VerticesRest),
member(A, VerticesRest),
member(B, VerticesRest).
To construct uninitialized variables for every vertex:
% vars(+List, -Vars)
vars([], []).
vars([_|ListRest], [_|VarsRest]) :-
vars(ListRest, VarsRest).
To pair up verticies and vertex variables:
% pairs(+ListA, +ListB, -Pairs)
pairs([], [], []).
pairs([AFirst|ARest], [BFirst|BRest], [AFirst-BFirst|PairsRest]) :-
pairs(ARest, BRest, PairsRest).
To construct the recursive nodes:
% nodes(+Pairs, +Edges, -Nodes)
nodes(Pairs, [], Nodes) :-
init_nodes(Pairs, Nodes).
nodes(Pairs, [EdgesFirst|EdgesRest], Nodes) :-
nodes(Pairs, EdgesRest, Nodes0),
insert_edge(Pairs, EdgesFirst, Nodes0, Nodes).
First, a list of empty nodes for every vertex is initialized:
% init_nodes(+Pairs, -EmptyNodes)
init_nodes([], []).
init_nodes([Vertex-_|PairsRest], [node(Vertex, [])|NodesRest]) :-
init_nodes(PairsRest, NodesRest).
Then the edges are inserted one by one:
% insert_edge(+Pairs, +Edge, +Nodes, -ResultingNodes)
insert_edge(Pairs, edge(W, A, B), [], [node(A, [edgeTo(W, BVar)])]) :-
vertex_var(Pairs, B, BVar).
insert_edge(Pairs, edge(W, A, B), [node(A, EdgesTo)|NodesRest], [node(A, [edgeTo(W, BVar)|EdgesTo])|NodesRest]) :-
vertex_var(Pairs, B, BVar).
insert_edge(Pairs, edge(W, A, B), [node(X, EdgesTo)|NodesRest], [node(X, EdgesTo)|ResultingNodes]) :-
A \= X,
insert_edge(Pairs, edge(W, A, B), NodesRest, ResultingNodes).
To get a vertex variable for a given vertex: (This actually works in both directions.)
% vertex_var(+Pairs, +Vertex, -Var)
vertex_var(Pairs, Vertex, Var) :-
member(Vertex-Var, Pairs).
```Prolog
This, of course, brings additional time overhead, but you can do this once and then just copy this data structure every time you need to perform some graph algorithm on it and access neighbours in constant time.
You can also add additional information to the `node` predicate. For example:
```Prolog
node(Vertex, Neighbours, OrderingVar)
Where the uninitialized variable OrderingVar can be "assigned" (initialized) in constant time with information about the vertex' position in a partial ordering of the graph, for example. So this may be used as output. (As sometimes denoted by +- in Prolog comments - an uninitialized variable as a part of an input term, that is yet to be initialized by the used predicate and provides output.)

Proof there is no instance for a given UML-Diagram

Given the diagram in the top-right corner, I'm supposed to decide whether there is any valid instance of it. Now the given image is a counterproof by example ('wegen' means 'because of'). The counterproof uses the cardinality ('Mächtigkeit') of the objects.
I don't understand, why for example 2*|A| equals |C|, as in UML, A would be in relation with 2 objects of C (rel1). So for every A there have to be 2 C to make a valid instance. 2*|A| = |C| should therefore be |A| = 2*|C|.
Why is it the other way around?
2*|A| = |C| since there is double the amount of C objects compared to A because each A has two C associated.
|A| = |B| because they have a 1-1 relation
3*|C| = 2*|B| because each C has 3 B and each B has 2 C
(4) and (5) are just substitutions where the last gives a contradiction
q.e.d
P.S. As #ShiDoiSi pointed out there is no {unique} constraint in the multiplicities. This will make it possible to have multiple associations to the same instance. Ergo, you have 1-1 relations. So with that being the case you actually CAN have a valid instantiation of the model.
Now go and tell that to your teacher xD

Interpretation of Multiplicity of a relation

I'm struggling to understand the multiplicity of a relation.
In general how should one interpret this
is this every entity of type P has between a and b entities of type C or between x and y or something else. All explanations I've found so for only adres the cases a,x = 0,1 and b,y = *
It's vice versa. P has x..y entities of type C in access and C has a..b of P.
As a side note: the multiplicity labels should not be placed to hide parts of the association.
Every Association contains two independent statements:
Every instance of P is linked to x..y instances of C
Every instance of C is linked to a..b instances of P
Being linked could mean, that P or C have an attribute of type C or P. This is the most common incarnation of a link, but UML does not prescribe this.

Resources