Is Transactional Locking 2 algorithm serializable? - multithreading

Consider two threads A and B
A.readset intersects with B.writeset
B.readset NOT intersect with A.writeset
A.writeset NOT intersect with B.writeset
They commit at the same time: A.lock --> A.validation --> B.lock --> B.validation --> (A B installs updates)
Is this not serializable because B may overwrite A's reads before A commits?

It is serializable because the the value written to Transaction A's writeset depends on the cached value of A's readset which was confirmed during validation. The over-writing of A's readset by B does not affect the cached values of A's readset which A's writes are based on. The values written to Transaction A's writeset are exactly the same values that would have been written if transaction A ran to completion before transaction B started, so it's serializable.
EXAMPLE
We have a transactional memory consists of 3 variables X, Y, Z = X1, Y1, Z1
Transaction A reads X and writes Y with a value that depends on X (X + P)
Transaction B reads Z and writes X with a value that depends on Z (Z + Q)
SERIALIZED EXECUTION
Transaction A: locks Y, validates X = X1.
Transaction A: sets Y = X1 + P and commits.
Transaction B: locks X, validates Z = Z1.
Transaction B: sets X = Z1 + Q and commits
Final result: (X, Y, Z) = (Z1 + Q, X1 + P, Z1)
INTERLEAVED EXECUTION
Transaction A: locks Y, validates X = X1.
Transaction B: locks X, validates Z = Z1.
Transaction B: sets X = Z1 + Q and commits (writes A's readset X before A commits)
Transaction A: sets Y = X1 + P and commits (uses cached value of X not the latest value)
Final result: (X, Y, Z) = (Z1 + Q, X1 + P, Z1) (same result as serial execution)

Related

Convert DFA to RE

I constructed a finite automata for the language L of all strings made of the symbols 0, 1 and 2 (Σ = {0, 1, 2}) where the last symbol is not smaller than the first symbol. E.g., the strings 0, 2012, 01231 and 102 are in the language, but 10, 2021 and 201 are not in the language.
Then from that an GNFA so I can convert to RE.
My RE looks like this:
(0(0+1+2)* )(1(0(1+2)+1+2)* )(2((0+1)2+2))*)
I have no idea if this is correct, as I think I understand RE but not entirely sure.
Could someone please tell me if it’s correct and if not why?
There is a general method to convert any DFA into a regular expression, and is probably what you should be using to solve this homework problem.
For your attempt specifically, you can tell whether an RE is incorrect by finding a word that should be in the language, but that your RE doesn't accept, or a word that shouldn't be in the language that the RE does accept. In this case, the string 1002 should be in the language, but the RE doesn't match it.
There are two primary reasons why this string isn't matched. The first is that there should be a union rather than a concatenation between the three major parts of the language (words starting with 0, 1 and 2, respectively:
(0(0+1+2)*) (1(0(1+2)+1+2)*) (2((0+1)2+2))*) // wrong
(0(0+1+2)*) + (1(0(1+2)+1+2)*) + (2((0+1)2+2))*) // better
The second problem is that in the 1 and 2 cases, the digits smaller than the starting digit need to be repeatable:
(1(0 (1+2)+1+2)*) // wrong
(1(0*(1+2)+1+2)*) // better
If you do both of those things, the RE will be correct. I'll leave it as an exercise for you to follow that step for the 2 case.
The next thing you can try is find a way to make the RE more compact:
(1(0*(1+2)+1+2)*) // verbose
(1(0*(1+2))*) // equivalent, but more compact
This last step is just a matter of preference. You don't need the trailing +1+2 because 0* can be of zero length, so 0*(1+2) covers the +1+2 case.
You can use an algorithm but this DFA might be easy enough to convert as a one-off.
First, note that if the first symbol seen in the initial state is 0, you transition to state A and remain there. A is accepting. This means any string beginning with 0 is accepted. Thus, our regular expression might as well have a term like 0(0+1+2)* in it.
Second, note that if the first symbol seen in the initial state is 1, you transition to state B and remain in states B and D from that point on. You only leave B if you see 0 and you stay out of B as long as you keep seeing 0. The only way to end on D is if the last symbol you saw was 0. Therefore, strings beginning with 1 are accepted if and only if the strings don't end in 0. We can have a term like 1(0+1+2)*(1+2) in our regular expression as well to cover these cases.
Third, note that if the first symbol seen in the initial state is 2, you transition to state C and remain in states C and E from that point on. You leave state C if you see anything but 2 and stay out of B until you see a 2 again. The only way to end up on C is if the last symbol you saw was 2. Therefore, strings beginning with 2 are accepted if and only if the strings end in 2. We can have a term like 2(0+1+2)*(2) in our regular expression as well to cover these cases.
Finally, we see that there are no other cases to consider; our three terms cover all cases and the union of them fully describes our language:
0(0+1+2)* + 1(0+1+2)*(1+2) + 2(0+1+2)*2
It was easy to just write out the answer here because this DFA is sort of like three simple DFAs put together with a start state. More complicated DFAs might be easier to convert to REs using algorithms that don't require you understand or follow what the DFA is doing.
Note that if the start state is accepting (mentioned in a comment on another answer) the RE changes as follows:
e + 0(0+1+2)* + 1(0+1+2)*(1+2) + 2(0+1+2)*2
Basically, we just tack the empty string onto it since it is not already generated by any of the other parts of the aggregate expression.
You have the equivalent of what is known as a right-linear system. It's right-linear because the variables occur on the right hand sides only to the first degree and only on the right-hand sides of each term. The system that you have may be written - with a change in labels from 0,1,2 to u,v,w - as
S ≥ u A + v B + w C
A ≥ 1 + (u + v + w) A
B ≥ 1 + u D + (v + w) B
C ≥ 1 + (u + v) E + w C
D ≥ u D + (v + w) B
E ≥ (u + v) E + w C
The underlying algebra is known as a Kleene algebra. It is defined by the following identities that serve as its fundamental properties
(xy)z = x(yz), x1 = x = 1x,
(x + y) + z = x + (y + z), x + 0 = x = 0 + x,
y0z = 0, w(x + y)z = wxz + wyz,
x + y = y + x, x + x = x,
with a partial ordering relation defined by
x ≤ y ⇔ y ≥ x ⇔ ∃z(x + z = y) ⇔ x + y = y
With respect to this ordering relation, all finite subsets have least upper bounds, including the following
0 = ⋁ ∅, x + y = ⋁ {x, y}
The sum operator "+" is the least upper bound operator.
The system you have is a right-linear fixed point system, since it expresses the variables on the left as a (right-linear) function, as given on the right, of the variables. The object being specified by the system is the least solution with respect to the ordering; i.e. the least fixed point solution; and the regular expression sought out is the value that the main variable has in the least fixed point solution.
The last axiom(s) for Kleene algebras can be stated in any of a number of equivalent ways, including the following:
0* = 1
the least fixed point solution to x ≥ a + bx + xc is x = b* a c*.
There are other ways to express it. A consequence is that one has identities such as the following:
1 + a a* = a* = 1 + a* a
(a + b)* = a* (b a*)*
(a b)* a = a (b a)*
In general, right linear systems, such as the one corresponding to your problem may be written in vector-matrix form as 𝐪 ≥ 𝐚 + A 𝐪, with the least fixed point solution given in matrix form as 𝐪 = A* 𝐚. The central theorem of Kleene algebras is that all finite right-linear systems have least fixed point solutions; so that one can actually define matrix algebras over Kleene algebras with product and sum given respectively as matrix product and matrix sum, and that this algebra can be made into a Kleene algebra with a suitably-defined matrix star operation through which the least fixed point solution is expressed. If the matrix A decomposes into block form as
B C
D E
then the star A* of the matrix has the block form
(B + C E* D)* (B + C E* D)* C E*
(E + D B* C)* D B* (E + D B* C)*
So, what this is actually saying is that for a vector-matrix system of the form
x ≥ a + B x + C y
y ≥ b + D x + E y
the least fixed point solution is given by
x = (B + C E* D)* (a + C E* b)
y = (E + D B* C)* (D B* a + b)
The star of a matrix, if expressed directly in terms of its components, will generally be huge and highly redundant. For an n×n matrix, it has size O(n³) - cubic in n - if you allow for redundant sub-expressions to be defined by macros. Otherwise, if you in-line insert all the redundancy then I think it blows up to a highly-redundant mess that is exponential in n in size.
So, there's intelligence required and involved (literally meaning: AI) in finding or pruning optimal forms that avoid the blow-up as much as possible. That's a non-trivial job for any purported matrix solver and regular expression synthesis compiler.
An heuristic, for your system, is to solve for the variables that don't have a "1" on the right-hand side and in-line substitute the solutions - and to work from bottom-up in terms of the dependency chain of the variables. That would mean starting with D and E first
D ≥ u* (v + w) B
E ≥ (u + v)* w C
In-line substitute into the other inequations
S ≥ u A + v B + w C
A ≥ 1 + (u + v + w) A
B ≥ 1 + u u* (v + w) B + (v + w) B
C ≥ 1 + (u + v) (u + v)* w C + w C
Apply Kleene algebra identities (e.g. x x* y + y = x* y)
S ≥ u A + v B + w C
A ≥ 1 + (u + v + w) A
B ≥ 1 + u* (v + w) B
C ≥ 1 + (u + v)* w C
Solve for the next layer of dependency up: A, B and C:
A ≥ (u + v + w)*
B ≥ (u* (v + w))*
C ≥ ((u + v)* w)*
Apply some more Kleene algebra (e.g. (x* y)* = 1 + (x + y)* y) to get
B ≥ 1 + N (v + w)
C ≥ 1 + N w
where, for convenience we set N = (u + v + w)*. In-line substitute at the top-level:
S ≥ u N + v (1 + N (v + w)) + w (1 + N w).
The least fixed point solution, in the main variable S, is thus:
S = u N + v + v N (v + w) + w + w N w.
where
N = (u + v + w)*.
As you can already see, even with this simple example, there's a lot of chess-playing to navigate through the system to find an optimally-pruned solution. So, it's certainly not a trivial problem. What you're essentially doing is synthesizing a control-flow structure for a program in a structured programming language from a set of goto's ... essentially the core process of reverse-compiling from assembly language to a high level language.
One measure of optimization is that of minimizing the loop-depth - which here means minimizing the depth of the stars or the star height. For example, the expression x* (y x*)* has star-height 2 but reduces to (x + y)*, which has star height 1. Methods for reducing star-height come out of the research by Hashiguchi and his resolution of the minimal star-height problem. His proof and solution (dating, I believe, from the 1980's or 1990's) is complex and to this day the process still goes on of making something more practical of it and rendering it in more accessible form.
Hashiguchi's formulation was cast in the older 1950's and 1960's formulation, predating the axiomatization of Kleene algebras (which was in the 1990's), so to date, nobody has rewritten his solution in entirely algebraic form within the framework of Kleene algebras anywhere in the literature ... as far as I'm aware. Whoever accomplishes this will have, as a result, a core element of an intelligent regular expression synthesis compiler, but also of a reverse-compiler and programming language synthesis de-compiler. Essentially, with something like that on hand, you'd be able to read code straight from binary and the lid will be blown off the world of proprietary systems. [Bite tongue, bite tongue, mustn't reveal secret yet, must keep the ring hidden.]

Determine if a sequence is an interleaving of a repetition of two strings

I have this task:
Let x be a string over some finite and fixed alphabet (think English alphabet). Given
an integer k we use x^k
to denote the string obtained by concatenating k copies of x. If x
is the string HELLO then x^3
is the string HELLOHELLOHELLO. A repetition of x is
a prefix of x^k
for some integer k. Thus HELL and HELLOHELL are both repetitions of
HELLO.
An interleaving of two strings x and y is any string that is obtained by shuffling a repetition
of x with a repetition of y. For example HELwoLOHELLrldwOH is an interleaving of
HELLO and world.
Describe an algorithm that takes three strings x, y, z as input and decides whether z is an
interleaving of x and y.
I've only come up with a solution, which has exponential complexity (We have pointer to the z word, and kind of a binary tree. In every node I have current states of possible words x and y (at the start both blank). I'm processing z, and nodes has one/two/no children depending on if the next character from z could be added to x word, y word or no word.) How could I get better than exponential complexity?
Suppose the two words x and y have length N1 and N2.
Construct a non-deterministic finite state machine with states (n1, n2) where 0 <= n1 < N1 and 0 <= n2 < N2. All states are accepting.
Transitions are:
c: (n1, n2) --> ((n1 + 1) % N1, n2) if x[n1] == c
c: (n1, n2) --> (n1, (n1 + 1) % n2) if y[n2] == c
This NDFSM recognises strings that are formed from interleaving repetitions of x and y.
Here's some ways to implement the NDFSM: https://en.wikipedia.org/wiki/Nondeterministic_finite_automaton#Implementation
Here's a simple Python implementation.
def is_interleaved(x, y, z):
states = set([(0, 0)])
for c in z:
ns = set()
for i1, i2 in states:
if c == x[i1]:
ns.add(((i1+1)%len(x), i2))
if c == y[i2]:
ns.add((i1, (i2+1)%len(y)))
states = ns
return bool(states)
print is_interleaved('HELLO', 'world', 'HELwoLOHELLrldwOH')
print is_interleaved('HELLO', 'world', 'HELwoLOHELLrldwOHr')
print is_interleaved('aaab', 'aac', 'aaaabaacaab')
In the worst case, it'll run in O(N1 * N2 * len(z)) time and will use O(N1 * N2) space, but for many cases, the time complexity will better than this unless the strings x and y are repetitious.

How can I implement the sum of squares of two largest numbers out of three for floating-point numbers?

Exercise 1.3 of the book Structure and Interpretation of Computer Programs asks the following:
Define a procedure that takes three numbers as arguments and returns the sum of the squares of the two larger numbers.
I've managed to answer this question, but only for integers:
use std::cmp;
fn sum_square_largest(x:isize, y:isize, z:isize) -> isize {
x * x + y * y + z * z - min_three(x, y, z) * min_three(x, y, z)
}
fn min_three<T>(v1: T, v2: T, v3: T) -> T where T: Ord {
cmp::min(v1, cmp::min(v2, v3))
}
But when I change the sum_square_largest function to:
fn sum_square_largest(x:f64, y:f64, z:f64) -> f64 {
x * x + y * y + z * z - min_three(x, y, z) * min_three(x, y, z)
}
It gives the following error: the trait 'core::cmp::Ord' is not implemented for the type 'f64' [E0277].
What is this? And how can I define this function to work with floating-point numbers?
Floats do not implement Ord because they do not have a total ordering. NaN is false to compare against any value, including another NaN.
If you're on nightly Rust, you can use partial_min, which makes these kinds of cases explicit.
You can also decide what to do in the case of things like NaN, and then implement a wrapper type over f64, and implement Ord for it, such that it handles that case.
If you want to avoid fatal rounding errors, you need to write your code as required: By adding the squares of two of the three numbers. Otherwise, if you have for example x = y = 1, and z = -1e100, then you get catastrophic rounding errors. The sum of the three squares gets rounded to 1e200, the same is subtracted, the result is 0 instead of 2.
It can be worse: If z = -1e200, then z*z overflows, three squares add up to +inf, you subtract +inf and get NaN.
let x1 = if x > y { x } else { y };
let mut x2 = if x > y { y } else { x };
if z > x2 { x2 = z; }
x1 * x1 + x2 * x2;
If you consider NaNs the situation gets slightly more complicated. Obviously if you have two or three NaNs the result will be NaN. If you have one NaN, you need to decide whether (a) you don't care, (b) the result should be NaN or (c) the result should be the sum of the squares of the two numbers that are not NaN. In case (b) or (c), the result should only depend on the three values, and not on the order in which they are used.
The code above covers (a). If you want (b), then you need to make sure that z will be stored into x2 if it is NaN and x2 stays unchanged if it is NaN. You achieve this by changing the last line to
if z > x2 || z != z { x2 = z };
If you want (c) it is a bit more complicated.
let x1 = if x > y || y != y { x } else { y };
let mut x2 = if x > y || y != y { y } else { x };
if z > x2 || x2 != x2 { x2 = z; }

HB RFID attack. What am I missing?

Regarding the RFID protocol HB (not HB+) I am having a hard time understanding why my approach will not work.
So in HB we have the Tag and the Reader whom both share a secret X.
We are trying to figure out X.
The protocol goes as follow:
Lets suppose k = 3 bits.
From the papers I have read it seems the attack goes as follows.
set a = 001 and send say 1000 times
set a = 010 and send 1000 times
set a = 100 and send 1000 times
take the parity which comes out the majority of times for each a revealing x.
This makes sense to me and works fine.
My question is, why can I not simply set a to 001. Since a = 001 when it is ANDED with x it will always produce x which will then be XOR with v. The resulting Z will always be either x or it will be x XOR with 1. We then just take the output that happens the majority of the times which would be x since the prob of v = 1 < .5
I feel like I would only have to run this say 10 times rather then running every a multiple times.
Am i missing an important aspect of this.
Thanks
Why can I not simply set a to '001'?
x and a are of length k, so
x = { xk-1, ..., x0 }
a = { ak-1, ..., a0 }
If, k = 3, this would be
x = { x2, x1, x0 }
a = { a2, a1, a0 }
I.e. x and a would be one of '000', '001', '010', '011', '100', '101', '110', or '111'.
So the scalar product x &bullet; a results in
x &bullet; a = (x2 AND a2) XOR (x1 AND a1) XOR (x0 AND a0)
Consequently using a = '001' results in
z = x &bullet; '001' = (x2 AND '0') XOR (x1 AND '0') XOR (x0 AND '1') = x0
So you would not get the remaining digits of x (i.e. x2 and x1) in that case. Similarly, if you use an a with more than one bit set, e.g. a = '111', you would get
z = x &bullet; '111' = (x2 AND '1') XOR (x1 AND '1') XOR (x0 AND '1') = x2 XOR x1 XOR x0
and therefore could dervice the digits of x. Thus, you need to perform the protocol with a = '001', a = '010', and a = '100' in order to get each digit of x.
I feel like I would only have to run this say 10 times rather then running every a multiple times.
Well, for every round, you will get a correct result with a probability v. So the expected value would be
E[X] = v, if the correct digit is a '1', and
E[X] = 1 - v, if the correct digit is a '0'.
Hence, the mean value over all rounds (i.e. every sample you take) will approximate v for a '1' and will approximate 1 - v for a '0' for an infinite number of rounds. But this does not necessarily mean that you already reach this expected value after 1 round or 10 rounds. However, with every round you increase the confidence of getting the expected value.

Stuck on a Concurrent programming example, in pseudocode(atomic actions/fine-grained atomicity)

My book presents a simple example which I'm a bit confused about:
It says, "consider the following program, and assume that the fine-grained atomic actions are reading and writing the variables:"
int y = 0, z = 0;
co x = y+z; // y=1; z=2; oc;
"If x = y + z is implemented by loading a register with y and then adding z to it, the final value of x can be 0,1,2, or 3. "
2? How does 2 work?
Note: co starts a concurrent process and // denote parallel-running statements
In your program there are two parallel sequences:
Sequence 1: x = y+z;
Sequence 2: y=1; z=2;
The operations of sequence 1 are:
y Copy the value of y into a register.
+ z Add the value of z to the value in the register.
x = Copy the value of the register into x.
The operations of sequence 2 are:
y=1; Set the value of y to 1.
z=2; Set the value of z to 2.
These two sequences are running at the same time, though the steps within a sequence must occur in order. Therefore, you can get an x value of '2' in the following sequence:
y=0
z=0
y Copy the value of y into a register. (register value is now '0')
y=1; Set the value of y to 1. (has no effect on the result, we've already copied y to the register)
z=2; Set the value of z to 2.
+ z Add the value of z to the value in the register. (register value is now '2')
x = Copy the value of the register into x. (the value of x is now '2')
Since they are assumed to run in parallel, I think an even simpler case could be y=0, z=2 when the assignment x = y + z occurs.

Resources