Is it possible to make conditional inserts with Azure Table Storage - azure

Is it possible to make a conditional insert with the Windows Azure Table Storage Service?
Basically, what I'd like to do is to insert a new row/entity into a partition of the Table Storage Service if and only if nothing changed in that partition since I last looked.
In case you are wondering, I have Event Sourcing in mind, but I think that the question is more general than that.
Basically I'd like to read part of, or an entire, partition and make a decision based on the content of the data. In order to ensure that nothing changed in the partition since the data was loaded, an insert should behave like normal optimistic concurrency: the insert should only succeed if nothing changed in the partition - no rows were added, updated or deleted.
Normally in a REST service, I'd expect to use ETags to control concurrency, but as far as I can tell, there's no ETag for a partition.
The best solution I can come up with is to maintain a single row/entity for each partition in the table which contains a timestamp/ETag and then make all inserts part of a batch consisting of the insert as well as a conditional update of this 'timestamp entity'. However, this sounds a little cumbersome and brittle.
Is this possible with the Azure Table Storage Service?

The view from a thousand feet
Might I share a small tale with you...
Once upon a time someone wanted to persist events for an aggregate (from Domain Driven Design fame) in response to a given command. This person wanted to ensure that an aggregate would only be created once and that any form of optimistic concurrency could be detected.
To tackle the first problem - that an aggregate should only be created once - he did an insert into a transactional medium that threw when a duplicate aggregate (or more accurately the primary key thereof) was detected. The thing he inserted was the aggregate identifier as primary key and a unique identifier for a changeset. A collection of events produced by the aggregate while processing the command, is what is meant by changeset here. If someone or something else beat him to it, he would consider the aggregate already created and leave it at that. The changeset would be stored beforehand in a medium of his choice. The only promise this medium must make is to return what has been stored as-is when asked. Any failure to store the changeset would be considered a failure of the whole operation.
To tackle the second problem - detection of optimistic concurrency in the further life-cycle of the aggregate - he would, after having written yet another changeset, update the aggregate record in the transactional medium if and only if nobody had updated it behind his back (i.e. compared to what he last read just before executing the command). The transactional medium would notify him if such a thing happened. This would cause him to restart the whole operation, rereading the aggregate (or changesets thereof) to make the command succeed this time.
Of course, now he had solved the writing problems, along came the reading problems. How would one be able to read all the changesets of an aggregate that made up its history? Afterall, he only had the last committed changeset associated with the aggregate identifier in that transactional medium. And so he decided to embed some metadata as part of each changeset. Among the meta data - which is not so uncommon to have as part of a changeset - would be the identifier of the previous last committed changeset. This way he could "walk the line" of changesets of his aggregate, like a linked list so to speak.
As an additional perk, he would also store the command message identifier as part of the metadata of a changeset. This way, when reading changesets, he could know in advance if the command he was about to execute on the aggregate was already part of its history.
All's well that ends well ...
P.S.
1. The transactional medium and changeset storage medium can be the same,
2. The changeset identifier MUST not be the command identifier,
3. Feel free to punch holes in the tale :-),
4. Although not directly related to Azure Table Storage, I've implemented the above tale successfully using AWS DynamoDB and AWS S3.

How about storing each event at "PartitionKey/RowKey" created based on AggregateId/AggregateVersion?where AggregateVersion is a sequential number based on how many events the aggregate already has.
This is very deterministic, so when adding a new event to the aggregate, you will make sure that you were using the latest version of it, because otherwise you'll get an error saying that the row for that partition already exists. At this time you can drop the current operation and retry, or try to figure out if you could merge the operation anyways if the new updates to the aggregate do not conflict to the operation you just did.

Related

Prevent DELETES from bypassing versioning in Amazon QLDB

Amazon QLDB allows querying the version history of a specific object by its ID. However, it also allows deleting objects. It seems like this can be used to bypass versioning by deleting and creating a new object instead of updating the object.
For example, let's say we need to track vehicle registrations by VIN.
INSERT INTO VehicleRegistration
<< {
'VIN' : '1N4AL11D75C109151',
'LicensePlateNumber' : 'LEWISR261LL'
} >>
Then our application can get a history of all LicensePlateNumber assignments for a VIN by querying:
SELECT * FROM _ql_committed_VehicleRegistration AS r
WHERE r.data.VIN = '1N4AL11D75C109151';
This will return all non-deleted document revisions, giving us an unforgeable history. The history function can be used similarly if you remember the document ID from the insert. However, if I wanted to maliciously bypass the history, I would simply delete the object and reinsert it:
DELETE FROM VehicleRegistration AS r WHERE VIN = '1N4AL11D75C109151';
INSERT INTO VehicleRegistration
<< {
'VIN' : '1N4AL11D75C109151',
'LicensePlateNumber' : 'ABC123'
} >>
Now there is no record that I have modified this vehicle registration, defeating the whole purpose of QLDB. The document ID of the new record will be different from the old, but QLDB won't be able to tell us that it has changed. We could use a separate system to track document IDs, but now that other system would be the authoritative one instead of QLDB. We're supposed to use QLDB to build these types of authoritative records, but the other system would have the exact same problem!
How can QLDB be used to reliably detect modifications to data?
There would be a record of the original record and its deletion in the ledger, which would be available through the history() function, as you pointed out. So there's no way to hide the bad behavior. It's a matter of hoping nobody knows to look for it. Again, as you pointed out.
You have a couple of options here. First, QLDB rolled-out fine-grained access control last week (announcement here). This would let you, say, prohibit deletes on a given table. See the documentation.
Another thing you can do is look for deletions or other suspicious activity in real-time using streaming. You can associate your ledger with a Kinesis Data Stream. QLDB will push every committed transaction into the stream where you can react to it using a Lambda function.
If you don't need real-time detection, you can do something with QLDB's export feature. This feature dumps ledger blocks into S3 where you can extract and process data. The blocks contain not just your revision data but also the PartiQL statements used to create the transaction. You can setup an EventBridge scheduler to kick off a periodic export (say, of the day's transactions) and then churn through it to look for suspicious deletes, etc. This lab might be helpful for that.
I think the best approach is to manage it with permissions. Keep developers out of production or make them assume a temporary role to get limited access.

How to handle (partially) dependant aggregate roots?

I have domain concept of Product.
Product have some GeneralDetails, lets say: sku, name, description.
At the same time, Product have some ProductCalculations part where accountants can put different values like purchasePrice, stockLevelExpenses, wholeSalesPrice, retailPrice.
So, so far, Product would look something like:
class Product{
GeneralDetails Details;
ProductCalculations Calculations;
ChangeDetails(GeneralDetails details){}
Recalculate(ProductCalculations calculations{}
}
This setup would make Product an aggregate root. But now, i want to split it in a way that Product manager can input/update product details but then that accountant can step in and intependently change calculations for given product without concurrency issues.
That would suggest splitting it into 2 separate aggregate roots.
But then, deleting ProductDetails aggregate must mean deleting ProductCalculations too and it should happen in transactional way.
Assuming they are 2 aggregate roots, meaning they have 2 separate repositories with corresponding Delete methods, how to implement this as an atomic transaction?
The only thing i can think about is to raise event when ProductDetails gets deleted, have a handler (DomainService) that uses some special repository that handles transactions over multiple aggregate roots.
Is there some problem with that approach and/or is there some better way to handle it?
PS.
I cannot allow eventual consistency when ProductDetails is deleted.
PS2.
Based on comments from #Jon, Details and Calculations create&delete should be synced in a way that when Details are created/deleted, Calculations should also be created/deleted.
On the other hand, their updates should be completely independent.
I think the answer to your question depends somewhat on what data storage technology you're using and your data storage model, because if you can push operation transactionality to the data layer, things get much easier.
If you're using a document-oriented database (Cosmos DB, MongoDB, etc...), I would model and store your Product aggregate (including Details and Calculations) as a single document and you get the atomic transaction and concurrency checking for free from the database.
If you must store these as separate documents/records in your data store, then providing atomic transactions and concurrency checking becomes your concern. For years folks (especially those using Entity Framework) have been using the Unit of Work pattern to batch up multiple repository operations and submit them to the database as a single operation (EF-specific UoW implementation). Rob Conery suggests here that a better option is to use Command objects to encapsulate a multi-part operation that needs to be executed as a single transaction.
In any event, I would encourage you to keep the management of this operation within Product, so that consumers of Product are unaware of what's going on during the save - they just blissfully call product.SaveAsync() and they don't need to know whether that's causing one record update or ten. As long as Product is injected with the repositories it needs to get the job done, there's no need to have a separate domain service to coordinate this operation. There's nothing wrong with Product listening for events that its children raise and responding appropriately.
Hope this helps!
" I cannot allow eventual consistency when ProductDetails is deleted"
Why not? What would be the business cost of having Inventory.Product exist while Finance.Product doesn't or vice-versa?
"but then that accountant can step in and intependently change calculations for given product"
That's pretty much what eventual consistency is, no?
If you really can't have eventual consistency then use a domain service to create/delete two distinct aggregate roots in a single transaction, but ask yourself how you are going to do that if the information is not entirely provided by the same end user?
I agree with #plalx in almost every point. However I want to do my bit to the discussion.
I've found that there is usually a very little cost in creating two or more related aggregates inside a single transaction (inside a single bounded context). After all, if those aggregates don't exist yet there cannot be a concurrency conflict, there is no contention and no much difference. Furher, you don't need to deal with partially created state (thinking that state is split between aggregates). It is possible to do that using eventual consistency, and there are situations where that is a better approach, but most of the time there is no great benefit. Even Vernon in his book Implementing Domain-Driven Design mentions this use case as "valid reason to break the rules".
Deleting more than one aggregate is a different story. What should happen if you delete and aggregate that another user is updating at the same time? The probability of such a conflict increases as more aggregates you try to modify/delete in the same transaction. Is there always an upstream/downstream relationship between those aggregates? I mean, if an user deletes A and B must be also deleted, have the user that is updating B no "power" or "voice" to cancel that deletion since she is providing more information to the state of the aggregate?
Those are a very tricky questions and most of the time it is something you need to discuss with a domain expert, and there are very few real scenarios when the answer is something you can't afford with eventual consistency. I discovered that in many cases is preferable to put a "flag" marking the aggregate as "inactive", notifying that will be deleted after some period of time. If no user with enough permission request that aggregate to become active again, then it gets deleted. That helped users to not kill themselves when they delete some aggregate by mistake.
You've mentioned that you don't want a user to spend hours modifying one aggregate if there is a deletion, but that is something that a transaction doesn't contribute much. This is very dependent in the whole architecture, though. That user could have loaded the aggregate into her own memory space and then a deletion occurs. It doesn't matter if you delete inside a transaction, the user is still wasting time. A better solution could be to publish a domain event that triggers some sort of push notification to the user, so she knows that a deletion happened and can stop working (or request a cancellation of that deletion, if you follow such approach).
For the reports and calculations, there are many cases when those "scripts" can skip records where the sibling aggregate is gone, so users doesn't notice there is a missing part or there is no complete consistency yet.
If for some reason you still need to delete several aggregates in the same transaction you just start a transaction in an application service and use repositories to perform the deletion, analogous to the creation case.
So, to summarize:
The rule of "modify one aggregate per transaction" is not that important when there is a creation of aggregates.
Deletion of many aggregates works quite well (most of the time) with eventual consistency, and very often just disabling those aggregates, one at a time, is better than performing the deletion immediately.
Preventing an user from wasting time is better achieved with proper notifications than transactions.
If there is a real need to perform those actions inside a single transaction, then manage that transaction in the application an be explicit. Using a domain service to perform all the required operations (except for the transaction that is mostly an application concern) brings that logic back to the domain layer.

What strategies exist to find unreachable keys in a key/value database?

TL;DR
How can you find "unreachable keys" in a key/value store with a large amount of
data?
Background
In comparison to relational database that provide ACID guarantees, NoSQL
key/value databases provide fewer guarantees in order to handle "big data".
For example, they only provide atomicity in the context of a single key/value
pair, but they use techniques like distributed hash tables to "shard" the data
across an arbitrarily large cluster of machines.
Keys are often unfriendly for humans. For example, a key for a blob of data
representing an employee might be
Employee:39045e87-6c00-47a4-a683-7aba4354c44a. The employee might also have a
more human-friendly identifier, such as the username jdoe with which the
employee signs in to the system. This username would be stored as a separate
key/value pair, where the key might be EmployeeUsername:jdoe. The value for
key EmployeeUsername:jdoe is typically either an array of strings containing
the main key (think of it like a secondary index, which does not necessarily
contain unique values) or a denormalised version of employee blob (perhaps
aggregating data from other objects in order to improve query performance).
Problem
Now, given that key/value databases do not usually provide transactional
guarantees, what happens when a process inserts the key
Employee:39045e87-6c00-47a4-a683-7aba4354c44a (along with the serialized
representation of the employee) but crashes before inserting the
EmployeeUsername:jdoe key? The client does not know the key for the employee
data - he or she only knows the username jdoe - so how to you find the
Employee:39045e87-6c00-47a4-a683-7aba4354c44a key?
The only thing I can think of is to enumerate the keys in the key/value store
and once you find the appropriate key, "resume" the indexing/denormalisation.
I'm well aware of techniques like event sourcing, where an idempotent event
handler could respond to the event (e.g., EmployeeRegistered) in order to
recreate the username-to-employee-uuid secondary index, but using event
sourcing over key/value store still requires enumeration of keys, which could
degrade performance.
Analogy
The more experience I have in IT, the more I see the same problems being
tackled in different scenarios. For example, Linux filesystems store both file
and directory contents in "inodes". You can think of these as key/value pairs,
where the key is an integer and the value is the file/directory contents. When
writing a new file, the system creates an inode and fills it with data THEN
modifies the parent directory to add the "filename-to-inode" mapping. If the
system crashes after creating the file but before referencing it in the parent
directory, your file "exists on disk" but is essentially unreadable. When the
system comes back online, hopefully it will place this file into the
"lost+found" directory (I imagine it does this by scanning the entire disk).
There are plenty of other examples (such as domain name to IP address mappings
in the DNS system), but I specifically want to know how the above problem is
tackled in NoSQL key/value databases.
EDIT
I found this interesting article on manual secondary indexes but it doesn't "broken" or "dated" secondary indexes.
The solution I've come up with is to use a process manager (or "saga"),
whose key contains the username. This also guarantees uniqueness across
employees during registration. (Note that I'm using a key/value store
with compare-and-swap (CAS) semantics for concurrency control.)
Create an EmployeeRegistrationProcess with a key of
EmployeeRegistrationProcess:jdoe.
If a concurrency error occurs (i.e., the registration process
already exists) then this is a duplicate username.
When started, the EmployeeRegistrationProcess allocates an
employee UUID. The EmployeeRegistrationProcess attempts to create
an Employee object using this UUID (e.g.,
Employee:39045e87-6c00-47a4-a683-7aba4354c44a).
If the system crashes after starting the
EmployeeRegistrationProcess but before creating the Employee, we
can still locate the "employee" (or more accurately, the employee
regisration process) by the username "jdoe". We can then resume the
"transaction".
If there is a concurrency error (i.e., the Employee with the
generated UUID already exists), the RegistrationProcess can flag
itself as being "in error" or "for review" or whatever process we
decide is best.
After the Employee has successfully been created, the
EmployeeRegistrationProcess creates the secondary index
EmployeeUsernameToUuid:jdoe ->
39045e87-6c00-47a4-a683-7aba4354c44a.
Again, if this fails, we can still locate the "employee" by the
username "jdoe" and resume the transaction.
And again, if there is a concurrency error (i.e., the
EmployeeUsernameToUuid:jdoe key already exists), the
EmployeeRegistrationProcess can take appropriate action.
When both commands have succeeded (the creation of the Employee and
the creation of the secondary index), the
EmployeeRegistrationProcess can be deleted.
At all stages of the process, Employee (or
EmployeeRegistrationProcess) is reachable via it's human-friendly
identifier "jdoe". Event sourcing the EmployeeRegistrationProcess is
optional.
Note that using a process manager can also help in enforcing uniqueness
across usernames after registration. That is, we can create a
EmployeeUsernameChangeProcess object with a key containing the new
username. "Enforcing" uniqueness at either registration or username
change hurts scalability, so the value identified by
"EmployeeUsernameToUuid:jdoe" could be an array of employee UUIDs.
If to look at a question from the point of view of eventsourcing entities, then responsibility of an entity of EventStore includes the guaranteed saving an event into storage and sending for the bus. From this point of view it is guaranteed that the event will be written completely, and as the database in the append-only mode, there will never be a problem with a non-valid event.
At the same time of course it isn't guaranteed that all commands which generate events will be successfully executed - it is possible to guarantee only an order and protection against repeated execution of the same command, but not all transaction.
Further occurs as follows - the saga intercepts an original command, and tries to execute everything transaction. If any part of transaction comes to the end with an error, or for example, doesn't come to the end for the preset time - that process is rolled away by means of generation of the so-called compensating events. Such events can't delete an entity, however bring system to the consistent state similar to that the command never and was executed.
Note. If your specific implementation of the database for events is arranged so that the key value can guarantee record only of one couple, just serialize an event, and the combination from the identifier and the version of a root of the aggregate can be a key. The version of the aggregate in this case somewhat is a CAS operation analog.
About concurrency you can read this article: http://danielwhittaker.me/2014/09/29/handling-concurrency-issues-cqrs-event-sourced-system/

Selecting and updating against tables in separate data sources within the same transaction

The attributes for the <jdbc:inbound-channel-adapter> component in Spring Integration include data-source, sql and update. These allow for separate SELECT and UPDATE statements to be run against tables in the specified database. Both sql statements will be part of the same transaction.
The limitation here is that both the SELECT and UPDATE will be performed against the same data source. Is there a workaround for the case when the the UPDATE will be on a table in a different data source (not just separate databases on the same server)?
Our specific requirement is to select rows in a table which have a timestamp prior to a specific time. That time is stored in a table in a separate data source. (It could also be stored in a file). If both sql statements used the same database, the <jdbc:inbound-channel-adapter> would work well for us out of the box. In that case, the SELECT could use the time stored, say, in table A as part of the WHERE clause in the query run against table B. The time in table A would then be updated to the current time, and all this would be part of one transaction.
One idea I had was, within the sql and update attributes of the adapter, to use SpEL to call methods in a bean. The method defined for sql would look up a time stored in a file, and then return the full SELECT statement. The method defined for update would update the time in the same file and return an empty string. However, I don't think such an approach is failsafe, because the reading and writing of the file would not be part of the same transaction that the data source is using.
If, however, the update was guaranteed to only fire upon commit of the data source transaction, that would work for us. If the event of a failure, the database transaction would commit, but the file would not be updated. We would then get duplicate rows, but should be able to handle that. The issue would be if the file was updated and the database transaction failed. That would mean lost messages, which we could not handle.
If anyone has any insights as to how to approach this scenario it is greatly appreciated.
Use two different channel adapters with a pub-sub channel, or an outbound gateway followed by an outbound channel adapter.
If necessary, start the transaction(s) upstream of both; if you want true atomicity you would need to use an XA transaction manager and XA datasources. Or, you can get close by synchronizing the two transactions so they get committed very close together.
See Dave Syer's article "Distributed transactions in Spring, with and without XA" and specifically the section on Best Efforts 1PC.

Cassandra TimedOutException and data modification for batch updates

I execute batch update which modifies few rows within few column families. In case of TimedOutException some data could be modified, but possibly not whole set....
In order to implement compensating transaction, I would need to know what data (rows) was modified - is there a way to find this out? Does exception contain this information?
Thanks,
Maciej
Creating a system that can scale out means taking some trade-offs - one of these is facilitating "idempotent" operations in your application.
This means that you would either:
assume that the data was written somewhere and that the node will
eventually become consistent
fire the entire contents of the write again, perhaps sleeping a given amount of time or
at a less restrictive consistency level
A good description of this approach can be found in section 6 of Pat Helland's "Building on Quicksand" paper: http://arxiv.org/pdf/0909.1788

Resources