On the idempotence of GET - get

I have been doing some reading on the GET HTTP method and in particular on its idempotent quality.
This is my understanding: if I call a GET operation 1 time or a million times (or any number of times) the result should be the same.
My problem with this definition is this.
Imagine if I have a database of films and I perform a GET operation in which I return all the James Bond films in the database.
Imagine I run this query a million times and after the 500,000th time someone else runs a POST query on the database adding a new Bond film.
Well, now half the GET operations return N results and the other half return N+1 results.
Does this not then break idempotence as it is usually described?
Would not a better definition be that the idempotence of a function is that it returns the same results no matter how many times it is executed as long as the underlying data does not change?

GET idempotent because it does not (or should not) change the resource. This does not require that the resource is static and nothing else (like a post) never changes it.

The idempotence is about the fact that the GET calls do not change the resource being called.
What other methods do is a different matter.

Related

Recalculation of exchanges done twice

When using the function ParameterManager.recalculate() to get the actualised values of all the parameterized exchanges of my database, the functions ActivityParameter.recalculate() and ActivityParameter.recalculate_exchanges() are applied to all the parameters groups. But it seems that the function ActivityParameter.recalculate_exchanges() is ran twice because it is also used inside the function ActivityParameter.recalculate(). When deleting one, I get the same results but twice faster (that is what I was looking for because otherwise my calculation is a bit long). Is there a reason for running the function twice ? Is it right to delete one to get faster results ? Would there be a way to reduce the duration of this calculation ?
You are totally right - ParameterManager.recalculate calls both ActivityParameter.recalculate and ActivityParameter.recalculate_exchanges, but ActivityParameter.recalculate already calls ActivityParameter.recalculate_exchanges. This makes things slower, but doesn't break anything.
This duplication has been removed; you can safely do the same.

Actor model: how to reach data integrity in this specific situation?

I'm new to actor model and Orleans, so any suggestions on good practices to solve the following task is very appreciated:
we have [service1] that runs some logic and store some results in relational database (legacy thing). Now somewhere in the middle we want to call Orleans actor [Actor1] which holds a list of numbers, to get the next available number. The goal of the [Actor1] is to feed the numbers sequentially and consistently, so no skip over, no duplication is allowed, so it's sort of single-threaded stack. Single-threaded not only per process, but throughout the cluster of services, exactly what we need.
[service1] -> [Actor1]
Now the only problem I see here is that [service1] can fail with exception after it takes the next number, but before it stores results in database. Number is taken from the single-threaded stack, but it's lost as calling application did not manage to store results based on the fed number in database. In other words, I do not want the actor to feed next number, unless it ensures the last fed one is in good use, and only calling application knows if it is.
How would you suggest to handle these situations? Can I somehow keep Orleans actor's job open unless calling service (or another actor) commits it to database?
This is a Byzantine problem, so there is no easy solution: there will be "holes" in the number sequence or you will use the same number twice.
I would prefer to get holes and fill them with dummy data later if this is necessary (eg. if this is a billing system, end of month enter a cancelled empty bill for each bill number that is a "hole").
Even in SQL, an Insert and a Rollback will let the sequence incremented in an auto-increment primary-key ID column, so there can be holes after a failure.

How should i guarantee consistency in database involving finance transaction operations

I am trying to figure out how to handle consistency in the database.
In scenario:
User A has an accounting document in the database include a balance field representing the amount of his current money. (supposed initially he has 100$)
My system has many methods to charge his account.
Suppose 2 methods occur at the same time, each method charges him for 10$, these steps occur concurrently in below orders:
Method 1 READ his balance and store in memory (100$)
Method 2 READ his balance and store in memory (100$)
... some business logics
Method 1 UPDATE his balance by subtracting variable in memory by 10 (100$ - 10$) and then save it
Method 2 UPDATE his balance by subtracting variable in memory by 10 (100$ - 10$) and then save it
This means he has been charged only 10$ instead of 20$.
I searched this situation a while and can not get it clear (sorry for my stupidity).
Really appreciate yours helps to enlighten my featherbrained. :)
You just discovered why financial transactions are complicated :-)
Have you ever wondered why it takes time for you to have an updated balance in your bank account? Or why you actually have two balances, instead of one?
That's because your account can actually go negative and (up to a certain point) that will be fine.
So in a real life scenario what happens is that you have a balance of 100$, you pay 10$ and until that transaction is processed and confirmed by the receiver, you still have your 100$. If you do 20 transactions of 10$ each, you'll be able to complete them because the system will most likely not be able to notice.
And honestly, it shouldn't. Think of credit cards, you might not have enough money now, but maybe you know you'll have enough when the credit is due.
So, the race condition you describe only works if you actually read the value and then update it.
There are a few approaches:
Read the current balance, and update the row using the old balance as a field in the where statement. This way if it updates no rows you know that you need to re-read and update.
Don't update the balance and only do it time-based, say once per hour. Yes, you might still have to do some checks, but the system will overall be more responsive.
Lock the database row as your first step. This would work but there's a chance that it will make the app slower.
Race condition you describe is low level design concern. With backend engine like Node that will handle the incomming request in first come first serve fashion you don't need to think about this case. Race condition you describe is not possible if you respect the order in which database update callbacks are fired. They are fired in the same order they have been issued in. So you should call next update only when the previous has finished. Promisses are great way to do this.

EventSourcing race condition

Here is the nice article which describes what is ES and how to deal with it.
Everything is fine there, but one image is bothering me. Here it is
I understand that in distributed event-based systems we are able to achieve eventual consistency only. Anyway ... How do we ensure that we don't book more seats than available? This is especially a problem if there are many concurrent requests.
It may happen that n aggregates are populated with the same amount of reserved seats, and all of these aggregate instances allow reservations.
I understand that in distributes event-based systems we are able to achieve eventual consistency only, anyway ... How to do not allow to book more seats than we have? Especially in terms of many concurrent requests?
All events are private to the command running them until the book of record acknowledges a successful write. So we don't share the events at all, and we don't report back to the caller, without knowing that our version of "what happened next" was accepted by the book of record.
The write of events is analogous to a compare-and-swap of the tail pointer in the aggregate history. If another command has changed the tail pointer while we were running, our swap fails, and we have to mitigate/retry/fail.
In practice, this is usually implemented by having the write command to the book of record include an expected position for the write. (Example: ES-ExpectedVersion in GES).
The book of record is expected to reject the write if the expected position is in the wrong place. Think of the position as a unique key in a table in a RDBMS, and you have the right idea.
This means, effectively, that the writes to the event stream are actually consistent -- the book of record only permits the write if the position you write to is correct, which means that the position hasn't changed since the copy of the history you loaded was written.
It's typical for commands to read event streams directly from the book of record, rather than the eventually consistent read models.
It may happen that n-AggregateRoots will be populated with the same amount of reserved seats, it means having validation in the reserve method won't help, though. Then n-AggregateRoots will emit the event of successful reservation.
Every bit of state needs to be supervised by a single aggregate root. You can have n different copies of that root running, all competing to write to the same history, but the compare and swap operation will only permit one winner, which ensures that "the" aggregate has a single internally consistent history.
There are going to be a couple of ways to deal with such a scenario.
First off, an event stream would have the current version as the version of the last event added. This means that when you would not, or should not, be able to persist the event stream if the event stream is not at the version when loaded. Since the very first write would cause the version of the event stream to be increased, the second write would not be permitted. Since events are not emitted, per se, but rather a result of the event sourcing we would not have the type of race condition in your example.
Well, if your commands are processed behind a queue any failures should be retried. Should it not be possible to process the request you would enter the normal "I'm sorry, Dave. I'm afraid I can't do that" scenario by letting the user know that they should try something else.
Another option is to start the processing by issuing an update against some table row to serialize any calls to the aggregate. Probably not the most elegant but it does cause a system-wide block on the processing.
I guess, to a large extent, one cannot really trust the read store when it comes to transactional processing.
Hope that helps :)

Why limit commands and events to one aggregate? CQRS + ES + DDD

Please explain why modifying many aggregates at the same time is a bad idea when doing CQRS, ES and DDD. Is there any situations where it still could be ok?
Take for example a command such as PurgeAllCompletedTodos. I want this command to lead to one event that update the state of each completed Todo-aggregate by setting IsActive to false.
Why is this not good?
One reason I could think of:
When updating the domain state it's probably good to limit the transaction to a well defined part of the entire state so that only this part need to be write locked during the update. Doing so would allow many writes on different aggregates in parallell which could boost performance in some extremely heavy scenarios.
The response of the question lie in the meaning of "aggregate".
As first thing I would say that you are not modifying 'n' aggregates, but you are modifying 'n' entities.
An aggregate contains more-than-one entity and it is just a transaction concept, the aggregate (pattern) is used when you need to modify the state of more than one entity in your application transactionally (all are modified or none).
Now, why you would modify more than one aggregate with one command?
If you feel this needs, before doing anything else check your aggregate boundaries to see if you can modify it to remove the needs to 1 command -> 'n' aggregate.
An aggregate can contains a lot of entities of the same type, so for your command PurgeAllCompletedTodos, you could also think about expand the transaction boundary from a single Todo to an aggregate UserTodosAggregate that contains all the user todos, and let it manage all the commands for the todos of a single user.
In this way you can modify all the todos of a user in a single transaction.
If this still doesn't solve your problem because, let's say that is needed to purge all completed todos of each user in the application, you will still need to send a command to 'n' aggregates, the aggregate boundary doesn't help, so we can think of having an AllApplicationTodosAggregate that manage the command.
Probably this isn't the best solution, because as you said it that command would block ALL the todos of the application, but, always check if it can be a good trade off (this part of the blocking is explained very well in both Blue Book and Red Book of DDD).
What if I need to modify some entities and can't have them in a single aggregate?
With the previous said, a command that modify more than one aggregate is bad because of transactions. What if you modify 3 aggregate, the first is good, and then the server is shut down?
In this case what you are doing is having a lot of single modification that needs to be managed to prevent inconsistency of the system.
It can be done using a process manager, whom responsabilities are modify all the aggregates sending them the right command and manage failures if they happen.
An aggregate still receive it's own command, but the process manager is in charge to send them in a way it knows (one at time, all in parallel, 5 per time, what-do-you-want)
So you can have a strategy to manage the failure between two transaction, and make decision like: "if something fail, roll back all the modification done untill now" (sending a rollback command to each aggregate), or "if an operation fail repeat it 3 times each 30 minutes and if doens't work then rollback", "if something fail create a notification for the system admin".
(sorry for the long post, at least hope it helps)

Resources