QLDB Transaction Isolation - amazon-qldb

I was looking at the sample code here: https://docs.aws.amazon.com/qldb/latest/developerguide/getting-started.python.step-5.html, and noticed that the three functions get_document_id_by_gov_id, is_secondary_owner_for_vehicle and add_secondary_owner_for_vin are executed separately in three driver.execute_lambda calls. So if there are two concurrent requests that are trying to add a secondary owner, would this trigger a serialization conflict for one of the requests?
The reason I'm asking is that I initially thought we would have to run all three functions within the same execute_lambda call in order for the serialization conflict to happen since each execute_lambda call uses one session, which in turn uses one transaction. If they are run in three execute_lambda calls, then they would be spread out into three transactions, and QLDB wouldn't be able to detect a conflict. But it seems like my assumption is not true, and the only benefit of batching up the function calls would just be better performance?

Got an answer from a QLDB specialist so going to answer my own question: the operations should have been wrapped in a single transaction so my original assumption was actually true. They are going to update the code sample to reflect this.

Related

Rust - Is `Lazy::new` guaranteed to be called only once even when running tests in parallel?

I am testing some code that expects a postgres database with a particular set of data in the tables. I have a function that will connect to a database, ensure the data is correct, then return a connection. Once this data is set up, all the queries are read-only, so I'm not worrying about synchronizing the rest of the tests.
I'm worried about potential race conditions if this "ensure the database is correct" step is called twice at the same time.
I've wrapped this setup function in a once_cell::sync::Lazy, to try to guarantee that it is called only once per cargo test invocation, but I'm not sure this is actually guaranteeing what I think it is.
Note, this also includes doctests, which I understand are run with an entirely different model to regular unit tests. Does my use of Lazy still guarantee that the function is only run once, even in these circumstances?
This feels like a relatively common problem, but I've had trouble finding better solutions on the internet so far.

Does node-cache uses locks

I'm trying to understand if the node-cache package uses locks for the cache object and can't find anything.
I tried to look at the source code and it doesn't look like it, but this answer suggests otherwise with the quote:
So there is Redis and node-cache for memory locks.
This cache is used in a CRUD server and I want to make sure that GET/UPDATE requests will not create a race condition on the data.
I don't see any evidence of locking in the code.
If two requests for the same key which is not in the cache are made one after the other, then it will launch two separate fetch() operations and whichever request comes back last is the one that will remain in the cache. This is probably not normally a problem, but an improved implementation could make only one request for that same key and have the second request just wait for the first request to provide the value that was already in flight.
Since the cache itself is all in-memory, all access to the cache is synchronous and thus regulated by Javascript's single threaded nature. So, the only place concurrency issues could affect things in the cache code itself are when they launch an asynchronous fetch() operation.
There are, of course, race conditions waiting to happen in how one uses the code that accesses the data just like there are with a database interface so the calling code has to be smart about how it uses the interface to avoid creating race conditions because of how it calls things.
Unfortunately no, you can write a unit test to confirm it.
I have written a library to fix that and also added read through method to easy the code usage:
https://github.com/KhanhPham2411/node-cache-async-lock

DDD - How to modify several AR (from different bounded contexts) throughout single request?

I would want expose a little scenario which is still at paper state, and which, regarding DDD principle seem a bit tedious to accomplish.
Let's say, I've an application for hosting accounts management. Basically, the application compose several bounded contexts such as Web accounts management, Ftp accounts management, Mail accounts management... each of them represented by their own AR (they can live standalone).
Now, let's imagine I want to provide a UI with an HTML form that compose one fieldset for each bounded context, for instance to update limits and or features. How should I process exactly to update all AR without breaking single transaction per request principle? Can I create a kind of "outer" AR, let's say a ClientHostingProperties AR which would holds references to other AR and update them as part of single transaction, using own repository? Or should I better create an AR that emit messages to let's listeners provided by the bounded contexts react on, in which case, I should probably think about ES?
Thanks.
How should I process exactly to update all AR without breaking single transaction per request principle?
You are probably looking for a process manager.
Basic sketch: persisting the details from the submitted form is a transaction unto itself (you are offered an opportunity to accrue business value; step 1 is to capture that opportunity).
That gives you a way to keep track of whether or not this task is "done": you compare the changes in the task to the state of the system, and fire off commands (to run in isolated transactions) to make changes.
Processes, in my mind, end up looking a lot like state machines. These tasks are commands are done, these commands are not done, these commands have failed: now what? and eventually reach a state where there are no additional changes to be made, and this instance of the process is "done".
Short answer: You don't.
An aggregate is a transactional boundary, which means that if you would update multiple aggregates in one "action", you'd have to use multiple transactions. The reason for an aggregate to be equivalent to one transaction is that this allows you to guarantee consistency.
This means that you have two options:
You can make your aggregate larger. Then you can actually guarantee consistency, but your ability to handle concurrent requests gets worse. So this is usually what you want to avoid.
You can live with the fact that it's two transactions, which means you are eventually consistent. If so, you usually use something such as a process manager or a flow to handle updating multiple aggregates. In its simplest form, a flow is nothing but a simple if this event happens, run that command rule. In its more complex form, it has its own state.
Hope this helps 😊

Replacing bad performing workers in pool

I have a set of actors that are somewhat stateless and perform similar tasks.
Each of these workers is unreliable and potentially low performing. In my design- I can easily spawn more actors to replace lazy ones.
The performance of an actor is assessed by itself. Is there a way to make the supervisor/actor pool do this assessment, to help decide which workers are slow enough for me to replace? Or is my current strategy "the" right strategy?
I'm new to akka myself, so only trying to help, but my attack would be something along the following lines:
Write your own routing logic, something along the following lines https://github.com/akka/akka/blob/v2.3.5/akka-actor/src/main/scala/akka/routing/SmallestMailbox.scala Keep in mind that a new instance is created for every pool, so each instance can store information about how many messages have been processed by each actor so far. In this instance, once you find an actor underperforming, mark it as 'removable' (once it is no longer processing any new messages) in a separate data structure and stop sending further messages.
Write your own router pool: override createRouterActor https://github.com/akka/akka/blob/v2.3.5/akka-actor/src/main/scala/akka/routing/RouterConfig.scala:236 to provide your own CustomRouterPoolActor
Write your CustomRouterPoolActor along the following lines: https://github.com/akka/akka/blob/8485cd2ebb46d2fba851c41c03e34436e498c005/akka-actor/src/main/scala/akka/routing/Resizer.scala (See ResizablePoolActor). This actor will have access to your strategy instance. From this strategy instance- remove the routees already marked for removal. Look at ResizablePoolCell to see how to remove actors.
Question is - why some of your workers perform badly? Is there anything difference between them (I assume not). If not, that maybe some payloads simply require more work the the others - what's the point of terminating them then?
Once we had similar problem - and used SmallestMailboxRoutingLogic. It basically try to distribute the workload based on mailbox sizes.
Anyway, I would rather try to answer the question - why some of the workers are unstable and perform poorly - because this looks like a biggest problem you are just trying to cover elsewhere.

How does nodejs-redis(&connect-redis) deal with sync and async?

I used connect-redis for my session store, and when I use req.session, it seems all the operations on it are synchronized, it's like operating on ordinary Javascript variables, the code obey the order. but I check the source code, which uses the asynchronized way, so I wonder why the req.session acts like that.
Another question is that if I have multiple redis queries,
client.sadd('test', 1);
client.del('test');
client.sadd('test', 2);
client.sadd('test', 3);
no matter where I put the del operation, the results always the same. I thought these queries might be run in any order right? since they all asynchronized called, so the results I expected should be different every time.
Thanks for you help
The fact that roundtrips to the Redis server are managed asynchronously does not mean the queries will be sent in random order.
Redis (and therefore most Redis client libraries) supports pipelining, generally used to optimize the number of roundtrips. The idea is to send multiple queries, and then wait for the replies. The order is critical, because it is used by the client to match queries and replies.
Node.js is very well suited to support this kind of mechanisms. Matt Ranney's node_redis client supports pipelining in a transparent way. Provided the same client object is used, all the queries will be serialized and executed in order.
In your example, it is normal the queries are always executed in the same order. You can check this point by using the monitor command to display the flow of queries sent to Redis.
Now, it is important the last query of the pipeline is associated with a callback, otherwise your program will never know when the last query is complete.

Resources