DynamoDB Application Architecture - node.js

We are using DynamoDB with node.js and Express to create REST APIs. We have started to go with Dynamo on the backend, for simplicity of operations.
We have started to use the DynamoDB Document SDK from AWS Labs to simplify usage, and make it easy to work with JSON documents. To instantiate a client to use, we need to do the following:
AWS = require('aws-sdk');
Doc = require("dynamodb-doc");
var Dynamodb = new AWS.DynamoDB();
var DocClient = new Doc.DynamoDB(Dynamodb);
My question is, where do those last two steps need to take place, in order to ensure data integrity? I’m concerned about an object that is waiting for something happen in Dynamo, being taken over by another process, and getting the data swapped, resulting in incorrect data being sent back to a client, or incorrect data being written to the database.
We have three parts to our REST API. We have the main server.js file, that starts express and the HTTP server, and assigns resources to it, sets up logging, etc. We do the first two steps of creating the connection to Dynamo, creating the AWS and Doc requires, at that point. Those vars are global in the app. We then, depending on the route being followed through the API, call a controller that parses up the input from the rest call. It then calls a model file, that does the interacting with Dynamo, and provides the response back to the controller, which formats the return package along with any errors, and sends it to the client. The model is simply a group of methods that essentially cover the same area of the app. We would have a user model, for instance, that covers things like login and account creation in an app.
I have done the last two steps above for creating the dynamo object in two places. One, I have simply placed them in one spot, at the top of each model file. I do not reinstantiate them in the methods below, I simply use them. I have also instantiated them within the methods, when we are preparing to the make the call to Dynamo, making them entirely local to the method, and pass them to a secondary function if needed. This second method has always struck me as the safest way to do it. However, under load testing, I have run into situations where we seem to have overwhelmed the outgoing network connections, and I start getting errors telling me that the DynamoDB end point is unavailable in the region I’m running in. I believe this is from the additional calls required to make the connections.
So, the question is, is creating those objects local to the model file, safe, or do they need to be created locally in the method that uses them? Any thoughts would be much appreciated.

You should be safe creating just one instance of those clients and sharing them in your code, but that isn't related to your underlying concern.
Concurrent access to various records in DynamoDB is still something you have to deal with. It is possible to have different requests attempt writes to the object at the same time. This is possible if you have concurrent requests on a single server, but is especially true when you have multiple servers.
Writes to DynamoDB are atomic only at the individual item. This means if your logic requires multiple updates to separate items potentially in separate tables there is no way to guarantee all or none of those changes are made. It is possible only some of them could be made.
DynamoDB natively supports conditional writes so it is possible to ensure specific conditions are met, such as specific attributes still have certain values, otherwise the write will fail.
With respect to making too many requests to DynamoDB... unless you are overwhelming your machine there shouldn't be any way to overwhelm the DynamoDB API. If you are performing more read/writes that you have provisioned you will receive errors indicating provisioned throughput has been exceeded, but the API itself is still functioning as intended under these conditions.

Related

What's a good design for making sure the Node.js Event Loop isn't blocked when adding potentially hundreds of records?

I just read this article from Node.js: Don't Block the Event Loop
The Ask
I'm hoping that someone can read over the use case I describe below and tell me whether or not I'm understanding how the event loop is blocked, and whether or not I'm doing it. Also, any tips on how I can find this information out for myself would be useful.
My use case
I think I have a use case in my application that could potentially cause problems. I have a functionality which enables a group to add members to their roster. Each member that doesn't represent an existing system user (the common case) gets an account created, including a dummy password.
The password is hashed with argon2 (using the default hash type), which means that even before I get to the need to wait on a DB promise to resolve (with a Prisma transaction) that I have to wait for each member's password to be generated.
I'm using Prisma for the ORM and Sendgrid for the email service and no other external packages.
A take-away that I get from the article is that this is blocking the event loop. Since there could potentially be hundreds of records generated (such as importing contacts from a CSV or cloud contact service), this seems significant.
To sum up what the route in question does, including some details omitted before:
Remove duplicates (requires one DB request & then some synchronous checking)
Check remaining for existing user
For non-existing users:
Synchronously create many records & push each to a separate array. One of these records requires async password generation for each non-existing user
Once the arrays are populated, send a DB transaction with all records
Once the transaction is cleared, create invitation records for each member
Once the invitation records are created, send emails in a MailData[] through SendGrid.
Clearly, there are quite a few tasks that must be done sequentially. If it matters, the asynchronous functions are also nested: createUsers calls createInvites calls sendEmails. In fact, from the controller, there is: updateRoster calls createUsers calls createInvites calls sendEmails.
There are architectural patterns that are aimed at avoiding issues brought by potentially long-running operations. Note here that while your example is specific, any long running process would possibly be harmful here.
The first obvious pattern is the cluster. If your app is handled by multiple concurrent independent event-loops of a cluster, blocking one, ten or even thousand of loops could be insignificant if your app is scaled to handle this.
Imagine an example scenario where you have 10 concurrent loops, one is blocked for a longer time but 9 remaining are still serving short requests. Chances are, users would not even notice the temporary bottleneck caused by the one long running request.
Another more general pattern is a separated long-running process service or the Command-Query Responsibility Segregation (I'm bringing the CQRS into attention here as the pattern description could introduce more interesting ideas you could be not familiar with).
In this approach, some long-running operations are not handled directly by backend servers. Instead, backend servers use a Message Queue to send requests to yet another service layer of your app, the layer that is solely dedicated to running specific long-running requests. The Message Queue is configured so that it has specific throughput so that if there are multiple long-running requests in short time, they are queued, so that possibly some of them are delayed but your resources are always under control. The backend that sends requests to the Message Queue doesn't wait synchronously, instead you need another form of return communication.
This auxiliary process service can be maintained and scaled independently. The important part here is that the service is never accessed directly from the frontend, it's always behind a message queue with controlled throughput.
Note that while the second approach is often implemented in real-life systems and it solves most issues, it can still be incapable of handling some edge cases, e.g. when long-running requests come faster than they are handled and the queue grows infintely.
Such cases require careful maintenance and you either scale your app to handle the traffic or you introduce other rules that prevent users from running long processes too often.

Scaling Stateful NodeJS Services - Stickiness/Affinity Based on Object ID (not user session)

I am trying to find a good way to horizontally scale a stateful NodeJS service.
The Problem
The problem is that most of the options I find online assume the service is stateless. The NodeJS cluster documentation says:
Node.js [Cluster] does not provide routing logic. It is, therefore important to design an application such that it does not rely too heavily on in-memory data objects for things like sessions and login.
https://nodejs.org/api/cluster.html
We are using Kubernetes so scaling across multiple machines would also be easy if my service was stateless, but it is not.
Current Setup
I have a list of objects that stay in memory, each object alone is a transaction boundary. Requests to this service always have the object ID in the url. Requests to the same object ID are put into a queue and processed one at a time.
Desired Setup
I would like to keep this interface to the external world but internally spread this list of objects across multiple nodes and based on the ID in the URL the request would be routed to the appropriate node.
What is the usual way to do it in NodeJS? I've seen people using the user session to make sure a given user always go to the same node, what I would like to do is the same thing but instead of using the user session using the ID in the url.

How to share database connection between different lambda functions

I went through some articles about taking advantage of lambda's container and sharing things like database connection between multiple instances, however, what if I have multiple lambda functions accessing the database and I want to have them share the same connection knowing that these functions call each other, for example, an API gateway calls the authenticator lambda function and then calls the insert user function, both of these functions make calls to the database, is it possible for them to share the same connection?
I'm using NodeJS but I can use a different language if it would support that.
You can't share connections between instances. Concurrent invocations do not use the same instance.
You can however share connections between invocations (which might be executed on the same container/instance). However, there you have to check if you connection is (still) open, in which case you can reuse it. Otherwise open a new one.
If you are worried about too many connections to your db just close the connections when you exit your lambda & instantiate new ones every time. You may also need to think about concurrency if that is a problem. A few weeks ago AWS added the possibility to control concurrency on a per function basis, which is neat.

How to share an object in multiple instances of nodejs?

I have a functionality where user post data containing few userid and some data related to those userid and I am saving it into postgresql database. I want to save this returned userid in some object.
I just want to check if userid is present in this object and then only call database. This check happen very frequently so I can not hit db every time just to check is there any data present for that userid.
Problem is, I have multiple nodejs instances running on different server so how can I have a common object.
I know I can use redis/riak for storing key-value on server, but don't want to increase complexity/learning just for a single case.(I have never used redis/riak before.)
Any suggestion ?
If your data is in different node.js processes on different servers, then the ONLY option is to use networking to communicate across servers with some common server to get the value. There are lots of different ways to do that.
Put the value in a database and always read the value from the common database
Designate one of your node.js instances as the master and have all the other node.js instances ask the value is on the master anytime they need it
Synchronize the value to each node.js process using networking so each node.js instance always has a current value in its own process
Use a shared file system (kind of like a poor man's database)
Since you already have a database, you probably want to just store it in the database you already have and query it from there rather than introduce another data store with redis just for this one use. If possible, you can have each process cache the value over some interval of time to improve performance for frequent requests.

Custom Logging mechanism: Master Operation with n-Operation Details or Child operations

I'm trying to implement logging mechanism in a Service-Workflow-hybrid application. The requirements for logging is that instead for independent log action, each log must be considered as a detail operation and placed against a parent/master operation. So, it's a parent-child and goes to database table(s). This is the primary reason, NLog failed.
To help understand better, I'm diving in a generic detail. This is how the application flow goes:
Now, the Main entry point of the application (normally called Program.cs) is Platform. It initializes an engine that is capable of listening incoming calls from ISDN lines, VoIP, or web services. The interface is generic, so any call that reaches the Platform triggers OnConnecting(). OnConnecting() is a thread-safe event and can be triggered as many times as system requires.
Within OnConnecting(), a new instance of our custom Workflow manager is launched and the context is a custom object called ProcessingInfo:
new WorkflowManager<ZeProcessingInfo>();
Where, ZeProcessingInfo:
var ZeProcessingInfo = new ProcessingInfo(this, new LogMaster());
As you can see, the ProcessingInfo is composed of Platform itself and a new instance of LogMaster. LogMaster is defined in an independent assembly.
Now this LogMaster is available throughout the WorkflowManager, all the Workflows it launches, all the activities within any running Workflow, and passed on to external code called from within any Activity. Now, when a new LogMaster is initialized, a Master Operation entry is created in the database and this LogMaster object now lives until this call is ended after a series of very serious roller coaster rides through different workflows. Upon every call of OnConnecting(), a new Master Operation is created and maintained.
The LogMaster allows for calling a AddDetail() method that adds new child detail under the internally stored Master Operation (distinguished through a Guid Primary Key). The LogMaster is built upon Entity Framework.
And, I'm able to log under the same Master Operation as many times as I require. But the application requirements are changing and there is a need to log from other assemblies now. There is a Platform Server assembly witch is a Windows Service that acts as a server listening to web service based calls and once a client calls a method, OnConnecting in Platform is triggered.
I need a mechanism to somehow retrieve the related LogMaster object so that I can add detail to the same Master Operation. But Platform Server is the once triggering the OnConnecting() on the Platform and thus, instantiating LogMaster. This creates a redundancy loop.
Also, failure scenarios are being considered as well. If LogMaster fails, need to revert to Event Logging from Database Logging. If Event Logging is failed (or not allowed through unified configuration), need to revert to file-based (XML) logging.
I hope I have given a rough idea. I don't expect code but I need some strategy for a very seamless plug-able configurable logging mechanism that supports Master-Child operations.
Thanks for reading. Any help would be much appreciated.
I've read this question a number of times and it was pretty hard to figure out what was going on. I don't think your diagram helps at all. If your question is about trying to retrieve the master log record when writing child log records then I would forget about trying to create normalised data in the log tables. You will just slow down the transactional system in trying to do so. You want the log/audit records to write as fast as possible and you can later aggregate them when you want to read them.
Create a de-normalised table for the logs entries and use a single Guid in that table to track the session/parent log master. Yes this will be a big table but it will write fast.
As for guaranteed delivery of log messages to a destination, I would try not to create multiple destinations as combining them later will be a nightmare but rather use something like MSMQ to emit the audit logs as fast as possible and have another service pick them up and process them in a guaranteed delivery manner. ETW (Event Logging) is not guaranteed under load and you will not know that it has failed.

Resources