How would one use ExpressJS as an orchestration layer?
I have five NodeJS / ExpressJS "API applications" for different business functions (security, human resources, asset management, fleet management, etc.). Each provides raw object / document APIs and has its own database, app server, routes, etc.. I would like to build ANOTHER ExpressJS application to sit IN FRONT OF those five "stacks" and provide higher-level business operations (ie, TerminateEmployee, etc.) by funneling multiple calls into the other five stacks via REST.
Am I insane? Is this common? Maybe I don't know what to search for, but I'm not finding any examples of doing this.
BTW: I'm also thinking of building highly-reusable "widgets" (basically, individual AngularJS services and UI elements) to call into that sixth front-end stack.
Whoa, an old question left unanswered.
SOA, Microservice, or whatever the name is only the abstract thought of system architecture, they need to be applied, now let us define the problem
Problem: need orchestration for middleware
Parameter: we need to define what middleware need to be run (maybe a String for service name and Number for service port) and this value need to be passed as CLI args or process.env
First we need to store our parameter in JS readable format, could be an Array or Object, in my case i need to run multiple service at one port and the app need to be exposed in multiple port eg:
from String
graphql:80 auth:5000 usercrud:5000 trx:6969 docs:4000
to JSON
{
"80":["graphql"],
"5000":["auth", "usercrud"],
"6969":["trx"],
"7070":["ssr"],
"4000":["docs"]
}
we can make CLI accept args by using the popular yargs, but also need to be able read from string passed from .env file, when parsing a string this is quite easy since we can control the format of the string by simply using String.split() function we can make an Array, i prefer using space when separating context (in this case a middleware/service) followed by : and port of this particular service, like auth:5000 and then from this Array we can map the value to an Object (note "5000":["auth", "usercrud"] is sharing the same port so we need to accommodate this two service).
from this config we can iterate the keys which is reflected by its port, so doing Object.keys(service) return an Iterable/Array that we can Map/ForEach according to modern EcmaScript standard. In each iteration we make
app.use(require(`./service/${service[port]}`))
http.createServer(app)
.listen(port, () =>
log(`http://127.0.0.1:${port} >>> ${service[port]}`)
)
usually when we http.createServer(app) we do this once, but now we do it as many as the port we need to exposed, this increased atomicity and decrease dependency inter services.
advantage of this approach is:
can share library / helper
single codebase & consistency
single container development / staging / integration test
controlled service based on resource consumption (not only task, if one of the service is idle / consuming very little resource we can join it with the other service)
up & running source: https://github.com/nsnull0/eService/blob/master/packages/%40nodejs-express/index.js
Related
Our existing system uses App Services with API controllers.
This is not a good setup because our scaling support is poor, its basically all or nothing
I am looking at changing over to use Azure Functions
So effectively each method in a controller would become a new function
Lets say that we have a taxi booking system
So we have the following
Taxis
GetTaxis
GetTaxiDrivers
Drivers
GetDrivers
GetDriversAvailableNow
In the app service approach we would simply have a TaxiController and DriverController with the the methods as routes
How can I achieve the same thing with Azure Functions?
Ideally, I would have 2 function apps - Taxis and Drivers with functions inside for each
The problem with that approach is that 2 function apps means 2 config settings, and if that is expanded throughout the system its far too big a change to make right now
Some of our routes are already quite long so I cant really add the "controller" name to my function name because I will exceed the 32 character limit
Has anyone had similar issues migrating from App Services to Azure Functions>
Paul
The problem with that approach is that 2 function apps means 2 config
settings, and if that is expanded throughout the system its far too
big a change to make right now
This is why application setting is part of the release process. You should compile once, deploy as many times you want and to different environments using the same binaries from the compiling process. If you're not there yet, I strongly recommend you start by automating the CI/CD pipeline.
Now answering your question, the proper way (IMHO) is to decouple taxis and drivers. When requested a taxi, your controller should add a message to a Queue, which will have an Azure Function listening to it, and it get triggered automatically to dequeue / process what needs to be processed.
Advantages:
Your controller response time will get faster as it will pass the processing to another process
The more messages in the queue / more instances of the function to consume, so it will scale only when needed.
Http Requests (from one controller to another) is not reliable (unless you implement properly a circuit breaker and a retry policy. With the proposed architecture, if something goes wrong, the message will remain in the queue or it won't get completed by the Azure function and will return to the queue.
I am trying to find a good way to horizontally scale a stateful NodeJS service.
The Problem
The problem is that most of the options I find online assume the service is stateless. The NodeJS cluster documentation says:
Node.js [Cluster] does not provide routing logic. It is, therefore important to design an application such that it does not rely too heavily on in-memory data objects for things like sessions and login.
https://nodejs.org/api/cluster.html
We are using Kubernetes so scaling across multiple machines would also be easy if my service was stateless, but it is not.
Current Setup
I have a list of objects that stay in memory, each object alone is a transaction boundary. Requests to this service always have the object ID in the url. Requests to the same object ID are put into a queue and processed one at a time.
Desired Setup
I would like to keep this interface to the external world but internally spread this list of objects across multiple nodes and based on the ID in the URL the request would be routed to the appropriate node.
What is the usual way to do it in NodeJS? I've seen people using the user session to make sure a given user always go to the same node, what I would like to do is the same thing but instead of using the user session using the ID in the url.
We are using DynamoDB with node.js and Express to create REST APIs. We have started to go with Dynamo on the backend, for simplicity of operations.
We have started to use the DynamoDB Document SDK from AWS Labs to simplify usage, and make it easy to work with JSON documents. To instantiate a client to use, we need to do the following:
AWS = require('aws-sdk');
Doc = require("dynamodb-doc");
var Dynamodb = new AWS.DynamoDB();
var DocClient = new Doc.DynamoDB(Dynamodb);
My question is, where do those last two steps need to take place, in order to ensure data integrity? I’m concerned about an object that is waiting for something happen in Dynamo, being taken over by another process, and getting the data swapped, resulting in incorrect data being sent back to a client, or incorrect data being written to the database.
We have three parts to our REST API. We have the main server.js file, that starts express and the HTTP server, and assigns resources to it, sets up logging, etc. We do the first two steps of creating the connection to Dynamo, creating the AWS and Doc requires, at that point. Those vars are global in the app. We then, depending on the route being followed through the API, call a controller that parses up the input from the rest call. It then calls a model file, that does the interacting with Dynamo, and provides the response back to the controller, which formats the return package along with any errors, and sends it to the client. The model is simply a group of methods that essentially cover the same area of the app. We would have a user model, for instance, that covers things like login and account creation in an app.
I have done the last two steps above for creating the dynamo object in two places. One, I have simply placed them in one spot, at the top of each model file. I do not reinstantiate them in the methods below, I simply use them. I have also instantiated them within the methods, when we are preparing to the make the call to Dynamo, making them entirely local to the method, and pass them to a secondary function if needed. This second method has always struck me as the safest way to do it. However, under load testing, I have run into situations where we seem to have overwhelmed the outgoing network connections, and I start getting errors telling me that the DynamoDB end point is unavailable in the region I’m running in. I believe this is from the additional calls required to make the connections.
So, the question is, is creating those objects local to the model file, safe, or do they need to be created locally in the method that uses them? Any thoughts would be much appreciated.
You should be safe creating just one instance of those clients and sharing them in your code, but that isn't related to your underlying concern.
Concurrent access to various records in DynamoDB is still something you have to deal with. It is possible to have different requests attempt writes to the object at the same time. This is possible if you have concurrent requests on a single server, but is especially true when you have multiple servers.
Writes to DynamoDB are atomic only at the individual item. This means if your logic requires multiple updates to separate items potentially in separate tables there is no way to guarantee all or none of those changes are made. It is possible only some of them could be made.
DynamoDB natively supports conditional writes so it is possible to ensure specific conditions are met, such as specific attributes still have certain values, otherwise the write will fail.
With respect to making too many requests to DynamoDB... unless you are overwhelming your machine there shouldn't be any way to overwhelm the DynamoDB API. If you are performing more read/writes that you have provisioned you will receive errors indicating provisioned throughput has been exceeded, but the API itself is still functioning as intended under these conditions.
I'm researching a good way to implement multiple database for multi-tenant support using node.js + mongoose and mongodb.
I've found out that mongoose supports a method called createConnection() and I'm wondering the best practice to use that. Actually I am storing all of those connection in an array, separated by tenant. It'd be like:
var connections = [
{ tenant: 'TenantA', connection: mongoose.createConnection('tenant-a') },
{ tenant: 'TenantB', connection: mongoose.createConnection('tenant-b') }
];
Let's say the user send the tenant he will be logged in by request headers, and I get it in a very early middleware in express.
app.use(function (req, res, next) {
req.mongoConnection = connections.find({tenant: req.get('tenant')});
});
The question is, is it OK to store those connections statically or a better move would be create that connection every time a request is made ?
Edit 2014-09-09 - More info on software requirements
At first we are going to have around 3 tenants, but our plan is to increase that number to 40 in a year or two. There are more read operations than write ones, it's basically a big data system with machine learning. It is not a freemium software. The databases are quite big because the amount of historical data, but it is not a problem to move very old data to another location (we already thought about that). We plan to shard it later if we run out of available resources on our database machine, we could also separate some tenants in different machines.
The thing that most intrigues me is that some people say it's not a good idea to have prefixed collections for multitenancy but the reasons for that are very short.
https://docs.compose.io/use-cases/multi-tenant.html
http://themongodba.wordpress.com/2014/04/20/building-fast-scalable-multi-tenant-apps-with-mongodb/
I would not recommend manually creating and managing those separate connections. I don't know the details of your multi-tenant requirements (number of tenants, size of databases, expected number transactions, etc), but I think it would be better to go with something like Mongoose's useDb function. Then Mongoose can handle all the connection pool details.
update
The first direction I would explore is to setup each tenant on a separate node process. There are some interesting benefits to running your tenants in separate node processes. It makes sense from a security standpoint (isolated memory) and from a stability standpoint (one tenant process crash doesn't effect others).
Assuming you're basing the tenancy off of the URL, you would setup a proxy server in front of the actual tenant servers. It's job would be to look at the URL and route to the correct process based on that information. This is a very straightforward node http proxy setup. Each tenant instance could be the exact same code base, but launched with a different config (which tells them what mongo connection string to use).
This means you're able to design your actual application as if it wasn't multi-tenant. Each process only knows about one mongo database, and there is no multi-tenant logic necessary. It also enables you to easily split up traffic later based on load. If you need split up the tenants for performance reasons, you can do it transparently at the proxy level. The DNS can all stay the same, and you can just move the server that the instances are on behind the scenes. You can even have the proxy balance the requests for a tenant between multiple servers.
I have two identical sites which will consume RabbitMQ messages using the new Rabbit MQ client. The producer ideally should be able to designate the site either by queue name or routing key. The former I can do as a Publish parameter but the latter I have no access to. Furthermore, on the service side, the consumer appears only able to subscribe to convention-based queue names, i.e. mq.myrequest.inq and I don't seem to be able to take advantage of the routing key.
Is there a way I can publish and subscribe using my own routing key, or register the handler based on an explicit queue name, i.e mq.myrequest.site1.inq ?
There isn't. ServiceStack's RabbitMq support is conventionally based on Type names and is opinionated to function as a work queue. It was designed to be config-free and simple to use so automatically takes care of the details of which exchanges, routing keys and queue names to use.
If you need advanced or custom configuration it's best to instead use the underlying RabbitMQ.Client directly.