DDD: Using External Websocket - domain-driven-design

I am trying to design a microservice using DDD approach.
Microservice has an aggregate whose state/logic depends on the data received over WS connection to a thrid-party server. WS is used because of latency issues.
According to my understanding, DDD seems to indicate that external APIs have to go through application layer. Except WS connection instantiation, there will be lot of to and fro for dataflow with this approach in application layer. Not sure how to go about this.

A web-socket is a form of connection between your microservice and another component. There are other forms of connections as well, e.g. request/response and message queues. Connectivity is an infrastructural matter, therefore its implementation (instantiation or receiving data) should reside not in the application layer but in an infrastructural layer.
Once data is received through web-socket channel, it should be packed into an application service.
According to my understanding, DDD seems to indicate that external APIs have to go through application layer
External API invocations, much like web-sockets, should be implemented in an infrastructural layer. If an API is used to retrieve data, it should probably be abstracted away as a repository (in line with DDD principles: repository interface in domain layer, concrete repository invoking API in infrastructure layer). And application service calls repository interface.

Related

Data Aggregator/composition service in Microservices

I am developing an application where there is a dashboard for data insights.
The backend is a set of microservices written in NodeJS express framework, with MySQL backend. The pattern used is the Database-Per-Service pattern, with a message broker in between.
The problem I am facing is, that I have this dashboard that derives data from multiple backend services(Different databases altogether, some are sql, some are nosql and some from graphDB)
I want to avoid multiple queries between front end and backend for this screen. However, I want to avoid a single point of failure as well. I have come up with the following solutions.
Use an API gateway aggregator/composition that makes multiple calls to backend services on behalf of a single frontend request, and then compose all the responses together and send it to the client. However, scaling even one server would require scaling of the gateway itself. Also, it makes the gateway a single point of contact.
Create a facade service, maybe called dashboard service, that issues calls to multiple services in the backend and then composes the responses together and sends a single payload back to the server. However, this creates a synchronous dependency.
I favor approach 2. However, I have a question there as well. Since the services are written in nodeJs, is there a way to enforce time-bound SLAs for each service, and if the service doesn't respond to the facade aggregator, the client shall be returned partial, or cached data? Is there any mechanism for the same?
GraphQL has been designed for this.
You start by defining a global GraphQL schema that covers all the schemas of your microservices. Then you implement the fetchers, that will "populate" the response by querying the appropriate microservices. You can start several instances to do not have a single point of failure. You can return partial responses if you have a timeout (your answer will incluse resolver errors). GraphQL knows how to manage cache.
Honestly, it is a bit confusing at first, but once you got it, it is really simple to extend the schema and include new microservices into it.
I can’t answer on node’s technical implementation but indeed the second approach allows to model the query calls to remote services in a way that the answer is supposed to be received within some time boundary.
It depends on the way you interconnect between the services. The easiest approach is to spawn an http request from the aggregator service to the service that actually bring the data.
This http request can be set in a way that it won’t wait longer than X seconds for response. So you spawn multiple http requests to different services simultaneously and wait for response. I come from the java world, where these settings can be set at the level of http client making those connections, I’m sure node ecosystem has something similar…
If you prefer an asynchronous style of communication between the services, the situation is somewhat more complicated. In this case you can design some kind of ‘transactionId’ in the message protocol. So the requests from the aggregator service might include such a ‘transactionId’ (UUID might work) and “demand” that the answer will include just the same transactionId. Now the sends when sent the messages should wait for the response for the certain amount of time and then “quit waiting” after X seconds/milliseconds. All the responses that might come after that time will be discarded because no one is expected to handle them at the aggregator side.
BTW this “aggregator” approach also good / simple from the front end approach because it doesn’t have to deal with many requests to the backend as in the gateway approach, but only with one request. So I completely agree that the aggregator approach is better here.

Front end for backend DDD

We have a application landscape with many micro services and use a backend for front-end for the UI to aggregate data.
Would you put the aggregation logic (combine data from multiple micro services) in the domain or application layer of the Front End For Backend Application?
It feels like business logic, however, there is no persistance only data retrieval, so I am in doubt where to put it?
Backend-for-front-end implementations shouldn't need to contain any real business logic as they only serve as gateways which are focused on specific front-end clients (mobile, desktop, web, etc.).
So of course there is logic in your BFFs but it will rather be the aggregation logic. Taking care of processing the front-end client requests and managing the workflow to communicate with the involved microservices - be it via synchronous or asynchronous communication - as well as aggregating the data and serving it back to the client.
What you usually end up with are some kind of controllers handling the API requests (e.g. via REST), some services containing the aggregation logic, some kind of anemic models (i.e. without logic) representing the client requests and responses and stuff like that.
I don't know what technology stack you are on but you can have a look at some sample BFF implementations in a microservices architecture at the Microsoft powered eshopOnContainers project.
Note: if you are really sure there is business logic required in your BFF I guess you might have to consider if this should rather be part of an already existing microservice or if you even might not have discovered a new bounded context yet this logic rather fits in.
Ideally Backend for frontend should not have any business logic. It should do authentication/authorization & routing logic.
If you are willing to do lots of aggregation and lots of frontends (need different set of data) then you may use GraphQL.

Why does nestjs framework use a transport layer different than HTTP in a microservices approach?

I have been developing microservices with Spring Boot for a while, using feign client, rest template and AMPQ brokers to establish communication between each microservice.
Now, I am learning NestJs and its microservice approach. I've noticed that nestjs uses TCP as the default transport layer which is different from the way it is done with Spring Boot.
Why does nestjs prefer those transport layers (TCP, AMPQ) instead of HTTP? isn't HTTP the transport protocol for rest microservices?
From NestJs documentation:
"a microservice is fundamentally an application that uses a different transport layer than HTTP"
The main reason is it is slow. The problem with HTTP approach is that, with HTTP, JSON can generate an unwanted processing time to send and translate the information.
One problem with http-json is the serialization time of JSON sent. This is an expensive process and imagine serialization for a big data.
In addition to JSON, there are a number of HTTP headers that should be
interpreted further which may be discarded. The only concern should be to maintain a single layer for sending and receiving messages. Therefore, the HTTP protocol with JSON to communicate
between microservices is very slow. There are some optimization techniques and those are complex and does not add significant performance benefits
Also,HTTP spends more time waiting than it does transfer data.
If you look a the OSI model, HTTP is part of Layer 7 (Application). TCP is Layer 4 (Transport).
When looking at Layer 4 there is no determining characteristic that makes it HTTP, AMPQ, gRPC, or RTSP. Layer 4 is explicitly how data is transmitted and received with the remote device.
Now, this is where networking and the software development worlds collide. Networking people will use "transport" meaning Layer 4, while Programming people use "transport" meaning the way a packet of data is transmitted to another component.
The meaning of "transport" (or "transporter" as used in the docs) is used as an abstraction from how messages are shared in this architecture.
Looking at the documentation if you are looking for something like AMPQ for your microservice you can use NATS or REDIS (both implementations are built by them).
https://docs.nestjs.com/microservices/basics#getting-started

Azure Logic Apps - HTTP communication between microservices

Are Logic Apps considered microservices? If so, is making HTTP API calls from Logic Apps, whether it's using HTTP/Function/APIM connectors, not a violation of direct HTTP communication between microservices?
If possible, never depend on synchronous communication (request/response) between multiple microservices, not even for queries. The goal of each microservice is to be autonomous and available to the client consumer, even if the other services that are part of the end-to-end application are down or unhealthy. If you think you need to make a call from one microservice to other microservices (like performing an HTTP request for a data query) in order to be able to provide a response to a client application, you have an architecture that will not be resilient when some microservices fail.
Moreover, having HTTP dependencies between microservices, like when creating long request/response cycles with HTTP request chains, as shown in the first part of the Figure 4-15, not only makes your microservices not autonomous but also their performance is impacted as soon as one of the services in that chain is not performing well.
Source: https://learn.microsoft.com/en-us/dotnet/standard/microservices-architecture/architect-microservice-container-applications/communication-in-microservice-architecture
Yes, Logic Apps are primarily Http based services. Whether or not it's 'micro' really doesn't matter because 'micro' is too abstract to have any real meaning. It was a useful marketing term at one point but it's tour on the tech fashion runway has ended. So, don't even think about that. ;)
What the authors are trying to express is that you should avoid chaining dependencies in an app's architecture. A waits for B which waits for C which waits for D which waits for E, etc... That's the first line in the graphic.
Instead, Basket can check Catalog on it's own, then call Ordering, while Inventory is checked in the background. You only one level deep instead of 4.

Is a node.js app that both servs a rest-api and handles web sockets a good idea?

Disclaimer: I'm new to node.js so I am sorry if this is a weird question :)
I have a node.js using express.js to serv a REST-API. The data served by the REST-API is fetched from a nosql database by the node.js app. All clients only use HTTP-GET. There is one exception though: Data is PUT and DELETEd from the master database (a relational database on another server).
The thought for this setup is of course to let the 'node.js/nosql database' server(s) be a public front end and thereby protecting the master database from heavy traffic.
Potentially a number of different client applications will use the REST-API, but mainly it will be used by a client app with a long lifetime (typically 0.5 to 2 hours). Instead of letting this app constantly polling the REST-API for possible new data I want to use websockets so that data is only sent to client when there is any new data. I will use a node.js app for this and probably socket.io so that it could fall back to api-polling if websockets are not supported by the client. New data should be sent to clients each time the master database PUTs or DELETEs objects in the nosql database.
The question is if I should use one node.js for both the API and the websockets or one for the API and one for the websockets.
Things to consider:
- Performance: The app(s) will be hosted on a cluster of servers with a load balancer and a HTTP accelerator in front. Would one app handling everything perform better than two apps with distinct tasks?
- Traffic between app: If I choose a two app solution the api app that receives PUTs and DELETEs from the master database will have to notice the websocket app every time it receives new data (or the master database will have to notice both apps). Could the doubled traffic be a performance issue?
- Code cleanlines: I believe two apps will result in cleaner and better code, but then again there will surely be some common code for both apps which will lead to having two copies it.
As to how heavy the load can be it is very difficult to say, but a possible peak can involve:
50000 clients
each listening to up to 5 different channels
new data being sent from master each 5th second
new data should be sent to approximately 25% of the clients (for some data it should be sent to all clients and other data probably below 1% of the clients)
UPDATE:
Thanks for the answers guys. More food for thoughts here. I have decided to have two node.js apps, one for the REST-API and one for web sockets. The reason is that I belive it will be easier to scale them. To begin with the whole system will be hosted on three physical servers and one node.js app for the REST-API on each server should bu sufficient, but for the websocket app there probably needs to several instances of it on each physical server.
This is a very good question.
If you are looking at a legacy system, and you already have a REST interface defined, there is not a lot of advantages to adding WebSockets. Things that may point you to WebSockets would be:
a demand for server-to-client or client-to-client real-time data
a need to integrate with server-components using a classic bi-directional protocol (e.g. you want to write an FTP or sendmail client in javascript).
If you are starting a new project, I would try to have a hard split in the project between:
the serving of static content (images, js, css) using HTTP (that was what it was designed for) and
the serving of dynamic content (real-time data) using WebSockets (load-balanced, subscription/messaging based, automatic reconnect enabled to handle network blips).
So, why should we try to have a hard separation? Let's consider the advantages of a HTTP-based REST protocol.
The use of the HTTP protocol for REST semantics is an invention that has certain advantages
Stateless Interactions: none of the client's context is to be stored on the server side between the requests.
Cacheable: Clients can cache the responses.
Layered System: undetectability of intermediaries
Easy testing: it's easy to use curl to test an HTTP-based protocol
On the other hand...
The use of a messaging protocol (e.g. AMQP, JMS/STOMP) on top of WebSockets does not preclude any of these advantages.
WebSockets can be transparently load-balanced, messages and state can be cached, efficient stateful or stateless interactions can be defined.
A basic reactive analysis style can define which events trigger which messages between the client and the server.
Key additional advantages are:
a WebSocket is intended to be a long-term persistent connection, usable for multiple different messaging purpose over a single connection
a WebSocket connection allows for full bi-directional communication, allowing data to be sent in either direction in sympathy with network characteristics.
one can use connection offloading to share subscriptions to common topics using intermediaries. This means with very few connections to a core message broker, you can serve millions of connected users efficiently at scale.
monitoring and testing can be implemented with an admin interface to send/recieve messages (provided with all message brokers).
the cost of all this is that one needs to deal with re-establishment of state when the WebSocket needs to reconnect after being dropped. Many protocol designers build in the notion of a "sync" message to provide context from the server to the client.
Either way, your model object could be the same whether you use REST or WebSockets, but that might mean you are still thinking too much in terms of request-response rather than publish/subscribe.
The first thing you must think about, is how you're going to scale the servers and manage their state. With a REST API this is largely straightforward, as they are for the most part stateless, and every load balancer knows how to proxy http requests. Hence, REST APIs can be scaled horizontally, leaving the few bits of state to the persistence layer (database) to deal with. With websockets, often times its a different matter. You need to research what load balancer you're going to use (if its a cloud deployment, often times it depends on the cloud provider). Then figure out what type of websocket support or configuration the load balancer will need. Then depending on your application, you need to figure out how to manage the state of your websocket connections across the cluster. Think about the different use cases, e.g. if a websocket event on one server alters the state of the data, will you need to propagate this change to a different user on a different connection? If the answer is yes, then you'll probably need something like Redis to manage your ws connections and communicate changes between the servers.
As for performance, at the end of the day its still just HTTP connections, so I doubt there will be a big difference in separating the server functionality. However, I think two servers would go a big way in improving code cleanliness, as long as you have another 'core' module to isolate code common to both servers.
Personally I would do them together, this is because you can share the models and most of the code between the REST and the WS.
At the end of the day what Yuri said in his answer is correct, but is not so much work to load balance WS any way, everyone does it nowadays. The approach I took is have REST for everything and then create some WS "endpoints" for subscribing for realtime data server-client.
So for what I understood, your client would just get notifications from the server, with updates, so definitely I would go with WS. You subscribe to some events and then you get new results when there are. Keep asking with HTTP calls is not the best way.
We had this need and basically built a small framework around this idea http://devalien.github.io/Axolot/
Basically you can understand our approach in the controller (this is just an example, in our real world app we have subscriptions so we can notify when we have new data or when we finish a procedure). In actions there are the rest endpoints and in sockets the websockets endpoints.
module.exports = {
model: 'user', // We are attaching the user to the model, so CRUD operations are there (good for dev purposes)
path: '/user', // Tthis is the end point
actions: {
'get /': [
function (req, res) {
var query = {};
Model.user.find(query).then(function(user) { // Find from the User Model declared above
res.send(user);
}).catch(function (err){
res.send(400, err);
});
}],
},
sockets: {
getSingle: function(userId, cb) { // This one is callable from socket.io using "user:getSingle
Model.user.findOne(userId).then(function(user) {
cb(user)
}).catch(function (err){
cb({error: err})
});
}
}
};

Resources