What's the right way to gather data from different microservices?

What's the right way to gather data from different microservices? - node.js

I'm having a problem understanding how basic communication between microservices should be made and I haven't been able to find a good solution or standard way to do this in the other questions. Let's use this basic example.
I have an invoice service that return invoices, every invoice will contain information(ids) about the user and the products. If I have a view in which I need to render the invoices for a specific user, I just make a simple request.
let url = "http://my-domain.com/api/v2/invoices"
let params = {userId:1}
request(url,params,(e,r)=>{
const results = r // An array of 1000 invoices for the user 1
});
Now, for this specific view I will need to make another request to get all the details for each product on each invoice.
results.map((invoice)=>{
invoice.items.map((itemId)=>{
const url=`http://my-domain.com/api/v2/products/${itemId}`
request(url,(e,r)=>{
const product = r
//Do something else.....
});
});
});
I know the code example is not perfect but you can see that this will generate a huge number of requests(at least 1000) to the product service and just for 1 user, now imagine if I have 1000 users making this kind of requests.
What is the right way to get the information off all the products without having to make this number of requests in order to avoid performance issues?.
I found some workarounds for this kind of scenarios such as:
Create an API endpoint that accepts a list of IDs in order to make a single request.
Duplicate the information from the Product service within the invoice service and find a way to keep them in sync.
In a microservices architecture are these the right ways to deal with this kind of issues? For me, they look like simple workarounds.
Edit #1: Based on Remus Rusanu response.
As per Remus recommendation, I decided to isolate my services and describe them a little bit better.
As shown in the image above the microservices are now isolated(in specific the Billing-service) and they now are the owners of the data. By using this structure I ensure that Billing-service is able to work even if there are async jobs or even if the other two services are down.
If I need to create a new invoice, I can call the other two microservices(Users, Inventory) synchronously and then update the data on the "cache" tables(Users, Inventory) in my billing service.
Is it also good to assume these "cache" tables are read-only? I assume they are since only the user/inventory services should be able to modify this information to preserve isolation and authority over the information.

You need to isolate the services as so they do not share state/data. The design in your question is a single macroservice split into 3 correlated storage silos. Case in point, you cannot interpret a result form the 'Invoicing' service w/o correlating the data with the 'Products' response(s).
Isolated microservices mean they own their data and they can operate independently. An invoice is complete as returned from the 'Invoices' service. It contains the product names, the customer name, every information on the invoice. All the data came from its own storage. A separate microservice could be 'Inventory', that operates all the product inventories, current stock etc. It would also have its own data, in its own storage. A 'product' can exist in both storage mediums, and there once was logical link between them (when the invoice was created), but the link is severed now. The 'Inventory' microservice can change its products (eg. remove one, add new SKUs etc) w/o affecting the existing Invoices (this is not only a microservice isolation requirement, is also a basic accounting requirement). I'm not going to enter here into details of what is a product 'identity' in real life.
If you find yourself asking questions like you're asking it likely means you do not have microservices. You should think at your microservice boundaries while considering what happens if you replace all communication with async queued based requests (a response can come 6 days later): If the semantics break, the boundary is probably wrong. If the semantics hold, is the right track.

It all depends on the resilience requirements that you have. Do you want your microservice to function when the other microservices are down or not?
The first solution that you presented is the less resilient: if any of the Users or Products microservices goes down, the Invoice microservice would also go down. Is this what you want? On the other hand, this architecture is the simplest. A variation of this architecture is to let the client make the join requests; this leads to a chatty conversation but it has the advantage that the client could replace the missing information with default information when the other microservices are down.
The second solution offers the biggest possible resilience but it's more complex. Having an event-driven architecture helps a lot in this case. In this architecture the microservices act as swimming lanes. A failure in one of the microservices does not propagate to other microservices.

Related

Domain / integration events payload information in DDD CQRS architecture

I have a question about the integration events used in a microservice / CQRS architecture.
The payload of the event can only have references to aggregates or can it have more information?
If only reference ids can be sent, the only viable solution is to bring the rest of the information with some type of call but the origin would have to implement an endpoint and the services would end up more coupled.
ex. when a user is created and the event is raised.
UserCreated {
userId
name
lastname
document
...
}
Is this correct?

If only reference ids can be sent,
Why would only that be allowed? I have worked with a system which was using micro-services, CQRS and DDD(similar like yours) and we did not have such restrictions. Like in most cases it is: "What works best for your application/business domain". Do not follow any rule blindly. This is perfectly fine to put other information in the events Payload as well.
the only viable solution is to bring the rest of the information with
some type of call but the origin would have to implement an endpoint
and the services would end up more coupled.
This is fine in some cases as well but this brings you to the situation to have additional call's after the event has been processed. I would not do this unless you have a really heavy model/models and it would affect your performance. For example if you have an event executed and based on userId you would need to load a collection of related objects/models for some reason. I had one similar case where I had to load a collection of other objects based on some action on user like event UserCreated. Of course in this case you don't want to send all that data in one Event payload. Instead you send only the id of the user and later call a Get api from the other service to get and save that data to your micro-service.
UserCreated
{
userId
name
lastname
document
... }
Is this correct?
Yes this is fine :)
What you could do instead:
Depending of your business scenario you could publish the information with multiple events with Stages and in different States.
Lets say from UI you have some Wizard-like screen with multiple steps of creation. You could publish
event: UserCreatedDraft with some initial data from 1st Wizard page
event: UserPersonalDataCreated with only part of the object related to private data
event: UserPaymentDataCreated with only the payment data created
UserCreatedFinal with the last step
Of this is just an example for some specific scenario which depends on your use case and your Business requirements. This is just to give you an Idea what you could do in some cases.
Summary:
As you can see there are multiple ways how you can work with these kind of systems. Keep in mind that following the rules is good but in some cases you need to do what is the best based on your business scenario and what works for some application might not be the best solution for your. Do what is most efficient for your system. Working with micro-services we need to deal with latency and async operations anyways so saving some performance on other parts of the system is always good.

Design Ecommerce Microservices

I am designing an Ecommerce using micro services architecture. Suppose that I have two context a product catalog, inventory and pricing.
It's seems clear to me that they have a clear responsibility. But to serve the show case (the product list) I need to make a request for the product catalog, get a list of ID's and then use it to query the Inventory micro services to check inventory status ( in stock or stockout). Besides that I need to make a request to Pricing to get the price of each product.
So basically to serve a fundamental feature makes me execute three requests (like a join) in three micro services. I have been reading about micro services architecture and when you are dealing with many "joins" it's possible that the these contexts should be a single one. But, IMO it seems clear to me that each context has a different set of responsibilities.
The other option is to create a "search" micro service that aggregate all these information (product + pricing + inventory). We can use a domain event to notify "search" microsecond that something has changed. So we can resolve show case with a single request. This look like a CQRS.
The question is...
Is there a correct approach?
Which one is better ? Trade-offs?

you can try to include some information from different domain contexts to other domain context
so you product catalog domain will contain #of items , price from inventory and pricing domains.
This will be a read only (value objects)and should be updated by events from inventory and pricing domains .
in your use case the trusted source of truth will be carried in inventory domain so if any synchronization failure happen still the inventory will reject any order because of availability .

in your case i think its better to create a separate search microservice to aggregate the data from all of them as search is mostly always from multiple domain areas like product , inventory and ....
and you can use events from other microservice (Event Sorucing) to populate the data in search.

It seems that what you need is to show in a view info coming from different microservices (contexts).
You can use ViewModel Composition technic, where an infraestructure component (a request handler) intercepts the http request and allows microservices to participate in the response, looking for a microservice who says "hey, I have that info" (Inventory has the info about stock, Pricing about price, and so on). This infraestructure component compose a dynamic viewmodel on the fly, with the info coming from differect microservices.
I've never implemented it, but look at this video explaining it, from minute 17:35 to 21:00
https://www.youtube.com/watch?v=KkzvQSuYd5I
Hope it helps.

Update on 14-Feb-2019
Probably this will answer your question in more detail https://stackoverflow.com/a/54676222/1235935
I think the right approach here is to use Event Sourcing to pre-aggregate the show case data with product description, inventory and price. A separate microservice is probably not needed. This pre-aggregated data (a.k.a. materialized view) can be stored in the same microservice that handles the user request to display products (probably the order creation service).
The events could be generated by log-based Change Data Capture (CDC) from the DB of the product, inventory and pricing services and writing them to their respective topics in a log structured streaming platform (e.g. Kafka or AWS Kinesis) as mentioned here. This will also ensure "read your own write" guarantees in product, inventory and pricing services

Create a VO from a Entity

I'm building a e-commerce with DDD and Event Sourcing, CQRS. My ideia is separate each AR in a microservice.
In my AR ShoppingCart I need a VO Item with productId and a Price, because price doesn't change after add to the cart.
I have another AR Product that control the price.
My problem is, how get the Price from AR Product without a synchronous request to the Product since I'm using a event architecture?

Fundamentally, what you are trying to do is copy information from one aggregate root to the other.
There are two approaches you might take.
One is to think in terms of a cache - we pass to the shopping cart an instance of a domain service that knows how to take a correlation id (product code?) and get a cached copy of a price. So we have a background process that copies pricing information from the pricing micro service to the shopping cart micro service, and then the autonomous shopping cart relies on its locally cached copy of the price.
Important note: there's nothing wrong with including timeliness metadata in the cache, so that the sharping cart can include intelligence about whether or not the cached information is "too old".
The other is more direct - have a method by which you can send a command with the price to the shopping cart, and build some orchestration logic that observes which shopping carts need prices, then send send a command to the cart with the appropriate price.

If you have two microservices, you can have each microservice publish a stream of events. Your ShoppingCart microservice can consume PriceChanged events from your Product microservice and maintain a local cache of the last price per Product. When you add a Product to a ShoppingCart, you would reference the local cache of prices.
This same approach of listening to events as a means of communication scales up from inter-Aggregate to inter-BoundedContext or inter-Microservice and even inter-System. Depending on your sensitivity to price changes, you might have to employ other approaches as described, but I assume you have some tolerance to eventual consistency given your choice in the CQRS+ES pattern.

Track multiple context for the same Bot

We have a bot that will be used by different customers and depending on their database, sector of activity we're gonna have different answers from the bot and inputs from users. Intents etc will be the same for now we don't plan to make a custom bot for each customer.
What would be the best way to separate data per customer within Chatbase?
I'm not sure if we should use
A new API key for each customer (Do we have a limitation then?)
Differentiate them by the platform filter (seems to not be appropriated)
Differentiate them by the version filter (same it would feel a bit weird to me)
Using Custom Event, not sure how though
Example, in Dialogflow we pass the customer name/id as a context parameter.

Thank you for your question. You listed the two workarounds I would suggest, I will detail the pros/cons:
New API Key for each customer: Could become unwieldy to have to change bots everytime you want to look at a different users' metrics. You should also create a general api (bot) where you send all messages in order to get the aggregate metrics. This would mean making two api calls per message.
Differentiate by version filter: This would be the preferred method, however it could lengthen load times for your reports as your number of users grows. The advantage would be that all of your metrics are in one place, and they will be aggregated while only having to send one api call per message.

DDD, external datas and Repository

I'm thinking to use DDD for our next application. I have already found a lot of interesting papers and answers but cannot find a solution to my problem :
We have an SOA. architecture where some services are known as master of their datas. That's nice but I can't figure how to use them nicely with DDD.
Given a service "employees" who is the master of the Employee datas, it is a crud over a couple of simple values (first and lastname, birthdate, address).
My new app, should track the trainings offered to those employees. So I have the concept of Participant, a Participant has the same values as an Employee plus a list of trainings and a skill.
We can suppose that the "trainings" applications has a database with a table of participants that contains a participant_id, skill and one employee_id used to retrieve the first and lastname.
Am I correct ?
But now, which component may I use to call the "employees" service ? Is it the ParticipantRepository so that when I get a participant I have is names. Or is it the application service who complete the Participant datas before using them. Or may I explicitly call the employees service when needed ?
Thanks a lot.

In your training application (I mean in the domain of your application) the concept of an employee might not exist as other than an external reference. As you correctly said, that will be a Participant.
I understand that you need to get some data from the employee service to populate the participant. I can think of few options.
1) ParticipantRepository builds a Participant, which is an aggregate root, some of that data might be in a PersonalDetails value object. This value object is constructed by calling the employee app. This approach is easy, but might not be the best. This is the approach you mentioned, where the ParticipantRepository calls an interface PersonalDetailsService and the implementation of that interface does the actual call to the Employee service. In this way, your domain has no idea that is dealing with employees, as it only sees PersonalDetails.
2) Eventual consistency by replicating data from the employee service: If the employee service can send a notification when an employee is updated (e.g. via messaging) you can listen to those events and have a read only copy of the data. The benefit of this is that your app will work even if the employee service goes down. The problem is that you might need to build something to re-send data that might have got lost.
Both of these approaches are explained quite well in the book Implementing Domain-Driven Design

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string