I ran into the microservices architecture for e-commerce application where each table has it's own micro service basically with CRUD operations (something like rest client for each table).
Now I am thinking about combine and model them around business domains, before that I wanted to know does anyone encountered such situation and is it right architecture or not.
Any suggestions will be very helpful.
Thanks.
Each microservice should have its own set of SQL tables that no other microservice can access. But having one microservice per SQL table, and having each microservice just support CRUD operations is generally an anti-pattern: it turns a powerful DBMS and query language into a simple record manager: no cross-table transactions, joins, filtering, sorting, pagination, etc.
You're mixing up different, unrelated things.
(micro)services are logical entities that do some specific task. they communicate with other services to perform a larger-scope task.
Tables/CRUD/SQL/NO-SQL come from an entirety different level. its where data is saved and how its accessed.
Its true that services use SQL and have tables. Its also probably a good idea to have separate tables for each service. I would even go as far as saying that if 2 services directly use the same table you're probably looking at a design problem.
but you can't equate services with tables, conceptually, they belong in different worlds.
Microservices are logical block for any application , combining them at sql level dosen't make any sense.
For eg: let's consider you create an order service , which allow customer to place order.
Now a order contain order items as well and may have a reference of customer object , for all these you might end up creating multiple tables. So don't just think sql table and microservices together
If you still have doubts post a more exact question , will help :)
Related
I'm trying to make a service architecture which includes two Node.js apps which shares the same database. The overall service architecture looks like below (simplified version)
I'm planning to use Sequelize as an ORM to access the database. As far as I know, if a service uses Sequelize, it needs model to get the structure of data tables. In my case, api and service will access the same database, which means they should share the same Sequelize model.
So here is the question: where should I locate the common Sequelize relevant files? It seems I have two choices:
put them on the upper common location (assuming the project structure is monorepo) so that each apps can use the single same files
maintain copies of files in each apps' project folders. In this case, each apps will be independent(Let's say I want to dockerize each apps) but in case the Sequelize files modified, the same action should be done for the other.
I'm not sure how I understood is correct. Is my question valid? If so, what is the better choice and practice? I appreciate for your answers in advance.
There is no correct answer, it depends on the specific situation, but sharing a database between multiple microservices is a bad design.
Sharing a database means tight coupling at the data level. The direct consequence is that when a service modifies the database table structure, such as deleting the name field of the user table, it may break the APIs of other services and all use the sequelize user model. All services need to update the model definition and modify the implementation code of the API.
If all of your services are maintained by a team, I suggest you choose the first solution, which costs less and is easier to maintain. If your services are maintained by different teams, the two solutions are actually similar, because as long as the table structure is modified, the application layer model needs to be modified or verified whether it still works well.
Therefore, I recommend following the best practices of microservice architecture, first splitting the database vertically according to the business model, and building application APIs on top of it.
Core principles of microservices:
loose coupling
high cohesion
I'm new to DDD and CQRS and I'm planning to build a simple application to improve my skills a bit.
What I'm planning to do is a simple Taxi Corp application.
Requirements:
Client orders a taxi.
Client can have only one order at a time.
Driver picks an order.
Driver can have only one order at a time.
Driver goes to client.
Client enters cab.
Course starts.
Course finishes.
Client is purchased and driver is paid
And so on.
I can see there can be three aggregates: Client, Order and Driver. I want to split them into separate microservices. Do you think it's a good idea or I should start with one microservice?
I'm currently focused on the ordering a taxi. First of all I need to check if client doesn't already have a course assigned, later on I can create an order. After the order is created, I need to assign it to client. As during one request only one aggregate can be updated/created I wonder how to do it correctly. I've read something about Process Managers and I think it will be very useful in this case. I even draw a schema of communication. Can anyone tell me if my approach is correct and give me some tips on how to going further?
Process of creating an order
Do you think it's a good idea or I should start with one microservice?
I refer you to the wisdom of John Gall
A complex system that works is invariably found to have evolved from a simple system that worked. A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over, beginning with a working simple system.
Instead of worrying about microservices, give your attention to messages.
Someone said: "If you have more microservices than customers, you are doing it wrong".
And if you really follow CQRS/ES approach, resulting system is much easier to split apart than traditional ORM monolyths.
So focus on the domain first and start with monolyth.
start with the microservices design even in a wrong way, you get a better insight into desired architecture. because problems in microservices architecture design show themselves very soon.
client and driver are both users of systems and have some commonalities so you can consider them as one domain and one micro-service for them.
consider an order manager micro-service to assign client and driver to a trip by their ids. the order database may include trips table with two id keys for driver-Id and client-Id and some columns for the different states. after finishing each trip you can remove it from the trip table and insert that in an archive table. also, you can leave it there and partition your table daily to keep your database performance high.
consider an accounting micro-service for keeping payments and transactions. It's ok if you opt to use NoSql databases for other microservices, but do use SQL database for your transactions.
you may need another microservice for reporting and dashboards. mirror other dbs in a new one for reporting.
you also need an API gateway to route requests to micro-services or do authentication
your process is a set of events. definitely, you will expand the system later on and perhaps will have some long-running tasks, better to have a message broker and implement your flow as an event/task flow using patterns like event sourcing.
I can see there can be three aggregates: Client, Order and Driver. I
want to split them into separate microservices. Do you think it's a
good idea or I should start with one microservice?
They all belong to the same bounded context. Bounded context translates nicely to microservices (see Eric Evans video: https://www.infoq.com/news/2015/06/dddx-microservices-boundaries). But don't start by designing a micro service, you are doing it in the wrong order. Design first your bounded context then if it makes sense create a micro service around the hexagonal architecture.
After the order is created, I need to assign it to client. As during
one request only one aggregate can be updated/created I wonder how to
do it correctly.
This is the perfect example of why you need to do it all in the same process.
But in the case you want to go multiple micro services, think of eventual consistency (https://en.wikipedia.org/wiki/Eventual_consistency) and create a message driven architecture between your services. Might be too much work in my opinion but for learning purpose can be a good idea.
In standard micro-service architecture, each service is responsible for their own data with boundaries set. The only way to manipulate this data is through RESTful endpoints provided by the service.
I have a unique case where I would like to have a few clustered scraper processes running, populating a table with raw data. These scraper processes can also be configured for specific cases, say one to scrape text, one to scrape images, etc.
The raw data will then be consumed and aggregated into a normalized structure in another table by another process. I'd like to split out all this processes into small, deployable components, but that means that I must somehow share the model definitions across multiple repositories/projects since the aggregation logic must consume all the raw data.
It's possible that the aggregation logic makes request to each clustered scraper process, but the state control for that would be a lot more complex than just querying a table.
I know it's possible to define the model definitions in an isolated repo and then import as a dependency in other projects, but is this the correct architecture?
The best case for when to use microservices is when you have very distinct bounded contexts in your problem domain. When you have overlapping context boundaries like the scenario you've described, microservices will probably cost you more than you'd gain. Do you feel like you'd gain productivity by deconstructing your application into microservices despite this issue?
Without a better look at your application, it's hard to give definitive answers, but when you're bumping into problems like this at the outset, there's a good chance that this isn't a good case for a microservice architecture. Bear in mind, that's just my two cents.
Sharing physical repositories for configuration sounds pretty onerous, and I'd avoid it if at all possible!
We are currently working on a design using Azure functions with Azure storage queue binding.
Each message in the queue represents a complete transaction. An Azure function will be bound to that queue so that the function will be triggered as soon as there is a new message in the queue.
The function will then commit the transaction in a SQL DB.
The first-cut implementation is also complete; and it's working fine. However, on retrospective, we are considering the following:
In a typical DAL, there are well-established design patterns using entity framework, repository patterns, etc. However, we didn't find a similar guidance/best practices when implementing DAL within a server-less code.
Therefore, my question is: should such patterns be implemented with Azure functions (this would be challenging :) ), or should the server-less code be kept as light as possible or this is not a use-case for azure functions, at all?
It doesn't take anything too special. We're using a routine set of library DLLs for all kinds of things -- database, interacting with other parts of Azure (like retrieving Key Vault secrets for connection strings), parsing file uploads, business rules, and so on. The libraries are targeting netstandard20 so we can more easily migrate to Functions v2 when the right triggers become available.
Mainly just design your libraries so they're highly modularized, so you can minimize how much you load to get the job done (assuming reuse in other areas of the system is important, which it usually is).
It would be easier if dependency injection was available today. See this for a few ways some of us have hacked it together until we get official DI support. (DI is on the roadmap for Functions, I believe the 3.0 release.)
At first I was a little worried about startup time with the library approach, but the underlying WebJobs stack itself is already pretty heavy, and Functions startup performance seems to vary wildly anyway (on the cheaper tiers, at least). During testing, one of our infrequently-executed Functions has varied from just ~300ms to a peak of about ~3800ms to parse the exact same test file, with all but ~55ms spent on startup).
should such patterns be implemented with Azure functions (this would
be challenging :) ), or should the server-less code be kept as light
as possible or this is not a use-case for azure functions, at all?
My answer is NO.
There should be patterns to follow, but the traditional repository patterns and CRUD operations do not seem to be valid in the cloud era.
Many strong concepts we were raised up to adhere to, became invalid these days.
Denormalizing the data base became something not only acceptable but preferable.
Now designing a pattern will depend on the database you selected for your solution and also depends of the type of your application and the type of your data.
This is a link for general guideline when you do Table Storage design Guidelines.
Is your application read-heavy or write-heavy ? The design will vary accordingly.
Are you using Azure Tables or Mongo? There are design decisions based on that. Indexing is important in Mongo while there is non in Azure table that you can do.
Sharding consideration.
Redundancy Consideration.
In modern development/Architecture many principles has changed, each Microservice has its own database that might be totally different that any other Microservices'.
If you read along the guidelines that I provided, you will see what I mean.
Designing your Table service solution to be read efficient:
Design for querying in read-heavy applications. When you are designing your tables, think about the queries (especially the latency sensitive ones) that you will execute before you think about how you will update your entities. This typically results in an efficient and performant solution.
Specify both PartitionKey and RowKey in your queries. Point queries such as these are the most efficient table service queries.
Consider storing duplicate copies of entities. Table storage is cheap so consider storing the same entity multiple times (with different keys) to enable more efficient queries.
Consider denormalizing your data. Table storage is cheap so consider denormalizing your data. For example, store summary entities so that queries for aggregate data only need to access a single entity.
Use compound key values. The only keys you have are PartitionKey and RowKey. For example, use compound key values to enable alternate keyed access paths to entities.
Use query projection. You can reduce the amount of data that you transfer over the network by using queries that select just the fields you need.
Designing your Table service solution to be write efficient:
Do not create hot partitions. Choose keys that enable you to spread your requests across multiple partitions at any point of time.
Avoid spikes in traffic. Smooth the traffic over a reasonable period of time and avoid spikes in traffic.
Don't necessarily create a separate table for each type of entity. When you require atomic transactions across entity types, you can store these multiple entity types in the same partition in the same table.
Consider the maximum throughput you must achieve. You must be aware of the scalability targets for the Table service and ensure that your design will not cause you to exceed them.
Another good source is this link:
Given the following "facts" I have gleaned from reading around this.
Federations are separate databases from the moment they are created.
As copies of the original, they will not alter automatically if I alter the original's schema.
As separate databases you cannot cross join.
Each federation is priced as a separate db.
I will have to provide a TenantId field to each table I want to federate.
If these are correct, what are the advantages to using federation to achieve multi-tenancy over simply separate dbs? Or if there're not correct please put me straight.
Note, we have a small number of tenants, maybe 20.
Your understanding is correct.
There are a few interesting aspects of Federations that you may find useful. First it is a relatively flexible partitioning environment. For example you can group 10 tenants into the first member, and 50 in the second, based on usage patterns of your customers. Or you could simply isolate a single customer that is using the system more than the others.
Another important concept is that you can have multiple federations per database. So you could have a Customer federation and a SalesHistory federation for example.
Last but not least you may want to read this article that discusses connection pool fragmentation that occurs in traditional sharding models, but is not an issue with SQL Database Federations.