Questions pertaining to micro-service architecture - node.js

I have a couple of questions that exist around micro service architecture, for example take the following services:
orders,
account,
communication &
management
Question 1: From what I read I understand that each service is suppose to have ownership of the data pertaining to that service, so orders would have an orders database. How important is that data ownership? Would micro-services make sense if they all called from one traditional database such that all data pertaining to the services would exist in one database? If so, are there an implications of structuring the services this way.
Question 2: Services should be able to communicate with one and other. How would that statement be any different than simply curling an existing API? & basing the logic on that response? Is calling a service more efficient than simply curling the API?
Question 3: Is it worth it? Now I understand this is a massive generality , and it's fundamentally predicated on the needs of the business. But when that discussion has been had, was the re-build worth it? & what challenges can you expect to face

I will try to answer all the questions.
Respect to all services using the same database. If you do so you have two main problems. First the database would become a bottleneck because all requests will go to the same point. And second you will have coupled all your services, so if the database goes down or it needs to update, all your services will be affected. (The database will became a single point of failure)
The communication between services could be whatever your services need (syncrhonous, asynchronous, via message passing (message broker), etc..) it all depends on the use cases you have to support. The recommended way to do to avoid temporal decoupling is to use a message broker like kafka, doing this your services don't have to known each other and in case some of them go down the others will still working. And when they are up again, they can continue to process the messages that have pending. However, if your services need to respond in synchronous way, you can define synchronous communication between services and use a circuit breaker to behave properly in case the callee service is down.
Microservices architecture is far more complicated to make it work, to monitoring and to debug than a traditional monolith architecture so, it is only worth if you will have very large requirements of scalability and availability and/or if the system is very large and it will require several teams working in different parts of the system and it is recommendable to avoid dependencies among them. So each team can work at their own pace deploying their own services

Related

CQRS and Event Sourcing Guide

I want to create a CQRS and Event Sourcing architecture that is very cheap and very flexible and very uncomplicated.
I want to make sure that events never fail to at least reach the publisher/event store, ever, ever, because that's where business is.
Now, i have several options in mind:
Azure
With azure, i seem to not know what to use.
Azure service bus
Azure Function
Azure webjob (i suppose this can be replaced with Azure functions)
?? (something else i forgot or dont know?)
How reliable are these azure server-less solutions??
Custom
For this i am thinking of using RabbitMQ, the problem is the cost of a virtual machine to run it.
All in all, i want:
Ability to replay the messages/events in case of failure.
Ability to easily add subscribers.
Ability to select the subscribers upon which to replay the messages.
The Event store should be able to store very large sizes of event messages (or how else shall queue an image or file??).
The event store MUST NEVER EVER get chocked, or sleep.
Speed of implementation/prototyping would be an added
advantage.
What does your experience suggest?
What about other alternatives? (eg: apache-kafka)?
Why not run Event Store? Created by Greg Young himself. Host where you need.
I am a java user, I have been using hornetq (aka artemis which I dont use) an alternative to rabbitmq for the longest; the only problem is it does not support replication but gets the job done when it comes to eventsourcing. For your custom scenario, rabbitmq is a good choice but try running it on a digital ocean instance for low costs. If you are looking for simplicity and flexibility you have only 2 choices , build your own or forgo simplicity and pick up apache kafka with all its complexities but will give you flexibility. Again you can also build an eventstore with mongodb. https://www.mongodb.com/blog/post/event-sourcing-with-mongodb
Your requirements are too vague to make the optimal choice. You need to consider a lot of things, one of them would be, for instance, the numbers of events per one aggregate, the number of aggregates (note that this has to be statistical). Those are important primarily because if you allow tens of thousands of events for each aggregate then you would need to have snapshotting which adds complexity which you might not need.
But for regular use cases you could just use a relational database like Postgres as your (linearizable) event store. It also has a listen/notify functionality to you would not really need any message bus either and your application could be written in a reactive way.

Learning DDD and CQRS

I'm new to DDD and CQRS and I'm planning to build a simple application to improve my skills a bit.
What I'm planning to do is a simple Taxi Corp application.
Requirements:
Client orders a taxi.
Client can have only one order at a time.
Driver picks an order.
Driver can have only one order at a time.
Driver goes to client.
Client enters cab.
Course starts.
Course finishes.
Client is purchased and driver is paid
And so on.
I can see there can be three aggregates: Client, Order and Driver. I want to split them into separate microservices. Do you think it's a good idea or I should start with one microservice?
I'm currently focused on the ordering a taxi. First of all I need to check if client doesn't already have a course assigned, later on I can create an order. After the order is created, I need to assign it to client. As during one request only one aggregate can be updated/created I wonder how to do it correctly. I've read something about Process Managers and I think it will be very useful in this case. I even draw a schema of communication. Can anyone tell me if my approach is correct and give me some tips on how to going further?
Process of creating an order
Do you think it's a good idea or I should start with one microservice?
I refer you to the wisdom of John Gall
A complex system that works is invariably found to have evolved from a simple system that worked. A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over, beginning with a working simple system.
Instead of worrying about microservices, give your attention to messages.
Someone said: "If you have more microservices than customers, you are doing it wrong".
And if you really follow CQRS/ES approach, resulting system is much easier to split apart than traditional ORM monolyths.
So focus on the domain first and start with monolyth.
start with the microservices design even in a wrong way, you get a better insight into desired architecture. because problems in microservices architecture design show themselves very soon.
client and driver are both users of systems and have some commonalities so you can consider them as one domain and one micro-service for them.
consider an order manager micro-service to assign client and driver to a trip by their ids. the order database may include trips table with two id keys for driver-Id and client-Id and some columns for the different states. after finishing each trip you can remove it from the trip table and insert that in an archive table. also, you can leave it there and partition your table daily to keep your database performance high.
consider an accounting micro-service for keeping payments and transactions. It's ok if you opt to use NoSql databases for other microservices, but do use SQL database for your transactions.
you may need another microservice for reporting and dashboards. mirror other dbs in a new one for reporting.
you also need an API gateway to route requests to micro-services or do authentication
your process is a set of events. definitely, you will expand the system later on and perhaps will have some long-running tasks, better to have a message broker and implement your flow as an event/task flow using patterns like event sourcing.
I can see there can be three aggregates: Client, Order and Driver. I
want to split them into separate microservices. Do you think it's a
good idea or I should start with one microservice?
They all belong to the same bounded context. Bounded context translates nicely to microservices (see Eric Evans video: https://www.infoq.com/news/2015/06/dddx-microservices-boundaries). But don't start by designing a micro service, you are doing it in the wrong order. Design first your bounded context then if it makes sense create a micro service around the hexagonal architecture.
After the order is created, I need to assign it to client. As during
one request only one aggregate can be updated/created I wonder how to
do it correctly.
This is the perfect example of why you need to do it all in the same process.
But in the case you want to go multiple micro services, think of eventual consistency (https://en.wikipedia.org/wiki/Eventual_consistency) and create a message driven architecture between your services. Might be too much work in my opinion but for learning purpose can be a good idea.

Should I be moving to a microservices based architecture?

I am working on a monolith system. All of it's code is in one repository (Web API and background workers). System is written in Nodejs and MongoDB (Mongoose) is used as a data store. My goal is to set a new path how project should evolve. At first I was wondering if I could move towards microservices based architecture.
Monolith architecture creates some problems:
If my background workers needs to scale. I have to deploy all the project to the server despite only using a small fraction of it.
All system must be redeployed when code changes. What if payment processor calls webhook while system is being redeployed?
Using microsevices advantages are quite obvious:
Smaller code base for individual microservice. Easier to reason about it.
Ability to select programming tools best for particular use case.
Easier to scale.
Looking at the current code I noticed that Mongoose ODM (Object Document Mapper) models are used across all the project to create, query and update models in database. As a principle of a good programming all such interactions with database should be abstracted. Business logic should not leak into other system layers. I could do that by introducing REPOSITORY pattern (Domain Driven Design). While code is still being shared across web api and it's background workers it is not a hard task to do.
If i decide to extract repositories into standalone microservices than all bunch of problems arise:
Some sort of query language must be introduced to accommodate complex search queries.
Interface must provide a way to iterate over search results (cursor based navigation) without returning all database documents over network.
Since project is in it's early stage and I am the only developer, going to microservices based architecture seems like an overkill. Maybe there are other approaches I should consider?
Extracting business logic and interaction with database into separate repository and sharing among services to avoid complex communication protocols between services?
Based on my experience with working in Microservices for last few years, it seems like an overkill in current scenario but pays off in long-term.
Based on the information stated above, my thoughts are:
Code Structure - Microservices Architecture (MSA) applying in above context means not separating DAO, Business Logic etc. rather is more on the designing system as per business functions. For example, if it is an eCommerce application, then you can shipping, cart, search as separate services, which can further be divided into smaller services. Read it more about domain-driven design here.
Deployment Unit - Keeping microservices apps as an independent deployment unit is a key principle. Hence, keep a vertical slice of the application and package them as Docker Image with Application Code, App Server (if any), Database and OS (Linux etc.)
Communication - With MSA, communication between services become a key and hence general practice is to remain with the message-oriented approach for communication (read about the reactive system and reactive programming for more insight).
PaaS Solution - There are multiple PaaS solutions available, which you can apply so that you don't need to worry about all the other aspects like container management, container orchestration, auto-scaling, configuration management, log management and monitoring etc. See following PaaS solutions:
https://www.nanoscale.io/ by TIBCO
https://fabric8.io/ - by RedHat
https://openshift.io - by RedHat
Cloud Vendor Platforms - AWS, Azure & Google Cloud all of them have specific support for Microservices App from the deployment perspective, which we can use as an alternative solution if you don't want to deploy PaaS solution in your organization.
Hope these pointers will have in understanding the overall landscape so that you can structure your architecture for future need.
I am working on a monolith system... My goal is to set a new path how project should evolve. At first I was wondering if I could move towards microservices based architecture.
In what ways do you need to evolve the project? Will it be mostly bugfixes, adding features, improving performance and/or scalability? Do you anticipate other developers collaborating in the future? Are you currently having maintenance issues? The answers to these questions (and many more) should be considered in guiding your choices.
You seem to be doing your homework around the pros and cons of a microservice architecture, so if you haven't asked yourself why you're even doing this in the first place, now would be good time to do so.
Maybe there are other approaches I should consider?
There's always the good old don't-break-what's-going ;)

Decision path for Azure Service Fabric Programming Models

Background
We are looking at porting a 'monolithic' 3 tier Web app to a microservices architecture. The web app displays listings to a consumer (think Craiglist).
The backend consists of a REST API that calls into a SQL DB and returns JSON for a SPA app to build a UI (there's also a mobile app). Data is written to the SQL DB via background services (ftp + worker roles). There's also some pages that allow writes by the user.
Information required:
I'm trying to figure out how (if at all), Azure Service Fabric would be a good fit for a microservices architecture in my scenario. I know the pros/cons of microservices vs monolith, but i'm trying to figure out the application of various microservice programming models to our current architecture.
Questions
Is Azure Service Fabric a good fit for this? If not, other recommendations? Currently i'm leaning towards a bunch of OWIN-based .NET web sites, split up by area/service, each hosted on their own machine and tied together by an API gateway.
Which Service Fabric programming model would i go for? Stateless services with their own backing DB? I can't see how Stateful or Actor model would help here.
If i went with Stateful services/Actor, how would i go about updating data as part of a maintenance/ad-hoc admin request? Traditionally we would simply login to the DB and update the data, and the API would return the new data - but if it's persisted in-memory/across nodes in a cluster, how would we update it? Would i have to expose this all via methods on the service? Similarly, how would I import my existing SQL data into a stateful service?
For Stateful services/actor model, how can I 'see' the data visually, with an object Explorer/UI. Our data is our Gold, and I'm concerned of the lack of control/visibility of it in the reliable services models
Basically, is there some documentation on the decision path towards which programming model to go for? I could model a "listing" as an Actor, and have millions of those - sure, but i could also have a Stateful service that stores the listing locally, and i could also have a Stateless service that fetches it from the DB. How does one decide as to which is the best approach, for a given use case?
Thanks.
What is it about your current setup that isn't meeting your requirements? What do you hope to gain from a more complex architecture?
Microservices aren't a magic bullet. You mainly get four benefits:
You can scale and distribute pieces of your overall system independently. Service Fabric has very sophisticated tools and advanced capabilities for this.
You can deploy and upgrade pieces of your overall system independently. Service Fabric again has advanced capabilities for this.
You can have a polyglot system - each service can be written in a different language/platform.
You can use conflicting dependencies - each service can have its own set of dependencies, like different framework versions.
All of this comes at a cost and introduces complexity and new ways your system can fail. For example: your fast, compile-time checked in-proc method calls now become slow (by comparison to an in-proc function call) failure-prone network calls. And these are not specific to Service Fabric, btw, this is just what happens you go from in-proc method calls to cross-machine I/O - doesn't matter what platform you use. The decision path here is a pro/con list specific to your application and your requirements.
To answer your Service Fabric questions specifically:
Which programming model do you go for? Start with stateless services with ASP.NET Core. It's going to be the simplest translation of your current architecture that doesn't require mucking around with your data layer.
Stateful has a lot of great uses, but it's not necessarily a replacement for your RDBMS. A good place to start is hot data that can be stored in simple key-value pairs, is accessed frequently and needs to be low-latency (you get local reads!), and doesn't need to be datamined. Some examples include user session state, cache data, a "snapshot" of the most recent items in a data stream (like the most recent stock quote in a stream of stock quotes).
Currently the only way to see or query your data is programmatically directly against the Reliable Collection APIs. There is no viewer or "management studio" tool. You have to write (and secure) an API in each service that can display and query data.
Finally, the actor model is a very niche model. It serves specific purposes but if you just treat it as a data store it will not work for you. Like in your example, a listing per actor probably wouldn't work because you can't query across that list, or even have multiple users reading the same listing simultaneously.

High Scalability in Domain-Driven Design

I'm using DDD for a service-oriented application intended to transmit a high volume of messages between a high volume of web clients (i.e., browsers).
Because in the context of required functionality, the need for transmission outweighs the need for storage, I love the idea of relying on RAM primarily and minimizing use of the database.
However I'm unclear on how to architect this from a scalability point of view. A web farm creates high availability of service endpoints and domain logic processing. But no matter how many servers I have, it seems they must all share a common repository so that their data is consistent.
How do I build this repository so that it's as scalable as possible? How can it be splashed across an array of physical machines in a manner such that all machines are consistent and each couldn't care less if another goes down?
Also since touching the database will be required occasionally (e.g., when a client goes missing and messages intended for it must be stored until it returns), how should I organize my memory-based code and data access layer? Are they both considered "the repository"?
There are several ways to solve this issue. No single answer can really cover it all...
One method to ensure your scalability is to simply scale the hardware. Write your web services to be stateless so that you can run a web farm (all running the same identical services, pointing to the same DB) and turn your DB into a cluster. Clustered databases run over multiple servers and work on the same storage. However, this scenario can get complicated and expensive quite quickly.
Some interesting links:
http://scale-out-blog.blogspot.com/2009/09/future-of-database-clustering.html
http://en.wikipedia.org/wiki/Server_farm
Another method is to look at architecture. CQRS is a common architectural model that ensures scalability. For instance, this architecture model -- its name stands for Command/Query Responsibility Segregation -- builds different databases for reading and writing. This seems contradictory, but if you study it, it becomes natural and you wonder why you've never thought of it before. Simply put, most apps do a lot more reading than writing, and writing tends to be a lot more complicated than reading (requiring business rule validation etc.), so why not separate the two? You can use your expensive transactional database for writing and then your cheap, maybe Non-SQL based or open source, database over multiple reading servers. Your read model is then optimized for the screens of your application(s), whereas the write model is optimized solely for writing and is in fact a DDD-based set of repositories.
There's just not enough room here to cover this option in detail, but CQRS is a good way of achieving scalability and even ease of development, once you have a CQRS framework in place. There are many other advantages to CQRS, such as ease of auditing (if you combine it with the "event sourcing" technique, which is common in CQRS-based environments).
Some interesting links:
http://cqrsinfo.com
http://abdullin.com/cqrs
http://blog.fossmo.net/post/Command-and-Query-Responsibility-Segregation-(CQRS).aspx
Are you ready for some reading? There are a lot of options, but I believe you should start by learning about the advantages of modern distributed NoSQL dbs, and enjoy learning from the experience learned in facebook, linkedin and other friends. Start here:
http://highscalability.com/
http://nosql-database.org/

Resources