Microservices, how to notify backend when task complete - node.js

For example, if i have main application (backend) and some microservice, e.g for image cropping.
User loads an image, making request to backend, backend using rabbitmq posts new task in the queue, then image cropping service pickup a task, completes it and i need somehow notify backend.
What is options for this? I need another microservice for such notifications?

so... there are reaaaaaaly many ways to do that.
On the high level, what you want to achieve is to produce an event that 1 or more services can react to. Now depending on what you have available, you can produce the event in a number of different ways.
if you want to be completely platform independent, you can use Apache Kafka. It's a popular service specifically for what we need -> publishing events and processing them at mass-scale. Kafka can be clustered, partitioned, have multiple parallel consumers of the same type (like multiple instances of your main backend service) or different types (3 different microservices that happen to be interested in a specific event). This bad boy just has it all and is famous for that. You can set up a cluster yourself or use one that comes out-of-the-box with some of the cloud platforms (like AWS for instance), but this might be more expensive and difficult to use compared to some cloud-specific fully-managed solutions.
if you're running your stuff on the google cloud, you can make it easier and cheaper by using the PubSub service. PubSub is a fully managed service that is scaled out-of-the-box (welcome to the cloud! you don't need to scale or cluster anything by yourself!).
if you're running on AWS, you can use SNS, or a more recent alternative - EventBridge (kinda like SNS, but booooooy what can it not do?). Yeah... I would recommend EventBridge. It can just do more... with the target filtering rules, payload transformations, it can automatically trigger more things...
Azure... ehm... Event Hub... but I haven't worked with this one yet... I'm not much of an Azurer... because you know... nobody uses azure for this kind of stuff...

Related

Should I use a Webhook or AWS queue (SQS)?

I've been implementing an SQS service(AWS) for my project. My purpose for this implement is I have 2 projects (microservice) and I want to sync data from one project to another. So, I intend to use SQS service but I also think about webhook for solving my case. I know some basics of the pros and cons of them. So, my question is should I use a webhook or SQS for my case?
Thanks for any helping!
First of all, if you wish to sync 2 databases you would probably want something that's not accounting on your service. Try reading about change data capture - Log scanners is a safe way to do that. Debezium - is a strong tool for it.
Second, if you wish to go with your own implementation I would suggest going with the queueing approach. The biggest advantage of it will be incased when the second service is down. While if using Webhooks the information will be lost, using queues (SQS or any other) will keep the data until the service is up again.
SQS is your best bet here. Couple of reasons
- Reliability in case something is down.
- Ability to repopulate other micro-services. For example if you decide to create another microservice and you need to populate data since start, you will probably read everything from service 1 and put it in the queue for the new micro service.
- Scalability - Queues makes your architecture horizontally scalable. Just put machines to do the work while reading it from queues in parallel.

When use rabbitMQ in node REST API?

I have developed a node sdk which has certain REST API.These API's are interacting with blockhchain framework for read and write operations.
There could be certain situations when many requests are coming on node sdk.
So for load balacing i have used NGNIX with having one more replica of sdk on another instance.This all works well.
It is being suggested to use rabbitMQ for load balancing as well. But in my API there are few straightforwards read and write operations by API & no heavy processing done.
I read rabbitMQ should be used for below purpose.
Integrating multiple microservices
Executing heavy task such as image processing,image uploading etc.
So how and when should i use rabbitMQ ?
I think your design is OK. Simply, your system had to manage more load and you added more replicas of your services, with a load balancer on the front that is able to distribute incoming load between the replicas. If your "sdk" is purely stateless (doesn't remeber client data collected from previous requests, but delegates all state to a DB/BC) your've done your job. A message queuing technology can help in other scenarios
when your application does things in a pure asynchronous fashion
when you have to manage big spikes of load
when some of your architecture component reacts to events (eg. receiving an alarm from a device, sending an email when your become the 1 million click etc)
when you're into event sourcing
when in some way there are stateful services that consume data from the same batch of requests (eg all data from user with id 1sw023)
various and possible
Adopting MQs has a big impact and needs some effort to integrate e manage things. Don't do it if you are not sure to leverage completely its benefits
RabbitMQ is a Message Queue. It's useful when your application is receiving more requests that what it can handle simultaneously.
The way it works is that the queue store the incoming messages until they are processed by worker nodes (for example your SDK). The worker nodes typically do some work (usually heavy processing), and when they are done with the work, they pull a new message from the queue, process it, do the work, and so on so forth.
In your case, you might need it if you see that your blockchain is rejecting a lot of messages (for example because there was too much request at once, and the blockchain couldn't reach a consensus quick enough).

Many ordered queues - how to auto rebalancing streams between app instances?

Problem description
I want to deploy distributed, ordered queues solution for my project but I have questions/problems:
Which tool/solution should I use? Which would be the easiest to implement/learn and infrastructure cost me less? RabbitMQ, Kafka, Redis Streams?
How to implement auto rebalancing of topics/streams for each consumer in failure situation or when new topic/stream was added to system?
In other words, I want to realize something like that:
distributed queues
..but, if one of my application are failed, other instances should take all traffic which is currently left with proper distribution (equal load).
Note, that my code was written in node.js v10 (TypeScript) and my infrastructure are based on Azure, so besides self-hosted solution (like RabbitMQ), azure-based solution (like Azure Service Bus) are also possible, but less vendor-lock, the better solution for me
My current architecture
Now I provide a more detailed background of my system:
I have 100 000 vehicle's tracker devices (different ones, many manufactures and protocols), each of them communicate with one of my custom app called decoder. This small microservice decodes and unifies payload from tracker and send it to distributed queue. Each tracker sends message every 10-30 seconds.
Note, that I must keep order of messages from single device, this is very important!
In next step, I have processing app microservice which I want to scale (forking / clustering) depends of number of tracker devices. Each fork of this app should subscribe to some of topics/consumer groups to process messages from devices, while keeping order. Processing of each message takes about 1-3 seconds.
Note, that in every moment of time, I can add or remove tracker devices, and this information should be auto-propagate to forks of processing app and this instances should be able to auto rebalancing traffic from queue.
The question is how to do that with as little as possible lines of (node.js) code, and at the same time, keeping solution easy, clean and cheap? :)
As you see at picture above, if fork no.3 failed, system must decide which of working forks should be get "blue" messages. Also, if fork no.3 return back, rebalancing is also needed.
My own research
I read about Apache Kafka with Consumer Groups, but Kafka is difficult to learn and to implement for me.
I read about RabbitMQ and Consumer Groups / many topics, but I don't know how to write auto rebalancing feature and also how I can use RabbitMQ (which plugins? which settings / configurations? there's so many options...).
I read about Azure Service Bus with message sessions but it has vendor-lock (azure cloud), it costs a lot, and like other solutions, doesn't provide full auto-rebalancing out-of-box.
I read about Redis Streams (with consumer groups) but it's new feature (lack of libraries for node.js) and also doesn't provide auto-rebalancing.
1 Message Brocker
For the first question you should look for a mature m2m protocol brocker which will give you freedom in designing your own intelligent data switching algorithms.
2 Loadbalancer
The answer to the second question you must employ well performed load balancer for handling such a huge number of 100000 connected cars. My suggestion to use Azure API Gateway or Nginx load balancer.
Now lets look at some of connected car solutions and analyze how the Aws IoT or Azure IoT doing the job nicely.
OpenSource IoT Solution
OpenSource IoT Solution
Nginx or API Gateway is used for the load Balancing purposes while the event processing is done on Kafka. Using kafka you can implement your own rule engine for intelligent data switching. Similarly any Message Broker as IoT bridge would do better. If I were you would be using VerneMQ to implement MQTTv5 features and data routing. In this case queue is not required.
Again if you want to use azure queue you have to concentrate on managing the queue forking and preempting. To control the queue seamlessly you have to write Azure Queue Trigger server-less Function. Thus your goal to not be vendor locked would be impossible to achieve.
In single word using VerneMQ, MQTT V5 implementation with Nginx would be great to implement but as all these are opensource product you must be strong in implementation and trouble shooting otherwise your business operation would be in support failure.
Its better to use professional IoT cloud services for a solution of thousands of connected cars. This is paying of as the SLA of the service is very high standard and little effort in system operation management.
Azure IoT Solution
Azure IoT Solution
If you are using Azure Solution, you be using IoT Hub where you don't have to worry about load balancing. Using Azure device SDK you can connect all the car with mobile LTE sim, OBD plugin etc to the cloud. Then azure function can handle the event processing and so on.
AWS IoT Solution
AWS IoT Solution
Unlike Azure IoT Device SDK, AWS IoT have sdk for devices. But in this architecture we want to complete the connected car project a little differently. For the shake of thing shadow and actual device status synchronization we have used AWS GreenGrass core solution in the edge side. Along with the server-less IoT event processing we have settled the whole connected car solution.
Similarly Azure IoT edge could be used to provide all can information to the device twin and synchronize between the actual car and twins.
Hope this will give you a clear idea how to implement and see the cost benefit over the vendor locked or unlocked situation.
Thank you.

Splitting load of an API between multiple servers

I'm planning to build an API for one of my projects. But I'm looking for a good way to manage it, and manage server load.
Would I be better off just creating everything on one server, or should I create multiple?
Thoughts:
If I create one server and that server crashes, the whole system would go down. But if I create multiple servers to handle this, and one of them crashes, only that part would go down.
How I was thinking to accomplish this:
1) Create one API ENDPOINT
2) When a user sends a REQUEST to that API ENDPOINT, the ENDPOINT would send another request to the correct server containing the special task, when the task is done it would return the data back to the user.
AKA:
User => ENDPOINT => ENDPOINT 1, ENDPOINT 2, ENDPOINT 3, => ENDPOINT => User
Is this how I should do it?
P.S. I don't know if this the right terminology but I'm trying to learn how to scale my ENDPOINTS/API/code.
About the load balancer, you should use specific web server applications to do that, like nginxor apache. This kind of web server tools already have implemented load balance mechanisms, you just need to configure it.
Also, I recommend you to pack your server in docker images. This way you could use Docker Swarm or Kubernetes to deploy and scale up/down your application. It's easier to manage your services, check applications states and deploy new versions.
You could use docker with nginx, where each docker container has an instance of your application and nginx will take care of redirect/distribute your requests between your instances.
What you are basically looking for is a comparison between microservices based architecture (or SOA) and a monolith.
In microservices, there are multiple services performing specific tasks. They all in-turn are used to perform complex tasks. Monoliths on the other hand consist of a big server which does everything and is also the single point of failure like your pointed.
Should you move to microservices?
It is widely agreed that a project should be built in monolithic architecture and then moved to microservices as the complexity grows. Martin Fowler's article explains this concept well.
This is because there are certain disadvantages and tradeoffs associated with this architecture -- inconsistency and latency, for instance.
TLDR; Stick to one server if starting, break into services when it becomes complex.

Seneca.js role in microservices architecture across separate docker containers

I am in the planning phase of moving a c#.net monolithic application to node.js. I would like to implement the microservices architecture, event-driven, for this app using seneca.js and docker to separate each microservice into its own container hosted on aws elastic beanstalk. From what I have read and per recommendations this seems the way to go so far.
Here is where I am confused, in reviewing the seneca.js docs, I am not seeing how out-of-process communication is occurring.
In particular, if I want to allow multiple clients to subscribe to the same event should I use rabbitmq with seneca.js as there are times where several microservices have to perform actions for a particular event? In going this route, how would I handle a scenario where one of the subscribers fails and needs to run again? Seems like this event would need to be run again for this microservice only and not the others.
Also, in using seneca.js, how do I allow for exposing a rest api for each microservice to allow clients to gain access to its internal database and data using this approach?
Please let me know if I am incorrect in any aspects of this.

Resources