I am a little bit confused on the roles played by Kafka and SignalR with regards to real time communication. Can somebody help me by providing insights whether Kafka can be used as a like to like replacement for SignalR or are they complementary?
Thanks and Regards,
Nagarajan B P
SignalR is a library that simplifies the process of adding real-time web functionality to applications using WebSockets.
Kafka is an open source software which provides a framework for storing, reading and analysing streaming data.
You can use both if you want, using kafka as the entry point for events, and SignalR to notify web/mobile apps in real time.
Kafka aims to provide streaming data between many different programs. SignalR aims to provide real time communication between a client and server. So if you have a web app, thanks to cross scripting prevention, that page can only talk to its own server. But if you have many of your own custom services that don't all need to go through one point, Kafka would be more appropriate.
Related
To preface I'd like to say I apologize if this question is too general or has been asked before. I'm struggling with some concepts here and truly don't know what I don't know and have no idea where to look. For context, i'm a full stack dev whose experience has mainly been with writing web servers using Node.js. In such applications implementing BullMQ, bull-board and error-handling is fairly straight forward since you can use middlewares and bull-board ships with server framework specific adapters.
Currently I am working on a project written in Node.js and Typescript. It listens to events emitted by the new Opensea Stream package (a socket based application) and accordingly adds jobs to the BullMQ queue. So on startup it connects to the database, connects to the Opensea socket and then acts accordingly to emitted events. In my mind I would classify this as a Job/Worker/Process sort of application. I've had a couple of issues building out this project and understanding how its made and its place in the web application architecture.
Consequently I have come up with a couple of questions:
What are these sort of applications called? I've had a hard time looking up answers for my questions since I don't know what to categorize this application as. When I try to Google 'Node.js Job Application', as expected it gives me links to job postings.
Am I using the wrong tools for the job? Is Node.js only supposed to be used to write web servers and not for such job based applications?
How does error-handling work here? With web servers if an error shows up in the server, the middleware catches the error and sends a response to the client. In such job applications what happens if an error is thrown? Is anything done with the error or is it just logged in errors, is the job reran/cancelled etc?
How do I implement bull-board here to graphically observe my queue? Since i'm not using a web framework here in this application how is the integration done? Is it even possible?
How different are socket based architectures to REST API servers? Is it a completely different use case/style of implementation or can they have the same server architecture and are just different ways of communicating data between processes? Are sockets more for microservices?
Are there any resources or books that detail how these applications are supposed to be built? Any guidance on best practices or architecture would be greatly appreciated.
Problem description
I want to deploy distributed, ordered queues solution for my project but I have questions/problems:
Which tool/solution should I use? Which would be the easiest to implement/learn and infrastructure cost me less? RabbitMQ, Kafka, Redis Streams?
How to implement auto rebalancing of topics/streams for each consumer in failure situation or when new topic/stream was added to system?
In other words, I want to realize something like that:
distributed queues
..but, if one of my application are failed, other instances should take all traffic which is currently left with proper distribution (equal load).
Note, that my code was written in node.js v10 (TypeScript) and my infrastructure are based on Azure, so besides self-hosted solution (like RabbitMQ), azure-based solution (like Azure Service Bus) are also possible, but less vendor-lock, the better solution for me
My current architecture
Now I provide a more detailed background of my system:
I have 100 000 vehicle's tracker devices (different ones, many manufactures and protocols), each of them communicate with one of my custom app called decoder. This small microservice decodes and unifies payload from tracker and send it to distributed queue. Each tracker sends message every 10-30 seconds.
Note, that I must keep order of messages from single device, this is very important!
In next step, I have processing app microservice which I want to scale (forking / clustering) depends of number of tracker devices. Each fork of this app should subscribe to some of topics/consumer groups to process messages from devices, while keeping order. Processing of each message takes about 1-3 seconds.
Note, that in every moment of time, I can add or remove tracker devices, and this information should be auto-propagate to forks of processing app and this instances should be able to auto rebalancing traffic from queue.
The question is how to do that with as little as possible lines of (node.js) code, and at the same time, keeping solution easy, clean and cheap? :)
As you see at picture above, if fork no.3 failed, system must decide which of working forks should be get "blue" messages. Also, if fork no.3 return back, rebalancing is also needed.
My own research
I read about Apache Kafka with Consumer Groups, but Kafka is difficult to learn and to implement for me.
I read about RabbitMQ and Consumer Groups / many topics, but I don't know how to write auto rebalancing feature and also how I can use RabbitMQ (which plugins? which settings / configurations? there's so many options...).
I read about Azure Service Bus with message sessions but it has vendor-lock (azure cloud), it costs a lot, and like other solutions, doesn't provide full auto-rebalancing out-of-box.
I read about Redis Streams (with consumer groups) but it's new feature (lack of libraries for node.js) and also doesn't provide auto-rebalancing.
1 Message Brocker
For the first question you should look for a mature m2m protocol brocker which will give you freedom in designing your own intelligent data switching algorithms.
2 Loadbalancer
The answer to the second question you must employ well performed load balancer for handling such a huge number of 100000 connected cars. My suggestion to use Azure API Gateway or Nginx load balancer.
Now lets look at some of connected car solutions and analyze how the Aws IoT or Azure IoT doing the job nicely.
OpenSource IoT Solution
OpenSource IoT Solution
Nginx or API Gateway is used for the load Balancing purposes while the event processing is done on Kafka. Using kafka you can implement your own rule engine for intelligent data switching. Similarly any Message Broker as IoT bridge would do better. If I were you would be using VerneMQ to implement MQTTv5 features and data routing. In this case queue is not required.
Again if you want to use azure queue you have to concentrate on managing the queue forking and preempting. To control the queue seamlessly you have to write Azure Queue Trigger server-less Function. Thus your goal to not be vendor locked would be impossible to achieve.
In single word using VerneMQ, MQTT V5 implementation with Nginx would be great to implement but as all these are opensource product you must be strong in implementation and trouble shooting otherwise your business operation would be in support failure.
Its better to use professional IoT cloud services for a solution of thousands of connected cars. This is paying of as the SLA of the service is very high standard and little effort in system operation management.
Azure IoT Solution
Azure IoT Solution
If you are using Azure Solution, you be using IoT Hub where you don't have to worry about load balancing. Using Azure device SDK you can connect all the car with mobile LTE sim, OBD plugin etc to the cloud. Then azure function can handle the event processing and so on.
AWS IoT Solution
AWS IoT Solution
Unlike Azure IoT Device SDK, AWS IoT have sdk for devices. But in this architecture we want to complete the connected car project a little differently. For the shake of thing shadow and actual device status synchronization we have used AWS GreenGrass core solution in the edge side. Along with the server-less IoT event processing we have settled the whole connected car solution.
Similarly Azure IoT edge could be used to provide all can information to the device twin and synchronize between the actual car and twins.
Hope this will give you a clear idea how to implement and see the cost benefit over the vendor locked or unlocked situation.
Thank you.
I have an idea of social network website and as I'm currently learning web development I thought that was a great idea to practice. I already worked out the business logic and front-end design with React. Now it's time for backend and I'm struggling.
I want to create a React+Nodejs event-driven app. It seems logical to use Kafka right away. I looked through various Kafka architecture examples and have several questions:
Is it possible to create an app that uses Kafka for data through API calls from Nodejs and to React and vice versa. And user relational database only for longterm storage?
Is it better to use Kafka to handle all events but communicate with some noSQL database like Cassandra or HBase. Then it seems that NodeJS will have to make API calls to them and send data to React.
Am I completely missing the point and talking nonsense?
Any answer to this question is going to be quite opinion based, so I'm just going to try to stick to the facts.
It is totally possible to create such an application. Moreover, Kafka is basically a distributed log, so you can use it as an event store and build your state from that.
That mainly depends on your architecture, and there are too many gaps here to answer this with any certainty - what kind of data are you saving? What kind of questions will you need answered? What does your domain model look like? You could use Kafka as a store, or as a persistent messaging service.
I don't think you're off the mark, but perhaps you're going for the big guns when in reality you don't really need them. Kafka is great for a very large volume of events going through. If you're building something new, you don't have that volume. Perhaps start with something simpler that doesn't require so much operational complexity.
Perhaps a silly question, but keep reading about SIs "lightweight messaging within Spring-based applications". I want to know how (if) SI uses messaging internally. When I run an SI (Boot) application (one that doesn't require AMPQ ... aka 'messaging' support), I don't have to run a Rabbit server. But, from what I gather, SI uses messaging internally. How is this accomplished? I can't seem to find any reference explaining this & what infrastructure is required to make this possible. Thanks!
The messages are simply Java objects (o.s.messaging.Message) passed between components. No external broker is needed, unless you need persistence.
I suggest you read Mark Fisher's book (Spring Integration in Action) and/or the reference manual.
The messaging inside spring integration are in-memory java objects passed from one service to another via channels/queue. It provides a mechanism to define the flow and order of processing, also allowing each service step to work in isolation. The spring integration queue is eventually an implementation of java.util.Queue interface.
It is different from commercial Messaging tools like IBM MQ or Active MQ as it doesnt offer persistence. Which means if you kill the jvm or the app process is stopped, all the messages in flight on the Spring queue/channel are lost. A lot if times this is acceptable if the process in idempotent, i.e When the application comes up, I can restart the process from beginning.
For a notification project, would like to push event notifications out. These are things like login, change in profile, etc., and to be displayed to the appropriate client. I would like to discuss some ideas on putting it together, and get some advice on the best approach.
I noticed here that changes made to a CouchDB can be detected with a _changes stream, picked up by Node, and a process kicks off. I would like to implement something like this (I'm using SQL Server, but an entry point at this level may not be the best solution).
Instead of following the CouchDB example (detecting database-based events, I think this just complicates things, since we're interested in client events), I was thinking that when an event occurs, such as a user login, then a message is sent to the Node server with some event details (RESTful request?). This message is then processed and broadcast to all connected clients; the appropriate client displays notification.
Proposed ecosystem:
.Net 4.0
IIS
IISNode
Socket.IO
Node.JS
SQL Server 2008
This will be built on top of an existing project using the .Net framework (IIS, etc.). Many of the clients' browsers do not support web sockets, so using Socket.IO is a good option (fallback support). However, from what I can see, Socket.IO only still only supports long polling through IISNode (which isn't really a problem).
An option would be to expose the Socket.IO/Node endpoint to all clients, so that client-based notifications can be sent through JS to the Node server, which broadcasts the message. (follows the basic chat-server /client/server examples).
Alternately, an IIS endpoint could be used, but could only support long polling (through Socket.IO). This would offer some additional .Net back-end processing, but may be over-complicating the architecture.
Is there SQL Server-based event notification available for Node?
What would be the best approach?
If I didn't get the terminology ecosystem configuration right, please clarify.
Thanks.
I would recommend you check out SignalR first before considering adding iisnode/node.js to the mix of technologies of your pre-existing ASP.NET application.
Regarding websockets, regardless if you use ASP.NET or node.js (socket.io), you can only use HTTP long polling for low latency notifications, as websockets are not supported by HTTP.SYS/IIS until Windows 8. iisnode does not currently support websockets (even on Windows 8), but such support could be added later.
I did some research lately regarding MSSQL access from node.js. There are a few OSS projects out there, some of them use native, platform-specific extensions, some attempt implementing TDS protocol purely in JavaScript. I am not aware of any that would enable you to access the SQL Notifications functionality. However, the MSSQL team itself is investing in a first class MSSQL driver for node.js, so this is something to keep an eye on going forward (https://github.com/tjanczuk/iisnode/issues/139).
If you plan to use SQL Notifications to support low latency notifications, I would strongly recommend starting with performance benchmarks that simulate the desired level of traffic at the SQL server level. SQL Notifications were meant primarily as a mechanism to help maintain in memory cache consistent with the content of the database, so it may or may not meet the requirements of your notification scenario. At the very minimum these measurements would help you start with a better design.
I would highly recommend using Pusher. That is what we use and it makes it easy to implement as it is a hosted solution. So plugging it and making it work is really easy. It doesn't cost much unless you are going to push a crazy amount of messages through it on a massive scale.