Heap memory full in Hazelcast jet - hazelcast

I have a client and server application, where the client is receiving data from the server using WebSockets.
Client application is built using Hazelcast Jet to process data provided by server. ClientApplication stores data inside Queue and polls it when it is ready to process further. Client is relatively slow compared to server, so data is continuously inserted in client's Queue until the heap is full. I wonder if there are any Hazelcast Jet features which I can use to resolve this issue.
Note: I don't want the producer to stop producing data or the client to stop receiving it. I want to client to continue receiving the data and process it as fast as possible.
The solution I have in mind is to write data received from server into a file rather than storing in Queue and read data from it, this can resolve the memory issue. Is this the proper way to handle this use case or there are better solutions to this?

Related

How to measure Websocket backpressure or network buffer from client

I am using the ws Node.js package to create a simple WebSocket client connection to a server that is sending hundreds of messages per second. Even with a simple onMessage handler that just console.logs incoming messages, the client cannot keep up. My understanding is that this is referred to as backpressure, and incoming messages may start piling up in a network buffer on the client side, or the server may throttle the connection or disconnect all-together.
How can I monitor backpressure, or the network buffer from the client side? I've found several articles speaking about this issue from the perspective of the server, but I have no control over the server and need to know just how slow is my client?
So you don't have control over the server and want to know how slow your client is.(seems like you already have read about backpressure). Then I can only think of using a stress tool like artillery
Check this blog, it might help you setting up a benchmarking scenario.
https://ma.ttias.be/benchmarking-websocket-server-performance-with-artillery/
Add timing metrics to your onMessage function to track how long it takes to process each message. You can also use RUM instrumentation like from the APM providers -- NewRelic or Appdynamics for paid options or you could use free tier of Google Analytics timing.
If you can, include a unique identifier for correlation between the client and server for each message sent.
Then you can correlate for a given window how long a message took to send from the server and how long it spent being processed by the client.
You can't get directly to the network socket buffer associated with your websocket traffic since you're inside the browser sandbox. I checked the WebSocket APIs and there's no properties that expose receive buffer information.
If you don't have control over the server, you are limited. But you could try some client tricks to simulate throttling.
This heavily assumes you don't mind skipping messages.
One approach would be to enable the socket, start receiving events and set your own max count in a in-memory queue/array. Once you reach a full queue, turn off the socket. Process enough of the queue, then enable the socket again.
This has high cost to disable/enable the socket, as well as the loss of events, but at least your client will not crash.
Once your client is not crashing, you can put some additional counts on timestamp and the queue size to determine the threshold before the client starts crashing.

How to handle socket connections in a docker-swarm environment

I am building a webapplication using nodejs as the server and Docker Swarm to handle replication and load balancing.
Right now, I need to handle real-time data updates between clients and the replicated servers, so i thought of using Socket.IO to handle the connections. All the requests pass via an NGINX server that redirect it to the manager node of the swarm, and its him that handles the balancing.
Since the topology of the network can change rapidly based on the load of the network, i am reticent of letting NGINX handle the balancing and applying sticky sessions... (maybe am wrong)
For my understanding with this setup, if a client connects to my server, the load balancer of docker will send the request to one of my N replicated servers, and this an only this server will know that the client connected.
So, its possible that if some traditional HTTP-Request updates my data on another replica, the information will not be sent because of the lack of existence of this connection in the given server.
Is there a way of handling situations like this? I thought of including a Message queue between servers to send the data to all of them and then the one containing the connection will send the data, but is that the recommended way of doing it?
Thank you very much
I investigated a bit further since the time of the question. I'll post what I found in case it helps somebody with a similar issue.
One option i found is to use a MessageQueue or something similar to broadcast the messages to all the replicas, then each one filters only the messages that can send because the replica itself has knowledge of the TCP connections available in that replica.
But i think that would put excessive stress over the replicas because all of them are receiving all of the messages, so one solution would be to create a queue or a service that links the id of the given connection to the replica, and forward the messages only to those replicas interested.
I think it can be easily done with topics or making a queue for each tcp connection with some id as a identifier, and then pushing to the corresponding queue.
If anyone sees any problem or wants to add something, it will be very much appreciated!

How to detect network failure in sockets Node.js

I am trying to write internal transport system.
Data should be transferred from client to server using net sockets.
It is working fine except handling of network issues.
If I place firewall between client and server, on both sides I will not see any error, so data will continue to fill kernel buffer on client side.
And if I will restart app in this moment I will lose all data in buffer.
Question:
Do we have any way to detect network issues?
Do we have any way to get data back from kernel buffers?
Node js exposes the low level socket api to you very directly. I'm assuming that you are using a TCP socket to send and receive data.
One way to ensure that there is an active connection between the client and server is to send heartbeat signals back and forth. If you fail to receive a heartbeat from the server while sending data, you can assume that the connection failed.
As for the second part of your question: There is no easy way to get data back from kernel buffers. If losing the data will be a problem, I would make sure to write it to disk.

factors that determine socket.io+node.js emits

I have a node.js+socket.io server for sending messages. As it is not multithreaded, and handles one request at a time, i wanted to know what factors can make the emits faster?
I create a simple test server which only sends strings across sockets.
If i keep sending messages rapidly between only two users(like 1000 in a minute), the socket.io+node.js server gets extremely slow and messages start getting delayed by minutes. So what all can i do to make this faster?
Also, does this effect the node.js server handling the messages or all node.js servers? If a create two server for handling messages will the performance get better?
Use Redis as your state store (see https://www.npmjs.org/package/socket.io-redis) and scale out (deploy to multiple servers and use a WebSockets aware load balancer). Yes, performance will get better.

What is most efficient approach processing data read from socket?

I would like to use libev for a streaming server I am writing.
This is how everything is supposed to work:
client opens a TCP socket connection to server
server receives connection
client sends a list of images they would like
server reads request
server loops through all of the images
server reads image from NAS
server processes image file meta data
server sends image data to client
I found sample code that allows me to read and write from the socket using libev I/O events (epoll under the hood). But, I am not sure how to handle the read from NAS and processing.
This could take some time. And I don't want to block the server while this is happening.
Should this be done in another thread, and have the thread send the
image data back to the client?
I was planning on using a thread pool. But, perhaps libev can support a processing step without blocking?
Any ideas or help would be greatly appreciated!
You'll need a file I/O library (such as Boost::ASIO) that supports asynchronous reads. The underlying APIs are aio_read, aio_suspend, lio_listio.

Resources