I have a web site which uses a Long Poll to wait for the server to finish processing some data. However, a timeout might occur or the user might close his browser, yet the server is continuously processing it's data.
I want the server to stop processing data as soon as the Long Poll connection is broken. There's no client who will receive the data so there's no use for this long process to continue running... How to do this?
The server is working on adding files to a ZIP archive, which takes some time since these are reasonable big files. Once it's done, it will send the final ZIP file and close the connection. But if the client disconnected before the task is finished, the server should stop it's work and discard everything again...
You should consider using he SignalR framework. It offers very comfortable events like OnConnect() and OnDisconnect(). Under the hood it works with
WebSockets
Server Sent Events
Forever Frame
Long polling
It uses whatever is available with the given environment, starting with WebSockets.
Related
So, I have Express server that accepts a request. The request is web scraping that takes 3-4 minute to finish. I'm using Bull to queue the jobs and processing it as and when it is ready. The challenge is to send this results from processed jobs back as response. Is there any way I can achieve this? I'm running the app on heroku, but heroku has a request timeout of 30sec.
You don’t have to wait until the back end finished do the request identified who is requesting . Authenticate the user. Do a res.status(202).send({message:”text});
Even though the response was sended to the client you can keep processing and stuff
NOTE: Do not put a return keyword before res.status...
The HyperText Transfer Protocol (HTTP) 202 Accepted response status code indicates that the request has been accepted for processing, but the processing has not been completed; in fact, processing may not have started yet. The request might or might not eventually be acted upon, as it might be disallowed when processing actually takes place.
202 is non-committal, meaning that there is no way for the HTTP to later send an asynchronous response indicating the outcome of processing the request. It is intended for cases where another process or server handles the request, or for batch processing.
You always need to send response immediately due to timeout. Since your process takes about 3-4 minutes, it is better to send a response immediately mentioning that the request was successfully received and will be processed.
Now, when the task is completed, you can use socket.io or web sockets to notify the client from the server side. You can also pass a response.
The client side also can check continuously if the job was completed on the server side, this is called polling and is required with older browsers which don't support web sockets. socket.io falls back to polling when browsers don't support web sockets.
Visit socket.io for more information and documentation.
Best approach to this problem is socket.io library. It can send data to client send whenever you want. It triggers a function on client side which receives the data. Socket.io supports different languages and it is really ease to use.
website link
Documentation Link
create a jobs table in a database or persistant storage like redis
save each job in the table upon request with a unique id
update status to running on starting the job
sent HTTP 202 - Accepted
At the client implement a polling script, At the server implement a job status route/api. The api accept a job id and queries the job table and respond with the status
When the job is finished update the job table with status completed, when the jon is errored updated the job table with status failed and maybe a description column to store the cause for error
This solution makes your system horizontaly scalable and distributed. It also prevents the consequences of unexpected connection drops. Polling interval depends on average job completion duration. I would recommend an average interval of 5 second
This can be even improved to store job completion progress in the jobs table so that the client can even display a progress bar
->Request time out occurs when your connection is idle, different servers implement in a different way so timeout time differs
1)The solution for this timeout problem would be to make your connections open(constant), that is the connection between client and servers should remain constant.
So for such scenarios use WebSockets, which ensures that after the initial request and response handshake between client and server the connection stays open.
there are many libraries to implement realtime connection.Eg Pubnub,socket.io. This is the same technology used for live streaming.
Node js can handle many concurrent connections and its lightweight too, won't use many resources too.
I am using the ws Node.js package to create a simple WebSocket client connection to a server that is sending hundreds of messages per second. Even with a simple onMessage handler that just console.logs incoming messages, the client cannot keep up. My understanding is that this is referred to as backpressure, and incoming messages may start piling up in a network buffer on the client side, or the server may throttle the connection or disconnect all-together.
How can I monitor backpressure, or the network buffer from the client side? I've found several articles speaking about this issue from the perspective of the server, but I have no control over the server and need to know just how slow is my client?
So you don't have control over the server and want to know how slow your client is.(seems like you already have read about backpressure). Then I can only think of using a stress tool like artillery
Check this blog, it might help you setting up a benchmarking scenario.
https://ma.ttias.be/benchmarking-websocket-server-performance-with-artillery/
Add timing metrics to your onMessage function to track how long it takes to process each message. You can also use RUM instrumentation like from the APM providers -- NewRelic or Appdynamics for paid options or you could use free tier of Google Analytics timing.
If you can, include a unique identifier for correlation between the client and server for each message sent.
Then you can correlate for a given window how long a message took to send from the server and how long it spent being processed by the client.
You can't get directly to the network socket buffer associated with your websocket traffic since you're inside the browser sandbox. I checked the WebSocket APIs and there's no properties that expose receive buffer information.
If you don't have control over the server, you are limited. But you could try some client tricks to simulate throttling.
This heavily assumes you don't mind skipping messages.
One approach would be to enable the socket, start receiving events and set your own max count in a in-memory queue/array. Once you reach a full queue, turn off the socket. Process enough of the queue, then enable the socket again.
This has high cost to disable/enable the socket, as well as the loss of events, but at least your client will not crash.
Once your client is not crashing, you can put some additional counts on timestamp and the queue size to determine the threshold before the client starts crashing.
I am going to design a system where there is a two-way communication between clients and a web application. The web application can receive data from the client so it can persist it to a DB and so forth, while it can also send instructions to the client. For this reason, I am going to use Node.JS and Socket.IO.
I also need to use RabbitMQ since I want that if the web application sends an instruction to a client, and the client is down (hence the socket has dropped), I want it to be queued so it can be sent whenever the client connects again and creates a new socket.
From the client to the web application it should be pretty straightforward, since the client uses the socket to send the data to the Node.JS app, which in turn sends it to the queue so it can ultimately be forwarded to the web application. From this direction, if the socket is down, there is no internet connection, and hence the data is not sent in the first place, or is cached on the client.
My concern lies with the other direction, and I would like an answer before I design it this way and actually implement it, so I can avoid hitting any brick walls. Let's say that the web application tries to send an instruction to the client. If the socket is available, the web app forwards the instruction to the queue, which in turn forwards it to the Node.JS app, which in turn uses the socket to forward it to the client. So far so good. If on the other hand, the internet connection from the client has dropped, and hence the socket is currently down, the web app will still send the instruction to the queue. My question is, when the queue forwards the instruction to Node.JS, and Node.JS figures out that the socket does not exist, and hence cannot send the instruction, will the queue receive a reply from Node.JS that it could not forward the data, and hence that it should remain in the queue? If that is the case, it would be perfect. When the client manages to connect to the internet, it will perform a handshake once again, the queue will once again try to send to Node.JS, only this time Node.JS manages to send the instruction to the client.
Is this the correct reasoning of how those components would interact together?
this won't work the way you want it to.
when the node process receives the message from rabbitmq and sees the socket is gone, you can easily nack the message back to the queue.
however, that message will be processed again immediately. it won't sit there doing nothing. the node process will just pick it up again. you'll end up with your node / rabbitmq thrashing as it just nacks a message over and over and over and over, waiting for the socket to come back online.
if you have dozens or hundreds of messages for a client that isn't connected, you'll have dozens or hundreds of messages thrashing round in circles like this. it will destroy the performance of both your node process and rabbitmq.
my recommendation:
when the node app receives the message from rabbitmq, and the socket is not available to the client, put the message in a database table and mark it as waiting for that client.
when the client re-connects, check the database for any pending messages and forward them all at that point.
I have a production app that uses socket.io (node.js back-end)to distribute messages to all the logged in clients. Many of my users are experiencing disconnections from the socket.io server. The normal use case for a client is to keep the web app open the entire working day. Most of the time on the app in a work day time is spent idle, but the app is still open - until the socket.io connection is lost and then the app kicks them out.
Is there any way I can make the connection more reliable so my users are not constantly losing their connection to the socket.io server?
It appears that all we can do here is give you some debugging advice so that you might learn more about what is causing the problem. So, here's a list of things to look into.
Make sure that socket.io is configured for automatic reconnect. In the latest versions of socket.io, auto-reconnect defaults to on, but you may need to verify that no piece of code is turning it off.
Make sure the client is not going to sleep such that all network connections will become inactive get disconnected.
In a working client (before it has disconnected), use the Chrome debugger, Network tab, webSockets sub-tab to verify that you can see regular ping messages going between client and server. You will have to open the debug window, get to the network tab and then refresh your web page with that debug window open to start to see the network activity. You should see a funky looking URL that has ?EIO=3&transport=websocket&sid=xxxxxxxxxxxx in it. Click on that. Then click on the "Frames" sub-tag. At that point, you can watch individual websocket packets being sent. You should see tiny packets with length 1 every once in a while (these are the ping and pong keep-alive packets). There's a sample screen shot below that shows what you're looking for. If you aren't seeing these keep-alive packets, then you need to resolve why they aren't there (likely some socket.io configuration or version issue).
Since you mentioned that you can reproduce the situation, one thing you want to know is how is the socket getting closed (client-end initiated or server-end initiated). One way to gather info on this is to install a network analyzer on your client so you can literally watch every packet that goes over the network to/from your client. There are many different analyzers and many are free. I personally have used Fiddler, but I regularly hear people talking about WireShark. What you want to see is exactly what happens on the network when the client loses its connection. Does the client decide to send a close socket packet? Does the client receive a close socket packet from someone? What happens on the network at the time the connection is lost.
webSocket network view in Chrome Debugger
The most likely cause is one end closing a WebSocket due to inactivity. This is commonly done by load balancers, but there may be other culprits. The fix for this is to simply send a message every so often (I use 30 seconds, but depending on the issue you may be able to go higher) to every client. This will prevent it from appearing to be inactive and thus getting closed.
I have a NodeJS server set up that accepts TLS connections using the tls module: http://nodejs.org/api/tls.html
The clients are are using the NodeJS TLS module for the connections. I'm also storing a list/hashmap of all connected client and their IDs. If a client disconnects, then I will remove it from the list using the "error", "clientError" and "close" events.
This works in any normal case - however, when I "kill" the client (unplug power, unplug network cable) it seems like there is no event fired and the stream is open forever. Maybe I have overlooked something, but is there an event for something like this or how can I detect when the stream is not there any longer?
Sure, I could poll it in a certain interval, but that does not sound pretty good, since it will cause a lot of traffic (for almost no reason).
In the end, the stream is actually closed. If you try to call write, then it will cause an "write after end" error. Sadly, it seems like there is no event fired when the stream itself closes.
So right now, I'm just trying to write something every few minutes to see if the stream is still alive.