Is the node thread the wrong place for a simulation loop? - multithreading

Let's say there was a weather simulator generating/calculating made up weather, and every 0.5 seconds a setInterval fires off and runs a bunch of calculations to get readings and process data to be human readable.
Then it would fire off relevant data to logged in parties via socket, perhaps only when the data actually changes.
So would it be better to run the weather simulation/generator in a child process by itself and keep the I/O in node's single thread?
Or, would that create locking requirements?

I think the principle that should guide you is separation of concerns. Your node.js server is a communications conduit. Your weather simulator is simply that. Changes made to one of those is very unlikely to involve the other. This is also a more scalable choice, if later your weather simulator becomes heavier, or you start getting more users than you had expected.
If the web clients could, I'm sure they would want to subscribe directly to the weather events feed, but don't let that fool you into thinking that you should muddle the websockets in with the simulator.

If the calculations are intense, I'd highly recommend that you run them as a child process and keep the node.js event loop responsive. If you listen to the childs stdout for the events data and done, you can send out the data as soon as it is available.
Make sure you use the async read and writes so that you don't start blocking things and you won't have to worry about locks.

Related

NodeJS Polling per User Structure best practice

My project is a full stack application where a web client subscribes to an unready object. When the subscription is triggered, the backend will run an observation loop to that unready object until it becomes ready. When that happens it sends a message to the frontend through socketIO (suggestions are welcome, I'm not quite sure if it's the best method). My question is how do I construct the observation loop.
My frontend basically subscribes to the backend, and gets a return 200 and will connect to the server per Websocket (socketIO) if it got subscribed correctly, or an error 4XX code if there was something that went wrong. On the backend, when the user subscribes, it should start for that user, a "thread" (I know Nodejs doesn't support threads, it's just for the mental image) that polls an information from an api every 10 or so seconds.
I do that, because the API that I poll from does not support WebHooks, so I need to observe the API response until it's at the state that I want it (this part I already got cleared).
What I'm asking, is there a third party library that actually is meant for those kinds of tasks? Should I use worker threads or simple setTimeouts abstracted by Classes? The response will be sent over SocketIO, that part I already got working as well, it's just the method I'm using im not quite sure how to build.
I'm also open to use another fitting programming language that makes solving this case easier. I'm not in a hurry.
A polling network request (which it sounds like this is) is non-blocking and asynchronous so it doesn't really take much of your nodejs CPU unless you're doing some heavy-weight computation of the result.
So, a single nodejs thread can make a lot of network requests (for your polling and for sending data over socket.io connection) without adding WorkerThreads or clustering. This is something that nodejs is very, very good at.
I'm not aware of any third party library specifically for this as you have to custom code looking at the results of the network request anyway and that's most of the coding. There are a bunch of libraries for making http requests of other servers from nodejs listed here. My favorite in that list is got(), but you can look at the choices and decide what you like.
As for making the repeated requests, I would probably just use either repeated setTimeout() calls or a setInterval() call.
You don't say whether you have to make separate requests for every single client that is subscribed to something or whether you can somehow combine all clients watching the same resource so that you use the same polling interval for all of them. If you can do the latter, that would certainly be more efficient.
If, as you scale, you run into scaling issues, you can then move the polling code to one or more child processes or WorkerThreads and then just communicate back to the main thread via messaging when you have found a new state that needs to be sent to the client. But, I would not anticipate you would need to code that extra step until you reach larger scale. As with most scaling things, you would need to code up the more basic option (which should scale well by itself) and then measure and benchmark and see where any bottlenecks are and modify the architecture based on data, not speculation. Far too often, the architecture is over-designed and over-implemented based on where people think the bottlenecks might be rather than where they actually turn out to be. Not only does this make the development take longer and end up with more complicated implementation than required, but it can target development at the wrong part of the problem. Profile, measure, then decide.

Building Websites only on NodeJs and Express blocking requests over http

I have a question regarding the examples out there when using Nodejs, Express and Jade for templates.
All the examples show how to build some sort of a user administrative interface where you can add user profiles, delete them and manage them.
Those are considered beginner's guides to NodeJs. My question is around the fact that if I have have 10 users concurrently accessing the same interface and doing the same operations, surely NodeJs will block the requests for the other users as they are running on the same port.
So let's say I am pulling out a list of users which may be something like 10000. Yes I can do paging, but that is not the point. While I am getting the list from the server another 4 users want to access the application. They have to wait for my process to end. That is my question - how can one avoid that using NodeJS & Express?
I am on this issue for a couple of months! I currently have something in place that does the following:
Run the main processing of stuff on a port
Run a Socket.io process on a different port
Use a sticky session
The idea is that I do a request (like getting a list of items), and immediately respond with some request reference but without the requested items, thus releasing the port.
In the background "asynchronously" I then do the process of getting the items. Upon which when completed, I do an http request from one node to the socket node port node SENDING the items through.
When that is done I then perform a socket.io emit WITH the data and the initial request reference so that the correct user gets the message.
On the client side I have an event listening for the socket which then completes the ajax request by populating the list.
I have SOME success in doing this! It actually works to a degree! I have an issue online which complicates matters due to ip addresses, and socket.io playing funny.
I also have multiple workers using clustering. I use it in the following manner:
I create a master worker
I spawn workers
I take any connection request and pass it to the relevant worker.
I do that for the main node request as well as for the socket requests. Like I said I use 2 ports!
As you can see I have had a lot of work done on this and I am not getting a proper solution!
My question is this - have I gone all around the world 10 times only to have missed something simple? This sounds way to complicated to achieve a non-blocking nodejs only website.
I asked myself - surely all these tutorials would have not missed on something as important as this! But they did!
I have researched, read, and tested a lot of code - this is my very first time I ask anything on stackoverflow!
Thank you for any assistance.
P.S. One example of the same approach is this: I request a report using jasper, I pass parameters, and with the "delayed ajax response" approach as described above I simply release the port, and in the background a very intensive report is being generated (and this can be very intensive process as a lot of calculations are being performed)..! I really don't see a better approach - any help will be super appreciated!
Thank you for taking the time to read!
I'm sorry to say it, but yes, you have been going around the world 10 times only to have been missing something simple.
It's obvious that your previous knowledge/experience with webservers are from a blocking point of view, and if this was the case, your concerns had been valid.
Node.js is a framework focused around using a single thread to execute code, which means if it does any blocking operations, no one else would be able to get anything done.
There are some operations that can do this in node, like reading/writing to disk. However, most node operations will be asynchronous.
I believe you are familiar with the term, so I won't go into details. What asynchronous operations allows node to do, is to keep this single thread idle as much as possible. By idle I mean open for other work. If your code is fully asynchronous, then handling 4 concurrent users (or even 400) shouldn't be a problem, even for a single thread.
Now, in regards to your initial problem of ports: Once a request is received on a given port, node.js execute whatever code you have written for it, until it encounters an asynchronous operation as soon as that happens, it is available to to pick up more requests on the same port.
The second problem you inquire about, is the database operation. In this case, node-js would send the query to the database (which takes no time at all) and the database does that actual execution of the query. In the meantime, node is free to do whatever it wants, until the database is finished, and lets node know there is a result to fetch.
You can recognize async operations by their structure: my_function(..., ..., callback). Function that uses a callback function, is in most cases asynch.
So bottom line: Don't worry about the problems around blocking IO, as you will hardly encounter any in node. Use a single port if you want (By creating multiple child processes, you can even have multiple node instances on the same port).
Hope this explains it good enough. If you have any further questions, let me know :)

How do I track and handle an event in node

So let's say I want to make a twitter bot. I want to send a certain message to whoever has sent it a reply, so I need to make an event for it. Obviously one way is to get all the replies (or last n replies) in a certain time interval, find out which ones are new, etc; but first of all it's not live, and it requires an extra query to find new tweets.
Say we want to track some changes in a website. For instance, we want to handle an event when that change happens, instantly.
I used socket.io to handle some other kind of events, like when some changes happen in a particular port, but I couldn't figure out how I can handle these types of events.
The word "event" does not mean what you think it means!
In a DOM environment, an Event is a very specific (and core) concept which allows you to write code based on user interactions with elements on the screen.
In NodeJS, an Event is something that can be generated and announced by an instance of events.EventEmitter
In your question, an Event seems to refer to anything that happens on the internet, potentially anywhere.
Under that last definition, there is simply no single answer for how to "track an event."
If you want to write code that can respond to change (which is just a more specific version of "react to an input") you need to create a mechanism to identify that a change has occurred, followed by a mechanism to trigger whatever code you want to be run in response (this last part is you would normally call "emitting" an "event").
SocketIO accomplishes both of these things for certain situations, using a graceful degredataion of protocols in order to explicitly emit local events that you can listen for and handle. It starts trying to use WebSockets, and eventually falls back to more expensive techniques such as polling.
SocketIO only works if the source of the information or change has decided to support the protocol. In those cases, the source is actually emitting the event (over websockets) and socketIO listens for it.
In cases where the source of the information you are looking for does not support websockets (and hasn't been coded to explicitly notify your servers of changes), you are going to have to come up with your own solutions. However: You shouldn't think of this as a case of tracking "events". Rather, you are watching for changes.
How you watch for changes will depend on the nature of the change. Generally you'll probably have to poll for it.

socket.io send data to client based on id passed by client

Say I have a rest end point which when called starts a long running process server side e.g.
http://host/api/program/start
and I want to push any updates / output from that process from the server side to a client.
I'm thinking the rest call would return some sort of unique id which the client could then use when connecting to the websocket to only receive updates about that particular process.
I'd have to think about buffering the output / updates from the process to send to the client if they didn't connect before the first output from the process but irrespective of that, what would be the best way of achieving the socket data handling for this? Could I make use of the socket.io rooms / namespaces in some way?
If you really want to do it this way, I would suggest generating the ID via the initial start call, then passing that to the long running process as an argument. Then that process publishes all messages to that ID (which appropriate clients are listening to as well).
However, I would discourage you from going from this approach. There are plenty of ways to go about handling a child process in Node, so you might want to look into these options a little more so you don't end up dealing with zombie processes all over the place.
The first that comes to mind is ChildProcess. Another option would be something like WebWorker Threads. Either of these would be right in the vein of what (I think) you're trying to do, but allow you to maintain much more control over the child processes.

Node.js game logics

I'm in process of making realtime multiplayer racing game. Now I need help writing game logics in Node.js TCP (net) server. I don't know if it's possible, I don't know if i'm doing that right, but I'm trying my best. I know it's hard to understand my broken english, so i made this "painting" :)
Thank you for your time
To elaborate on driushkin's answer, you should use remote procedure calls (RPC) and an event queue. This works like in the image you've posted, where each packet represents a 'command' or RPC with some arguments (i.e. movement direction). You'll also need an event queue to make sure RPCs are executed in order and on time. This will require a timestamp or framecount for each command to be executed on (at some point in the future, in a simple scheme), and synchronized watches (World War II style).
You might notice one critical weakness in this scheme: RPC messages can be late (arrive after the time they should be applied) due to network latency, malicious users, etc. In a simple scheme, late RPCs are dropped. This is fine since all clients (even the originator!) wait for the server to send an RPC before acting (if the originating client didn't wait for the server message, his game state would be out of sync with the server, and your game would be broken).
Consider the impact of lag on such a scheme. Let's say the lag for Client A to the server was 100ms, and the return trip was also 100ms. This means that client input goes like:
Client A presses key, and sends RPC to server, but doesn't add it locally (0ms)
Server receives and rebroadcasts RPC (100ms)
Client A receives his own event, and now finally adds it to his event queue for processing (200ms)
As you can see, the client reacts to his own event 1/5 of a second after he presses the key. This is with fairly nice 100ms lag. Transoceanic lag can easily be over 200ms each way, and dialup connections (rare, but still existent today) can have lag spikes > 500ms. None of this matters if you're playing on a LAN or something similar, but on the internet this unresponsiveness could be unbearable.
This is where the notion of client side prediction (CSP) comes in. CSP is made out to be big and scary, but implemented correctly and thoughtfully it's actually very simple. The interesting feature of CSP is that clients can process their input immediately (the client predicts what will happen). Of course, the client can (and often will) be wrong. This means that the client will need a way of applying corrections from the Server. Which means you'll need a way for the server to validate, reject, or amend RPC requests from clients, as well as a way to serialize the gamestate (so it can be restored as a base point to resimulate from).
There are lots of good resources about doing this. I like http://www.gabrielgambetta.com/?p=22 in particular, but you should really look for a good multiplayer game programming book.
I also have to suggest socket.io, even after reading your comments regarding Flex and AS3. The ease of use (and simple integration with node) make it one of the best (the best?) option(s) for network gaming over HTTP that I've ever used. I'd make whatever adjustments necessary to be able to use it. I believe that AIR/AS3 has at least one WebSockets library, even if socket.io itself isn't available.
This sounds like something socket.io would be great for. It's a library that gives you real time possibilities on the browser and on your server.
You can model this in commands in events: client sends command move to the server, then server validates this command and if everything is ok, he publishes event is moving.
In your case, there is probably no need for different responses to P1 (ok, you can move) and the rest (P1 is moving), the latter suffices in both cases. The is moving event should contain all necessary info (like current position, velocity, etc).
In this simplest form, the one issuing command would experience some lag until the event from server arrives, and to avoid that you could start moving immediately, and then apply some compensating actions if necessary when event arrives. But this can get complicated.

Resources