Node.js Threads Shared Memory Access

Node.js Threads Shared Memory Access - node.js

Is there anything that is similar to PHP's APC (Alternative PHP Cache) for Node.js?
So every Node.js thread running on a server can access the cache. I know the architecture of Node.js may not easily or at all allow for an APC like cache.
I know we can of course run memcache on each server as well to create a server level cache but was curious of there was any alternative.
thanks

Node is trying to keep only the basic stuff in its API, so you won't find such a thing "baked in" (for example WebSockets isn't included in Node core, but in external modules).
You would need to create such a cache layer using something like Redis or Memcached.
P.S. You should better refer to Node processes instead of threads, since you don't have to handle threading stuff with Node.

I don't know if this module helps at all.
I can't guarantee its' reliability and I never kept my promise to do a Windows API as I'm a bit of a linux snob (as in nothing Microsoft comes near my PC)
https://github.com/dazhazit/node-ipcbuffer
It implements a simple byte buffer between processes. You could probably build any mechanism you like on top of it.

Related

NodeJS Performance Issue

I'm running an API server using NodeJS 6.10.3 LTS on Ubuntu 14.04 (trusty). I've noticed that my API server tops out at ~600 reqs/min running on a c4.large EC2 instance. By tops out I mean, I see the CPU go uptil 100% Note, I know that I'm not fully utilizing the instance by using the cluster module, but that's ok for now.
I took a .cpuprofile dump of my API server for 10 seconds, and noticed that every second, for ~300ms, the profiler shows my NodeJS code is sitting (idle).
Does anyone know what that (idle) implies? Is it a GC issue? Or is it a internal (to V8) lock that I'm triggering? Any help or pointers to tools to help debug this would be nice. I'm working on anonymizing some of stack traces in the cpuprofile so I can share.
The packages I'm using are ExpressJS 4, Couchbase NodeJS SDK, Socket.IO mainly. The codepaths are mainly reading requests, and pushing to Couchbase. And finally querying couchbase via Views API, and pushing some aggregated data on a Socket.IO channel. So all pretty I/O async friendly stuff. I've made sure that I'm not calling any synchronous functions. There are no patterns of function calls before the (idle) in the cpu profile.

It could also just be I/O wait, meaning none of the sockets have data ready to read yet and so the time is spent idle. If you are using a load testing library you should check that the requests are evenly distributed within a second.
Take a look at https://www.npmjs.com/package/gc-stats to check GC data. There are flags to increase heap space, and to change when GC runs, if the problem turns out to be GC related.

Out of Memory with Node.js and Redis - Video game

We just installed Redis yesterday, because we are thinking about save our entire map, all positions in Redis, as a cache to make the game faster.
We did some tests and trying to save by 70000*70000 we ran out of mmory from node.js.
Because node doesn't accept a process more than 1Go RAM by default using x64 machine.
https://github.com/joyent/node/wiki/FAQ (last chapter)
We tried again with a smaller one, 7500*7500 and it was okay.
We don't have a such big world map yet, but I'm thinking about the future.
I think that save the map in Redis is really important to have better performance (because we want to check the player movments from server too), but maybe is there a better way?
We could simply run the server using
node --max-old-space-size=16000 app.js
To allow node to use more, because a video game obviously need more memory than a web application, but I would like to discuss more about that before make choices.

Redis is better choice, we do not recommend to store too much data in node.js heap.

Is there a way to share memory among workers/threads/something in Node.JS?

I have a Node app which accesses a static, large (>100M), complex, in-memory data structure, accepts queries, and then serves out little slices of that data to the client over HTTP.
Most queries can be answered in tenths of a second. Hurray for Node!
But, for certain queries, searching this data structure takes a few seconds. This sucks because everyone else has to wait.
To serve more clients efficiently, I would like to use some sort of parallelism.
But, because this data structure is so large, I would like to share it among the workers or threads or what have you, so I don't burn hundreds of megabytes. This would be perfectly safe, because the data structure is not going to be written to. A typical 'fork()' in any other language would do it.
However, as far as I can tell, all the standard ways of doing parallelism in Node explicitly make this impossible. For safety, they don't want you to share anything.
But is there a way?
Background:
It is impractical to put this data structure in a database, or use memcached, or anything like that.
WebWorker API libraries and similar only allow short serialized messages to be passed in and out of the workers.
Node's Cluster uses a call named 'fork', but it is not really a fork of the existing process, it is spawning a new one. So once again, no shared memory.
Probably the really correct answer would be to use filesystem-like access to shared memory, aka tmpfs, or mmap. There are some node libraries that make mount() and mmap() available for exactly something like this. Unfortunately then one has to implement complex data structure access on top of synchronous seeks and reads. My application uses arrays of arrays of dicts and so on. It would be nice to not have to reimplement all that.

I tried write a C/C++ binding of shared memory access from nodejs. https://github.com/supipd/node-shm
Still work in progress (but working for me), maybe usefull, if bug or suggestion, inform me.

building with waf is old style (node 0.6 and below), new build is with gyp.
You should look at node cluster (http://nodejs.org/api/cluster.html). Not clear this is going to help you without having more details, but this runs multiple node processes on the same machine using fork.

Actually Node does support spawning processes. I'm not sure how close Node's fork is to real fork, but you can try it:
http://nodejs.org/api/child_process.html#child_process_child_process_fork_modulepath_args_options
By the way: it is not true that Node is unsuited for that. It is as suited as any other language/web server. You can always fire multiple instances of your server on different ports and put a proxy in front.
If you need more memory - add more memory. :) It is as simple as that. Also you should think about putting all of that data on a dedicated in-memory database like Redis or Memcached ( or even Couchbase if you need complex queries ). You won't have to worry about duplicating that data any more.

Most web applications spend the majority of their life waiting for network buffers and database reads. Node.js is designed to excel at this io bound work. If your work is truly bound by the CPU, you might be served better by another platform.
With that out of the way...
Use process.nextTick (perhaps even nested blocks) to make sure that expensive CPU work is properly asynchronous and not allowed to block your thread. This will make sure one client making expensive requests doesn't negatively impact all the others.
Use node.js cluster to add a worker process for each CPU in the system. Worker processes can all bind to a single HTTP port and use Memcached or Redis to share memory state. Workers also have a messaging API that can be used to keep an in-process memory cache synchronized, however it has some consistency limitations.

How to throttle a node.js application?

I am trying to throttle hosted node.js applications. Those applications are user created in a web-ide and it seems, it can knock out the entire server.
Do we need to apply this in C++ and rebuild node.js by self ?

If you are using linux, you can try something like "renice" to set the priority of each of these processes. Node.js is no different than hosting python, perl or PHP applications, any of them can take a lot of CPU if the program is written poorly or the application is processing many requests.
If by "knock out the entire server" you mean can cause a kernel panic, make sure you have the latest version of node.js and your server is up to date. This should never happen.

NodeJS + SocketIO: Scaling and preventing single point of failure

So the first app that people usually build with SocketIO and Node is usually a chatting app. This chatting app basically has 1 Node server that will broadcast to multiple clients. In the Node code, you would have something like.
//Psuedocode
for(client in clients){
if(client != messageSender){
user.send(message);
}
}
This is great for a low number of users, but I see a problem with this. First of all, there is a single point of failure which is the Node server. Second of all, the app will slow down as the number of clients grow. What is there to do then when we reach this bottleneck? Is there an architecture (horizontal/vertical scaling) that can be used to alleviate this problem?

For that "one day" when your chat app needs multiple, fault-tolerant node servers, and you want to use socket.io to cross communicate between the server and the client, there is a node.js module that fits the bill.
https://github.com/hookio/hook.io
It's basically an event emitting framework to cross communicate between multiple "things" -- such as multiple node servers.
It's relatively complicated to use, compared to most modules, which is understandable since this is a complex problem to solve.
That being said, you'd probably have to have a few thousand simultaneous users and lots of other problems before you begin to have problems with this.
Another thing you can do, is try to develop your application in a way so that if a connection is lost (which happens all the time anyway), eg. server goes down, client has network issues (eg. mobile user), etc, your application should be able to handle that and recover from such issues gracefully.

Since Node.js has a single event-loop thread, this single point of failure is written into its DNA. Even reloading a server after code changes require this thread to be stopped.
There are however a lot of tools available to handle such failures gracefully. You could use forever; a simple CLI tool for ensuring that a given script runs continuously. Other options include distribute and up. Distribute is a load balancing middleware for Node. Up builds on top of Distribute to offer zero downtime reloads using either a JavaScript API or command line interface:
Further reading I find you just need to use Redis Store with Socket.io to maintain connection references between two or more processes/ servers. These options have already been discussed extensively here and here.
There's also the option of using socket.io-clusterhub if you don't intend to use the Redis store.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string