We're currently in the process of updating from node 0.10 to node 4.1.2 and we're seeing some weird patterns. The number of connections to our postgres database doubles1 and we're seeing the same pattern with requests to external services2. We are running a clustered app running the native cluster API and the number of workers is the same for both versions.
I'm failing to understand why upgrading the runtime language would apparently change application behaviour by doubling requests to external services.
One of the interesting things I've noticed with 0.12 and 4.x is the change in garbage collection. I've not used the pg module before so I don't know internally how it maintains it's pools of if it would be affected by memory or garbage collection. If you haven't defined default memory setting for node you could try giving that a shot and see if you see any other results.
node --max_old_space_size <some sane value in MB>
I ran into something similar, but I was getting double file writes. I don't know your exact case, but I've seen a scenario where requests could almost exactly double.
in the update to 4.1.2, process.send and child.send has gone from synchronous to asynchronous.
I found an issue like this:
var child = fork('./request.js');
var test = {};
child.send(small request);
child.send(large request);
child.on('response', function (val) {
console.log('small request came back: ' + val);
test = val;
});
if(!test){
//retry request
} ...
So where as previously the blocking sends has allowed this code to work, the non-blocking version assumes an error has occurred and retries. No error actually occurred, so double the requests come in.
Related
We have recently started working on Typescript language for one of the application where a queue'd communication is expected between a server and client/clients.
For achieving the queue'd communication, we are trying to use the ZeroMQ library version 4.6.0 as a npm package: npm install -g zeromq and npm install -g #types/zeromq.
The exact scenario :
The client is going to send thousands of messages to the server over ZeroMQ. The server in-turn will be responding with some acknowledgement message per incoming message from the client. Based on the acknowledgement message, the client will send next message.
ZeroMQ pattern used :
The ROUTER/DEALER pattern (we cannot use any other pattern).
Client side code :
import Zmq = require('zeromq');
let clientSocket : Zmq.Socket;
let messageQueue = [];
export class ZmqCommunicator
{
constructor(connString : string)
{
clientSocket = Zmq.socket('dealer');
clientSocket.connect(connString);
clientSocket.on('message', this.ReceiveMessage);
}
public ReceiveMessage = (msg) => {
var argl = arguments.length,
envelopes = Array.prototype.slice.call(arguments, 0, argl - 1),
payload = arguments[0];
var json = JSON.parse(msg.toString('utf8'));
if(json.type != "error" && json.type =='ack'){
if(messageQueue.length>0){
this.Dispatch(messageQueue.splice(0, 1)[0]);
}
}
public Dispatch(message) {
clientSocket.send(JSON.stringify(message));
}
public SendMessage(msg: Message, isHandshakeMessage : boolean){
// The if condition will be called only once for the first handshake message. For all other messages, the else condition will be called always.
if(isHandshakeMessage == true){
clientSocket.send(JSON.stringify(message));
}
else{
messageQueue.push(msg);
}
}
}
On the server side, we already have a ROUTER socket configured.
The above code is pretty straight forward. The SendMessage() function is essentially getting called for thousands of messages and the code works successfully but with load of memory consumption.
Problem :
Because the behavior of ZeroMQ is asynchronous, the client has to wait on the call back call ReceiveMessage() whenever it has to send a new message to ZeroMQ ROUTER (which is evident from the flow to the method Dispatch).
Based on our limited knowledge with TypeScript and usage of ZeroMQ with TypeScript, the problem is that because default thread running the typescript code (which creates the required 1000+ messages and sends to SendMessage()) continues its execution (creating and sending more messages) after sending the first message (handshake message essentially), unless all the 1000+ messages are created and sent to SendMessage() (which is not sending the data but queuing the data as we want to interpret the acknowledgement message sent by the router socket and only based on the acknowledgement we want to send the next message), the call does not come to the ReceiveMessage() call back method.
It is to say that the call comes to ReceiveMessage() only after the default thread creating and calling SendMessage() is done doing this for 1000+ message and now there is no other task for it to do any further.
Because ZeroMQ does not provide any synchronous mechanism of sending/receiving data using the ROUTER/DEALER, we had to utilize the queue as per the above code using a messageQueue object.
This mechanism will load a huge size messageQueue (with 1000+ messages) in memory and will dequeue only after the default thread gets to the ReceiveMessage() call at the end. The situation will only worsen if say we have 10000+ or even more messages to be sent.
Questions :
We have validated this behavior certainly. So we are sure of the understanding that we have explained above. Is there any gap in our understanding of either/or TypeScript or ZeroMQ usage?
Is there any concept like a blocking queue/limited size array in Typescript which would take limited entries on queue, and block any new additions to the queue until the existing ones are queues (which essentially applies that the default thread pauses its processing till the time the call back ReceiveMessage() is called which will de-queue entries from the queue)?
Is there any synchronous ZeroMQ methodology (We have used it in similar setup for C# where we pool on ZeroMQ and received the data synchronously)?.
Any leads on using multi-threading for such a scenario? Not sure if Typescript supports multi threading to a good extent.
Note : We have searched on many forums and have not got any leads any where. The above description may have multiple questions inside one question (against the rules of stackoverflow forum); but for us all of these questions are interlinked to using ZeroMQ effectively in Typescript.
Looking forward to getting some leads from the community.
Welcome to ZeroMQ
If this is your first read about ZeroMQ, feel free to first take a 5 seconds read - about the main conceptual differences in [ ZeroMQ hierarchy in less than a five seconds ] Section.
1 ) ... Is there any gap in our understanding of either/or TypeScript or ZeroMQ usage ?
Whereas I cannot serve for the TypeScript part, let me mention a few details, that may help you move forwards. While ZeroMQ is principally a broker-less, asynchronous signalling/messaging framework, it has many flavours of use and there are tools to enforce both a synchronous and asynchronous cooperation between the application code and the ZeroMQ Context()-instance, which is the cornerstone of all the services design.
The native API provides means to define, whether a respective call ought block, until a message processing across the Context()-instance's boundary was able to get completed, or, on the very contrary, if a call ought obey the ZMQ_DONTWAIT and asynchronously return the control back to the caller, irrespectively of the operation(s) (in-)completion.
As additional tricks, one may opt to configure ZMQ_SND_HWM + ZMQ_RCV_HWM and other related .setsockopt()-options, so as to meet a specific blocking / silent-dropping behaviours.
Because ZeroMQ does not provide any synchronous mechanism of sending/receiving data
Well, ZeroMQ API does provide means for a synchronous call to .send()/.recv() methods, where the caller is blocked until any feasible message could get delivered into / from a Context()-engine's domain of control.
Obviously, the TypeScript language binding/wrapper is responsible for exposing these native API services to your hands.
3 ) Is there any synchronous ZeroMQ methodology (We have used it in similar setup for C# where we pool on ZeroMQ and received the data synchronously) ?
Yes, there are several such :
- the native API, if not instructed by a ZMQ_DONTWAIT flag, blocks until a message can get served
- the native API provides a Poller()-object, that can .poll(), if given a -1 as a long duration specifier to wait for sought for events, blocking the caller until any such event comes and appears to the Poller()-instance.
Again, the TypeScript language binding/wrapper is responsible for exposing these native API services to your hands.
... Large memory consumption ...
Well, this may signal a poor resources management care. ZeroMQ messages, once got allocated, ought become also free-d, where appropriate. Check your TypeScript code and the TypeScript language binding/wrapper sources, if the resources systematically get disposed off and free-d from memory.
I believe this is more of a MongoDB question than a Meteor question, so don't get scared if you know a lot about mongo but nothing about meteor.
Running Meteor in development mode, but connecting it to an external Mongo instance instead of using Meteor's bundled one, results in the same problem. This leads me to believe this is a Mongo problem, not a Meteor problem.
The actual problem
I have a meteor project which continuosly gets data added to the database, and displays them live in the application. It works perfectly in development mode, but has strange behaviour when built and deployed to production. It works as follows:
A tiny script running separately collects broadcast UDP packages and shoves them into a mongo collection
The Meteor application then publishes a subset of this collection so the client can use it
The client subscribes and live-updates its view
The problem here is that the subscription appears to only get data about every 10 seconds, while these UDP packages arrive and gets shoved into the database several times per second. This makes the application behave weird
It is most noticeable on the collection of UDP messages, but not limited to it. It happens with every collection which is subscribed to, even those not populated by the external script
Querying the database directly, either through the mongo shell or through the application, shows that the documents are indeed added and updated as they are supposed to. The publication just fails to notice and appears to default to querying on a 10 second interval
Meteor uses oplog tailing on the MongoDB to find out when documents are added/updated/removed and update the publications based on this
Anyone with a bit more Mongo experience than me who might have a clue about what the problem is?
For reference, this is the dead simple publication function
/**
* Publishes a custom part of the collection. See {#link https://docs.meteor.com/api/collections.html#Mongo-Collection-find} for args
*
* #returns {Mongo.Cursor} A cursor to the collection
*
* #private
*/
function custom(selector = {}, options = {}) {
return udps.find(selector, options);
}
and the code subscribing to it:
Tracker.autorun(() => {
// Params for the subscription
const selector = {
"receivedOn.port": port
};
const options = {
limit,
sort: {"receivedOn.date": -1},
fields: {
"receivedOn.port": 1,
"receivedOn.date": 1
}
};
// Make the subscription
const subscription = Meteor.subscribe("udps", selector, options);
// Get the messages
const messages = udps.find(selector, options).fetch();
doStuffWith(messages); // Not actual code. Just for demonstration
});
Versions:
Development:
node 8.9.3
mongo 3.2.15
Production:
node 8.6.0
mongo 3.4.10
Meteor use two modes of operation to provide real time on top of mongodb that doesn’t have any built-in real time features. poll-and-diff and oplog-tailing
1 - Oplog-tailing
It works by reading the mongo database’s replication log that it uses to synchronize secondary databases (the ‘oplog’). This allows Meteor to deliver realtime updates across multiple hosts and scale horizontally.
It's more complicated, and provides real-time updates across multiple servers.
2 - Poll and diff
The poll-and-diff driver works by repeatedly running your query (polling) and computing the difference between new and old results (diffing). The server will re-run the query every time another client on the same server does a write that could affect the results. It will also re-run periodically to pick up changes from other servers or external processes modifying the database. Thus poll-and-diff can deliver realtime results for clients connected to the same server, but it introduces noticeable lag for external writes.
(the default is 10 seconds, and this is what you are experiencing , see attached image also ).
This may or may not be detrimental to the application UX, depending on the application (eg, bad for chat, fine for todos).
This approach is simple and and delivers easy to understand scaling characteristics. However, it does not scale well with lots of users and lots of data. Because each change causes all results to be refetched, CPU time and network bandwidth scale O(N²) with users. Meteor automatically de-duplicates identical queries, though, so if each user does the same query the results can be shared.
You can tune poll-and-diff by changing values of pollingIntervalMs and pollingThrottleMs.
You have to use disableOplog: true option to opt-out of oplog tailing on a per query basis.
Meteor.publish("udpsPub", function (selector) {
return udps.find(selector, {
disableOplog: true,
pollingThrottleMs: 10000,
pollingIntervalMs: 10000
});
});
Additional links:
https://medium.baqend.com/real-time-databases-explained-why-meteor-rethinkdb-parse-and-firebase-dont-scale-822ff87d2f87
https://blog.meteor.com/tuning-meteor-mongo-livedata-for-scalability-13fe9deb8908
How to use pollingThrottle and pollingInterval?
It's a DDP (Websocket ) heartbeat configuration.
Meteor real time communication and live updates is performed using DDP ( JSON based protocol which Meteor had implemented on top of SockJS ).
Client and server where it can change data and react to its changes.
DDP (Websocket) protocol implements so called PING/PONG messages (Heartbeats) to keep Websockets alive. The server sends a PING message to the client through the Websocket, which then replies with PONG.
By default heartbeatInterval is configure at little more than 17 seconds (17500 milliseconds).
Check here: https://github.com/meteor/meteor/blob/d6f0fdfb35989462dcc66b607aa00579fba387f6/packages/ddp-client/common/livedata_connection.js#L54
You can configure heartbeat time in milliseconds on server by using:
Meteor.server.options.heartbeatInterval = 30000;
Meteor.server.options.heartbeatTimeout = 30000;
Other Link:
https://github.com/meteor/meteor/blob/0963bda60ea5495790f8970cd520314fd9fcee05/packages/ddp/DDP.md#heartbeats
I have set a node server with Express middleware. I get the ECONNABORTED error randomly on some files when loading an HTML file which triggers about 10 other loads (js, css, etc.). The exact error is:
{ [Error: Request aborted] code: 'ECONNABORTED' }
Generated by this simplified code (after I tried to debug the issue):
res.sendFile(res.locals.physicalUrl,function (err) {
if (err)
console.log(err);
...
}
Many posts talk about this error resulting from not specifying the full path name. That is not the situation here. I do specify the full path and indeed the error is randomly generated. There are times when the page and all its subsequent links load perfectly and there are times when they do not. I tried to flush the cache and did not find any pattern to connect it with this.
This specific error appears to be a a generic term for socket connection getting aborted and is discussed in the context of other applications like FTP.
Having realized that the node worker threads can be increased, I tried to do so using:
process.env.UV_THREADPOOL_SIZE = 20;
However, my understanding is that even absent this, at most the file transfer may have to wait for a worker thread to be free and not get aborted. I am not talking about big files here, all files are less than 1 MB.
I have a gut feeling that this has nothing to do with node directly.
Please point to any other possibilities (node or otherwise) to handle this error. Also, any other indirect solutions? Retrying a few times could be one but that would be clumsy. EDIT: No, I cannot retry. Headers are already sent with the error!
A SIDE NOTE:
Many examples on the use of sendFile skip using the callback thereby giving the impression that it is a synchronous call. It is not. Do use the callback at all times, check for success and only then move on to the "next" middleware or take appropriate steps if the send fails for whatever reason. Not doing so can make it difficult to debug the consequences in an asynchronous environment.
See https://stackoverflow.com/a/36949631/2798152
Could it be possible that in some cases you terminate the connection by calling res.end before the asynchronous call to res.sendFile ends?
If that's not the case - can you pastebin more of your application code?
Uninstalling and Re-installing MongoDB solved this for me.
I was facing the same problem. It started happening when I had to force restart my laptop because it became unresponsive. On restarting, trying to connect to mongo server using nodejs, always threw ECONNABORTED error
I'm trying to set up a robust memcached configuration for a nodejs app with the node-memcached driver, but it does not seem to use the specified failover servers when one server dies.
My local experiment goes as follows:
shell
memcached -p 11212
node
MC = require('memcached')
c = new MC('localhost:11211', //this process does not exist
{failOverServers: ['localhost:11212']})
c.get('foo', console.log) //this will eventually time out
c.get('foo', console.log) //repeat 5 or 6 times to exceed the retries number
//wait until all the connection errors appear in the console
//at this point, the failover server should be in use
c.get('foo', console.log) //this still times out :(
Any ideas of what might we be doing wrong?
It seems that the failover feature is somewhat buggy in node-memcached.
To enable failover you must set the remove options:
c = new MC('localhost:11211', //this process does not exist
{failOverServers: ['localhost:11212'],
remove : true})
Unfortunately, this is not going to work because of the following error:
[depricated] HashRing#replaceServer is removed.
[depricated] the API has no replacement
That is, when trying to replace a dead server with a replacement from the failover list, node-memcached outputs a deprecation error from the HashRing library (which, in turn, is maintained by the same author of node-memcached). IMHO, feel free to open a bug :-)
This is come when your nodejs server not getting any session id from memcached
Please check properly in php.ini file you are setting properly or not for memcached
session.save = 'memcache'
session.path = 'tcp://localhost:11212'
I have a NodeJs application that listens to messages via subscribe on a Redis server. It collects the messages for a period of 5 Seconds and then pushes them out to the connected clients, the code looks something like this:
io.sockets.on('connection', function (socket) {
nClients++;
console.log("Number of clients connected " + nClients);
socket.on('disconnect', function () {
nClients--;
console.log("Number of clients remaining " + nClients);
});
});
Receiving messages to send out to the clients
cli_sub.on("message",function(channel,message) {
oo = JSON.parse(message);
ablv_last_message[oo[0]["base"]+"_"+oo[0]["alt"]] = message;
});
setInterval(function() {
Object.keys(ablv_last_message).forEach( function(key) {
io.sockets.emit('ablv', ablv_last_message[key]);
});
ablv_last_message = [];
}, 5000);
SOLUTION FOUND (at least I think so): Node didn't crash because it reached some internal memory limits, it looks as if it crashed because my VPS ran out of memory, it was a 2GB VPS running one or two other processes too. After upgrading it to 4GB, Node runs smoothly, yes always around 1.6 to 2.0 GB but I believe its the GC who does its work here.
It is better you try some tools for finding leaks in node.js.
Tools for Finding Leaks
Jimb Esser’s node-mtrace, which uses the
GCC mtrace utility to profile heap usage.
Dave Pacheco’s node-heap-dump takes a snapshot of the V8 heap and serializes the whole thing out in a huge JSON file. It includes tools to traverse and investigate
the resulting snapshot in JavaScript.
Danny Coates’s v8-profiler and node-inspector provide Node bindings for the V8 profiler and a Node debugging interface using the WebKit Web Inspector.
Felix Gnass’s fork of the same that un-disables the retainers graph
Felix Geisendörfer’s Node Memory Leak Tutorial is a short and sweet explanation of how to use the v8-profiler and node-debugger, and is presently the state-of-the-art for most Node.js memory leak debugging.
Joyent’s SmartOS platform, which furnishes an arsenal of tools at your disposal for debugging Node.js memory leaks
From Tracking Down Memory Leaks in Node.js – A Node.JS Holiday Season.
And another blog
It looks to me that you keep adding keys to the global ablv_last_message object and never clean it.
You may use Object.getOwnPropertyNames rather than Object.keys