A better way to structure a Mongoose connection module - node.js

I have refactored some code to place all my mongoose.createConnection(...) in a single file. This file is then required in other files that use connections to the various databases specified. The connections are lazily created and are used in both an http server and in utility scripts.
The connection file looks like this:
var mongoose = require("mongoose");
var serverString = "mongodb://localhost:27017";
var userDBString = "/USER";
var customerDBString = "/CUSTOMER";
var userConnection = null;
exports.getUserConnection = function () {
if (userConnection === null) {
userConnection = mongoose.createConnection(serverString + userDBString, {server: { poolSize: 4 }});
}
return userConnection;
};
var customerConnection = null;
exports.getCustomerConnection = function () {
if (customerConnection === null) {
customerConnection = mongoose.createConnection(serverString + customerDBString, { server: { poolSize: 4 }});
}
return customerConnection;
};
My models are stored in a separate files (based on their DB) that looks a bit like this:
exports.UserSchema = UserSchema; //Just assume I know how to define a valid schema
exports.UserModel = connection.getUserConnection().model("User", UserSchema);
Later , I use the getUserConnection() to refer to the connection I have created to actually do work the model.
TL;DR
In utilities that use this connection format, I have to call
connection.getUserConnection().on("open", function() {
logger.info("Opened User DB");
//Do What I Need To Do
});
It is possible that in some scenarios the task processor will have already broadcast the open event. In some, it won't be guaranteed to have happened yet. I noticed that it doesn't queue work if the connection isn't open (specifically, dropCollection) so I feel stuck.
How can I be certain that the connection is open before proceeding given that I can't depend on subscribing to the open event before the task processor runs?
Is there a better pattern for centralizing the managing of multiple connections?

I can answer part of my own question
How can I be certain that the connection is open before proceeding
given that I can't depend on subscribing to the open event before the
task processor runs?
if (connection.getUserConnection().readyState!==1) {
logger.info("User connection was not open yet. Adding open listener");
connection.getSR26Connection().on("open", function () {
logger.info("User open event received");
doStuff();
});
} else {
logger.info("User is already open.");
doStuff();
}
function doStuff() {
logger.info("Doing stuff");
}
If you see a better way then please comment or offer up an answer. I would still like to hear how other people manage connections without rebuilding the connection every time.

Related

Mongoose too many connection and commands

I'm here to request help with mongo/mongoose. I use AWS lambda that accesses a mongo database and I'm having problems sometimes my connections reach the limit of 500. I'm trying to fix this problem and I did some things like this https://dzone.com/articles/how-to-use-mongodb-connection-pooling-on-aws-lambd and https://www.mongodb.com/blog/post/optimizing-aws-lambda-performance-with-mongodb-atlas-and-nodejs. That basically is to use a singleton-like and set context.callbackWaitsForEmptyEventLoop = false, which indeed helped but is still, rarely, open 100 connections in less than a minute, it looks like there is some connection that is not being reused even tho our logs show that they are being reused. So I realized a weird behavior, whenever mongoatlas shows me an increased number of commands, my mongo connections increase heavily. The first chart is operations and the second is the connections.
Looking at operations, there are too many commands and just a few queries. I have no idea what are those commands, my theory is that those commands are causing the problem but I did not find anything that explained what is the difference between query and command exactly for me to know if that is a valid theory or not. Another thing is, how to choose correctly the number of pool size, we have really simple queries.
Here is our singleton class because maybe this is what we are doing wrong:
class Database {
options: [string, mongoose.ConnectionOptions];
instance?: typeof mongoose | null;
constructor(options = config) {
console.log('[DatabaseService] Created database instance...');
this.options = options;
this.instance = null;
}
async checkConnection() {
try {
if (this.instance) {
const pingResponse = await this.instance.connection.db.admin().ping();
console.log(`[DatabaseService] Connection status: ${pingResponse.ok}`);
return pingResponse.ok === 1;
}
return false;
} catch (error) {
console.log(error);
return false;
}
}
async init() {
const connectionActive = await this.checkConnection();
if (connectionActive) {
console.log(`[DatabaseService] Already connected, returning instance`);
return this.instance;
}
console.log('[DatabaseService] Previous connection was not active, creating new connection...');
this.instance = await mongoose.connect(...this.options);
const timeId = Date.now();
console.log(`Connection opened ${timeId}`);
console.time(`Connection started at ${timeId}`);
this.instance?.connection.on('close', () => {
console.timeEnd(`Connection started at ${timeId}`);
console.log(`Closing connection ${timeId}`);
});
return this.instance;
}
async getData(id: string) {
await this.init();
const response = await Model.findOne({ 'uuid': id });
return response;
}
}
I hope that is enough information. My main question is if my theory of commands causing too many connections is possible and what are exactly commands because every explanation that I found look like is the same than query.
Based on the comment written by Matt I have changed my init function and now my connections are under control.
async init() {
if (this.instance) {
console.log(`[DatabaseService] Already connected, returning instance`);
return this.instance;
}
console.log('[DatabaseService] Previous connection was not active, creating new connection...');
this.instance = await mongoose.connect(...this.options);
const timeId = Date.now();
console.log(`Connection opened ${timeId}`);
console.time(`Connection started at ${timeId}`);
this.instance?.connection.on('close', () => {
console.timeEnd(`Connection started at ${timeId}`);
console.log(`Closing connection ${timeId}`);
});
return this.instance;
}

Wait for Node constructor to connect to api before issuing commands

Sorry if the question title is a tad ambiguous, but I'm not entirely sure how to word it.
I'm writing an NPM module that talks to a json-rpc api - this is the current setup.
// The module
function MyModule(config) {
// do some connection stuff here
connected = true
}
MyModule.prototype.sendCommand = function() {
if(connected) {
// do command
} else {
// output an error
}
}
module.exports = MyModule;
// The script interacting with the module
var MyModule = require('./MyModule');
var config = {
// config stuff
};
var mod = new MyModule(config);
var mod.sendCommand;
The command won't send, as at this point it hasn't connected, I assume this is due to NodeJS' asynchronous, non-blocking architecture and that I perhaps need to use promises to wait for a response from the API, where would I implement this? Do I do it in my module or do I do it in the script interacting with the module?
You will need to use either a callback or promises or something like that to indicate when the connection is complete so you can then use the connection in further code that is started via that callback.
Though it is generally not considered the best practice to do asynchronous stuff in a constructor, it can be done:
function MyModule(config, completionCallback) {
// do some connection stuff here
connected = true
completionCallback(this);
}
var mod = new MyModule(config, function(mod) {
// object has finished connecting
// further code can run here that uses the connection
mod.sendCommand(...);
});
A more common design pattern is to not put the connecting in the constructor, but to add a method just for that:
function MyModule(config) {
}
MyModule.prototype.connect = function(fn) {
// code here that does the connection and calls
// fn callback when connected
}
var mod = new MyModule(config);
mod.connect(function() {
// object has finished connecting
// further code can run here that uses the connection
mod.sendCommand(...);
});
don't use promises, use node's programming model where you don't "call functions" but you "call functions with a result handler for dealing with the data once it's actually available":
MyModule.prototype.sendCommand = function(handler) {
if(connected) {
// run stuff, obtain results, send that on:
handler(false, result);
} else {
// output an error, although really we should
// just try to connect if we're not, and say
// there's an error only when it actually fails.
handler(new Error("ohonoes"));
}
}
and then you call the function as
var MyModule = require('./MyModule');
var mod = ...
mod.sendCommand(function(err, result) {
// we'll eventually get here, at which point:
if (err) { return console.error(err); }
run();
more();
code();
withResult(result);
});

node.js - Require Exports Variables Errors

I have a node.js script that does some database queries for me and works fine. The script is starting to get a bit longer so I thought I might start to break it up and thought moving the database connection code out to another file made sense.
Below is the code that I have moved into another file and then included with a require statement.
The issue I'm having is with the 'exports' commands at the bottom of the script. It appears the function 'dbHandleDisconnectUsers()' exports fine however the variable 'dbConnectionUsers' doesn't.
The script errors refer to methods of the object'dbConnectionUsers' (I hope thats the correct terminalogy) missing and gives me the impression I'm not really passing a complete object. Note: I would include the exact errors but I'm not in front of the machine.
var mysql = require('/usr/lib/node_modules/mysql');
// Users Database Configuration
var dbConnectionUsers;
var dbConfigurationUsers = ({
host : 'xxxxx',
user : 'xxxxx',
password : 'xxxxx',
database : 'xxxxxx',
timezone : 'Asia/Singapore'
});
// Users Database Connection & Re-Connection
function dbHandleDisconnectUsers() {
dbConnectionUsers = mysql.createConnection(dbConfigurationUsers);
dbConnectionUsers.connect(function(err) {
if(err) {
console.log('Users Error Connecting to Database:', err);
}else{
dbConnectionUsers.query("SET SESSION TRANSACTION ISOLATION LEVEL SERIALIZABLE;");
dbConnectionUsers.query("SET SESSION sql_mode = 'ANSI';");
dbConnectionUsers.query("SET NAMES UTF8;");
dbConnectionUsers.query("SET time_zone='Asia/Singapore';");
}
});
dbConnectionUsers.on('error', function(err) {
console.log('Users Database Protocol Connection Lost: ', err);
if(err.code === 'PROTOCOL_CONNECTION_LOST') {
dbHandleDisconnectUsers();
} else {
throw err;
}
});
}
dbHandleDisconnectUsers();
exports.dbHandleDisconnectUsers() = dbHandleDisconnectUsers();
exports.dbConnectionUsers = dbConnectionUsers;
In the core script I have this require statement:
var database = require('database-connect.js');
And I refer the function/variable as
database.dbHandleDisconnectUsers()
database.dbConnectionUsers
Ignoring the syntax error that everybody else has pointed out in exports.dbHandleDisconnectUsers() = dbHandleDisconnectUsers(), I will point out that dbConnectionUsers is uninitialized.
JavaScript is a pass-by-copy-of-reference language, therefore these lines:
var dbConnectionUsers;
exports.dbConnectionUsers = dbConnectionUsers;
are essentially identical to
exports.dbConnectionUsers = undefined;
Even though you set dbConnectionUsers later, you are not affecting exports.dbConnectionUsers because it holds a copy of the original dbConnectionUsers reference.
It's similar, in primitive data types, to:
var x = 5;
var y = x;
x = 1;
console.log(x); // 1
console.log(y); // 5
For details on how require and module.exports work, I will refer you to a recent answer I posted on the same topic:
Behavior of require in node.js
It's odd that your function is working but your other variable isn't exporting. This shouldn't be the case.
When you export functions you generally don't want to be exporting them as evaluated functions (ie. aFunction() ). The only time you might is if you want export whatever that function returns, or if you want to export an instance of a constructor function as part of your module.
The other thing, which is really odd, and is mentioned in a comment above is that you are trying to assign a value to exports.dbHandleDisconnectUsers(), which should be an undefined and throw an error.
So,in other words: Your code should not look like exports.whatever() = whatever().
Instead you should export both functions and other properties like this:
exports.dbHandleDisconnectUsers = dbHandleDisconnectUsers; // no evaluation ()
exports.dbConnectionUsers = dbConnectionUsers;
I don't know if this is the only thing wrong here, but this is definitely one thing that might be causing an execution error or two :)
Also, taking into consideration what Brandon has pointed out as well, you are initially exporting something undefined. But in your script, you are overwriting the reference anyway.
What you should do instead is make a new object reference, which is persistent and has a property in it that you can update. ie:
var dbConnection = {users: null};
exports.dbConnection = dbConnection;
Then when you run your function:
function dbHandleDisconnectUsers() {
dbConnection.users = mysql.createConnection(dbConfigurationUsers);
dbConnection.users.connect(function(err) {
if(err) {
console.log('Users Error Connecting to Database:', err);
}else{
dbConnection.users.query("SET SESSION TRANSACTION ISOLATION LEVEL SERIALIZABLE;");
dbConnection.users.query("SET SESSION sql_mode = 'ANSI';");
dbConnection.users.query("SET NAMES UTF8;");
dbConnection.users.query("SET time_zone='Asia/Singapore';");
}
});
dbConnection.users.on('error', function(err) {
console.log('Users Database Protocol Connection Lost: ', err);
if(err.code === 'PROTOCOL_CONNECTION_LOST') {
dbHandleDisconnectUsers();
} else {
throw err;
}
});
}
This way, the object reference of dbConnection is never overwritten.
You will then refer to your users db connection in your module as:
database.dbConnection.users
Your function should still work as you were intending on using it before with:
database.dbHandleDisconnectUsers();

Keeping open a MongoDB database connection

In so many introductory examples of using MongoDB, you see code like this:
var MongoClient = require('mongodb').MongoClient;
MongoClient.connect("mongodb://localhost:port/adatabase", function(err, db)
{
/* Some operation... CRUD, etc. */
db.close();
});
If MongoDB is like any other database system, open and close operations are typically expensive time-wise.
So, my question is this: Is it OK to simply do the MongoClient.connect("... once, assign the returned db value to some module global, have various functions in the module do various database-related work (insert documents into collections, update documents, etc. etc.) when they're called by other parts of the application (and thereby re-use that db value), and then, when the application is done, only then do the close.
In other words, open and close are done once - not every time you need to go and do some database-related operation. And you keep re-using that db object that was returned during the initial open\connect, only to dispose of it at the end, with the close, when you're actually done with all your database-related work.
Obviously, since all the I/O is asynch, before the close you'd make sure that the last database operation completed before issuing the close. Seems like this should be OK, but i wanted to double-check just in case I'm missing something as I'm new to MongoDB. Thanks!
Yes, that is fine and typical behavior. start your app, connect to db, do operations against the db for a long time, maybe re-connect if the connection ever dies unexpectedly, and then just never close the connection (just rely on the automatic close that happens when your process dies).
mongodb version ^3.1.8
Initialize the connection as a promise:
const MongoClient = require('mongodb').MongoClient
const uri = 'mongodb://...'
const client = new MongoClient(uri)
const connection = client.connect() // initialized connection
And then call the connection whenever you wish you perform an action on the database:
// if I want to insert into the database...
const connect = connection
connect.then(() => {
const doc = { id: 3 }
const db = client.db('database_name')
const coll = db.collection('collection_name')
coll.insertOne(doc, (err, result) => {
if(err) throw err
})
})
The current accepted answer is correct in that you may keep the same database connection open to perform operations, however, it is missing details on how you can retry to connect if it closes. Below are two ways to automatically reconnect. It's in TypeScript, but it can easily be translated into normal Node.js if you need to.
Method 1: MongoClient Options
The most simple way to allow MongoDB to reconnect is to define a reconnectTries in an options when passing it into MongoClient. Any time a CRUD operation times out, it will use the parameters passed into MongoClient to decide how to retry (reconnect). Setting the option to Number.MAX_VALUE essentially makes it so that it retries forever until it's able to complete the operation. You can check out the driver source code if you want to see what errors will be retried.
class MongoDB {
private db: Db;
constructor() {
this.connectToMongoDB();
}
async connectToMongoDB() {
const options: MongoClientOptions = {
reconnectInterval: 1000,
reconnectTries: Number.MAX_VALUE
};
try {
const client = new MongoClient('uri-goes-here', options);
await client.connect();
this.db = client.db('dbname');
} catch (err) {
console.error(err, 'MongoDB connection failed.');
}
}
async insert(doc: any) {
if (this.db) {
try {
await this.db.collection('collection').insertOne(doc);
} catch (err) {
console.error(err, 'Something went wrong.');
}
}
}
}
Method 2: Try-catch Retry
If you want more granular support on trying to reconnect, you can use a try-catch with a while loop. For example, you may want to log an error when it has to reconnect or you want to do different things based on the type of error. This will also allow you to retry depending on more conditions than just the standard ones included with the driver. The insert method can be changed to the following:
async insert(doc: any) {
if (this.db) {
let isInserted = false;
while (isInserted === false) {
try {
await this.db.collection('collection').insertOne(doc);
isInserted = true;
} catch (err) {
// Add custom error handling if desired
console.error(err, 'Attempting to retry insert.');
try {
await this.connectToMongoDB();
} catch {
// Do something if this fails as well
}
}
}
}
}

fs.watch fired twice when I change the watched file

fs.watch( 'example.xml', function ( curr, prev ) {
// on file change we can read the new xml
fs.readFile( 'example.xml','utf8', function ( err, data ) {
if ( err ) throw err;
console.dir(data);
console.log('Done');
});
});
OUTPUT:
some data
Done X 1
some data
Done X 2
It is my usage fault or ..?
The fs.watch api:
is unstable
has known "behaviour" with regards repeated notifications. Specifically, the windows case being a result of windows design, where a single file modification can be multiple calls to the windows API
I make allowance for this by doing the following:
var fsTimeout
fs.watch('file.js', function(e) {
if (!fsTimeout) {
console.log('file.js %s event', e)
fsTimeout = setTimeout(function() { fsTimeout=null }, 5000) // give 5 seconds for multiple events
}
}
I suggest to work with chokidar (https://github.com/paulmillr/chokidar) which is much better than fs.watch:
Commenting its README.md:
Node.js fs.watch:
Doesn't report filenames on OS X.
Doesn't report events at all when using editors like Sublime on OS X.
Often reports events twice.
Emits most changes as rename.
Has a lot of other issues
Does not provide an easy way to recursively watch file trees.
Node.js fs.watchFile:
Almost as bad at event handling.
Also does not provide any recursive watching.
Results in high CPU utilization.
If you need to watch your file for changes then you can check out my small library on-file-change. It checks file sha1 hash between fired change events.
Explanation of why we have multiple fired events:
You may notice in certain situations that a single creation event generates multiple Created events that are handled by your component. For example, if you use a FileSystemWatcher component to monitor the creation of new files in a directory, and then test it by using Notepad to create a file, you may see two Created events generated even though only a single file was created. This is because Notepad performs multiple file system actions during the writing process. Notepad writes to the disk in batches that create the content of the file and then the file attributes. Other applications may perform in the same manner. Because FileSystemWatcher monitors the operating system activities, all events that these applications fire will be picked up.
Source
My custom solution
I personally like using return to prevent a block of code to run when checking something, so, here is my method:
var watching = false;
fs.watch('./file.txt', () => {
if(watching) return;
watching = true;
// do something
// the timeout is to prevent the script to run twice with short functions
// the delay can be longer to disable the function for a set time
setTimeout(() => {
watching = false;
}, 100);
};
Feel free to use this example to simplify your code. It may NOT be better than using a module from others, but it works pretty well!
Similar/same problem. I needed to do some stuff with images when they were added to a directory. Here's how I dealt with the double firing:
var fs = require('fs');
var working = false;
fs.watch('directory', function (event, filename) {
if (filename && event == 'change' && active == false) {
active = true;
//do stuff to the new file added
active = false;
});
It will ignore the second firing until if finishes what it has to do with the new file.
I'm dealing with this issue for the first time, so all of the answers so far are probably better than my solution, however none of them were 100% suitable for my case so I came up with something slightly different – I used a XOR operation to flip an integer between 0 and 1, effectively keeping track of and ignoring every second event on the file:
var targetFile = "./watchThis.txt";
var flippyBit = 0;
fs.watch(targetFile, {persistent: true}, function(event, filename) {
if (event == 'change'){
if (!flippyBit) {
var data = fs.readFile(targetFile, "utf8", function(error, data) {
gotUpdate(data);
})
} else {
console.log("Doing nothing thanks to flippybit.");
}
flipBit(); // call flipBit() function
}
});
// Whatever we want to do when we see a change
function gotUpdate(data) {
console.log("Got some fresh data:");
console.log(data);
}
// Toggling this gives us the "every second update" functionality
function flipBit() {
flippyBit = flippyBit ^ 1;
}
I didn't want to use a time-related function (like jwymanm's answer) because the file I'm watching could hypothetically get legitimate updates very frequently. And I didn't want to use a list of watched files like Erik P suggests, because I'm only watching one file. Jan Święcki's solution seemed like overkill, as I'm working on extremely short and simple files in a low-power environment. Lastly, Bernado's answer made me a little nervous – it would only ignore the second update if it arrived before I'd finished processing the first, and I can't handle that kind of uncertainty. If anyone were to find themselves in this very specific scenario, there might be some merit to the approach I used? If there's anything massively wrong with it please do let me know/edit this answer, but so far it seems to work well?
NOTE: Obviously this strongly assumes that you'll get exactly 2 events per real change. I carefully tested this assumption, obviously, and learned its limitations. So far I've confirmed that:
Modifying a file in Atom editor and saving triggers 2 updates
touch triggers 2 updates
Output redirection via > (overwriting file contents) triggers 2 updates
Appending via >> sometimes triggers 1 update!*
I can think of perfectly good reasons for the differing behaviours but we don't need to know why something is happening to plan for it – I just wanted to stress that you'll want to check for yourself in your own environment and in the context of your own use cases (duh) and not trust a self-confessed idiot on the internet. That being said, with precautions taken I haven't had any weirdness so far.
* Full disclosure, I don't actually know why this is happening, but we're already dealing with unpredictable behaviour with the watch() function so what's a little more uncertainty? For anyone following along at home, more rapid appends to a file seem to cause it to stop double-updating but honestly, I don't really know, and I'm comfortable with the behaviour of this solution in the actual case it'll be used, which is a one-line file that will be updated (contents replaced) like twice per second at the fastest.
first is change and the second is rename
we can make a difference from the listener function
function(event, filename) {
}
The listener callback gets two arguments (event, filename). event is either 'rename' or 'change', and filename is the name of the file which triggered the event.
// rm sourcefile targetfile
fs.watch( sourcefile_dir , function(event, targetfile)){
console.log( targetfile, 'is', event)
}
as a sourcefile is renamed as targetfile, it's will call three event as fact
null is rename // sourcefile not exist again
targetfile is rename
targetfile is change
notice that , if you want catch all these three evnet, watch the dir of sourcefile
I somtimes get multible registrations of the Watch event causing the Watch event to fire several times.
I solved it by keeping a list of watching files and avoid registering the event if the file allready is in the list:
var watchfiles = {};
function initwatch(fn, callback) {
if watchlist[fn] {
watchlist[fn] = true;
fs.watch(fn).on('change', callback);
}
}
......
Like others answers says... This got a lot of troubles, but i can deal with this in this way:
var folder = "/folder/path/";
var active = true; // flag control
fs.watch(folder, function (event, filename) {
if(event === 'rename' && active) { //you can remove this "check" event
active = false;
// ... its just an example
for (var i = 0; i < 100; i++) {
console.log(i);
}
// ... other stuffs and delete the file
if(!active){
try {
fs.unlinkSync(folder + filename);
} catch(err) {
console.log(err);
}
active = true
}
}
});
Hope can i help you...
Easiest solution:
const watch = (path, opt, fn) => {
var lock = false
fs.watch(path, opt, function () {
if (!lock) {
lock = true
fn()
setTimeout(() => lock = false, 1000)
}
})
}
watch('/path', { interval: 500 }, function () {
// ...
})
I was downloading file with puppeteer and once a file saved, I was sending automatic emails. Due to problem above, I noticed, I was sending 2 emails. I solved by stopping my application using process.exit() and auto-start with pm2. Using flags in code didn't saved me.
If anyone has this problem in future, one can use this solution as well. Exit from program and restart with monitor tools automatically.
Here's my simple solution. It works well every time.
// Update obj as file updates
obj = JSON.parse(fs.readFileSync('./file.json', 'utf-8'));
fs.watch('./file.json', () => {
const data = JSON.parse(fs.readFileSync('./file.json', 'utf-8') || '{}');
if(Object.entries(data).length > 0) { // This checks fs.watch() isn't false-firing
obj = data;
console.log('File actually changed: ', obj)
}
});
I came across the same issue. If you don't want to trigger multiple times, you can use a debounce function.
fs.watch( 'example.xml', _.debounce(function ( curr, prev ) {
// on file change we can read the new xml
fs.readFile( 'example.xml','utf8', function ( err, data ) {
if ( err ) throw err;
console.dir(data);
console.log('Done');
});
}, 100));
Debouncing The Observer
A solution I arrived at was that (a) there needs to be a workaround for the problem in question and, (b), there needs to be a solution to ensure multiple rapid Ctrl+s actions do not cause Race Conditions. Here's what I have...
./**/utilities.js (somewhere)
export default {
...
debounce(fn, delay) { // #thxRemySharp https://remysharp.com/2010/07/21/throttling-function-calls/
var timer = null;
return function execute(...args) {
var context = this;
clearTimeout(timer);
timer = setTimeout(fn.bind(context, ...args), delay);
};
},
...
};
./**/file.js (elsewhere)
import utilities from './**/utilities.js'; // somewhere
...
function watch(server) {
const debounced = utilities.debounce(observeFilesystem.bind(this, server), 1000 * 0.25);
const observers = new Set()
.add( fs.watch('./src', debounced) )
.add( fs.watch('./index.html', debounced) )
;
console.log(`watching... (${observers.size})`);
return observers;
}
function observeFilesystem(server, type, filename) {
if (!filename) console.warn(`Tranfer Dev Therver: filesystem observation made without filename for type ${type}`);
console.log(`Filesystem event occurred:`, type, filename);
server.close(handleClose);
}
...
This way, the observation-handler that we pass into fs.watch is [in this case a bound bunction] which gets debounced if multiple calls are made less than 1000 * 0.25 seconds (250ms) apart from one another.
It may be worth noting that I have also devised a pipeline of Promises to help avoid other types of Race Conditions as the code also leverages other callbacks. Please also note the attribution to Remy Sharp whose debounce function has repeatedly proven very useful over the years.
watcher = fs.watch( 'example.xml', function ( curr, prev ) {
watcher.close();
fs.readFile( 'example.xml','utf8', function ( err, data ) {
if ( err ) throw err;
console.dir(data);
console.log('Done');
});
});
I had similar similar problem but I was also reading the file in the callback which caused a loop.
This is where I found how to close watcher:
How to close fs.watch listener for a folder
NodeJS does not fire multiple events for a single change, it is the editor you are using updating the file multiple times.
Editors use stream API for efficiency, they read and write data in chunks which causes multiple updates depending on the chunks size and the amount of content. Here is a snippet to test if fs.watch fires multiple events:
const http = require('http');
const fs = require('fs');
const path = require('path');
const host = 'localhost';
const port = 3000;
const file = path.join(__dirname, 'config.json');
const requestListener = function (req, res) {
const data = new Date().toString();
fs.writeFileSync(file, data, { encoding: 'utf-8' });
res.end(data);
};
const server = http.createServer(requestListener);
server.listen(port, host, () => {
fs.watch(file, (eventType, filename) => {
console.log({ eventType });
});
console.log(`Server is running on http://${host}:${port}`);
});
I believe a simple solution would be checking for the last modified timestamp:
let lastModified;
fs.watch(file, (eventType, filename) => {
stat(file).then(({ mtimeMs }) => {
if (lastModified !== mtimeMs) {
lastModified = mtimeMs;
console.log({ eventType, filename });
}
});
});
Please note that you need to use all-sync or all-async methods otherwise you will have issues:
Update the file in a editor, you will see only single event is logged:
const http = require('http');
const host = 'localhost';
const port = 3000;
const fs = require('fs');
const path = require('path');
const file = path.join(__dirname, 'config.json');
let lastModified;
const requestListener = function (req, res) {
const data = Date.now().toString();
fs.writeFileSync(file, data, { encoding: 'utf-8' });
lastModified = fs.statSync(file).mtimeMs;
res.end(data);
};
const server = http.createServer(requestListener);
server.listen(port, host, () => {
fs.watch(file, (eventType, filename) => {
const mtimeMs = fs.statSync(file).mtimeMs;
if (lastModified !== mtimeMs) {
lastModified = mtimeMs;
console.log({ eventType });
}
});
console.log(`Server is running on http://${host}:${port}`);
});
Few notes on the alternative solutions: Storing files for comparison will be memory inefficient especially if you have large files, taking file hashes will be expensive, custom flags are hard to keep track of, especially if you are going to detect changes made by other applications, and lastly unsubscribing and re-subscribing requires unnecessary juggling.
If you don't need an instant result, you can use setTimout to debounce successive events:
let timeoutId;
fs.watch(file, (eventType, filename) => {
clearTimeout(timeoutId);
timeoutId = setTimeout(() => {
console.log({ eventType });
}, 100);
});

Resources