How to remove module after "require" in node.js? - node.js

Let say, after I require a module and do something as below:
var b = require('./b.js');
--- do something with b ---
Then I want to take away module b (i.e. clean up the cache). how I can do it?
The reason is that I want to dynamically load/ remove or update the module without restarting node server. any idea?
------- more --------
based on the suggestion to delete require.cache, it still doesn't work...
what I did are few things:
1) delete require.cache[require.resolve('./b.js')];
2) loop for every require.cache's children and remove any child who is b.js
3) delete b
However, when i call b, it is still there! it is still accessible. unless I do that:
b = {};
not sure if it is a good way to handle that.
because if later, I require ('./b.js') again while b.js has been modified. Will it require the old cached b.js (which I tried to delete), or the new one?
----------- More finding --------------
ok. i do more testing and playing around with the code.. here is what I found:
1) delete require.cache[] is essential. Only if it is deleted,
then the next time I load a new b.js will take effect.
2) looping through require.cache[] and delete any entry in the
children with the full filename of b.js doesn't take any effect. i.e.
u can delete or leave it. However, I'm unsure if there is any side
effect. I think it is a good idea to keep it clean and delete it if
there is no performance impact.
3) of course, assign b={} doesn't really necessary, but i think it is
useful to also keep it clean.

You can use this to delete its entry in the cache:
delete require.cache[require.resolve('./b.js')]
require.resolve() will figure out the full path of ./b.js, which is used as a cache key.

Spent some time trying to clear cache in Jest tests for Vuex store with no luck. Seems like Jest has its own mechanism that doesn't need manual call to delete require.cache.
beforeEach(() => {
jest.resetModules();
});
And tests:
let store;
it("1", () => {
process.env.something = true;
store = require("#/src/store.index");
});
it("2", () => {
process.env.something = false;
store = require("#/src/store.index");
});
Both stores will be different modules.

One of the easiest ways (although not the best in terms of performance as even unrelated module's caches get cleared) would be to simply purge every module in the cache
Note that clearing the cache for *.node files (native modules) might cause undefined behaviour and therefore is not supported (https://github.com/nodejs/node/commit/5c14d695d2c1f924cf06af6ae896027569993a5c), so there needs to be an if statement to ensure those don't get removed from the cache, too.
for (const path in require.cache) {
if (path.endsWith('.js')) { // only clear *.js, not *.node
delete require.cache[path]
}
}

I found this useful for client side applications. I wanted to import code as I needed it and then garbage collect it when I was done. This seems to work. I'm not sure about the cache, but it should get garbage collected once there is no more reference to module and CONTAINER.sayHello has been deleted.
/* my-module.js */
function sayHello { console.log("hello"); }
export { sayHello };
/* somewhere-else.js */
const CONTAINER = {};
import("my-module.js").then(module => {
CONTAINER.sayHello = module.sayHello;
CONTAINER.sayHello(); // hello
delete CONTAINER.sayHello;
console.log(CONTAINER.sayHello); // undefined
});

I have found the easiest way to handle invalidating the cache is actually to reset the exposed cache object. When deleting individual entries from the cache, the child dependencies become a bit troublesome to iterate through.
require.cache = {};

Related

Nodejs "closing directory handle on garbage collection"

Нello, the following is an excerpt from my code:
let dirUtility = async (...args) => {
let dir = await require('fs').promises.opendir('/path/to/some/dir...');
let entries = dir.entries();
for await (let childDir of dir) doStuffWithChildDir(childDir);
return entries;
};
This function is called a fair bit in my code. I have the following in my logs:
(node:7920) Warning: Closing directory handle on garbage collection
(Use `node --trace-warnings ...` to show where the warning was created)
(node:7920) Warning: Closing directory handle on garbage collection
(node:7920) Warning: Closing directory handle on garbage collection
(node:7920) Warning: Closing directory handle on garbage collection
(node:7920) Warning: Closing directory handle on garbage collection
What exactly is the significance of these errors?
Do they indicate a large issue? (Should I simply seek to silence these errors?)
What is the best way to avoid this issue?
Thanks!
Raina77ow’s answer tells you why the warning is displayed.
Basically what's happening is that the NodeJS runtime is implicity calling the close() method on the dir object, but the best practice is that you would explicity call the close() method on the handle, or even better wrap it in a try..finally block.
Like this:
let dirUtility = async (...args) => {
let dir = await require('fs').promises.opendir('/path/to/some/dir...');
try {
let entries = dir.entries();
for await (let childDir of dir) doStuffWithChildDir(childDir);
return entries;
}
finally {
dir.close();
// return some dummy value, or return undefined.
}
};
Quoting the comments (source):
// If the close was successful, we still want to emit a process
// warning to notify that the file descriptor was gc'd. We want to be
// noisy about this because not explicitly closing the DirHandle is a
// bug.
While your code seems to be really similar to the code in this question, there's a difference:
let entries = dir.entries();
...
return entries;
That, in a nutshell, seems to create an additional iterator over directory, which is passed outside as the function's return value. How exactly this iterator is employed is not clear (as you don't show what happens next with dirUtility), but either it's not exhausted before GC takes its toll, or it's done in a way that confuses NodeJS.
Overall, the whole approach doesn't seem right to me: the function seems both to do something with a directory AND, essentially, give that directory back as its result, without actually caring how this object will be used. That, at least, looks like a leaky abstraction.
So it seems you need to decide: if you actually don't use the return value of dirUtility, just drop the corresponding lines of code. But if you actually do need to preserve the open directory (for example, for performance reasons), consider creating a stateful wrapper around it, encapsulating the value. That should prevent GC this handle, as long as the corresponding object lives in your code.

Save value out of nodejs server

Is there a way where I can save a timestamp out of my application / object, so when I restart the nodeserver I can get that value?
I need this for my cronjob. I need to save the last synching even though I restart the server.
There are all sorts of ways to save this sort of information so you can load it when you restart your node process. One is to write it to a file in your file system, then read it when you start your program.
To write the current timestamp to a file do this.
const fs = require('fs')
...
fs.writeFile('timestamp.txt', Date.now().toString(), err => {console.error(err)})
To read it do this.
const fs = require('fs')
...
const timestamp = Number(fs.readFileSync('timestamp.txt'))
Obviously there's more programming to do to put the file in the correct directory, to handle errors, and to cope with the case where you attempt to read the file before writing it. But that's the idea.
You can also store it in some kind of database. But this should do you for now. Unless you're using a system like Heroku where the files don't always get saved from run to run.
When a process dies, all data stored in its working memory (such as variables and functions) die with it.
I recently wrote an npm package cashola that makes it easier to store this data across process restarts.
You can run this example script twice and see how the print statements differ each time.
import { rememberSync } from 'cashola';
const myState = rememberSync('timestamp-example');
console.log('Before:', myState);
// First run: {}
// Second run: { <timeString1>: 'hi! }
myState[new Date.getTime().toString()] = 'hi!';
console.log('After:', myState);
// First run: { <timeString1>: 'hi! }
// Second run: { <timeString1>: 'hi!, <timeString2>: 'hi! }

How to capture only the fields modified by user

I am trying to build a logging mechanism, to log changes done to a record. I am currently logging previous and new record. However, as the site is very busy, I expect the logfile to grow seriously huge. To avoid this, I plan to only capture the modified fields only.
Is there a way to capture only the modifications done to a record (in REACT), so my {request.body} will have fewer fields?
My Server-side is build with NODE.JS and the client-side is REACT.
One approach you might want to consider is to add an onChange(universal) or onTextChanged(native) listener to the text field and store the form update in a local state/variables.
Finally, when a user makes an action (submit, etc.) you can send the updated data to the logging module.
The best way I found and works for me is …
on the api server-side, where I handle the update request, before hitting the database, I do a difference between the previous record and {request.body} using lodash and use the result to send to my update database function
var _ = require('lodash');
const difference = (object, base) => {
function changes(object, base) {
return _.transform(object, function (result, value, key) {
if (!_.isEqual(value, base[key])) {
result[key] = (_.isObject(value) && _.isObject(base[key])) ? changes(value, base[key]) : value;
}
});
}
return changes(object, base);
}
module.exports = difference
I saved the above code in a file named diff.js and included it in my server-side file.
It worked good.
Thanks for giving the idea...

Best way to reuse a large translation file within Node / Express

I'm new to Node but I figured I'd jump right in and start converting a PHP app into Node/Express. It's a bilingual app that uses gettext with PO/MO files. I found a Node module called node-gettext. I'd rather not convert the PO files into another format right now, so it seems this library is my only option.
So my concern is that right now, before every page render, I'm doing something like this:
exports.home_index = function(req, res)
{
var gettext = require('node-gettext'),
gt = new gettext();
var fs = require('fs');
gt.textdomain('de');
var fileContents = fs.readFileSync('./locale/de.mo');
gt.addTextdomain('de', fileContents);
res.render(
'home/index.ejs',
{ gt: gt }
);
};
I'll also be using the translations in classes, so with how it's set up now I'd have to load the entire translation file again every time I want to translate something in another place.
The translation file is about 50k and I really don't like having to do file operations like this on every page load. In Node/Express, what would be the most efficient way to handle this (aside from a database)? Usually a user won't even be changing their language after the first time (if they're changing it from English).
EDIT:
Ok, I have no idea if this is a good approach, but it at least lets me reuse the translation file in other parts of the app without reloading it everywhere I need to get translated text.
In app.js:
var express = require('express'),
app = express(),
...
gettext = require('node-gettext'),
gt = new gettext();
Then, also in app.js, I create the variable app.locals.gt to contain the gettext/translation object, and I include my middleware function:
app.locals.gt = gt;
app.use(locale());
In my middleware file I have this:
mod
module.exports = function locale()
{
return function(req, res, next)
{
// do stuff here to populate lang variable
var fs = require('fs');
req.app.locals.gt.textdomain(lang);
var fileContents = fs.readFileSync('./locales/' + lang + '.mo');
req.app.locals.gt.addTextdomain(lang, fileContents);
next();
};
};
It doesn't seem like a good idea to assign the loaded translation file to app, since depending on the current request that file will be one of two languages. If I assigned the loaded translation file to app instead of a request variable, can that mix up users' languages?
Anyway, I know there's got to be a better way of doing this.
The simplest option would be to do the following:
Add this in app.js:
var languageDomains = {};
Then modify your Middleware:
module.exports = function locale()
{
return function(req, res, next)
{
// do stuff here to populate lang variable
if ( !req.app.locals.languageDomains[lang] ) {
var fs = require('fs');
var fileContents = fs.readFileSync('./locales/' + lang + '.mo');
req.app.locals.languageDomains[lang] = true;
req.app.locals.gt.addTextdomain(lang, fileContents);
}
req.textdomain = req.app.locals.gt.textdomain(lang);
next();
};
};
By checking if the file has already been loaded you are preventing the action from happening multiple times, and the domain data will stay resident in the server's memory. The downside to the simplicity of this solution is that if you ever change the contents of your .mo files whilst the server is running, the changes wont be taken into account. However, this code could be extended to keep an eye on the mtime of the files, and reload accordingly, or make use of fs.watchFile — if required:
if ( !req.app.locals.languageDomains[lang] ) {
var fs = require('fs'), filename = './locales/' + lang + '.mo';
var fileContents = fs.readFileSync(filename);
fs.watchFile(filename, function (curr, prev) {
req.app.locals.gt.addTextdomain(lang, fs.readFileSync(filename));
});
req.app.locals.languageDomains[lang] = true;
req.app.locals.gt.addTextdomain(lang, fileContents);
}
Warning: It should also be noted that using sync versions of functions outside of server initialisation is not a good idea because it can freeze the thread. You'd be better off changing your sync loading to the async equivalent.
After the above changes, rather than passing gt to your template, you should be able to use req.textdomain instead. It seems that the gettext library supports a number of requests directly on each domain object, which means you hopefully don't need to refer to the global gt object on a per request basis (which will be changing it's default domain on each request):
Each domain supports:
getTranslation
getComment
setComment
setTranslation
deleteTranslation
compilePO
compileMO
Taken from here:
https://github.com/andris9/node-gettext/blob/e193c67fdee439ab9710441ffd9dd96d027317b9/lib/domain.js
update
A little bit of further clarity.
Once the server has loaded the file into memory the first time, it should remain there for all subsequent connections it receives (for any visitor/request) because it is stored globally and wont be garbage collected — unless you remove all references to the data, which would mean gettext would need to have some kind of unload/forget domain method.
Node is different to PHP in that its environment is shared and wraps its own HTTP server (if you are using something like Express), which means it is very easy to remember data globally as it has a constant environment that all the code is executed within. PHP is always executed after the HTTP server has received and dealt with the request (e.g. Apache). Each PHP response is then executed in its own separate run-time, which means you have to rely on databases, sessions and cache stores to share even simple information and most resources.
further optimisations
Obviously with the above you are constantly running translations on each page load. Which means the gettext library will still be using the translation data resident in memory, which will take up processing time. To get around this, it would be best to make sure your URLs have something that makes them unique for each different language i.e. my-page/en/ or my.page.fr or even jp.domain.co.uk/my-page and then enable some kind of full page caching using something like memcached or express-view-cache. However, once you start caching pages you need to make certain there aren't any regions that are user specific, if so, you need to start implement more complicated systems that are sensitive to these areas.
Remember: The golden rule of optimisation, don't do so before you need to... basically meaning I wouldn't worry about page caching until you know it's going to be an issue, but it is always worth bearing in mind what your options are, as it should shape your code design.
update 2
Just to illustrate a bit further on the behaviour of a server running in JavaScript, and how the global behaviour is not just a property of req.app, but in fact any object that is further up the scope chain.
So, as an example, instead of adding var languageDomains = {}; to your app.js, you could instantiate it further up the scope of wherever your middleware is placed. It's best to keep your global entities in one place however, so app.js is the better place, but this is just for illustration.
var languageDomains = {};
module.exports = function locale()
{
/// you can still access languageDomains here, and it will behave
/// globally for the entire server.
languageDomains[lang]
}
So basically, where-as with PHP, the entire code-base is re-executed on each request — so the languageDomains would be instantiated a-new each time — in Node the only part of the code to be re-executed is the code within locale() (because it is triggered as part of a new request). This function will still have a reference to the already existing and defined languageDomains via the scope chain. Because languageDomains is never reset (on a per request basis) it will behave globally.
Concurrent users
Node.js is single threaded. This means that in order for it to be concurrent i.e. handle multiple requests at the "same" time, you have to code your app in such a way that each little part can be executed very quickly and then slip into a waiting state, whilst another part of another request is dealt with.
This is the reason for the asynchronous and callback nature of Node, and the reason to avoid Sync calls whilst your app is running. Any one Sync request could halt or freeze execution of the thread and delay handling for all other requests. The reason why I state this is to give you a better idea of how multiple users might interact with your code (and global objects).
Basically once a request is being dealt with by your server, it is it's only focus, until that particular execution cycle ends i.e. your request handler stops calling other code that needs to run synchronously. Once that happens the next queued item is dealt with (a callback or something), this could be part of another request, or it could be the next part in the current request.

What does delete cache mean in Nodejs

Please find below a sample code in nodejs:
var hello_file = require.resolve('hello')
var hello = require('hello')
console.log(m.hello()); // there is a method hello in module hello.js
delete require.cache[hello_file]
console.log(m.hello()); // it still works
I thought the delete would remove the reference to module and hence the last line should throw an error. But it does not. What could be the reason and what does delete cache really mean?
The cache doesn't know about it anymore but your var hello still has a reference to what was previously loaded.
The next time you call require('hello') it will load the module from the file. But, until you update the reference that var hello is holding, it will continue to point to the originally loaded module.
As you know, node would load a module once even if you require many times, Modules are cached after the first time they are loaded. If you delete it from cache, it will reload the module from filesystem to the cache the next time you require.

Resources