modify module but not by hacking - node.js

I'm using node.js modules installed via npm.
I'm wondering what is the best way to modify a node module functionality.
Let's say I have a module called Handler and there is a method called foo which takes a request object and returns a response object.
1) What if I want to do something to the response before it gets returned.
Do I just modify the code itself ?
Are there any articles on this ?
UPDATE --
Also, the original function is modifying a few objects that is not being returned, But I want to modify them too. How would I handle that ?

What i would do here is to create a wrapper around the function, and then change it in there. If that was unclear, here's some code:
var myModule = require('myModule');
var myModuleFunc = myModule.myFunc;
myModule.myFunc = function() {
var res = myModuleFunc.apply(this, arguments); // call the function, and pass along context and arguments
res = transform(res); // whatever you do the response
return res;
};

Related

Module that returns function based on bool not returning correct function

I am using this library to control ffprobe in my script. To avoid creating more child processes in my script I need to use the synchronous version of the command from the library. According to the documentation it should be as easy as setting probe.SYNC = true.
However, regardless of the value of probe.SYNC calling the probe function always uses doProbe instead of using doProbeSync.
// index.js
import probe from 'node-ffprobe';
probe.SYNC = true;
let data = probe('video.mp4'); // Incorrectly calls doProbe()
// node-ffprobe.js
module.exports = (function () {
function doProbeSync(file) {
...
}
function doProbe(file) {
...
}
return module.exports.SYNC ? doProbeSync : doProbe
})()
The only reason I think of that this might not be working is because my script is written with ES6 syntax, but I am not sure how that would impact this.
How can I get the node-ffprobe library to use doProbeSync?

Accessing outside variables inside the 'then' function

I am new to nodejs. Using bluebird promises to get the response of an array of HTTP API calls, and storing derived results in an ElasticSearch.
Everything is working fine, except I am unable to access the variables within the 'then' function. Below is my code:
Promise.map(bucket_paths, function(path) {
this.path = path;
return getJson.getStoreJson(things,path.path);
}, {concurrency:1}).then(function(bucketStats){
bucketStats.map(function(bucketStat) {
var bucket_stats_json = {};
bucket_stats_json.timestamp = new Date();
bucket_stats_json.name = path.name ==> NOT WORKING
});
});
How can I access the path.name variable within the 'then' ? Error says 'path' is undefined.
The best way to do this is to package the data you need from one part of the promise chain into the resolved value that is sent onto the next part of the chain. In your case with Promise.map(), you're sending an array of data onto the .then() handler so the cleanest way to pass each path down to the next stage is to make it part of each array entry that Promise.map() is resolving. It appears you can just add it to the bucketStat data structure with an extra .then() as show below. When you get the data that corresponds to a path, you then add the path into that data structure so later on when you're walking through all the results, you have the .path property for each object.
You don't show any actual result here so I don't know what you're ultimately trying to end up with, but hopefully you can get the general idea from this.
Also, I switched to Promise.mapSeries() since that's a shortcut when you want concurrency set to 1.
Promise.mapSeries(bucket_paths, function(path) {
return getJson.getStoreJson(things,path.path).then(bucketStat => {
// add the path into this item's data so we can get to it later
bucketStat.path = path;
return bucketStat;
});
}).then(function(bucketStats){
return bucketStats.map(function(bucketStat) {
var bucket_stats_json = {};
bucket_stats_json.timestamp = new Date();
bucket_stats_json.name = bucketStat.path.name;
return bucket_status_json;
});
});

How can i access variable inside callback nodejs?

I'm unable to get the values in meals object although i have create new object at the top can any one tell which is the best procedure access variable inside callback function
var meals = new Object();
passObj.data = _.map(passObj.data, (x)=> {
x.mealImageUrl = !_.isNull(x.image_url) ? `${config.image_path}${x.image_url}` : x.image;
dbHelpder.query(`select * from meals where meal_category = ${x.category_id}`,(error,result)=>{
meals = x.result;
passObj.total = 555
});
return x;
});
You need to use callback again inside the callback function. :)
You are doing something asynchronous, it means, there are no sequence codes. (At least, I keep this in my mind, don't know how others think about this.) So, the code should be:
function somehow(callback) { // you get the result from callback
var meals = new Object();
passObj.data = _.map(passObj.data, (x)=> {
dbHelpder.query(`select * from meals where meal_category = ${x.category_id}`,(error,result)=>{
meals = x.result;
passObj.total = 555;
callback(meals); // Here you get the result
});
}
return x;
}
So, when you are going to use this function, it should be
function afterMeals(resultMeals) {
// do something on the meals
}
somehow(afterMeals);
Use some other technology can make it a bit clear (like promise), but you can not avoid callback actually.
First of all, I cannot see what passObj exactly is, apparently it is defined elsewhere.
Secondly, callback functions don't function the way you seem to think they do. Typically one reason to use them is to implement asynchronous calls, so returning a value is not of use.
The idea is as follows. Usually you have a call like this:
var myFunc1 = function(){
return 42;
}
var x = myFunc1();
myFunc2(x);
However when myFunc1 is an asynchronous call returning a value is impossible without using some sort of promise, which is a topic on its own. So if myFunc1 was an asynchronous call and the 42 was returned e.g. by a server, then just returning a value caused the value to be null, because the return value is not calculated and received yet, when you arrive at return.
This is a reason for callbacks to be introduced. They work in a way, that allows for asynchronous calls and proceeding the way you want to after the call has finished. To show on the example above:
var myFunc1 = function( myFunc2, params ){
// do async stuff here, then call the callback function from myFunc1
...
myFunc2(x);
}
So the asynchronous function doesn't return anything. It makes the calls or calculations it needs to make and when those are done (in the example that is when x has been declared and assigned a value) myFunc2, which is the callback function in our example, is called directly from the asynchronous function.
Long story short - do what you need to do with x directly inside the callback function.

call back on cheerio node.js

I'm trying to write a scraper using 'request' and 'cheerio'. I have an array of 100 urls. I'm looping over the array and using 'request' on each url and then doing cheerio.load(body). If I increase i above 3 (i.e. change it to i < 3 for testing) the scraper breaks because var productNumber is undefined and I can't call split on undefined variable. I think that the for loop is moving on before the webpage responds and has time to load the body with cheerio, and this question: nodeJS - Using a callback function with Cheerio would seem to agree.
My problem is that I don't understand how I can make sure the webpage has 'loaded' or been parsed in each iteration of the loop so that I don't get any undefined variables. According to the other answer I don't need a callback, but then how do I do it?
for (var i = 0; i < productLinks.length; i++) {
productUrl = productLinks[i];
request(productUrl, function(err, resp, body) {
if (err)
throw err;
$ = cheerio.load(body);
var imageUrl = $("#bigImage").attr('src'),
productNumber = $("#product").attr('class').split(/\s+/)[3].split("_")[1]
console.log(productNumber);
});
};
Example of output:
1461536
1499543
TypeError: Cannot call method 'split' of undefined
Since you're not creating a new $ variable for each iteration, it's being overwritten when a request is completed. This can lead to undefined behaviour, where one iteration of the loop is using $ just as it's being overwritten by another iteration.
So try creating a new variable:
var $ = cheerio.load(body);
^^^ this is the important part
Also, you are correct in assuming that the loop continues before the request is completed (in your situation, it isn't cheerio.load that is asynchronous, but request is). That's how asynchronous I/O works.
To coordinate asynchronous operations you can use, for instance, the async module; in this case, async.eachSeries might be useful.
You are scraping some external site(s). You can't be sure the HTML all fits exactly the same structure, so you need to be defensive on how you traverse it.
var product = $('#product');
if (!product) return console.log('Cannot find a product element');
var productClass = product.attr('class');
if (!productClass) return console.log('Product element does not have a class defined');
var productNumber = productClass.split(/\s+/)[3].split("_")[1];
console.log(productNumber);
This'll help you debug where things are going wrong, and perhaps indicate that you can't scrape your dataset as easily as you'd hoped.

what should nodeJS/commonJS module.exports return

I know that I can set module.exports to either an object or a function
(and in some cases a function that will return an object).
I am also aware of the differences and ways to use exports vs. module.exports so no need to comment on that.
I also understand that whatever is returned is cached and will be returned on any consecutive call to require. So that if I choose to return a function and not an object then possibly this implies that on every require It is necessary to actually run this function.
I was wondering is there any defacto standard on which of those two should be used. Or if there is no such standard - what considerations would apply when deciding if my module should return an object, a function or anything more complicated...
The module I intend to write is expected to be used as part of an express application if it matters (maybe there is a "local defacto standard" for express/connect modules).
If the require'd code is standalone, and does not need access to any objects from the parent code, I export an object.
edit - the above is my preferred way to do it. I do the below only when I need to pass stuff into the module, like configuration data, or my database object. I haven't found a more elegant way to give the module access to variables that are in the parents' scope.
So to pass an object from parent into module I use a function:
//parent.js
var config = { DBname:'bar' };
var child = require('./child.js')(config);
//child.js
module.exports = function(cfg){
var innerchild = {};
innerchild.blah = function(){
console.log(cfg.DBname); // this is out of scope unless you pass it in
}
return innerchild;
};
"so that if I choose to return a function and not an object then
possibly this implies that on every require It is necessary to
actually run this function."
It does not matter whether you return an individual function or an object. In neither cases a function (or functions) are ran. unless you explicitly do so.
For instance, consider the a module hello.js:
exports = function () { return 'Hello'; };
You can use require to get that function:
var hello = require('hello');
If you want to run that function, you need to invoke it explicitly as follows:
var hello = require('hello')();
You wrote you want to make sure your function is executed exactly once. Intuitively this could lead you to writing your hello.js as follows:
var hello = function () { return 'Hello'; };
exports = hello();
In which case you could just store result from hello via require:
var hello = require('hello');
However: if you do that the export system may cache your module. In such cases, you do not get fresh result from hello, but instead, a cached value. This may or may not be what you want. If you want to make sure a function is invoked every time it is required, you need to export a function and call it after require.

Resources