Optimization callbacks in MeteorJS - node.js

I dont know how to ask my question correctly, but for example I have some structure like this
get_data:function(){
this.unblock();
request("example.com", Meteor.bindEnvironment(function(error, response, body) {
if (!error && response.statusCode == 200) {
$ = Cheerio.load(body);// get HTML of example.com
$(".someclass").each(function() {
if (!somedata_doesnt_exist_in_Mongo) {
request(nexturl, Meteor.bindEnvironment(function(error, response, body)
//... logic
}));
}
});
}
}))
}
Main idea is that I get data from many sites like agregator and have a lot of methods like this. And it'a a lot of time. So I have 2 questions
1 - for Meteor guys. When I use this.unblock() this ensures that my method will work without taking time with customers, like work in other thread ?
2 - How can I optimaze code stucture like above ?
Sorry if it's not in StackOverflow format but
I am waiting for any help !

this.unblock is relevant only to each client individually. It
allows subsequent method calls from client A to run without
having the previous method calls from that client A to finish.
It is like working in a new thread asynchronously in the sense that
the previous method calls are not blocking for client A for this
function using this.unblock. If you have client B, his/her
method invocation wouldn't be blocking A's regardless of whether
you use this.unblock.
I recommend using this.unblock whenever you are sure subsequent method calls will not rely on the result of the function you use this.unblock in. Sending out emails is the most common example. Subsequent method calls will not need the emails to finish sending before doing its job. For your example, I think it should be good to use this.unblock, but of course it depends on what you plan to do with the results following the execution of code after this.unblock.

Related

Bot framework async issues

I'm experimenting with the translation service on the Microsoft bot framework. I've written a method to which I pass a callback function which receives my translated text.
I've got an existing bot that calls an HTTP endpoint to create my output in English. I want to translate the output to the different language before returning it to the user. My unaltered code looks like this:
await request.post(ENDPOINT,
{
headers: HEADERS,
json: BODY
},
async function (error, response, body) {
if (response.statusCode == 202) {
var msg = body.mainResponse.text;
context.sendActivity(msg);
}
});
This runs just fine. Data passed in the HTTP response body gets parsed sent back to the user.
Now I want to plug in my translation service. I've got a single function that I call to do this called Translator.translate(text, callback). I've added this call to my existing function to get:
await request.post(ENDPOINT,
{
headers: HEADERS,
json: BODY
},
async function (error, response, body) {
if (response.statusCode == 202) {
var msg = body.mainResponse.text;
await Translator.translate(msg, function (output) {
context.sendActivity(output);
});
}
}
);
My translation process runs and I get the translation in the output variable, but nothing gets sent back to the user. Looking at the terminal, I see the error "Cannot perform 'get' on a proxy that has been revoked" relating to the context.sendActivity line in my callback.
Can anyone suggest how I keep the context object active?
Thanks in advance.
Many thanks for the assistance everyone - I never completely got to the bottom of this, but I finally fixed it with a complete re-write of the code. I think the problem was caused by a large number of nested synchronous and asynchronous calls. My ultimate solution was to completely get rid of all the nesting - first calling the translation service (and waiting for it), then doing the original call.
I think there are a number of other asynchronous threads inside the methods of both pieces of functionality. I don't have a great understanding of how this works in node, but I'm guessing that the response was getting popped off the stack at the wrong point, which is why I wasn't seeing it. The "cannot perform get" error was a bit of a red herring, it turns out. I get the same error from some of Microsoft's working demo code. I'm sure there's a separate issue there that ought to be fixed, but it wasn't actually caused by this issue. The code was running, but the output was getting lost.

Is there any risk to read/write the same file content from different 'sessions' in Node JS?

I'm new in Node JS and i wonder if under mentioned snippets of code has multisession problem.
Consider I have Node JS server (express) and I listen on some POST request:
app.post('/sync/:method', onPostRequest);
var onPostRequest = function(req,res){
// parse request and fetch email list
var emails = [....]; // pseudocode
doJob(emails);
res.status(200).end('OK');
}
function doJob(_emails){
try {
emailsFromFile = fs.readFileSync(FILE_PATH, "utf8") || {};
if(_.isString(oldEmails)){
emailsFromFile = JSON.parse(emailsFromFile);
}
_emails.forEach(function(_email){
if( !emailsFromFile[_email] ){
emailsFromFile[_email] = 0;
}
else{
emailsFromFile[_email] += 1;
}
});
// write object back
fs.writeFileSync(FILE_PATH, JSON.stringify(emailsFromFile));
} catch (e) {
console.error(e);
};
}
So doJob method receives _emails list and I update (counter +1) these emails from object emailsFromFile loaded from file.
Consider I got 2 requests at the same time and it triggers doJob twice. I afraid that when one request loaded emailsFromFile from file, the second request might change file content.
Can anybody spread the light on this issue?
Because the code in the doJob() function is all synchronous, there is no risk of multiple requests causing a concurrency problem.
If you were using async IO in that function, then there would be possible concurrency issues.
To explain, Javascript in node.js is single threaded. So, there is only one thread of Javascript execution running at a time and that thread of execution runs until it returns back to the event loop. So, any sequence of entirely synchronous code like you have in doJob() will run to completion without interruption.
If, on the other hand, you use any asynchronous operations such as fs.readFile() instead of fs.readFileSync(), then that thread of execution will return back to the event loop at the point you call fs.readFileSync() and another request can be run while it is reading the file. If that were the case, then you could end up with two requests conflicting over the same file. In that case, you would have to implement some form of concurrency protection (some sort of flag or queue). This is the type of thing that databases offer lots of features for.
I have a node.js app running on a Raspberry Pi that uses lots of async file I/O and I can have conflicts with that code from multiple requests. I solved it by setting a flag anytime I'm writing to a specific file and any other requests that want to write to that file first check that flag and if it is set, those requests going into my own queue are then served when the prior request finishes its write operation. There are many other ways to solve that too. If this happens in a lot of places, then it's probably worth just getting a database that offers features for this type of write contention.

To async, or not to async in node.js?

I'm still learning the node.js ropes and am just trying to get my head around what I should be deferring, and what I should just be executing.
I know there are other questions relating to this subject generally, but I'm afraid without a more relatable example I'm struggling to 'get it'.
My general understanding is that if the code being executed is non-trivial, then it's probably a good idea to async it, as to avoid it holding up someone else's session. There's clearly more to it than that, and callbacks get mentioned a lot, and I'm not 100% on why you wouldn't just synch everything. I've got some ways to go.
So here's some basic code I've put together in an express.js app:
app.get('/directory', function(req, res) {
process.nextTick(function() {
Item.
find().
sort( 'date-modified' ).
exec( function ( err, items ){
if ( err ) {
return next( err );
}
res.render('directory.ejs', {
items : items
});
});
});
});
Am I right to be using process.nextTick() here? My reasoning is that as it's a database call then some actual work is having to be done, and it's the kind of thing that could slow down active sessions. Or is that wrong?
Secondly, I have a feeling that if I'm deferring the database query then it should be in a callback, and I should have the actual page rendering happening synchronously, on condition of receiving the callback response. I'm only assuming this because it seems like a more common format from some of the examples I've seen - if it's a correct assumption can anyone explain why that's the case?
Thanks!
You are using it wrong in this case, because .exec() is already asynchronous (You can tell by the fact that is accepts a callback as a parameter).
To be fair, most of what needs to be asynchronous in nodejs already is.
As for page rendering, if you require the results from the database to render the page, and those arrive asynchronously, you can't really render the page synchronously.
Generally speaking it's best practice to make everything you can asynchronous rather than relying on synchronous functions ... in most cases that would be something like readFile vs. readFileSync. In your example, you're not doing anything synchronously with i/o. The only synchronous code you have is the logic of your program (which requires CPU and thus has to be synchronous in node) but these are tiny little things by comparison.
I'm not sure what Item is, but if I had to guess what .find().sort() does is build a query string internally to the system. It does not actually run the query (talk to the DB) until .exec is called. .exec takes a callback, so it will communicate with the DB asynchronously. When that communication is done, the callback is called.
Using process.nextTick does nothing in this case. That would just delay the calling of its code until the next event loop which there is no need to do. It has no effect on synchronicity or not.
I don't really understand your second question, but if the rendering of the page depends on the result of the query, you have to defer rendering of the page until the query completes -- you are doing this by rendering in the callback. The rendering itself res.render may not be entirely synchronous either. It depends on the internal mechanism of the library that defines the render function.
In your example, next is not defined. Instead your code should probably look like:
app.get('/directory', function(req, res) {
Item.
find().
sort( 'date-modified' ).
exec(function (err, items) {
if (err) {
console.error(err);
res.status(500).end("Database error");
}
else {
res.render('directory.ejs', {
items : items
});
}
});
});
});

Notification in ajax response orchard

I'm using ajax requests to get one of PartialViews in my project. I want to pass a message by INotifier.
Cuttently i'm using HttpStatusCodeResult return new HttpStatusCodeResult(204, "Message");to pass informations about the errors but is not satisfying solution.
$(this).load($(this).attr("href"), function (response, status, xhr) {
if (xhr.status == 204) {
// show message
}
});
I'm wondering that is possible by using standard INotifier.Error() in ActionResult.
Nope. The default notifier is not suitable for AJAX requests. What it does, it queues notifications inside a temporary collection. Queued notifications are then written to the client when request ends - pushed into Layout.Messages zone.
In your case it would be best to implement your own INotifier that would follow the required logic. It's a very simple interface to implement so it's not actually that much work.
I needn't to implement INotifier, i only placed in my PartialView this:
#Display(WorkContext.Layout.Zones["Messages"])
Now the message isn't rendered in main zone (in Layout.cshtml of used theme), but could be placed anywhere in your PartialView, for example under the affected table.

How to avoid the need to delay event emission to the next tick of the event loop?

I'm writing a Node.js application using a global event emitter. In other words, my application is built entirely around events. I find this kind of architecture working extremely well for me, with the exception of one side case which I will describe here.
Note that I do not think knowledge of Node.js is required to answer this question. Therefore I will try to keep it abstract.
Imagine the following situation:
A global event emitter (called mediator) allows individual modules to listen for application-wide events.
A HTTP Server is created, accepting incoming requests.
For each incoming request, an event emitter is created to deal with events specific to this request
An example (purely to illustrate this question) of an incoming request:
mediator.on('http.request', request, response, emitter) {
//deal with the new request here, e.g.:
response.send("Hello World.");
});
So far, so good. One can now extend this application by identifying the requested URL and emitting appropriate events:
mediator.on('http.request', request, response, emitter) {
//identify the requested URL
if (request.url === '/') {
emitter.emit('root');
}
else {
emitter.emit('404');
}
});
Following this one can write a module that will deal with a root request.
mediator.on('http.request', function(request, response, emitter) {
//when root is requested
emitter.once('root', function() {
response.send('Welcome to the frontpage.');
});
});
Seems fine, right? Actually, it is potentially broken code. The reason is that the line emitter.emit('root') may be executed before the line emitter.once('root', ...). The result is that the listener never gets executed.
One could deal with this specific situation by delaying the emission of the root event to the end of the event loop:
mediator.on('http.request', request, response, emitter) {
//identify the requested URL
if (request.url === '/') {
process.nextTick(function() {
emitter.emit('root');
});
}
else {
process.nextTick(function() {
emitter.emit('404');
});
}
});
The reason this works is because the emission is now delayed until the current event loop has finished, and therefore all listeners have been registered.
However, there are many issues with this approach:
one of the advantages of such event based architecture is that emitting modules do not need to know who is listening to their events. Therefore it should not be necessary to decide whether the event emission needs to be delayed, because one cannot know what is going to listen for the event and if it needs it to be delayed or not.
it significantly clutters and complexifies code (compare the two examples)
it probably worsens performance
As a consequence, my question is: how does one avoid the need to delay event emission to the next tick of the event loop, such as in the described situation?
Update 19-01-2013
An example illustrating why this behavior is useful: to allow a http request to be handled in parallel.
mediator.on('http.request', function(req, res) {
req.onceall('json.parsed', 'validated', 'methodoverridden', 'authenticated', function() {
//the request has now been validated, parsed as JSON, the kind of HTTP method has been overridden when requested to and it has been authenticated
});
});
If each event like json.parsed would emit the original request, then the above is not possible because each event is related to another request and you cannot listen for a combination of actions executed in parallel for a specific request.
Having both a mediator which listens for events and an emitter which also listens and triggers events seems overly complicated. I'm sure there is a legit reason but my suggestion is to simplify. We use a global eventBus in our nodejs service that does something similar. For this situation, I would emit a new event.
bus.on('http:request', function(req, res) {
if (req.url === '/')
bus.emit('ns:root', req, res);
else
bus.emit('404');
});
// note the use of namespace here to target specific subsystem
bus.once('ns:root', function(req, res) {
res.send('Welcome to the frontpage.');
});
It sounds like you're starting to run into some of the disadvantages of the observer pattern (as mentioned in many books/articles that describe this pattern). My solution is not ideal – assuming an ideal one exists – but:
If you can make a simplifying assumption that the event is emitted only 1 time per emitter (i.e. emitter.emit('root'); is called only once for any emitter instance), then perhaps you can write something that works like jQuery's $.ready() event.
In that case, subscribing to emitter.once('root', function() { ... }) will check whether 'root' was emitted already, and if so, will invoke the handler anyway. And if 'root' was not emitted yet, it'll defer to the normal, existing functionality.
That's all I got.
I think this architecture is in trouble, as you're doing sequential work (I/O) that requires definite order of actions but still plan to build app on components that naturally allow non-deterministic order of execution.
What you can do
Include context selector in mediator.on function e.g. in this way
mediator.on('http.request > root', function( .. ) { } )
Or define it as submediator
var submediator = mediator.yield('http.request > root');
submediator.on(function( ... ) {
emitter.once('root', ... )
});
This would trigger the callback only if root was emitted from http.request handler.
Another trickier way is to make background ordering, but it's not feasible with your current one mediator rules them all interface. Implement code so, that each .emit call does not actually send the event, but puts the produced event in list. Each .once puts consume event record in the same list. When all mediator.on callbacks have been executed, walk through the list, sort it by dependency order (e.g. if list has first consume 'root' and then produce 'root' swap them). Then execute consume handlers in order. If you run out of events, stop executing.
Oi, this seems like a very broken architecture for a few reasons:
How do you pass around request and response? It looks like you've got global references to them.
If I answer your question, you will turn your server into a pure synchronous function and you'd lose the power of async node.js. (Requests would be queued effectively, and could only start executing once the last request is 100% finished.)
To fix this:
Pass request & response to the emit() call as parameters. Now you don't need to force everything to run synchronously anymore, because when the next component handles the event, it will have a reference to the right request & response objects.
Learn about other common solutions that don't need a global mediator. Look at the pattern that Connect was based on many Internet-years ago: http://howtonode.org/connect-it <- describes middleware/onion routing

Resources