What is the order of execution of the same-type hooks in fastify? - fastify

In Fastify.js you have at least to ways to register hooks: globally (via fastify.addHook()) or as a property inside the route declaration. In the example below I'm trying to use fastfy-multer to handle file uploading but the maximum amount of files must be limited by a setting associated with a "room". As the app has many rooms, most of the requests contain a reference to a room, and every time the request is being augmented with room settings by the preHandler hook.
import fastify from 'fastify'
import multer from 'fastify-multer'
const server = fastify()
server.register(multer.contentParser)
// For all requests containing the room ID, fetch the room options from the database
fastify.addHook('preHandler', async (request, reply) => {
if (request.body.roomID) {
const roomOptions = await getRoomOptions(request.body.roomID)
if (roomOptions) {
reuqest.body.room = roomOptions
}
else {
// handle an error if the room doesn't exist
}
}
})
server.post('/post', {
// Limit the maximum amount of files to be uploaded based on room options
preHandler: upload.array(files, request.body.room.maxFiles)
})
In order for this setup to work, the global hook must always be executed before the file upload hook. How can I guarantee that?

Summary: As #Manuel Spigolon said:
How can I guarantee that? The framework does it
Now we can take Manuel's word for it (SPOILER ALERT: they are absolutely correct), or we can prove how this works by looking in the source code on GitHub.
The first thing to keep in mind is that arrays in JavaScript are remain ordered by the way objects are pushed into them, but don’t take my word for it. That is all explained here if you want to dive a little deeper into the evidence. If that was not true, everything below doesn't matter and you could just stop reading now.
How addHook works
Now that we have established that arrays maintain their order, let look at how the addHook code is executed. We can start by looking at the default export of fastify in the fastify.js file located in the root directory. In this object if scoll down you'll see the addHook property defined. When we look into the addHook function implementation we can see that in that add hook call we are calling this[kHooks].add.
When we go back to see what the kHooks property is we see that it is a new Hooks(). When we go to take a look at the add method on the Hooks object, we can see that it just validates the hook that is being add and then [pushes] it to the array property on the Hooks object with the matching hook name. This shows that hooks will always be in the order which add was called for them.
How fastify.route adds hooks
I hope you're following to this point because that only proves the order of the addHook calls in the respective array on the Hooks object. The next question is how these interact with the calls of fastify.(get | post | route | ...) functions. We can walk through the fastify.get function, but they are all pretty much the same (you can do the same exercise with any of them). Looking at the get function, we see that the implementation is just calling the router.prepareRoute function. When you look into the prepareRoute implementation, you see that this function returns a call to the route function. In the route function there is a section where the hooks are set up. It looks like this:
for (const hook of lifecycleHooks) {
const toSet = this[kHooks][hook]
.concat(opts[hook] || [])
.map(h => h.bind(this))
context[hook] = toSet.length ? toSet : null
}
What this does is go through every lifecycle hook and turn it into a set of all the hooks from the Fastify instance (this) and the hooks in the options (opts[hook]) for that given hook and binds them to the fastify instance (this). This shows that the hooks in the options for the routes are always added after the addHook handlers.
How Fastify executes hooks
This is not everything we need though. Now we know the order in which the hooks are stored. But how exactly are they executed? For that we can look at the hookRunner function in the hooks.js file. We see this function acts as a sort of recursive loop that continues running as long as the handlers do not error. It first creates a variable i to keep track of the handler function it is currently on and then tries to execute it and increments the function tracker (i).
If the handler fails (handleReject), it runs a callback function and does not call the next function to continue. If the handler succeeds (handleResolve), it just runs the next function to try the same process on the following handler (functions[i++]) in the functions set.
Why does this matter
This proves that the hook handlers are called in the order that they were pushed into the ordered collection. In other words:
How can I guarantee that? The framework does it

Related

How to add a custom dimension to request telemetry in a Nodejs/typescript azure function?

Goal
A request comes in and is handled by the Azure Functions run-time. By default it creates a Request entry, and a bunch of Trace entries in Application Insights. I want to add a custom dimension to that top level request item (on a per-request basis) so I can use it for filtering/analysis later.
Query for -requests- on Application Insights
Resulting list of requests including custom dimensions column
The Azure Functions runtime adds a few custom dimensions already. I want to add a few of my own.
Approach
The most promising approach I've found is show below (taken from here https://github.com/microsoft/ApplicationInsights-node.js/issues/392)
appInsights.defaultClient.addTelemetryProcessor(( envelope, context ) => {
var data = envelope.data.baseData;
data.properties['mykey'] = 'myvalue';
return true;
});
However, I find that this processor is only called for requests that I initialise within my function. For example, if I make an HTTP request to another service, then details of that request will be passed thru the processor and I can add custom properties to it. But the main function does not seem to pass thru here. So I can't add my custom property.
I also tried this
defaultClient.commonProperties['anotherCustomProp'] = 'bespokeProp2'
Same problem. The custom property doesn't arrive in application insights. I've played with many variations on this and it appears that the logging done by azure-functions is walled off from anything I can do within my code.
The best workaround I have right now, is to call trackRequest manually. This is okay, except I end up with each request logged twice in application insights, one by the framework and one by me. And both need to have the same operation_id otherwise I can't find the associated trace/error items. So I'm having to extract the operationId in a slightly hacky way. This may be fine, my knowledge of application insights is pretty naive at this point.
import { setup, defaultClient } from 'applicationinsights' // i have to import the specific functions, because "import ai from applicationinsights" returns null
// call this because otherwise defaultClient is null.
// Some examples call start(), I've tried with and without this.
// I think the start() function must be useful when you're adding application-insights to a project fresh, whereas I think the azure-functions run-time must be doing this already.
setup()
const httpTrigger: AzureFunction = async function (context: Context, req: HttpRequest): Promise<void> {
// Extract the operation id from the traceparent as per w3 standard https://www.w3.org/TR/trace-context/.
const operationId = context.traceContext.traceparent.split('-')[1]
var operationIdOverride = { 'ai.operation.id': operationId }
// Create my own trackRequest entry
defaultClient.trackRequest({
name: 'my func name',
url: context.req.url.split('?')[0],
duration: 123,
resultCode: 200,
success: true,
tagOverrides: operationIdOverride,
properties: {
customProp: 'bespokeProp'
}
})
The Dream
Our C# cousins seem to have an array of options, like Activity.Current.tags and the ability to add TelemetryInitializer. However it looks like what I'm trying to do is supported, I'm just not finding the right combination of commands! Is there something similar for javascript/typescript/nodejs, where I can just add a tag on a per-request basis? Along the lines of context.traceContext.attributes['myprop'] = 'myValue'
Alternative
Alternatively, instrumenting my code using my own TelemetryClient (rather than the defaultClient) using trackRequest, trackTrace, trackError etc, is not a very big job and should work well - that would be more explicit. Should I just do that? Is there a way to disable the azure functions tracking - or perhaps I just leave that as something running side-by-side.

Azure function run code on startup for Node

I am developing Chatbot using Azure functions. I want to load the some of the conversations for Chatbot from a file. I am looking for a way to load these conversation data before the function app starts with some function callback. Is there a way load the conversation data only once when the function app is started?
This question is actually a duplicate of Azure Function run code on startup. But this question is asked for C# and I wanted a way to do the same thing in NodeJS
After like a week of messing around I got a working solution.
First some context:
The question at hand, running custom code # App Start for Node JS Azure Functions.
The issue is currently being discussed here and has been open for almost 5 years, and doesn't seem to be going anywhere.
As of now there is an Azure Functions "warmup" trigger feature, found here AZ Funcs Warm Up Trigger. However this trigger only runs on-scale. So the first, initial instance of your App won't run the "warmup" code.
Solution:
I created a start.js file and put the following code in there
const ErrorHandler = require('./Classes/ErrorHandler');
const Validator = require('./Classes/Validator');
const delay = require('delay');
let flag = false;
module.exports = async () =>
{
console.log('Initializing Globals')
global.ErrorHandler = ErrorHandler;
global.Validator = Validator;
//this is just to test if it will work with async funcs
const wait = await delay(5000)
//add additional logic...
//await db.connect(); etc // initialize a db connection
console.log('Done Waiting')
}
To run this code I just have to do
require('../start')();
in any of my functions. Just one function is fine. Since all of the function dependencies are loaded when you deploy your code, as long as this line is in one of the functions, start.js will run and initialize all of your global/singleton variables or whatever else you want it to do on func start. I made a literal function called "startWarmUp" and it is just a timer triggered function that runs once a day.
My use case is that almost every function relies on ErrorHandler and Validator class. And though generally making something a global variable is bad practice, in this case I didn't see any harm in making these 2 classes global so they're available in all of the functions.
Side Note: when developing locally you will have to include that function in your func start --functions <function requiring start.js> <other funcs> in order to have that startup code actually run.
Additionally there is a feature request for this functionality that can voted on open here: Azure Feedback
I have a similar use case that I am also stuck on.
Based on this resource I have found a good way to approach the structure of my code. It is simple enough: you just need to run your initialization code before you declare your module.exports.
https://github.com/rcarmo/azure-functions-bot/blob/master/bot/index.js
I also read this thread, but it does not look like there is a recommended solution.
https://github.com/Azure/azure-functions-host/issues/586
However, in my case I have an additional complication in that I need to use promises as I am waiting on external services to come back. These promises run within bot.initialise(). Initialise() only seems to run when the first call to the bot occurs. Which would be fine, but as it is running a promise, my code doesn't block - which means that when it calls 'listener(req, context.res)' it doesn't yet exist.
The next thing I will try is to restructure my code so that bot.initialise returns a promise, but the code would be much simpler if there was a initialisation webhook that guaranteed that the code within it was executed at startup before everything else.
Has anyone found a good workaround?
My code looks something like this:
var listener = null;
if (process.env.FUNCTIONS_EXTENSION_VERSION) {
// If we are inside Azure Functions, export the standard handler.
listener = bot.initialise(true);
module.exports = function (context, req) {
context.log("Passing body", req.body);
listener(req, context.res);
}
} else {
// Local server for testing
listener = bot.initialise(false);
}
You can use global variable to load data before function execution.
var data = [1, 2, 3];
module.exports = function (context, req) {
context.log(data[0]);
context.done();
};
data variable initialized only once and will be used within function calls.

Would the values inside request be mixed up in callback?

I am new to Node.js, and I have been reading questions and answers related with this issue, but still not very sure if I fully understand the concept in my case.
Suggested Code
router.post('/test123', function(req, res) {
someAsyncFunction1(parameter1, function(result1) {
someAsyncFunction2(parameter2, function(result2) {
someAsyncFunction3(parameter3, function(result3) {
var theVariable1 = req.body.something1;
var theVariable2 = req.body.something2;
)}
)}
});
Question
I assume there will be multiple (can be 10+, 100+, or whatever) requests to one certain place (for example, ajax request to /test123, as shown above) at the same time with some variables (something1 and something2). According to this, it would be impossible that one user's theVariable1 and theVariable2 are mixed up with (i.e, overwritten by) the other user's req.body.something1 and req.body.something2. I am wondering if this is true when there are multiple callbacks (three like the above, or ten, just in case).
And, I also consider using res.locals to save some data from callbacks (instead of using theVariable1 and theVariable2, but is it good idea to do so given that the data will not be overwritten due to multiple simultaneous requests from clients?
Each request an Node.js/Express server gets generated a new req object.
So in the line router.post('/test123', function(req, res), the req object that's being passed in as an argument is unique to that HTTP connection.
You don't have to worry about multiple functions or callbacks. In a traditional application, if I have two objects cat and dog that I can pass to the listen function, I would get back meow and bark. Even though there's only one listen function. That's sort of how you can view an Express app. Even though you have all these get and post functions, every user's request is passed to them as a unique entity.

How to think asynchronously with nodejs?

I just started developing nodejs. I'm confused to use async model. I believe there is a way to turn most of SYNC use cases into ASYNC way. Example, by SYNC, we load some data and wait until it returns then show them to user; by ASYNC, we load data and return, just tell the user data will be presented later. I can understand why ASYNC is used in this scenario.
But here I have a use case. I'm building an web app, allowing user to place a order (buying something). Before saving the order data into db, I want to put some user data together with order data (I'm using document NoSql db by the way). So I think by SYNC, after I get order data, I make a SYNC call to database and wait for its returned user data. After I get returned data, integrate them together and ingest into db.
I think there might be an issue if I make ASYNC call to db to query user data because user data may be returned after I save data to db. And that's not what I want.
So in this case, how can I do this thing ASYNCHRONOUSLY?
Couple of things here. First, if your application already has the user data (the user is already logged in), then this information should be stored in session so you don't have to access the DB. If you are allowing the user to register at the time of purchase, you would simply want to pass a callback function that handles saving the order into your call that saves the user data. Without knowing specifically what your code looks like, something like this is what you would be looking for.
function saveOrder(userData, orderData, callback) {
// save the user data to the DB
db.save(userData, function(rec) {
// if you need to add the user ID or something to the order...
orderData.userId = rec.id; // this would be dependent on your DB of choice
// save the order data to the DB
db.save(orderData, callback);
});
}
Sync code goes something like this. step by step - one after other. There can be ifs and loops (for) etc. all of us get it.
fetchUserDataFromDB();
integrateOrderDataAndUserData();
updateOrderData();
Think of async programming with nodejs as event driven. Like UI programming - code (function) is executed when an event occurs. E.g. On click event - framework calls back registered clickHandler.
nodejs async programming can also be thought on these lines. When db query (async) execution completes, your callback is called. When order data is updated, your callback is called. The above code goes something like this:
function nodejsOrderHandler(req,res)
{
var orderData;
db.queryAsync(..., onqueryasync);
function onqueryasync(userdata)
{
// integrate user data with order data
db.update(updateParams, onorderudpate);
}
function onorderupdate(e, r)
{
// handler error
write response.
}
}
javascript closure provides the way to keep state in variables across functions.
There is certainly much more to async programming and there are helper modules that help with basic constructs like chain, parallel, join etc as you write more involved async code. but this probably gives you a quick idea.

CoffeeScript Preserve Multiple `this` objects?

I'm in a rather unique situation (one I've never found myself in before) and I can't find anything on how it should be handled, so I thought I'd ask here. (and perhaps start a good discussion on how it should be handled)
I'm writing a Node.js/Express application that does a series of database calls in various route handlers. I'm using the node-sqlite3 module to make the database calls. In this case the user is uploading a file, so I take that from the form POST data and save it to the filesystem (will be moved to blob storage later) and generate some xml files for other reasons. The way that I name these files is by the id in the database to facilitate routes like '/file/:id' to GET the file later on, as well as leveraging the database to ensure I don't have name collision issues.
I'm doing this in CoffeeScript so I'm wrapping this model in a class, so it uses this (or #) to access these helper methods. Before I was doing a last_insert_rowid() call to get the id of the thing I just inserted but this opens me up to potential race conditions. So as it turns out when you do an INSERT in the node-sqlite3 module the callback it calls when it's done saves the last inserted row id in the this object (so (err) -> fileid = #.lastID). Now for some code (names changed and simplified).
uploadFile: (fileData, cb) ->
#.openConnectionIfClosed()
#db.serialize =>
# preserve the # object to allow calls to class methods like #.saveFile
#db.run 'insert into thing (val=$val)', $val: 'some_val', (err) ->
if err
cb err
return
# # refers to the this object provided by #db.run
fileid = #.lastID
async.waterfall [
(async_cb) =>
# uh oh, the # object is the one from the db library, not my class
# earning me a 'method undefined' error
#.saveFile fileid, async_cb
# etc... you get the idea
...
], (err) -> cb err
#.closeConnection()
So I'm needing to retain the class this object to access instance methods, but I also need to get the lastID value out of the this object returned by default from the callback in the db call. Obviously if I change the callback on the db call to fat arrows I can access my class methods, but then the #.lastID returns undefined.
What's the "proper" way to achieve this? (in CoffeeScript or JavaScript, doesn't particularly matter) I think that the way I can solve this is by assigning a context variable like ctxt = # at the top of the method but obviously this isn't favorable. Any ideas?
EDIT: Oh, and I forgot to mention, I can't just name the file something else because I also need that id to update the database. The call to async.waterfall does: Save file -> generate xml metadata file & save it -> add the path of the xml file to the database entry retroactively. Though I could potentially use a statement to do that without completely isolated database calls, haven't investigated that yet.
In JavaScript, if you need to preserve a particular context, its standard practice to use the variable self, as such:
var self = this;
In CoffeeScript, the fat arrow is standard practice for cases like this, though using 'self' is not wrong either.

Resources