Understanding middleware in Express - node.js

I am trying to figure out how middleware works in Express.
Whilst I understand the concept of middleware, I am confused by the middleware parameters.
Here is an example from the offical docs regarding middleware:
app.use('/user/:id', function (req, res, next) {
console.log('Request URL:', req.originalUrl)
next()
}, function (req, res, next) {
console.log('Request Type:', req.method)
next()
})
In this example, I can see there are two functions that act as two middleware which are executed one after the other before this specific route is handled.
But what are the parameters passed to these functions?
Are req and res just "empty" objects?
If so how are we able to reference the property req.originalUrl?
And if not, where is that object and its properties coming from?
They also use res.send in the tutorial, so therefore the res object seems to also have properties and not be an "empty" object.
(I understand that next is a call back argument).

Summary
The request object represents the HTTP request and has properties for the request query string, parameters, body, HTTP headers, and so on.
The response object represents the HTTP response that an Express app sends when it gets an HTTP request.
Middleware functions are functions that have access to the request object, the response object, and the next function in the application’s request-response cycle. The next function is a function in the Express router which, when invoked, executes the middleware succeeding the current middleware.
Routes can have chained methods attached (for GET, POST and DELETE requests) that take middleware functions as arguments.
The request object is the data initially received from the request, which can be modified as it passes through various middleware functions, and the response object is the data sent out.
Example Middleware
Below is an example middleware function you can copy and paste at the beginning of your app:
/**
* An example of a middleware function that logs various values of the Express() request object.
*
* #constant
* #function
* #param {object} req - The req object represents the HTTP request and has properties for the request query string, parameters, body, HTTP headers, and so on. In this documentation and by convention, the object is always referred to as req (and the HTTP response is res) but its actual name is determined by the parameters to the callback function in which you’re working.
* #param {object} res - The res object represents the HTTP response that an Express app sends when it gets an HTTP request. In this documentation and by convention, the object is always referred to as res (and the HTTP request is req) but its actual name is determined by the parameters to the callback function in which you’re working.
* #param {Function} next - `next` is used as an argument in the middleware function, and subsequently invoked in the function with `next()`, to indicate the application should "move on" to the next piece of middleware defined in a route's chained method.
* #see {#link https://expressjs.com/en/4x/api.html#req|Express Request}
* #see {#link https://expressjs.com/en/4x/api.html#res|Express Response}
* #see {#link http://expressjs.com/en/guide/writing-middleware.html|Writing Middleware}
*/
const my_logger = (req, res, next) => {
console.log("req.headers: ");
console.log(req.headers);
console.log("req.originalUrl: " + req.originalUrl);
console.log("req.path: " + req.path);
console.log("req.hostname: " + req.hostname);
console.log("req.query: " + JSON.stringify(req.query));
console.log("req.route: " + JSON.stringify(req.route));
console.log("req.secure: " + JSON.stringify(req.secure));
console.log("req.ip: " + req.ip);
console.log("req.method: " + req.method);
console.log("req.params:");
console.log(req.params);
console.log("==========================");
//if next() is not invoked below, app.use(myLogger) is the only middleware that will run and the app will hang
next();
}
// called for all requests
app.use(my_logger);
Example Routes
Below are some example routes.
The routes have chained methods attached that take middleware functions as arguments.
// some example routes
app.route("/api/:api_version/pages")
.get(api_pages_get);
app.route("/api/:api_version/topics")
.get(api_topics_get)
.post(api_login_required, api_topics_post)
.delete(api_login_required, api_topics_delete);
app.route("/api/:api_version/topics/ratings")
.post(api_login_required, api_topics_ratings_post);
Using next() in a middleware function
In the above example, you can see some methods have two middleware functions as arguments.
The first one, api_login_required, verifies login credentials and, if successful, calls next() which prompts the next middleware function to run.
It looks like this:
const api_login_required = (req, res, next) => {
// req.user exists if the user's request was previously verified, it is produced elsewhere in the code
if (req.user) {
next();
} else {
return res.status(401).json({ message: 'Unauthorized user!' });
}
}
Middleware without next()
However, the get() method attached to the route handler for /api/:api_version/pages only has a single middleware function argument: api_pages_get.
As shown below, api_pages_getdoes not call next() because there are no middleware functions that are required to run after it.
It uses the send() and json() methods of the response object to return a response.
const api_pages_get = async (req, res) => {
var page_title = req.query.page_title;
var collection = mongo_client.db("pages").collection("pages");
var query = { page_title: page_title };
var options = { projection: { page_title: 1, page_html: 1 } };
try {
var page = await collection.findOne(query);
// if there is no result
if (page === null) {
res.status(404).send('404: that page does not exist');
return;
}
// if there is a result
else if (page !== null) {
res.json(page);
return;
}
} catch (err) {
console.log("api_pages_get() error: " + err);
res.send(err);
return;
}
}
Notes on middleware
Some other notes I've previously written for my own reference that may help:
Middleware, or middleware functions, have access to the Express request and response objects and are passed as arguments to a route's chained method (or on all requests if passed as an argument to an instance of the use() method defined early in your code).
next is used as an argument in the middleware function, and subsequently invoked in the function with next(), to indicate the application should "move on" to the next piece of middleware defined in a route's chained method.
If a middleware function does not invoke next(), it will not move on to the next piece of middleware defined in a route or method handler.
Additionally, if next() is not used, and a terminating action, ie a response, is not defined in the function, the app will stay in a "hanging" state.

Do req and res are just "empty" objects?
No, req and res are never empty and are in fact same which are passed on to each middleware. You can also modify the req and res objects and the modification will persist in all next middlewares.
You can see all the available fields on req and res here respectively - request object and response object.
You can always access req and res at any point in a middleware. If you wish to the end the request response cycle, you can just use the response object and send a response like res.send(200). This will end the req-res cycle and you need not call the next().
But what paramters are given to this functions ?
You don't need to pass any parameter to this function. Express will alwasy pass the req, res and next to any middleware defined. It's the format you can assume that express uses and and all middlewares should follow.
Note that if you don't end the req-res cycle, you must call the next() which will pass on the control to the next middleware. If the middleware does not end the req-res cycle and also does not call next(), the request will keep hanging and may perhaps just timeout on the client side.

If I understand you correctly, the confusing part are the objects passed to the middleware functions? In the docs you've linked there is already an explanation for those (See below).
"But what paramters are given to this functions ?."
Middleware functions are functions that have access to the request object (req), the response object (res), and the next middleware function in the application’s request-response cycle. The next middleware function is commonly denoted by a variable named next.
(Source)
"Do req and res are just "empty" objects ?, if so how come we are using the field req.orginaleUrl ? and if not where is that object and
its field came from ?"
If you follow the links, you'll discover following explanation for the request object:
The req object represents the HTTP request and has properties for the request query string, parameters, body, HTTP headers, and so on.
(Source)
The originalUrl property mentioned in your question is a property of the req object.
and the response object:
The res object represents the HTTP response that an Express app sends when it gets an HTTP request.
The send is a method assigned to the res object, which will send a HTTP response.
(Source)

Related

What does this app.use() function do here:

I have some code from a tutorial and i am trying to understand it. I can't figure it out what's the purpose of this middleware function:
app.use((req, res, next) => {
res.locals.path = req.path;
next();
});
res.locals doc says:
An object that contains response local variables scoped to the request, and therefore available only to the view(s) rendered during that request / response cycle (if any).
This middleware sets the path part of the request URL to res.locals object and call next middleware.
You can use the res.locals.path to access the value in your controller later

Built in Middleware function

Built in middleware functions like app.use(express.json()) and app.get('/', () => res.send('Hello')).
I heard they can be called middleware functions too, I don't know if 'Built in Middleware function is the best term to call them. Do they terminate the cycle or do they automatically invoke the next() method, to pass control to the next middleware function?
Neither app.get or app.use are middleware functions. Middleware functions are those that you are passing as callbacks to the app.get, app.use, ...
In express, middleware functions have a predefined signature, either
function(req: express.Request, res: express.Response, next: express.NextFunction)
or
function(err: Error, req: express.Request, res: express.Response, next: express.NextFunction)
depending on whether it is normal or error middleware.
Any function that operates on an incoming request and have the above signature can be called a middleware function.
There are only two possible options that you can do in the middeware function, either you send a response to the requester or you call next function to pass the request to the next middleware (or forget to do any of there and wonder why the client seems stuck).
So yes, things like body-parsers, loggers, session handlers, ... do call next function otherwise your own request handlers wouldn't be executed at all (assuming they are preceded by the mentioned middleware functions).
Callbacks passed to app.get and similar methods are also middleware functions and you can call next within them and continue with another handler. Example
// these are global middleware functions - they process every request
app.use(middleware1)
app.use(middleware2)
app.use(middleware3)
// functions middleware4 and middleware5 are executed only when GET /
// request is received
app.get('/', middleware4, middleware5, (req, res) => {
return res.send({})
})
they automatically invoke the next() method .id execute successfully otherwise they will throw error
Middleware functions are functions that have access to the request object (req), the response object (res), and the next middleware function in the application’s request-response cycle. The next middleware function is commonly denoted by a variable named next.
As name suggests it comes in middle of something and that is request and response cycle
Middleware has access to request and response object
Middleware has access to next function of request-response life cycle
Image for post
Middleware functions can perform the following tasks:
Execute any code.
Make changes to the request and the response objects.
End the request-response cycle.
Call the next middleware in the stack.
If the current middleware function does not end the request-response cycle, it must call next() to pass control to the next middleware function. Otherwise, the request will be left hanging.

Call Express router manually

Нello! I am looking to call a function which has been passed to an expressRouter.post(...) call.
This expressRouter.post(...) call is occurring in a file which I am unable to modify. The code has already been distributed to many clients and there is no procedure for me to modify their versions of the file. While I have no ability to update this file for remote clients, other developers are able to. I therefore face the issue of this POST endpoint's behaviour changing in the future.
I am also dealing with performance concerns. This POST endpoint expects req.body to be a parsed JSON object, and that JSON object can be excessively large.
My goal is to write a GET endpoint which internally activates this POST endpoint. The GET endpoint will need to call the POST endpoint with a very large JSON value, which has had URL query params inserted into it. The GET's functionality should always mirror the POST's functionality, including if the POST's functionality is updated in the future. For this reason I cannot copy/paste the POST's logic. Note also that the JSON format will never change.
I understand that the issue of calling an expressjs endpoint internally has conventionally been solved by either 1) extracting the router function into an accessible scope, or 2) generating an HTTP request to localhost.
Unfortunately in my case neither of these options are viable:
I can't move the function into an accessible scope as I can't modify the source, nor can I copy-paste the function as the original version may change
Avoiding the HTTP request is a high priority due to performance considerations. The HTTP request will require serializing+deserializing an excessively large JSON body, re-visiting a number of authentication middlewares (which require waiting for further HTTP requests + database queries to complete), etc
Here is my (contrived) POST endpoint:
expressRouter.post('/my/post/endpoint', (req, res) => {
if (!req.body.hasOwnProperty('val'))
return res.status(400).send('Missing "val"');
return res.status(200).send(`Your val: ${req.body.val}`);
});
If I make a POST request to localhost:<port>/my/post/endpoint I get the expected error or response based on whether I included "val" in the JSON body.
Now, I want to have exactly the same functionality available, but via GET, and with "val" supplied in the URL instead of in any JSON body. I have attempted the following:
expressRouter.get('/my/get/endpoint/:val', (req, res) => {
// Make it seem as if "val" occurred inside the JSON body
let fakeReq = {
body: {
val: req.params.val
}
};
// Now call the POST endpoint
// Pass the fake request, and the real response
// This should enable the POST endpoint to write data to the
// response, and it will seem like THIS endpoint wrote to the
// response.
manuallyCallExpressEndpoint(expressRouter, 'POST', '/my/post/endpoint', fakeReq, res);
});
Unfortunately I don't know how to implement manuallyCallExpressEndpoint.
Is there a solution to this problem which excludes both extracting the function into an accessible scope, and generating an HTTP request?
This seems possible, but it may make more sense to modify req and pass it, rather than create a whole new fakeReq object. The thing which enables this looks to be the router.handle(req, res, next) function. I'm not sure this is the smartest way to go about this, but it will certainly avoid the large overhead of a separate http request!
app.get('/my/get/endpoint/:val', (req, res) => {
// Modify `req`, don't create a whole new `fakeReq`
req.body = {
val: req.params.val
};
manuallyCallExpressEndpoint(app, 'POST', '/my/post/endpoint', req, res);
});
let manuallyCallExpressEndpoint = (router, method, url, req, res) => {
req.method = method;
req.url = url;
router.handle(req, res, () => {});
};
How about a simple middleware?
function checkVal(req, res, next) {
const val = req.params.val || req.body.val
if (!val) {
return res.status(400).send('Missing "val"');
}
return res.status(200).send(`Your val: ${val}`);
}
app.get('/my/get/endpoint/:val', checkVal)
app.post('/my/post/endpoint', checkVal)
This code isn't tested but gives you rough idea on how you can have the same code run in both places.
The checkVal function serves as a Express handler, with request, response and next. It checks for params first then the body.

Error: Can't set headers after they are sent because of res.?

I'm trying to set up a method that is called with Shopify's webhook. I get the data and I'm able to store with a fresh server but I get "Error: Can't set headers after they are sent" returned in the console. I believe this is because I'm calling res twice. Any ideas on how to structure this better?
This is my method:
function createProductsWebHook(req,res,next) {
//if(req.headers){
// res.status(200).send('Got it')
// return next()
// }
res.sendStatus(200)
next()
const productResponse = req.body
console.log(productResponse)
const product = Product.build({
body_html: req.body.body_html,
title: req.body.title,
});
product.save()
.then(saveProduct => res.json(saveProduct))
.catch((e)=> {
console.log(e)
});
}
This occurs because the middleware, createProductsWebHook(), is called first when a request is received, which then sends a 200 status code response, res.sendStatus(200). Then in, in the same middleware function, product.save().then(...) is called. save()’s callback function attempts to send a response too – after one has already been sent by the very same middleware – using res.json(saveProduct).
Key Takeaway
Middleware should not send the response; this defeats the purpose of middleware. Middleware's job is to decorate (add or remove information, i.e, headers, renew some auth session asynchronously, perform side effects, and other tasks) from a request or response and pass it along, like a chain of responsibility, not transmit it – that's what your route handler is for (the one you registered your HTTP path and method with, e.g., app.post(my_path, some_middleware, route_handler).

express js - How is one http request different from other?

I've been working on creating a better architecture for rest api in express and node. Let's say I have 3 methods in my route middleware -
router.post('/users/:id', [
UserService.getUserById,
UserController.setUser,
MailerService.sendSubscriptionMail
]);
I am setting up req.session.user in call to UserService.getUserById and then using that to set req.session.result in UserController.setUser. Now I am sending a mail to this user using the data stored in req.session.result.
UserService -
exports.getUserById = function(req, res, next) {
.
.
req.session.user = data;
.
.
};
module.exports = exports;
UserController -
exports.setUser = function(req, res, next) {
.
.
req.session.result = req.session.user;
.
.
};
module.exports = exports;
MailerService -
exports.sendSubscriptionMail = function(req, res, next) {
// Using req.session.result here to send email
};
module.exports = exports;
Now I have two questions regarding above process -
(a) Is there any chance that a new http req to another route (which also has these kind of methods which can modify req.session) can modify the req.session.result and MailerService.sendSubscriptionMail does not get the data which it needs to send to the user or will that req object will be completely different from this one in the memory?
(b) Is there any other method to transfer data between middleware rather than setting up req object?
Is there any chance that a new http req to another route (which also
has these kind of methods which can modify req.session) can modify the
req.session.result and MailerService.sendSubscriptionMail does not get
the data which it needs to send to the user or will that req object
will be completely different from this one in the memory?
The req object is specific to this request. That object cannot be changed by another request. But, if the session in req.session is a common shared sesison object that all requests from that particular user share, then req.session.result could be changed by another request from that user that is getting processed at around the same time (e.g. interleaved within various async operations).
If you want to make sure that no other request from this user could change your result, then put it in req.result, not req.session.result because no other requests will have access to req.result.
Is there any other method to transfer data between middleware rather
than setting up req object?
The req or res objects are the right places to share info among middleware handlers for the same request as they are unique to this particular request. Be careful with the session object because it is shared among multiple requests from the same user.
Another possible way to share the data among your three handlers is to make a single middleware handler that calls all three of your middleware handlers and then share the data within that single function (e.g. passing it to the other handlers).
For example, you could change the calling signature of your 2nd two methods so you can get the data out of setUser() and then pass it directly to sendSubscriptionMail() without using the req or session objects to store it.
router.post('/users/:id', [UserService.getUserById, function(req, res, next) {
UserController.setUser(req, res, function(err, result) {
if (err) return next(err);
MailerService.sendSubscriptionMail(result, req, res, next);
}]);
});

Resources