expressjs repo documentation - node.js

I want to understand the internal working of Expressjs (just curious). Much of the thing are clear but I am not able to understand the chaining of routing and middleware. How expressjs add all the route and middleware to path / and how it keep the stack of route with middleware internally
So I will be very thankful to you if you provide some documentation or link from where I get the understanding how expressjs work internally
Thanks

ExpressJS is just an HTTP server allowing you to route and manipulate the received requests and returning a response.
So if you carefully look at the HTTP packet format below,
you can find the method and the path in the first line.
So, basically here express-router has a regex matcher that tries to match the HTTP request it receives to the predefined routes declared in the express application.
If you check L:43 Router, here it shows that the route you declare is just a function containing 3 constants:
path - That would be a path to match.
stack - Following a proper format, the URL is broken down using the / separator and a stack is formed in order of it's parsing, comprising another function called layer.
methods - Methods are the HTTP methods that we declare along with the path.
Parsing
So when a request is made, let's suppose: http://localhost:8000/user/1/test
We get the path: /user/1/test
The router's handle function is executed. Then this path is broken down into layers and formed a stack: ['user', '*', 'test']
This stack is then matched with that of the route objects that are pre-declared in the application and are used as Route functional objects.
As soon as it finds the match the callback is executed!

Related

Efficient way of getting the route of request in Nest.JS

I'm writing a middleware in Nest.JS and need to get a route of the request in order to log it. Unfortunately req.route is undefined in Nest due to it's way of dealing with handlers and req.path is not good enough because it contains params in it (e.g. /user/23423). I want to get a route instead: '/user/:id'.
I can format the path with a trivial regex to get rid of the params, but doing so in the middleware on all requests feels inefficient. Is there a better way of capturing route in Nest.JS?

router handler returns an array of object but client doesn't get them in json though response with 200 status

I am implementing a express.js project with Typescript.
I have defined a enum and a interface :
export enum ProductType {
FOOD = 'food',
CLOTH = 'cloth',
TOOL = 'tool'
}
export interface MyProduct {
type: ProductType;
info: {
price: number;
date: Date;
};
}
One of my router handler needs to return an array of MyProduct to client. I tried this :
const productArr: MyProduct[] = // call another service returns an array of MyProduct
app.get('/products', (req, res) => {
res.status(200).send({products: productArr});
});
I use Postman tested this endpoint, it responses with status 200 but with a default HTML page instead of the array of objects in JSON.
What do I miss? Is it because express.js can't automatically parse the enum and interface to json object??
P.S. I have set up json parser, so it is not about that, other endpoints work fine with json response:
const app = express();
app.use(express.json());
...
As mentioned in the comments, your code should work. I'll list some steps which can be used to try to find the problem.
Show debug info
Set DEBUG=* in your environment. DEBUG is an environment variable which controls logging for many Node modules. You'll be able to see the flow of a request through Express. If there is too much info, you can limit the output like so: DEBUG=*,-babel,-babel:*,-nodemon,-nodemon:*,-router:layer,-follow-redirects,-send (use a comma-separated list and put a - in front of any module you'd like to exclude)
This should help you trace the life of a request through the various routers and routes. You're now in a position to...
Check for another route that is short-circuiting the request
The fact that you're seeing an HTML page when the Express route is sending an object might indicate that your request is matching a different route. Look for catch-all routes such as non-middleware app.use() or wildcard routes which appear ABOVE your route.
Other suggestions
Don't explicitly set the status
Adding .status(200) is more code and unnecessary.
Use res.json()
Use .json() instead of .send(). If will always add the Content-Type: application/json header, whereas .send() will not when it cannot determine the content type (e.g. .send(null) or .send('hello') will not set the Content Type header to application/json, which may confuse clients).
As there is a lack of full response headers and server environment, assuming you are using AWS service with reverse proxy. So, there might be few possibilities listed here that need to look upon :
If router handler returns an array of object but client doesn't get them in json though response with 200 status then there might be a reverse proxy acting as a backend server, serving default content with status code 200 for unknown routes from the client. So in this scenario, you need to whitelist a new route in your reverse proxy server, assuming you are using AWS Amplify for API rewrite and redirects then you need to whitelist this route in your AWS amplify settings, or else it will serve the default content like it is happening in current scenrio.
If issue still persists then :
Make sure you have proper CORS specification on your server.
Make sure productArr is an array returned by service, because if some service returns this value - it might be an unresolved promise. So, proper test cases will help you out here or for debugging purposes set DEBUG=* in your environment and make sure it should return value as expected.
Check for another route that is short-circuiting the request: The fact that you're seeing an HTML page when the Express route is sending an object might indicate that your request is matching a different route. Look for catch-all routes such as non-middleware app.use() or wildcard routes that appear above your route.

What is the difference between a route handler and middleware function in ExpressJS?

My understanding is that a middleware function is a route handler with the exception that it may invoke the next function parameter to pass control on to the middleware function on the stack. Is this the only difference between a standard route handler and a middleware function?
Most of what you're talking about is semantics. In ExpressJS, middleware can be a route handler or a route handler can behave as middleware. So, there is not a hard and fast line between the two. But, when people refer to middleware or a route handler in a programming discussion, they usually mean something slightly different for each...
Middleware
As generic terms, middleware is code that examines an incoming request and prepares it for further processing by other handlers or short circuits the processing (like when it discovered the user is not authenticated yet). Some examples:
Session Management. Parse cookies, look for session cookie, lookup session state for that cookie and add session info to the request so that other handlers down the line have ready access to the session object without any additional work on their part. In express, this would be express.session().
Authentication. Check if the user is trying to access a portion of the site that requires authentication. If so, check if their authentication credentials are good. If not, send an error response and prevent further processing. If so, allow further processing.
Parsing of Cookies. Parse incoming cookies into an easy-to-use data structure a request handler can have easy access to cookie data without each having to parse them on their own. This type of middleware is built into Express and happens automatically.
Parsing and Reading of POST/PUT bodies. If the incoming request is a POST or PUT, the body of the request may contain data that is needed for processing the request and needs to be read from the incoming stream. Middleware can centralize this task reading the body, then parsing it according to the data type and putting the result into a known request parameter (in Express this would be req.body). Express has some ready-to-use middleware for this type of body parsing with express.json() or express.urlencoded(). A middleware library like multer is for handling file uploads.
Serve static files (HTML, CSS, JS, etc...). For some groups of URLs, all the server needs to do is to serve a static file (no custom content added to the file). This is common for CSS files, JS files and even some HTML files. Express provides middleware for this which is called express.static().
Route Handler
As a generic term, a route handler is code that is looking for a request to a specific incoming URL such as /login and often a specific HTTP verb such as POST and has specific code for handling that precise URL and verb. Some examples:
Serve a specific web page. Handle a browser request for a specific web page.
Handle a specific form post. For example, when the user logs into the site, a login for is submitted to the server. This would be handled by a request handler in Express such as app.post("/login", ...).
Respond to a specific API request. Suppose you had an API for a book selling web-site. You might provide in that API the ability to get info on a book by its ISBN number. So, you design an api that supports a query for a particular book such as /api/book/list/0143105426 where 0143105426 is the ISBN number for the book (a universal book identifier). In that case, you'd create a request handler in Express for a URL that looks like that: app.get('/api/book/list/:isbn', ...). The request handler in Express could then programmatically examine req.parms.isbn to get the request isbn number, look it up in the database and return the desired info on the book.
So, those are somewhat generic descriptions of middleware vs. request handlers in any web server system in any language.
In Express, there is no hard and fast distinction between the two. Someone would generally call something middleware that examines a bunch of different requests and usually prepares the request for further processing. Someone would generally call something a route handler that is targeted at a specific URL (or type of URL) and whose main purpose is to send a response back to the client for that URL.
But, the way you program Express, the distinction is pretty blurry. Express offers features for handling routes such as:
app.use()
app.get()
app.post()
app.put()
app.delete()
app.all()
Anyone of these can be used for either middleware or a route handler. Which would would call a given block of code has more to do with the general intent of the code than exactly which tools in Express it uses.
More typically, one would use app.use() for middleware and app.get() and app.post() for route handlers. But there are use cases for doing it differently than that as it really depends upon the particular situation and what you're trying to do.
You can even pass more than one handler to a given route definition where the first one is middleware and followed by a route handler.
app.get("/admin", verifyAuth, (req, res) => {
// process the /admin URL, auth is already verified
req.sendFile("...");
});
It is common for middleware to be active for a large number of different requests. For example, you might have an authentication middleware that prevents access to 95% of the site if the user isn't already logged in (say everything except the a few generally information pages such as the homepage and the login and account creation pages).
It is also common to have middleware that is active for all HTTP verbs such as GET, POST, DELETE, PUT, etc... In express, this would usually be app.use() or app.all(). Request handlers are usually only for one particular verb such as app.get() or app.post().
You might have session middleware that loads a session object (if one is available) for every single request on the site and then passes control on to other handlers that can, themselves, decide whether they need to access the session object or not.
It is common for request handlers to be targeted at a specific URL and only active for that specific URL. For example, the /login URL would typically have one request handler that renders that particular page or responds to login form requests.
Path Matching for app.use() is Different
In Express, there's one other subtle difference. Middleware is typically specified with:
app.use(path, handler);
And, a route is typically specified with:
app.get(path, handler);
app.post(path, handler);
app.put(path, handler);
// etc...
app.use() is slightly more greedy than app.get() and the others in how it matches the path. app.get() requires a full match. app.use() is OK with a partial match. Here are some examples:
So, for a URL request /category:
app.use("/category", ...) matches
app.get("/category", ...) matches
For a URL request /category/fiction:
app.use("/category", ...) matches
app.get("/category", ...) does not match
You can see that app.use() accepts a partial URL match, app.get() and it's other cousins do not accept a partial URL match.
Now, of course, you can use app.get() for middleware if you want and can use app.use() for request handlers if you want, but typically one would use app.use() for middleware and app.get() and its cousins for request handlers.
Ok, let's explore some definitions first.
Middleware function
definition from the express documentation:
Middleware functions are functions that have access to the request object (req), the response object (res), and the next function in the application’s request-response cycle. The next function is a function in the Express router which, when invoked, executes the middleware succeeding the current middleware.
Handler function or callback function
definition from express Documentation:
These routing methods specify a callback function (sometimes called “handler functions”) called when the application receives a request to the specified route (endpoint) and HTTP method. In other words, the application “listens” for requests that match the specified route(s) and method(s), and when it detects a match, it calls the specified callback function.
Here Routing methods are the methods derived from HTTP requests and the all() method that matches all the HTTP methods. But remind, the use() method is not a routing method. Let's go to express documentation for clarification.
Routing Methods
From express documentation:
A route method is derived from one of the HTTP methods and is attached to an instance of the express class. Express supports methods that correspond to all HTTP request methods: get, post, and so on. There is a special routing method, app.all(), used to load middleware functions at a path for all HTTP request methods.
So we see that middleware functions and handler functions are not opposite of each other.
Middleware functions are functions that take an extra argument, next along with req and res which is used to invoke the next middleware function.
app.use() and the other HTTP derived methods(app.get(), app.all() etc) can use middleware functions. The difference between handler function and middleware function is same on both app and router object.
On the other hand, a handler function is a function that is specified by the routing methods.
So when any of the routing methods pass a function that has req, res, and next then it is both a middleware function and a handler function.
But for app.use() if the function has req, res, and next, then it is only a middleware function.
Now, what about the functions which only have req and res don't have the next argument?!
They are not middleware functions by definition.
If such function is used on routing methods then they are only handler functions. We use such a handler function which is not a middleware when it is the only one callback function. Because in such cases there is no need for next which calls the next middleware function.
If they are used on app.use() then they are not middleware, nor handler function.
Ok, enough of the definitions, But is there any difference between middleware functions of app.use() and handler & middleware functions of routing methods??
They look similar since both have the next argument and works almost the same. But there is a subtle difference.
The next('route') works only on handler & middleware functions. So app.use() can not invoke next('route') since they can have only middleware functions.
According to express Documentation:
next('route') will work only in middleware functions that were loaded by using the app.METHOD() or router.METHOD() functions.
METHOD is the HTTP method of the request that the middleware function handles (such as GET, PUT, or POST) in lowercase.
If you know what is next('route') then the answer is finished here for you :).
In case you don't, you can come along.
next('route')
So let's see what is next('route').
From here we will use the keyword METHOD instead of get, post, all, etc.
From the previous middleware function definition, we see app.use() or app.METHOD() can take several middleware functions.
From the express documentation:
If the current middleware function does not end the request-response cycle, it must call next() to pass control to the next middleware function. Otherwise, the request will be left hanging.
We see each middleware functions have to either call the next middleware function or end the response.
But sometimes in some conditions, you may want to skip all the next middleware functions for the current route but also don't want to end the response right now. Because maybe there are other routes which should be matched. So to skip all the middleware functions of the current route without ending the response, you can run next('route'). It will skip all the callback functions of the current route and search to match the next routes.
For Example (From express documentation):
app.get('/user/:id', function (req, res, next) {
// if the user ID is 0, skip to the next route
if (req.params.id === '0') next('route')
// otherwise pass the control to the next middleware function in this stack
else next()
}, function (req, res, next) {
// send a regular response
res.send('regular')
})
// handler for the /user/:id path, which sends a special response
app.get('/user/:id', function (req, res, next) {
res.send('special')
})
See, here in a certain condition(req.params.id === '0') we want to skip the next callback function but also don't want to end the response because there is another route of the same path parameter which will be matched and that route will send a special response. (Yeah, it is valid to use the same path parameter for the same METHOD several times. In such cases, all the routes will be matched until the response ends). So in such cases, we run the next('route') and all the callback function of the current route is skipped. Here if the condition is not met then we call the next callback function.
This next('route') behavior is only possible in the app.METHOD() functions.
Recalling from express documentation:
next('route') will work only in middleware functions that were loaded by using the app.METHOD() or router.METHOD() functions.
That means this next('route') can be invoked only from handler & middleware functions. The only middleware functions of app.use() can not invoke it.
Since skipping all callback functions of the current route is not possible in app.use(), we should be careful here. We should only use the middleware functions in app.use() which need not be skipped in any condition. Because we either have to end the response or traverse all the callback functions from beginning to end, we can not skip them at all.
You may visit here for more information

Node and Express, getting all (custom) methods with .all()

In Node and Express, I'm trying to get all traffic sent to a URL like this.
APP.all('/testCase', function(req, res) {
console.log('Im called with the method: ' + req.method);
});
If I now do:
curl -X GET http://localhost:3000/testCase it works fine, I get the response: Im called with the method: GET
But when I do:
curl -X INSERT http://localhost:3000/testCase I'm getting: curl: (52) Empty reply from server
What Am I doing wrong? I will have many custom methods
The INSERT method is not supported by the node http parser. To see a list of the HTTP methods supported, you can run node -pe "require('http').METHODS". In order to support custom HTTP methods, one would have to patch core itself (specifically the http parser).
app.all(path, callback [, callback ...])
This method is like the standard app.METHOD() methods, except it
matches all HTTP verbs.
It’s useful for mapping “global” logic for specific path prefixes or
arbitrary matches. For example, if you put the following at the top of
all other route definitions, it requires that all routes from that
point on require authentication, and automatically load a user. Keep
in mind that these callbacks do not have to act as end-points:
loadUser can perform a task, then call next() to continue matching
subsequent routes.
The most commons HTTP methods is:
GET
HEAD
POST
PUT
DELETE
TRACE
CONNECT

What is the purpose on the middleware Yeoman function implementation?

I'm new in grunt-contrib-connect and came across with this follow middleware function Yoeman implementation -
middleware: function(connect, options, middlewares) {
return [
proxySnippet,
connect.static('.tmp'),
connect().use('/bower_components', connect.static('./bower_components')),
connect.static(config.app)
];
}
What is the purpose of this implementation ?
These are connect middlewares. A middleware is a request callback function which may be executed on each request. It can either modify/end the curent request-response cycle or pass the request to the next middleware in the stack. You can learn more about middlewares from express guide.
In your code you have four middlewares in the stack. First one is for proxying current request to another server. And rest three middlewares are for serving static files from three different directories.
When a request is made to the server, it'll go through these middlewares in following order:
Check if the request should be proxied. If it is proxied to other server, then it's the end of the request/response cycle, rest three middlewares will be ignored.
If not proxied, it'll try to serve the requested file from ./tmp directory.
If the file isn't found in above, it'll look inside ./bower_components. Note that this middleware will be executed only for the requests that has `/bower_components/ in the path. e.g. http://localhost:9000/bower_components/bootstrap/bootstrap.js
Finally, if file isn't found in above two directories, it'll look for it in whatever the path is set in config.app.
That's the end of stack, after that you'll get a 404 Not found error.

Resources