Express response interception. Use body to modify the outgoing headers

Express response interception. Use body to modify the outgoing headers - node.js

I would like to try and improve site render times by making use of preload/push headers.
We have various assets which are required up front that I would like to preload, and various assets which are marked up in data attributes etc which will be required later via JS but not for initial paint. It would be good to get these flowing to the client early.
Our application is a bit of a hybrid, it uses http-proxy-middleware connected to various different applications, plus directly renders pages it self. I would like the middleware to be agnostic and work regardless of how to page is produced.
I've seen express-mung but this doesn't hold back the header so executes too late, and works with chunked buffers anyway not the entire response. Next up was express-interceptor, that works perfectly for pages rendered directly in express but causes request failures for pages run through the proxy. My next best idea is pulling apart the compression module to figure out how it works.
Does anyone have a better suggestion, or even better know of a working module for this kind of thing?
Thanks.

Related

What is the difference between server side rendering (Next.js) and Static Site rendering (Gatsby.js)?

I'm looking to create a website that does not rely on client-side JavaScript, however, I still want to use SPA features like client-side routing etc, so I am looking at using a framework that does not render on the client-side. These 2 seem to be the top options for this type of thing, however, I'm unsure as to the differences between the 2 different types of server processing.

Server side rendering is where a request is made from the client/browser to the server, and then at that point the HTML is generated on-the-fly run-time and sent back to the browser to be rendered.
Static site rendering is very similar, however the parsing is carried out during the build time instead. Therefore when a request is made the HTML is stored statically and can be sent straight back to the client.
They both have their pros and cons:
Although static sites will be faster at run-time as no server-side processing is required, it does mean that any changes to data require a full rebuild over the application server side.
Alternatively, with the server side approach, putting any caching aside, the data is processed on-the-fly and sent straight to the client.
Often the decision is best made depending on how dynamic and real-time your content must be vs how performant the application needs to be.
For example, Stackoverflow most likely uses a server-side rendering approach. There are far two many questions for it to rebuild static versions of every question page each time a new post is submitted. The data also needs to be very real-time with users being able to see posts submitted only seconds ago.
However, a blog site, or promo site, which hardly has any content changes, would benefit much more from a static site setup. The response time would be much greater and the server costs would be much lower.

How HTTP response is generated

I'm fairly new to programming and this question is about making sure I get the HTTP protocol correctly. My issue is that when I read about HTTP request/response, it looks like it needs to be in a very specific format with a status code, HTTP version number, headers, a blank line followed by the body.
However, after creating a web app with nodejs/express, I never once had to actually write code that made an HTTP response in this format (I'm assuming, although I don't know for sure that other frameworks like ruby on rails or python/Django are the same). In the express app, I just set up the route handlers to render the appropriate pages, when a request was made to that route.
Is this because express is actually putting the response in the correct HTTP format behind the scenes? In other words, if I looked at the expressJS code, would there be something in that code that actually makes an HTTP response in the HTTP format?
My confusion is that, it seems like the HTTP request/response format is so important but somehow I never had to write any code dealing with it for a node/express application. Maybe this is the entire point of a framework like express... to take out the details so that developers can deal with business logic. And if that is correct, does anyone ever write web apps without a framework to do this. Would you then be responsible for writing code that puts the server's response into the exact HTTP format?

I'm fairly new to programming and this question is about making sure I get the HTTP protocol correctly. My issue is that when I read about HTTP request/response, it looks like it needs to be in a very specific format with a status code, HTTP version number, headers, a blank line followed by the body.
Just to give you an idea, there are probably hundreds of specifications that have something to do with the HTTP protocol. They deal with not only the protocol itself, but also with the data format/encoding for everything you send including headers and all the various content types you can send, authentication schemes, caching, status codes, URL decoding, etc.... You can see some of the specifications involved just by looking here: https://www.w3.org/Protocols/.
Now a simple request and a simple text response could get away with only knowing a few of these specifications, but life is not always that simple.
Is this because express is actually putting the response in the correct HTTP format behind the scenes? In other words, if I looked at the expressJS code, would there be something in that code that actually makes an HTTP response in the HTTP format?
Yes, there would. A combination of Express and the HTTP library that is built into node.js handle all the details of the specification for you. That's the advantage of using a library/framework. They even handle different versions of the protocol and feedback from thousands of other developers have helped them to clean up edge case bugs. A good library/framework allows you to still control any detail about the response (headers, content types, status codes, etc..) without making you have to go through the detail work of actually creating the exact response. This is a good thing. It lets you write code faster and lets you ride on the shoulders of others who have already figured out minutiae details that have nothing to do with the logic of your app.
In fact, one could say the same about the TCP protocol below the HTTP protocol. No regular app developer wants to write their own TCP stack. Instead, you just want a working TCP stack that you can use that's already been tuned and debugged for you.
However, after creating a web app with nodejs/express, I never once had to actually write code that made an HTTP response in this format (I'm assuming, although I don't know for sure that other frameworks like ruby on rails or python/Django are the same). In the express app, I just set up the route handlers to render the appropriate pages, when a request was made to that route.
Yes, this is a good thing. The framework did the detail work for you. You just call res.setHeader(), res.status(), res.cookie(), res.send(), res.json(), etc... and Express makes the entire response for you.
And if that is correct, does anyone ever write web apps without a framework to do this. Would you then be responsible for writing code that puts the server's response into the exact HTTP format?
If you didn't use a framework or library of any kind and were programming at the raw TCP level, then yes you would be responsible for all the details of the HTTP protocol. But, hardly anybody other than library developers ever does this because frankly it's just a waste of time. Every single platform has at least one open source library that does this already and even if you were working on a brand new platform, you could go get an open source body of code and port it to your platform much quicker than you could write all this yourself.
Keep in mind that one of the HUGE advantages of node.js is that there's an enormous body of open source code (mostly in NPM and Github) already prepackaged to work with node.js. And, because node.js is server-side where code memory isn't usually tight and where code just comes from the local hard disk at server init time, there's little downside to grabbing a working and tested package that does what you already need, even if you're only going to use 5% of the functionality in the package. Or, worst case, clone an existing repository and modify it to perfectly suit your needs.

Is this because express is actually putting the response in the
correct HTTP format behind the scenes?
Yes, exactly, HTTP is so ubiquitous that almost all programming languages / frameworks handle the actual writing and parsing of HTTP behind the scenes.
Does anyone ever write web apps without a framework to do this. Would
you then be responsible for writing code that puts the server's
response into the exact HTTP format?
Never (unless you're writing code that needs very low level tweaking of HTTP code or something)

Server side response to allow client side routing

I am developing a single page application that has a client side router. so although the base url to run the application will be either http:://example.com or http:://example.com/index.html - skipping the domain name that is routes '/' and '/index.html'
But somewhere in my application, because of my client side router, I may call up a route something like '/appointments/20160113 and the client router will redirect me to the appropriate "Appointments Page" inside my SPA passing the parameter of todays date.
But if the user calls directly http://example.com/appointments/20160113, I am led to believe that the server should respond directly with /index.html so the browser doesn't get a 404.
I am building the server using nodejs (specifically the http2 module, but I don't think that is very important, and my examples don't use https, although with http2 they do). I just tried changing the server so if its hit with an unknown url it responds with the index.html file.
However, the browser sees this as its first response and then makes requests for the rest of its attached files relative to the url (so for instance follows up with /appointments/20160113/styles/main.css). This leads to an infinite loop, as the server responds with another copy in index.html (and immediately gets a request back for /appointments/20160113/styles/styles/main.css ).
Its too early in the lifecycle of the page for the javascript to be running yet (and specifically the router software), so clearly the approach is too simplistic.
I have also written the client side router (using the history api) so I can change things if I need to.
How should this situation be handled. I am thinking perhaps a 301 redirect to /index.html or something and then the router's initial dispatch knows this and can do a popstate or something. I ideally want to support the passing of urls via external means between users, but until I actually tried to implement it I hadn't realise the implications.

I don't know if this is the best way or not, but having not received any answers on here, I decided to try a number of different ways and see which worked out the best.
Each way involved doing a 301 redirect to /index.html, and then providing the url from which I was redirecting via different mechanisms
This is what I tried
Setting a cookie with a short expiry date the value of which was the url
Adding a query string with a ?redirect= parameter with the url
Adding a #fragment after /index.html with the url
In the end I rejected 1) because chrome wasn't deleting the cookie after I had used it and making the value shorted lived depends on accurate timing between client and server. The solution appeared too fragile.
I tried 2) and it was looking good until I came to test it. Unfortunately setting window.location.search causes a page reload, and I was really struggling with finding out what was happening. However, what I discovered in 3) about mocking could well be provided to a solution based on 2) so it is one that could be used. I still might return to this solution as it "feels" right to me.
I tried 3) and it worked quite well. However I was struggling with timing issues in testing since my router element was using the #fragment during initialisation, but I couldn't set the window.location.hash until after the router was established in the test suite. I wanted to mock window.location.hash with sinon so I could control it, but it turns out you can't
The solution to this was for the router to wrap its own calls to window.location.hash in a library, so that I could mock the library. And that is what I did in the end and it all worked.
I could go back to using a query string and wrapping window.location.search in a library call, so I could stub that call and avoid the problems of page reloading.

Serving images based on :foo in URL

I'm trying to limit data usage when serving images to ensure the user isn't loading bloated pages on mobile while still maintaining the ability to serve larger images on desktop.
I was looking at Twitter and noticed they append :large to the end of the url
e.g. https://pbs.twimg.com/media/CDX2lmOWMAIZPw9.jpg:large
I'm really just curious how this request is being handled, if you go directly to that link there is no scripts on the page so I'm assuming it's done serverside.
Can this be done using something like Restify/Express on a Node instance? More than anything I'm really just curious how it is done.

Yes, it can be done using Express in Node. However, it can't be done using express.static() since it is not a static request. Rather, the receiving function must handle the request by parsing the querystring (or whatever :large is) in order to dynamically respond with the appropriate image.
Generally the images will have already been pre-generated during the user-upload phase for a set of varying sizes (e.g. small, medium, large, original), and the function checks the querystring to determine which static request to respond with.
That is a much higher-performing solution than generating the appropriately-sized image server-side on every request from the original image, though sometimes that dynamic approach is necessary if the server is required to generate a non-finite set of image sizes.

Modifying content delivered by node-http-proxy

Due to some limitations about the web services I am proxying, I have to inject some JS code so that it allows the iframe to access the parent window and perform some actions.
I have built a proxy system with node-http-proxy which works pretty nicely. However I have spent unmeasurable hours trying to modify the content (on my own, using harmon as well, etc) that is being sent to the user without any success. I have found some articles and even some questions here but all of them are outdated and are not useful anymore.
I was wondering if someone can give me an actual example about how to do this, because I am unable to do it and maybe it is just that it is impossible to do at this point?

I haven't tried harmon, but I did try cheerio and it works.
However, I used http-mitm-proxy and not node-http-proxy.
If you are using http-mitm-proxy, you need to return a promise in the response handler. Otherwise, the proxy continues to send the original response without picking up your changes.
I have recently written another proxy at:
https://github.com/noeltimothy/noelsproxy
I'm going to add response handling to this soon. This one uses a callback mechanism, which means it wont return the response until the caller signals it to.
You should be able to use 'cheerio' and alter the content in JQuery style.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string