I'm trying to figure out what to do to make Google App Engine (standard version) apply compression to the output of my Next.js/Node.js/Express application.
As far as I've gathered, the problem is that
1) Google's load balancer removes all meta tags indicating that the client supports compression from the request, and thus app.use(compression()) in server.js won't do anything. I've tried to force compression using a {filter: shouldCompress} function, but doesn't seem to matter since Google's front end still returns an uncompressed result. (Locally compression works fine.)
2) How and when Google's load balancer chooses to apply compression is a mystery to me. (And particularly, why not to my silly but large application/javascsript content :))
Here's what they say in the docs:
If the client sends HTTP headers with the original request indicating
that the client can accept compressed (gzipped) content, App Engine
compresses the handler response data automatically and attaches the
appropriate response headers. It uses both the Accept-Encoding and
User-Agent request headers to determine if the client can reliably
receive compressed responses.
How Requests are Handled: Response Compression
So there's that. I'd love to use App Engine for this project but when index.js is 700KB instead of a compressed 200KB, it's kind of a showstopper.
As per the Request Headers and Responses documentation for Node.js, the Accept-Encoding header is removed from the request for security purpose.
Note: Entity headers (headers relating to the request body) are not sanitized or checked, so applications should not rely on them. In particular, the Content- MD5 request header is sent unmodified to the application, so may not match the MD5 hash of the content. Also, the Content-Encoding request header is not checked by the server, so if the client sends a gzipped request body, it will be sent in compressed form to the application.
Also note the response on Google Group which states:
Today, we are not passing through the Accept-Encoding header, so it is not possible for your middleware to decide that it should compress.
We will roll out a fix for this in the new few weeks.
Related
Is it possible to identify the client / library which sent a HTTP request?
I am trying to fetch some data via an API and it is possible to query the API via cURL and python, but when I try to use node (doesn't matter which library, axios requests, unirest, native, ...) or wget I get a proprietary error back from the backend.
Now I am wondering, if the backend is able to identify, which library I am using?
More information:
The requests are exactly the same, so no way to distinguish them
The user-agent header field is set and overwritten for all requests
I already tried to monitor the traffic in wireshark, but couldn't find any differences with the packets on HTTP layer (only the order of the header fields is different, that according to the standard this shouldn't make a difference)
It turns out that the problem was TLS fingerprinting.
See: https://httptoolkit.tech/blog/tls-fingerprinting-node-js/
Nodejs uses google V8 JS engine, V8 based http request clients will not allow you to override headers that would compromise 'web safety', so for example if you are setting "Origin, Host, Referrer" headers, node might refuse to do so. I had the same issue previously.
Un-opinionated http clients, such as the ones written in C++(curl) and python won't 'web safety' check your requests, so that is what is causing the difference in behavior.
In my case I used a C++ library that I called from javascript to make my 'unsafe' requests and the problem was solved.
I am studying NodeJS and I have learned about streams and buffer. I found it very interesting because it occurred to me to use it on an HTTP server, since if a request is too big, the data can be sent little by little, especially when using the verb POST to upload big files for example. The problem is that not all data (files at this case) is very big. Is there a way to know the size of a request before it reaches its destination?
From the receiving end, you can look at the Content-Length header.
It may or may not be present, depending upon the sender. For info about that, see Is the Content-Length header required for a HTTP/1.0 response?.
I was going through this code (linnovate mean boilerplate code) and I have seen they have used a package called compression as a middleware.
In npm compression is described as:
... The middleware will attempt to compress response bodies for all
request that traverse through the middleware, based on the given
options.
What exactly is compressing the response body? Is it making it smaller and wouldn't this change the data?
Why is it important? Why not just send the response body uncompressed?
-----EDIT-----
And how is it decompressed by the client is there a package that does that? How does it work?
Just to make the website more responsive, compressing data reduces the overall size of response hence faster loading times.
Compressed data is decompressed on the client side, so the data remains same.
Browser is responsible for decompressing the response automatically which is compressed and sent by server. Only thing required from client is to send an header containing decompression techniques supported.
Accept-encoding: gzip
So when browser sends the above header, server can send compressed response using gzip and browser will automatically decompress the response.
I originally created a logic app that would, given a JSON payload, run a stored procedure, transform the results into a CSV table and then email the CSV to a specified email account. Unfortunately requirements changed slightly and instead of emailing the csv they want it to download directly in the browser.
I am unable to get the HTTP response action to tell the browser to download the file using the Content-Disposition header. It looks like this is pulled out of the request by design. Is anyone aware of another action (perhaps a function?) that could be used in place of the HTTP response to get a web browser to download the file rather than returning it as text in the response body?
It does indeed seem to be the case that the Response action doesn't support the Content-Disposition header for some reason. Probably the easiest workround is to proxy the request through a simple HTTP-triggered Azure Function with CORS enabled (or an API on your server) that just fetches the file from the Logic App and then returns it with the Content-Disposition header attached.
NB. Don't rely on <a download="filename"> - most browsers that support the download attribute only respect it for same-origin requests.
I have installed node.js and am running a simple express server on my local machine. I have included the compress module and make the call for app (instance of express) to use this module. After debugging, my request appears to be passing through the filter with the option to be encoded with gzip, but when transmitting the response it is not encoded. Are there other common reasons I am overlooking for why this is the case?
Please see request header and source code in images linked
below. (It should be noted that file 1 is actually being retrieved, but not encoded)
Source
Headers
After several days of struggle, I have come to the conclusion that the issue was not with the server or the compression middleware, but rather a proxy that is used on the network I am on. The data was indeed sent as compressed (gzip), but the proxy was intercepting the response and decompressing it prior to reaching the browser. Thus, it appeared to have been sent decompressed (in the response header).
Helpful hint: Read thoroughly over the known issues!
c.f. https://github.com/expressjs/compression/issues/31