AWS S3 Returns 200ok parser fails if ContentEncoding: 'gzip' - node.js

My first deploy to AWS.
The files are all in place, and index.html loads.
There are two files in a subdir, one .js and once .css.
They both return 200 but fail to load. Chrome sais it's the 'parser'.
After trying a few things, I noted that this property is causing it: ContentEncoding: "gzip".
If I remove this property the files are found correctly.
Am I using this property incorrectly?
I am using the Node AWS SDK via this great project: https://github.com/MathieuLoutre/grunt-aws-s3
You can witness this behavior for yourself at http://tidepool.co.s3-website-us-west-1.amazonaws.com/

If you specify Content-Encoding: gzip then you need to make sure that the content is actually gzipped on S3.
From what I see in this CSS file:
http://tidepool.co.s3-website-us-west-1.amazonaws.com/08-26_6483218-dirty/all-min.css
the actual content is not gzipped, but the Content-Encoding: gzip header is present.
Also keep in mind that S3 is unable to compress your content on the fly based on the Accept-Encoding header in the request. You can either store it uncompressed and it will work for all browsers/clients or store it in a compressed format (gzip/deflate) and it will only work on some clients that can work with compressed content.

You could also take a look at the official AWS SDK for Node.js.

Related

Setting custom header with API gateway non-proxy lambda and binary output

Is it possible to set a custom header when using lambda non proxy integrations?
At the moment I have enabled binary support and I am returning straight from my handler but I have a requirement to set the file name of the download and was planning to use Content-Disposition: attachment; filename="filename.xlsx" but I am not sure how I can do this if I have lambda proxy integration turned off.
Reading this https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-integration-settings-integration-response.html I am not sure if it only works for json responses?
The example shows the body comment as taking a json object but then says there is a base64 encoding option for binary support, but I am just returning my binary data straight from my function and I had not planned to use lambda proxy at all if possible.
I currently have files downloading but I am using temporary files and I want to name the downloads.
# In my service
with tempfile.NamedTemporaryFile(suffix=".xlsx") as tmp:
pd.DataFrame(report_json).to_excel(tmp.name)
bytes_io = BytesIO(tmp.read())
return bytes_io
# In my handler
return base64.b64encode(bytes_io.getvalue())
Using later versions of the serverless framework a custom header for example Content-Disposition can be set like the following.
Integration: lambda
response:
headers:
Content-Type: "'text/csv'"
Content-Disposition: "'attachment; filename=abc.csv'"
I am not sure yet if it is possible to interpolate values from the context into these values.

AWS S3 serving gzipped files but not readable

I'm using AWS S3 to host a static webpage, almost all assets are gzipped before being uploaded.
During the upload the "content-encoding" header is correctly set to "gzip" (and this also reflects when actually loading the file from AWS).
The thing is, the files can't be read and are still in gzip format although the correct headers are set...
The files are uploaded using npm s3-deploy, here's a screenshot of what the request looks like:
and the contents of the file in the browser:
If I upload the file manually and set the content-encoding header to "gzip" it works perfectly. Sadly I have a couple hundred files to upload for every deployment and can not do this manually all the time (I hope that's understandable ;) ).
Has anyone an idea of what's going on here? Anyone worked with s3-deploy and can help?
I use my own bash script for S3 deployments, you can try to do it:
webpath='path'
BUCKET='BUCKETNAME'
for file in $webpath/js/*.gz; do
aws s3 cp "$file" s3://"$BUCKET/js/" --content-encoding 'gzip' --region='eu-west-1'
done

Amazon S3 403 Forbidden Error for KML files but not JPG files

I am able to successfully upload (put object) jpg files to S3 with a particular code path, but receive a 403 forbidden error when using the same code path to upload a KML file. I am not restricting file types explicitly with "bucket policy," but feel that this must somehow be tied to bucket policy or CORS configuration.
I was using code based off the Heroku tutorial for uploading images to Amazon S3. The issue ended up being that the '+' symbol in the appropriate mime type is "application/vnd.google-earth.kml+xml" and the + symbol was being replaced with a space when fetching the file-type query parameter for our own S3 endpoint to generate signed requests. We were able to quickly fix this by just forcing the ContentType to be "application/vnd.google-earth.kml+xml" for all kml files going to our endpoint for generating signed S3 requests.

Difference between Transferred and Size columns in Mozilla Firexfox Network tab

I am trying to determine what the difference is between the Transferred column and Size column. Does it have to do with the difference between compressed files and uncompressed?
I am not compressing my files on my server (Node.js Express server) so I don't know why there would be a difference in file size.
Your express application has gzip compression enabled as indicated by the Content-Encoding: gzip header, so the response body is compressed with gzip before sending over the network. Transferred size is when compressed, and size is decompressed in the browser. Express is doing this on the fly, so even though your file is not compressed on disk, it gets compressed before it is sent over the network.
Follow-up on your comments
You haven't posted any code, but it's probable that your express application is using the compression middleware (perhaps from the boilerplate you started with). If so, that will use mime-db to determine if the response content type is compressible. Looking up application/javascript in mime-db reveals it is marked as compressible:
mimeDb['application/javascript']
{ source: 'iana',
charset: 'UTF-8',
compressible: true,
extensions: [ 'js' ] }
Note that a .gz file extension is not involved anywhere here. There is no .gz file on disk, the compression is being done to a .js file in memory. Also note that just setting the Content-Encoding: gzip header without actually encoding the body as gzip is not something you want to do. It will cause encoding errors for the client.

How to cache with manifest Node.js site

In relation with my early question of how to add manifest cache in node.js, my question now is related with how to cache the HTML generated by node.js. As we didn't have a physical file like in php (index.php) we cannot cache such kind of files.
How we can cache a "non existing" page? Just adding in cache:
CACHE MANIFEST
CACHE:
# plain files to cache
/javascripts/client.js
/stylesheets/style.css
/stylesheets/style.styl
# generated files like /
/
/content
Any idea of how to solve this problem?
Thanks!
Solution:
Add router to return the cache.manifest file with the correct mime-type:
app.get("/offline.manifest", function(req, res){
res.header("Content-Type", "text/cache-manifest");
res.end("CACHE MANIFEST");
});
Found at stackoverflow
The cache manifest list URLs that should be cached. The client accessing those urls has no knowledge whether these are static html files on top of Apache or dynamic content generated by node.js or anything else.
You are basically instructing the client:
Read my list of urls
Go through each url
Download the response and store it someplace safe
Check back on my cache.manifest if it has changed and then proceed to step 1
So as long as your data generated by node.js is reachable via a URL there is no problem in defining it as a line in the cache manifest.
And if you are worried "how will I know which urls there are" you can always generate the cache.manifest file programmatically from node.js itself -- but remember to serve the correct content-type text/cache-manifest

Resources