s3.getSignedUrl ResponseContentDisposition parameter not working

s3.getSignedUrl ResponseContentDisposition parameter not working - node.js

I'm successfully generating a signed url that I can then use for a limited time to download resources from my s3 bucket. However I'm trying to use the ResponseContentDisposition attribute in the params as documented here:
http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#getSignedUrl-property
http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#getObject-property
I'm not sure if I'm doing this wrong but for some reason the headers are not being set. For example if I use the url I get back from s3.getSignedUrl:
curl -i "https://foo-dev.s3.amazonaws.com/images/foo.jpg?AWSAccessKeyId=AKIAICBHUC26S6B446PQ&Expires=1468359314&Signature=EeBqx1G83oeusarBl2KUbbCCBgA%3D&response-content-disposition=attachment%3B%20filename%3Ddata.jpg"
the headers are:
x-amz-id-2: SG9rjYQCcuqgKfjBmMbDQC2CNLcnqBAFzP7zINa99VYUwNijPOm5Ea/5fllZ6cnt/Qti7e26hbE=
x-amz-request-id: 2670068008525B1D
Date: Tue, 12 Jul 2016 21:26:16 GMT
Content-Disposition: inline; filename=foo.jpg
Last-Modified: Tue, 12 Jul 2016 00:47:23 GMT
ETag: "2a8e36651b24769170f4faa429f40f54"
Accept-Ranges: bytes
Content-Type: image/jpeg
Content-Length: 43373
Server: AmazonS3
I'm setting this, using the javascript s3 sdk like this:
function tempRedirect(req, res) {
var filename = req.params[0];
var contentDisposition = 'attachment; filename=data.jpg';
var params = {
Bucket: S3_BUCKET,
ResponseContentDisposition: contentDisposition,
Key: checkTrailingSlash(getFileKeyDir(req)) + filename
};
var s3 = new aws.S3(s3Options);
s3.getSignedUrl('getObject', params, function(err, url) {
res.redirect(url);
});
};
The docs are pretty light and I can only find PHP examples but it does look like I'm setting content disposition correctly.
Anyone know what is going wrong here??

According to RFC- 2616, your value is malformed.
The expected format is attachment; filename="funny-cat.jpg". The filename is a quoted string.
And, my original assumption was that S3 was rejecting it as invalid and silently refusing to replace the value.
Subsequent tests reveal unexpected behavior: if Content-Disposition is not stored with the object, then &response-content-disposition=... works as expected, setting the response header. But if there is a header stored with the object, this query string parameter does not have the documented effect of "overriding" that value.
Conversely, &response-content-type=... does override a stored Content-Type: for the object.
That's what a few quick tests revealed for me.
But this appears to be a bug -- or more accurately, some kind of regression -- in S3. According to one support forum post, the behavior is actually inconsistent, sometimes working and sometimes not.
S3 is aware of this issue and we are working to resolve it. (2016-07-12)
https://forums.aws.amazon.com/thread.jspa?threadID=235006

Related

How do I send raw binary data via HTTP in node.js?

From a node.js back end, I need to send an HTTP message to a REST endpoint. The endpoint requires some parameters that it will expect to find in the HTTP message. Some of the parameters are simple enough, just requiring a number or a string as an argument. But one of the parameters is to be "the raw binary file content being uploaded" and this has puzzled me. As far as I understand, the parameters need to be gathered together into a string to put in the body of the HTTP request; How do I add raw binary data to a string? Obviously, for it to be in the string, it cannot be raw binary data; it needs to be encoded into characters.
The endpoint in question is the Twitter media upload API. The "raw binary data" parameter is called media. Below is an incomplete code snippet showing the basic gist of what I've tried. Specifically, the line where I build the requestBody string. I don't believe it is anywhere near correct, because the endpoint is returning a "bad request" message.
var https = require("https");
var base64ImageData = /* (some base 64 string here) */;
var options = {
host: "api.twitter.com",
path: "/1.1/media/upload.json",
method: "POST",
headers: {
"Content-Type": "multipart/form-data"
}
};
var request = https.request(options, function(response) {});
var requestBody = "media_id=18283918294&media=" + Buffer.from(base64ImageData, "base64").toString("binary");
request.write(requestBody);
request.end();
Also worth noting, Twitter themselves note the following extremely confusing statement:
"When posting base64 encoded images, be sure to set the “Content-Transfer-Encoding: base64” on the image part of the message."
Source: https://developer.twitter.com/en/docs/media/upload-media/uploading-media/media-best-practices
That might be part of the answer to my question, but what I don't understand is: How do I apply different headers to different parts of the HTTP message? Because apparently, the image data needs to have a Content-Transfer-Encoding header of "base64" while the rest of the HTTP message does not...

How do I apply different headers to different parts of the HTTP message?
This is the point of the multipart/form-data content type. A multi-part message looks like this:
Content-Type: multipart/form-data; boundary=---foo---
---foo---
Content-Disposition: form-data; name="datafile1"; filename="r.gif"
Content-Transfer-Encoding: base64
Content-Type: image/gif
// data goes here
---foo---
Content-Disposition: form-data; name="datafile2"; filename="g.png"
Content-Transfer-Encoding: base64
Content-Type: image/png
// another file's data goes here
---foo---
You probably don't want to put all this together yourself. There are a bunch of good libraries for putting together complex POSTs. For example: https://www.npmjs.com/package/form-data

How to maintain a cache of the public keys from Google's OpenID Connect discovery document

I am working on a Node.js server side validation of json web tokens received from cross origin ajax clients. Presumably the tokens are generated by Google OpenID Connect which states the following:
To use Google's OpenID Connect services, you should hard-code the Discovery-document URI into your application. Your application fetches the document, then retrieves endpoint URIs from it as needed.
You may be able to avoid an HTTP round-trip by caching the values from the Discovery document. Standard HTTP caching headers are used and should be respected.
source: https://developers.google.com/identity/protocols/OpenIDConnect#discovery
I wrote the following function that uses request.js to get the keys and moment.js to add some timestamp properties to a keyCache dictionary where I store the cached keys. This function is called when the server starts.
function cacheWellKnownKeys(uri) {
var openid = 'https://accounts.google.com/.well-known/openid-configuration';
// get the well known config from google
request(openid, function(err, res, body) {
var config = JSON.parse(body);
var jwks_uri = config.jwks_uri;
var timestamp = moment();
// get the public json web keys
request(jwks_uri, function(err, res, body) {
keyCache.keys = JSON.parse(body).keys;
keyCache.lastUpdate = timestamp;
keyCache.timeToLive = timestamp.add(12, 'hours');
});
});
}
Having successfully cached the keys, my concern now is regarding how to effectively maintain the cache over time.
Since Google changes its public keys only infrequently (on the order of once per day), you can cache them and, in the vast majority of cases, perform local validation.
source: https://developers.google.com/identity/protocols/OpenIDConnect#validatinganidtoken
Since Google is changing their public keys every day, my idea with the timestamp and timeToLive properties of keyCache is to do one of two things:
Set a timeout every 12 hours to update the cache
Deal with the case where Google changes their public keys in between my 12 hour update cycle. The first failed token validation on my end triggers a refresh of the key cache followed by one last attempt to validate the token.
This seems like a viable working algorithm until I consider an onslaught of invalid token requests that result in repeated round trips to the well known config and public keys while trying to update the cache.
Maybe there's a better way that will result in less network overhead. This one line from the first quote above may have something to do with developing a more efficient solution but I'm not sure what to do about it: Standard HTTP caching headers are used and should be respected.
I guess my question is really just this...
Should I be leveraging the HTTP caching headers from Google's discovery document to develop a more efficient caching solution? How would that work?

The discovery document has property jwks_uri which is the web address of another document with public keys. This other document is the one Google is referring to when they say...
Standard HTTP caching headers are used and should be respected.
An HTTP HEAD request to this address https://www.googleapis.com/oauth2/v3/certs reveals the following header:
HTTP/1.1 200 OK
Expires: Wed, 25 Jan 2017 02:39:32 GMT
Date: Tue, 24 Jan 2017 21:08:42 GMT
Vary: Origin, X-Origin
Content-Type: application/json; charset=UTF-8
X-Content-Type-Options: nosniff
x-frame-options: SAMEORIGIN
x-xss-protection: 1; mode=block
Content-Length: 1472
Server: GSE
Cache-Control: public, max-age=19850, must-revalidate, no-transform
Age: 10770
Alt-Svc: quic=":443"; ma=2592000; v="35,34"
X-Firefox-Spdy: h2
Programmatically access these header fields from the response object generated by request.js and parse the max-age value from it, something like this:
var cacheControl = res.headers['cache-control'];
var values = cacheControl.split(',');
var maxAge = parseInt(values[1].split('=')[1]);
The maxAge value is measured in seconds. The idea then is to set a timeout based on the maxAge (times 1000 for millisecond conversion) and recursively refresh the cache upon every timeout completion. This solves the problem of refreshing the cache on every invalid authorization attempt, and you can drop the timestamp stuff you're doing with moment.js
I propose the following function for handling the caching of these well known keys.
var keyCache = {};
/**
* Caches Google's well known public keys
*/
function cacheWellKnownKeys() {
var wellKnown= 'https://accounts.google.com/.well-known/openid-configuration';
// get the well known config from google
request(wellKnown, function(err, res, body) {
var config = JSON.parse(body);
var address = config.jwks_uri;
// get the public json web keys
request(address, function(err, res, body) {
keyCache.keys = JSON.parse(body).keys;
// example cache-control header:
// public, max-age=24497, must-revalidate, no-transform
var cacheControl = res.headers['cache-control'];
var values = cacheControl.split(',');
var maxAge = parseInt(values[1].split('=')[1]);
// update the key cache when the max age expires
setTimeout(cacheWellKnownKeys, maxAge * 1000);
});
});
}

non-chunked multipart/mixed POST?

I'm trying to send multiple binary files to a web service in a single multipart/mixed POST but can't seem to get it to work... target server rejects it. One thing I noticed is Node is trying to do the encoding as chunked, which I don't want:
POST /SomeFiles HTTP/1.1
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary=123456789012345
Host: host.server.com
Connection: keep-alive
Transfer-Encoding: chunked
How do I get it to stop being chunked? Docs say that can be disabled by setting the Content-Length in the main request header but I can't set the Content-Length since, among other reasons, I'm loading one file at a time -- but it shouldn't be required either since it's multipart.
I think the rest is OK (excepting it's not chunked, unless the req_post.write() is doing that part), e.g. for the initial part I:
var req_post = https.request({
hostname: 'host.server.com',
path: '/ManyFiles',
method: 'POST',
headers: {
'MIME-Version': '1.0',
'Content-Type': 'multipart/mixed; boundary=' + boundary
}
},...);
and then pseudo-code:
while ( true ) {
// { get next file here }
req_post.write('\r\n--' + boundary + '\r\nContent-Type: ' + filetype + '\r\nContent-Length: ' + filesize + '\r\n\r\n');
req_post.write(chunk);// filesize
if ( end ) {
req_post.write('\r\n--' + boundary + '--\r\n');
req_post.end();
break;
}
}
Any help/pointers is appreciated!

The quick answer is you cannot disable chunked without setting content-length. The chunked encoding was introduced for cases where you do not know the size of the payload when you start transmitting. Originally, content-length was required and the recipient knew it had the full message when it received content-length bytes. Chunked encoding removed that requirement by transmitting mini-payloads each with a content-length, followed by a zero-size to denote completion of the payload. So, if you do not set the content-length, and you do not use the chunked methodology, the recipient will never know when it has the full payload.
To help solve your problem, if you cannot send chunked and do not want to read all the files before sending, take a look at fs.stat. You can use it to get the file size without reading the file.

NodeJS: Uploading plain text to s3 via Knox and I get statusCode = 505?

I have the following code, where message is a JSON String. I am trying to upload this to s3 with the md5 of message as the destination filename. I am getting a '505' statusCode. I am new to NodeJS and not sure what I am doing wrong here?
knoxInitParams =
'key': awsKey
'secret': awsPrivateKey
'bucket': bucket
client = knox.createClient knoxInitParams
buff = new Buffer message
reqHeader =
'Content-Length': buff.length
'Content-Type': 'text/plain'
'x-amz-acl': 'private'
req = client.put '/tmp/xxx.txt', reqHeader
req.on 'response', (res) ->
console.log res.statusCode
console.log res.headers
if res.statusCode is 200
console.log res.url
req.on 'error', (err) ->
console.error "S3 Error: ", err
req.end buff
Edit:
Changed the destination to hardcode it, as a reply below pointed out that was causing the issue. However, I am now getting a 403 :(

Your code looks fine.
Make sure your date/time are correct. ntpdate -s pool.ntp.org.

Quick note, I ran into this issue, too, but my error was that I had a space in my filename.
var req = client.put('/tmp/x xx.txt', reqHeader);
I wrapped the filename, like this
var req = client.put(encodeURIComponent('/tmp/x xx.txt'))

Most likely your bug is here:
req = client.put destination.toLowerCase + '.txt', reqHeader
You probably want to invoke destination.toLowerCase:
req = client.put destination.toLowerCase() + '.txt', reqHeader
On the other hand, I think it's wholly unnecessary -- it will be lowercase already.
On a side note, you may want to look into unit testing -- it's a great way of catching these kinds of bugs! If I were you, I would add a function, say getFileName:
getFileName = (contents) ->
crypto.createHash('md5').update(contents).digest('hex') + '.txt'
Now you can easily test this function with nodeunit, mocha, jasmine or any of the other great test utilities, and make sure that it always returns what you expect -- and if not, help you notice immediately where the error is.
I can also heartily recommend node's debugger, which also helps you catch these bugs.

When I try your exact code with my own S3 account, it works fine:
$ coffee test
200
{ 'x-amz-id-2': 'f5C32nQHlE0WI8jtNFEZykRFAdrM8ZdBzgeAxc23bnJ2Ti4bYKmcY3pX/KpEzyZg',
'x-amz-request-id': 'B41AACFF85661C2E',
date: 'Tue, 01 May 2012 23:15:39 GMT',
etag: '"44b25eb6d36a88713b7260d8db15b24b"',
'content-length': '0',
server: 'AmazonS3' }
Check your ID/key/bucket, and date/time as #skrewler suggests.

S3 randomly giving me "BadDigest" errors

I have a node.js app that periodically pushes some data to Amazon S3. I'm using a Put request to push a buffer over to S3.
I know that the "content-md5" parameter of the S3 request needs to be the base64 encoded Md5 hash of the content that I'm pushing. What has me confused is that 90% of the time, my requests succeed. The other 10% of the time, without my hashing method changing at all, Amazon gives me back "badDigest" error:
{ [Error: API error with HTTP Code: 400]
headers:
{
'content-type': 'application/xml',
'transfer-encoding': 'chunked',
date: 'Fri, 06 Apr 2012 02:20:14 GMT',
connection: 'close',
server: 'AmazonS3' },
code: 400,
document:
{ Code: 'BadDigest',
Message: 'The Content-MD5 you specified did not match what we received.',
ExpectedDigest: 'fPRrmxapcSHmI2gljme1Fg==',
CalculatedDigest: 'w6PoDxh2ty478+Mw2UwTrA==',
RequestId: '1018E7A80A8B0B00',
HostId: 'W/SK/OovQHlsi593DJ154pkHdOrUk3oMWmIGNdOKj3WaHa8cBknhB+7H5IdZLUjt' } }
Has anyone else experienced this randomness from S3 before? Am I missing something obvious?
Thanks!

You likely forgot to specify 'utf8' as parameter for update.
var status = 'काक्नोम्यत्क्नोम्यत्चं शक्नोम्यत्तुमतुम् ।तुम् ।् । नोपहिनस्ति माम् ॥';
var contentMd5 = crypto
.createHash('md5')
.update(status, 'utf8')
.digest('base64');
Without it works in the most cases but not when your string includes multibyte characters.

The aws-sdk will automatically calculate the ContentMD5 and ContentLength values for you. If you have a UTF-8 string and you are using '।्'.length to set the ContentLength value S3 will return the BadDigest error. So the solution in my case was just let the aws-sdk calculate the ContentMD5 and ContentLength values.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

s3.getSignedUrl ResponseContentDisposition parameter not working - node.js

Related

How do I send raw binary data via HTTP in node.js?

How to maintain a cache of the public keys from Google's OpenID Connect discovery document

non-chunked multipart/mixed POST?

NodeJS: Uploading plain text to s3 via Knox and I get statusCode = 505?

S3 randomly giving me "BadDigest" errors

Categories

Resources