Stream audio from S3 to client through REST API - node.js

Description
I am building a music streaming SPA using NEXT.js.
Audio files are stored on AWS S3.
The goal is to stream audio file from S3 to client through REST
so that authentication is possible, and to "hide" the AWS endpoints.
Issue
When streaming data down to client through REST endpoint the audio glitches and loads only ~15seconds of the audio file being played.
I tested this behaviour on separate project with manually creating the read stream and providing options to it :
fs.createReadStream("path", {start: startByte, end: endByte})
and it works just fine.
Although the createReadStream from s3 (i believe im using v2) does not accept any options. So i am unable to fix this glitching this way.
I thought about many solutions one of which involved manually converting the incoming buffer from S3 to streamable data, but this will lead to data being processed in server's RAM i believe, and i dont want this behaviour even though audio files are usually quite "small".
I also thought about creating a presigned url to the file and then redirecting in the worst case scenario.
Question
I will provide source code below. I believe my audio loops on first ~15 seconds due to readstream lacking start and end positions.
How do i fix given behaviour and stream data corectly from s3 to server to client without saving whole files in servers RAM?
Code
Part of the utility function for data streaming:
const downloadParams = {
Key,
Bucket: bucketName,
};
const fileStream = s3.getObject(downloadParams).createReadStream();
fileStream is returned from this function and accessed in API endpoint like so:
const CHUNK_SIZE = 10 ** 3 * 500; // ~500KB
const startByte = Number(range.replace(/\D/g, ""));
const endByte = Math.min(
startByte + CHUNK_SIZE,
attr.ObjectSize - 1
);
const chunk = endByte - startByte + 1;
const headers = {
"Content-Range": `bytes ${startByte}-${endByte}/${attr.ObjectSize}`,
"Accept-Ranges": "bytes",
"Content-Length": chunk,
"Content-Type": "audio/*",
};
res.writeHead(206, headers);
fileStream.pipe(res);
Here is audio receiver on the client:
"use client";
const Audio = () => {
return (
<audio src="http://localhost:3000/api/stream/FILE_KEY_HERE" controls></audio>
);
};
export default Audio;
here is how request headers look like:
Accept: */*
Accept-Encoding: identity;q=1, *;q=0
Accept-Language: en,ru;q=0.9,sv-SE;q=0.8,sv;q=0.7,en-US;q=0.6
Connection: keep-alive
Cookie:
Host: localhost:3000
Range: bytes=65536-
Referer: http://localhost:3000/
sec-ch-ua: "Not_A Brand";v="99", "Google Chrome";v="109", "Chromium";v="109"
sec-ch-ua-mobile: ?1
sec-ch-ua-platform: "Android"
Sec-Fetch-Dest: video
Sec-Fetch-Mode: no-cors
Sec-Fetch-Site: same-origin
sec-gpc: 1
User-Agent: Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Mobile Safari/537.36
second sheader differs only by Range: bytes=65536-
first request:
Request URL: http://localhost:3000/api/stream/track/4
Request Method: GET
Status Code: 206 Partial Content
Remote Address: [::1]:3000
Referrer Policy: strict-origin-when-cross-origin
Response Headers:
Accept-Ranges: bytes
Connection: keep-alive
Content-Length: 500001
Content-Range: bytes 65536-565536/3523394
Content-Type: audio/*
Date: Wed, 25 Jan 2023 21:35:51 GMT
Keep-Alive: timeout=5
PS
I did check my network tab and headers contain thruthful information about objects that are being streamed. Requests seem to download full file size (3.2mb of 3.2mb for example), but the audio still loops on first 15 seconds. Even if i manipulate the duration bar manually.
Haven't found any information like this here so thought this would be helpful to someone in the future
Tried
On top of things mentioned I tried creating new streams and piping them, tried using stream events on createReadStream(), read poorly written aws docs. But due to lack of info it is less time consuming to ask someone than trying fixing same issue for 4 days straight.

The issue is that the first X bytes were read from the source MP3, regardless of if the client requested a later 'range'.
The quick solution was to just tell the GetObject function to seek to the same bytes the request states in the Range header, since S3 itself also supports range requests.

Related

node express send function not working with binary data

I have a GET route in express that should return a binary png image stored in mongodb. However, when I enter the url into chrome to test, the image is downloaded but the request never completes. From the Network tab in Chrome DevTools the request is just stuck in the 'pending' state. I'm only getting this problem with binary data it seems. I have plenty of other json GET requests that work just fine with send().
I am using the send() function like this:
exports.getProjectPng = (req, res) => {
Project.findById(req.params.projectId).select('project.png')
.then(project => {
res.send(project.png.buffer);
});
If I simply replace send() with end() the request completes as expected. Also, perhaps significantly, the png image is actually rendered within the browser rather than downloading as a file.
So why does end() work but send() doesn't?
If you point curl at an express server and see what the response looks like for both methods it is quite interesting. The main difference is what when we call send, the Content-Type header is populated, which is consistent with the Express docs:
When the parameter is a Buffer object, the method sets the Content-Type response header field to “application/octet-stream”, unless previously defined
It's worthwhile noting that res.send() actually calls res.end() internally at the end of the call, so the different behaviour is likely down to something that res.send does in addition to res.end.
It might be worth populating the Content-Type Header in your example to "image/png" before sending.
e.g.
res.set('Content-Type', 'image/png');
For .end():
* Connected to localhost (::1) port 8081 (#0)
> GET /downloadpng_end HTTP/1.1
> User-Agent: curl/7.30.0
> Host: localhost:8081
> Accept: */*
>
< HTTP/1.1 200 OK
< X-Powered-By: Express
< Date: Mon, 25 Jun 2018 13:14:58 GMT
< Connection: keep-alive
< Content-Length: 69040
<
And for send():
* Connected to localhost (::1) port 8081 (#0)
> GET /downloadpng_send HTTP/1.1
> User-Agent: curl/7.30.0
> Host: localhost:8081
> Accept: */*
>
< HTTP/1.1 200 OK
< X-Powered-By: Express
< Content-Type: application/octet-stream
< Content-Length: 69040
< ETag: W/"10db0-KwFSGG5Ib/DQNZChAbluTiKSP0o"
< Date: Mon, 25 Jun 2018 13:15:25 GMT
< Connection: keep-alive
<
Nodejs does not handle straight binary data very well.So , thats what buffer is used to handle binary data.
End() Method
end − This event is fired when there is no more data to read. while send has no guarantee whether its completed or not. You can read more about Buffer on offical docs here

using nodejs, express and basic auth how do i get username and password

I am using nodejs, express
A third party that will be calling my api will be using basic auth in the format of http://{username}:{password}#yourdomain.com/
I have tried
var auth = require('basic-auth')
app.use(function(req, res, next) {
var credentials = auth(req)
if (req.headers.authorization) {
console.log("found headers");
}
}
But this only works if the basic auth is passed in the header.
I can not seem to get the username and password from the URL(which is the only way this external party can call my api)
I then tried as suggested use url
here is what i am seeing now when i do a POST to
http://myusername:mypassword#localhost:4050/api/callback
var express = require('express');
var http = require('http');
var url = require('url');
var app = express();
app.use(function(req, res, next) {
console.log("req.protocol=",req.protocol);
console.log("req.get('host')=",req.get('host'));
console.log("req.originalUrl=",req.originalUrl);
}
http.createServer(app).listen(config.port, function () {
console.log("HTTP BandWidth listening on port " + config.port);
});
My console looks like
req.protocol= http
req.get('host')= localhost:4050
req.originalUrl=/api/callback
if i dump the entire req obj i do not see myusername or mypassword
i must be missing something obvious
Thanks
Randy
You should be able to use the built-in url package in Node.
https://nodejs.org/dist/latest-v6.x/docs/api/url.html#url_urlobject_auth
const url = require('url');
app.use(function(req, res, next) {
const urlObj = url.parse(req.protocol + '://' + req.get('host') + req.originalUrl);
console.log('Auth info: ' + urlObj.auth);
}
Hope this helps!
EDIT: Well, I take that back. It looks like the use of username and password in the URI has been deprecated, and browsers may just be ignoring that information. See RFC 3986:
3.2.1. User Information
The userinfo subcomponent may consist of a user name and,
optionally, scheme-specific information about how to gain
authorization to access the resource. The user information, if
present, is followed by a commercial at-sign ("#") that delimits it
from the host.
userinfo = *( unreserved / pct-encoded / sub-delims / ":" )
Use of the format "user:password" in the userinfo field is
deprecated. Applications should not render as clear text any data
after the first colon (":") character found within a userinfo
subcomponent unless the data after the colon is the empty string
(indicating no password). Applications may choose to ignore or
reject such data when it is received as part of a reference and
should reject the storage of such data in unencrypted form. The
passing of authentication information in clear text has proven to be
a security risk in almost every case where it has been used.
And...
7.5. Sensitive Information
URI producers should not provide a URI that contains a username or
password that is intended to be secret. URIs are frequently
displayed by browsers, stored in clear text bookmarks, and logged by
user agent history and intermediary applications (proxies). A
password appearing within the userinfo component is deprecated and
should be considered an error (or simply ignored) except in those
rare cases where the 'password' parameter is intended to be public.
I added the bold and italics...
I tried this using the developer tools in Chrome and got the following:
General
Request URL:http://foobar:password#localhost:8888/test
Request Method:GET
Status Code:200 OK
Remote Address:127.0.0.1:8888
Response Headers
Connection:keep-alive
Content-Length:19
Content-Type:application/json; charset=utf-8
Date:Thu, 08 Dec 2016 03:52:35 GMT
ETag:W/"13-uNGID+rxNJ6jDZKj/wrpcA"
Request Headers
GET /test HTTP/1.1
Host: localhost:8888
Connection: keep-alive
Pragma: no-cache
Cache-Control: no-cache
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.98 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8
So, it doesn't even look like the username and password info is being passed along by Chrome. Unfortunately, I think you're out of luck if you're trying to use this schema. You may have to set the authorization headers, set your own custom headers (which is what I have done in the past), or pass your credentials in the query string.

Chunked encoding, streams and content-length

I need to upload a gzipped file. For performance in case my string gets too big I decided to use streams but ran into an issue with the server requiring a content-length header which cannot be calculated as the gzipping is inline. I then decided to use chunked transfer but am not sure if I am either not doing this correctly or if the server will simply not accepting streams/chunks as it still returns an error about needing a content-length header.
Here's the bit of the code:
const gzip = zlib.createGzip()
let stream = createStream(string) // I also use files hence the streaming
.pipe(gzip) )
.pipe(request.put(url, {
headers: {
'Transfer-Encoding': 'chunked',
'x-ms-blob-type': 'blockblob'
}
}))
Response:
Content-Length HTTP header is missing
I've also played around with adding other headers such as:
'Content-Type': 'application/javascript'
'Content-Encoding': 'gzip'
Is my only option to just forgo streaming or gzip out of the flow and calculate length that way? I can't tell if I am missing something or of the server is being persnickety.

s3.getSignedUrl ResponseContentDisposition parameter not working

I'm successfully generating a signed url that I can then use for a limited time to download resources from my s3 bucket. However I'm trying to use the ResponseContentDisposition attribute in the params as documented here:
http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#getSignedUrl-property
http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#getObject-property
I'm not sure if I'm doing this wrong but for some reason the headers are not being set. For example if I use the url I get back from s3.getSignedUrl:
curl -i "https://foo-dev.s3.amazonaws.com/images/foo.jpg?AWSAccessKeyId=AKIAICBHUC26S6B446PQ&Expires=1468359314&Signature=EeBqx1G83oeusarBl2KUbbCCBgA%3D&response-content-disposition=attachment%3B%20filename%3Ddata.jpg"
the headers are:
x-amz-id-2: SG9rjYQCcuqgKfjBmMbDQC2CNLcnqBAFzP7zINa99VYUwNijPOm5Ea/5fllZ6cnt/Qti7e26hbE=
x-amz-request-id: 2670068008525B1D
Date: Tue, 12 Jul 2016 21:26:16 GMT
Content-Disposition: inline; filename=foo.jpg
Last-Modified: Tue, 12 Jul 2016 00:47:23 GMT
ETag: "2a8e36651b24769170f4faa429f40f54"
Accept-Ranges: bytes
Content-Type: image/jpeg
Content-Length: 43373
Server: AmazonS3
I'm setting this, using the javascript s3 sdk like this:
function tempRedirect(req, res) {
var filename = req.params[0];
var contentDisposition = 'attachment; filename=data.jpg';
var params = {
Bucket: S3_BUCKET,
ResponseContentDisposition: contentDisposition,
Key: checkTrailingSlash(getFileKeyDir(req)) + filename
};
var s3 = new aws.S3(s3Options);
s3.getSignedUrl('getObject', params, function(err, url) {
res.redirect(url);
});
};
The docs are pretty light and I can only find PHP examples but it does look like I'm setting content disposition correctly.
Anyone know what is going wrong here??
According to RFC- 2616, your value is malformed.
The expected format is attachment; filename="funny-cat.jpg". The filename is a quoted string.
And, my original assumption was that S3 was rejecting it as invalid and silently refusing to replace the value.
Subsequent tests reveal unexpected behavior: if Content-Disposition is not stored with the object, then &response-content-disposition=... works as expected, setting the response header. But if there is a header stored with the object, this query string parameter does not have the documented effect of "overriding" that value.
Conversely, &response-content-type=... does override a stored Content-Type: for the object.
That's what a few quick tests revealed for me.
But this appears to be a bug -- or more accurately, some kind of regression -- in S3. According to one support forum post, the behavior is actually inconsistent, sometimes working and sometimes not.
S3 is aware of this issue and we are working to resolve it. (2016-07-12)
https://forums.aws.amazon.com/thread.jspa?threadID=235006

non-chunked multipart/mixed POST?

I'm trying to send multiple binary files to a web service in a single multipart/mixed POST but can't seem to get it to work... target server rejects it. One thing I noticed is Node is trying to do the encoding as chunked, which I don't want:
POST /SomeFiles HTTP/1.1
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary=123456789012345
Host: host.server.com
Connection: keep-alive
Transfer-Encoding: chunked
How do I get it to stop being chunked? Docs say that can be disabled by setting the Content-Length in the main request header but I can't set the Content-Length since, among other reasons, I'm loading one file at a time -- but it shouldn't be required either since it's multipart.
I think the rest is OK (excepting it's not chunked, unless the req_post.write() is doing that part), e.g. for the initial part I:
var req_post = https.request({
hostname: 'host.server.com',
path: '/ManyFiles',
method: 'POST',
headers: {
'MIME-Version': '1.0',
'Content-Type': 'multipart/mixed; boundary=' + boundary
}
},...);
and then pseudo-code:
while ( true ) {
// { get next file here }
req_post.write('\r\n--' + boundary + '\r\nContent-Type: ' + filetype + '\r\nContent-Length: ' + filesize + '\r\n\r\n');
req_post.write(chunk);// filesize
if ( end ) {
req_post.write('\r\n--' + boundary + '--\r\n');
req_post.end();
break;
}
}
Any help/pointers is appreciated!
The quick answer is you cannot disable chunked without setting content-length. The chunked encoding was introduced for cases where you do not know the size of the payload when you start transmitting. Originally, content-length was required and the recipient knew it had the full message when it received content-length bytes. Chunked encoding removed that requirement by transmitting mini-payloads each with a content-length, followed by a zero-size to denote completion of the payload. So, if you do not set the content-length, and you do not use the chunked methodology, the recipient will never know when it has the full payload.
To help solve your problem, if you cannot send chunked and do not want to read all the files before sending, take a look at fs.stat. You can use it to get the file size without reading the file.

Resources