Why is fs.createReadStream ... pipe(res) locking the read file? - node.js

I'm using express to stream audio & video files according to this answer. Relevant code looks like this:
function streamMedia(filePath, req, res) {
// code here to determine which bytes to send, compute response headers, etc.
res.writeHead(status, headers);
var stream = fs.createReadStream(filePath, { start, end })
.on('open', function() {
stream.pipe(res);
})
.on('error', function(err) {
res.end(err);
})
;
}
This works just fine to stream bytes to <audio> and <video> elements on the client. However after these requests are served, another express request can delete the file being streamed from the filesystem. This second request is failing, sort of.
What happens is that as long as the file is streamed at least once (meaning a createReadStream was invoked for the file's path while running the code above), then a different express request comes in to delete the file, the file remains on the filesystem until express is stopped. As soon as express is stopped, the files are deleted from the filesystem.
What exactly is going on here? Is it fs or express that is locking the file, why, and how can I get the process to release the file so that it can be deleted (after its contents have been read and piped to a response, if any is pending)?
Update 1:
I've modified the above code to set autoClose: true for the second function arg, and added both 'end' and 'close' event handlers, like so:
res.writeHead(status, headers);
var streamReadOpts = { start: start, end: end, autoClose: true };
var stream = fs.createReadStream(filePath, streamReadOpts)
// previous 'open' & 'error' event handlers are still here
.on('end', function () {
console.log('stream end');
})
.on('close', function () {
console.log('stream close');
})
What I have discovered is that when a page initially loads with a <video> or <audio> element, only the 'open' even is fired. Then when the user clicks to play the video/audio, a second request is made, and this second time, both the 'end' and 'close' events fire, and subsequently deleting the file succeeds.
So it appears that the file is being locked when a user loads the page that has the <video> or <audio> element that gets its source from the request that calls this function. It isn't until that media file is played that a second request is made, and the file is unlocked.
I've also discovered that closing the browser also causes the 'end' and 'close' events to fire, and the file to be unlocked. My guess is that I'm doing something wrong with the express res to make it not close properly, but I'm still not sure what that could be.

It turned out the solution to this was to read and pipe smaller blocks of data from the file during each request. In my test cases for this, I was streaming a 6MB MP4 video file. Though I was able to reproduce the issue using either firefox or chrome, I debugged using the latter, and found that the client was blocking the stream.
When the page initially loads, there is an element that looks something like this:
<video> <!-- or <audio> -->
<source src="/path/to/express/request" type="video/mpeg" /> <!-- or audio/mpeg -->
</video> <!-- or </audio> -->
As is documented in the other answer referenced in the OP, chrome will send a request with a range header like so:
Range:bytes=0-
For this request, my function was sending the whole file, and my response looked like this:
Accept-Ranges:bytes
Connection:keep-alive
Content-Length:6070289
Content-Range:bytes 0-6070288/6070289
Content-Type:video/mp4
However, chrome was not reading the whole stream. It was only reading the first 3-4MB, then blocking the connection until a user action caused it to need the rest of the file. This explains why closing either the browser or stopping express caused the files to be unlocked, because it closed the connection from either the browser or the server's end.
My current solution is to only send a maximum of 1MB (the old school 1MB, 1024 * 1024) chunk at a time. The relevant code can be found in an additional answer to the question referenced in the OP.

Set autoClose = true in options. If autoClose = false you have to close it manually in 'end' event.
Refer node doc :- https://nodejs.org/api/fs.html#fs_fs_createreadstream_path_options

Related

How to download incoming file and prevent Backpressure while sending files through WebRTC data channels using streams?

I'm building a file sharing application with WebRTC and Node.js. It is a command line application so there will be no HTML invloved. I'm reading the file as a stream and sending it, then at reciever's side I'll download the file. Here's how I'll be writing the sender's code :
// code taken from https://github.com/coding-with-chaim/file-transfer-
// final/blob/master/client/src/routes/Room.js
const reader = stream.getReader();
reader.read().then(obj => {
handlereading(obj.done, obj.value);
});
// recursive function for sending out chunks of stream
function handlereading(done, value) {
if (done) {
peer.write(JSON.stringify({ done: true, fileName: file.name }));
return;
}
peer.write(value);
reader.read().then(obj => {
handlereading(obj.done, obj.value);
})
}
On the reciever's side I'll be converting the incoming file (stream) to Blob but people online are saying that there will be an issue of backpressure if the size of the file is too large. How should I write the file dowloading code to avoid backpressure so that it doesn't crash the reciever's side due to buffer overflow? Or should there be another approach to downloading the file?
You want to listen to onbufferedamountlow after setting bufferedAmountLowThreshold
You will want to put all your logic on the sender side, the receiver doesn't have any control. I think MDN is your best resource, I didn't find any good single article on this.
I do have an example in Pion here but that is in Go. The same concept though so hopefully helpful!

readable.on('end',...) is never fired

I am trying to stream some audio to my server and then stream it to a service specified by the user, the user will be providing me with someHostName, which can sometimes not support that type of request.
My problem is that when it happens the clientRequest.on('end',..) is never fired, I think it's because it's being piped to someHostReq which gets messed up when someHostName is "wrong".
My question is:
Is there anyway that I can still have clientRequest.on('end',..) fired even when the stream clientRequest pipes to has something wrong with it?
If not: how do I detect that something wrong happened with someHostReq "immediately"? someHostReq.on('error') doesn't fire up except after some time.
code:
someHostName = 'somexample.com'
function checkIfPaused(request){//every 1 second check .isPaused
console.log(request.isPaused()+'>>>>');
setTimeout(function(){checkIfPaused(request)},1000);
}
router.post('/', function (clientRequest, clientResponse) {
clientRequest.on('data', function (chunk) {
console.log('pushing data');
});
clientRequest.on('end', function () {//when done streaming audio
console.log('im at the end');
}); //end clientRequest.on('end',)
options = {
hostname: someHostName, method: 'POST', headers: {'Transfer-Encoding': 'chunked'}
};
var someHostReq = http.request(options, function(res){
var data = ''
someHostReq.on('data',function(chunk){data+=chunk;});
someHostReq.on('end',function(){
console.log('someHostReq.end is called');
});
});
clientRequest.pipe(someHostReq);
checkIfPaused(clientRequest);
});
output:
in the case of a correct hostname:
pushing data
.
.
pushing data
false>>>
pushing data
.
.
pushing data
pushing data
false>>>
pushing data
.
.
pushing data
console.log('im at the end');
true>>>
//continues to be true, that's fine
in the case of a wrong host name:
pushing data
.
.
pushing data
false>>>>
pushing data
.
.
pushing data
pushing data
false>>>>
pushing data
.
.
pushing data
true>>>>
true>>>>
true>>>>
//it stays true and clientRequest.on('end') is never called
//even tho the client is still streaming data, no more "pushing data" appears
if you think my question is a duplicate:
it's not the same as this: node.js http.request event flow - where did my END event go? , the OP was just making a GET instead of a POST
it's not the same as this: My http.createserver in node.js doesn't work? , the stream was in paused mode because the none of the following happened:
You can switch to flowing mode by doing any of the following:
Adding a 'data' event handler to listen for data.
Calling the resume() method to explicitly open the flow.
Calling the pipe() method to send the data to a Writable.
source: https://nodejs.org/api/stream.html#stream_class_stream_readable
it's not the same as this: Node.js response from http request not calling 'end' event without including 'data' event , he just forgot to add the .on('data',..)
The behaviour in case of a wrong host name seems some problem with buffers, if the destination stream buffer is full (because someHost is not getting the sended chunks of data) the pipe will not continue to read the origin stream because pipe automatically manage the flow. As pipe is not reading the origin stream you never reach 'end' event.
Is there anyway that I can still have clientRequest.on('end',..) fired
even when the stream clientRequest pipes to has something wrong with
it?
The 'end' event will not fire unless the data is completely consumed. To get 'end' fired with a paused stream you need to call resume() (unpiping first from wrong hostname or you will fall in buffer stuck again) to set the steam into flowMode again or read() to the end.
But how to detect when I should do any of the above?
someHostReq.on('error') is the natural place but if it takes too long to fire up:
First try to set a low timeout request (less than someHostReq.on('error') takes to trigger, as seems too much time for you) request.setTimeout(timeout[, callback]) and check if it doesn't fail when correct hostname. If works, just use the callback or timeout event to detect when the server timeOut and use one of the techniques above to reach to the end.
If timeOut solution fails or doesn't fits your requirements you have to play with flags in clientRequest.on('data'), clientRequest.on('end') and/or clienteRequest.isPaused to guess when you are stuck by the buffer. When you think you are stuck just apply one of the techniques above to reach to the end of the stream. Luckily it takes less time to detect buffer stuck than wait for someHostReq.on('error') (maybe two request.isPaused() = true without reach 'data' event is enought to determine if you are stuck).
How do I detect that something wrong happened with someHostReq
"immediately"? someHostReq.on('error') doesn't fire up except after
some time.
Errors triggers when triggers. You can not "immediately" detect it. ¿Why not just send a prove beacon request to check support before piping streams? Some kind of:
"Cheking service specified by the user..." If OK -> Pipe user request stream to service OR FAIL -> Notify user about wrong service.

How to handle undefined upload in NodeJS

I use BusBoy plugin to handle file upload. It registers a handler for event 'file' which is
happening when the file has been uploaded completely,then the handler will do the next operation. But when the upload file is undefined,event handler is registered but event 'file' never happens. It will take along time for process to stop handling this request. When I keep on doing such thing for more than five times,the node process will halt for several minutes, it will come to normal after a few minutes.
Then how can I handle this situation in NodeJS process?
Don't tell me never to upload undefined, when file upload failed, I think such situation will also take place.
you can use 'finish' event to stop processing
busboy.on('finish', function() {
//your code to stop processing
});
This 'finish' event will trigger when all files/fields events in the uploading (form) data have been triggered. (Trigger is ignored if file/field is undefined).
further it is recommended to recheck in your case to pipe your node request through busboy
req.pipe(busboy);
I have solved this problem through this.
Set a variable flag for 'file' event, when the event happened the flag will be set true in its handler. Because 'file' event happens before req.end event,so I register a handler for the end event of req.
req.on('end',function(){
if(!flag){
res.json({
....
});
}
});

NodeJS writable streams: how to wait for data to be flushed?

I have a simple situation in which an https.get pipes its response stream into a file stream created with fs.createWriteStream, something like this:
var file = fs.createWriteStream('some-file');
var downloadComplete = function() {
// check file size with fs.stat
};
https.get(options, function(response) {
file.on('finish', downloadComplete);
response.pipe(file);
});
Almost all the time this works fine and the file size determined in downloadComplete is what is expected. Every so often however, it's a bit too small, almost like the underlying file stream hasn't written to the disk even though it has raised the finish event.
Does anyone know what's happening here, or have any particular way to make this safer to delays in finish being called and the underlying data being written to disk?

Node.js request stream ends/stalls when piped to writable file stream

I'm trying to pipe() data from Twitter's Streaming API to a file using modern Node.js Streams. I'm using a library I wrote called TweetPipe, which leverages EventStream and Request.
Setup:
var TweetPipe = require('tweet-pipe')
, fs = require('fs');
var tp = new TweetPipe(myOAuthCreds);
var file = fs.createWriteStream('./tweets.json');
Piping to STDOUT works and stream stays open:
tp.stream('statuses/filter', { track: ['bieber'] })
.pipe(tp.stringify())
.pipe(process.stdout);
Piping to the file writes one tweet and then the stream ends silently:
tp.stream('statuses/filter', { track: ['bieber'] })
.pipe(tp.stringify())
.pipe(file);
Could anyone tell me why this happens?
it's hard to say from what you have here, it sounds like the stream is getting cleaned up before you expect. This can be triggered a number of ways, see here https://github.com/joyent/node/blob/master/lib/stream.js#L89-112
A stream could emit 'end', and then something just stops.
Although I doubt this is the problem, one thing that concerns me is this
https://github.com/peeinears/tweet-pipe/blob/master/index.js#L173-174
destroy should be called after emitting error.
I would normally debug a problem like this by adding logging statements until I can see what is not happening right.
Can you post a script that can be run to reproduce?
(for extra points, include a package.json that specifies the dependencies :)
According to this, you should create an error handler on the stream created by tp.

Resources