Piping readstream into response makes it one-time-use - node.js

Right now I'm trying to use Node readstreams and stream transforms to edit my HTML data before sending it to the client. Yes, I'm aware templating engines exist, this is the method I'm working with right now though. The code I'm using looks like this:
const express = require('express')
const app = express()
const port = 8080
const fs = require('fs')
const Transform = require("stream").Transform
const parser = new Transform()
const newLineStream = require("new-line")
parser._transform = function(data, encoding, done) {
const str = data.toString().replace('</body>', `<script>var questions = ${JSON.stringify(require('./questions.json'))};</script></body>`)
this.push(str)
done()
}
app.get('/', (req, res) => {
console.log('Homepage served')
res.write('<!-- Begin stream -->\n');
let stream = fs.createReadStream('./index.html')
stream.pipe(newLineStream())
.pipe(parser)
.on('end', () => {
res.write('\n<!-- End stream -->')
}).pipe(res)
})
This is just a rough draft to try and get this method working. Right now, the issue I'm running into is that the first time I load my webpage everything works fine, but every time after that the html I'm given looks like this:
<!-- Begin stream -->
<html>
<head></head>
<body></body>
</html>
It seems like the stream is getting hung up in the middle, because most of the data is never transmitted and the stream is never ended. Another thing I notice in the console is a warning after 10 reloads that there are 11 event listeners on [Transform] and there's a possible memory leak. I've tried clearing all event listeners on both the readstream and the parser once the readstream ends, but that didn't solve anything. Is there a way to change my code to fix this issue?
Original StackOverflow post that this method came from

The issue here was using a single parser and ._transform() instead of creating a new transform every time the app received a request. Putting const parser = ... and parser._transform = ... inside the app.get() fixed everything.

Related

How to keep the request open to use the write() method after a long time

I need to keep the connection open so after I finish the music I write the new data. The problem is that the way I did, the stream simply stops after the first song.
How can I keep the connection open and play the next songs too?
const fs = require('fs');
const express = require('express');
const app = express();
const server = require('http').createServer(app)
const getMP3Duration = require('get-mp3-duration')
let sounds = ['61880.mp3', '62026.mp3', '62041.mp3', '62090.mp3', '62257.mp3', '60763.mp3']
app.get('/current', async (req, res) => {
let readStream = fs.createReadStream('sounds/61068.mp3')
let duration = await getMP3Duration(fs.readFileSync('sounds/61068.mp3'))
let pipe = readStream.pipe(res, {end: false})
async function put(){
let file_path = 'sounds/'+sounds[Math.random() * sounds.length-1]
duration = await getMP3Duration(fs.readFileSync(file_path))
readStream = fs.createReadStream(file_path)
readStream.on('data', chunk => {
console.log(chunk)
pipe.write(chunk)
})
console.log('Current Sound: ', file_path)
setTimeout(put, duration)
}
setTimeout(put, duration)
})
server.listen(3005, async function () {
console.log('Server is running on port 3005...')
});
You should use a library or look at the source code and see what they do.
A good one is:
https://github.com/obastemur/mediaserver
TIP:
Always start your research by learning from other projects.. (When possible or when you are not inventing the wheel ;)) you are not the first to do so or to hit this problem :)
a quick search with the phrase "nodejs stream mp3 github" gave me few directions..
Good luck !
Express works by returning a single response to a single request. As soon as the request has been sent, a new request needs to be generated to trigger a new response.
In your case however you want to keep on generating new responses out of a single request.
Two approaches can be used to solve your problem:
Change the way you create your response to satisfy your use-case.
use an instantaneous communication framework (websocket). The best and simplest which comes to my mind is socket.io
Adapting express
The solution here is to follow this procedure:
Request on endpoint /current comes in
The audio sequence is prepared
The stream of the entire sequence is returned
So your handler would look like that:
const fs = require('fs');
const express = require('express');
const app = express();
const server = require('http').createServer(app);
// Import the PassThrough class to concatenate the streams
const { PassThrough } = require('stream');
// The array of sounds now contain all the sounds
const sounds = ['61068.mp3','61880.mp3', '62026.mp3', '62041.mp3', '62090.mp3', '62257.mp3', '60763.mp3'];
// function which concatenate an array of streams
const concatStreams = streamArray => {
let pass = new PassThrough();
let waiting = streamArray.length;
streamArray.forEach(soundStream => {
pass = soundStream.pipe(pass, {end: false});
soundStream.once('end', () => --waiting === 0 && pass.emit('end'));
});
return pass;
};
// function which returns a shuffled array
const shuffle = (array) => {
const a = [...array]; // shallow copy of the array
for (let i = a.length - 1; i > 0; i--) {
const j = Math.floor(Math.random() * (i + 1));
[a[i], a[j]] = [a[j], a[i]];
}
return a;
};
server.get('/current', (req, res) => {
// Start by shuffling the array
const shuffledSounds = shuffle(sounds);
// Create a readable stream for each sound
const streams = shuffledSounds.map(sound => fs.createReadStream(`sounds/${sound}`));
// Concatenate all the streams into a single stream
const readStream = concatStreams(streams);
// This will wait until we know the readable stream is actually valid before piping
readStream.on('open', function () {
// This just pipes the read stream to the response object (which goes to the client)
// the response is automatically ended when the stream emits the "end" event
readStream.pipe(res);
});
});
Notice that the function does not require the async keyword any longer. The process is still asynchronous but the coding is emitter based instead of promise based.
If you want to loop the sounds you can create additional steps of shuffling/mapping to stream/concatenation.
I did not include the socketio alternative as to keep it simple.
Final Solution After a Few Edits:
I suspect your main issue is with your random array element generator. You need to wrap what you have with Math.floor to round down to ensure you end up with a whole number:
sounds[Math.floor(Math.random() * sounds.length)]
Also, Readstream.pipe returns the destination, so what you're doing makes sense. However, you might get unexpected results with calling on('data') on your readable after you've already piped from it. The node.js streams docs mention this. I tested out your code on my local machine and it doesn't seem to be an issue, but it might make sense to change this so you don't have problems in the future.
Choose One API Style
The Readable stream API evolved across multiple Node.js versions and provides multiple methods of consuming stream data. In general, developers should choose one of the methods of consuming data and should never use multiple methods to consume data from a single stream. Specifically, using a combination of on('data'), on('readable'), pipe(), or async iterators could lead to unintuitive behavior.
Instead of calling on('data') and res.write, I would just pipe from the readStream into the res again. Also, unless you really want to get the duration, I would pull that library out and just use the readStream.end event to make additional calls to put(). This works because you're passing the false option when piping, which disables the default end event functionality on the write stream and leaves it open. However, it still gets emitted, so you can use that as a marker to know when the readable has finished piping. Here's the refactored code:
const fs = require('fs');
const express = require('express');
const app = express();
const server = require('http').createServer(app)
//const getMP3Duration = require('get-mp3-duration') no longer needed
let sounds = ['61880.mp3', '62026.mp3', '62041.mp3', '62090.mp3', '62257.mp3', '60763.mp3']
app.get('/current', async (req, res) => {
let readStream = fs.createReadStream('sounds/61068.mp3')
let duration = await getMP3Duration(fs.readFileSync('sounds/61068.mp3'))
let pipe = readStream.pipe(res, {end: false})
function put(){
let file_path = 'sounds/'+sounds[Math.floor(Math.random() * sounds.length)]
readStream = fs.createReadStream(file_path)
// you may also be able to do readStream.pipe(res, {end: false})
readStream.pipe(pipe, {end: false})
console.log('Current Sound: ', file_path)
readStream.on('end', () => {
put()
});
}
readStream.on('end', () => {
put()
});
})
server.listen(3005, async function () {
console.log('Server is running on port 3005...')
});

Node - Abstracting Pipe Steps into Function

I'm familiar with Node streams, but I'm struggling on best practices for abstracting code that I reuse a lot into a single pipe step.
Here's a stripped down version of what I'm writing today:
inputStream
.pipe(csv.parse({columns:true})
.pipe(csv.transform(function(row) {return transform(row); }))
.pipe(csv.stringify({header: true})
.pipe(outputStream);
The actual work happens in transform(). The only things that really change are inputStream, transform(), and outputStream. Like I said, this is a stripped down version of what I actually use. I have a lot of error handling and logging on each pipe step, which is ultimately why I'm try to abstract the code.
What I'm looking to write is a single pipe step, like so:
inputStream
.pipe(csvFunction(transform(row)))
.pipe(outputStream);
What I'm struggling to understand is how to turn those pipe steps into a single function that accepts a stream and returns a stream. I've looked at libraries like through2 but I'm but not sure how that get's me to where I'm trying to go.
You can use the PassThrough class like this:
var PassThrough = require('stream').PassThrough;
var csvStream = new PassThrough();
csvStream.on('pipe', function (source) {
// undo piping of source
source.unpipe(this);
// build own pipe-line and store internally
this.combinedStream =
source.pipe(csv.parse({columns: true}))
.pipe(csv.transform(function (row) {
return transform(row);
}))
.pipe(csv.stringify({header: true}));
});
csvStream.pipe = function (dest, options) {
// pipe internal combined stream to dest
return this.combinedStream.pipe(dest, options);
};
inputStream
.pipe(csvStream)
.pipe(outputStream);
Here's what I ended up going with. I used the through2 library and the streaming API of the csv library to create the pipe function I was looking for.
var csv = require('csv');
through = require('through2');
module.exports = function(transformFunc) {
parser = csv.parse({columns:true, relax_column_count:true}),
transformer = csv.transform(function(row) {
return transformFunc(row);
}),
stringifier = csv.stringify({header: true});
return through(function(chunk,enc,cb){
var stream = this;
parser.on('data', function(data){
transformer.write(data);
});
transformer.on('data', function(data){
stringifier.write(data);
});
stringifier.on('data', function(data){
stream.push(data);
});
parser.write(chunk);
parser.removeAllListeners('data');
transformer.removeAllListeners('data');
stringifier.removeAllListeners('data');
cb();
})
}
It's worth noting the part where I remove the event listeners towards the end, this was due to running into memory errors where I had created too many event listeners. I initially tried solving this problem by listening to events with once, but that prevented subsequent chunks from being read and passed on to the next pipe step.
Let me know if anyone has feedback or additional ideas.

Cannot pipe after data has been emitted from the response nodejs

I've been experiencing a problem with the require library of node js. When I try to pipe to a file and a stream on response, I get the error: you cannot pipe after data has been emitted from the response. This is because I do some calculations before really piping the data.
Example:
var request = require('request')
var fs = require('fs')
var through2 = require('through2')
options = {
url: 'url-to-fetch-a-file'
};
var req = request(options)
req.on('response',function(res){
//Some computations to remove files potentially
//These computations take quite somme time.
//Function that creates path recursively
createPath(path,function(){
var file = fs.createWriteStream(path+fname)
var stream = through2.obj(function (chunk, enc, callback) {
this.push(chunk)
callback()
})
req.pipe(file)
req.pipe(stream)
})
})
If I just pipe to the stream without any calculations, it's just fine. How can I pipe to both a file and stream using request module in nodejs?
I found this:Node.js Piping the same readable stream into multiple (writable) targets but it is not the same thing. There, piping happens 2 times in a different tick. This example pipes like the answer in the question and still receives an error.
Instead of piping directly to the file you can add a listener to the stream you defined. So you can replace req.pipe(file) with
stream.on('data',function(data){
file.write(data)
})
stream.on('end',function(){
file.end()
})
or
stream.pipe(file)
This will pause the stream untill its read, something that doesn't happen with the request module.
More info: https://github.com/request/request/issues/887

JSONStream handle one data with different parser

I'm using JSONStream to parse the data from server, the data can either be like {"error": "SomeError"} or {"articles":[{"id": 123}]};
My code goes like
var request = require('request');
var JSONStream = require('JSONStream');
var articleIDParser = JSONStream.parse(['articles', true, 'id']);
var errorParser = JSONStream.parse(['error']);
request({url: 'http://XXX/articles.json'})
.pipe(articleIDParser).pipe(errorParser);
errorParser.on('data', function(data) {
console.log(data);
});
articleIDParser.on('data', someFuncHere);
But unlucky, the second parser does not work even when the server returns error.
Am I wrong at pipe function or JSONStream?
Thanks in advance.
Well, I use the following way to solved the problem:
var request({url: 'http://XXX/articles.json'})
dest.pipe(articleIDParser)
dest.pipe(errorParser);
Explanation in Node.js Stream documentation.
The callback function of the 'end' event doesn't have a data parameter. Listen for the 'data' event instead. In case of piping listen for the pipe event on the destination.
var request, JSONStream, articleIDParser, errorParser;
request = require('request');
JSONStream = require('JSONStream');
articleIDParser = JSONStream.parse(['articles', true, 'id']);
errorParser = JSONStream.parse(['error']);
articleIDParser.on('pipe', function (src) {
// some code
});
errorParser.on('pipe', function (src) {
// some code
});
request({url: 'http://XXX/articles.json'}).pipe(articleIDParser).pipe(errorParser);
Note: JSONStream.getParserStream is less ambiguous, one might think you're already parsing while you're just getting the parser/writable stream. If you still have issues please give more information (code) about JSONStream. The Stream module is still marked as unstable by the way.

Node Streaming, Writing, and Memory

I'm attempting to dynamically concatenate files prior to serving their content. The following very simplified code shows an approach:
var http = require('http');
var fs = require('fs');
var start = '<!doctype html><html lang="en"><head><script>';
var funcsA = fs.readFileSync('functionsA.js', 'utf8');
var funcsB = fs.readFileSync('functionsB.js', 'utf8');
var funcsC = fs.readFileSync('functionsC.js', 'utf8');
var finish = '</script></head><body>some stuff here</body></html>';
var output = start + funcsA + funcsB + funcsC + finish;
http.createServer(function (req, res) {
res.writeHead(200, {'Content-Type': 'text/html'});
res.end(output);
}).listen(9000);
In reality, how I concatenate might depend on clues from the userAgent. My markup and scripts could be several hundred kilobytes combined.
I like this approach because there is no file system I/O happening within createServer. I seem to have read somewhere that this response.write(...); approach is not as efficient/low overhead as streaming data using an fs.createReadStream approach. I seem to recall this had something to do with what happens when the client cannot receive data as fast as Node can send it.(?) We seem to be able to create a readable stream from a file system object, but not from memory. Is it possible to do what I have coded above with a streaming approach? With file I/O happening initially, outside of the CreateServer function?
Or, on the other hand, are my concerns not that critical, and the approach above offers perhaps no less efficiency than a streaming approach.
Thanks.
res.write(start)
var A = fs.createReadStream()
var B = fs.createReadStream()
var C = fs.createReadStream()
A.pipe(res, {
end: false
})
A.on('end', function () {
B.pipe(res, {
end: false
})
})
B.on('end', function () {
C.pipe(res, {
end: false
})
})
C.on('end', function () {
res.write(finish)
res.end()
})
Defining Streams prior to (and not inside) the createServer callback won't typically work, see here

Resources