Node.js request stream ends/stalls when piped to writable file stream

Node.js request stream ends/stalls when piped to writable file stream - node.js

I'm trying to pipe() data from Twitter's Streaming API to a file using modern Node.js Streams. I'm using a library I wrote called TweetPipe, which leverages EventStream and Request.
Setup:
var TweetPipe = require('tweet-pipe')
, fs = require('fs');
var tp = new TweetPipe(myOAuthCreds);
var file = fs.createWriteStream('./tweets.json');
Piping to STDOUT works and stream stays open:
tp.stream('statuses/filter', { track: ['bieber'] })
.pipe(tp.stringify())
.pipe(process.stdout);
Piping to the file writes one tweet and then the stream ends silently:
tp.stream('statuses/filter', { track: ['bieber'] })
.pipe(tp.stringify())
.pipe(file);
Could anyone tell me why this happens?

it's hard to say from what you have here, it sounds like the stream is getting cleaned up before you expect. This can be triggered a number of ways, see here https://github.com/joyent/node/blob/master/lib/stream.js#L89-112
A stream could emit 'end', and then something just stops.
Although I doubt this is the problem, one thing that concerns me is this
https://github.com/peeinears/tweet-pipe/blob/master/index.js#L173-174
destroy should be called after emitting error.
I would normally debug a problem like this by adding logging statements until I can see what is not happening right.
Can you post a script that can be run to reproduce?
(for extra points, include a package.json that specifies the dependencies :)

According to this, you should create an error handler on the stream created by tp.

Related

Unable to use one readable stream to write to two different targets in Node JS

I have a client side app where users can upload an image. I receive this image in my Node JS app as readable data and then manipulate it before saving like this:
uploadPhoto: async (server, request) => {
try {
const randomString = `${uuidv4()}.jpg`;
const stream = Fse.createWriteStream(`${rootUploadPath}/${userId}/${randomString}`);
const resizer = Sharp()
.resize({
width: 450
});
await data.file
.pipe(resizer)
.pipe(stream);
This works fine, and writes the file to the projects local directory. The problem comes when I try to use the same readable data again in the same async function. Please note, all of this code is in a try block.
const stream2 = Fse.createWriteStream(`${rootUploadPath}/${userId}/thumb_${randomString}`);
const resizer2 = Sharp()
.resize({
width: 45
});
await data.file
.pipe(resizer2)
.pipe(stream2);
The second file is written, but when I check the file, it seems corrupted or didn't successfully write the data. The first image is always fine.
I've tried a few things, and found one method that seems to work but I don't understand why. I add this code just before the I create the second write stream:
data.file.on('end', () => {
console.log('There will be no more data.');
});
Putting the code for the second write stream inside the on-end callback block doesn't make a difference, however, if I leave the code outside of the block, between the first write stream code and the second write stream code, then it works, and both files are successfully written.
It doesn't feel right leaving the code the way it is. Is there a better way I can write the second thumb nail image? I've tried to use the Sharp module to read the file after the first write stream writes the data, and then create a smaller version of it, but it doesn't work. The file doesn't ever seem to be ready to use.

You have 2 alternatives, which depends on how your software is designed.
If possible, I would avoid to execute two transform operations on the same stream in the same "context", eg: an API endpoint. I would rather separate those two different tranform so they do not work on the same input stream.
If that is not possible or would require too many changes, the solution is to fork the input stream and the pipe it into two different Writable. I normally use Highland.js fork for these tasks.
Please also see my comments on how to properly handle streams with async/await to check when the write operation is finished.

node.js - res.end vs fs.createWriteStream

I am rather new to Node and am attempting to learn streaming; please correct me if my understanding is flawed.
Using fs.createReadStream and fs.createWriteStream together with .pipe method will effectively stream any kind of data.
Also res.end method utilizes streaming by default.
So could we use fs.createReadStream together with res.end to create the same streaming effect?
How would this look?
Under what circumstances would you normally use res.end?
Thank you

You can use pipe like:
readStream.pipe(res);
To stream some readable stream to the response.
See this answer for a working example of using it.
Basically it's something like:
var s = fs.createReadStream(file);
s.on('open', function () {
s.pipe(res);
});
plus some error handling and MIME types support - see this for full code:
How to serve an image using nodejs
where you can find it used in three examples using three node modules:
express
connect
http

Can npm request module be used in a .pipe() stream?

I am parsing a JSON file using a parsing stream module and would like to stream results to request like this:
var request = require("request")
fs.createReadStream('./data.json')
.pipe(parser)
.pipe(streamer)
.pipe(new KeyFilter({find:"URL"}) )
.pipe(request)
.pipe( etc ... )
(Note: KeyFilter is a custom transform that works fine when piping to process.stdout)
I've read the docs and source code. This won't work with 'request' or 'new request()' because the constructor wants a URL.

It will work with request.put() as this : yourStream.pipe(request.put('http://example.com/form'))

After more research and experimenting I've concluded that request cannot be used in this way. The simple answer is that request creates a readable stream and .pipe() methods expects a writable stream.
I tried several attempts to wrap request in a transform to get by this with no luck. While you can receive the piped url and create a new request, I can't figure out how to reset the pipe callbacks without some truly unnatural bending of the stream pattern. Any thoughts would be appreciated, but I have moved on to using an event in the url stream to kick off a new request(url).pipe(etc) type stream.

Piping a stream to a stream with custom results

After reading and marginally understanding the node stream handbook, I want to use streams whenever it seems appropriate/possible.
I have a request that uploads a file which should be written to another spot on the file system. This is done via:
readStream = fs.createReadStream(request.files.file.path);
readStream.pipe(fs.createWriteStream(targetPath));
This works great, but I want to pipe the result of the write stream to a response -- specifically I want the target path to be piped to the result when it's successful. Right now I'm doing:
readStream.pipe(fs.createWriteStream(targetPath)).on("close", function ()
serverResponse.send(200, targetPath);
});
This works fine, but I feel like it is more verbose than it needs to be and I should be able to call .pipe on the result as in read.pipe(write).pipe(respose).
Is there something I can do to get the write stream to pipe the target path to the response or better way I can go about doing what I'm doing?

Websockets with Streaming Archives

So this is the setup I'm working with:
I am on an express server which must stream an archived binary payload to a browser (does not matter if it is zip, tar or tar.gz - although zip would be nice).
On this server, I have a websocket open that connects to another server which is sending me binary payloads of individual files in a directory. I get these payloads streamed, piece-by-piece, as buffers, and I'm doing this serially (that is - file-by-file - there aren't multiple websockets open at one time, and there is one websocket per file). This is the websocket library I'm using: https://github.com/einaros/ws
I would like to go through each file, open a websocket, and then append the buffers to an archiver as they come through the websockets. When data is appended to the archiver, it would be nice if I could stream the ouput of the archiver to the browser (via the response object with response.write). So, basically, as I'm getting the payload from the websocket, I would like that payload streamed through an archiver and then to the response. :-)
Some things I have looked into:
node-zipstream - This is nice because it gives me an output stream I can pipe directly to response.write. However, it doesn't appear to support nested files/folders, and, more importantly, it only accepts an input stream. I have looked at the source code (which is quite terse and readable), and it seems as though, if I were able to have access to the update function within ZipStream.prototype.addFile, I could just call that each time on the message event when I get a binary buffer from the websocket. This is quite messy/hacky though, and, given that this library already doesn't seem to support nested files/folders, I'm not sure I will be going with it.
node-archiver - This suffers from the same issue as node-zipstream (probably because it was inspired by it) where it allows me to pipe the output, but I cannot append multiple buffers for the same file within the archive (and then manually signal when the last buffer has been appended for a given file). However, it does allow me to have nested folders, which is a clear win over node-zipstream.
Is there something I'm not aware of, or is this just a really crazy thing that I want to do?
The only alternative I see at this point is to wait for the entire payload to be streamed through a websocket and then append with node-archiver, but I really would like to reap the benefit of true streaming/archiving on-the-fly.
I've also thought about the possibility of creating a read stream of sorts just to serve as a proxy object that I can pass into node-archiver and then just append the buffers I get from the websocket to this read stream. Looking at various read streams, I'm not sure how to do this though. The only way I could think of was creating a writestream, piping buffers to it, and having a readstream read from that writestream. Am I on the correct thought process here?
As always, thanks for any help/direction you can offer SO community.
EDIT:
Since I just opened this question, and I'm new to node, there may be a better answer than the one I provided. I will keep this question open and accept a better answer if one presents itself within a few days. As always, I will upvote any other answers, even if they're ridiculous, as long as they're correct and allow me to stream on-the-fly as mine does.

I figured out a way to get this working with node-archiver. :-)
It was based off my hunch of creating a temporary "proxy stream" of sorts, inspired by this SO question: How to create streams from string in Node.Js?
The basic gist is (coffeescript syntax):
archive = archiver 'zip'
archive.pipe response // where response is the http response
// and then for each file...
fileName = ... // known file name
fileSize = ... // known file size
ws = .... // create websocket
proxyStream = new Stream()
numBytesStreamed = 0
archive.append proxyStream, name: fileName
ws.on 'message', (dataBuffer) ->
numBytesStreamed += dataBuffer.length
proxyStream.emit 'data', dataBuffer
if numBytesStreamed is fileSize
proxyStream.emit 'end'
// function/indicator to do this for the next file in the folder
// and then when you're completely done...
archive.finalize (err, bytesOfArchive) ->
if err?
// do whatever
else
// unless you somehow knew this ahead of time
res.addTrailers
'Content-Length': bytesOfArchive
res.end()
Note that this is not the complete solution I implemented. There is still a lot of logic dealing with getting the files, their paths, etc. Not to mention error-handling.
EDIT:
Since I just opened this question, and I'm new to node, there may be a better answer. I will keep this question open and accept a better answer if one presents itself within a few days. As always, I will upvote any other answers, even if they're ridiculous, as long as they're correct and allow me to stream on-the-fly as mine does.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string