node.js - res.end vs fs.createWriteStream - node.js

I am rather new to Node and am attempting to learn streaming; please correct me if my understanding is flawed.
Using fs.createReadStream and fs.createWriteStream together with .pipe method will effectively stream any kind of data.
Also res.end method utilizes streaming by default.
So could we use fs.createReadStream together with res.end to create the same streaming effect?
How would this look?
Under what circumstances would you normally use res.end?
Thank you

You can use pipe like:
readStream.pipe(res);
To stream some readable stream to the response.
See this answer for a working example of using it.
Basically it's something like:
var s = fs.createReadStream(file);
s.on('open', function () {
s.pipe(res);
});
plus some error handling and MIME types support - see this for full code:
How to serve an image using nodejs
where you can find it used in three examples using three node modules:
express
connect
http

Related

Unable to use one readable stream to write to two different targets in Node JS

I have a client side app where users can upload an image. I receive this image in my Node JS app as readable data and then manipulate it before saving like this:
uploadPhoto: async (server, request) => {
try {
const randomString = `${uuidv4()}.jpg`;
const stream = Fse.createWriteStream(`${rootUploadPath}/${userId}/${randomString}`);
const resizer = Sharp()
.resize({
width: 450
});
await data.file
.pipe(resizer)
.pipe(stream);
This works fine, and writes the file to the projects local directory. The problem comes when I try to use the same readable data again in the same async function. Please note, all of this code is in a try block.
const stream2 = Fse.createWriteStream(`${rootUploadPath}/${userId}/thumb_${randomString}`);
const resizer2 = Sharp()
.resize({
width: 45
});
await data.file
.pipe(resizer2)
.pipe(stream2);
The second file is written, but when I check the file, it seems corrupted or didn't successfully write the data. The first image is always fine.
I've tried a few things, and found one method that seems to work but I don't understand why. I add this code just before the I create the second write stream:
data.file.on('end', () => {
console.log('There will be no more data.');
});
Putting the code for the second write stream inside the on-end callback block doesn't make a difference, however, if I leave the code outside of the block, between the first write stream code and the second write stream code, then it works, and both files are successfully written.
It doesn't feel right leaving the code the way it is. Is there a better way I can write the second thumb nail image? I've tried to use the Sharp module to read the file after the first write stream writes the data, and then create a smaller version of it, but it doesn't work. The file doesn't ever seem to be ready to use.
You have 2 alternatives, which depends on how your software is designed.
If possible, I would avoid to execute two transform operations on the same stream in the same "context", eg: an API endpoint. I would rather separate those two different tranform so they do not work on the same input stream.
If that is not possible or would require too many changes, the solution is to fork the input stream and the pipe it into two different Writable. I normally use Highland.js fork for these tasks.
Please also see my comments on how to properly handle streams with async/await to check when the write operation is finished.

Can npm request module be used in a .pipe() stream?

I am parsing a JSON file using a parsing stream module and would like to stream results to request like this:
var request = require("request")
fs.createReadStream('./data.json')
.pipe(parser)
.pipe(streamer)
.pipe(new KeyFilter({find:"URL"}) )
.pipe(request)
.pipe( etc ... )
(Note: KeyFilter is a custom transform that works fine when piping to process.stdout)
I've read the docs and source code. This won't work with 'request' or 'new request()' because the constructor wants a URL.
It will work with request.put() as this : yourStream.pipe(request.put('http://example.com/form'))
After more research and experimenting I've concluded that request cannot be used in this way. The simple answer is that request creates a readable stream and .pipe() methods expects a writable stream.
I tried several attempts to wrap request in a transform to get by this with no luck. While you can receive the piped url and create a new request, I can't figure out how to reset the pipe callbacks without some truly unnatural bending of the stream pattern. Any thoughts would be appreciated, but I have moved on to using an event in the url stream to kick off a new request(url).pipe(etc) type stream.

read (pull) vs pipe(control flow) vs data(push)

Node.js has different options to consume the data.
Streams 0,1,2,3 and so on...
My question is with respect to real life application of
These different option. I fairly understand the
Difference between readable /read, data event and
Pipe but not very confident about selecting specific
Method.
For example if I want to use flow control, read with
Some manual work as well as pipe can be used.
data event ignores flow control, should I stop using
Plain data event?
For most things, you should be able to use
src.pipe(dest);
If you look at the source code for the Stream.prototype.pipe implementation, you can see that it's just a very handy wrapper that sets everything up for you
For all the work I do with streams, I generally just choose the proper stream type (Readable, Writable, Duplex, Transform, or PassThrough) and then define the proper methods (_read, _write, and/or _transform) on the stream. Lastly, I use .pipe to connect everything together.
It's very common to see stream setups that appear to be "circular"
client.pipe(encoder).pipe(server).pipe(decoder).pipe(client)
As an example, here's stream I'm using in my burro module. You can write objects to this stream, and you can read JSON strings from it.
// burro/encoder.js
var stream = require("stream"),
util = require("util");
var Encoder = module.exports = function Encoder() {
stream.Transform.call(this, {objectMode: true});
};
util.inherits(Encoder, stream.Transform);
Encoder.prototype._transform = function _transform(obj, encoding, callback) {
this.push(JSON.stringify(obj));
callback(null);
};
As a general recommendation, you will almost always write your Streams like this. That is, you write your own "class" that inherits from one of the built-in streams. It is not really practical for you to use a built-in stream directly.
To demonstrate how you might use this, start by creating a new instance of the stream
var encoder = new Encoder();
See what the encoder outputs by piping it to stdout
encoder.pipe(process.stdout);
Write some sample objects to it
encoder.write({foo: "bar", a: "b"});
// '{"foo":"bar","a":"b"}'
encoder.write({hello: "world"});
// '{"hello":"world"}'

Which nodejs library should I use to write into HDFS?

I have a nodejs application and I want to write data into hadoop HDFS file system. I have seen two main nodejs libraries that can do it: node-hdfs and node-webhdfs. Someone have tried it? Any hints? Which one should I use in production?
I am inclined to use node-webhdfs since it uses WebHDFS REST API. node-hdfs seem to be a c++ binding.
Any help will be greatly appreciated.
You may want to check out webhdfs library. It provides nice and straightforward (similar to fs module API) interface for WebHDFS REST API calls.
Writing to the remote file:
var WebHDFS = require('webhdfs');
var hdfs = WebHDFS.createClient();
var localFileStream = fs.createReadStream('/path/to/local/file');
var remoteFileStream = hdfs.createWriteStream('/path/to/remote/file');
localFileStream.pipe(remoteFileStream);
remoteFileStream.on('error', function onError (err) {
// Do something with the error
});
remoteFileStream.on('finish', function onFinish () {
// Upload is done
});
Reading from the remote file:
var WebHDFS = require('webhdfs');
var hdfs = WebHDFS.createClient();
var remoteFileStream = hdfs.createReadStream('/path/to/remote/file');
remoteFileStream.on('error', function onError (err) {
// Do something with the error
});
remoteFileStream.on('data', function onChunk (chunk) {
// Do something with the data chunk
});
remoteFileStream.on('finish', function onFinish () {
// Upload is done
});
Not good news!!!
Do not use node-hdfs. Although it seems promising, it is now two years obsolete. I've tried to compile it but it does not match the symbols of current libhdfs. If you want to use something like that you'll have to make your own nodejs binding.
You can use node-webhdfs but IMHO there's not much advantage on that. It is better to use an http nodejs lib to make your own requests. The hardest part here is try to hold the very async nature of nodejs, since you might want first to create a folder, and then after successfully create it, create a file and then, at last, write or append data. Everything through http requests that you must send and wait the for answer to then go on....
At least node-webhdfs might be a good reference to you take a look and start your own code.
Br,
Fabio Moreira

Node.js request stream ends/stalls when piped to writable file stream

I'm trying to pipe() data from Twitter's Streaming API to a file using modern Node.js Streams. I'm using a library I wrote called TweetPipe, which leverages EventStream and Request.
Setup:
var TweetPipe = require('tweet-pipe')
, fs = require('fs');
var tp = new TweetPipe(myOAuthCreds);
var file = fs.createWriteStream('./tweets.json');
Piping to STDOUT works and stream stays open:
tp.stream('statuses/filter', { track: ['bieber'] })
.pipe(tp.stringify())
.pipe(process.stdout);
Piping to the file writes one tweet and then the stream ends silently:
tp.stream('statuses/filter', { track: ['bieber'] })
.pipe(tp.stringify())
.pipe(file);
Could anyone tell me why this happens?
it's hard to say from what you have here, it sounds like the stream is getting cleaned up before you expect. This can be triggered a number of ways, see here https://github.com/joyent/node/blob/master/lib/stream.js#L89-112
A stream could emit 'end', and then something just stops.
Although I doubt this is the problem, one thing that concerns me is this
https://github.com/peeinears/tweet-pipe/blob/master/index.js#L173-174
destroy should be called after emitting error.
I would normally debug a problem like this by adding logging statements until I can see what is not happening right.
Can you post a script that can be run to reproduce?
(for extra points, include a package.json that specifies the dependencies :)
According to this, you should create an error handler on the stream created by tp.

Resources