Getting a ReadableStream from something that writes to WritableStreams - node.js

I've never used streams in Node.js, so I apologize in advance if this is trivial.
I'm using the ya-csv library to create a CSV. I use a line like this:
csvwriter = csv.createCsvStreamWriter(process.stdout)
As I understand it, this takes a writable Stream and writes to it when I add a record.
I need to use this CSV as an email attachment.
From nodemailer's docs, here is how to do that:
attachments: [
{ // stream as an attachment
fileName: "text4.txt",
streamSource: fs.createReadStream("file.txt")
}
]
As I understand it, this takes a readable Stream and reads from it.
Therein lies the problem. I need a readable Stream, I need a writable Stream, but at no point do I have a Stream.
It would be nice if ya-csv had a:
csvwriter = csv.createReadableCsvStream()
But it doesn't. Is there some built-in stream that makes available for writing whatever it reads? I've looks for a library with no success (though there are a few things that could work but seem like overkill).

you can use PassThrough stream for that:
var PassThrough = require('stream').PassThrough
var stream = new PassThrough
var csvwriter = csv.createCsvStreamWriter(stream)
now you can read from stream whatever is written

Related

Using streams to pipe data in node.js

Setup: I am reading data from a database and emailing the dataset as a CSV file. The data get read in chunks of 500 rows, and I'm using ya-csv to write the records as a CSV to a stream. I then want to use mailgun-js to email the file as an attachment.
Option 1 (what I don't want to do):
create temp file;
create write stream to that file;
write all CSV records;
read all of it back into memory to attach to an email;
 Option 2 (what I want to do but don't quite know how to):
create a writable stream;
create a readable stream;
somehow pipe the writes from (1) into (2);
pass the writable stream to ya-csv;
pass the readable stream to mailgun;
fetch data and write to the write stream until there's no more data;
end the write stream, thus ending the read stream and sending the email.
I've been reading https://github.com/substack/stream-handbook and https://nodejs.org/api/stream.html, and the problem is that I can't use writable.pipe(readable);.
I have tried using a Duplex stream (i.e. both the write and read streams are just Duplex streams) but this doesn't work as Duplex is an abstract class and I'd have to implement several of the linking parts.
Question: how do I use streams to link up this writing of CSV records to streaming an attachment to mailgun?
Don't overthink it, mailgun-js can take a stream as attachment, it can be as easy as:
var csv = require('csv');
var mailgun = require('mailgun-js');
// this will stream some csv
var file = myDbStream.pipe(csv.stringify());
var data = {
from: 'Excited User <me#samples.mailgun.org>',
to: 'serobnic#mail.ru',
subject: 'Hello',
text: 'Testing some Mailgun awesomness!',
attachment: file // attach it to your message, mailgun should deal with it
};
mailgun.messages().send(data, function (error, body) {
console.log(body);
});
I don't know what your Db is, maybe the driver already has a support for streams or you'll have to feed csv manually (can be done very easily with event-stream).
Edit
ya-csv seems to not be able to be piped easily, csv works better.

Can npm request module be used in a .pipe() stream?

I am parsing a JSON file using a parsing stream module and would like to stream results to request like this:
var request = require("request")
fs.createReadStream('./data.json')
.pipe(parser)
.pipe(streamer)
.pipe(new KeyFilter({find:"URL"}) )
.pipe(request)
.pipe( etc ... )
(Note: KeyFilter is a custom transform that works fine when piping to process.stdout)
I've read the docs and source code. This won't work with 'request' or 'new request()' because the constructor wants a URL.
It will work with request.put() as this : yourStream.pipe(request.put('http://example.com/form'))
After more research and experimenting I've concluded that request cannot be used in this way. The simple answer is that request creates a readable stream and .pipe() methods expects a writable stream.
I tried several attempts to wrap request in a transform to get by this with no luck. While you can receive the piped url and create a new request, I can't figure out how to reset the pipe callbacks without some truly unnatural bending of the stream pattern. Any thoughts would be appreciated, but I have moved on to using an event in the url stream to kick off a new request(url).pipe(etc) type stream.

read (pull) vs pipe(control flow) vs data(push)

Node.js has different options to consume the data.
Streams 0,1,2,3 and so on...
My question is with respect to real life application of
These different option. I fairly understand the
Difference between readable /read, data event and
Pipe but not very confident about selecting specific
Method.
For example if I want to use flow control, read with
Some manual work as well as pipe can be used.
data event ignores flow control, should I stop using
Plain data event?
For most things, you should be able to use
src.pipe(dest);
If you look at the source code for the Stream.prototype.pipe implementation, you can see that it's just a very handy wrapper that sets everything up for you
For all the work I do with streams, I generally just choose the proper stream type (Readable, Writable, Duplex, Transform, or PassThrough) and then define the proper methods (_read, _write, and/or _transform) on the stream. Lastly, I use .pipe to connect everything together.
It's very common to see stream setups that appear to be "circular"
client.pipe(encoder).pipe(server).pipe(decoder).pipe(client)
As an example, here's stream I'm using in my burro module. You can write objects to this stream, and you can read JSON strings from it.
// burro/encoder.js
var stream = require("stream"),
util = require("util");
var Encoder = module.exports = function Encoder() {
stream.Transform.call(this, {objectMode: true});
};
util.inherits(Encoder, stream.Transform);
Encoder.prototype._transform = function _transform(obj, encoding, callback) {
this.push(JSON.stringify(obj));
callback(null);
};
As a general recommendation, you will almost always write your Streams like this. That is, you write your own "class" that inherits from one of the built-in streams. It is not really practical for you to use a built-in stream directly.
To demonstrate how you might use this, start by creating a new instance of the stream
var encoder = new Encoder();
See what the encoder outputs by piping it to stdout
encoder.pipe(process.stdout);
Write some sample objects to it
encoder.write({foo: "bar", a: "b"});
// '{"foo":"bar","a":"b"}'
encoder.write({hello: "world"});
// '{"hello":"world"}'

Node.js request stream ends/stalls when piped to writable file stream

I'm trying to pipe() data from Twitter's Streaming API to a file using modern Node.js Streams. I'm using a library I wrote called TweetPipe, which leverages EventStream and Request.
Setup:
var TweetPipe = require('tweet-pipe')
, fs = require('fs');
var tp = new TweetPipe(myOAuthCreds);
var file = fs.createWriteStream('./tweets.json');
Piping to STDOUT works and stream stays open:
tp.stream('statuses/filter', { track: ['bieber'] })
.pipe(tp.stringify())
.pipe(process.stdout);
Piping to the file writes one tweet and then the stream ends silently:
tp.stream('statuses/filter', { track: ['bieber'] })
.pipe(tp.stringify())
.pipe(file);
Could anyone tell me why this happens?
it's hard to say from what you have here, it sounds like the stream is getting cleaned up before you expect. This can be triggered a number of ways, see here https://github.com/joyent/node/blob/master/lib/stream.js#L89-112
A stream could emit 'end', and then something just stops.
Although I doubt this is the problem, one thing that concerns me is this
https://github.com/peeinears/tweet-pipe/blob/master/index.js#L173-174
destroy should be called after emitting error.
I would normally debug a problem like this by adding logging statements until I can see what is not happening right.
Can you post a script that can be run to reproduce?
(for extra points, include a package.json that specifies the dependencies :)
According to this, you should create an error handler on the stream created by tp.

Node.js: Processing a stream without running out of memory

I'm trying to read a giant logfile (250,000 lines), parsing each line into a JSON object, and insert each JSON object to CouchDB for analytics.
I'm trying to do this by creating a buffered stream that will process each chunk seperately, but I always run out of memory after about 300 lines. It seems like using buffered streams and util.pump should avoid this, but apparently not.
(Perhaps there are better tools for this than node.js and CouchDB, but I'm interested in learning how to do this kind of file processing in node.js and think it should be possible.)
CoffeeScript below, JavaScript here: https://gist.github.com/5a89d3590f0a9ca62a23
fs = require 'fs'
util = require('util')
BufferStream = require('bufferstream')
files = [
"logfile1",
]
files.forEach (file)->
stream = new BufferStream({encoding:'utf8', size:'flexible'})
stream.split("\n")
stream.on("split", (chunk, token)->
line = chunk.toString()
# parse line into JSON and insert in database
)
util.pump(fs.createReadStream(file, {encoding: 'utf8'}), stream)
Maybe this helps:
Memory leak when using streams in Node.js?
Try to use pipe() to solve it.

Resources