JSONStream handle one data with different parser - node.js

I'm using JSONStream to parse the data from server, the data can either be like {"error": "SomeError"} or {"articles":[{"id": 123}]};
My code goes like
var request = require('request');
var JSONStream = require('JSONStream');
var articleIDParser = JSONStream.parse(['articles', true, 'id']);
var errorParser = JSONStream.parse(['error']);
request({url: 'http://XXX/articles.json'})
.pipe(articleIDParser).pipe(errorParser);
errorParser.on('data', function(data) {
console.log(data);
});
articleIDParser.on('data', someFuncHere);
But unlucky, the second parser does not work even when the server returns error.
Am I wrong at pipe function or JSONStream?
Thanks in advance.

Well, I use the following way to solved the problem:
var request({url: 'http://XXX/articles.json'})
dest.pipe(articleIDParser)
dest.pipe(errorParser);

Explanation in Node.js Stream documentation.
The callback function of the 'end' event doesn't have a data parameter. Listen for the 'data' event instead. In case of piping listen for the pipe event on the destination.
var request, JSONStream, articleIDParser, errorParser;
request = require('request');
JSONStream = require('JSONStream');
articleIDParser = JSONStream.parse(['articles', true, 'id']);
errorParser = JSONStream.parse(['error']);
articleIDParser.on('pipe', function (src) {
// some code
});
errorParser.on('pipe', function (src) {
// some code
});
request({url: 'http://XXX/articles.json'}).pipe(articleIDParser).pipe(errorParser);
Note: JSONStream.getParserStream is less ambiguous, one might think you're already parsing while you're just getting the parser/writable stream. If you still have issues please give more information (code) about JSONStream. The Stream module is still marked as unstable by the way.

Related

request.on in http.createServer(function(request,response) {});

var http = require('http');
var map = require('through2-map');
uc = map(function(ch) {
return ch.toString().toUpperCase();
});
server = http.createServer(function(request, response) {
request.on('data',function(chunk){
if (request.method == 'POST') {
//change the data from request to uppercase letters and
//pipe to response.
}
});
});
server.listen(8000);
I have two questions about the code above. First, I read the documentation for request, it said that request is an instance of IncomingMessage, which implements Readable Stream. However, I couldn't find .on method in the Stream documentation. So I don't know what chunk in the callback function in request.on does. Secondly, I want to do some manipulation to the data from request and pipe it to response. Should I pipe from chunk or from request? Thank you for consideration!
is chunk a stream?
nop. The stream is the flow among what the chunks of the whole data are sent.
A simple example, If you read a 1gb file, a stream will read it by chunks of 10k, each chunk will go through your stream, from the beginning to the end, with the right order.
I use a file as example, but a socket, request or whatever streams is based on that idea.
Also, whenever someone sends a request to this server would that entire thing be a chunk?
In the particular case of http requests, only the request body is a stream. It can be the posted files/data. Or the response body of the response. Headers are treated as Objects to apply on the request before the body is written on the socket.
A small example to help you with some concrete code,
var through2 = require('through2');
var Readable = require('stream').Readable;
var s1 = through2(function transform(chunk, enc, cb){
console.log("s1 chunk %s", chunk.toString())
cb(err=null, chunk.toString()+chunk.toString() )
});
var s2 = through2(function transform(chunk, enc, cb){
console.log("s2 chunk %s", chunk.toString())
cb(err=null, chunk)
});
s2.on('data', function (data) {
console.log("s2 data %s", data.toString())
})
s1.on('end', function (data) {
console.log("s1 end")
})
s2.on('end', function (data) {
console.log("s2 end")
})
var rs = new Readable;
rs.push('beep '); // this is a chunk
rs.push('boop'); // this is a chunk
rs.push(null); // this is a signal to end the stream
rs.on('end', function (data) {
console.log("rs end")
})
console.log(
".pipe always return piped stream: %s", rs.pipe(s1)===s1
)
s1.pipe(s2)
I would like to suggest you to read more :
https://github.com/substack/stream-handbook
http://maxogden.com/node-streams.html
https://github.com/maxogden/mississippi
All Streams are instances of EventEmitter (docs), that is where the .on method comes from.
Regarding the second question, you MUST pipe from the Stream object (request in this case). The "data" event emits data as a Buffer or a String (the "chunk" argument in the event listener), not a stream.
Manipulating Streams is usually done by implementing a Transform stream (docs). Though there are many NPM packages available that make this process simpler (like through2-map or the like), though in reality, they produce Transform streams.
Consider the following:
var http = require('http');
var map = require('through2-map');
// Transform Stream to uppercase
var uc = map(function(ch) {
return ch.toString().toUpperCase();
});
var server = http.createServer(function(request, response) {
// Pipe from the request to our transform stream
request
.pipe(uc)
// pipe from transfrom stream to response
.pipe(response);
});
server.listen(8000);
You can test by running curl:
$ curl -X POST -d 'foo=bar' http://localhost:8000
# logs FOO=BAR

how to get updated data from fs.watch in nodejs when ever there is update in file

I am trying to get updated data from a file when ever there is a change in the file i am using fs.watch for watching changes but how to get updated data so that i can parse csv to json
nodejs code:
var express=require("express");
var app=express();
var http = require('http');
var fs = require('fs');
var Converter = require("csvtojson").Converter;
var converter = new Converter({constructResult:false});
fs.createReadStream("test.xlsx").pipe(converter);
//record_parsed will be emitted each csv row being processed
converter.on("record_parsed", function (jsonObj) {
console.log(jsonObj); //here is your result json obje
app.get("/",function(req,res)
{
console.log("listening\n");
fs.watch('test.csv', function (event, filename) {
console.log('event is: ' + event);
//record_parsed will be emitted each csv row being processed
converter.on("record_parsed", function (jsonObj) {
console.log(jsonObj); //here is your result json object
});
});
});
app.listen(8180);
console.log("running");
You may consider using tail-stream or at least peek at its source code.
It extends stream.Readable and uses fs.watch to execute stream.read(0) when change event occurs. Still, there is a lot of work to properly handle EOF, deletion of file etc., if you decide to implement it yourself.

Cannot pipe after data has been emitted from the response nodejs

I've been experiencing a problem with the require library of node js. When I try to pipe to a file and a stream on response, I get the error: you cannot pipe after data has been emitted from the response. This is because I do some calculations before really piping the data.
Example:
var request = require('request')
var fs = require('fs')
var through2 = require('through2')
options = {
url: 'url-to-fetch-a-file'
};
var req = request(options)
req.on('response',function(res){
//Some computations to remove files potentially
//These computations take quite somme time.
//Function that creates path recursively
createPath(path,function(){
var file = fs.createWriteStream(path+fname)
var stream = through2.obj(function (chunk, enc, callback) {
this.push(chunk)
callback()
})
req.pipe(file)
req.pipe(stream)
})
})
If I just pipe to the stream without any calculations, it's just fine. How can I pipe to both a file and stream using request module in nodejs?
I found this:Node.js Piping the same readable stream into multiple (writable) targets but it is not the same thing. There, piping happens 2 times in a different tick. This example pipes like the answer in the question and still receives an error.
Instead of piping directly to the file you can add a listener to the stream you defined. So you can replace req.pipe(file) with
stream.on('data',function(data){
file.write(data)
})
stream.on('end',function(){
file.end()
})
or
stream.pipe(file)
This will pause the stream untill its read, something that doesn't happen with the request module.
More info: https://github.com/request/request/issues/887

HTTP request stream not firing readable when reading fixed sizes

I am trying to work with the new Streams API in Node.js, but having troubles when specifying a fixed read buffer size.
var http = require('http');
var req = http.get('http://143.226.75.100/waug_mp3_128k', function (res) {
res.on('readable', function () {
var receiveBuffer = res.read(1024);
console.log(receiveBuffer.length);
});
});
This code will receive a few buffers and then exit. However, if I add this line after the console.log() line:
res.read(0);
... all is well again. My program continues to stream as predicted.
Why is this happening? How can I fix it?
It's explained here.
As far as I understand it, by reading only 1024 bytes with each readable event, Node is left to assume that you're not interested in the rest of the data that's in the stream buffers, and discards it. Issuing the read(0) (in the same event loop iteration) 'resets' this behaviour. I'm not sure why the process exits after reading a couple of 1024-byte buffers though; I can recreate it, but I don't understand it yet :)
If you don't have a specific reason to use the 1024-byte reads, just read the entire buffer for each event:
var receiveBuffer = res.read();
Or instead of using non-flowing mode, use flowing mode by using the data/end events instead:
var http = require('http');
var req = http.get('http://143.226.75.100/waug_mp3_128k', function (res) {
var chunks = [];
res.on('data', function(chunk) {
chunks.push(chunk);
console.log('chunk:', chunk.length);
});
res.on('end', function() {
var result = Buffer.concat(chunks);
console.log('final result:', result.length);
});
});

Node Streaming, Writing, and Memory

I'm attempting to dynamically concatenate files prior to serving their content. The following very simplified code shows an approach:
var http = require('http');
var fs = require('fs');
var start = '<!doctype html><html lang="en"><head><script>';
var funcsA = fs.readFileSync('functionsA.js', 'utf8');
var funcsB = fs.readFileSync('functionsB.js', 'utf8');
var funcsC = fs.readFileSync('functionsC.js', 'utf8');
var finish = '</script></head><body>some stuff here</body></html>';
var output = start + funcsA + funcsB + funcsC + finish;
http.createServer(function (req, res) {
res.writeHead(200, {'Content-Type': 'text/html'});
res.end(output);
}).listen(9000);
In reality, how I concatenate might depend on clues from the userAgent. My markup and scripts could be several hundred kilobytes combined.
I like this approach because there is no file system I/O happening within createServer. I seem to have read somewhere that this response.write(...); approach is not as efficient/low overhead as streaming data using an fs.createReadStream approach. I seem to recall this had something to do with what happens when the client cannot receive data as fast as Node can send it.(?) We seem to be able to create a readable stream from a file system object, but not from memory. Is it possible to do what I have coded above with a streaming approach? With file I/O happening initially, outside of the CreateServer function?
Or, on the other hand, are my concerns not that critical, and the approach above offers perhaps no less efficiency than a streaming approach.
Thanks.
res.write(start)
var A = fs.createReadStream()
var B = fs.createReadStream()
var C = fs.createReadStream()
A.pipe(res, {
end: false
})
A.on('end', function () {
B.pipe(res, {
end: false
})
})
B.on('end', function () {
C.pipe(res, {
end: false
})
})
C.on('end', function () {
res.write(finish)
res.end()
})
Defining Streams prior to (and not inside) the createServer callback won't typically work, see here

Resources