How can I consume a stream of json chunks from http endpoint? - node.js

I have a server that is streaming json objects to an endpoint. Here is a simplified example:
app.get('/getJsonObjects', function (req, res) {
res.write(JSON.stringify(json1));
res.write(JSON.stringify(json2));
res.write(JSON.stringify(json3));
res.write(JSON.stringify(json4));
res.write(JSON.stringify(json5));
res.end();
});
Then client side using browser-request, I'm trying to do:
var r = request(url);
r.on('data', function(data) {
console.log(JSON.parse(data));
});
The problem is despite streaming to the endpoint chunks of valid stringified JSON, the chunks I'm getting back from the request are just text chunks that don't necessarily align with the start/end of the JSON chunks that were sent from the server. This means that JSON.parse(data) will sometimes fail.
What is the best way to stream these chunks of json in the same way that they were written to the endpoint?

This is an async problem. The server code you have provided will not be guaranteed to send out data in that order.
You will either have to accumulate the chunks on the client side and determine the order of the chunks on the client end for display or you will have to do some sort of accumulator method on the server end and then output the JSON in order as they get processed.
Edit:
It appears that res.write can take in an encoding type "chunked". So try setting the header field to chunked and then specify "chunked" in the encoding parameter of res.write().
https://nodejs.org/api/http.html#http_response_write_chunk_encoding_callback
If this fails, you can just make a huge callback / promise chain using the callback parameter of res.write to guarantee the order of the res.write().

Related

Nodjs Syntax error unexpected token JSON at position 0

I'm trying to get data from keepa api for my app ,
the status code of my request is 200 but I'm getting the Syntax error unexpected token JSON at position 0 for every request.
response.on("data", function(data){
const asinData = JSON.parse(data);
console.log(asinData);
res.send();
});
Can you print this "data"? Guess there is an error in it. I think "data" is a "serverResponse" object and "serverResponse.data" is what you want to see there, try to console.log it.
Is the response object coming from the http core module get() method? If so this may be helpful: https://nodejs.org/api/http.html#http_http_get_options_callback.
Basically, the response object you are getting is an http.IncomingMessage instance, which is a readable stream. The data event is triggered on this object not when the response body has been fully received, but every time a small part - a chunk - of it has. You would need to aggregate all of these chunks into a single piece of data before attempting to parse it into a javascript object.
Also, be aware the chunks are emitted as buffers by default - not as strings. You can set the stream to emit strings instead by setting the stream encoding before starting to read the chunks.

node-superagent responseType('blob') vs. buffer(true)

Due to the deprecation of request, we're currently rewriting the request-service in our node app with superagent. So far all looks fine, however we're not quite sure how to request binary data/octet-stream and to process the actual response body as a Buffer. According to the docs (on the client side) one should use
superAgentRequest.responseType('blob');
which seems to work fine on NodeJS, but I've also found this github issue where they use
superAgentRequest.buffer(true);
which works just as well. So I'm wondering what the preferred method to request binary data in NodeJS is?
According to superagent's source-code, using the responseType() method internally sets the buffer flag to true, i.e. the same as setting it manually to true.
In case of dealing with binary-data/octet-streams, a binary data parser is used, which is in fact just a simple buffer:
module.exports = (res, fn) => {
const data = []; // Binary data needs binary storage
res.on('data', chunk => {
data.push(chunk);
});
res.on('end', () => {
fn(null, Buffer.concat(data));
});
};
In both cases this parser is used, which explains the behaviour. So you can go with either of the mentioned methods to deal with binary data/octet-streams.
As per documentation https://visionmedia.github.io/superagent/
SuperAgent will parse known response-body data for you, currently supporting application/x-www-form-urlencoded, application/json, and multipart/form-data. You can setup automatic parsing for other response-body data as well:
You can set a custom parser (that takes precedence over built-in parsers) with the .buffer(true).parse(fn) method. If response buffering is not enabled (.buffer(false)) then the response event will be emitted without waiting for the body parser to finish, so response.body won't be available.
So to parse other response types, you will need to set .buffer(true).parse(fn). But if you do not want to parse response then no need to set buffer(true).

Any performance concerns with sending a buffer array in express JSON response?

My nodejs server consumes data from a nodejs JSON API. Some endpoints on the API return image data like so:
let buffer = await getImageBuffer();
res.set('content-type', 'image/png');
res.end(buffer);
Which works great. However for a number of complexity reasons, I'd love include a buffer array in a JSON response instead... like so:
let buffer = await getBuffer();
res.json({
contentType: 'image/png',
buffer
});
Are there any performance issues w/ including a buffer array in a JSON response like that? Is there any inherent performance benefit to using res.end(buffer) instead? The consuming server is also running nodejs, and will naturally JSON.parse() the response from the API.

Why can't we do multiple response.send in Express.js?

3 years ago I could do multiple res.send in express.js.
even write a setTimeout to show up a live output.
response.send('<script class="jsbin" src="http://code.jquery.com/jquery-1.7.1.min.js"></script>');
response.send('<html><body><input id="text_box" /><button>submit</button></body></html>');
var initJs = function() {
$('.button').click(function() {
$.post('/input', { input: $('#text_box').val() }, function() { alert('has send');});
});
}
response.send('<script>' + initJs + '</script>');
Now it will throw:
Error: Can't set headers after they are sent
I know nodejs and express have updated. Why can't do that now? Any other idea?
Found the solution but res.write is not in api reference http://expressjs.com/4x/api.html
Maybe you need: response.write
response.write("foo");
response.write("bar");
//...
response.end()
res.send implicitly calls res.write followed by res.end. If you call res.send multiple times, it will work the first time. However, since the first res.send call ends the response, you cannot add anything to the response.
response.send sends an entire HTTP response to the client, including headers and content, which is why you are unable to call it multiple times. In fact, it even ends the response, so there is no need to call response.end explicitly when using response.send.
It appears to me that you are attempting to use send like a buffer: writing to it with the intention to flush later. This is not how the method works, however; you need to build up your response in code and then make a single send call.
Unfortunately, I cannot speak to why or when this change was made, but I know that it has been like this at least since Express 3.
res.write immediately sends bytes to the client
I just wanted to make this point about res.write clearer.
It does not build up the reply and wait for res.end(). It just sends right away.
This means that the first time you call it, it will send the HTTP reply headers including the status in order to have a meaningful response. So if you want to set a status or custom header, you have to do it before that first call, much like with send().
Note that write() is not what you usually want to do in a simple web application. The browser getting the reply little by little increases the complexity of things, so you will only want to do it it if it is really needed.
Use res.locals to build the reply across middleware
This was my original use case, and res.locals fits well. I can just store data in an Array there, and then on the very last middleware join them up and do a final send to send everything at once, something like:
async (err, req, res, next) => {
res.locals.msg = ['Custom handler']
next(err)
},
async (err, req, res, next) => {
res.locals.msg.push('Custom handler 2')
res.status(500).send(res.locals.msg.join('\n'))
}

HTTP - how to send multiple pre-cached gzipped chunks?

Lets say I have 2 individually gziped html chunks in memory.
Can I send chunk1+chunk2 to HTTP client? Does any browser supports this?
Or there is no way to do this and I have to gzip the whole stream not individual chunks?
I want to serve to clients for example chunk1+chunk2 and chunk2+chunk1 etc (different order) but I don't want to compress the whole page every time and I dont want to cache the whole page. I want to use precompressed cached chunks and send them.
nodejs code (node v0.10.7):
// creating pre cached data buffers
var zlib = require('zlib');
var chunk1, chunk2;
zlib.gzip(new Buffer('test1'), function(err, data){
chunk1 = data;
});
zlib.gzip(new Buffer('test2'), function(err, data){
chunk2 = data;
});
var http = require('http');
http.createServer(function (req, res) {
res.writeHead(200, {'Content-Type': 'text/plain', 'Content-Encoding': 'gzip'});
// writing two pre gziped buffers
res.write(chunk1); // if I send only this one everything is OK
res.write(chunk2); // if I send two chunks Chrome trying to download file
res.end();
}).listen(8080);
When my example server returns this kind of response Chrome browser display download window (it doesnt understand it :/
I haven't tried it, but if the http clients are compliant with RFC 1952, then they should accept concatenated gzip streams, and decompress them with the same result as if the data were all compressed into one stream. The HTTP 1.1 standard in RFC 2616 does in fact refer to RFC 1952.
If by "chunks" you are referring to chunked transfer encoding, then that is independent of the compression. If the clients do accept concatenated streams, then there is no reason for chunked transfer encoded boundaries to have to align with the gzip streams within.
As to how to do it, simply gzip your pieces and directly concatenate them. No other formatting or preparation is required.

Resources