POST request hangs (timeout) when trying to parse request body, running Koa on Firebase Cloud Functions - node.js

I'm working on a small website, serving static files using Firebase Hosting (FH) and rewriting all requests to a single function on Firebase Cloud Functions (FCF), where I'm using Koa (with koa-router) to handle the requests. However, when I try to parse the body of a POST request using koa-bodyparser, the service just hangs until it eventually times out.
The same thing occurs when using other body parsers, such as koa-body, and it seems to persist no matter where I put the parser, unless I put it after the router, in which case the problem goes away, though I still can't access the data, since it never gets a chance to be parsed(?).
The following is a stripped-down version of the code that's causing the problem:
import * as functions from 'firebase-functions'
import * as Koa from 'koa'
import * as KoaRouter from 'koa-router'
import * as KoaBodyParser from 'koa-bodyparser'
const app = new Koa()
const router = new KoaRouter()
app.use(KoaBodyParser())
router.post('/', (context) => {
// do some stuff with the data
})
app.use(router.routes())
export const serve = functions.https.onRequest(app.callback())
I'm still pretty new to all of these tools and I might be missing something completely obvious, but I can't seem to find the solution anywhere. If I'm not mistaken, FCF automatically parses requests, but Koa is unable to access that data unless it does the parsing itself, so I'd assume that something is going wrong between FCF's automatic parsing and the parser used by Koa.
I haven't been able to produce any actual errors or useful error messages, other than a Gateway Timeout (504), so I don't have much to go on and won't be able to provide you with much more than I already have.
How do I go about getting a hold of the data?

Firebase already parses the body.
https://firebase.google.com/docs/functions/http-events#read_values_from_the_request
It appears that the provided Koa body parsing middlewares don't know what to do with an "already parsed" body (ie an object vs an unparsed string), so the middleware ends up getting confused and does some sort of an infinite loop.
A solution is to use ctx.req.body because it's already parsed. :)
Koa rocks!

Related

How to download files with node-fetch

I need help implementing a file downloader in nodejs.
So i need to download over 25'000 files from a server. Im using node-fetch but i don't exactly know how to implement this. I tried using Promise.allSettled() but i also need a way to limit the amount of concurrent requests to the server otherwise i get rate-limited.
This is my code so far:
const fetch = require('node-fetch')
async function main () {
const urls = [
'https://www.example.com/foo.png',
'https://www.example.com/bar.gif',
'https://www.example.com/baz.jpg',
... many more (~25k)
]
// how to save each file on the machine with same file name and extension?
// how to limit the amount of concurrent requests to the server?
const files = await Promise.allSettled(
urls.map((url) => fetch(url))
)
}
main()
So my questions are:
How do i limit the amount of concurrent requests to the server? Can this be solved using a custom https agent with node-fetch and setting the maxSockets to something like 10?
How do i check if the file exists on the server and if it does then download it on my machine with the same file name and extension?
It would be very helpful if someone could show a small example code how i would implement such functionality.
Thanks in advance.
To control how many simultaneous requests are running at once, you can use any of these three options:
mapConcurrent() here and pMap() here: These let you iterate an array, sending requests to a host, but manages things so that you only ever have N requests in flight at the same time where you decide what the value of N is.
rateLimitMap() here: Let's you manage how many requests per second are sent.
Can this be solved using a custom https agent with node-fetch and setting the maxSockets to something like 10?
I'm not aware of any solution using a custom https agent.
How do i check if the file exists on the server and if it does then download it on my machine with the same file name and extension?
You can't directly access a remote http server's file system. So, all you can do is make an http request for a specific resource (a url) and examine the http response to see if it returned data or returned some sort of http error such as a 404.
As for filenames and extensions, that depends entirely upon whether you already know what to request and the server supports that being part of the URL or whether the server returns to you that information in an http header. If you're requesting specific filename and extension, then you can just create a file with that name and extension and save the http response data to that file on your local drive.
As for coding examples, the doc for node-fetch() shows examples of downloading data to a file using streams here: https://www.npmjs.com/package/node-fetch#streams.
import {createWriteStream} from 'fs';
import {pipeline} from 'stream';
import {promisify} from 'util'
import fetch from 'node-fetch';
const streamPipeline = promisify(pipeline);
const url='https://github.githubassets.com/images/modules/logos_page/Octocat.png';
const response = await fetch(url);
if (!response.ok) throw new Error(`unexpected response ${response.statusText}`);
await streamPipeline(response.body, createWriteStream('./octocat.png'));
Personally, I wouldn't use node-fetch as it's design center is to mimic the browser implementation of node which is not as friendly an API design as similar libraries built explicitly for nodejs. I use got(), and there are several other good libraries listed here. You can pick your favorite.
Here's a code example using the got() library:
import {promisify} from 'node:util';
import stream from 'node:stream';
import fs from 'node:fs';
import got from 'got';
const pipeline = promisify(stream.pipeline);
await pipeline(
got.stream('https://sindresorhus.com'),
fs.createWriteStream('index.html')
);

Streaming response body to file in typescript: Property 'pipe' does not exist on type 'ReadableStream<Uint8Array>'

I am able to fetch a binary body from an API to write it to a file in node.
const fileStream = fs.createWriteStream(filePath);
fetch(apiURL).then((downloadResponse) => {
downloadResponse.body.pipe(fileStream);
});
However, when doing so I get a linting error of:
Property 'pipe' does not exist on type
'ReadableStream'
It seems weird to me that the call would work when the linter gives an error.
I even initially thought my logic was wrong and wasted time debugging this working call...
Is my typescript version misidentifying the type for some reason or should I not be able to perform this call ?
I am barely beginning with typescript but on occasion I run into such idiosyncrasies that slow me down when I am doing something that would seem perfectly valid.
The answer provided in the comments was indeed correct but the context in which this wasn't straight-forward was in a nextjs app using fetch from a server side rendering function such as getServerSideProps.
On the client side it is pretty straight-forward that the standard fetch api is used however on SSR it wasn't evident that one had to additionnally install the types for node-fetch using npm i #types/node-fetch.

At what point are request and response objects populated in express app

I’m always coding backend api’s and I don’t really get how express does its bidding with my code. I know what the request and response objects offer, I just don’t understand how they come to be.
This simplified code for instance:
exports.getBlurts = function() {
return function(req, res) {
// build query…
qry.exec(function(err, results) {
res.json(results);
}
});
}
}
Then I’d call in one of my routes:
app.get('/getblurts/, middleware.requireUser, routes.api.blurtapi.getBlurts());
I get that the function is called upon the route request. It’s very abstract to me though and I don’t understand the when, where, or how as it pertains to the req\res params being injected.
For instance. I use a CMS that modifies the request object by adding a user property, which is then available globally on all requests made whether ajax or otherwise, making it easy at all times to determine if a user is logged in.
Are the req and res objects just pre-cooked by express but allow freedom for them to be modified to your needs? When are they actually 'built'
At its heart express is actually using node's default http-module and passing the express-application as a callback to the http.createServer-function. The request and response objects are populated at that point, i.e. from node itself for every incoming connection. See the nodeJS documentation for more details regarding node's http-module and what req/res are.
You might want to check out express' source code which shows how the express application is passed as a callback to http.createServer.
https://github.com/expressjs/express/blob/master/lib/request.js and https://github.com/expressjs/express/blob/master/lib/response.js show how node's request/response are extended by express specific functions.

ExpressJS Middleware bodyParser has very poor performance

Few days ago we added NewRelic, APM to our Rest API, which is written in NodeJS and uses EXPRESS JS as a development framework.
Now we see a lot of users experience poor response times, because of JSON parser middleware.
Here is one of those requests reported in NewRelic:
Drilled-down report:
As you can see the most of the time is consumed by middleware JSON Parser.
We were thinking maybe issue comes from big JSON payloads, which is sometimes sent out from API. But for this response, illustrated above, API returned data with contentLength=598, which shouldn't be too huge JSON.
We also use compression middleware as visible on drilled down request execution screenshot. Which should be reducing size of IO sent back and forth to clients.
At this moment we have a doubt for a parameter limit which is passed to middleware when initialized. {limit:50kb} But when testing locally it doesn't make any difference.
We were thinking to switch with protobuf, also think about the way to parse JSON payloads asynchronously. Because JSON.parse which is used by middleware is synchronous process and stops non-blocking IO.
But before staring those changes and experiments, please if anyone had same kind of problem to suggest any possible solution.
Benchmarking:
For benchmarking on local/stage environments we use JMeter and generate loads to check when such timeouts may happen, but we are not able to catch this when testing with JMeter.
Thank You.
Express comes with an embedded bodyparser, and you can try it if you want. It should perform better since it's integrated.
const express = require('express');
const app = express();
app.use(express.urlencoded()); // support for GET
// No limited usage,
app.use(express.json()); // support for POST
// If you want add request limit,
app.use(express.json({ limit: "1mb" }));

Possible to use http-parser-js in an Electron app?

I need to make an HTTP request to a service that returns malformed headers that the native Node.js parser can't handle. In a test script, I've found that I can use the http-parser-js library to make the same request and it handles the bad headers gracefully.
Now I need to make that work within the Electron app that needs to actually make the call and retrieve the data and it's failing with the same HPE_INVALID_HEADER_TOKEN. I assume, for that reason, that the native HTTP parser is not getting overridden.
In my electron app, I have the same code I used in my test script:
process.binding('http_parser').HTTPParser = require('http-parser-js').HTTPParser;
var http = require('http');
var req = http.request( ... )
Is there an alternate process binding syntax I can use within Electron?
This was not an electron issue. My app makes several different requests and most of the are to services that return proper headers. Originally, I was using the request-promise library to handle all calls, but I needed to modify the one call that returned bad headers.
The problem was that I was still using request-promise for the other calls and that library conflicts with the custom code I had to write to deal with the malformed headers. Once I modified my custom code to handle all requests, things worked much more smoothly.

Resources