Extend node-fetch default timeout - node.js

I am using node-fetch to perform a get request
const fetch = require("node-fetch");
try {
const response = await fetch(
`https://someurl/?id=${id}`
);
} catch (error) {
console.error(error);
}
The API take long time to return the response, around 15 minutes. With postman everything works fine, but with node-fetch I get a time out error after 600000 ms. So I assume node-fetch has a default timeout. I found some ways to change the default time-out, but if I understand the code well, this will make the default timeout shorter, but will not extend it. Any advice on that?
import * as fetch from "node-fetch"
export default function (url: any, options: any, timeout = 5000) {
return Promise.race([
fetch(url, options),
new Promise((_, reject) => setTimeout(() => reject("timeout"), timeout)),
])
}

Related

puppeteer - how to intercept requests and responses only from a certain url in nodejs [duplicate]

Using Puppeteer, I'd like to load a URL in Chrome and capture the following information:
request URL
request headers
request post data
response headers text (including duplicate headers like set-cookie)
transferred response size (i.e. compressed size)
full response body
Capturing the full response body is what causes the problems for me.
Things I've tried:
Getting response content with response.buffer - this does not work if there are redirects at any point, since buffers are wiped on navigation
intercepting requests and using getResponseBodyForInterception - this means I can no longer access the encodedLength, and I also had problems getting the correct request and response headers in some cases
Using a local proxy works, but this slowed down page load times significantly (and also changed some behavior for e.g. certificate errors)
Ideally the solution should only have a minor performance impact and have no functional differences from loading a page normally. I would also like to avoid forking Chrome.
You can enable a request interception with page.setRequestInterception() for each request, and then, inside page.on('request'), you can use the request-promise-native module to act as a middle man to gather the response data before continuing the request with request.continue() in Puppeteer.
Here's a full working example:
'use strict';
const puppeteer = require('puppeteer');
const request_client = require('request-promise-native');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
const result = [];
await page.setRequestInterception(true);
page.on('request', request => {
request_client({
uri: request.url(),
resolveWithFullResponse: true,
}).then(response => {
const request_url = request.url();
const request_headers = request.headers();
const request_post_data = request.postData();
const response_headers = response.headers;
const response_size = response_headers['content-length'];
const response_body = response.body;
result.push({
request_url,
request_headers,
request_post_data,
response_headers,
response_size,
response_body,
});
console.log(result);
request.continue();
}).catch(error => {
console.error(error);
request.abort();
});
});
await page.goto('https://example.com/', {
waitUntil: 'networkidle0',
});
await browser.close();
})();
Puppeteer-only solution
This can be done with puppeteer alone. The problem you are describing that the response.buffer is cleared on navigation, can be circumvented by processing each request one after another.
How it works
The code below uses page.setRequestInterception to intercept all requests. If there is currently a request being processed/being waited for, new requests are put into a queue. Then, response.buffer() can be used without the problem that other requests might asynchronously wipe the buffer as there are no parallel requests. As soon as the currently processed request/response is handled, the next request will be processed.
Code
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const [page] = await browser.pages();
const results = []; // collects all results
let paused = false;
let pausedRequests = [];
const nextRequest = () => { // continue the next request or "unpause"
if (pausedRequests.length === 0) {
paused = false;
} else {
// continue first request in "queue"
(pausedRequests.shift())(); // calls the request.continue function
}
};
await page.setRequestInterception(true);
page.on('request', request => {
if (paused) {
pausedRequests.push(() => request.continue());
} else {
paused = true; // pause, as we are processing a request now
request.continue();
}
});
page.on('requestfinished', async (request) => {
const response = await request.response();
const responseHeaders = response.headers();
let responseBody;
if (request.redirectChain().length === 0) {
// body can only be access for non-redirect responses
responseBody = await response.buffer();
}
const information = {
url: request.url(),
requestHeaders: request.headers(),
requestPostData: request.postData(),
responseHeaders: responseHeaders,
responseSize: responseHeaders['content-length'],
responseBody,
};
results.push(information);
nextRequest(); // continue with next request
});
page.on('requestfailed', (request) => {
// handle failed request
nextRequest();
});
await page.goto('...', { waitUntil: 'networkidle0' });
console.log(results);
await browser.close();
})();
I would suggest you to search for a quick proxy server which allows to write requests logs together with actual content.
The target setup is to allow proxy server to just write a log file, and then analyze the log, searching for information you need.
Don't intercept requests while proxy is working (this will lead to slow down)
The performance issues(with proxy as logger setup) you may encounter are mostly related to TLS support, please pay attention to allow quick TLS handshake, HTTP2 protocol in the proxy setup
E.g. Squid benchmarks show that it is able to process hundreds RPS, which should be enough for testing purposes
I would suggest using a tool namely 'fiddler'. It will capture all the information that you mentioned when you load a URL url.
Here's my workaround which I hope will help others.
I had issues with the await page.setRequestInterception(True) command blocking the flow and made the page hanging until timeout.
So I added this function
async def request_interception(req):
""" await page.setRequestInterception(True) would block the flow, the interception is enabled individually """
# enable interception
req.__setattr__('_allowInterception', True)
if req.url.startswith('http'):
print(f"\nreq.url: {req.url}")
print(f" req.resourceType: {req.resourceType}")
print(f" req.method: {req.method}")
print(f" req.postData: {req.postData}")
print(f" req.headers: {req.headers}")
print(f" req.response: {req.response}")
return await req.continue_()
removed the await page.setRequestInterception(True) and called the function above with
page.on('request', lambda req: asyncio.ensure_future(request_interception(req)))
in my main().
Without the req.__setattr__('_allowInterception', True) statement Pyppeteer would complain about the intercept not enabled for some requests but works fine for me with it.
Just in case someone interested in the system I'm running Pyppeteer:
Ubuntu 20.04
Python 3.7 (venv)
...
pyee==8.1.0
pyppeteer==0.2.5
python-dateutil==2.8.1
requests==2.25.1
urllib3==1.26.3
websockets==8.1
...
I also posted the solution at https://github.com/pyppeteer/pyppeteer/issues/198
Cheers
go to Chrome press F12, then go to "network" tab, you can see there all the http request that the website sends, yo're be able to see the details you mentioned.

There is a way to make Axios return the data as default response?

When we use Axios we always have to get the data from response. Like this:
const response = await Axios.get('/url')
const data = response.data
There is a way to make Axios return the data already? Like this:
const data = await Axios.get('/url')
We never used anything besides the data from the response.
You can use ES6 Destructing like this:
const { data } = await Axios.get('/url');
So you won't have write another line of code.
add a response interceptors
axios.interceptors.response.use(function (response) {
// Any status code that lie within the range of 2xx cause this function to trigger
// Do something with response data
return response.data; // do like this
}, function (error) {
// Any status codes that falls outside the range of 2xx cause this function to trigger
// Do something with response error
return Promise.reject(error);
});
what i normally do is create a js file called interceptors.js
import axios from 'axios';
export function registerInterceptors() {
axios.interceptors.response.use(
function (response) {
// Any status code that lie within the range of 2xx cause this function to trigger
// Do something with response data
return response.data;
},
function (error) {
// Any status codes that falls outside the range of 2xx cause this function to trigger
// Do something with response error
return Promise.reject(error);
}
);
}
in ./src/index.js
import { registerInterceptors } from './path/to/interceptors';
registerInterceptors();//this will register the interceptors.
For a best practice don't use axios every where, just in case in the future if you want to migrate to a different http provider then you have to change everywhere it uses.
create a wrapper around axios and use that wrapper in your app
for ex:
create a js file called http.js
const execute = ({url, method, params, data}) => {
return axios({
url,
method,//GET or POST
data,
params,
});
}
const get = (url, params) => {
return execute({
url, method: 'GET', params
})
}
const post = (url, data) => {
return execute({
url, method: 'POST', data
})
}
export default {
get,
post,
};
and use it like
import http from './http';
....
http.get('url', {a:1, b:2})
so now you can customize all over the app, even changing the http provider is so simple.

Why is Express res.json() holding up Google Sheets API for +30 seconds

I am building an app that looks up a word definition in a Google sheet from a Slack slash command. The app is hosted in Google Cloud Functions and written in Node.js.
To make the 3000ms time limit on slack commands, the app
posts an immediate 200 OK response, then
do the lookup in Sheets, and finally,
returns the full reply via Slack's request_url as defined in the Slack documentation.
So far, so good. But here's the kicker:
When I call res.json() in my main function glossary, Slack get's an initial reply, but the sendMessageToSlackResponseURL() isn't called for another 10-40 seconds. I eventually get the reply in Slack as expected, albeit painfully slow.
I have narrowed it down (by an embarrassing amount of console.log() calls) to the line:
const reply = (await sheets.spreadsheets.values.get(request)).data.values;
This command takes 2-3 seconds to run if res.json() is not called prior - barely making the Slack time limit. But if res.json() is called prior. This command takes up to 40 seconds.
How is the Google Sheets API call affected by a prior res.json() call? What am I missing?
// Simplified code pasted below:
exports.glossary = async (req, res) => {
// Give immediate response to prevent 3000ms Slack timeout.
res.json(initSlackResponse(req.body.text)); //Commenting out this line speeds up the app
// Get glossary result from Google Sheet
let response = await getGlossaryResults(query);
// Return late response
await sendMessageToSlackResponseURL(req.body.response_url, response);
return Promise.resolve();
};
const getGlossaryResults = async (query) => {
const content = await readFile(CREDENTIALS_PATH);
let oAuth2Client = await authorize(JSON.parse(content));
const request = {
spreadsheetId: spreadsheetId,
range: range,
auth: oAuth2Client
};
//The following command takes 10-40 seconds to run if res.json(initSlackResponse(query)); has been called.
//If res.json() is *not* called, the command takes 2-3 seconds.
const reply = (await sheets.spreadsheets.values.get(request)).data.values;
// *Generate the results here
return results;
};
function sendMessageToSlackResponseURL(responseURL, JSONmessage) {
let postOptions = {
uri: responseURL,
method: 'POST',
headers: {
'Content-type': 'application/json'
},
json: JSONmessage
};
request(postOptions, (error) => {
if (error){
console.error(error);
}
});
return Promise.resolve();
}
const initSlackResponse = (query) => {
return {
// *Build simple json object here
};
};

How can I change the result status in Axios with an adapter?

The why
We're using the axios-retry library, which uses this code internally:
axios.interceptors.response.use(null, error => {
Since it only specifies the error callback, the Axios documentation says:
Any status codes that falls outside the range of 2xx cause this function to trigger
Unfortunately we're calling a non-RESTful API that can return 200 with an error code in the body, and we need to retry that.
We've tried adding an Axios interceptor before axios-retry does and changing the result status in this case; that did not trigger the subsequent interceptor error callback though.
What did work was specifying a custom adapter. However this is not well-documented and our code does not handle every case.
The code
const axios = require('axios');
const httpAdapter = require('axios/lib/adapters/http');
const settle = require('axios/lib/core/settle');
const axiosRetry = require('axios-retry');
const myAdapter = async function(config) {
return new Promise((resolve, reject) => {
// Delegate to default http adapter
return httpAdapter(config).then(result => {
// We would have more logic here in the production code
if (result.status === 200) result.status = 500;
settle(resolve, reject, result);
return result;
});
});
}
const axios2 = axios.create({
adapter: myAdapter
});
function isErr(error) {
console.log('retry checking response', error.response.status);
return !error.response || (error.response.status === 500);
}
axiosRetry(axios2, {
retries: 3,
retryCondition: isErr
});
// httpstat.us can return various status codes for testing
axios2.get('http://httpstat.us/200')
.then(result => {
console.log('Result:', result.data);
})
.catch(e => console.error('Service returned', e.message));
This works in the error case, printing:
retry checking response 500
retry checking response 500
retry checking response 500
retry checking response 500
Service returned Request failed with status code 500
It works in the success case too (change the URL to http://httpstat.us/201):
Result: { code: 201, description: 'Created' }
The issue
Changing the URL to http://httpstat.us/404, though, results in:
(node:19759) UnhandledPromiseRejectionWarning: Error: Request failed with status code 404
at createError (.../node_modules/axios/lib/core/createError.js:16:15)
at settle (.../node_modules/axios/lib/core/settle.js:18:12)
A catch on the httpAdapter call will catch that error, but how do we pass that down the chain?
What is the correct way to implement an Axios adapter?
If there is a better way to handle this (short of forking the axios-retry library), that would be an acceptable answer.
Update
A coworker figured out that doing .catch(e => reject(e)) (or just .catch(reject)) on the httpAdapter call appears to handle the issue. However we'd still like to have a canonical example of implementing an Axios adapter that wraps the default http adapter.
Here's what worked (in node):
const httpAdapter = require('axios/lib/adapters/http');
const settle = require('axios/lib/core/settle');
const customAdapter = config =>
new Promise((resolve, reject) => {
httpAdapter(config).then(response => {
if (response.status === 200)
// && response.data contains particular error
{
// log if desired
response.status = 503;
}
settle(resolve, reject, response);
}).catch(reject);
});
// Then do axios.create() and pass { adapter: customAdapter }
// Now set up axios-retry and its retryCondition will be checked
Workaround with interceptor and custom error
const axios = require("axios").default;
const axiosRetry = require("axios-retry").default;
axios.interceptors.response.use(async (response) => {
if (response.status == 200) {
const err = new Error("I want to retry");
err.config = response.config; // axios-retry using this
throw err;
}
return response;
});
axiosRetry(axios, {
retries: 1,
retryCondition: (error) => {
console.log("retryCondition");
return false;
},
});
axios
.get("https://example.com/")
.catch((err) => console.log(err.message)); // gonna be here anyway as we'll fail due to interceptor logic

Puppeteer: How to listen to a specific response?

I'm tinkering with the headless chrome node api called puppeteer.
I'm wondering how to listen to a specific request response and how to act in consequence.
I have look at events requestfinish and response but it gives me all the request/responses already performed in the page.
How can I achieve commented behaviour?
One option is to do the following:
page.on('response', response => {
if (response.url().endsWith("your/match"))
console.log("response code: ", response.status());
// do something here
});
This still catches all requests, but allows you to filter and act on the event emitter.
https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#event-response
Filtered response (wait up to 11 seconds) body parsed as JSON with initially requested PATCH or POST method every time you will be call that:
const finalResponse = await page.waitForResponse(response =>
response.url() === urlOfRequest
&& (response.request().method() === 'PATCH'
|| response.request().method() === 'POST'), 11);
let responseJson = await finalResponse.json();
console.log(responseJson);
Since puppeteer v1.6.0 (I guess) you can use page.waitForResponse(urlOrPredicate[, options])
Example usage from docs:
const firstResponse = await page.waitForResponse('https://example.com/resource');
const finalResponse = await page.waitForResponse(response =>
response.url() === 'https://example.com' && response.status() === 200
);
return finalResponse.ok();
I was using jest-puppeteer and trying to test for a specific response code of my test server. page.goto() resolves to the response of the original request.
Here is a simple test that a 404 response is returned when expected.
describe(`missing article page`, () => {
let response;
beforeAll(async () => {
response = await page.goto('http://my-test-server/article/this-article-does-not-exist')
})
it('has an 404 response for missing articles', () => {
expect(response.status()).toEqual(404)
})
it('has a footer', async () => {
await expect(page).toMatch('My expected footer content')
})
})
to get the xhr response simply do :
const firstResponse = await page.waitForResponse('https://example.com/resource')
// the NEXT line will extract the json response
console.log( await firstResponse.json() )

Resources