Nodejs proxy request coalescing - node.js

I'm running into an issue with my http-proxy-middleware stuff. I'm using it to proxy requests to another service which i.e. might resize images et al.
The problem is that multiple clients might call the method multiple times and thus create a stampede on the original service. I'm now looking into (what some services call request coalescing i.e. varnish) a solution that would call the service once, wait for the response and 'queue' the incoming requests with the same signature until the first is done, and return them all in a single go... This is different from 'caching' results due to the fact that I want to prevent calling the backend multiple times simultaneously and not necessarily cache the results.
I'm trying to find if something like that might be called differently or am i missing something that others have already solved someway... but i can't find anything...
As the use case seems pretty 'basic' for a reverse-proxy type setup, I would have expected alot of hits on my searches but since the problemspace is pretty generic i'm not getting anything...
Thanks!

A colleague of mine has helped my hack my own answer. It's currently used as a (express) middleware for specific GET-endpoints and basically hashes the request into a map, starts a new separate request. Concurrent incoming requests are hashed and checked and walked on the separate request callback and thus reused. This also means that if the first response is particularly slow, all coalesced requests are too
This seemed easier than to hack it into the http-proxy-middleware, but oh well, this got the job done :)
const axios = require('axios');
const responses = {};
module.exports = (req, res) => {
const queryHash = `${req.path}/${JSON.stringify(req.query)}`;
if (responses[queryHash]) {
console.log('re-using request', queryHash);
responses[queryHash].push(res);
return;
}
console.log('new request', queryHash);
const axiosConfig = {
method: req.method,
url: `[the original backend url]${req.path}`,
params: req.query,
headers: {}
};
if (req.headers.cookie) {
axiosConfig.headers.Cookie = req.headers.cookie;
}
responses[queryHash] = [res];
axios.request(axiosConfig).then((axiosRes) => {
responses[queryHash].forEach((coalescingRequest) => {
coalescingRequest.json(axiosRes.data);
});
responses[queryHash] = undefined;
}).catch((err) => {
responses[queryHash].forEach((coalescingRequest) => {
coalescingRequest.status(500).json(false);
});
responses[queryHash] = undefined;
});
};

Related

Progress bar for express / react communicating with backend

I want to make a progress bar kind of telling where the user where in process of fetching the API my backend is. But it seems like every time I send a response it stops the request, how can I avoid this and what should I google to learn more since I didn't find anything online.
React:
const {data, error, isError, isLoading } = useQuery('posts', fetchPosts)
if(isLoading){<p>Loadinng..</p>}
return({data&&<p>{data}</p>})
Express:
app.get("api/v1/testData", async (req, res) => {
try {
const info = req.query.info
const sortByThis = req.query.sortBy;
if (info) {
let yourMessage = "Getting Data";
res.status(200).send(yourMessage);
const valueArray = await fetchData(info);
yourMessage = "Data retrived, now sorting";
res.status(200).send(yourMessage);
const sortedArray = valueArray.filter((item) => item.value === sortByThis);
yourMessage = "Sorting Done now creating geojson";
res.status(200).send(yourMessage);
createGeoJson(sortedArray)
res.status(200).send(geojson);
}
else { res.status(400) }
} catch (err) { console.log(err) res.status(500).send }
}
You can only send one response to a request in HTTP.
In case you want to have status updates using HTTP, the client needs to poll the server i.e. request status updates from the server. Keep in mind though that every request needs to be processed on the server side and will take resources away which are then not available for other (more important) requests from other clients. So don't poll too frequently.
If you want to support long running operations using HTTP have a look at the following API design pattern.
Alternatively you could also use a WebSockets connection to push updates from the server to the client. I assume your computation on the backend will not be minutes long and you want to update the client in real-time, so probably WebSockets will be the best option for you. A WebSocket connection has, once established, considerably less overhead than sending huge HTTP requests/ responses between client and server.
Have a look at this thread which dicusses abovementioned and other possibilites.

How can I prevent my ServiceWorker from intercepting cross-origin requests?

I am trying to clean up some code on my application https://git.sequentialread.com/forest/sequentialread-password-manager
I am using a ServiceWorker to enable the application to run offline -- however, I noticed that the ServiceWorker is intercepting cross-origin requests to backblazeb2.com. The app makes these cross origin requests as a part of its normal operation.
You can see here how I am registering the ServiceWorker:
navigator.serviceWorker.register('/serviceworker.js', {scope: "/"}).then(
(reg) => {
...
And inside the serviceworker.js code, I manually avoid caching any requests to backblazeb2.com:
...
return fetch(event.request).then(response => {
const url = new URL(event.request.url);
const isServerStorage = url.pathname.startsWith('/storage');
const isVersion = url.pathname == "/version";
const isBackblaze = url.host.includes('backblazeb2.com');
const isPut = event.request.method == "PUT";
if(!isServerStorage && !isVersion && !isBackblaze && !isPut) {
... // cache the response
However, this seems silly, I wish there was a way to limit the ServiceWorker to only intercept requests for the current origin.
I already tried inserting the origin into the scope property during registration, but this didn't work:
navigator.serviceWorker.register('/serviceworker.js', {scope: window.location.origin}).then(
(reg) => {
...
It was behaving the same way. I am assuming that perhaps this is because there are CORS headers present on the responses from backblazeb2.com, making those requests "technically" within the "scope" of the current origin ?
One idea I had, I could serve a permanent redirect from / to /static/index.html and then configure the serviceworker with a scope of /static, meaning it would only cache resources in that folder. But that seems like an ugly hack I should not have to do.
Is there a clean and "correct" way to do this??
As far as I can tell, the answer is, you can't do this. the serviceworker api won't let you.
Someone explained it to me as "the scope for the serviceworker limits where FROM the requests can be intercepted, not where TO the requests can be intercepted. So in otherwords, if I register a serviceworker at /app/, then a javascript from / or /foo/ will be able to make requests without them being intercepted.
It turns out that actually what I REALLY needed was to understand how service worker error handling works.
In the old version of my app, when the fetch() promise rejected, my code would return null.
return fetch(event.request).then(response => {
...blahblahblah...
}).catch( e => {
....
return null;
});
This was bad news bears and it was causing me to want to skip the serviceworker. what I didn't understand was; its not the serviceworkers fault per se as much as the fact that my serviceworker did not handle errors correctly. So the solution was to handle errors better.
This is what I did. I introduced a new route on the server called /error that always returns the string "serviceworker request failed".
http.HandleFunc("/error", func(response http.ResponseWriter, request *http.Request) {
response.WriteHeader(200)
fmt.Fprint(response, "serviceworker request failed")
})
And then I made sure to cache that endpoint when the service worker is installed.
self.addEventListener('install', event => {
event.waitUntil(clients.get(event.clientId).then(client => {
return caches.open(cacheVersion).then(cache => {
return cache.addAll([
'/',
'/error',
....
Finally, when the serviceworker fetch() promise rejects, I fall back to returning the cached version.
return fetch(event.request).then(response => {
...blahblahblah...
}).catch( e => {
....
return caches.match('/error');
});
I got the idea from the MDN serviceworker example project which does a similar thing and simply returns a cached image of darth vader if the fetch() promise rejects.
This allowed me to gracefully handle these errors and retry instead of silently failing. I simply had to make sure that my code does the right thing when it encounters an http response that matches the literal string "serviceworker request failed".
const requestFailedBytes = app.sjcl.codec.bytes.fromBits(app.sjcl.codec.utf8String.toBits("serviceworker request failed"));
...
var httpRequest = new XMLHttpRequest();
....
httpRequest.onloadend = () => {
...
if(app.cryptoService.uint8ArrayEquals(new Uint8Array(httpRequest.response), requestFailedBytes)) {
reject(false);
return
}
The fetch event in the service worker has a property on the request called "mode." This property allows you to check if the mode was set to "cors."
Here is an example of how to prevent cors requests.
self.addEventListener('fetch', (e) =>
{
if(e.request.method !== 'GET')
{
return false;
}
if(e.request.mode === 'navigate')
{
e.respondWith(caches.match('index.html'));
return false;
}
else if(e.request.mode === 'cors')
{
return false;
}
const response = this.fetchFile(e);
e.respondWith(response);
});

Chain of endpoints in Node and Express: how to prevent that some of them stops all the series?

In some page I have to get information from 8 different endpoints. 2 of them are outside of my application and sometimes they cause an delay at displaying data. The web browser waits until the data is processed. Once they're outside of my app I can't refactor them in order to make them fast, but I need to show the information that they provide. In addition, sometimes one of them returns nothing. If so, I use default data to show to the user. The waiting time takes time for the user experience perspective.
I'm using promises to call these endpoints. Below is part of the code snippet that I am using.
The code is working fine. The issue is the delay.
First. Here is the array that contains all the service that I need to process:
var requests = [{
// 0
url: urlLocalApi + '/endpointURL_1/',
headers: {
'headers': 'apitoken'
},
}, {
// 1
url: urlLocalApi + '/endpointURL_2/',
headers: {
'headers': 'apitoken'
},
];
The code of array is encapsulated in this method:
const requests = homePageFunctions.createRequest();
Now, it is how the data is processed. I am using both 'request-promise' and 'bluebird', and a personal logger to check it out if everything goes fine.
const Promise = require("bluebird");
const request = require('request-promise');
var viewsHelper = {
getPageData: function (requests) {
return Promise.map(requests, function (obj) {
return request(obj).then(function (body) {
AppLogger.log(`Endpoint parsed`, statusLogger.infodate);
return JSON.parse(body);
});
});
}
}
module.exports = viewsHelper;
How do I call this?
viewsHelper.getPageData(requests)
.then(results => {
var output = [];
for (var i = 0; i < results.length; i++) {
output.push(results[i]);
}
// render data
res.render('homepage/index', output);
AppLogger.log(`PageData is rendered`, statusLogger.infodate);
})
.catch(err => {
console.log(err);
});
};
Take a look that inside of each index item of "output" array, there is the output of each data of each endpoint.
The problem here is:
If any of the endpoint takes long, the entire chain slows even though
if they are already processed. The web page waits in a blank mode.
How to prevent this behavior?
That is an interesting question but I have questions in order to answer it effectively.
You have Node server and client (HTML/JS)
You have 8 end points 2 are slow because you don’t have control over them.
Does the client (page) aware of the 8 end points? I .e you make 8 calls everytime you reload the page?
OR
Does the page makes one request to your node JS and your nodeJS synchronously calls the 8 end points
If it is 1 then lazy loading will work easily for you since the page is making the requests.
If it is 2 lazy loading will work only at the server side however the client will be blocked because it doesn’t know (or care how you load your data. The page made one request and it is blocked waiting for that request..
Obviously each method has pros and cons ..
One way you can solve this is to asynchronously call those end points on node and cache them and when the page makes the 1 request you have the data ready ..
Again we know very little about the situation there are many ways to solve this
Hope this helps

Call Express router manually

Нello! I am looking to call a function which has been passed to an expressRouter.post(...) call.
This expressRouter.post(...) call is occurring in a file which I am unable to modify. The code has already been distributed to many clients and there is no procedure for me to modify their versions of the file. While I have no ability to update this file for remote clients, other developers are able to. I therefore face the issue of this POST endpoint's behaviour changing in the future.
I am also dealing with performance concerns. This POST endpoint expects req.body to be a parsed JSON object, and that JSON object can be excessively large.
My goal is to write a GET endpoint which internally activates this POST endpoint. The GET endpoint will need to call the POST endpoint with a very large JSON value, which has had URL query params inserted into it. The GET's functionality should always mirror the POST's functionality, including if the POST's functionality is updated in the future. For this reason I cannot copy/paste the POST's logic. Note also that the JSON format will never change.
I understand that the issue of calling an expressjs endpoint internally has conventionally been solved by either 1) extracting the router function into an accessible scope, or 2) generating an HTTP request to localhost.
Unfortunately in my case neither of these options are viable:
I can't move the function into an accessible scope as I can't modify the source, nor can I copy-paste the function as the original version may change
Avoiding the HTTP request is a high priority due to performance considerations. The HTTP request will require serializing+deserializing an excessively large JSON body, re-visiting a number of authentication middlewares (which require waiting for further HTTP requests + database queries to complete), etc
Here is my (contrived) POST endpoint:
expressRouter.post('/my/post/endpoint', (req, res) => {
if (!req.body.hasOwnProperty('val'))
return res.status(400).send('Missing "val"');
return res.status(200).send(`Your val: ${req.body.val}`);
});
If I make a POST request to localhost:<port>/my/post/endpoint I get the expected error or response based on whether I included "val" in the JSON body.
Now, I want to have exactly the same functionality available, but via GET, and with "val" supplied in the URL instead of in any JSON body. I have attempted the following:
expressRouter.get('/my/get/endpoint/:val', (req, res) => {
// Make it seem as if "val" occurred inside the JSON body
let fakeReq = {
body: {
val: req.params.val
}
};
// Now call the POST endpoint
// Pass the fake request, and the real response
// This should enable the POST endpoint to write data to the
// response, and it will seem like THIS endpoint wrote to the
// response.
manuallyCallExpressEndpoint(expressRouter, 'POST', '/my/post/endpoint', fakeReq, res);
});
Unfortunately I don't know how to implement manuallyCallExpressEndpoint.
Is there a solution to this problem which excludes both extracting the function into an accessible scope, and generating an HTTP request?
This seems possible, but it may make more sense to modify req and pass it, rather than create a whole new fakeReq object. The thing which enables this looks to be the router.handle(req, res, next) function. I'm not sure this is the smartest way to go about this, but it will certainly avoid the large overhead of a separate http request!
app.get('/my/get/endpoint/:val', (req, res) => {
// Modify `req`, don't create a whole new `fakeReq`
req.body = {
val: req.params.val
};
manuallyCallExpressEndpoint(app, 'POST', '/my/post/endpoint', req, res);
});
let manuallyCallExpressEndpoint = (router, method, url, req, res) => {
req.method = method;
req.url = url;
router.handle(req, res, () => {});
};
How about a simple middleware?
function checkVal(req, res, next) {
const val = req.params.val || req.body.val
if (!val) {
return res.status(400).send('Missing "val"');
}
return res.status(200).send(`Your val: ${val}`);
}
app.get('/my/get/endpoint/:val', checkVal)
app.post('/my/post/endpoint', checkVal)
This code isn't tested but gives you rough idea on how you can have the same code run in both places.
The checkVal function serves as a Express handler, with request, response and next. It checks for params first then the body.

Reasonable design of using socket.io for RPC

I am building a web app that uses socket.io to trigger remote procedure calls that passes sessions specific data back go the client. As my app is getting bigger and getting more users, I wanted to check to see if my design is reasonable.
The server that receives websocket communications and triggers RPCs looks something like this:
s.on('run', function(input) {
client.invoke(input.method, input.params, s.id, function(error, res, more) {
s.emit('output', {
method: input.method,
error: error,
response: res,
more: more,
id: s.id
});
});
});
However, this means that the client has to first emit the method invocation, and then listen to all method returns and pluck out its correct return value:
socket.on('output', function(res) {
if (res.id === socket.sessionid) {
if (!res.error) {
if(res.method === 'myMethod') {
var myResponse = res.response;
// Do more stuff with my response
});
}
}
});
It is starting to seem like a messy design as I add more and more functions... is there a better way to do this?
The traditional AJAX way of attaching a callback to each function would be a lot nicer, but I want to take advantage of the benefits of websockets (e.g. less overhead for rapid communications).

Resources