Throttling HTTP request (Node.js)

Throttling HTTP request (Node.js) - node.js

I'm trying to do some basic web scraping and eventually found the need to limit my request calls as the server will return a page not found when there's too many request
Currently I'm using request-promise wrapped in request-promise-retry to make this call and I also found this article which seems to be trying to achieve the same thing Throttle and queue up API requests due to per second cap
I went ahead to try the simple-rate-limiter as it looks simple enough to use but got the following error
TypeError: requestPromise(...).catch is not a function
at promiseRetry.retries (C:\project\node_modules\request-promise-retry\index.js:18:27)
at C:\project\node_modules\promise-retry\index.js:29:24
at <anonymous>
My guess is that the simple-rate-limiter doesn't work with request-promise and only works with request.
Are there any simple ways to go around throttling the requests without having to rewrite all my calls with "request" instead of "request-promise"?
Or is it a good idea to rewrite using normal "request", which will be a pain as I already have a few pieces of code written with request-promise and is expecting a promise to be returned.

Related

Practical applications of multiple callbacks?

relatively new to Node.js and I've found that there's a lot of support for multiple callback functions. I'd love some help in understanding the applications of having such a structure (why isn't one enough?) Thanks!
Might it have something to do with making async requests in our callback functions?
i.e. I'm taking handling a get request, and must first make a get request of my own before ultimately returning the request to the client.

OPTIONS Preflight request executes POST's code - is that standard?

If I understand correctly, a preflight OPTIONS request is sent as a way of asking "what's allowed here?". Then, once the response comes, if allowed, the calling site sends the POST request (or GET but in my case it's a post). I have figured out that, at least with Azure Function Apps, the OPTIONS request is executing the code that I expected only the POST to execute. I believe this to be the case because once I added some null checking (since the OPTIONS request doesn't have a payload in the body) everything worked fine.
I'm wondering if this is standard.
Seems to me that if I had written the API without using Azure Function Apps, I'd have the OPTIONS request sent down a path that would set the appropriate headers and return a 200 response. And the POST request would be sent down a different path that would expect a payload in the body. If that's how it usually works then that means I've just found an idiosyncrasy of the Azure functionality. But if not it means that I have something to learn about the OPTIONS preflight request.
Thanks in advance for your advice.
Denise

As sideshowbarker mentioned, the OPTIONS request is sent automatically by the browser to check if the cross-origin request can be made.
In case of Azure Functions, this will handled by the Azure when running in the cloud.
If your function is being triggered, that would mean that you have "options" as a supported method for your HTTP Trigger
In the HTTPTrigger attribute for C# functions
In functions.json for non-C# functions
If you want to customize the CORS responses and/or running functions in a container, you could always include "options" as supported and respond differently when the incoming HTTP method is OPTIONS.
Also, if you are using Azure API Management with Azure Functions, you could offload CORS handling to it instead or even use Functions Proxies as shown here.

Thanks y'all! Sorry I was unclear. And sorry it took me a while to get back. Things have been a bit crazy on this end.
Yes, the function being called is mine. And now I understand the browser doesn't have much choice as to whether or not it makes the OPTIONS call.
And yes, I could make my Azure function handle an options call differently and thanks for that suggestion too. That's sort of what I ended up doing but basically I did it by handling an empty payload. I didn't follow that best practice originally because I thought any valid request would have a payload. Accordingly, any request that did not have a payload was invalid and should be turned away as a failure of some sort. This was before I knew that the OPTIONS call was actually executing that function.
My remaining question is if I had NOT been using Azure... if I had rolled my own solution and hosted it somewhere, I'd have a class or at least methods that handle calls to this particular API. (This is something I'm new to so bear with me if my terms aren't quite right and please do correct me). So if I'd done my own API, I'd have one method to handle a POST call and a different method to handle an OPTIONS call, wouldn't I? And the method that handles the OPTIONS call would return information about what's legally do-able with this API. And the method that handles a POST call would handle the payload sent with it. And the method that handles the POST wouldn't get executed when an OPTIONS request is sent. At least that's how I figured it would work. And that's my question -- is that how it's done when not letting something like Azure handle some of the infrastructure?
I'm just trying to learn if the OPTIONS request executing a POST's function is a standard practice or if it's some kind of idiosyncrasy to working with Azure functions.
Thanks again for the advice and for helping me understand these questions.

Node post request body gets truncated

when trying to post a WakeUp event with a JSON body to the Alexa events API using nodejs with axios or request-promise, the API always returns an error 500.
I posted to an online endpoint to actually see what gets posted and learned that the post body gets truncated which obviously results in invalid json. I abstracted the problem and tried to run it from a virgin nodejs installation by using repl.it and the result is the same.
Interestingly enough, there seems to be a relation between the length of the header and the body. So when I shorten the auth token in the header, more characters of the body get transferred. If I shorten the long tokens in the body to about 450 to 500 characters (it seems to vary) the whole request gets through. Obviously this is not a solution, because the tokens are needed for authentication.
When I experimented with the axios version used lowering it to 0.10 I once got a result but posting again lead to another 500. If I post often enough some requests get trough complete, even on the current axios version. I also tried using request-promise with the same outcome.
I got the feeling that I made a really stupid mistake but I can't find it and I really couldn't find anything on this topic, so it's driving me crazy. Any help would be greatly appreciated!

This looks like a tricky one.. first of all, I don't think you're making a really stupid mistake. It looks to me like one of the low-level modules doesn't like something in the POST body for some reason (really weird.).. I've played about with this and I'm getting exactly the same behaviour with both Axios and Request.. if I comment out the tokens (correlationToken and bearer token ) everything works fine.
If I test this locally, everything works as it should (e.g. set up express server and log POST body).
Also posting to https://postman-echo.com/post works as expected (with the original post data)..
I've created this here: https://repl.it/repls/YoungPuzzlingMonad
It looks to me like the original request to http://posthere.io is failing because of the request size only. If you try a very basic POST with a large JSON body you get the same result.
I get the same result with superagent too.. this leads me to believe this is something server side...

This was not related to the post request at all. The reason for the error after sending the WakeUp event was the missing configuration parameter containing the MACAdresses in the Alexa.WakeOnLANController interface.
I used the AlexaResponse class to add the capability via createPayloadEndpointCapability which had not been modified to support the "new" WakeOnLANController interface yet.
It's a pity that the discovery was accepted and my WOL-capable device was added to my smart home devices although a required parameter was missing :(
posthere.io cutting off long post bodys cost me quite a few hours... On the upside, I go to know many different ways of issuing a post request in node ;)
Thanks again Terry for investigating!

How can I see the whole request and response object in a node.js program?

I have written a web server in nodejs. Most of the time I am receiving a message from one service, doing something, and sending a message to another service. I am in the middle of all the communication.
Sometimes, the communication fails. I am trying to debug what's going on. I would like to examine the request that comes in.
I have a node service, written in express. I have routes, and the routes are passed a req object and a resp object. I should be able to just print out the req object. Problem solved!
But JSON.stringify throws an error. util.inspect doesn't throw an error, but many property values are marked [circular]. The actual property value isn't shown.
When I console.log(req.body) it prints undefined. When I look at req.body using util.inspect, it prints body: {}
I have the feeling the framework is hiding things from me. I don't know how to get the information without it being prettified.
At the tcp/ip level, it's too detailed. At the application level, it's not detailed enough. But at the http level, it should be just right. The request that is received is just text. I should be able to print it out.
I tried using Charles, but I'm having trouble configuring it.
Surely, other people have wanted to see the request as it comes in, before the framework massaged it. How did they do it?

you can use morgan module, it's a HTTP request logger middleware for node.js

I made a more specific question, using a lower level of the node stack of middleware. I got an answer there:
Where did the information I passed in go?
Here is the discussion of how node came to be designed this way:
Node.js - get raw request body using Express
Basically, there used to be a rawBody attribute of the request object in node. People took it out. To accomplish the same thing requires a little bit of code.

Modifying content delivered by node-http-proxy

Due to some limitations about the web services I am proxying, I have to inject some JS code so that it allows the iframe to access the parent window and perform some actions.
I have built a proxy system with node-http-proxy which works pretty nicely. However I have spent unmeasurable hours trying to modify the content (on my own, using harmon as well, etc) that is being sent to the user without any success. I have found some articles and even some questions here but all of them are outdated and are not useful anymore.
I was wondering if someone can give me an actual example about how to do this, because I am unable to do it and maybe it is just that it is impossible to do at this point?

I haven't tried harmon, but I did try cheerio and it works.
However, I used http-mitm-proxy and not node-http-proxy.
If you are using http-mitm-proxy, you need to return a promise in the response handler. Otherwise, the proxy continues to send the original response without picking up your changes.
I have recently written another proxy at:
https://github.com/noeltimothy/noelsproxy
I'm going to add response handling to this soon. This one uses a callback mechanism, which means it wont return the response until the caller signals it to.
You should be able to use 'cheerio' and alter the content in JQuery style.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Throttling HTTP request (Node.js) - node.js

Related

Practical applications of multiple callbacks?

OPTIONS Preflight request executes POST's code - is that standard?

Node post request body gets truncated

How can I see the whole request and response object in a node.js program?

Modifying content delivered by node-http-proxy

Categories

Resources