Throttling requests to third-party APIs when using cloud functions - node.js

We're running our Node backend on Firebase Functions, and have to frequently hit a third-party API (HubSpot), which is rate-limited to 100 requests / 10 seconds.
We're making these requests to HubSpot from our cloud functions, and often find ourselves exceeding HubSpot's rate-limit during campaigns or other website usage spikes. Also, since they are all write requests to update data on HubSpot, these requests cannot be made out of order.
Is there a way to throttle our requests to HubSpot, so as to not exceed their rate limit? Open to suggestions that may not necessarily involve cloud functions, although that would be preferred.
Note: When I say "throttle", I mean that all requests to HubSpot need to go through. I'm trying to achieve something similar to what Lodash's throttle method does, if that makes sense.

What we usually do in this case is store the data into a database, and then pass it over to HubSpot in a tempered way (e.g. without exceeding their rate limit) using a cron that runs every minute. For every data item that we pass to HubSpot successfully, we mark it as "success" in the database.

Cloud Functions can not be rate limited. It will always attempt to service requests and events as fast as they arrive. But you can use Cloud Tasks to create an task queue to spread out the load of some work over time using a configured rate limit. A task queue can target another HTTP function. This effectively makes your processing asynchronous, but is really the only mechanism that Google Cloud gives you to smooth out load.

Related

Rate limit a pubsub queue worker

There is a scenario where a google pubsub worker will call a 3rd party API. This 3rd party API has limit of 500 requests per minute.
How can we handle this scenario.
Rate limit the google pub-sub worker.(If its possible how we can achieve it?)
Any other way available to check the limit before making the call to 3rd party API?
Please share if there is another option. Thanks
Cloud task is the tool designed for that. Instead of publishing your messages in a PubSub topic, create a task in a Cloud Task queue with the target URL.
On the task queue configuration, define the rate limit and, out of the box, your feature is done.
The simplest solution should be using the api-gateway, configure push notifications from the subscription to your limited api-gateway: https://cloud.google.com/api-gateway/docs/quotas-overview
When you reach the threshold, the messages will fail and get back to the queue. Be sure to configure several monitoring alerts to your pubsub unack message count and a DeadLetterQueue to avoid losing information.
Another solution might be a custom implementation in AppEngine, pulling messages with one instance of your subscriber App and keeping in memory the number of requests per date, not very resilient though. For a more resilient option consider using redis memorystore to keep the requests rate distributed between several instances, then you can use something like Cloud Functions.

Azure Durable Functions as Message Queue

I have a serverless function that receives orders, about ~30 per day. This function is depending on a third-party API to perform some additional lookups and checks. However, this external endpoint isn't 100% reliable and I need to be able to store order requests if the other API isn't available for a couple of hours (or more..).
My initial thought was to split the function into two, the first part would receive orders, do some initial checks such as validating the order, then post the request into a message queue or pub/sub system. On the other side, there's a consumer that reads orders and tries to perform the API requests, if the API isn't available the orders get posted back into the queue.
However, someone suggested to me to simply use an Azure Durable Function for the requests, and store the current backlog in the function state, using the Aggregator Pattern (especially since the API will be working find 99.99..% of the time). This would make the architecture a lot simpler.
What are the advantages/disadvantages of using one over the other, am I missing any important considerations?
I would appreciate any insight or other suggestions you have. Let me know if additional information is needed.
You could solve this problem with Durable Task Framework or Azure Storage or Service Bus Queues, but at your transaction volume, I think that's overcomplicating the solution.
If you're dealing with ~30 orders per day, consider one of the simpler solutions:
Use Polly, a well-supported resilience and fault-tolerance framework.
Write request information to your database. Have an Azure Function Timer Trigger read occasionally and finish processing orders that aren't marked as complete.
Durable Task Framework is great when you get into serious volume. But there's a non-trivial learning curve for the framework.

How to handle multiple API Requests

I am working with the Google Admin SDK to create Google Groups for my organization. I can't add members to the group when creating the group, ideally, when I create a new group I'd like to add roughly 60 members. In addition, the ability to add members after the group is created in bulk (a batch request) was deprecated August 2020. Right now, after I create a group I need to make a web request to the API to add each member of the group individually (which will be about 60 members).
I am using node.js and express, is there a good way to handle 60 web requests to an api? I don't know how taxing this will be on my server. If anyone has any resources to share where I can learn about the impact this would have on a nodejs server that would be great.
Fortunately, these groups aren't created often, maybe 15 a month.
One idea I have is to offload the work to something like a cloud function so my node server makes one request to the cloud function, then the cloud function makes all the additional requests to add members to the Group. I'm not 100% sure if this is necessary and I'm curious on other approaches.
Limits and Quotas
Note that adding group members may take 10 minutes to propagate.
The rate limit for the Directory API is 3000 queries per 100 seconds per IP address. This works out to around 30 per second. 60 requests is not a large amount of requests, but if you try to send them all in a few milliseconds the system may extrapolate the rate and deem it over the limit, I wouldn't think so, though probably best to test it on your end with your system and connection etc.
Exponential Backoff
If you do need to make many requests this is the method Google recommends. It involves repeating the request if it fails and then exponentially increasing the amount of time to wait until it reaches 16 seconds. You can always implement a longer wait to retry. Its only 60 requests after all.
Batch Requests
The previously mentioned methods should work no issue for you, since there are only 60 requests to make, it won't put any "stress" on the system as such. That said, the most performant way to handle many requests to the Directory API is to use a batch request. This allows you to bundle up all your member requests into one large batch, of up to 1000 calls. This will also give you a nice cushion in case you need to increase your requests in future!
EDIT - I am sorry, I missed that you mentioned that Batching is deprecated. Only global batching is deprecated. If you send a batch request to a specific API, batching will still be supported. What you can no longer do is send a single batch request to different APIs, like Directory and Sheets in one.
References
Limits and Quotas
Exponential Backoff
Batch Requests

How do Azure Functions scale out?

The scaling documentation for Azure Functions is a bit light on details for how Azure Functions decide when to add more instances of an app.
Say for example I have a function that is triggered by a Github webhook. 10,000 people simultaneously commit to the Github repo (with no merge conflicts ;) ), and Github calls my function 10,000 times in a very short period of time.
What can I expect to happen? Specifically,
Will Azure Functions throttle the webhook calls? i.e., will Azure Functions reject certain function calls if my function app is under high load?
Does Azure Functions queue the requests somehow? If so, where/how?
How many instances of my function app will Azure Functions create in this scenario? One for each request (i.e., 10,000), and each will run in parallel?
If my app function was scaled down to zero instances, because there was no load on it, could I expect to see some "warm-up time" before the first function is executed? Roughly how long?
Azure Functions won't reject a webhook call, but in the case of sudden, extreme load, some requests may timeout. For web apis, please include retry on the client, as a best practice.
They aren't queued in any persistent place. They are (implementation detail) managed by IIS.
(Implementation detail) Number of instances isn't a hard set thing. We have certain, unpublished protections in place, but we're designed to scale quite far. Your requests will be handled by multiple instances.
Yes. Right now, it's pretty hefty (seconds), but we'll be working to improve it. For perf sensitive situations, a canary or a timer trigger to keep it awake is recommended.
I'm from the Azure Functions team. The things I marked as implementation details aren't promises and will likely also change as we evolve our service; just an attempt at transparency.
tested today. it took more than seconds :(
ACTUAL PERFORMANCE
--------------
ClientConnected: 13:58:41.589
ClientBeginRequest: 13:58:41.592
GotRequestHeaders: 13:58:41.592
ClientDoneRequest: 13:58:41.592
Determine Gateway: 0ms
DNS Lookup: 65ms
TCP/IP Connect: 40ms
HTTPS Handshake: 114ms
ServerConnected: 13:58:41.703
FiddlerBeginRequest: 13:58:41.816
ServerGotRequest: 13:58:41.817
ServerBeginResponse: 14:00:36.790
GotResponseHeaders: 14:00:36.790
ServerDoneResponse: 14:00:36.790
ClientBeginResponse: 14:00:36.790
ClientDoneResponse: 14:00:36.790
Overall Elapsed: **0:01:55.198**

Instagram real-time API POST rate

I'm building an application using tag subscriptions in the real-time API and have a question related to capacity planning. We may have a large number of users posting to a subscribed hashtag at once, so the question is how often will the API actually POST to our subscription processing endpoint? E.g., if 100 users post to #testhashtag within a second or two, will I receive 100 POSTs or does the API batch those together as one update? A related question: is there a maximum rate at which POSTs can be sent (e.g., one per second or one per ten seconds, etc.)?
The Instagram API seems to lack detailed information about both how many updates are sent and what are the rate limits. From the [API docs][1]:
Limits
Be nice. If you're sending too many requests too quickly, we'll send back a 503 error code (server unavailable).
You are limited to 5000 requests per hour per access_token or client_id overall. Practically, this means you should (when possible) authenticate users so that limits are well outside the reach of a given user.
In other words, you'll need to check for a 503 and throttle your application accordingly. No information I've seen for how long they might block you, but it's best to avoid that completely. I would advise you manage this by placing a rate limiting mechanism on your own code, such as pushing your API requests through a queue with rate control. That will also give you the benefit of a retry of you're throttled so you won't lose any of the updates.
Moreover, a mechanism such as a queue in the case of real-time updates is further relevant because of the following from the API docs:
You should build your system to accept multiple update objects per payload - though often there will be only one included. Also, you should acknowledge the POST within a 2 second timeout--if you need to do more processing of the received information, you can do so in an asynchronous task.
Regarding the number of updates, the API can send you 1 update or many. The problem with this is you can absolutely murder your API calls because I don't think you can batch calls to specific media items, at least not using the official python or ruby clients or API console as far as I have seen.
This means that if you receive 500 updates either as 1 request to your server or split into many, it won't matter because either way, you need to go and fetch these items. From what I observed in a real application, these seemed to count against our quota, however the quota itself seems to consume resources erratically. That is, sometimes we saw no calls at all consumed, other times the available calls dropped by far more than we actually made. My advice is to be conservative and take the 5000 as a best guess rather than an absolute. You can check the remaining calls by parsing one of the headers they send back.
Use common sense, don't be stupid, and using a rate limiting mechanism should keep you safe and have the benefit of dealing with failures either due to outages (this happens more than you may think), network hicups, and accidental rate limiting. You could try to be tricky and use different API keys in a pooling mechanism, but this is likely a violation of the TOS and if they are doing anything via IP, you'd have to split this up to different machines with different IPs.
My final advice would be to restructure your application to not completely rely on the subscription mechanism. It's less than reliable and very expensive API wise. It's only truly useful if you just need to do something in your app that doesn't require calling back to Instgram, your number of items is small, or you can filter out the majority of items to avoid calling back to Instagram accept when a specific business rule is matched.
Instead, you can do things like query the tag or the user (ex: recent media) and scale it out that way. Normally this allows you to grab 100 items with 1 request rather than 100 items with 100 requests. If you really want to be cute, you could at least merge the subscription notifications asynchronously and combine the similar ones into a single batched request when you combine the duplicate characteristics such as tag into a single bucket. Sort of like a map/reduce but on a small data set. You could of course do an actual map/reduce from time-to-time on your own data as another way of keeping things in async. Again, be careful not to thrash instagram, but rather just use map/reduce to batch out your calls in a way that's useful to your app.
Hope that helps.

Resources