Call 5 APIs per minute in nodejs - node.js

I am new to nodejs. I want to limit my external API call to 5 per minute. If I exceeds more than 5 API call per minute, I will get following error.
You have exceeded the maximum requests per minute.
This is my code. Here tickerSymbol array that is passed to scheduleTickerDetails function will be a large array with almost 100k elements in it.
public async scheduleTickerDetails(tickerSymbol: any) {
for(let i=0;i<tickerSymbol.length;i++) {
if(i%5 == 0){
await this.setTimeOutForSymbolAPICall(60000);}
await axios.get('https://api.polygon.io/v1/meta/symbols/' + tickerSymbol[i] + '/company?apiKey=' + process.env.POLYGON_API_KEY).then(async function (response: any) {
console.log("Logo : " + response.data.logo + 'TICKER :' + tickerSymbol[i]);
let logo = response.data.logo;
if (await stockTickers.updateOne({ ticker: tickerSymbol[i] }, { $set: { "logo": logo } }))
return true;
else
return false;
})
.catch(function (error: any) {
console.log("Error from symbol service file : " + error + 'symbol:'+tickerSymbol[i]);
});
}
}
/**
* Set time out for calling symbol API call
* #param minute
* #return Promise
*/
public setTimeOutForSymbolAPICall(minute:any) {
return new Promise( resolve => {
setTimeout(()=> {resolve('')} ,minute );
})
}
I want to send 1st 5 APIs first, then after a minute I need to send next 5 APIs and so on. I have created a setTimeOut fucntion for this, but sometimes in my console
Error: Request failed with status code 429 : You've exceeded the maximum requests per minute.

The for loop in JS runs immediately to completion while all your
asynchronous operations are started.
Refer this answer.

Related

RxJS: Parallel http call to paginated API using NestJS HttpService

Im using NestJS, and this is my current implementation to make parallel http request:
#Injectable()
export class AppService {
constructor(private readonly http: HttpService) {}
private fetchData(index: number) {
Logger.log(`Collect index: ${index}`);
return this.http
.get<Passenger>(
`https://api.instantwebtools.net/v1/passenger?page=${index}&size=100`,
{ validateStatus: null },
)
.pipe(concatMap(response => of(response.data)));
}
async getAllData() {
let a = 0;
const collect: Passenger[] = [];
const $subject = new BehaviorSubject(a);
await $subject
.pipe(
flatMap(index =>
forkJoin([
this.fetchData(index),
this.fetchData(index + 1),
this.fetchData(index + 2),
]).pipe(mergeAll()),
),
tap(data => {
collect.push(data);
if (data?.data?.length === 0) {
$subject.complete(); // stop the stream
} else {
a += 3; // increment by 3, because the request is 3 times at a time
$subject.next(a);
}
}),
)
.toPromise();
return collect;
}
}
This service is to collect the 3rd party data. As for now, the fetchData() function is called multiple times according to how many parallel requests I want at a time. I use a dummy API for the test, but in the real scenario the API endpoint size limit is 100 and it doesn't return the meta information about how much the totalPage. It just returns the empty data when the last page is reached.
The goal is to make a parallel request and combine the result at the end. I'm doing this to keep the request time as minimum as possible and because the API itself has a rate limit of 50 requests per second. How to optimize this code?
To fetch all pages in one go you can use expand to recursively subscribe to an observable that fetches some pages. End the recursion by returning EMPTY when the last page you received is empty.
function fetchAllPages(batchSize: number = 3): Observable<any[][]> {
let index = 0;
return fetchPages(index, batchSize).pipe(
// if the last page isn't empty fetch the next pages, otherwise end the recursion
expand(pages => pages[pages.length - 1].length > 0
? fetchPages((index += batchSize), batchSize)
: EMPTY
),
// accumulate all pages in one array, filter out any trailing empty pages
reduce((acc, curr) => acc.concat(curr.filter(page => page.length)), [])
);
}
// fetch a given number of pages starting from 'index' as parallel requests
function fetchPages(index: number, numberOfPages: number): Observable<any[][]> {
const requests = Array.from({ length: numberOfPages }, (_, i) =>
fetchData(index + i)
);
return forkJoin(requests);
}
https://stackblitz.com/edit/rxjs-vkad5h?file=index.ts
This will obviously send a few unnecessary requests in the last batch if
(totalNumberOfPages + 1) % batchSize != 0.

Firebase Nodejs : DEADLINE_EXCEEDED: Deadline exceeded on set() [duplicate]

I took one of the sample functions from the Firestore documentation and was able to successfully run it from my local firebase environment. However, once I deployed to my firebase server, the function completes, but no entries are made in the firestore database. The firebase function logs show "Deadline Exceeded." I'm a bit baffled. Anyone know why this is happening and how to resolve this?
Here is the sample function:
exports.testingFunction = functions.https.onRequest((request, response) => {
var data = {
name: 'Los Angeles',
state: 'CA',
country: 'USA'
};
// Add a new document in collection "cities" with ID 'DC'
var db = admin.firestore();
var setDoc = db.collection('cities').doc('LA').set(data);
response.status(200).send();
});
Firestore has limits.
Probably “Deadline Exceeded” happens because of its limits.
See this. https://firebase.google.com/docs/firestore/quotas
Maximum write rate to a document 1 per second
https://groups.google.com/forum/#!msg/google-cloud-firestore-discuss/tGaZpTWQ7tQ/NdaDGRAzBgAJ
In my own experience, this problem can also happen when you try to write documents using a bad internet connection.
I use a solution similar to Jurgen's suggestion to insert documents in batch smaller than 500 at once, and this error appears if I'm using a not so stable wifi connection. When I plug in the cable, the same script with the same data runs without errors.
I have written this little script which uses batch writes (max 500) and only write one batch after the other.
use it by first creating a batchWorker let batch: any = new FbBatchWorker(db);
Then add anything to the worker batch.set(ref.doc(docId), MyObject);. And finish it via batch.commit().
The api is the same as for the normal Firestore Batch (https://firebase.google.com/docs/firestore/manage-data/transactions#batched-writes) However, currently it only supports set.
import { firestore } from "firebase-admin";
class FBWorker {
callback: Function;
constructor(callback: Function) {
this.callback = callback;
}
work(data: {
type: "SET" | "DELETE";
ref: FirebaseFirestore.DocumentReference;
data?: any;
options?: FirebaseFirestore.SetOptions;
}) {
if (data.type === "SET") {
// tslint:disable-next-line: no-floating-promises
data.ref.set(data.data, data.options).then(() => {
this.callback();
});
} else if (data.type === "DELETE") {
// tslint:disable-next-line: no-floating-promises
data.ref.delete().then(() => {
this.callback();
});
} else {
this.callback();
}
}
}
export class FbBatchWorker {
db: firestore.Firestore;
batchList2: {
type: "SET" | "DELETE";
ref: FirebaseFirestore.DocumentReference;
data?: any;
options?: FirebaseFirestore.SetOptions;
}[] = [];
elemCount: number = 0;
private _maxBatchSize: number = 490;
public get maxBatchSize(): number {
return this._maxBatchSize;
}
public set maxBatchSize(size: number) {
if (size < 1) {
throw new Error("Size must be positive");
}
if (size > 490) {
throw new Error("Size must not be larger then 490");
}
this._maxBatchSize = size;
}
constructor(db: firestore.Firestore) {
this.db = db;
}
async commit(): Promise<any> {
const workerProms: Promise<any>[] = [];
const maxWorker = this.batchList2.length > this.maxBatchSize ? this.maxBatchSize : this.batchList2.length;
for (let w = 0; w < maxWorker; w++) {
workerProms.push(
new Promise((resolve) => {
const A = new FBWorker(() => {
if (this.batchList2.length > 0) {
A.work(this.batchList2.pop());
} else {
resolve();
}
});
// tslint:disable-next-line: no-floating-promises
A.work(this.batchList2.pop());
}),
);
}
return Promise.all(workerProms);
}
set(dbref: FirebaseFirestore.DocumentReference, data: any, options?: FirebaseFirestore.SetOptions): void {
this.batchList2.push({
type: "SET",
ref: dbref,
data,
options,
});
}
delete(dbref: FirebaseFirestore.DocumentReference) {
this.batchList2.push({
type: "DELETE",
ref: dbref,
});
}
}
I tested this, by having 15 concurrent AWS Lambda functions writing 10,000 requests into the database into different collections / documents milliseconds part. I did not get the DEADLINE_EXCEEDED error.
Please see the documentation on firebase.
'deadline-exceeded': Deadline expired before operation could complete. For operations that change the state of the system, this error may be returned even if the operation has completed successfully. For example, a successful response from a server could have been delayed long enough for the deadline to expire.
In our case we are writing a small amount of data and it works most of the time but loosing data is unacceptable. I have not concluded why Firestore fails to write in simple small bits of data.
SOLUTION:
I am using an AWS Lambda function that uses an SQS event trigger.
# This function receives requests from the queue and handles them
# by persisting the survey answers for the respective users.
QuizAnswerQueueReceiver:
handler: app/lambdas/quizAnswerQueueReceiver.handler
timeout: 180 # The SQS visibility timeout should always be greater than the Lambda function’s timeout.
reservedConcurrency: 1 # optional, reserved concurrency limit for this function. By default, AWS uses account concurrency limit
events:
- sqs:
batchSize: 10 # Wait for 10 messages before processing.
maximumBatchingWindow: 60 # The maximum amount of time in seconds to gather records before invoking the function
arn:
Fn::GetAtt:
- SurveyAnswerReceiverQueue
- Arn
environment:
NODE_ENV: ${self:custom.myStage}
I am using a dead letter queue connected to my main queue for failed events.
Resources:
QuizAnswerReceiverQueue:
Type: AWS::SQS::Queue
Properties:
QueueName: ${self:provider.environment.QUIZ_ANSWER_RECEIVER_QUEUE}
# VisibilityTimeout MUST be greater than the lambda functions timeout https://lumigo.io/blog/sqs-and-lambda-the-missing-guide-on-failure-modes/
# The length of time during which a message will be unavailable after a message is delivered from the queue.
# This blocks other components from receiving the same message and gives the initial component time to process and delete the message from the queue.
VisibilityTimeout: 900 # The SQS visibility timeout should always be greater than the Lambda function’s timeout.
# The number of seconds that Amazon SQS retains a message. You can specify an integer value from 60 seconds (1 minute) to 1,209,600 seconds (14 days).
MessageRetentionPeriod: 345600 # The number of seconds that Amazon SQS retains a message.
RedrivePolicy:
deadLetterTargetArn:
"Fn::GetAtt":
- QuizAnswerReceiverQueueDLQ
- Arn
maxReceiveCount: 5 # The number of times a message is delivered to the source queue before being moved to the dead-letter queue.
QuizAnswerReceiverQueueDLQ:
Type: "AWS::SQS::Queue"
Properties:
QueueName: "${self:provider.environment.QUIZ_ANSWER_RECEIVER_QUEUE}DLQ"
MessageRetentionPeriod: 1209600 # 14 days in seconds
If the error is generate after around 10 seconds, probably it's not your internet connetion, it might be that your functions are not returning any promise. In my experience I got the error simply because I had wrapped a firebase set operation(which returns a promise) inside another promise.
You can do this
return db.collection("COL_NAME").doc("DOC_NAME").set(attribs).then(ref => {
var SuccessResponse = {
"code": "200"
}
var resp = JSON.stringify(SuccessResponse);
return resp;
}).catch(err => {
console.log('Quiz Error OCCURED ', err);
var FailureResponse = {
"code": "400",
}
var resp = JSON.stringify(FailureResponse);
return resp;
});
instead of
return new Promise((resolve,reject)=>{
db.collection("COL_NAME").doc("DOC_NAME").set(attribs).then(ref => {
var SuccessResponse = {
"code": "200"
}
var resp = JSON.stringify(SuccessResponse);
return resp;
}).catch(err => {
console.log('Quiz Error OCCURED ', err);
var FailureResponse = {
"code": "400",
}
var resp = JSON.stringify(FailureResponse);
return resp;
});
});

set delay dynamic on retryWhen

Is it possible to set the delay value dynamic after every retry. I tried it like this but it looks lite it keeps the value which is set initial.
imageController(epgData: EpgDataDTO[], showOrMovie: string){
var retryAfterMilliSeconds = 1000;
epgData.forEach( (data) => {
this.getImagesFromMovieDB(data.title).pipe(
retryWhen((error) => {
return error.pipe(
mergeMap((error: any) => {
if(error.response.status === 429) {
const retryAfter = error.response.headers;
retryAfterMilliSeconds = +retryAfter['retry-after'] * 1000
console.log(retryAfterMilliSeconds); // it tells me the correct value here but the retry happens every 1000ms
console.log(data.title);
}else{
this.errorHandling(error)
return of("error");
}
return of("error");
}),
delay(retryAfterMilliSeconds),
take(5)
)
}))
.subscribe( (res) => {
console.log(res.status);
console.log(res.headers);
});
})
}
You were VERY close! To get this to work, all I had to do was move the delay(retryAfterMilliSeconds) to after the return value of the mergeMap() operator to tie it to the same observable. Without this it would randomly delay the RETURN from mergeMap() which would be random for which Observable was actually delayed.
I put this up in a Stackblitz to test it. Click on 'Console' at the bottom of the far right frame to see results.
Here is the function from that StackBlitz:
imageController(epgData: EpgDataDTO[], showOrMovie: string){
var retryAfterMilliSeconds = 1000;
epgData.forEach( (data) => {
this.getImagesFromMovieDB(data.title).pipe(
retryWhen((error) => {
return error.pipe(
mergeMap((error: any) => {
if(error.response.status === 429) {
const retryAfter = error.response.headers;
retryAfterMilliSeconds = +retryAfter['retry-after'] * 1000
console.log(retryAfterMilliSeconds); // it tells me the correct value here but the retry happens every 1000ms
console.log(data.title);
}else{
this.errorHandling(error)
// return of("error"); <-- unnecessary since this will be executed with next statement
}
return of("error").pipe(delay(retryAfterMilliSeconds));
}),
// delay(retryAfterMilliSeconds),
take(5)
)
}))
.subscribe(
(res) => {
// console.log(res.status);
// console.log(res.headers);
const elapsedTime = Math.round(((new Date()).getTime() - startTime) / 1000);
console.log(`'${res.status}' is Ok - total elapsed time ${elapsedTime} seconds`);
}
);
})
}
Some other notes:
The return from getImagesFromMovieDB() is actually important - it needs to return a unique observable for every call for this to work, please ensure this is the case. I simulated this in the StackBlitz by constructing the Observable return with delay.
As you can see I changed the first function inside the .subscribe() to print out the total elapsed time taken to get valid data for this res.status. I did this only to show for each emission that it correctly takes the sum of all delays.
It is retried after every failure for a random amount of time (I arbitrarily chose between 5 and 10 seconds), as returned by the response header in your original function.
Minor point: you have two returns from mergeMap return of("error") but the first is unnecessary since the second will be immediately executed, so I commented that one out.
I hope this helps.

Node.js - Limiting the number of requests I make

I have a Node app in which there is a Gremlin client:
var Gremlin = require('gremlin');
const client = Gremlin.createClient(
443,
config.endpoint,
{
"session": false,
"ssl": true,
"user": `/dbs/${config.database}/colls/${config.collection}`,
"password": config.primaryKey
}
);
With which I then making calls to a CosmoDB to add some records using:
async.forEach(pData, function (data, innercallback) {
if (data.type == 'Full'){
client.execute("g.addV('test').property('id', \"" + data.$.id + "\")", {}, innercallback);
} else {
innercallback(null);
}
}, outercallback);
However on my Azure side there is a limit of 400 requests / second and subsequently I get the error:
ExceptionType : RequestRateTooLargeException
ExceptionMessage : Message: {"Errors":["Request rate is large"]}
Does anyone have any ideas on how I can restrict the number of requests made per second, without having to scale up on Azure (as that costs more :) )
Additionally:
I tried using
async.forEachLimit(pData, 400, function (data, innercallback) {
if (data.type == 'Full'){
client.execute("g.addV('test').property('id', \"" + data.$.id + "\")", {}, innercallback);
} else {
innercallback(null);
}
}, outercallback);
However if keep seeing RangeError: Maximum call stack size exceeded if its too high otherwise if I reduce I just get the same request rate too large exception.
Thanks.
RangeError: Maximum call stack size exceeded
That might happen because innercallback is called synchronously in else case. It should be:
} else {
process.nextTick(function() {
innercallback(null)
});
}
The call to forEachLimit looks generally correct, but you need to make sure that when a request is really fired (if block), innercallback is not called earlier than 1 second to guarantee that there are no more than 400 request in one second fired. The easiest is to delay the callback execution exactly for 1 second:
client.execute("g.addV('test').property('id', \"" + data.$.id + "\")", {},
function(err) {
setTimeout(function() { innercallback(err); }, 1000);
});
The more accurate solution would be to calculate the actual request+response time and setTimeout only for the time remaining to 1 second.
As a further improvement, looks like you can filter your pData array before doing async stuff to get rid of if...else, so eventually:
var pDataFull = pData.filter(function(data) => {
return data.type == 'Full';
});
async.forEachLimit(pDataFull, 400, function (data, innercallback) {
client.execute("g.addV('test').property('id', \"" + data.$.id +
"\")", {},
function(err) {
setTimeout(function() { innercallback(err); }, 1000);
}
);
}, outercallback);
Let's clear up something first. You don't have a 400 requests/second collection but a 400 RU/s collection. RUs stand for request units and they don't translate to a request.
Roughly:
A retrieval request for the retrieval of a document that's 1KB will cost 1 RU.
A modification for the retrieval of a document that's 1KB will cost 5 RU.
Assuming your documents are 1KB big, you can only add 80 documents per second.
Now that we have that out of the way, it sounds like async.queue() can do the trick for you.

How to stop rethink database call if cannot find result?

I run RethinkDB command with Node.js (babel) asynchronous call:
let user = await r.table('users').filter({key: key}).limit(1).run();
How can I stop the asynchronous call, if database cannot find result?
Using the await function means that node will wait for the asynchronous r.table(... command to return before continuing to the next line of code, meaning that it behaves logically as if it were synchronous code.
Your specific command should return when RethinkDB finds the first 'user' document with the specified key. There is no need to "stop" it if it cannot find a result, it will stop as soon as it (a) finds a result or (b) finished scanning the entire table.
In general "stopping" asynchronous code in node/javascript is not possible but you can limit the amount of time you'll wait for an async method. Here is an example using the Promise.race() function.
/*
* toy async function
*
* returns a promise that resolves to the specified number `n`
* after the specified number of seconds `s` (default 2)
*/
const later = (n, s=2) => {
return new Promise(resolve => {
setTimeout(() => resolve(n), s*1000);
})
}
/*
* returns a promise that rejects with `TIMEOUT_ERROR` after the
* specified number of seconds `s`
*/
const timeout = (s) => {
return new Promise((resolve, reject) => {
setTimeout(() => reject("TIMEOUT_ERROR"), s*1000)
})
}
/*
* Example 1: later finished before timeout
* later resolves after 1 second, timeout function rejects after 3 seconds
* so we end up in the `.then` block with `val == 42`
*/
Promise.race([later(42, 1), timeout(3)])
.then(val => {
// do somethign with val...
console.log(val)
}).catch(err => {
if (err === "TIMEOUT_ERROR") {
console.log("we timed out!")
} else {
consle.log("something failed (but it was not a timeout)")
}
});
/*
* Example 2 (using async/await syntax): we timeout before later returns.
* later resolves after 3 seconds, timeout function rejects after 2 seconds
* so we end up in the `.catch` block with `err == "TIMEOUT_ERROR"`
*/
try {
const val = await Promise.race([later(11, 3), timeout(2)]);
// do something with val...
} catch (err) {
if (err === "TIMEOUT_ERROR") {
console.error("we timed out!")
} else {
console.error("something failed (but it was not a timeout)")
}
}

Resources