NodeJS - While loop until certain condition is met or certain time is passed

NodeJS - While loop until certain condition is met or certain time is passed - node.js

I've seen some questions/answers very similar but none exactly describing what I would like to achieve. Some background, this is a multi step provision flow. In pretty short words this is the goal.
1. POST an action.
2. GET status based in one variable submitted above. If response == "done" then proceed. Returns an ID.
3. POST an action. Returns an ID.
4. GET status based on ID returned above. If response == "done" then proceed. Returns an ID.
5. (..)
I think there are 6/7 steps in total.
The first question is, are there any modules that could help me somehow achieve this? The only requirement is that each attempt to get status should have an X amount of delay and should expire, marking the flow as failed after an X amount of time.
Nevertheless, the best I could get to, is this, assuming for example step 2:
GetNewDeviceId : function(req, res) {
const delay = ms => new Promise((resolve, reject) => setTimeout(resolve, ms));
var ip = req;
async function main() {
let response;
while (true) {
try {
response = await service.GetNewDeviceId(ip);
console.log("Running again for: " + ip + " - " + response)
if (response["value"] != null) {
break;
}
} catch {
// In case it fails
}
console.log("Delaying for: " + ip)
await delay(30000);
}
//Call next step
console.log("Moving on for: "+ ip)
}
main();
}
This brings couple of questions,
I'm not sure this is indeed the best/clean way.
How can I set a global timeout, let's say 30 minutes, forcing it to step out of the loop and call a "failure" function.
The other thing I'm not sure (NodeJS newbie here) is that, assuming this get's called let's say 4 times, with different IP before any of those 4 are finished, NodeJS will run each call in each own context right? I quickly tested this and it seems like so.

I'm not sure this is indeed the best/clean way.
It am unsure whether your function GetNewDeviceId involves a recursion, that is, whether it invokes itself as service.GetNewDeviceId. That would not make sense, service.GetNewDeviceId should perform a GET request, right? If that is the case, your function seems clean to me.
How can I set a global timeout, let's say 30 minutes, forcing it to step out of the loop and call a "failure" function.
let response;
let failAt = new Date().getTime() + 30 * 60 * 1000; // 30 minutes after now
while (true) {
if (new Date().getTime() >= failAt)
return res.status(500).send("Failure");
try {...}
...
await delay(30000);
}
The other thing I'm not sure (NodeJS newbie here) is that, assuming this get's called let's say 4 times, with different IP before any of those 4 are finished, NodeJS will run each call in each own context right?
Yes. Each invocation of the function GetNewDeviceId establishes a new execution context (called a "closure"), with its own copies of the parameters req and res and the variables response and failAt.

Related

Is there a way to intercept multiple requests linked by an "OR" conditional operator?

I am a beginner with cypress. I've been looking for a way to intercept API calls to at least one of multiple URLs.
Let's say a button is clicked and something like this code is executed to check if a list of requests were called :
cy.get('#request1').should('have.been.called').log(`Request was made to 'REQUEST1_URL'`)
OR
cy.get('#request2').should('have.been.called').log(`Request was made to ''REQUEST2_URL'`)
I want to check if a request was sent to one url or the other, or both.
Has anyone encountered this problem before ? Any contribution is appreciated.
Thanks.

The URL you use in the intercept should be general enough to catch both calls.
For example if the calls have /api/ in common, this catches both
cy.intercept('**/api/*') // note wildcards in the URL
.as('apiRequest')
cy.visit('/')
cy.wait('#apiRequest')
If you have more paths in the url than you need to catch, for example /api/dogs/ /api/cats/ and /api/pigs/, then use a function to weed out the ones you want
cy.intercept('**/api/*', (req) => {
if (req.url.includes('dogs') || req.url.includes('cats') { // no pigs
req.alias = 'dogsOrCats' // set alias
}
})
cy.visit('/')
cy.wait('#dogsOrCats')
Catching 0, 1, or 2 URLs
This is a bit tricky, if the number of calls isn't known then you have to know within what time frame they would be made.
To catch requests which you are fired fairly quickly by the app
let count = 0;
cy.intercept('**/api/*', (req) => {
count = count +1;
})
cy.visit('/')
cy.wait(3000) // wait to see if calls are fired
cy.then(() => {
cy.wrap(count).should('be.gt', 0) // 0 calls fails, 1 or 2 passes
})

Do not process next job until previous job is completed (BullJS/Redis)?

Basically, each of the clients ---that have a clientId associated with them--- can push messages and it is important that a second message from the same client isn't processed until the first one is finished processing (Even though the client can send multiple messages in a row, and they are ordered, and multiple clients sending messages should ideally not interfere with each other). And, importantly, a job shouldn't be processed twice.
I thought that using Redis I might be able to fix this issue, I started with some quick prototyping using the bull library, but I am clearly not doing it well, I was hoping someone would know how to proceed.
This is what I tried so far:
Create jobs and add them to the same queue name for one process, using the clientId as the job name.
Consume jobs while waiting large random amounts of random time on 2 separate process.
I tried adding the default locking provided by the library that I am using (bull) but it locks on the jobId, which is unique for each job, not on the clientId .
What I would want to happen:
One of the consumers can't take the job from the same clientId until the previous one is finished processing it.
They should be able to, however, get items from different clientIds in parallel without problem (asynchronously). (I haven't gotten this far, I am right now simply dealing with only one clientId)
What I get:
Both consumers consume as many items as they can from the queue without waiting for the previous item for the clientId to be completed.
Is Redis even the right tool for this job?
Example code
// ./setup.ts
import Queue from 'bull';
import * as uuid from 'uuid';
// Check that when a message is taken from a place, no other message is taken
// TO do that test, have two processes that process messages and one that sets messages, and make the job take a long time
// queue for each room https://stackoverflow.com/questions/54178462/how-does-redis-pubsub-subscribe-mechanism-works/54243792#54243792
// https://groups.google.com/forum/#!topic/redis-db/R09u__3Jzfk
// Make a job not be called stalled, waiting enough time https://github.com/OptimalBits/bull/issues/210#issuecomment-190818353
export async function sleep(ms: number): Promise<void> {
return new Promise((resolve) => {
setTimeout(resolve, ms);
});
}
export interface JobData {
id: string;
v: number;
}
export const queue = new Queue<JobData>('messages', 'redis://127.0.0.1:6379');
queue.on('error', (err) => {
console.error('Uncaught error on queue.', err);
process.exit(1);
});
export function clientId(): string {
return uuid.v4();
}
export function randomWait(minms: number, maxms: number): Promise<void> {
const ms = Math.random() * (maxms - minms) + minms;
return sleep(ms);
}
// Make a job not be called stalled, waiting enough time https://github.com/OptimalBits/bull/issues/210#issuecomment-190818353
// eslint-disable-next-line #typescript-eslint/ban-ts-comment
//#ts-ignore
queue.LOCK_RENEW_TIME = 5 * 60 * 1000;
// ./create.ts
import { queue, randomWait } from './setup';
const MIN_WAIT = 300;
const MAX_WAIT = 1500;
async function createJobs(n = 10): Promise<void> {
await randomWait(MIN_WAIT, MAX_WAIT);
// always same Id
const clientId = Math.random() > 1 ? 'zero' : 'one';
for (let index = 0; index < n; index++) {
await randomWait(MIN_WAIT, MAX_WAIT);
const job = { id: clientId, v: index };
await queue.add(clientId, job).catch(console.error);
console.log('Added job', job);
}
}
export async function create(nIds = 10, nItems = 10): Promise<void> {
const jobs = [];
await randomWait(MIN_WAIT, MAX_WAIT);
for (let index = 0; index < nIds; index++) {
await randomWait(MIN_WAIT, MAX_WAIT);
jobs.push(createJobs(nItems));
await randomWait(MIN_WAIT, MAX_WAIT);
}
await randomWait(MIN_WAIT, MAX_WAIT);
await Promise.all(jobs)
process.exit();
}
(function mainCreate(): void {
create().catch((err) => {
console.error(err);
process.exit(1);
});
})();
// ./consume.ts
import { queue, randomWait, clientId } from './setup';
function startProcessor(minWait = 5000, maxWait = 10000): void {
queue
.process('*', 100, async (job) => {
console.log('LOCKING: ', job.lockKey());
await job.takeLock();
const name = job.name;
const processingId = clientId().split('-', 1)[0];
try {
console.log('START: ', processingId, '\tjobName:', name);
await randomWait(minWait, maxWait);
const data = job.data;
console.log('PROCESSING: ', processingId, '\tjobName:', name, '\tdata:', data);
await randomWait(minWait, maxWait);
console.log('PROCESSED: ', processingId, '\tjobName:', name, '\tdata:', data);
await randomWait(minWait, maxWait);
console.log('FINISHED: ', processingId, '\tjobName:', name, '\tdata:', data);
} catch (err) {
console.error(err);
} finally {
await job.releaseLock();
}
})
.catch(console.error); // Catches initialization
}
startProcessor();
This is run using 3 different processes, which you might call like this (Although I use different tabs for a clearer view of what is happening)
npx ts-node consume.ts &
npx ts-node consume.ts &
npx ts-node create.ts &

I'm not familir with node.js. But for Redis, I would try this,
Let's say you have client_1, client_2, they are all publisher of events.
You have three machines, consumer_1,consumer_2, consumer_3.
Establish a list of tasks in redis, eg, JOB_LIST.
Clients put(LPUSH) jobs into this JOB_LIST, in a specific form, like "CLIENT_1:[jobcontent]", "CLIENT_2:[jobcontent]"
Each consumer takes out jobs blockingly (RPOP command of Redis) and process them.
For example, consumer_1 takes out a job, content is CLIENT_1:[jobcontent]. It parses the content and recognize it's from CLIENT_1. Then it wants to check if some other consumer is processing CLIENT_1 already, if not, it will lock the key to indicate that it's processing CLIENT_1.
It goes on to set a key of "CLIENT_1_PROCESSING" , with content as "consumer_1", using the Redis SETNX command (set if the key not exists), with an appropriate timeout. For example, the task norally takes one minute to finish, you set a timeout of the key of five minutes, just in case consumer_1 crashes and holds on the lock indefinitely.
If the SETNX returns 0, it means it fails to acquire the lock of CLIENT_1 (someone is already processing a job of client_1). Then it returns the job (a value of "CLIENT_1:[jobcontent]")to the left side of JOB_LIST, by using Redis LPUSH command.Then it might wait a bit (sleep a few seconds), and RPOP another task from the right side of the LIST. If this time SETNX returns 1, consumer_1 acquires the lock. It goes on to process job, after it finishes, it deletes the key of "CLIENT_1_PROCESSING", releasing the lock. Then it goes on to RPOP another job, and so on.
Some things to consider:
The JOB_LIST is not fair,eg, earlier jobs might be processed later
The locking part is a bit rudimentary, but will suffice.
----------update--------------
I've figured another way to keep tasks in order.
For each client(producer), build a list. Like "client_1_list", push jobs into the left side of the list.
Save all the client names in a list "client_names_list", with values "client_1", "client_2", etc.
For each consumer(processor), iterate the "client_names_list", for example, consumer_1 get a "client_1", check if the key of client_1 is locked(some one is processing a task of client_1 already), if not, right pop a value(job) from client_1_list and lock client_1. If client_1 is locked, (probably sleep one second) and iterate to the next client, "client_2", for example, and check the keys and so on.
This way, each client(task producer)'s task is processed by their order of entering.

EDIT: I found the problem regarding BullJS is starting jobs in parallel on one processor: We are using named jobs and where defining many named process functions on one queue/processor. The default concurrency factor for a queue/processor is 1. So the queue should not process any jobs in parallel.
The problem with our mentioned setup is if you define many (named) process-handlers on one queue the concurrency is added up with each process-handler function: So if you define three named process-handlers you get a concurrency factor of 3 for given queue for all the defined named jobs.
So just define one named job per queue for queues where parallel processing should not happen and all jobs should run sequentially one after the other.
That could be important e.g. when pushing a high number of jobs onto the queue and the processing involves API calls that would give errors if handled in parallel.
The following text is my first approach of answering the op's question and describes just a workaround to the problem. So better just go with my edit :) and configure your queues the right way.
I found an easy solution to operators question.
In fact BullJS is processing many jobs in parallel on one worker instance:
Let's say you have one worker instance up and running and push 10 jobs onto the queue than possibly that worker starts all processes in parallel.
My research on BullJS-queues gave that this is not intended behavior: One worker (also called processor by BullJS) should only start a new job from the queue when its in idle state so not processing a former job.
Nevertheless BullJS keeps starting jobs in parallel on one worker.
In our implementation that lead to big problems during API calls that most likely are caused by t00 many API calls at a time. Tests gave that when only starting one worker the API calls finished just fine and gave status 200.
So how to just process one job after the other once the previous is finished if BullJS does not do that for us (just what the op asked)?
We first experimented with delays and other BullJS options but thats kind of workaround and not the exact solution to the problem we are looking for. At least we did not get it working to stop BullJS from processing more than one job at a time.
So we did it ourself and started one job after the other.
The solution was rather simple for our use case after looking into BullJS API reference (BullJS API Ref).
We just used a for-loop to start the jobs one after another. The trick was to use BullJS's
job.finished
method to get a Promise.resolve once the job is finished. By using await inside the for-loop the next job gets just started immediately after the job.finished Promise is awaited (resolved). Thats the nice thing with for-loops: Await works in it!
Here a small code example on how to achieve the intended behavior:
for (let i = 0; i < theValues.length; i++) {
jobCounter++
const job = await this.processingQueue.add(
'update-values',
{
value: theValues[i],
},
{
// delay: i * 90000,
// lifo: true,
}
)
this.jobs[job.id] = {
jobType: 'socket',
jobSocketId: BackgroundJobTasks.UPDATE_VALUES,
data: {
value: theValues[i],
},
jobCount: theValues.length,
jobNumber: jobCounter,
cumulatedJobId
}
await job.finished()
.then((val) => {
console.log('job finished:: ', val)
})
}
The important part is really
await job.finished()
inside the for loop. leasingValues.length jobs get started all just one after the other as intended.
That way horizontally scaling jobs across more than one worker is not possible anymore. Nevertheless this workaround is okay for us at the moment.
I will get in contact with optimalbits - the maker of BullJS to clear things out.

How to specify HTTP timeout for DownloadURL() in Akavache?

I am developing an application targetting mobile devices, so I have to consider bad network connectivity. In one use case, I need to reduce the timeout for a request, because if no network is available, that's okay, and I'd fall back to default data immediately, without having the user wait for the HTTP response.
I found that HttpMixin.MakeWebRequest() has a timeout parameter (with default=null) but DownloadUrl() never makes use of it, so the forementioned function always waits for up to 15 seconds:
request.Timeout(timeout ?? TimeSpan.FromSeconds(15),
BlobCache.TaskpoolScheduler).Retry(retries);
So actually I do not have the option to use a different timeout, or am I missing something?
Thanks for considering a helpful response.

So after looking at the signature for DownloadUrl in
HttpMixin.cs
I saw what you are talking about and am not sure why it is there but, it looks like the timeout is related to building the request and not a timeout for the request itself.
That being said, in order to set a timeout with a download, you have a couple options that should work.
Via TPL aka Async Await
var timeout = 1000;
var task = BlobCache.LocalMachine.DownloadUrl("http://stackoverflow.com").FirstAsync().ToTask();
if (await Task.WhenAny(task, Task.Delay(timeout)) == task) {
// task completed within timeout
//Do Stuff with your byte data here
//var result = task.Result;
} else {
// timeout logic
}
Via Rx Observables
var obs = BlobCache.LocalMachine
.DownloadUrl("http://stackoverflow.com")
.Timeout(TimeSpan.FromSeconds(5))
.Retry(retryCount: 2);
var result = obs.Subscribe((byteData) =>
{
//Do Stuff with your byte data here
Debug.WriteLine("Byte Data Length " + byteData.Length);
}, (ex) => {
Debug.WriteLine("Handle your exceptions here." + ex.Message);
});

Azure Search .net SDK- How to use "FindFailedActionsToRetry"?

Using the Azure Search .net SDK, when you try to index documents you might get an exception IndexBatchException.
From the documentation here:
try
{
var batch = IndexBatch.Upload(documents);
indexClient.Documents.Index(batch);
}
catch (IndexBatchException e)
{
// Sometimes when your Search service is under load, indexing will fail for some of the documents in
// the batch. Depending on your application, you can take compensating actions like delaying and
// retrying. For this simple demo, we just log the failed document keys and continue.
Console.WriteLine(
"Failed to index some of the documents: {0}",
String.Join(", ", e.IndexingResults.Where(r => !r.Succeeded).Select(r => r.Key)));
}
How should e.FindFailedActionsToRetry be used to create a new batch to retry the indexing for failed actions?
I've created a function like this:
public void UploadDocuments<T>(SearchIndexClient searchIndexClient, IndexBatch<T> batch, int count) where T : class, IMyAppSearchDocument
{
try
{
searchIndexClient.Documents.Index(batch);
}
catch (IndexBatchException e)
{
if (count == 5) //we will try to index 5 times and give up if it still doesn't work.
{
throw new Exception("IndexBatchException: Indexing Failed for some documents.");
}
Thread.Sleep(5000); //we got an error, wait 5 seconds and try again (in case it's an intermitent or network issue
var retryBatch = e.FindFailedActionsToRetry<T>(batch, arg => arg.ToString());
UploadDocuments(searchIndexClient, retryBatch, count++);
}
}
But I think this part is wrong:
var retryBatch = e.FindFailedActionsToRetry<T>(batch, arg => arg.ToString());

The second parameter to FindFailedActionsToRetry, named keySelector, is a function that should return whatever property on your model type represents your document key. In your example, your model type is not known at compile time inside UploadDocuments, so you'll need to change UploadsDocuments to also take the keySelector parameter and pass it through to FindFailedActionsToRetry. The caller of UploadDocuments would need to specify a lambda specific to type T. For example, if T is the sample Hotel class from the sample code in this article, the lambda must be hotel => hotel.HotelId since HotelId is the property of Hotel that is used as the document key.
Incidentally, the wait inside your catch block should not wait a constant amount of time. If your search service is under heavy load, waiting for a constant delay won't really help to give it time to recover. Instead, we recommend exponentially backing off (e.g. -- the first delay is 2 seconds, then 4 seconds, then 8 seconds, then 16 seconds, up to some maximum).

I've taken Bruce's recommendations in his answer and comment and implemented it using Polly.
Exponential backoff up to one minute, after which it retries every other minute.
Retry as long as there is progress. Timeout after 5 requests without any progress.
IndexBatchException is also thrown for unknown documents. I chose to ignore such non-transient failures since they are likely indicative of requests which are no longer relevant (e.g., removed document in separate request).
int curActionCount = work.Actions.Count();
int noProgressCount = 0;
await Polly.Policy
.Handle<IndexBatchException>() // One or more of the actions has failed.
.WaitAndRetryForeverAsync(
// Exponential backoff (2s, 4s, 8s, 16s, ...) and constant delay after 1 minute.
retryAttempt => TimeSpan.FromSeconds( Math.Min( Math.Pow( 2, retryAttempt ), 60 ) ),
(ex, _) =>
{
var batchEx = ex as IndexBatchException;
work = batchEx.FindFailedActionsToRetry( work, d => d.Id );
// Verify whether any progress was made.
int remainingActionCount = work.Actions.Count();
if ( remainingActionCount == curActionCount ) ++noProgressCount;
curActionCount = remainingActionCount;
} )
.ExecuteAsync( async () =>
{
// Limit retries if no progress is made after multiple requests.
if ( noProgressCount > 5 )
{
throw new TimeoutException( "Updating Azure search index timed out." );
}
// Only retry if the error is transient (determined by FindFailedActionsToRetry).
// IndexBatchException is also thrown for unknown document IDs;
// consider them outdated requests and ignore.
if ( curActionCount > 0 )
{
await _search.Documents.IndexAsync( work );
}
} );

How to setTimeout in node.js?

I need to be able to make retries in node.js in the event of failure inside a function. I've setup a while loop as shown below, but I am getting slightly confused about how I should wrap the function call to not make sure that it won't block my whole server.
What should I do?
while(retryCount < 10 && !success){
// Alternative one
while(new Date().getTime() < now + 1000) {
myFunction();
}
// Or:
setTimeout( myFunction(), 1000);
}

You can store number of tryes in function object. It's will be fine for cronjob. If you need same behaviour in request context you must store attempts counter in request scope (not in function object).
var fnc = function() {
console.log('try');
if (true) { // Error condition
// Error here
if (!fnc.tryes) fnc.tryes = 0;
fnc.tryes++;
console.log(fnc.tryes);
if (fnc.tryes <= 10) {
setTimeout(fnc, 1000);
} else {
fnc.tryes = 0;
}
// Something wrong
} else {
// We hame result
}
};
fnc();

I'd say use the setTimeout method, that way the client won't be stuck inside the while loop that checks the time.

That outer while loop is going to block, you'd have to refactor using only setTimeout. However, the fact that you want this sort of thing indicates to me that your code structure is really terrible and needs more reworking. What is it that you are retrying? How are you detecting an error condition? Does doing it 10 times really make the chances of success higher?
I have a gist containing a generic function that will do this sort of thing for you, but I'm reluctant to share if this is an XY problem.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string