Implementing multi-threading in workflows

Implementing multi-threading in workflows - multithreading

I'm aware a single workflow instance run in a single thread at a time. I've a workflow with two receive activities inside a pick activity. Message correlation is implemented to make sure the requests to both the activities should be routed to the same instance.
In the first receive branch I've a parallel activity with a delay activity in one branch. The parallel activity will complete either the delay is over or a flag is set to true.
When the parallel activity is waiting for the condition to meet how I can receive calls from the second receive activity? because the flag will be set to true only through through it's branch. I'm waiting for your suggestions or ideas.

Check out my blog The Workflow Parallel Activity and Task Parallelism This will help you understand how WF works

Not quite sure what you are trying to achieve here.
If you have a Pick with 2 branched and both branches contain a Receive it will continue after you receive either of the 2 messages the 2 Receive activities are waiting for. The other will be canceled and not receive anything. The fact that one Receive is in a Parallel will not make a difference here. So unless this is on a loop you will not receive more than one WCF message in your workflow.

Related

How to perform long event processing in Node JS with a message queue?

I am building an email processing pipeline in Node JS with Google Pub/Sub as a message queue. The message queue has a limitation where it needs an acknowledgment for a sent message within 10 minutes. However, the jobs it's sending to the Node JS server might take an hour to complete. So the same job might run multiple times till one of them finishes. I'm worried that this will block the Node JS event loop and slow down the server too.
Find an architecture diagram attached. My questions are:
Should I be using a message queue to start this long-running job given that the message queue expects a response in 10 mins or is there some other architecture I should consider?
If multiple such jobs start, should I be worried about the Node JS event loop being blocked. Each job is basically iterating through a MongoDB cursor creating hundreds of thousands of emails.

Well, it sounds like you either should not be using that queue (with the timeout you can't change) or you should break up your jobs into something that easily finishes long before the timeouts. It sounds like a case of you just need to match the tool with the requirements of the job. If that queue doesn't match your requirements, you probably need a different mechanism. I don't fully understand what you need from Google's pub/sub, but creating a queue of your own or finding a generic queue on NPM is generally fairly easy if you just want to serialize access to a bunch of jobs.
I rather doubt you have nodejs event loop blockage issues as long as all your I/O is using asynchronous methods. Nothing you're doing sounds CPU-heavy and that's what blocks the event loop (long running CPU-heavy operations). Your whole project is probably limited by both MongoDB and whatever you're using to send the emails so you should probably make sure you're not overwhelming either one of those to the point where they become sluggish and lose throughput.

To answer the original question:
Should I be using a message queue to start this long-running job given that the message queue expects a response in 10 mins or is there
some other architecture I should consider?
Yes, a message queue works well for dealing with these kinds of events. The important thing is to make sure the final action is idempotent, so that even if you process duplicate events by accident, the final result is applied once. This guide from Google Cloud is a helpful resource on making your subscriber idempotent.
To get around the 10 min limit of Pub/Sub, I ended up creating an in-memory table that tracked active jobs. If a job was actively being processed and Pub/Sub sent the message again, it would do nothing. If the server restarts and loses the job, the in-memory table also disappears, so the job can be processed once again if it was incomplete.
If multiple such jobs start, should I be worried about the Node JS event loop being blocked. Each job is basically iterating through a
MongoDB cursor creating hundreds of thousands of emails.
I have ignored this for now as per the comment left by jfriend00. You can also rate-limit the number of jobs being processed.

Do I need a work queue for a proccess? BluePrism

I have to develop a Robot which has to work for 7 days. I've created the proccess and my question, do I have to create a Work Queue and configure my Proccess or how do I do that.

Work queue creation is not a compulsory task, it all depend on process to process until and unless we do not require any output from the BOT and not a large volume of data i.e.
We do not need to get status(error/completed) of an item.
Business do not need the status report
We do not need to keep track of the items got completed and pending
But, I would suggest you to create and use a Work queue as
It will keep track of number of records got processed
Easy to generate business report (how many requests got executed successfully or got exception)
For each record item it will give us the status whether it is been executed successfully or got exception
We can easily track the error.
And the most important : If suppose BOT execution is getting failed due to some reason and we need to restart the BOT then,
A. BOT will not pick the executed item if we are using work queue. It will pick the next pending item from work queue
B. If we are not using work queue, BOT will/can pick the items which were executed previously. There is no point to pick the items which were already got processed.
You can also refer the documentation provided by Blue Prism on their portal:
Work Queue Guide

SQS: Know remaining jobs

I'm creating an app that uses a JobQueue using Amazon SQS.
Every time a user logs in, I create a bunch of jobs for that specific user, and I want him to wait until all his jobs have been processed before taking the user to a specific screen.
My problem is that I don't know how to query the queue to see if there are still pending jobs for a specific user, or how is the correct way to implement such solution.
Everything regarding the queue (Job creation and processing is working as expected). But I am missing that final step.
Just for the record:
In my previous implementation I was using Redis + Kue and I had created a key with the user Id and the job count, every time a job was added that job count was incremented, and every time a job finished or failed I decremented that count. But now I want to move away from Redi + Kue and I am not sure how to implement this step.

Amazon SQS is not the ideal tool for the scenario you describe. A queueing system is normally used in a "Send and Forget" situation, where the sending system doesn't remain interested in later processing.
You could investigate Amazon Simple Workflow (SWF), which allows work to be monitored as it goes through several processes. Your existing code could mostly be re-used, just with the SWF framework added. Or even power it from Lambda, since you are already using node.js.

How to handle long running jobs that are posted to a service bus with only 5min peek lock

What do people tend to do when they have a design that put jobs on a service queue or topic that takes longer then the 5min max of peeklock?
I have been using the OnMessage(...) async messagepump of service bus and is wondering if thats not such a good idea after also since if I start moving the jobs to a table while processing them, then the messagepump will just empty the queue and I just have the problem elsewhere of making sure my jobs are scheduled even between servers.

If you have a long running message processing workflow the you can check the lockedUntilUtc property of the message and call RenewLock at the appropriate time.
http://msdn.microsoft.com/en-us/library/windowsazure/microsoft.servicebus.messaging.brokeredmessage.renewlock.aspx
in the next release of SDK the OnMessage processing loop will automatically do that for you so that convenience API is always a good idea to use.

Correct code pattern for recurrent events in Azure worker roles with sizable delays between each event

I have an Azure worker role whose job is to periodically run some code against a SQL Azure database. Here's my current code:
const int oneHour = 216000000; // milliseconds
while (true)
{
var numConversions = SaveSeedsToSQL.ConvertRemainingPotentialQueryURLsToSeeds();
SaveLogEntryToSQL.Save(new LogEntry { Count = numConversions });
Thread.Sleep(oneHour);
}
Is Thread.Sleep(216000000) the best way of programming such regular but infrequent events or is there some kind of wake-up-and-run-again mechanism for Azure worker roles that I should be utilizing?

This code works of course, but there are some problems:
You can fail somewhere and this schedule gets all thrown off. That
is important if you must actually do it at a specific time.
There is no concurrency control here. If you want something only done once,
you need a mechanism such that a single instance will perform the
work and the other instances won't.
There are a few solutions to this problem:
Run the Windows Scheduler on the role (built in). That solves problem 1, but not 2.
Run Quartz.NET and schedule things. That solves #1 and depending on how you do it, also #2.
Use future scheduled queue messages in either Service Bus or Windows Azure queues. That solves both.
The first two options work with caveats, so I think the last option deserves more attention. You can simply create a message(s) that your role(s) will understand and post it to the queue. Once the time comes, it becomes visible and your normally polling roles will see it and can work on it. The benefit here is that it is both time accurate as well as a single instance operates on it since it is a queue message. When completed with the work, you can have the instance schedule the next one and post it to the queue. We use this technique all the time. You only have to be careful that if for some reason your role fails before scheduling the next one, the whole system kinda fails. You should have some sanity checks and safeguards there.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Implementing multi-threading in workflows - multithreading

Check out my blog The Workflow Parallel Activity and Task Parallelism This will help you understand how WF works

Related

How to perform long event processing in Node JS with a message queue?

Do I need a work queue for a proccess? BluePrism

SQS: Know remaining jobs

How to handle long running jobs that are posted to a service bus with only 5min peek lock

Correct code pattern for recurrent events in Azure worker roles with sizable delays between each event

Categories

Resources