How to implement work stealing in SimPy 3? - simpy

I want to implement something akin to work stealing or task migration in multiprocessor systems. Details below.
I am simulating a scheduling system with multiple worker nodes (resources, each with multiple capacity), and tasks (process) that arrive randomly and are queued by the scheduler at a specific worker node. This is working fine.
However, I want to trigger an event when a worker node has spare capacity, so that it steals the front task from the worker with the longest wait queue.
I can implement the functionality described above. The problem is that all the tasks waiting on the worker queue from which we are stealing work receive the event notification. I want to notify ONLY the task at the front of the queue (or only N tasks at the front of the queue).
The Bank reneging example is the closest example to what I want to implement. However, it (1) ALL the customers leave the queue when they are notified that the event was triggered, and (2) when event is triggered, the customers leave the system; in my example, I want to make the task wait at another worker (though it wouldn't wait, since the queue of that worker is empty).
Old question: Can this be done in SimPy?
New questions: How can I do this in SimPy?
1) How can I have many processes, waiting for a resource, listen for an event, but notify only the first one?
2) How can I make a process migrate to another resource?

Related

Does a game server create threads for each user request (like dota 2)?

For a user base of 100,000 and 4 users per game session, should we create new threads for each request such as create_session, move_player, use_attack, etc. ?
I wanted to know what would be the optimal way to handle large connections because if we create large number of threads, context switching will eat up most of the cycles and if no threads are created each request has to wait for previous request to complete.
I would avoid thread-per-connection if your goal is scalability. It would be better to have a queue of events and a thread pool.
A game company would probably use a non-connection-based internet protocol like UDP. All requests can theoretically come in on the same socket, so you only need 1 thread to handle that. That thread can assign work to other threads.
You can have a larger threadpool where any thread can be assigned any job. Or you could further organize the work into specific jobs, each with a threadpool to process a queue of tasks. But I wouldn't launch a new thread for each request.
How you design your threadpools and task distribution system depends on the libraries for whatever language you're using and the application requirements.

Amazon SQS better way of handling listeners

I have an SQS Queue which has a lot of messages (typically in thousands). Presently I am having multiple listeners (which are created by threads created from the same source) and each listener listens to the queue and receives messages. As soon as a listener receives a message from the Queue, that listener deletes the message from the Queue. The message will be processed only after deleting the message from the queue. I am having a visibility timeout of 30 seconds.
I am not using any locks or anything to handle duplicates since I am deleting the message from the queue as soon as after receiving. I haven't seen a case of duplicity until now but I am just worried it might.
Now, the question is, which is a better way, having multiple listeners this way or listening to the queue in a single thread, and then spinning up new threads to process each message you receive?
Firstly, it is worth understanding the concept of message invisibility timeout.
When a message is retrieved from an Amazon SQS queue (eg by your thread), the message is marked as invisible in Amazon SQS. Best-practice is for your thread to then process the message and then delete the message after it has completed processing the message. This way, if the thread fails, the message will automatically become visible on the queue again and another thread can process it.
With your current application design, if a thread fails then the message is lost and will not be retried. You should consider changing your code to delete the message only after it has been processed.
Using multiple threads to process messages is recommended, because it will allow higher message throughput by processing messages in parallel. It is also a simpler design, and simple is always best. Your alternate idea of having one process retrieve messages and then firing off threads to process the message is more complex and does not provide any benefits.
Amazon SQS queues can occasionally return the same message more than once. It is rare, but can happen. The multiple-thread design will probably result in it happening more than the single-thread design because multiple threads might simultaneously retrieve the same message. However, there it could still happen in the single-thread model, too.
If processing the same message twice is a concern, then consider using a FIFO queue (not currently available in every AWS Region). This will guarantee that every message is received only once. Alternatively, your code would need to check whether a particular message has already been processed (eg by checking in a database).
The multiple-thread design will also allow you to horizontally scale by having multiple system (even across multiple Availability Zones) process messages, whereas your single-thread design has a single point of failure and is less scalable.

Failure handling for Queue Centric work pattern

I am planning to use a queue centric design as described here for one of my applications. That essentially consists of using a Azure queue where work requests are queued from the UI. A worker reads from the queue, processes and deletes the message from the queue.
The 'work' done by the worker is within a transaction so if the worker fails before completing, upon restart it again picks up the same message (as it has not be deleted from the queue) and tries to perform the operation again (up to a max number of retries)
To scale I could use two methods:
Multiple workers each with a separate queue. So if I have five workers W1 to W5, I have 5 queues Q1 to Q5 and each worker knows which queue to read from and failure handling is similar as the case with one queue and one worker
One queue and multiple workers. Here failure/Retry handling here would be more involved and might end up using the 'Invisibility' time in the message queue to make sure no two workers pick up the same job. The invisibility time would have to be calculated to make sure that its enough for the job to complete and yet not be large enough that retries are performed after a long time.
Would like to know if the 1st approach is the correct way to go? What are robust ways of handling failures in the second approach above?
You would be better off taking approach 2 - a single queue, but with multiple workers.
This is better because:
The process that delivers messages to the queue only needs to know about a single queue endpoint. This reduces complexity at this end;
Scaling the number of workers that are pulling from the queue is now decoupled from any code / configuration changes - you can scale up and down much more easily (and at runtime)
If you are worried about the visibility, you can initially choose a default timespan, and then if the worker looks like it's taking too long, it can periodically call UpdateMessage() to update the visibility of the message.
Finally, if your worker timesout and failed to complete processing of the message, it'll be picked up again by some other worker to try again. You can also use the DequeueCount property of the message to manage number of retries.
Multiple workers each with a separate queue. So if I have five workers
W1 to W5, I have 5 queues Q1 to Q5 and each worker knows which queue
to read from and failure handling is similar as the case with one
queue and one worker
With this approach I see following issues:
This approach makes your architecture tightly coupled (thus beating the whole purpose of using queues). Because each worker role listens to a dedicated queue, the web application responsible for pushing messages in the queue always need to know how many workers are running. Anytime you scale up or down your worker role, some how you need to tell web application so that it can start pushing messages in appropriate queue.
If a worker role instance is taken down for whatever reason there's a possibility that some messages may not be processed ever as other worker role instances are working on their dedicated queues.
There may be a possibility of under utilization/over utilization of worker role instances depending on how web application pushes the messages in the queue. For optimal utilization, web application should know about the worker role utilization so that it can decide which queue to send message to. This is certainly not a desired thing for a web application to do.
I believe #2 is the correct way to go. #Brendan Green has covered your concerns about #2 in his answer excellently.

Concurrent message processing in RabbitMQ consumer

I am new to RabbitMQ so please excuse me if my question sound trivial. I want to publish message on RabbitMQ which will be processed by RabbitMQ consumer.
My consumer machine is a multi core machine (preferably worker role on azure). But QueueBasicConsumer pushes one message at a time. How can I program to utilize all core where I can process multiple message concurrently.
One solution could be to open multiple channels in multiple threads and then process message over there. But in this case how will i decide the number of threads.
Another approach could be to read message on main thread and then create task and pass message to this task. In this case I will have to stop consuming messages in case there are many message (after a threshold) already in progress. Not sure how could this be implemented.
Thanks In Advance
Your second option sounds much more reasonable - consume on a single channel and spawn multiple tasks to handle the messages. To implement concurrency control, you could use a semaphore to control the number of tasks in flight. Before starting a task, you would wait for the semaphore to become available, and after a task has finished, it would signal the semaphore to allow other tasks to run.
You haven't specified you language/technology stack of choice, but whatever you do - try to utilise a thread pool instead of creating and managing threads yourself. In .NET, that would mean using Task.Run to process messages asynchronously.
Example C# code:
using (var semaphore = new SemaphoreSlim(MaxMessages))
{
while (true)
{
var args = (BasicDeliverEventArgs)consumer.Queue.Dequeue();
semaphore.Wait();
Task.Run(() => ProcessMessage(args))
.ContinueWith(() => semaphore.Release());
}
}
Instead of controlling the concurrency level yourself, you might find it easier to enable explicit ACK control on the channel, and use RabbitMQ Consumer Prefetch to set the maximum number of unacknowledged messages. This way, you will never receive more messages than you wanted at once.

Azure Service Bus or Queue

I have a process where I would like to use an Azure Queue or Service Bus to decouple the processing from the UI. A user will press a button, and I would like to place 2 messages in the queue, each with it's own topic. 1 set of competing consumers will process topic A, and another set process topic B. Only after both A and B complete, should a third process C start. Said another way, my first message should launch 2 processes in parallel (both are intense and need to start together), and then when both have successfully completed, a 3rd and final competing consumer should run to finish the task.
I am trying to avoid storing the success of process 1 and 2 in a DB or something, and instead do this all with a queue.
Thanks in advance...
Sounds like you need an Azure Service Bus Topic for the first part (two queues, each with competing consumers). This will allow for the topic/subscription model you have described.
To automatically trigger another service after these have completed is not possible using a queue. This will require some sort of persistence layer to keep a track of that processes state.
To keep things decoupled, you could have processes A and B send completion messages to another queue. Then you could place a message pump at the end of this queue that can decide when to start process C.

Resources