I'm trying to proccess multiple messages at the same time from the same session and wanna FIFO guaranteed, It does work only with processor MaxConcurrentCallsPerSession = 1 on ServiceBusSessionProcessorOptions.
When I'm trying MaxConcurrentCallsPerSession > 1 my message handler receive any message from session with no order.
So, if I want to garantee FIFO ordering processing a session, does It work only with serial processing?
You cannot process messages in a specifc order and process many messages at the same time.
Even if you read the messages from the queue in order, there is no control over how long each message takes to process. The end processing time for each message would then appear to be random if you read messages concurrently.
Related
I have an application that lets users upload upto 200 documents/resumes.
The objective of the application is to return a parsed and scored result for each of these documents.
The front end splits these 200 documents into batches of 10. i.e 20 messages are put into a queue(RabbitMQ).
I have a 6 worker processes listening to the queue( scripts that are triggered by an entry_point).
The workers take the message and splits the resumes if it is a batch message. If it is not a batch message, the worker starts processing the message. ( the average time for the processing is around 8 secs).
The queue gets piled up with 200 resumes and the 6 workers get 5 messages. Processing each message sequentially.
Which means, if another user uploads even 1 resume,one of the workers needs to reach the end of the queue to pick that message and the user is left waiting till the processing of the 200 resumes.
I'm doing this using Rabbitmq,and python2.7.
I'm using a blockingconnection to connect to the queue and process the messages.
The only way to get to the last user's message is to complete the processing of all the message as fast as i can, which could mean more processes or more containers. When i fire up more proceatses using multiprocessing (pool of 6 workers), the cpu utilization is at the highest and cannot handle any more messages.
How can i prevent my users from waiting for the response. Is adding more workers to listen and consume from the queue the only way?
The consumer is just a plain consumer with no API. The tasks are directly picked from the queue and processed.
More workers i add, the faster the queue is consumed. But still the user that had uploaded probably last still has to wait for a long time.
I have business requirement where I have to process messages in a certain priority say priority1 and priority2
We have decided to use 2 JMS queues where priority1 messages will be sent to priority1Queue and priority2 messages will be sent to priority2Queue.
Response time for priority1Queue messages is that the moment message is in Queue, I need to read, process and send the response back to say another queue in 1 second. This means I should immediately process these messages the moment they are in priority1Queue, and I will have hundreds of such messages coming in per second on priority1Queue so I will definitely need to have multiple concurrent consumers consuming messages on this queue so that they can be processed immediately when they are in the queue(consumed and processed within 1 second).
Response time for priority2Queue messages is that I need to read, process and send the response back to say another queue in 1 minute. So the response time of priority2 is lower to priority1 messages however I still need to respond back in a minute.
Can you suggest best possible approach for this so that I can concurrently read messages from both the queue and give higher priority to priority1 messages so that each priority1 message can be read and processed in 1 second.
Mainly how it can be read and fed to a processor so that the next message can be read and so on.
I need to write a java based component that does the reading and processing.
I also need to ensure this component is highly available and doesn't result in OutOfMemory, I will be having this component running across multiple JVMS and multiple application servers thus I can have multiple clusters running this Java component
First off, the requirement to process within 1 second is not going to be dependent on your messaging approach, but more about the actual processing of the message and the raw CPUs available. Picking up 100s of messages per second from a queue is child's play, the JMS provider is most likely not the issue. Depending on your deployment platform (Tomcat, Mule, JEE, whatever), there should be a way to have n listeners to scale up appropriately. Because the messages exist on the queue until you pick it up, doubtful you'll run out of memory. I've done these apps, processed many more messages without problems.
Second, number of strategies for prioritizing messages, not necessarily requiring different queues, using priorities. I'm leaning towards using message priorities and message filters, where one group of listeners take care of the highest priority messages and another listener filters off lower priority but makes sure it does enough to get them out within a minute.
You could also do something where a lower priority message gets rewritten back to the same queue with a higher priority, based on how close to 1 minute you are. I know that sounds wrong, but reading/writing from JMS has very little overhead (at least compared to do the equivalent, column-driven database transactions), but the listener for lower priority messages could just continually increase the priority until it has to be processed.
Or simpler, just have more listeners on the high priority queue/messages than the lower priority ones, and imbalance in number of processes for messages might be all it needs.
Lots of possibilities, time for a PoC.
I am trying to create a Camel route that will process incoming IMAP messages in parallel. The mail component should distribute incoming mails to different threads (but every message should pass the two process steps in order).
Something like this:
from("imap://...")
.threads(4)
.process(new FirstProcessor())
.process(new SecondProcessor());
This seems to send new message to different threads, but not in parallel (thread n+1 starts after thread n finishes). How can I achieve parallel processing here?
This is not supported by the camel-mail consumer. It processes the mails in sequence using the same thread on the consumer side.
You need to use wireTap or store the message to a seda queue in no wait mode etc.
I need to generate quite a number of reports and a report can take about 5 minutes to be generated, large amount of data, many different sources.
The client will post messages to an Azure Storage Queue. There is a worker roles that processes the messages and generates the reports.
If I want to scale this up let's say I end up with 10 worker roles that will process the messages from the queue and generate the reports. Then I will add messages into the queue like this:
message 1: process reports from 1 - 5
message 2: process reports from 6 - 11
........
message 10: process reports from 50 - 55 (might not be accurate the range)
If my worker role 1 will take the first message and put a lock on it but the process will take 5 minutes, the lock will expire and the message will be visible again in the queue so the worker role 2 will take it and start processing it ... and so forth
How can I avoid that consuming the queue message is done only once keeping in mind that the task is a long one?
First of all: Using Azure Storage queues, you should be prepared for all of your operations to be idempotent: In case your queue item is processed multiple times, the same result should happen each time. The reason I bring this up: There's simply no way to guarantee you'll process the message one time (unless you check the DequeueCount property of the message and halt processing accordingly), due to unexpected events such as your role instance crashing/rebooting or your queue item processing code doing something unexpected like throwing an exception.
Next: Queue message invisibility timeout can be programmatically extended. This can be done via the queue api or via one of the language sdk's. In c# (something like this - I didn't test this), extending an additional minute:
queueMessage.UpdateMessage(message,
TimeSpan.FromSeconds(60),
MessageUpdateFields.Visibility);
You can also modify the message along the way (maybe as a hint to your code, to let you know which of the 5 reports has been complete. This should help your specific issue: In the event the message gets reprocessed, you don't have to process all five reports if the message has been modified to say something like "process reports from 3-5"). Note: You can combine the MessageUpdateFields flags via |:
queueMessage.UpdateMessage(message,
TimeSpan.FromSeconds(0),
MessageUpdateFields.Content);
Lastly: If you're concerned with the length of time taken to process a batch of reports, perhaps rethink why you're processing five reports in each message, vs. one report per message. You can always read queue messages in batches. This is getting a bit subjective, as there's really no right or wrong way to do it, but it's just something for you to think about.
We have scenario that lots of message from external system need to be processed async, current design is to have a job wake up every 5 mins to pull msg from external system, and then persist raw msg, and then send msg id to ExecutorChannel, so consumer(potentially many) can consume from channel.
The problem we are facing is how to deal with system crash while msgs in queue, somehow every time job wake up, we will need to look into our DB to find out if there is any raw msgs not in queue already.
The easiest way is to query current queue size and find out if there are more raw msg than msg in queue. So question I have is: is any API for ExecutorChannel to find out size of queue? or any other suggestion?
Thx
Jason
Spring Integration itself doesn't maintain a queue within an ExecutorChannel; the messages are executed by the underlying Executor.
If you are using a Spring ThreadPoolTaskExecutor which is dedicated to the channel, you could drill down to the channel's underlying ThreadPoolTaskExecutor's ThreadPoolExecutor, and get a handle to its BlockingQueue (getQueue()) and get it's count.
However, you'd have to add the active task count as well.
The total count would be approximate, though because the ThreadPoolExecutor has no atomic method to get a count of queued and active tasks.