parallel execution and aggregator locking - spring-integration

I am using spring integration to process some directories for files and each file goes through a "flow". I would like to set the overall processing of files in such a way that a file poller monitors a directory (or multiple) for new files. Once a new file has been picked up poller it should be passed to the subsequent components in the flow where this new file is processed while the polling process is not held. The second aspect of processing is that all new files go through a few steps before they are aggregated by an aggregator based on e.g. number of files (the criteria changes across directories). Once enough files have been accumulated then they are released from aggregated and then processed in some time consuming steps after aggregator. So the overall process looks like this
file-A picked up
file-A passed from poller to step1
file-A passed from step1 to aggregator
file-B picked up
file-B passed from poller to step1
file-B passed from step1 to aggregator
file-C picked up
file-C passed from poller to step1
file-C passed from step1 to aggregator
files A,B and C are released from aggregator
files A,B and C are processed by final-step
so overall there are two requirements
process each file in a separate thread to maximize the number of currently being processed files
files released from aggregator belong to a correlation id, we want only one group of messages using the same correlation id to be processed by final-step
How I attempted to satisfy these two requirements is for #1 i simply used a queue after file poller where the new files are dropped and step-a picks up files from queue. This detaches the polling process and the idea was to use a thread-executor in step-a service activator to process each file in a single thread
Second requirement was automatically handled by simply executing final-step after aggregator in the same thread as aggregator. Since the aggregator places a lock based on correlation id if another group is released for same correlation id it just simply waits before the previous instance of same group is being processed.
The problem I ran into is that #1 wasnt being fulfilled because the service activator was waiting until the end of thread completion before attempting to create another thread for second file. This is kind of not helpful because this way having a thread executor on service activator is not useful. It only seems to create second therad after completing the first thread. So to fix this I replaced queued channel with a dispatcher channel and palced the executor on the dispatcher channel. Now each file was being processed in a separate thread and multiple files were being processed at the same time.
Now for second part, since the components after aggregator are time consuming I wanted to disconnect that process from first part so I placed a queued channel after aggregator but now with this approach the locking behavior that I was previously getting with aggregator is gone because the thread that released the messages from aggregator dies/completes in the queued channel before final time consuming step.
Any thoughts on the overall process. How can I accomplish both of my requirements while running things in parallel.
Thanks

It's not at all clear what your question is. Yes, for the downstream flow to run under the lock, it must run on the final "releasing" thread (the thread processing the last inbound message that completes the group), you can't use a queue or executor channel downstream of the aggregator.
However, this has no impact on the threads from the executor channel; other groups (with different correlation) will process. However, if you are using the same correlation id for the "next" group, its threads will block.
If you are saying you want to assemble the next group (with the same correlationid) while the first one is processing downstream, you'll have to use some other mechanism to enforce single threading downstream - such as an executor channel with a single-thread executor, or use another lock registry.

Related

How can I debug and see what gets stored over time into the FIFO queue of inter-thread communication plugin?

I have the following JMeter context:
In one Concurrency Thread Group 1, I have a JSR223 Sampler which sends request messages to an MQ queue1 and always gets the JMSMessageID and an epochTimestamp (derived from JMS_IBM_PutDate + JMS_IBM_PutTime) and puts them into one variable. Underneath this Sampler is an Inter-Thread Communication PostProcessor element which gets the data from this variable and puts it into a FIFO QUEUE.
In another Concurrency Thread Group 2, I have another JSR223 Sampler with code to get the response messages for all the messages sent on MQ queue 1 from an MQ queue2.
To do this, (and be able to calculate the response time fore each message) before the JSR223 Sampler executes, I use the Inter-Thread Communication PreProcessor element which gets a message ID and a timestamp from the FIFO queue (60 seconds timeout) and passes it over to a variable with which the JSR223 Sampler can work to calculate the request-response time for each message.
I want to stress-test the system, which is why I am gradually dynamically increasing the Requests per second at every 1 minute (for script testing purposes) in both thread groups, like so:
I use the tstFeedback function of Concurrency Thread Group for this:
${__tstFeedback(ThroughputShapingTimerIn,1,1000,10)}
My problem is this:
When I gradually increase the desired TPS load, during the first 4 target TPS steps, the Consumer threads keep up (synchronized) with the Producer threads, but as the time passes and load increases, the consumer threads seem to be taking more time to find and consume the messages. It's as though the load of the consumer treads is no longer able to keep up with the load of the producer threads, despite both thread groups having the same load pattern. This eventually causes the queue2 which is keeping the response messages to get full. Here is a visual representation of what I mean:
The Consumer samples end up being much less than the producer samples. My expectation is that they should be more or less equal...
I need to understand how I can go about to debug this script and isolate the cause:
I think that something happens at the inter-thread synchronization level because sometimes I am getting null values from the FIFO queue into the consumer threads - I need to understand what gets put into that FIFO queue and what gets taken off of that FIFO queue.
How can I print what is present in the FIFO list at each iteration?
Does anyone have any suggestions for what could be the cause of this behavior and how to mitigate it?
Any help/suggestion is greatly appreciated.
First of all take a look at jmeter.log file, you have at least 865 errors there so I strongly doubt your Groovy scripts are doing what they're supposed to be doing
Don't run your test in GUI mode, it's only for tests development and debugging, when it comes to execution you should be using command-line non-GUI mode
When you call __fifoPop() you can save the value into a JMeter Variable like ${__fifoPop(queue-name,some-variable)}, the variable can be visualized using Debug Sampler. The size of the queue can be checked using __fifoSize() function
Alternatively my expectation is that such a Groovy expert as you shouldn't have any problems printing queue items in Groovy code:

Process messages from different inbounds in the same thread

I have two inbound-channel-adapter which collect files from two distinct sources.
I'd like to process the incoming files one at a time, by the same instance of service-activator and in the same thread. At the moment, since there are two distinct Poller, they are actually processed by two different threads concurrently.
I thought that using a queueChannel to feed my service-activator would have solved the problem but I don't want to introduce another Poller (and hence, another delay).
Any idea?
Use an ExecutorChannel with an Executors.newSingleThreadExecutor().
You can also use a QueueChannel with a fixedDelay of 0; the poller blocks in the queue for 1 second by default (and can be increased - receiveTimeout) so with a 0 delay between polls, no additional latency will be added.

Mule Exhausted Action RUN vs WAIT. Which one to choose and when

I have a question on Mule threading profile Exhausted_Action. From the documentation, I understand that when the action is WAIT, any new request beyond the maxActive will wait for a thread to be available. Whereas action of RUN, would cause use the original thread to process the request. From my understanding, I thought WAIT is better way to do it, rather than RUN. However, it appears MULE has all default values set to RUN. Just want to hear for comments on my understanding and differences between these two actions and how to decide which one to use when.
Your undertanding about WAIT and RUN is correct.
Reason why all the default values are RUN is that, The message processing is not stopped because of unavailability of flow thread. Because the original thread(or receiver thread) is anyway waiting for the Flow thread to take the message and process, why not process it. (This is my opinion).
But there is a downside for using RUN.
Ex:
No of receiver threads are restricted to 2.
<asynchronous=processing-strategy name="customAsynchronous" maxThreads="1" />
<flow name="sample" processingStrategy="customAsynchronous" >
<file:inbound-endpoint ......>
............
..........
</flow>
File sizes: 1MB, 50MB, 100MB, 1MB, 5MB.
In the above flow when there are 5 files coming in. 3 files are processed as there is 1 flow thread available and 2 File Receiver threads (Exhausted_Action = RUN). The flow thread will finish the processing fast as the first file is small and keeps waiting for the next message. Unfortunately the receiver thread whose job is to pick the next file and give it to Flow thread to process is busy processing the BIG file. This way there is a chance of receiver threads getting struck in time consuming processing while the flow threads are waiting.
So it is always depending on the usecase you are dealing with.
Hope this helps.

Spring integration queue, any way to get current queue size?

We have scenario that lots of message from external system need to be processed async, current design is to have a job wake up every 5 mins to pull msg from external system, and then persist raw msg, and then send msg id to ExecutorChannel, so consumer(potentially many) can consume from channel.
The problem we are facing is how to deal with system crash while msgs in queue, somehow every time job wake up, we will need to look into our DB to find out if there is any raw msgs not in queue already.
The easiest way is to query current queue size and find out if there are more raw msg than msg in queue. So question I have is: is any API for ExecutorChannel to find out size of queue? or any other suggestion?
Thx
Jason
Spring Integration itself doesn't maintain a queue within an ExecutorChannel; the messages are executed by the underlying Executor.
If you are using a Spring ThreadPoolTaskExecutor which is dedicated to the channel, you could drill down to the channel's underlying ThreadPoolTaskExecutor's ThreadPoolExecutor, and get a handle to its BlockingQueue (getQueue()) and get it's count.
However, you'd have to add the active task count as well.
The total count would be approximate, though because the ThreadPoolExecutor has no atomic method to get a count of queued and active tasks.

Signalling a producer task from a consumer task when working with a BlockingCollection

I have a pretty basic application that uses a Producer task and a Consumer task to work with files. It is based off the example here http://msdn.microsoft.com/en-us/library/dd267312.aspx
The basics of the program is that the Producer task enumerates the files on my hard drive and calculates their hash values and does a few other things. Once the Producer has finished working with a file, it Enques the file and the Consumer then grabs it.
The Consumer task has to connect to a remote server and attempt to upload the file. However, if the Consumer encounters an error, such as, not being able to connect to the remote server I need it to signal the Producer task that it should stop what it is doing and terminate. If the server is down, or goes down, there is no need for the Producer to continue cycling through thousands of files.
I have seen plenty of samples of signalling the Consumer task from the Producer task by using .CompleteAdding() on the BlockingCollection object but I am lost as to how to send a signal to the Producer from the Consumer that it should stop producing.
You could use a return queue. If one of the items generates an error/exception, you could load it up with error data and queue it back to the producer. The producer should TryTake() from the return queue just before generating a new item and handle any returned item appropriately. This beats using some atomic boolean by enabling the item to signal back with extended error information that could be used to decide what action to take - the producer may not always want/need to stop. Also, you could then queue up errored items to a GUI list and/or logger.
It's tempting to say that the consumer should return items anyway, whether they are errored or not, so that they can be re-used insted of creating new ones all the time. This, however, intruduces latency in detecting/acting on errors unless you use two return queues to prioritize error returns.
Oh - another thing - using the above design, if it has to stop, the producer could retain errored items in a local queue an re-issue one occasionally. If the server comes back up, (as indicated by the return of a successful item), the producer could re-issue the errored jobs from the local queue again before generating any more new ones. With care, this could make your upload system resilient to server reboots.

Resources