JmsOutboundGateway : Manual Start and Stop - spring-integration

We have an application with around 75 partitioned steps spread around 100 jobs. Our configuration for the outbound gateway is:
<int-jms:outbound-gateway
id="outbound-gateway_1"
auto-startup="true"
connection-factory="jmsConnectionFactory"
request-channel="jms.requests_1"
request-destination="jms.requestsQueue"
reply-channel="jms.reply_1"
reply-destination="jms.repliesQueue"
receive-timeout="${timeout}"
correlation-key="JMSCorrelationID" >
<int-jms:reply-listener receive-timeout="1000"/>
</int-jms:outbound-gateway>
When autostart="true" we see the replyListener thread for each outbound gateway. To remove this extra load and resource consumption, we change to autostart="false" and added a step listener for the partitioned step the start and stop the gateway in the beforeStep and afterStep methods. At server startup the replyListener threads are not there as expected. They appear during step execution but are not removed after the call to stop on the outbound gateway (even after waiting a prolonged period).
Is something else needed to cleanup the replyListener?

OK, I see what you mean. That looks like:
while ((active = isActive()) && !isRunning()) {
if (interrupted) {
throw new IllegalStateException("Thread was interrupted while waiting for " +
"a restart of the listener container, but container is still stopped");
}
if (!wasWaiting) {
decreaseActiveInvokerCount();
}
wasWaiting = true;
try {
lifecycleMonitor.wait();
}
catch (InterruptedException ex) {
// Re-interrupt current thread, to allow other threads to react.
Thread.currentThread().interrupt();
interrupted = true;
}
}
in the DefaultMessageListenerContainer.AsyncMessageListenerInvoker.executeOngoingLoop()
The point there is lifecycleMonitor.wait(); and pay attention to the message of the IllegalStateException.
Not sure what is the purpose of such a design, but we dontt have choice unless live with that as is.
Further the logic in the start() is based on the this.pausedTasks local cache, which is filled during doInitialize() if container !this.running.
Feel free to raise a JIRA, if you think that logic must be changed somehow.

Related

Run Spring Integration flow concurrently for each Ftp file

I have a Integration flow configured using Java DSL which pulls file from Ftp server using Ftp.inboundChannelAdapter then transforms it to JobRequest, then I have a .handle() method which triggers my batch job, everything is working as per required but the process in running sequentially for each file inside the FTP folder
I added currentThreadName in my Transformer Endpoint it was printing same thread name for each file
Here is what I have tried till now
1.task executor bean
#Bean
public TaskExecutor taskExecutor(){
return new SimpleAsyncTaskExecutor("Integration");
}
2.Integration flow
#Bean
public IntegrationFlow integrationFlow(JobLaunchingGateway jobLaunchingGateway) throws IOException {
return IntegrationFlows.from(Ftp.inboundAdapter(myFtpSessionFactory)
.remoteDirectory("/bar")
.localDirectory(localDir.getFile())
,c -> c.poller(Pollers.fixedRate(1000).taskExecutor(taskExecutor()).maxMessagesPerPoll(20)))
.transform(fileMessageToJobRequest(importUserJob(step1())))
.handle(jobLaunchingGateway)
.log(LoggingHandler.Level.WARN, "headers.id + ': ' + payload")
.route(JobExecution.class,j->j.getStatus().isUnsuccessful()?"jobFailedChannel":"jobSuccessfulChannel")
.get();
}
3.I also read in another SO thread that I need ExecutorChannel so I configured one but I don't know how to inject this channel into my Ftp.inboundAdapter, from logs is see that the channel is always integrationFlow.channel#0 which I guess is a DirectChannel
#Bean
public MessageChannel inputChannel() {
return new ExecutorChannel(taskExecutor());
}
I dont know what I'm missing here, or I might have not properly understood Spring Messaging System as I'm very much new to Spring and Spring-Integration
Any help is appreciated
Thanks
The ExecutorChannel you can simply inject into the flow and it is going to be applied to the SourcePollingChannelAdapter by the framework. So, having that inputChannel defined as a bean you just do this:
.channel(inputChannel())
before your .transform(fileMessageToJobRequest(importUserJob(step1()))).
See more in docs: https://docs.spring.io/spring-integration/docs/current/reference/html/dsl.html#java-dsl-channels
On the other hand to process your files in parallel according your .taskExecutor(taskExecutor()) configuration, you just need to have a .maxMessagesPerPoll(20) as 1. The logic in the AbstractPollingEndpoint is like this:
this.taskExecutor.execute(() -> {
int count = 0;
while (this.initialized && (this.maxMessagesPerPoll <= 0 || count < this.maxMessagesPerPoll)) {
if (pollForMessage() == null) {
break;
}
count++;
}
So, we do have tasks in parallel, but only when they reach that maxMessagesPerPoll where it is 20 in your current case. There is also some explanation in the docs: https://docs.spring.io/spring-integration/docs/current/reference/html/messaging-endpoints.html#endpoint-pollingconsumer
The maxMessagesPerPoll property specifies the maximum number of messages to receive within a given poll operation. This means that the poller continues calling receive() without waiting, until either null is returned or the maximum value is reached. For example, if a poller has a ten-second interval trigger and a maxMessagesPerPoll setting of 25, and it is polling a channel that has 100 messages in its queue, all 100 messages can be retrieved within 40 seconds. It grabs 25, waits ten seconds, grabs the next 25, and so on.

MSMQ ARCHITECTURE WITH DEDICATED PROCESSORS PER DATABASE

I have a web application in ASP.NET MVC , C# and I have a specific use case that takes long time to process and users have to wait until the process is complete. I want to use MSMQ and relay the heavy work to dedicated MSMQ consumer/servicer. Our application has multiple clients and each client has their own SQL database. So let's say 100 clients make 100 separate SQL databases. The real challenge I have is to make the process faster using MSMQ but task of 1 client should not effect the performance of others. So I have 2 solutions:
Option-1: Unique MSMQ Private Queue per database so in my case it will be 100 queues and growing. 1 dedicated ASP.NET console application that listens to a dedicated MSMQ so in my case it will be 100 processors or console applications.
Option-2: 1 big MSMQ private queue for all databases
A: 1 dedicated MSMQ consumer per database so 100 processors
B: 1 MSMQ consumer that listens to the big MSMQ
I want to stick with Option-1 but I would want to know is this a feasible and enterprise type solution?
You actually have two questions
First, how do you allocate a resources affinity to a processor to SQL Server.
Select the database in Sql Management Studio, right click and follow this..
Clean your Database regularly
DBCC FREEPROCCACHE;
DBCC DROPCLEANBUFFERS;
MSMQ, turn on [journaling][2], but also consider another queuing process RabbitMQ etc, or write a simple one to enquque the jobs sample from here
public class MultiThreadQueue
{
BlockingCollection<string> _jobs = new BlockingCollection<string>();
public MultiThreadQueue(int numThreads)
{
for (int i = 0; i < numThreads; i++)
{
var thread = new Thread(OnHandlerStart)
{ IsBackground = true };//Mark 'false' if you want to prevent program exit until jobs finish
thread.Start();
}
}
public void Enqueue(string job)
{
if (!_jobs.IsAddingCompleted)
{
_jobs.Add(job);
}
}
public void Stop()
{
//This will cause '_jobs.GetConsumingEnumerable' to stop blocking and exit when it's empty
_jobs.CompleteAdding();
}
private void OnHandlerStart()
{
foreach (var job in _jobs.GetConsumingEnumerable(CancellationToken.None))
{
Console.WriteLine(job);
Thread.Sleep(10);
}
}
}
Hope this helps :)
The question has been reworded, he meant sometheng else when he said Processors.
Update added a consumer pattern with onPeek :
You really need to post some code!
Consider using the OnPeekCompleted method. If there is an error you can leave the message on the queue
If you have some kind of header which identifies the message you can switch to a different dedicated/thread.
private static void OnPeekCompleted(Object sourceQueue, PeekCompletedEventArgs asyncResult)
{
// Set up and connect to the queue.
MessageQueue mq = (MessageQueue)sourceQueue;
// gets a new transaction going
using (var txn = new MessageQueueTransaction())
{
try
{
// retrieve message and process
txn.Begin();
// End the asynchronous peek operation.
var message = mq.Receive(txn);
#if DEBUG
// Display message information on the screen.
if (message != null)
{
Console.WriteLine("{0}: {1}", message.Label, (string)message.Body);
}
#endif
// message will be removed on txn.Commit.
txn.Commit();
}
catch (Exception ex)
{
// If there is an error you can leave the message on the queue, don't remove message from queue
Console.WriteLine(ex.ToString());
txn.Abort();
}
}
// Restart the asynchronous peek operation.
mq.BeginPeek();
}
You can also use a service broker

Akka.net Ask timeout when used in Azure WebJob

At work we have some code in a Azure WebJob where we use Rabbit
The basic workflow is this
A message arrives on RabbitMQ Queue
We have a message handler for the incoming message
Within the message handler we start a top level (user) supervisor actor where we "ask" it to handle the message
The supervisor actor hierarchy is like this
And the relevant top level code is something like this (this is the WebJob code)
static void Main(string[] args)
{
try
{
//Bootstrap akka IoC resolver well ahead of any actor usages
new AutoFacDependencyResolver(ContainerOperations.Instance.Container, ContainerOperations.Instance.Container.Resolve<ActorSystem>());
var system = ContainerOperations.Instance.Container.Resolve<ActorSystem>();
var busQueueReader = ContainerOperations.Instance.Container.Resolve<IBusQueueReader>();
var dateTime = ContainerOperations.Instance.Container.Resolve<IDateTime>();
busQueueReader.AddHandler<ProgramCalculationMessage>("RabbitQueue", x =>
{
//This is code that gets called whenever we have a RabbitMQ message arrive
//This is code that gets called whenever we have a RabbitMQ message arrive
//This is code that gets called whenever we have a RabbitMQ message arrive
//This is code that gets called whenever we have a RabbitMQ message arrive
//This is code that gets called whenever we have a RabbitMQ message arrive
try
{
//SupervisorActor is a singleton
var supervisorActor = ContainerOperations.Instance.Container.ResolveNamed<IActorRef>("SupervisorActor");
var actorMessage = new SomeActorMessage();
var supervisorRunTask = runModelSupervisorActor.Ask(actorMessage, TimeSpan.FromMinutes(25));
//we want to wait this guy out
var supervisorRunResult = supervisorRunTask.GetAwaiter().GetResult();
switch (supervisorRunResult)
{
case CompletedEvent completed:
{
break;
}
case FailedEvent failed:
{
throw failed.Exception;
}
}
}
catch (Exception ex)
{
_log.Error(ex, "Error found in Webjob");
//throw it for the actual RabbitMqQueueReader Handler so message gets NACK
throw;
}
});
Thread.Sleep(Timeout.Infinite);
}
catch (Exception ex)
{
_log.Error(ex, "Error found");
throw;
}
}
And this is the relevant IOC code (we are using Autofac + Akka.NET DI for Autofac)
builder.RegisterType<SupervisorActor>();
_actorSystem = new Lazy<ActorSystem>(() =>
{
var akkaconf = ActorUtil.LoadConfig(_akkaConfigPath).WithFallback(ConfigurationFactory.Default());
return ActorSystem.Create("WebJobSystem", akkaconf);
});
builder.Register<ActorSystem>(cont => _actorSystem.Value);
builder.Register(cont =>
{
var system = cont.Resolve<ActorSystem>();
return system.ActorOf(system.DI().Props<SupervisorActor>(),"SupervisorActor");
})
.SingleInstance()
.Named<IActorRef>("SupervisorActor");
The problem
So the code is working fine and doing what we want it to, apart from the Akka.Net "ask" timeout shown above in the WebJob code.
Annoyingly this seems to work fine if I try and run the webjob locally. Where I can simulate a "ask" timeout by providing a new supervisorActor that simply doesn't EVER respond with a message back to the "Sender".
This works perfectly running on my machine, but when we run this code in Azure, we DO NOT see a Timeout for the "ask" even though one of our workflow runs exceeded the "ask" timeout by a mile.
I just don't know what could be causing this behavior, does anyone have any ideas?
Could there be some Azure specific config value for the WebJob that I need to set.
The answer to this was to use the async rabbit handlers which apparently came out in V5.0 of the C# rabbit client. The offical docs still show the sync usage (sadly).
This article is quite good : https://gigi.nullneuron.net/gigilabs/asynchronous-rabbitmq-consumers-in-net/
Once we did this, all was good

How to parallelize an azure worker role?

I have got a Worker Role running in azure.
This worker processes a queue in which there are a large number of integers. For each integer I have to do processings quite long (from 1 second to 10 minutes according to the integer).
As this is quite time consuming, I would like to do these processings in parallel. Unfortunately, my parallelization seems to not be efficient when I test with a queue of 400 integers.
Here is my implementation :
public class WorkerRole : RoleEntryPoint {
private readonly CancellationTokenSource cancellationTokenSource = new CancellationTokenSource();
private readonly ManualResetEvent runCompleteEvent = new ManualResetEvent(false);
private readonly Manager _manager = Manager.Instance;
private static readonly LogManager logger = LogManager.Instance;
public override void Run() {
logger.Info("Worker is running");
try {
this.RunAsync(this.cancellationTokenSource.Token).Wait();
}
catch (Exception e) {
logger.Error(e, 0, "Error Run Worker: " + e);
}
finally {
this.runCompleteEvent.Set();
}
}
public override bool OnStart() {
bool result = base.OnStart();
logger.Info("Worker has been started");
return result;
}
public override void OnStop() {
logger.Info("Worker is stopping");
this.cancellationTokenSource.Cancel();
this.runCompleteEvent.WaitOne();
base.OnStop();
logger.Info("Worker has stopped");
}
private async Task RunAsync(CancellationToken cancellationToken) {
while (!cancellationToken.IsCancellationRequested) {
try {
_manager.ProcessQueue();
}
catch (Exception e) {
logger.Error(e, 0, "Error RunAsync Worker: " + e);
}
}
await Task.Delay(1000, cancellationToken);
}
}
}
And the implementation of the ProcessQueue:
public void ProcessQueue() {
try {
_queue.FetchAttributes();
int? cachedMessageCount = _queue.ApproximateMessageCount;
if (cachedMessageCount != null && cachedMessageCount > 0) {
var listEntries = new List<CloudQueueMessage>();
listEntries.AddRange(_queue.GetMessages(MAX_ENTRIES));
Parallel.ForEach(listEntries, ProcessEntry);
}
}
catch (Exception e) {
logger.Error(e, 0, "Error ProcessQueue: " + e);
}
}
And ProcessEntry
private void ProcessEntry(CloudQueueMessage entry) {
try {
int id = Convert.ToInt32(entry.AsString);
Service.GetData(id);
_queue.DeleteMessage(entry);
}
catch (Exception e) {
_queueError.AddMessage(entry);
_queue.DeleteMessage(entry);
logger.Error(e, 0, "Error ProcessEntry: " + e);
}
}
In the ProcessQueue function, I try with different values of MAX_ENTRIES: first =20 and then =2.
It seems to be slower with MAX_ENTRIES=20, but whatever the value of MAX_ENTRIES is, it seems quite slow.
My VM is a A2 medium.
I really don't know if I do the parallelization correctly ; maybe the problem comes from the worker itself (which may be it is hard to have this in parallel).
You haven't mentioned which Azure Messaging Queuing technology you are using, however for tasks where I want to process multiple messages in parallel I tend to use the Message Pump Pattern on Service Bus Queues and Subscriptions, leveraging the OnMessage() method available on both Service Bus Queue and Subscription Clients:
QueueClient OnMessage() - https://msdn.microsoft.com/en-us/library/microsoft.servicebus.messaging.queueclient.onmessage.aspx
SubscriptionClient OnMessage() - https://msdn.microsoft.com/en-us/library/microsoft.servicebus.messaging.subscriptionclient.onmessage.aspx
An overview of how this stuff works :-) - http://fabriccontroller.net/blog/posts/introducing-the-event-driven-message-programming-model-for-the-windows-azure-service-bus/
From MSDN:
When calling OnMessage(), the client starts an internal message pump
that constantly polls the queue or subscription. This message pump
consists of an infinite loop that issues a Receive() call. If the call
times out, it issues the next Receive() call.
This pattern allows you to use a delegate (or anonymous function in my preferred case) that handles the receipt of the Brokered Message instance on a separate thread on the WaWorkerHost process. In fact, to increase the level of throughput, you can specify the number of threads that the Message Pump should provide, thereby allowing you to receive and process 2, 4, 8 messages from the queue in parallel. You can additionally tell the Message Pump to automagically mark the message as complete when the delegate has successfully finished processing the message. Both the thread count and AutoComplete instructions are passed in the OnMessageOptions parameter on the overloaded method.
public override void Run()
{
var onMessageOptions = new OnMessageOptions()
{
AutoComplete = true, // Message-Pump will call Complete on messages after the callback has completed processing.
MaxConcurrentCalls = 2 // Max number of threads the Message-Pump can spawn to process messages.
};
sbQueueClient.OnMessage((brokeredMessage) =>
{
// Process the Brokered Message Instance here
}, onMessageOptions);
RunAsync(_cancellationTokenSource.Token).Wait();
}
You can still leverage the RunAsync() method to perform additional tasks on the main Worker Role thread if required.
Finally, I would also recommend that you look at scaling your Worker Role instances out to a minimum of 2 (for fault tolerance and redundancy) to increase your overall throughput. From what I have seen with multiple production deployments of this pattern, OnMessage() performs perfectly when multiple Worker Role Instances are running.
A few things to consider here:
Are your individual tasks CPU intensive? If so, parallelism may not help. However, if they are mostly waiting on data processing tasks to be processed by other resources, parallelizing is a good idea.
If parallelizing is a good idea, consider not using Parallel.ForEach for queue processing. Parallel.Foreach has two issues that prevent you from being very optimal:
The code will wait until all kicked off threads finish processing before moving on. So, if you have 5 threads that need 10 seconds each and 1 thread that needs 10 minutes, the overall processing time for Parallel.Foreach will be 10 minutes.
Even though you are assuming that all of the threads will start processing at the same time, Parallel.Foreach does not work this way. It looks at number of cores on your server and other parameters and generally only kicks off number of threads it thinks it can handle, without knowing too much about what's in those threads. So, if you have a lot of non-CPU bound threads that /can/ be kicked off at the same time without causing CPU over-utilization, default behaviour will not likely run them optimally.
How to do this optimally:
I am sure there are a ton of solutions out there, but for reference, the way we've architected it in CloudMonix (that must kick off hundreds of independent threads and complete them as fast as possible) is by using ThreadPool.QueueUserWorkItem and manually keeping track number of threads that are running.
Basically, we use a Thread-safe collection to keep track of running threads that are started by ThreadPool.QueueUserWorkItem. Once threads complete, remove them from that collection. The queue-monitoring loop is indendent of executing logic in that collection. Queue-monitoring logic gets messages from the queue if the processing collection is not full up to the limit that you find most optimal. If there is space in the collection, it tries to pickup more messages from the queue, adds them to the collection and kick-start them via ThreadPool.QueueUserWorkItem. When processing completes, it kicks off a delegate that cleans up thread from the collection.
Hope this helps and makes sense

Message being retried when operation takes time

I have a messaging system using Azure ServiceBus but I'm using Nimbus on top of that. I have an endpoint that sends a command to another endpoint and at one point the handler class on the other side picks it up, so it is all working fine.
When the operation takes time, roughly more than 20 second or so, the handler gets 'another' call with the same message. It looks like Nimbus is retrying the message that is already being handled by an other (even the same) instance of the handler, I don't see any exceptions being thrown and I could easily repro this with the following handler:
public class Synchronizer : IHandleCommand<RequestSynchronization>
{
public async Task Handle(RequestSynchronization synchronizeInfo)
{
Console.WriteLine("Received Synchronization");
await Task.Delay(TimeSpan.FromSeconds(30)); //Simulate long running process
Console.WriteLine("Got through first timeout");
await Task.Delay(TimeSpan.FromSeconds(30)); //Simulate another long running process
Console.WriteLine("Got through second timeout");
}
}
My question is: How do I disable this behavior? I am happy for the transaction take time as it is a heavy process that I have off-loaded from my website, which was the whole point of going with this architecture in the first place.
In other words, I was expecting the message to not to be picked up by another handler while one has picked it up and is processing it, unless there's an exception and the message goes back to the queue and eventually gets picked up for a retry.
Any ideas how to do this? Anything I'm missing?
By default, ASB/WSB will give you a message lock of 30 seconds. The idea is that you pop a BrokeredMessage off the head of the queue but have to either .Complete() or .Abandon() that message within the lock timeout.
If you don't do that, the service bus assumes that you've crashed or otherwise failed and it will return that message to the queue to be re-processed.
You have a couple of options:
1) Implement ILongRunningHandler on your handler. Nimbus will pay attention to the remaining lock time and automatically renew your message lock. Caution: The maximum message lock time supported by ASB/WSB is five minutes no matter how many times you renew so if your handler takes longer than that then you might want option #2.
public class Synchronizer : IHandleCommand<RequestSynchronization>, ILongRunningTask
{
public async Task Handle(RequestSynchronization synchronizeInfo)
{
Console.WriteLine("Received Synchronization");
await Task.Delay(TimeSpan.FromSeconds(30)); //Simulate long running process
Console.WriteLine("Got through first timeout");
await Task.Delay(TimeSpan.FromSeconds(30)); //Simulate another long running process
Console.WriteLine("Got through second timeout");
}
}
2) In your handler, call a Task.Run(() => SomeService(yourMessage)) and return. If you do this, be careful about lifetime scoping of dependencies if your handler takes any. If you need an IFoo, take a dependency on a Func> (or equivalent depending on your container) and resolve that within your handling task.
public class Synchronizer : IHandleCommand<RequestSynchronization>
{
private readonly Func<Owned<IFoo>> fooFunc;
public Synchronizer(Func<Owned<IFoo>> fooFunc)
{
_fooFunc = fooFunc;
}
public async Task Handle(RequestSynchronization synchronizeInfo)
{
// don't await!
Task.Run(() => {
using (var foo = _fooFunc())
{
Console.WriteLine("Received Synchronization");
await Task.Delay(TimeSpan.FromSeconds(30)); //Simulate long running process
Console.WriteLine("Got through first timeout");
await Task.Delay(TimeSpan.FromSeconds(30)); //Simulate another long running process
Console.WriteLine("Got through second timeout");
}
});
}
}
I think you are looking for the code here: http://www.uglybugger.org/software/post/support_for_long_running_handlers_in_nimbus

Resources