Behaviour of multiple inbound channel adapters/pollers in a single context

Behaviour of multiple inbound channel adapters/pollers in a single context - spring-integration

I have a Spring Integration context with multiple inbound channel adapters, each with his own poller (currently all the pollers have their refresh time configured with fixed-delay but may use fixed-rate in future). All the inbound adapters output their produced messages to the same processing chain. The question is what is the behaviour of polling and message consuming in such a situation? Imagine, that poller #1 has produced 1000 messages and they are handed to my processing chain. Since processing can take some significant time it is possible that it has come time for the poller #2 to do its job and possibly produce messages. But remeber - my processing chain is still handling messages passed by poller #1. What happens?
Poller #2 is not run at all until all the poller #1 messages are processed.
Poller #2 is run (but how could it be run if we have only one thread?), its messages are stored for later use when all the poller #1 messages are processed.
Processing initiated by poller #1 is interrupted, poller #2 is run, produced messages are passed to the processing chain immediately.
Some other answer
Note that all my channels are direct channels and there are no task executors used.

Pollers are independent tasks handled by the common taskScheduler bean; as long as the task scheduler has sufficient threads, there is no coordination across pollers.
If the pool is exhausted, pollers will run "late".
By default the taskScheduler has 10 threads; but you can reconfigure it.

I have almost the same case, however the behaviour is a bit differs.
My case is :
I have 4 pollers which requests data independently from 4 different
blocking queues ( i have set up timeout for 1 sec for each of them)
I have 4 inbound channel adapters configured to use fixed-delay (100ms) and the pollers above (one to one).
I have thread pool with 4 threads core/max, configured to handle inbound channel adapters (all adapters use this pool)
And now i see at logs that each thread executes all pollers sequently, and if there is an empty queue (i am using blocking queue) then all threads are delayed for 1 sec. This means even if you have enough threads you still may get delay for all your threads if at least one poller is slow. For instance if I would not use timeout for queue reading at all then all threads would stop on empty queue and nothing would be read from all others non-empty queues .
To solve this issue I guess we need to configure separate thread pools for each poller<-->inbound channel adapter.

Related

Thread Sleep in the Kafka Listener

I am trying to pause/resume the Kafka container. Using the following code snippet to do so:
kafkaListenerEndpointRegistry.getListenerContainer("MAIN").pause();
When I call the pause, I also need to do a thread.sleep so that messages in the batch are not processed. For every message in the batch, I am calling another API which has a rate limit. To maintain this rate limit, I need to stop the processing for the message.
If the Main thread sleeps, will it stop Listener from sending the hearbeat? Does it also stop the heartbeat thread in the background?
Documentation says , "When a container is paused, it continues to poll() the consumer, avoiding a rebalance if group management is being used, but it does not retrieve any records. "
But I am pausing the container and making the thread sleep. How will this impact the flow?

You must never sleep the consumer thread, to avoid rebalancing.
Instead, reduce the max.poll.records so the pause will take effect more quickly (the consumer won't actually pause until the records received by the previous poll are processed).
You can throw an exception after pausing the consumer, but you will need to resume the container somehow.
I opened a new issue to improve this behavior https://github.com/spring-projects/spring-kafka/issues/2280
If you are subject to rate limits, consider using KafkaTemplate.receive() methods, on a schedule, or a polled Spring Integration adapter, instead of using a message-driven approach.

RabbitMQ multiple consumers across multiple queues - messages delayed from being processed

We have recently experienced unexpected behaviour with our application that is powered by RabbitMQ.
RabbitMQ version is 3.6.12 and we are using .NET Client 5.0.1
The application subscribes to two queues, one for commands, and another for events - we also use manual acknowledgements.
Our application is configured to have 7 consumers. Each has its own channel(IModel) and each has its own EventingBasicConsumer
We end up processing messages when EventingBasicConsumer.Received is fired.
Our application must process messages as close as possible to when they are routed onto the queues and to date we have not had issues.
However recently, we have seen that when one of our messages being processed takes a long time to complete, it delays when another message is to be processed although there are many consumers available (6) that are not busy.
Note we have observed that this issue does not happen when an application is only subscribing to a single queue, it becomes an issue when there is multiple queues involved.
This is best illustrated using the following example:
We have a simple consuming application that subscribes to two queues,
one for commands and one for events. This application have 7
consumers, each with their own channel and EventingBasicConsumer We
start a simple publishing application, that publishes 20 messages, a
second apart. Every message is an event so is published to the event
queue except for the 5th and 10th messages, which are commands and
sent to the command queue. Note that every event is processed without
delay whereas commands take 30 seconds
The following table describes what we are observing in relation to assigning multiple channels to messages across multiple queues:
Once Message5 completes after 30 seconds with C1, then Messaqe9 is assigned immediately to C1 and is processed without delay
Once Message10 completes after 30 seconds with C2, then Messaqe11 is assigned immediately to C2 and is processed without delay
Hence, to us it looks like the assignment of channels is done independently per queue - meaning you can have delayed execution if some messages take a long time to process.
Is it possible that when multiple consumers are subscribing to multiple queues, RabbitMQ can assign a message to be handled by a consumer that is busy even if there are consumers that are currently idle?
Is there any documentation that explains the RabbitMQ algorithm that selects which consumers EventingBasicConsumer.received fires from a collection of consumers?

We have fixed this issue.
In the RMQ documentation (https://www.rabbitmq.com/api-guide.html#consuming) we came across the following:
"Each Channel has its own dispatch thread. For the most common use case of one Consumer per Channel, this means Consumers do not hold up other Consumers. If you have multiple Consumers per Channel be aware that a long-running Consumer may hold up dispatch of callbacks to other Consumers on that Channel.”
In our code, we had 2 consumers per channel, meaning consumers could hold up other consumers.
We changed to have one consumer per channel and that fixed the issue.

Understanding Timeout In Partitioned Batch Jobs

I am trying to understand the ways timeouts cal be specified for partitioned steps.
jmsoutbound-gateway receive-timeout
jmsoutbound-gateway reply-timeout
jmsoutbound-gateway repyListener receive-timeout
partition handler messagingOperations receive-timeout
I want to be able to timeout when a step takes too long and clean up. By looking at the stack trace, the reply listener does not go away after partition ends (and may receive a late responding message after job has completed).

The time the executor thread will wait in the gateway for a reply to arrive (partition to complete) before giving up.
A timeout when writing to the reply-channel - in general will only apply if the send can block - such as when the reply channel is a bounded queue channel that is full.
When using a reply listener, the container polls the JMS client for messages, this timeout is simply how long the thread blocks in the client waiting for a reply before looping around and waiting again - it has no bearing on messages timing out; it only really affects how quickly the container will respond to a stop().
The time the partition handler will wait for all partitions to complete (unless pollRepositoryForResults is true in which case, the handler's timeout property represents that and the receive timeout is not used).
So it sounds like #4 is what you want.

Concurrent message processing in RabbitMQ consumer

I am new to RabbitMQ so please excuse me if my question sound trivial. I want to publish message on RabbitMQ which will be processed by RabbitMQ consumer.
My consumer machine is a multi core machine (preferably worker role on azure). But QueueBasicConsumer pushes one message at a time. How can I program to utilize all core where I can process multiple message concurrently.
One solution could be to open multiple channels in multiple threads and then process message over there. But in this case how will i decide the number of threads.
Another approach could be to read message on main thread and then create task and pass message to this task. In this case I will have to stop consuming messages in case there are many message (after a threshold) already in progress. Not sure how could this be implemented.
Thanks In Advance

Your second option sounds much more reasonable - consume on a single channel and spawn multiple tasks to handle the messages. To implement concurrency control, you could use a semaphore to control the number of tasks in flight. Before starting a task, you would wait for the semaphore to become available, and after a task has finished, it would signal the semaphore to allow other tasks to run.
You haven't specified you language/technology stack of choice, but whatever you do - try to utilise a thread pool instead of creating and managing threads yourself. In .NET, that would mean using Task.Run to process messages asynchronously.
Example C# code:
using (var semaphore = new SemaphoreSlim(MaxMessages))
{
while (true)
{
var args = (BasicDeliverEventArgs)consumer.Queue.Dequeue();
semaphore.Wait();
Task.Run(() => ProcessMessage(args))
.ContinueWith(() => semaphore.Release());
}
}
Instead of controlling the concurrency level yourself, you might find it easier to enable explicit ACK control on the channel, and use RabbitMQ Consumer Prefetch to set the maximum number of unacknowledged messages. This way, you will never receive more messages than you wanted at once.

Only single netty thread is running

I am using Netty camel-netty:jar:2.10.0.redhat-60024.
Below is my configuration of Netty listener
netty:tcp://10.1.33.204:9001?textline=true&autoAppendDelimiter=true&delimiter=LINE&keepAlive=true&synchronous=false&orderedThreadPoolExecutor=false&sendBufferSize=2000&receiveBufferSize=2000&decoderMaxLineLength=2000&workerCount=20
Here I see based on debug log , Netty is creating only one worker threads , so incoming mesages are blocked until existing message is processed.
Like:
2014-08-23 12:36:48,394 | DEBUG | w I/O worker #5 | NettyConsumer
| ty.handlers.ServerChannelHandler 85 | 126 -
org.apache.camel.camel-netty - 2.10.0.redhat-60024
Till 5 minute proccess is running but I seee only this thread active. Only when this thread sends reponse it is accepting next request

For TCP, Netty creates a number of worker threads, and assigns each connection to a specific worker thread. All events for that channel are handled by that single thread (note it can be more complex, but that's sufficient for this answer).
It sounds like you're processing your message in the Netty worker thread. Therefore you're blocking processing of any further events on that connection, and all other connections assigned to the worker thread, until your process returns.
Netty is actually creating multiple worker threads. You can see in the debug message that your channel is being handled by I/O worker 5. Netty will create 2 * Runtime.availableProcessors by default but each connection is handled by a single thread unless you intervene.
It's not clear whether you can process requests concurrently and out of order, or whether ordering is important. If ordering is important you can tell camel to use the ordered thread pool executor. This will process the request in a separate thread pool, but subsequent requests on the same connection will still be blocked by the first requests.
If ordering is not important you have a few options. Given that camel appears to be using Netty 3, and allows you to create a custom pipeline, you could use Netty's MemoryAwareThreadPoolExecutor to process requests concurrently. Perhaps take look at What happens when shared MemoryAwareThreadPoolExecutor's threshold is reached? if you do this.
Camel may offer other mechanisms to help but I'm not overly familiar with Camel. The SEDA component might be a good place to start.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string