ActiveMQ: Dispatched queue contains more messages then prefetch size

ActiveMQ: Dispatched queue contains more messages then prefetch size - spring-integration

I have prefetch size set to 1 (jms.prefetchPolicy.all=1 in url). In web console I can see that prefetch is 1 for all of my consumers. One consumer got stuck and there were 67 messages on his dispatch queue -see my screenshot
Could you help me understand how could it happen? I've read plenty of articles on this and my understanding is that Dispatch queue size should be up to prefetch size?!
I use following configuration to consume messages from queue:
ConnectionFactory getActiveMQConnectionFactory() {
// Configure the ActiveMQConnectionFactory
ActiveMQConnectionFactory activeMQConnectionFactory = new ActiveMQConnectionFactory();
activeMQConnectionFactory.setBrokerURL(brokerUrl);
activeMQConnectionFactory.setUserName(user);
activeMQConnectionFactory.setPassword(password);
activeMQConnectionFactory.setNonBlockingRedelivery(true);
// Configure the redeliver policy and the dead letter queue
RedeliveryPolicy redeliveryPolicy = new RedeliveryPolicy();
redeliveryPolicy.setInitialRedeliveryDelay(initialRedeliveryDelay);
redeliveryPolicy.setRedeliveryDelay(redeliveryDelay);
redeliveryPolicy.setUseExponentialBackOff(useExponentialBackOff);
redeliveryPolicy.setMaximumRedeliveries(maximumRedeliveries);
RedeliveryPolicyMap redeliveryPolicyMap = activeMQConnectionFactory.getRedeliveryPolicyMap();
redeliveryPolicyMap.put(new ActiveMQQueue(thumbnailQueue), redeliveryPolicy);
activeMQConnectionFactory.setRedeliveryPolicy(redeliveryPolicy);
return activeMQConnectionFactory;
}
public IntegrationFlow createThumbnailFlow(String concurrency, CreateThumbnailReceiver receiver) {
return IntegrationFlows.from(
Jms.messageDrivenChannelAdapter(
Jms.container(getActiveMQConnectionFactory(), thumbnailQueue)
.concurrency(concurrency)
.sessionTransacted(true)
.get()
))
.transform(new JsonToObjectTransformer(CreateThumbnailRequest.class, jsonObjectMapper()))
.handle(receiver)
.get();
}

The problem was cause by difference between version of broker (5.14.5) and client (5.15.3). After upgrading broker dispatched queue contains at most 2 message as expected.

Related

Run Spring Integration flow concurrently for each Ftp file

I have a Integration flow configured using Java DSL which pulls file from Ftp server using Ftp.inboundChannelAdapter then transforms it to JobRequest, then I have a .handle() method which triggers my batch job, everything is working as per required but the process in running sequentially for each file inside the FTP folder
I added currentThreadName in my Transformer Endpoint it was printing same thread name for each file
Here is what I have tried till now
1.task executor bean
#Bean
public TaskExecutor taskExecutor(){
return new SimpleAsyncTaskExecutor("Integration");
}
2.Integration flow
#Bean
public IntegrationFlow integrationFlow(JobLaunchingGateway jobLaunchingGateway) throws IOException {
return IntegrationFlows.from(Ftp.inboundAdapter(myFtpSessionFactory)
.remoteDirectory("/bar")
.localDirectory(localDir.getFile())
,c -> c.poller(Pollers.fixedRate(1000).taskExecutor(taskExecutor()).maxMessagesPerPoll(20)))
.transform(fileMessageToJobRequest(importUserJob(step1())))
.handle(jobLaunchingGateway)
.log(LoggingHandler.Level.WARN, "headers.id + ': ' + payload")
.route(JobExecution.class,j->j.getStatus().isUnsuccessful()?"jobFailedChannel":"jobSuccessfulChannel")
.get();
}
3.I also read in another SO thread that I need ExecutorChannel so I configured one but I don't know how to inject this channel into my Ftp.inboundAdapter, from logs is see that the channel is always integrationFlow.channel#0 which I guess is a DirectChannel
#Bean
public MessageChannel inputChannel() {
return new ExecutorChannel(taskExecutor());
}
I dont know what I'm missing here, or I might have not properly understood Spring Messaging System as I'm very much new to Spring and Spring-Integration
Any help is appreciated
Thanks

The ExecutorChannel you can simply inject into the flow and it is going to be applied to the SourcePollingChannelAdapter by the framework. So, having that inputChannel defined as a bean you just do this:
.channel(inputChannel())
before your .transform(fileMessageToJobRequest(importUserJob(step1()))).
See more in docs: https://docs.spring.io/spring-integration/docs/current/reference/html/dsl.html#java-dsl-channels
On the other hand to process your files in parallel according your .taskExecutor(taskExecutor()) configuration, you just need to have a .maxMessagesPerPoll(20) as 1. The logic in the AbstractPollingEndpoint is like this:
this.taskExecutor.execute(() -> {
int count = 0;
while (this.initialized && (this.maxMessagesPerPoll <= 0 || count < this.maxMessagesPerPoll)) {
if (pollForMessage() == null) {
break;
}
count++;
}
So, we do have tasks in parallel, but only when they reach that maxMessagesPerPoll where it is 20 in your current case. There is also some explanation in the docs: https://docs.spring.io/spring-integration/docs/current/reference/html/messaging-endpoints.html#endpoint-pollingconsumer
The maxMessagesPerPoll property specifies the maximum number of messages to receive within a given poll operation. This means that the poller continues calling receive() without waiting, until either null is returned or the maximum value is reached. For example, if a poller has a ten-second interval trigger and a maxMessagesPerPoll setting of 25, and it is polling a channel that has 100 messages in its queue, all 100 messages can be retrieved within 40 seconds. It grabs 25, waits ten seconds, grabs the next 25, and so on.

Custom partition in pulsar

Unable to send message from producer to pulsar, when Producer is set to customPartition (please refer below code).
Producer<byte[]> producer = client.newProducer()
.topic(pulsarTopic)
//.messageRoutingMode(MessageRoutingMode.RoundRobinPartition)
.messageRoutingMode(MessageRoutingMode.CustomPartition)
.messageRouter( new MessageRounterImpl())
.create();
Code to send Message :
producer.send(msg);
MessageRouterImpl has randomly generates number with range from 0 to 5, as below code
public class MessageRounterImpl implements MessageRouter {
#Override
public int choosePartition(Message<?> msg, TopicMetadata metadata) {
Random r = new Random();
return r.nextInt((0 - 5) + 1);
}
}
My question is why i am unable to send message from producer with CustomPartition and why i am getting below log messages
Batching the messages from the batch container from timer thread
Batching the messages from the batch container with 0 messages
With MessageRoutingMode.RoundRobinPartition and MessageRoutingMode.SinglePartition I was able to send message from producer.
It would be really helpful if someone through light on this .

First, please take into account that partitioned topics should be explicitly created before a publisher starts sending messages, for example:
bin/pulsar-admin topics create-partitioned-topic persistent://tenant/namespace/partitioned-topic-name --partitions 5
Second, the following line will throw an exception (it cannot handle negative values as input):
return r.nextInt((0 - 5) + 1);
It is possible to use the following one:
return r.nextInt(5);

How to handle errors after message has been handed off to QueueChannel?

I have 10 rabbitMQ queues, called event.q.0, event.q.2, <...>, event.q.9. Each of these queues receive messages routed from event.consistent-hash exchange. I want to build a fault tolerant solution that will consume messages for a specific event in sequential manner, since ordering is important. For this I have set up a flow that listens to those queues and routes messages based on event ID to a specific worker flow. Worker flows work based on queue channels so that should guarantee the FIFO order for an event with specific ID. I have come up with with the following set up:
#Bean
public IntegrationFlow eventConsumerFlow(RabbitTemplate rabbitTemplate, Advice retryAdvice) {
return IntegrationFlows
.from(
Amqp.inboundAdapter(new SimpleMessageListenerContainer(rabbitTemplate.getConnectionFactory()))
.configureContainer(c -> c
.adviceChain(retryAdvice())
.addQueueNames(queueNames)
.prefetchCount(amqpProperties.getPreMatch().getDefinition().getQueues().getEvent().getPrefetch())
)
.messageConverter(rabbitTemplate.getMessageConverter())
)
.<Event, String>route(e -> String.format("worker-input-%d", e.getId() % numberOfWorkers))
.get();
}
private Advice deadLetterAdvice() {
return RetryInterceptorBuilder
.stateless()
.maxAttempts(3)
.recoverer(recoverer())
.backOffPolicy(backOffPolicy())
.build();
}
private ExponentialBackOffPolicy backOffPolicy() {
ExponentialBackOffPolicy backOffPolicy = new ExponentialBackOffPolicy();
backOffPolicy.setInitialInterval(1000);
backOffPolicy.setMultiplier(3.0);
backOffPolicy.setMaxInterval(15000);
return backOffPolicy;
}
private MessageRecoverer recoverer() {
return new RepublishMessageRecoverer(
rabbitTemplate,
"error.exchange.dlx"
);
}
#PostConstruct
public void init() {
for (int i = 0; i < numberOfWorkers; i++) {
flowContext.registration(workerFlow(MessageChannels.queue(String.format("worker-input-%d", i), queueCapacity).get()))
.autoStartup(false)
.id(String.format("worker-flow-%d", i))
.register();
}
}
private IntegrationFlow workerFlow(QueueChannel channel) {
return IntegrationFlows
.from(channel)
.<Object, Class<?>>route(Object::getClass, m -> m
.resolutionRequired(true)
.defaultOutputToParentFlow()
.subFlowMapping(EventOne.class, s -> s.handle(oneHandler))
.subFlowMapping(EventTwo.class, s -> s.handle(anotherHandler))
)
.get();
}
Now, when lets say an error happens in eventConsumerFlow, the retry mechanism works as expected, but when an error happens in workerFlow, the retry doesn't work anymore and the message doesn't get sent to dead letter exchange. I assume this is because once message is handed off to QueueChannel, it gets acknowledged automatically. How can I make the retry mechanism work in workerFlow as well, so that if exception happens there, it could retry a couple of times and send a message to DLX when tries are exhausted?

If you want resiliency, you shouldn't be using queue channels at all; the messages will be acknowledged immediately after the message is put in the in-memory queue;if the server crashes, those messages will be lost.
You should configure a separate adapter for each queue if you want no message loss.
That said, to answer the general question, any errors on downstream flows (including after a queue channel) will be sent to the errorChannel defined on the inbound adapter.

Spring Integration: File polling memory consumption

I've following InboundChannelAdapter with Poller to process files every 30 seconds. The files are not large but I realize the memory consumptions keeps going up even when there's no files coming.
#Bean
#InboundChannelAdapter(value = "flowFileInChannel" ,poller = #Poller(fixedDelay ="30000", maxMessagesPerPoll = "1"))
public MessageSource<File> flowInboundFileAdapter(#Value("${integration.path}") File directory) {
FileReadingMessageSource source = new FileReadingMessageSource();
source.setDirectory(directory);
source.setFilter(flowPathFileFilter);
source.setUseWatchService(true);
source.setScanEachPoll(true);
source.setAutoCreateDirectory(false);
return source;
}
Is there an internal queue that is not cleared after each poll? How do I configure to avoid eating up memory.
After digging deeper, it looks like the below Spring IntegrationFlows which processes the data from the InboundChannelDapter is holding up the memory after each file polling. After I commenting out the middle part, the memory consumption seems stable (instead of increasing consumption). Now I'm wondering how do we force Spring IntegrationFlows to clear those Messages and Headers after they're passed through different channels (i.e. after the last channel below)
public IntegrationFlow incomingLocateFlow(){
return IntegrationFlows.from(locateIncomingChannel())
// .split("locateItemSplitter","split")
// .transform(locateItemEnrichmentTransformer)
// .transform(locateRequestTransformer)
// .aggregate(new Consumer<AggregatorSpec>() { // 32
//
// #Override
// public void accept(AggregatorSpec aggregatorSpec) {
// aggregatorSpec.processor(locateRequestProcessor, null); // 33
// }
//
// }, null)
// .transform(locateIncomingResultTransformer)
// .transform(locateExceptionReportWritingHandler)
.channel(locateIncomingCompleteChannel())
.get();
}

Indeed there is an AcceptOnceFileListFilter with the code like:
private final Queue<F> seen;
private final Set<F> seenSet = new HashSet<F>();
On each poll those internal collections are replenished with new files.
For this purpose you can consider to use FileSystemPersistentAcceptOnceFileListFilter with the persistent MetadataStore implementation to avoid memory consumption.
Also consider to use some tool to analyze the memory content. You might have something else downstream on the flowFileInChannel.
UPDATE
Since you use .aggregate() it is definitely the point where memory is consumed by default. That's because there is SimpleMessageStore to keep messages for grouping. Plus there is an option expireGroupsUponCompletion(boolean) which is false by default. Therefore even after successful releasing some info is still in the MessageStore. That's how your memory is consumed a bit from time to time.
That option is false by default to let to have logic when we discard late message for completed group. When it is true, you are able to form fresh group for the same correlationKey.
See more info about Aggregator in the Reference Manual.

Getting Data from EventHub is delayed

I have an EventHub configured in Azure, also a consumer group for reading the data. It was working fine for some days. Suddenly, I see there is a delay in incoming data(around 3 days). I use Windows Service to consume data in my server. I have around 500 incoming messages per minute. Can anyone help me out to figure this out ?

It might be that you are processing them items too slow. Therefore the work to be done grows and you will lag behind.
To get some insight in where you are in the event stream you can use code like this:
private void LogProgressRecord(PartitionContext context)
{
if (namespaceManager == null)
return;
var currentSeqNo = context.Lease.SequenceNumber;
var lastSeqNo = namespaceManager.GetEventHubPartition(context.EventHubPath, context.ConsumerGroupName, context.Lease.PartitionId).EndSequenceNumber;
var delta = lastSeqNo - currentSeqNo;
logWriter.Write(
$"Last processed seqnr for partition {context.Lease.PartitionId}: {currentSeqNo} of {lastSeqNo} in consumergroup '{context.ConsumerGroupName}' (lag: {delta})",
EventLevel.Informational);
}
the namespaceManager is build like this:
namespaceManager = NamespaceManager.CreateFromConnectionString("Endpoint=sb://xxx.servicebus.windows.net/;SharedAccessKeyName=yyy;SharedAccessKey=zzz");
I call this logging method in the CloseAsync method:
public Task CloseAsync(PartitionContext context, CloseReason reason)
{
LogProgressRecord(context);
return Task.CompletedTask;
}
logWriter is just some logging class I have used to write info to blob storage.
It now outputs messages like
Last processed seqnr for partition 3: 32780931 of 32823804 in consumergroup 'telemetry' (lag: 42873)
so when the lag is very high you could be processing events that have occurred a long time ago. In that case you need to scale up/out your processor.
If you notice a lag you should measure how long it takes to process a given number of item. You can then try to optimize performance and see whether it improves. We did it like:
public async Task ProcessEventsAsync(PartitionContext context, IEnumerable<EventData> events)
{
try
{
stopwatch.Restart();
// process items here
stopwatch.Stop();
await CheckPointAsync(context);
logWriter.Write(
$"Processed {events.Count()} events in {stopwatch.ElapsedMilliseconds}ms using partition {context.Lease.PartitionId} in consumergroup {context.ConsumerGroupName}.",
EventLevel.Informational);
}
}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

ActiveMQ: Dispatched queue contains more messages then prefetch size - spring-integration

The problem was cause by difference between version of broker (5.14.5) and client (5.15.3). After upgrading broker dispatched queue contains at most 2 message as expected.

Related

Run Spring Integration flow concurrently for each Ftp file

Custom partition in pulsar

How to handle errors after message has been handed off to QueueChannel?

Spring Integration: File polling memory consumption

Getting Data from EventHub is delayed

Categories

Resources