My Message handler for publishing messages to the kinesis stream is as follows
public MessageHandler kinesisMessageHandler(final AmazonKinesisAsync amazonKinesis,
#Qualifier("successChannel") MessageChannel successChannel,
#Qualifier("errorChannel") MessageChannel errorChannel) {
KinesisMessageHandler kinesisMessageHandler = new KinesisMessageHandler(amazonKinesis);
kinesisMessageHandler.setSync(false);
kinesisMessageHandler.setOutputChannel(successChannel);
kinesisMessageHandler.setFailureChannel(errorChannel);
return kinesisMessageHandler;
}
#Bean(name = "errorChannel")
public MessageChannel errorChannel() {
return MessageChannels.direct().get();
}
#Bean(name = "successChannel")
public MessageChannel successChannel() {
return MessageChannels.direct().get();
}
The setSync flag is set as false so that the messages are getting processed asynchronously.Also, I have created separate IntegrationFlow to receive and process Kinesis response from the success & error channel.
public IntegrationFlow successMessageIntegrationFlow(MessageChannel successChannel,
MessageChannel inboundKinesisMessageChannel,
MessageReceiverServiceActivator kinesisMessageReceiverServiceActivator) {
return IntegrationFlows.from(successChannel).channel(inboundKinesisMessageChannel)
.handle(kinesisMessageReceiverServiceActivator, "receiveMessage").get();
}
#Bean
public IntegrationFlow errorMessageIntegrationFlow(MessageChannel errorChannel,
MessageChannel inboundKinesisErrorChannel,
MessageReceiverServiceActivator kinesisErrorReceiverServiceActivator
) {
return IntegrationFlows.from(errorChannel).channel(inboundKinesisErrorChannel)
.handle(kinesisErrorReceiverServiceActivator, "receiveMessage").get();
}
I wanted to know if you see any issues in using Direct Channel to receive success & error responses from Kinesis and processing it using an IntegrationFlow. As far as I know, with Direct Channel a producer is a blocker during send until the consumer finishes its work and returns management to the producer caller back. Is it a correct assumption that here the producer is executed in a different set of thread pools by the AmazonKinesisAsyncClient and the producer will not wait for the IntegrationFlow to process the messages? Let me know If I need to implement it differently
Your assumption about blocking is correct: the control does not come back to the producing thread. So, if have a limited number of threads in that Kinesis client, you need to be sure that you free them as soon as possible. You might consider to have those callbacks in the queue channel instead. They are asynchronous anyway, but won’t hold Kinesis client if that.
You still have a flaw in your flows: .channel(inboundKinesisMessageChannel) . That means the same channel in the middle if two different flows . And if it is a direct one , then you end up with round robin distribution. I would just remove it altogether .
Related
this is a follow-up question to Spring Integration AWS RabbitMQ Kinesis
I have the following configuration. I am noticing that when I send a message to the input channel named kinesisSendChannel for the first time, the aggregator and release strategy is getting invoked and messages are sent to Kinesis Streams. I put debug breakpoints at different places and could verify this behavior. But when I again publish messages to the same input channel the release strategy and the outbound processor are not getting invoked and messages are not sent to the Kinesis. I am not sure why the aggregator flow is getting invoked only the first time and not for subsequent messages. For testing purpose , the TimeoutCountSequenceSizeReleaseStrategy is set with count as 1 & time as 60 seconds. There is no specific MessageStore used. Could you help identify the issue?
#Bean(name = "kinesisSendChannel")
public MessageChannel kinesisSendChannel() {
return MessageChannels.direct().get();
}
#Bean(name = "resultChannel")
public MessageChannel resultChannel() {
return MessageChannels.direct().get();
}
#Bean
#ServiceActivator(inputChannel = "kinesisSendChannel")
public MessageHandler aggregator(TestMessageProcessor messageProcessor,
MessageChannel resultChannel,
TimeoutCountSequenceSizeReleaseStrategy timeoutCountSequenceSizeReleaseStrategy) {
AggregatingMessageHandler handler = new AggregatingMessageHandler(messageProcessor);
handler.setCorrelationStrategy(new ExpressionEvaluatingCorrelationStrategy("headers['foo']"));
handler.setReleaseStrategy(timeoutCountSequenceSizeReleaseStrategy);
handler.setOutputProcessor(messageProcessor);
handler.setOutputChannel(resultChannel);
return handler;
}
#Bean
#ServiceActivator(inputChannel = "resultChannel")
public MessageHandler kinesisMessageHandler1(#Qualifier("successChannel") MessageChannel successChannel,
#Qualifier("errorChannel") MessageChannel errorChannel, final AmazonKinesisAsync amazonKinesis) {
KinesisMessageHandler kinesisMessageHandler = new KinesisMessageHandler(amazonKinesis);
kinesisMessageHandler.setSync(true);
kinesisMessageHandler.setOutputChannel(successChannel);
kinesisMessageHandler.setFailureChannel(errorChannel);
return kinesisMessageHandler;
}
public class TestMessageProcessor extends AbstractAggregatingMessageGroupProcessor {
#Override
protected Object aggregatePayloads(MessageGroup group, Map<String, Object> defaultHeaders) {
final PutRecordsRequest putRecordsRequest = new PutRecordsRequest().withStreamName("test-stream");
final List<PutRecordsRequestEntry> putRecordsRequestEntry = group.getMessages().stream()
.map(message -> (PutRecordsRequestEntry) message.getPayload()).collect(Collectors.toList());
putRecordsRequest.withRecords(putRecordsRequestEntry);
return putRecordsRequestEntry;
}
}
I believe the problem is here handler.setCorrelationStrategy(new ExpressionEvaluatingCorrelationStrategy("headers['foo']"));. All your messages come with the same foo header. So, all of them form the same message group. As long as you release group and don’t remove it, all the new messages are going to be discarded.
Please, revise aggregator documentation to make yourself familiar with all the possible behavior : https://docs.spring.io/spring-integration/docs/current/reference/html/message-routing.html#aggregator
I am using spring integration framework, with a Transformer
inputChannel -> kafka consumer
outputChannel -> database jdbc writer
#Bean
public DirectChannel inboundChannel() {
return new DirectChannel();
}
#Bean
public DirectChannel outboundChannel() {
return new DirectChannel();
}
#Bean
#Transformer(inputChannel="inboundChannel", outputChannel="outboundChannel")
public JsonToObjectTransformer jsonToObjectTransformer() {
return new JsonToObjectTransformer(Item.class);
}
#Bean
#ServiceActivator(inputChannel = "outboundChannel")
public MessageHandler jdbcmessageHandler() {
JdbcMessageHandler jdbcMessageHandler = new ...
return ...;
}
#Bean
#ServiceActivator(inputChannel = "inboundChannel")
public MessageHandler kafkahandler() {
return new ...;
}
in both handlers I override
public void handleMessage(Message<?> message)
The problem: if in kafka there are total N messages,
then each handleMessage() is invoked exactly n/2 times!
I assumed that each handler will be invoked n times, because each handler linked to different channel and there are n messages in total.
What am I missing?
(if I disable the kafak handler, the second handler gets all n messages)
Update:
I need to subscriber to get all the messages from the same channel (kafka handler will do something with the raw data, and jdbc handler will push the transformed
data)
First of all your inboundChannel and outboundChannel are out of use: you nowhere (at least in the question) specify their names.
The names like input and output are taken by the framework and used to create new MessageChannel beans, which are used in other places.
Now see what you have:
#Transformer(inputChannel="input"
#ServiceActivator(inputChannel = "input")
Both of them are subscribers to the same input channel and since it is created automatically by the framework as a DirectChannel. This channel is based on a round-robin LoadBalancingStrategy, therefore you see n/2 in your Kafka since its service activator deals only with every second message sent to that input channel.
Please, see more info in docs: https://docs.spring.io/spring-integration/reference/html/core.html#channel-configuration-directchannel
How can I setup a reactive flow using DSL for the following steps:
Receive an SQS Message using SqsMessageDrivenChannelAdapter
Validate the Json message [JsonSchemaValidator class with validate method]
Transform the json to objects
Pass the objects to a service activator (BusinessService : business logic, state machine)
Persist the Objects R2DBC outbound adapter
I was looking at this : https://github.com/spring-projects/spring-integration/blob/master/spring-integration-core/src/test/java/org/springframework/integration/dsl/reactivestreams/ReactiveStreamsTests.java
In the above example, there are dedicated flows created that return a Publisher and in the tests the Publishers are subscribed. However, my flow will be triggered when SqsMessageDrivenChannelAdapter brings in a message into a channel.
How to achieve a reactive flow configuration, for the scenario above steps 1 to 5?
Update : Sample code added
#Bean
public IntegrationFlow importFlow() {
IntegrationFlows.from(sqsInboundChannel())
.handle((payload, messageHeaders) -> jsonSchemaValidator.validate(payload.toString()))
.transform(Transformers.fromJson(Entity.class))
.handle((payload, messageHeaders) ->businessService.process((Entity) payload))
.handle(
Jpa.outboundAdapter(this.entityManagerFactory)
.entityClass(Entity)
.persistMode(PersistMode.PERSIST),
ConsumerEndpointSpec::transactional)
.get();
}
#Bean
public MessageProducer sqsMessageDrivenChannelAdapter() {
SqsMessageDrivenChannelAdapter sqsMessageDrivenChannelAdapter =
new SqsMessageDrivenChannelAdapter(asyncSqsClient, queueName);
sqsMessageDrivenChannelAdapter.setAutoStartup(true);
sqsMessageDrivenChannelAdapter.setOutputChannel(sqsInboundChannel());
return sqsMessageDrivenChannelAdapter;
}
#Bean
public MessageChannel sqsInboundChannel() {
return MessageChannels.flux().get();
}
Update 2 : Moved JPA to a diff thread using executor channel
#Bean
public IntegrationFlow importFlow() {
IntegrationFlows.from(sqsInboundChannel())
.handle((payload, messageHeaders) -> jsonSchemaValidator.validate(payload.toString()))
.transform(Transformers.fromJson(Entity.class))
.handle((payload, messageHeaders) ->businessService.process((Entity) payload))
.channel(persistChannel())
.handle(
Jpa.outboundAdapter(this.entityManagerFactory)
.entityClass(Entity)
.persistMode(PersistMode.PERSIST),
ConsumerEndpointSpec::transactional)
.get();
}
#Bean
public MessageProducer sqsMessageDrivenChannelAdapter() {
SqsMessageDrivenChannelAdapter sqsMessageDrivenChannelAdapter =
new SqsMessageDrivenChannelAdapter(asyncSqsClient, queueName);
sqsMessageDrivenChannelAdapter.setAutoStartup(true);
sqsMessageDrivenChannelAdapter.setOutputChannel(sqsInboundChannel());
return sqsMessageDrivenChannelAdapter;
}
#Bean
public MessageChannel sqsInboundChannel() {
return MessageChannels.flux().get();
}
#Bean
public MessageChannel persistChannel() {
return MessageChannels.executor(Executors.newCachedThreadPool()).get();
}
You probably need to make yourself more familiar with what we have so far for Reactive Streams in Spring Integration: https://docs.spring.io/spring-integration/docs/current/reference/html/reactive-streams.html#reactive-streams
The sample you show with that test class is fully not relevant to your use case. In that test we try to cover some API we expose in Spring Integration, kinda unit tests. It has nothing to do with the whole flow.
Your use-case is really just a full black box flow starting with SQS listener and ending in the R2DBC. Therefore there is no point in your flow to try to convert part of it into the Publisher and then bring it back to another part of the flow: you are not going to track some how and subscribe to that Publisher yourself.
You may consider to place a FluxMessageChannel in between endpoints in your flow, but it still does not make sense for your use-case. It won't be fully reactive as you expect just because a org.springframework.cloud.aws.messaging.listener.SimpleMessageListenerContainer is not blocking on the consumer thread to be ready for a back-pressure from downstream.
The only really reactive part of your flow is that R2DBC outbound channel adapter, but probably it does not bring you too much value because the source of data is not reactive.
As I said: you can try to place a channel(channels -> channels.flux()) just after an SqsMessageDrivenChannelAdapter definition to start a reactive flow from that point. At the same time you should try to set a maxNumberOfMessages to 1 to try to make it waiting for a free space in before pulling the next mesasge from SQS.
I have this direct channel:
#Bean
public DirectChannel emailingChannel() {
return MessageChannels
.direct( "emailingChannel")
.get();
}
Can I define multiple flows for the same channel like this:
#Bean
public IntegrationFlow flow1FromEmailingChannel() {
return IntegrationFlows.from( "emailingChannel" )
.handle( "myService" , "handler1")
.get();
}
#Bean
public IntegrationFlow flow2FromEmailingChannel() {
return IntegrationFlows.from( "emailingChannel" )
.handle( "myService" , "handler2" )
.get();
}
EDIT
#Service
public class MyService {
public void handler1(Message<String> message){
....
}
public void handler2(Message<List<String>> message){
....
}
}
Each flow's handle(...) method manipulates different payload data types but the goal is the same, i-e reading data from the channel and call the relevant handler. I would like to avoid many if...else to check the data type in one handler.
Additional question: What happens when multiple threads call the same channel (no matter its type: Direct, PubSub or Queue) at the same time (as per default a #Bean has singleton scope)?
Thanks a lot
With a direct channel messages will be round-robin distributed to the consumers.
With a queue channel only one consumer will get each message; the distribution will be based on their respective pollers.
With a pub/sub channel both consumers will get each message.
You need to provide more information but it sounds like you need to add a payload type router to your flow to direct the messages to the right consumer.
EDIT
When the handler methods are in the same class you don't need two flows; the framework will examine the methods and, as long as there is no ambiguity) will call the method that matches the payload type.
.handle(myServiceBean())
I am using java dsl to configure sfp outbound flow.
Gateway:
#MessagingGateway
public interface SftpGateway {
#Gateway(requestChannel = "sftp-channel")
void sendFiles(List<Message> messages);
}
Config:
#Bean
public IntegrationFlow sftpFlow(DefaultSftpSessionFactory sftpSessionFactory) {
return IntegrationFlows
.from("sftp-channel")
.split()
.handle(Sftp.outboundAdapter(sftpSessionFactory, FileExistsMode.REPLACE)
.useTemporaryFileName(false)
.remoteDirectory(REMOTE_DIR_TO_CREATE).autoCreateDirectory(true)).get();
}
#Bean
public DefaultSftpSessionFactory sftpSessionFactory() {
...
}
How can i configure flow to make my gateway reply with Messages that were failed?
In other words i want my gateway to be able to return list of messages which were failed, not void.
I marked gateway with
#MessagingGateway(errorChannel = "errorChannel")
and wrote error channel
#Bean
public IntegrationFlow errorFlow() {
return IntegrationFlows.from("errorChannel").handle(new GenericHandler<MessagingException>() {
public Message handle(MessagingException payload, Map headers) {
System.out.println(payload.getFailedMessage().getHeaders());
return payload.getFailedMessage();
}
})
.get();
}
#Bean
public MessageChannel errorChannel() {
return MessageChannels.direct().get();
}
and in case of some errors(i.e. no connection to SFTP) i get only one error (payload of first message in list).
Where should i put Advice to aggregate all messages?
This is not the question to Spring Integration Java DSL.
This is mostly a design and architecture task.
Currently you don't have any choice because you use Sftp.outboundAdapter() which is one-way, therefore without any reply. And your SftpGateway is ready for that behavior with the void return type.
If you have a downstream errorr, you can only throw them or catch and send to some error-channel.
According to your request of:
i want my gateway to be able to return list of messages which were failed, not void.
I'd say it depends. Actually it is just return from your gateway. So, if you return an empty list into gateway that may mean that there is no errors.
Since Java doesn't provide multi-return capabilities we don't have choice unless do something in our stream which builds that single message to return. As we decided list of failed messages.
Since you have there .split(), you should look into .aggregate() to build a single reply.
Aggregator correlates with the Splitter enough easy, via default applySequence = true.
To send to aggregator I'd suggest to take a look into ExpressionEvaluatingRequestHandlerAdvice on the Sftp.outboundAdapter() endpoint (second param of the .handle()). With that you should send both good and bad messages to the same .aggregate() flow. Than you can iterate a result list to clean up it from the good result. The result after that can be send to the SftpGateway using replyChannel header.
I understand that it sounds a bit complicated, but what you want doesn't exist out-of-the-box. Need to think and play yourself to figure out what can be reached.