Split/Aggregate never releases the groups (Spring Integration with Java DSL) - spring-integration

I am trying to do a GroupBy a list of GeoJSON Features based on a shared ID, in order to aggregate a single field of these Features, by using split/aggregate, like so:
#Bean
IntegrationFlow myFlow() {
return IntegrationFlows.from(MY_DIRECT_CHANNEL)
.handle(Http.outboundGateway(myRestUrl)
.httpMethod(HttpMethod.GET)
.expectedResponseType(FeatureCollection.class)
.mappedResponseHeaders(""))
.split(FeatureCollection.class, FeatureCollection::getFeatures)
.aggregate(aggregator -> aggregator
.outputProcessor(agg -> {
final List<String> collected = agg
.getMessages()
.stream()
.map(m -> ((Number)((Feature) m.getPayload()).getProperties().get("my_field")).intValue() + "")
.collect(Collectors.toList());
return MyPojo.builder()
.myId(((Number) agg.getGroupId()).longValue())
.myListString(String.join(",", collected))
.build();
})
.correlationStrategy(m -> ((Feature) m.getPayload()).getProperties().get("shared_id"))
// .sendPartialResultOnExpiry(true)
// .groupTimeout(10000) // there's got to be a better way ...
// .expireGroupsUponTimeout(false)
)
.handle(Jpa.updatingGateway(myEntityManagerFactory).namedQuery(MyPojo.QUERY_UPDATE),
spec -> spec.transactional(myTransactionManager))
.nullChannel();
}
Unless I un-comment those 3 lines, the aggregator never releases the groups and the database never receives any updates. If I set groupTimeout to less than 5 seconds, I am missing partial results.
I expected the releaseStrategy to be SimpleSequenceSizeReleaseStrategy by default which I expected would automatically release all the groups after all of the (split) Features had been processed (there are only 129 Features in total from the REST service message). Manually setting this as the releaseStrategy doesn't help.
What is the proper way to release the groups once all 129 messages have been processed ?

I got it to work using a transformer instead of split/aggregate:
#Bean
IntegrationFlow myFlow(MyTransformer myTransformer) {
return IntegrationFlows.from(MY_DIRECT_CHANNEL)
.handle(Http.outboundGateway(myRestUrl)
.httpMethod(HttpMethod.GET)
.expectedResponseType(FeatureCollection.class)
.mappedResponseHeaders(""))
.transform(myTransformer)
.split()
.handle(Jpa.updatingGateway(myEntityManagerFactory).namedQuery(MyEntity.QUERY_UPDATE),
spec -> spec.transactional(myTransactionManager))
.nullChannel();
}
And the signature of the transformer is:
#Component
public class MyTransformer implements GenericTransformer<FeatureCollection, List<MyEntity>> {
#Override
public List<MyEntity> transform(FeatureCollection featureCollection) {
...
}
}

Related

How does JdbcPollingChannelAdapter maxRows differs from Poller maxMessagesPerPoll?

I have multiple Polling flows using the same flow logic, but varying the channel (it's a database column id_channel to separate rows by dependence).
On the JdbcPollingChannelAdapter, setMaxRows is fixed to 1. To my understandig each roundtrip to the database will fetch one row.
If I have 5 Polling flows and 10 threads, how each polling flow will "compete" with each other? Does setting Pollers.maxMessagesPerPoll makes any difference in the concurrency, given that JdbcPollingChannelAdapter.setMaxRows is always 1?
My application.properties (custom datasource) has:
spring.task.scheduling.pool.size=10
spring.pgsql.hikari.maximum-pool-size=10
Flow logic:
private MessageSource<Object> buildJdbcMessageSource(final int channel) {
JdbcPollingChannelAdapter adapter = new JdbcPollingChannelAdapter(dataSource, FETCH_QUERY);
adapter.setMaxRows(1);
adapter.setUpdatePerRow(true);
adapter.setSelectSqlParameterSource(new MapSqlParameterSource(Map.of("idCanal", channel)));
adapter.setRowMapper((RowMapper<IntControle>) (rs, i)
-> new IntControle(rs.getLong(1), rs.getInt(2), rs.getString(3)));
adapter.setUpdateSql(UPDATE_QUERY);
return adapter;
}
private IntegrationFlow buildIntegrationFlow(final int channel, final long rate, final int maxMessages) {
return IntegrationFlows.from(buildJdbcMessageSource(channel),
c -> c.poller(Pollers.fixedDelay(rate)
.transactional(transactionInterceptor())
.maxMessagesPerPoll(maxMessages)))
.split()
.enrichHeaders(h -> h.header(MessageHeaders.ERROR_CHANNEL, ERROR_CHANNEL))
.channel(SybaseFlowConfiguration.SYBASE_SINK)
.get();
}
public IntegrationFlow pollingFlowChannel1() {
return buildIntegrationFlow(1, properties.getChan1RateMs(), properties.getChan1MaxMessages());
}
public IntegrationFlow pollingFlowChannel2() {
return buildIntegrationFlow(2, properties.getChan2RateMs(), properties.getChan2MaxMessages());
}
...
We have some explanation in the doc: https://docs.spring.io/spring-integration/docs/current/reference/html/jdbc.html#jdbc-max-rows-versus-max-messages-per-poll.
Tell us please, if that is not enough for your expectations.

Spring Integration preventDuplicates(false) not working

Given the code below, I would expect preventDuplicates(false) to allow files with the same name and size and last modification to be processed. However, the code still acts as it this flag is true in that only a file with a new name is accepted.
The API says Configure an AcceptOnceFileListFilter if preventDuplicates == true, otherwise - AcceptAllFileListFilter.
Which I assume to mean that AcceptAllFileListFilter is being used automatically, however when setting a break-point on this filter, it is never hit.
The patternFilter is also not working. *.csv is also being processed.
Removing either/or preventDuplicates() or patternFilter() doesn't make any difference (assuming that a Filter Chain was causing the problem).
Can anyone explain why this is happening?
#SpringBootApplication
public class ApplicationTest {
public static void main(String[] args) {
ConfigurableApplicationContext ctx =
new SpringApplicationBuilder(ApplicationTest.class)
.web(WebApplicationType.NONE)
.run(args);
}
#Bean
public IntegrationFlow readBackUpFlow() {
return IntegrationFlows
.from(
Files
.inboundAdapter(new File("d:/temp/in"))
.patternFilter("*.txt")
.preventDuplicates(false)
.get(),
e -> e.poller(Pollers.fixedDelay(5000))
)
.log("File", m -> m)
.transform(Files.toStringTransformer())
.log("Content", m -> m)
.handle((p, h) -> h.get(FileHeaders.ORIGINAL_FILE, File.class))
.handle(Files.outboundAdapter(new File("d:/temp/done")).deleteSourceFiles(true))
.get();
}
}
Don't call get() on that Files.inboundAdapter(). It does some extra work in the getComponentsToRegister() which is called by the framework at specific phase. So, that's how your filters are not applied to the target channel adapter.

Spring Integration DSL: Simple way to send some messages to a Flow inputChannel

If I want to generate some sample data for testing purposes of the Spring Integration DSL functionality, one way I have come up with so far is like this:
#Bean
public IntegrationFlow myFlow() {
return IntegrationFlows
.from(Http.inboundChannelAdapter("numbers").get())
.scatterGather(s -> s
.applySequence(true)
.recipientFlow(f -> f.handle((a, b) -> Arrays.asList(1,2,3,4,5,6,7,8,9,10)))
)
.split() // unpack the wrapped list
.split() // unpack the elements of the list
.log()
.get();
}
Is there another/better way to do the same thing ? Using the Scatter-Gather EIP seems like overkill for something so basic...
You really can inject a specific MessageChannel from your flow to some testing component to send messages directly to that channel to be processed from the expecting endpoint in the flow. It is not clear what that scatter-gather does in your flow and where is the channel you talk in the title , but another way is to use a #MessagingGateway interface .
See more in docs:
https://docs.spring.io/spring-integration/docs/current/reference/html/messaging-endpoints.html#gateway
https://docs.spring.io/spring-integration/docs/current/reference/html/dsl.html#java-dsl-channels
Here's another way to get some test data injected into a flow:
#Bean
IntegrationFlow pollingFlow() {
return IntegrationFlow
.from(() -> new GenericMessage<>(List.of(1,2,3)),
e -> e.poller(Pollers.fixedRate(Duration.ofSeconds(1))))
// do something here ever second
.get();
}
I found a simpler way of doing the same thing using .transform() instead of .scatterGather()
#Bean
public IntegrationFlow testFlow() {
return IntegrationFlows
.from(Http.inboundChannelAdapter("test").get())
.transform(t -> Arrays.asList(1,2,3,4,5,6,7,8,9,10))
.split() // unpack the wrapped list
.split() // unpack the elements of the list
.get();
}
And here's another way you could do it using .gateway()
#Bean
public IntegrationFlow testFlow() {
return IntegrationFlows
.from(Http.inboundChannelAdapter("test").get())
.gateway(f -> f.handle((a, b) -> Arrays.asList(1,2,3,4,5,6,7,8,9,10)))
.split() // unpack the elements of the list
.get();
}
And yet another:
#Bean
IntegrationFlow fromSupplier() {
return IntegrationFlow.fromSupplier(() -> List.of(1,2,3,4),
s -> s.poller(p -> p.fixedDelay(1, TimeUnit.DAYS)))
.log(Message::getPayload)
.nullChannel();
}

How to register the integration flows in runtime?

I'm building a micro service for multiple properties. So, each property has different configuration. To do that, I've implemented something like this;
#Autowired
IntegrationFlowContext flowContext;
#Bean
public void setFlowContext() {
List<Login> loginList = DAO.getLoginList(); // a web service
loginList.forEach(e -> {
IntegrationFlow flow = IntegrationFlows.from(() -> e, c -> c.poller(Pollers.fixedRate(e.getPeriod(), TimeUnit.SECONDS, 5)))
.channel("X_CHANNEL")
.get();
flowContext.registration(flow).register();
});
}
By this implementation, I'm getting the loginList before application started. So, after application is started, I'm not able to get loginList from web service since there is no poller config. The problem is loginList could change; new logins credentials could be added or deleted. Therefore, I want to implement something will work X time period to get loginList from web service, then, by loginList I need to register the flows that are created for each loginList. To achieve, I've implemented something like this;
#Bean
public IntegrationFlow setFlowContext() {
return IntegrationFlows
.from(this::getSpecification, p -> p.poller(Pollers.fixedRate(X))) // the specification is constant.
.transform(payload -> DAO.getLoginList(payload))
.split()
.<Login>handle((payload, header) -> {
IntegrationFlow flow = IntegrationFlows.from(() -> payload, c -> c.poller(Pollers.fixedRate(payload.getPeriod(), TimeUnit.SECONDS, 5)))
.channel("X_CHANNEL")
.get();
flowContext.registration(flow).register().start();
return null;
})
.get();
}
Basically, I've used start() method, but this is not working as aspected. See this;
flowContext.registration(flow).register().start();
Lastly, I've read the Dynamic and Runtime Integration Flows, but still couldn't implement this feature.
Dynamic flow registration cannot be used within a #Bean definition.
It is designed to be used at runtime AFTER the application context is fully initialized.

How to make an IntegrationFlow with transform?

I want to take records from the database and transform it to json. This runs on Spring Cloud Dataflow.
I suspect I am missing some call on the IntegrationFlow.
The error output is:
Caused by: org.springframework.messaging.core.DestinationResolutionException: no output-channel or replyChannel header available
at org.springframework.integration.handler.AbstractMessageProducingHandler.sendOutput(AbstractMessageProducingHandler.java:440)
at org.springframework.integration.handler.AbstractMessageProducingHandler.doProduceOutput(AbstractMessageProducingHandler.java:319)
at org.springframework.integration.handler.AbstractMessageProducingHandler.produceOutput(AbstractMessageProducingHandler.java:267)
at org.springframework.integration.handler.AbstractMessageProducingHandler.sendOutputs(AbstractMessageProducingHandler.java:231)
at org.springframework.integration.handler.AbstractReplyProducingMessageHandler.handleMessageInternal(AbstractReplyProducingMessageHandler.java:140)
at org.springframework.integration.handler.AbstractMessageHandler.handleMessage(AbstractMessageHandler.java:62)
#Bean
public MessageSource<Object> jdbcMessageSource() {
String query = "select cd_int_controle, de_tabela from int_controle rowlock readpast " +
"where id_status = 0 order by cd_int_controle";
JdbcPollingChannelAdapter adapter =
new JdbcPollingChannelAdapter(dataSource, query);
adapter.setMaxRows(properties.getPollSize());
adapter.setUpdatePerRow(true);
adapter.setRowMapper((RowMapper<IntControle>) (rs, i) -> new IntControle(rs.getLong(1), rs.getString(2)));
adapter.setUpdateSql("update int_controle set id_status = 1 where cd_int_controle = :cdIntControle");
return adapter;
}
#Bean
public IntegrationFlow jsonSupplier() {
return IntegrationFlows.from(jdbcMessageSource(),
c -> c.poller(Pollers.fixedRate(properties.getPollRateMs(), TimeUnit.MILLISECONDS).transactional()))
.transform((GenericTransformer<List<IntControle>, String>) ints -> {
//transform to Json
})
.get();
}
You are missing several points:
The transform() in the Spring Integration indeed requires an output channel or reply channel header. There is just no way in Spring Integration to bypass channels between endpoints. Even if it is not requested in your flow between JDBC and transform, it is present there by the framework anyway. Since you call get() in the end of the flow and don't provide any hints what channel send transform result to, such a DestinationResolutionException is thrown.
Spring Cloud Stream functional model deals with basic Java interfaces - Supplier, Function & Consumer. Calling a bean as jsonSupplier doesn't make it as a Supplier. You really need to say the framework what bean to use for binding. See docs for more info: https://cloud.spring.io/spring-cloud-static/spring-cloud-stream/3.0.6.RELEASE/reference/html/spring-cloud-stream.html#spring_cloud_function
So, you are missing a connection point between an IntegrationFlow and Supplier declaration. Probably this one could work for you:
#Bean
PollableChannel jsonChannel() {
return new QueueChannel();
}
...
.transform((GenericTransformer<List<IntControle>, String>) ints -> {
//transform to Json
})
.channel(jsonChannel())
.get();
...
#Bean
public Supplier<Message<?>> jsonSupplier() {
return jsonChannel()::receive;
}
So, the idea is to dump result of the flow into some channel and then bridge that data from a Supplier which is already visible for Spring Cloud Stream binding logic.
See also here: https://sagan-production.cfapps.io/blog/2019/10/25/spring-cloud-stream-and-spring-integration

Resources