FTP IntegrationFlows Filters not working in Spring Integration 5.0.0.RC1 - spring-integration

I have upgraded an integration flow from 4.3.12 to 5.0.0.RC1 to take advantage of the inbound stream capabilities. I'm finding that both the patternFilter and regexFilter are not filtering at all. To check that it wasn't just the streaming interface, I tried with the file based interface and I'm seeing the same results.
In 4.3.12 I had my file based flow defined by:
return IntegrationFlows
.from(s -> s.ftp(ftpSessionFactory())
.preserveTimestamp(true)
.remoteDirectoryExpression(remoteDirectory())
.regexFilter("sn\\.[0-9]{4}\\.txt$")
.filter(ftpPersistantFilter())
.localFilter(fileSystemPersistantFilter())
.localFilename(f -> (currentUtcDay.toString("YYYYMMdd")) + "." + f)
.localDirectory(new File(this.localDirectory)),
e -> e.id("ftpInboundAdapter").autoStartup(true))
.channel(MessageChannels.queue("ftpInboundResultChannel"))
.get();
For consistency, here is the same definition in 5.0.0.RC1:
return IntegrationFlows
.from(Ftp.inboundAdapter(ftpSessionFactory())
.preserveTimestamp(true)
.remoteDirectoryExpression(remoteDirectory())
.regexFilter("sn\\.[0-9]{4}\\.txt$")
.filter(ftpPersistantFilter())
.localFilter(fileSystemPersistantFilter())
.localFilename(f -> (currentUtcDay.toString("YYYYMMdd")) + "." + f)
.localDirectory(new File(this.localDirectory)),
e -> e.id("ftpInboundAdapter").poller(Pollers.fixedDelay(100)))
.channel(MessageChannels.queue("ftpInboundResultChannel"))
.get();
It is not filtering at all in 5.0.0.RC1. Has the configuration for the filters changed? Is there anything additional I need to do?
Edit:
For the next person who encounters this, here is the fix.
return IntegrationFlows
.from(Ftp.inboundAdapter(ftpSessionFactory())
.preserveTimestamp(true)
.remoteDirectoryExpression(remoteDirectory())
.filter(ftpPersistantFilter())
.localFilter(fileSystemPersistantFilter())
.localFilename(f -> (currentUtcDay.toString("YYYYMMdd")) + "." + f)
.localDirectory(new File(this.localDirectory)),
e -> e.id("ftpInboundAdapter").poller(Pollers.fixedDelay(100)))
.channel(MessageChannels.queue("ftpInboundResultChannel"))
.get();
Then I changed my ftpPersistantFilter from:
#Bean
public FtpPersistentAcceptOnceFileListFilter ftpPersistantFilter() {
return new FtpPersistentAcceptOnceFileListFilter(metadataStore(), "ftpPersistentAcceptOnce");
}
to:
#Bean
public CompositeFileListFilter ftpPersistantFilter() {
CompositeFileListFilter filters = new CompositeFileListFilter();
filters.addFilter(new FtpPersistentAcceptOnceFileListFilter(metadataStore(), "ftpPersistentAcceptOnce"));
filters.addFilter(new FtpRegexPatternFileListFilter(regexFilter));
}

The change in Spring Integration 5.0 is like that .filter(ftpPersistantFilter()) fully overrides the previous filter-aware options:
/**
* Configure a {#link FileListFilter} to be applied to the remote files before
* copying them.
* #param filter the filter.
* #return the spec.
*/
public S filter(FileListFilter<F> filter) {
this.synchronizer.setFilter(filter);
return _this();
}
So, your .regexFilter("sn\\.[0-9]{4}\\.txt$") is ignored.
The change is done like this to avoid confuses with unexpected internal compositions. For example regex and pattern filters are composed together with the FtpPersistentAcceptOnceFileListFilter: https://docs.spring.io/spring-integration/docs/5.0.0.RC1/reference/html/whats-new.html#__s_ftp_changes:
All the Inbound Channel Adapters (streaming and synchronization-based) now use an appropriate AbstractPersistentAcceptOnceFileListFilter implementation by default to prevent remote files duplicate downloads.
In other words: any filter-based options are mutually exclusive and the last one wins. That's much easier option to support and let end-user do not worry about unexpected mutations.
To fix your requirements you have to use CompositeFileListFilter for your ftpPersistantFilter and FtpRegexPatternFileListFilter.
I think we have to add some Migration Guide bullet on the matter.
Thanks for understanding.

Related

Move file from inbound adapter after publish subscribe flow

I'm trying to implement the following flow:
1) files are read from inbound adapter
2) they are send to different flows using publish-subscribe channel with applied sequence
3) file is moved after all the subscriber flows are ready
This is the main flow
return IntegrationFlows
.from(Files.inboundAdapter(inboundOutDirectory)
.regexFilter(pattern)
.useWatchService(true)
.watchEvents(FileReadingMessageSource.WatchEventType.CREATE),
e -> e.poller(Pollers.fixedDelay(period)
.taskExecutor(Executors.newFixedThreadPool(poolSize))
.maxMessagesPerPoll(maxMessagesPerPoll)))
.publishSubscribeChannel(s -> s
.applySequence(true)
.subscribe(f -> f
.transform(Files.toStringTransformer())
.<String>handle((p, h) -> {
return "something"
}
})
.channel("consolidateFlow.input"))
.subscribe(f -> f
.transform(Files.toStringTransformer())
.handle(Http.outboundGateway(testUri)
.httpMethod(HttpMethod.GET)
.uriVariable("text", "payload") .expectedResponseType(String.class))
.<String>handle((p, h) -> {
return "something";
})
.channel("consolidateFlow.input")))
.get();
And the aggregation:
public IntegrationFlow consolidateFlow()
return flow -> flow
.aggregate()
.<List<String>>handle((p, h) -> "something").log()
}
}
Using the following code in the main flow after publish-subscribe
.handle(Files.outboundGateway(this.inboundProcessedDirectory).deleteSourceFiles(true))
ends up with
Caused by: org.springframework.messaging.core.DestinationResolutionException: no output-channel or replyChannel header available
If I go with this the consolidation/aggregation flow won't be reached at all.
.handle(Files.outboundAdapter(this.inboundProcessedDirectory))
Any idea how I could solve it? Currently I'm moving the file after the aggregation by reading the original file name from the header but it doesn't seem to be the right solution.
I was also thinking about applying spec/advice to the inbound adapter with success logic to move the file but not sure whether that's the right approach.
EDIT1
As suggested by Artem, I've added another subscriber to the publish-subscribe as follows:
...
.channel("consolidateNlpFlow.input"))
.subscribe(f -> f
.handle(Files.outboundAdapter(this.inboundProcessedDirectory).deleteSourceFiles(true))
...
The files is moved properly, but the consolidateFlow is not being executed at all. Any idea?
I've also tried adding the channel to the new flow .channel("consolidateNlpFlow.input") but it didn't change the behavior.
Your problem that a consolidateFlow is not able to return result into the main flow. Just because there is anything gateway-like. You do there an explicit .channel("consolidateFlow.input") which means there is not going to be way back.
That's for the issue you have so far.
Regarding a possible solution.
According to your configuration both your subscribers in the publishSubscribeChannel are performed on the same thread, one by one. So, it is going to be very easy for you to add one more subscriber with that Files.outboundAdapter() and deleteSourceFiles(true). This one is going to be called already after existing subscribers.

spring-integration: how to deliver deferred details as SSE

I have a list of items which I want to retrieve and return as fast as possible.
For each item I also need to retrieve details, they may be returned a few seconds later.
I could of course create two different routes with HTTP gateways and request first the list, then the details. However, I then have to wait until all details have arrived. I want to send back the list immediately and then the details as soon as I get them.
UPDATE
Following Artem Bilan's advice my flow returns a Flux as payload which merges the list of items as a Mono and the processed items as a Flux.
Note that the example below simulates detail processing of the items by calling toUpperCase; my real use case requires routing and outgoing calls to get the details for each item:
#Bean
public IntegrationFlow sseFlow() {
return IntegrationFlows
.from(WebFlux.inboundGateway("/strings/sse")
.requestMapping(m -> m.produces(MediaType.TEXT_EVENT_STREAM_VALUE))
.mappedResponseHeaders("*"))
.enrichHeaders(Collections.singletonMap("aHeader", new String[]{"foo", "bar"}))
.transform("headers.aHeader")
.<String[]>handle((p, h) -> {
return Flux.merge(
Mono.just(p),
Flux.fromArray(p)
.map(t -> {
return t.toUpperCase();
// return detailsResolver.resolveDetail(t);
}));
})
.get();
}
That comes closer to my goal. When I request data from this flow using curl, I get the list of items immediately and the processed items slightly later:
λ curl http://localhost:8080/strings/sse
data:["foo","bar"]
data:FOO
data:BAR
While simply converting the string to uppercase works fine, I have difficulty to make an outgoing call for details using WebFlux.outboundGateway. The detailsResolver in the commented out code above is defined as follows:
#MessagingGateway
public interface DetailsResolver {
#Gateway(requestChannel = "itemDetailsFlow.input")
Object resolveDetail(String item);
}
#Bean
IntegrationFlow itemDetailsFlow() {
return f -> f.handle(WebFlux.<String>outboundGateway(m ->
UriComponentsBuilder.fromUriString("http://localhost:3003/rest/path/")
.path(m.getPayload())
.build()
.toUri())
.httpMethod(HttpMethod.GET)
.expectedResponseType(JsonNode.class)
.replyPayloadToFlux(false));
}
When I comment in the detailsResolver call and comment out t.toUpperCase, the outboundGateway seems to be set up properly (the log says Subscriber present, Demand signaled) but never gets a response (doesn't reach a breakpoint in ExchangeFunctions.exchange#91).
I have ensured that the DetailsResolver itself is working by getting it as a bean from the context and invoking its method - that gives me a JsonNode response.
What can be the reason?
Yes, I wouldn't use toReactivePublsiher() there because you have a context of the current request. You need fluxes per request. I would use something like Flux.merge(Publisher<? extends I>... sources), where the first Flux is for items and the second is for details per item (something like Tuple2).
For this purpose you really can use something like this:
IntegrationFlows
.from(WebFlux.inboundGateway("/sse")
.requestMapping(m -> m.produces(MediaType.TEXT_EVENT_STREAM_VALUE)))
And your downstream flow should produce Flux as a payload for reply.
I have a sample like this in test cases:
#Bean
public IntegrationFlow sseFlow() {
return IntegrationFlows
.from(WebFlux.inboundGateway("/sse")
.requestMapping(m -> m.produces(MediaType.TEXT_EVENT_STREAM_VALUE))
.mappedResponseHeaders("*"))
.enrichHeaders(Collections.singletonMap("aHeader", new String[] { "foo", "bar", "baz" }))
.handle((p, h) -> Flux.fromArray((String[]) h.get("aHeader")))
.get();
}

Using filter with a discard channel in Spring Integration DSL

I don't know if this question is about spring-integration, spring-integration-dsl or both, so I just added the 2 tags...
I spend a considerable amount of time today, first doing a simple flow with a filter
StandardIntegrationFlow flow = IntegrationFlows.from(...)
.filter(messagingFilter)
.transform(transformer)
.handle((m) -> {
(...)
})
.get();
The messagingFilter being a very simple implementation of a MessageSelector. So far so good, no much time spent. But then I wanted to log a message in case the MessageSelector returned false, and here is where I got stuck.
After quite some time I ended up with this:
StandardIntegrationFlow flow = IntegrationFlows.from(...)
.filter(messagingFilters, fs -> fs.discardFlow( i -> i.channel(discardChannel()))
.transform(transformer)
.handle((m) -> {
(...)
})
.get();
(...)
public MessageChannel discardChannel() {
MessageChannel channel = new MessageChannel(){
#Override
public boolean send(Message<?> message) {
log.warn((String) message.getPayload().get("msg-failure"));
return true;
}
#Override
public boolean send(Message<?> message, long timeout) {
return this.send(message);
}
};
return channel;
}
This is both ugly and verbose, so the question is, what have I done wrong here and how should I have done it in a better, cleaner, more elegant solution?
Cheers.
Your problem that you don't see that Filter is a EI Pattern implementation and the maximum it can do is to send discarded message to some channel. It isn't going to log anything because that approach won't be Messaging-based already.
The simplest way you need for your use-case is like:
.discardFlow(df -> df
.handle(message -> log.warn((String) message.getPayload().get("msg-failure")))))
That your logic to just log. Some other people might do more complicated logic. So, eventually you'll get to used to with channel abstraction between endpoints.
I agree that new MessageChannel() {} approach is wrong. The logging indeed should be done in the MessageHandler instead. That is the level of the service responsibility. Also don't forget that there is LoggingHandler, which via Java DSL can be achieved as:
.filter(messagingFilters, fs -> fs.discardFlow( i -> i.log(message -> (String) message.getPayload().get("msg-failure"))))

Enriching in parallel after a split

This is a continuation of the shopping cart sample, where we have an external API that allows checkout from a shopping cart. To recap, we have a flow where we create an empty shopping, add line item(s) and finally checkout. All the operations above, happen as enrichments through HTTP calls to an external service. We would like to add line items concurrently (as part of the add line items) call. Our current configuration looks like this:
#Bean
public IntegrationFlow fullCheckoutFlow() {
return f -> f.channel("inputChannel")
.transform(fromJson(ShoppingCart.class))
.enrich(e -> e.requestChannel(SHOPPING_CART_CHANNEL))
.split(ShoppingCart.class, ShoppingCart::getLineItems)
.enrich(e -> e.requestChannel(ADD_LINE_ITEM_CHANNEL))
.aggregate(aggregator -> aggregator
.outputProcessor(g -> g.getMessages()
.stream()
.map(m -> (LineItem) m.getPayload())
.map(LineItem::getName)
.collect(joining(", "))))
.enrich(e -> e.requestChannel(CHECKOUT_CHANNEL))
.<String>handle((p, h) -> Message.called("We have " + p + " line items!!"));
}
#Bean
public IntegrationFlow addLineItem(Executor executor) {
return f -> f.channel(MessageChannels.executor(ADD_LINE_ITEM_CHANNEL, executor).get())
.handle(outboundGateway("http://localhost:8080/api/add-line-item", restTemplate())
.httpMethod(POST)
.expectedResponseType(String.class));
}
#Bean
public Executor executor(Tracer tracer, TraceKeys traceKeys, SpanNamer spanNamer) {
return new TraceableExecutorService(newFixedThreadPool(10), tracer, traceKeys, spanNamer);
}
To add line items in parallel, we are using an executor channel. However, they still seem to be getting processed sequentially when seen in zipkin:
What are we doing wrong? The source for the whole project is on github for reference.
Thanks!
First of all the main feature of Spring Integration is MessageChannel, but it still isn't clear to me why people are missing .channel() operator in between endpoint definitions.
I mean that for your case it must be like:
.split(ShoppingCart.class, ShoppingCart::getLineItems)
.channel(c -> c.executor(executor()))
.enrich(e -> e.requestChannel(ADD_LINE_ITEM_CHANNEL))
Now about your particular problem.
Look, ContentEnricher (.enrich()) is request-reply component: http://docs.spring.io/spring-integration/reference/html/messaging-transformation-chapter.html#payload-enricher.
Therefore it sends request to its requestChannel and waits for reply. And it is done independently of the requestChannel type.
I raw Java we can demonstrate such a behavior with this code snippet:
for (Object item: items) {
Data data = sendAndReceive(item);
}
where you should see that ADD_LINE_ITEM_CHANNEL as an ExecutorChannel doesn't have much value because we are blocked within loop for the reply anyway.
A .split() does exactly similar loop, but since by default it is with the DirectChannel, an iteration is done in the same thread. Therefore each next item waits for the reply for the previous.
That's why you definitely should parallel exactly as an input for the .enrich(), just after .split().

How do I specify default output channel on routeToRecipients using Spring Integration Java DSL 1.0.0.M3

Since upgrading to M3 of spring-integration java dsl I'm seeing the following error on any flow using a recipient list router:
org.springframework.messaging.MessageDeliveryException: no channel resolved by router and no default output channel defined
It's not clear how to actually specify this in M3. There is no output channel option on the endpoint configurer and nothing on the RecipientListRouterSpec. Any suggestions?
According to the https://jira.spring.io/browse/INTEXT-113 there is no more reason to specify .defaultOutputChannel(), because the next .channel() (or implicit) is used for that purpose. That's because that defaultOutputChannel exactly plays the role of standard outputChannel. Therefore you have now more formal integration flow:
#Bean
public IntegrationFlow recipientListFlow() {
return IntegrationFlows.from("recipientListInput")
.<String, String>transform(p -> p.replaceFirst("Payload", ""))
.routeToRecipients(r -> r.recipient("foo-channel", "'foo' == payload")
.recipient("bar-channel", m ->
m.getHeaders().containsKey("recipient")
&& (boolean) m.getHeaders().get("recipient")))
.channel("defaultOutputChannel")
.handle(m -> ...)
.get();
}
Where .channel("defaultOutputChannel") can be omitted.

Resources