Integration flow accessing a paged http resource - spring-integration

I'm trying to consume entirely a paged resource as follow, however my aproach is raising a StackOverflowException.
Any clue abount this? Or a different aproach?
Example: https://gist.github.com/daniel-frank/a88fa4553ed34c348528f51d33c3733b

OK. I see now. Let me simplify your recursive code to show the problem:
private IntegrationFlow getPageFlow() {
return f -> f
.publishSubscribeChannel(ps -> ps
.subscribe(this.nextPageFlow())
);
}
private IntegrationFlow nextPageFlow() {
return f -> f
.publishSubscribeChannel(ps -> ps
.subscribe(this.getPageFlow())
);
}
So, technically we have this structure in the memory:
getPageFlow
nextPageFlow
getPageFlow
nextPageFlow
getPageFlow
and so on.
Another problem here that each .subscribe(this.nextPageFlow()) creates a new instance of the IntegrationFlow meanwhile logically you expect only one.
I understand that you can't declare beans in the IntegrationFlowAdapter impl, but that won't have with the StackOverflowException anyway.
What I see as a problem in your approach is a lack of the MessageChannel abstraction.
You use publishSubscribeChannel everywhere, meanwhile you could just distinguish the logic by the explicit channel definition in your flow.
To break the recursion and keep the code as closer to your solution as possible I'd make like this:
private IntegrationFlow getPageFlow() {
return f -> f
.channel("pageServiceChannel")
.handle(Http
.outboundGateway("https://jobs.github.com/positions.json?description={description}&page={page}")
...
private IntegrationFlow nextPageFlow() {
return f -> f
.filter("!payload.isEmpty()")
.enrichHeaders(e -> e.headerExpression("page", "headers.getOrDefault('page', 0) + 1", true))
.channel("pageServiceChannel");
}
Of course you still have a recursion, but that will be already at run time, logical.

Related

Multiple Independent IntegrationFlow

Is below the correct way to configure multiple independent IntegrationFlows in the same Spring Boot application? Is there any more optimization that could be done?
#Bean("flow1")
public IntegrationFlow integrationFlow1() {
return IntegrationFlows.from(jdbcMessageSource1(), p -> p.poller(pollerSpec1()))
.split()
.channel(c -> c.executor(Executors.newCachedThreadPool()))
.transform(transformer1, "transform")
.enrichHeaders(headerEnricherSpec -> headerEnricherSpec.header(HttpHeaders.CONTENT_TYPE, MediaType.APPLICATION_JSON_VALUE))
.handle(Http.outboundGateway(url1)
.httpMethod(HttpMethod.POST)
.expectedResponseType(String.class)
.requestFactory(requestFactory))
.get();
}
#Bean("flow2")
public IntegrationFlow integrationFlow2() {
return IntegrationFlows.from(jdbcMessageSource2(), p -> p.poller(pollerSpec2()))
.split()
.channel(c -> c.executor(Executors.newCachedThreadPool()))
.transform(transformer2, "transform")
.enrichHeaders(headerEnricherSpec -> headerEnricherSpec.header(HttpHeaders.CONTENT_TYPE, MediaType.APPLICATION_JSON_VALUE))
.handle(Http.outboundGateway(url2)
.httpMethod(HttpMethod.POST)
.expectedResponseType(String.class)
.requestFactory(requestFactory))
.get();
}
What you have so far is fully OK and legit.
If your flows don't share any common logic, then have the structure similar is expected. On the other hand even if they look similar at the moment, it doesn't mean that their logic (one or both) might not be changed in the future. Of course it is much safer to divide your business logic into separate microservices, but that's not wrong to have several units of work in the same application.
You may need to pay attention that shared ThreadPoolTaskScheduler in Spring Boot has only one thread by default. So, to support those polling flows in parallel you might increase the pool config: https://docs.spring.io/spring-boot/docs/current/reference/html/features.html#features.task-execution-and-scheduling

Move file from inbound adapter after publish subscribe flow

I'm trying to implement the following flow:
1) files are read from inbound adapter
2) they are send to different flows using publish-subscribe channel with applied sequence
3) file is moved after all the subscriber flows are ready
This is the main flow
return IntegrationFlows
.from(Files.inboundAdapter(inboundOutDirectory)
.regexFilter(pattern)
.useWatchService(true)
.watchEvents(FileReadingMessageSource.WatchEventType.CREATE),
e -> e.poller(Pollers.fixedDelay(period)
.taskExecutor(Executors.newFixedThreadPool(poolSize))
.maxMessagesPerPoll(maxMessagesPerPoll)))
.publishSubscribeChannel(s -> s
.applySequence(true)
.subscribe(f -> f
.transform(Files.toStringTransformer())
.<String>handle((p, h) -> {
return "something"
}
})
.channel("consolidateFlow.input"))
.subscribe(f -> f
.transform(Files.toStringTransformer())
.handle(Http.outboundGateway(testUri)
.httpMethod(HttpMethod.GET)
.uriVariable("text", "payload") .expectedResponseType(String.class))
.<String>handle((p, h) -> {
return "something";
})
.channel("consolidateFlow.input")))
.get();
And the aggregation:
public IntegrationFlow consolidateFlow()
return flow -> flow
.aggregate()
.<List<String>>handle((p, h) -> "something").log()
}
}
Using the following code in the main flow after publish-subscribe
.handle(Files.outboundGateway(this.inboundProcessedDirectory).deleteSourceFiles(true))
ends up with
Caused by: org.springframework.messaging.core.DestinationResolutionException: no output-channel or replyChannel header available
If I go with this the consolidation/aggregation flow won't be reached at all.
.handle(Files.outboundAdapter(this.inboundProcessedDirectory))
Any idea how I could solve it? Currently I'm moving the file after the aggregation by reading the original file name from the header but it doesn't seem to be the right solution.
I was also thinking about applying spec/advice to the inbound adapter with success logic to move the file but not sure whether that's the right approach.
EDIT1
As suggested by Artem, I've added another subscriber to the publish-subscribe as follows:
...
.channel("consolidateNlpFlow.input"))
.subscribe(f -> f
.handle(Files.outboundAdapter(this.inboundProcessedDirectory).deleteSourceFiles(true))
...
The files is moved properly, but the consolidateFlow is not being executed at all. Any idea?
I've also tried adding the channel to the new flow .channel("consolidateNlpFlow.input") but it didn't change the behavior.
Your problem that a consolidateFlow is not able to return result into the main flow. Just because there is anything gateway-like. You do there an explicit .channel("consolidateFlow.input") which means there is not going to be way back.
That's for the issue you have so far.
Regarding a possible solution.
According to your configuration both your subscribers in the publishSubscribeChannel are performed on the same thread, one by one. So, it is going to be very easy for you to add one more subscriber with that Files.outboundAdapter() and deleteSourceFiles(true). This one is going to be called already after existing subscribers.

spring-integration: how to deliver deferred details as SSE

I have a list of items which I want to retrieve and return as fast as possible.
For each item I also need to retrieve details, they may be returned a few seconds later.
I could of course create two different routes with HTTP gateways and request first the list, then the details. However, I then have to wait until all details have arrived. I want to send back the list immediately and then the details as soon as I get them.
UPDATE
Following Artem Bilan's advice my flow returns a Flux as payload which merges the list of items as a Mono and the processed items as a Flux.
Note that the example below simulates detail processing of the items by calling toUpperCase; my real use case requires routing and outgoing calls to get the details for each item:
#Bean
public IntegrationFlow sseFlow() {
return IntegrationFlows
.from(WebFlux.inboundGateway("/strings/sse")
.requestMapping(m -> m.produces(MediaType.TEXT_EVENT_STREAM_VALUE))
.mappedResponseHeaders("*"))
.enrichHeaders(Collections.singletonMap("aHeader", new String[]{"foo", "bar"}))
.transform("headers.aHeader")
.<String[]>handle((p, h) -> {
return Flux.merge(
Mono.just(p),
Flux.fromArray(p)
.map(t -> {
return t.toUpperCase();
// return detailsResolver.resolveDetail(t);
}));
})
.get();
}
That comes closer to my goal. When I request data from this flow using curl, I get the list of items immediately and the processed items slightly later:
λ curl http://localhost:8080/strings/sse
data:["foo","bar"]
data:FOO
data:BAR
While simply converting the string to uppercase works fine, I have difficulty to make an outgoing call for details using WebFlux.outboundGateway. The detailsResolver in the commented out code above is defined as follows:
#MessagingGateway
public interface DetailsResolver {
#Gateway(requestChannel = "itemDetailsFlow.input")
Object resolveDetail(String item);
}
#Bean
IntegrationFlow itemDetailsFlow() {
return f -> f.handle(WebFlux.<String>outboundGateway(m ->
UriComponentsBuilder.fromUriString("http://localhost:3003/rest/path/")
.path(m.getPayload())
.build()
.toUri())
.httpMethod(HttpMethod.GET)
.expectedResponseType(JsonNode.class)
.replyPayloadToFlux(false));
}
When I comment in the detailsResolver call and comment out t.toUpperCase, the outboundGateway seems to be set up properly (the log says Subscriber present, Demand signaled) but never gets a response (doesn't reach a breakpoint in ExchangeFunctions.exchange#91).
I have ensured that the DetailsResolver itself is working by getting it as a bean from the context and invoking its method - that gives me a JsonNode response.
What can be the reason?
Yes, I wouldn't use toReactivePublsiher() there because you have a context of the current request. You need fluxes per request. I would use something like Flux.merge(Publisher<? extends I>... sources), where the first Flux is for items and the second is for details per item (something like Tuple2).
For this purpose you really can use something like this:
IntegrationFlows
.from(WebFlux.inboundGateway("/sse")
.requestMapping(m -> m.produces(MediaType.TEXT_EVENT_STREAM_VALUE)))
And your downstream flow should produce Flux as a payload for reply.
I have a sample like this in test cases:
#Bean
public IntegrationFlow sseFlow() {
return IntegrationFlows
.from(WebFlux.inboundGateway("/sse")
.requestMapping(m -> m.produces(MediaType.TEXT_EVENT_STREAM_VALUE))
.mappedResponseHeaders("*"))
.enrichHeaders(Collections.singletonMap("aHeader", new String[] { "foo", "bar", "baz" }))
.handle((p, h) -> Flux.fromArray((String[]) h.get("aHeader")))
.get();
}

Using filter with a discard channel in Spring Integration DSL

I don't know if this question is about spring-integration, spring-integration-dsl or both, so I just added the 2 tags...
I spend a considerable amount of time today, first doing a simple flow with a filter
StandardIntegrationFlow flow = IntegrationFlows.from(...)
.filter(messagingFilter)
.transform(transformer)
.handle((m) -> {
(...)
})
.get();
The messagingFilter being a very simple implementation of a MessageSelector. So far so good, no much time spent. But then I wanted to log a message in case the MessageSelector returned false, and here is where I got stuck.
After quite some time I ended up with this:
StandardIntegrationFlow flow = IntegrationFlows.from(...)
.filter(messagingFilters, fs -> fs.discardFlow( i -> i.channel(discardChannel()))
.transform(transformer)
.handle((m) -> {
(...)
})
.get();
(...)
public MessageChannel discardChannel() {
MessageChannel channel = new MessageChannel(){
#Override
public boolean send(Message<?> message) {
log.warn((String) message.getPayload().get("msg-failure"));
return true;
}
#Override
public boolean send(Message<?> message, long timeout) {
return this.send(message);
}
};
return channel;
}
This is both ugly and verbose, so the question is, what have I done wrong here and how should I have done it in a better, cleaner, more elegant solution?
Cheers.
Your problem that you don't see that Filter is a EI Pattern implementation and the maximum it can do is to send discarded message to some channel. It isn't going to log anything because that approach won't be Messaging-based already.
The simplest way you need for your use-case is like:
.discardFlow(df -> df
.handle(message -> log.warn((String) message.getPayload().get("msg-failure")))))
That your logic to just log. Some other people might do more complicated logic. So, eventually you'll get to used to with channel abstraction between endpoints.
I agree that new MessageChannel() {} approach is wrong. The logging indeed should be done in the MessageHandler instead. That is the level of the service responsibility. Also don't forget that there is LoggingHandler, which via Java DSL can be achieved as:
.filter(messagingFilters, fs -> fs.discardFlow( i -> i.log(message -> (String) message.getPayload().get("msg-failure"))))

Enriching in parallel after a split

This is a continuation of the shopping cart sample, where we have an external API that allows checkout from a shopping cart. To recap, we have a flow where we create an empty shopping, add line item(s) and finally checkout. All the operations above, happen as enrichments through HTTP calls to an external service. We would like to add line items concurrently (as part of the add line items) call. Our current configuration looks like this:
#Bean
public IntegrationFlow fullCheckoutFlow() {
return f -> f.channel("inputChannel")
.transform(fromJson(ShoppingCart.class))
.enrich(e -> e.requestChannel(SHOPPING_CART_CHANNEL))
.split(ShoppingCart.class, ShoppingCart::getLineItems)
.enrich(e -> e.requestChannel(ADD_LINE_ITEM_CHANNEL))
.aggregate(aggregator -> aggregator
.outputProcessor(g -> g.getMessages()
.stream()
.map(m -> (LineItem) m.getPayload())
.map(LineItem::getName)
.collect(joining(", "))))
.enrich(e -> e.requestChannel(CHECKOUT_CHANNEL))
.<String>handle((p, h) -> Message.called("We have " + p + " line items!!"));
}
#Bean
public IntegrationFlow addLineItem(Executor executor) {
return f -> f.channel(MessageChannels.executor(ADD_LINE_ITEM_CHANNEL, executor).get())
.handle(outboundGateway("http://localhost:8080/api/add-line-item", restTemplate())
.httpMethod(POST)
.expectedResponseType(String.class));
}
#Bean
public Executor executor(Tracer tracer, TraceKeys traceKeys, SpanNamer spanNamer) {
return new TraceableExecutorService(newFixedThreadPool(10), tracer, traceKeys, spanNamer);
}
To add line items in parallel, we are using an executor channel. However, they still seem to be getting processed sequentially when seen in zipkin:
What are we doing wrong? The source for the whole project is on github for reference.
Thanks!
First of all the main feature of Spring Integration is MessageChannel, but it still isn't clear to me why people are missing .channel() operator in between endpoint definitions.
I mean that for your case it must be like:
.split(ShoppingCart.class, ShoppingCart::getLineItems)
.channel(c -> c.executor(executor()))
.enrich(e -> e.requestChannel(ADD_LINE_ITEM_CHANNEL))
Now about your particular problem.
Look, ContentEnricher (.enrich()) is request-reply component: http://docs.spring.io/spring-integration/reference/html/messaging-transformation-chapter.html#payload-enricher.
Therefore it sends request to its requestChannel and waits for reply. And it is done independently of the requestChannel type.
I raw Java we can demonstrate such a behavior with this code snippet:
for (Object item: items) {
Data data = sendAndReceive(item);
}
where you should see that ADD_LINE_ITEM_CHANNEL as an ExecutorChannel doesn't have much value because we are blocked within loop for the reply anyway.
A .split() does exactly similar loop, but since by default it is with the DirectChannel, an iteration is done in the same thread. Therefore each next item waits for the reply for the previous.
That's why you definitely should parallel exactly as an input for the .enrich(), just after .split().

Resources