Spring Integration: Framework Error Handling is Inconsistent - spring-integration

In my experience, the error handling strategy in various EIP components have little or no consistency.
Case 1: handle:
return IntegrationFlows.from(inputChannel)
.enrichHeaders(spec -> spec.header(ERROR_CHANNEL, ARTIFACTORY_ERROR_CHANNEL, true))
.handle(WebFlux.outboundGateway(uri, webClient)
.expectedResponseType(ArtifactSearchResponse.class)
.httpMethod(GET)
.mappedRequestHeaders(ACCEPT))
.log(LoggingHandler.Level.INFO, CLASS_NAME, Message::getPayload)
.handle(transformer)
.channel(outputChannel)
.get();
In this case, if handle(transformer) throws an exception, the message is sent to the ARTIFACTORY_ERROR_CHANNEL as expected, but the exception is returned to the caller. Thus, a test has to use try-catch to not fail.
try {
inputChannel.send(new GenericMessage<>("start"));
} catch (Exception e) {
// nop-op
}
verify(mockMessageHandler, timeout.times(1)).handleMessage(any(ErrorMessage.class));
Case 2: transform:
Change handle(transformer) to transform(transformer) and the exception is never sent to the ARTIFACTORY_ERROR_CHANNEL channel.
Case 3: Gateway:
public IntegrationFlow fileStreamingFlow() {
return IntegrationFlows.from(inputChannel)
.gateway(f -> f.handle(String.class, (fileName, headers) -> {
throw new RuntimeException();
}), spec -> spec.requiresReply(false).errorChannel(S3_ERROR_CHANNEL))
.channel(outputChannel)
.get();
}
}
In this case, the calls blocks forever. See #2451.
Case 4: handle with routeByException:
return IntegrationFlows.from(s3Properties.getFileStreamingInputChannel())
.enrichHeaders(spec -> spec.header(ERROR_CHANNEL, S3_ERROR_CHANNEL, true))
.handle(String.class, (fileName, h) -> {
return new ErrorMessage(new RuntimeException(), h);
}, spec -> spec.requiresReply(false))
.channel(outputChannel)
.routeByException(r -> r.channelMapping(Exception.class, S3_ERROR_CHANNEL))
.get();
}
In order for the exception to be sent to S3_ERROR_CHANNEL, I need to convert the exception to an ErrorMessage, and also apply a routeByException although there is already a previously configured ERROR_CHANNEL.
What I expect: If user defines an error channel, send all exceptions there. If the error handler associated to that channel returns null, terminate the flow; if it returns something else, continue. If user doesn't define an error channel, send the exception to the framework default error channel. Do this regardless of the flow definition.

transformer - if your transformer returns Message<?>; it is responsible to propagate the errorChannel header.
When using a gateway, the error channel must be declared thereon rather than adding it later.
I don't understand what you are trying to do there.
In general, it's best to not manipulate framework headers in this way, but declare the channel on the proper elements (gateways, pollers etc).

Related

How to handle errors after message has been handed off to QueueChannel?

I have 10 rabbitMQ queues, called event.q.0, event.q.2, <...>, event.q.9. Each of these queues receive messages routed from event.consistent-hash exchange. I want to build a fault tolerant solution that will consume messages for a specific event in sequential manner, since ordering is important. For this I have set up a flow that listens to those queues and routes messages based on event ID to a specific worker flow. Worker flows work based on queue channels so that should guarantee the FIFO order for an event with specific ID. I have come up with with the following set up:
#Bean
public IntegrationFlow eventConsumerFlow(RabbitTemplate rabbitTemplate, Advice retryAdvice) {
return IntegrationFlows
.from(
Amqp.inboundAdapter(new SimpleMessageListenerContainer(rabbitTemplate.getConnectionFactory()))
.configureContainer(c -> c
.adviceChain(retryAdvice())
.addQueueNames(queueNames)
.prefetchCount(amqpProperties.getPreMatch().getDefinition().getQueues().getEvent().getPrefetch())
)
.messageConverter(rabbitTemplate.getMessageConverter())
)
.<Event, String>route(e -> String.format("worker-input-%d", e.getId() % numberOfWorkers))
.get();
}
private Advice deadLetterAdvice() {
return RetryInterceptorBuilder
.stateless()
.maxAttempts(3)
.recoverer(recoverer())
.backOffPolicy(backOffPolicy())
.build();
}
private ExponentialBackOffPolicy backOffPolicy() {
ExponentialBackOffPolicy backOffPolicy = new ExponentialBackOffPolicy();
backOffPolicy.setInitialInterval(1000);
backOffPolicy.setMultiplier(3.0);
backOffPolicy.setMaxInterval(15000);
return backOffPolicy;
}
private MessageRecoverer recoverer() {
return new RepublishMessageRecoverer(
rabbitTemplate,
"error.exchange.dlx"
);
}
#PostConstruct
public void init() {
for (int i = 0; i < numberOfWorkers; i++) {
flowContext.registration(workerFlow(MessageChannels.queue(String.format("worker-input-%d", i), queueCapacity).get()))
.autoStartup(false)
.id(String.format("worker-flow-%d", i))
.register();
}
}
private IntegrationFlow workerFlow(QueueChannel channel) {
return IntegrationFlows
.from(channel)
.<Object, Class<?>>route(Object::getClass, m -> m
.resolutionRequired(true)
.defaultOutputToParentFlow()
.subFlowMapping(EventOne.class, s -> s.handle(oneHandler))
.subFlowMapping(EventTwo.class, s -> s.handle(anotherHandler))
)
.get();
}
Now, when lets say an error happens in eventConsumerFlow, the retry mechanism works as expected, but when an error happens in workerFlow, the retry doesn't work anymore and the message doesn't get sent to dead letter exchange. I assume this is because once message is handed off to QueueChannel, it gets acknowledged automatically. How can I make the retry mechanism work in workerFlow as well, so that if exception happens there, it could retry a couple of times and send a message to DLX when tries are exhausted?
If you want resiliency, you shouldn't be using queue channels at all; the messages will be acknowledged immediately after the message is put in the in-memory queue;if the server crashes, those messages will be lost.
You should configure a separate adapter for each queue if you want no message loss.
That said, to answer the general question, any errors on downstream flows (including after a queue channel) will be sent to the errorChannel defined on the inbound adapter.

Spring Integration Java DSL: How to continue after error situation with the split and the aggregate methods?

My program does the following in the high level
Task 1
get the data from the System X
the Java DSL split
post the data to the System Y
post the reply data to the X
the Java DSL aggregate
Task 2
get the data from the System X
the Java DSL split
post the data to the System Y
post the reply data to the X
the Java DSL aggregate
...
The problem is that when one post the data to the System Y sub task fails, the error message is correctly send back to the System X, but after that any other sub tasks or tasks are not executed.
My error handler does this:
...
Message<String> newMessage = MessageBuilder.withPayload("error occurred")
.copyHeadersIfAbsent(message.getPayload().getFailedMessage().getHeaders()).build();
...
Set some extra headers etc.
...
return newMessage;
What could be the problem?
Edit:
I debugged the Spring Integration. In the error situation only first error message comes to the method AbstractCorrelatingMessageHandler.handleMessageInternal. Other successfull and failing messages not come to the method.
If there are not errors all the messages come to the method and finally the group is released.
What could be wrong in my program?
Edit 2:
This is working:
Added the advice for the Http.outboundGateway:
.handle(Http.outboundGateway(...,
c -> c.advice(myAdvice()))
and the myAdvice bean
#Bean
private Advice myAdvice() {
return new MyAdvice();
}
and the MyAdvice class
public class MyAdvice<T> extends AbstractRequestHandlerAdvice {
#SuppressWarnings("unchecked")
#Override
protected Object doInvoke(final ExecutionCallback callback, final Object target, final Message<?> message)
throws Exception {
...
try {
result = (MessageBuilder<T>) callback.execute();
} catch (final MessageHandlingException e) {
take the exception cause for the new payload
}
return new message with the old headers and replyChannel header and result.payload or the exception cause as a payload
}
}
There is nothing wrong with your program. That's exactly how regular loop works in Java. To catch exception for each iteration and continue with other remaining item you definitely need a try..catch in the Java loop. So, something similar you need to apply here for the splitter. It can be achieved with the ExpressionEvaluatingRequestHandlerAdvice, an ExectutorChannel as an output from the splitter or with the gateway call via service activator on the splitter output channel.
Since the story is about an aggregator afterward, you still need to finish a group somehow and this can be done only with some error compensation message to be emitted from the error handling to return back to the aggregator's input channel. In this case you need to ensure to copy request headers from the failedMessage of the MessagingException thrown to the error flow. After aggregation of the group you would need to sever messages with error from the normal ones. That can be done only with the special payload or you may just have an exception as a payload for the proper distinguishing errors from normal messages in the final result from the aggregator.

How to handle errors from parallel web requests using Retrofit + RxJava?

I have a situation like this where I make some web requests in parallel. Sometimes I make these calls and all requests see the same error (e.g. no-network):
void main() {
Observable.just("a", "b", "c")
.flatMap(s -> makeNetworkRequest())
.subscribe(
s -> {
// TODO
},
error -> {
// handle error
});
}
Observable<String> makeNetworkRequest() {
return Observable.error(new NoNetworkException());
}
class NoNetworkException extends Exception {
}
Depending on the timing, if one request emits the NoNetworkException before the others can, Retrofit/RxJava will dispose/interrupt** the others. I'll see one of the following logs (not all three) for each request remaining in progress++:
<-- HTTP FAILED: java.io.IOException: Canceled
<-- HTTP FAILED: java.io.InterruptedIOException
<-- HTTP FAILED: java.io.InterruptedIOException: thread interrupted
I'll be able to handle the NoNetworkException error in the subscriber and everything downstream will get disposed of and all is OK.
However based on timing, if two or more web requests emit NoNetworkException, then the first one will trigger the events above, disposing of everything down stream. The second NoNetworkException will have nowhere to go and I'll get the dreaded UndeliverableException. This is the same as example #1 documented here.
In the above article, the author suggested using an error handler. Obviously retry/retryWhen don't make sense if I expect to hear the same errors again. I don't understand how onErrorResumeNext/onErrorReturn help here, unless I map them to something recoverable to be handled downstream:
Observable.just("a", "b", "c")
.flatMap(s ->
makeNetworkRequest()
.onErrorReturn(error -> {
// eat actual error and return something else
return "recoverable error";
}))
.subscribe(
s -> {
if (s.equals("recoverable error")) {
// handle error
} else {
// TODO
}
},
error -> {
// handle error
});
but this seems wonky.
I know another solution is to set a global error handler with RxJavaPlugins.setErrorHandler(). This doesn't seem like a great solution either. I may want to handle NoNetworkException differently in different parts of my app.
So what other options to I have? What do other people do in this case? This must be pretty common.
** I don't fully understand who is interrupting/disposing of who. Is RxJava disposing of all other requests in flatmap which in turn causes Retrofit to cancel requests? Or does Retrofit cancel requests, resulting in each
request in flatmap emitting one of the above IOExceptions? I guess it doesn't really matter to answer the question, just curious.
++ It's possible that not all a, b, and c requests are in flight depending on thread pool.
Have you tried by using flatMap() with delayErrors=true?

How to use await keyword inside a method without changing the method async

I am developing a scheduled job to send message to Message queue using Quartz.net. The Execute method of IJob is not async. so I can't use async Task. But I want to call a method with await keyword.
Please find below my code. Not sure whether I am doing correct. Can anyone please help me with this?
private async Task PublishToQueue(ChangeDetected changeDetected)
{
_logProvider.Info("Publish to Queue started");
try
{
await _busControl.Publish(changeDetected);
_logProvider.Info($"ChangeDetected message published to RabbitMq. Message");
}
catch (Exception ex)
{
_logProvider.Error("Error publishing message to queue: ", ex);
throw;
}
}
public class ChangedNotificatonJob : IJob
{
public void Execute(IJobExecutionContext context)
{
//Publish message to queue
Policy
.Handle<Exception>()
.RetryAsync(3, (exception, count) =>
{
//Do something for each retry
})
.ExecuteAsync(async () =>
{
await PublishToQueue(message);
});
}
}
Is this correct way? I have used .GetAwaiter();
Policy
.Handle<Exception>()
.RetryAsync(_configReader.RetryLimit, (exception, count) =>
{
//Do something for each retry
})
.ExecuteAsync(async () =>
{
await PublishToQueue(message);
}).GetAwaiter()
Polly's .ExecuteAsync() returns a Task. With any Task, you can just call .Wait() on it (or other blocking methods) to block synchronously until it completes, or throws an exception.
As you have observed, since IJob.Execute(...) isn't async, you can't use await, so you have no choice but to block synchronously on the task, if you want to discover the success-or-otherwise of publishing before IJob.Execute(...) returns.
.Wait() will cause any exception from the task to be rethrown, wrapped in an AggregateException. This will occur if all Polly-orchestrated retries fail.
You'll need to decide what to do with that exception:
If you want the caller to handle it, rethrow it or don't catch it and let it cascade outside the Quartz job.
If you want to handle it before returning from IJob.Execute(...), you'll need a try {} catch {} around the whole .ExecuteAsync(...).Wait(). Or consider Polly's .ExecuteAndCaptureAsync(...) syntax: it avoids you having to provide that outer try-catch, by instead placing the final outcome of the execution into a PolicyResult instance. See the Polly doco.
There is a further alternative if your only intention is to log somewhere that message publishing failed, and you don't care whether that logging happens before IJob.Execute(...) returns or not. In that case, instead of using .Wait(), you could chain a continuation task on to ExecuteAsync() using .ContinueWith(...), and handle any logging in there. We adopt this approach, and capture failed message publishing to a special 'message hospital' - capturing enough information so that we can choose whether to republish that message again later, if appropriate. Whether this approach is valuable depends on how important it is to you never to lose a message.
EDIT: GetAwaiter() is irrelevant. It won't magically let you start using await inside a non-async method.

Fail early vs. robust methods

I'm constantly (since years) wondering the most senseful way to implement the following (it's kind of paradoxic for me):
Imagine a function:
DoSomethingWith(value)
{
if (value == null) { // Robust: Check parameter(s) first
throw new ArgumentNullException(value);
}
// Some code ...
}
It's called like:
SomeFunction()
{
if (value == null) { // Fail early
InformUser();
return;
}
DoSomethingWith(value);
}
But, to catch the ArgumentNullException, should I do:
SomeFunction()
{
if (value == null) { // Fail early
InformUser();
return;
}
try { // If throwing an Exception, why not *not* check for it (even if you checked already)?
DoSomethingWith(value);
} catch (ArgumentNullException) {
InformUser();
return;
}
}
or just:
SomeFunction()
{
try { // No fail early anymore IMHO, because you could fail before calling DoSomethingWith(value)
DoSomethingWith(value);
} catch (ArgumentNullException) {
InformUser();
return;
}
}
?
This is a very general question and the right solution depends on the specific code and architecture.
Generally regarding error handling
The main focus should be to catch the exception on the level where you can handle it.
Handling the exceptions at the right place makes the code robust, so the exception doesn't make the application fail and the exception can be handled accordingly.
Failing early makes the application robust, because this helps avoiding inconsistent states.
This also means that there should be a more general try catch block at the root of the execution to catch any non fatal application error which couldn't be handled. Which often means that you as a programmer didn't think of this error source. Later you can extend your code to also handle this error. But the execution root shouldn't be the only place where you think of exception handling.
Your example
In your example regarding ArgumentNullException:
Yes, you should fail early. Whenever your method is invoked with an invalid null argument, you should throw this exception.
But you should never catch this exception, cause it should be possible to avoid it. See this post related to the topic: If catching null pointer exception is not a good practice, is catching exception a good one?
If you are working with user input or input from other systems, then you should validate the input. E.g. you can display validation error on the UI after null checking without throwing an exception. It is always a critical part of error handling how to show the issues to users, so define a proper strategy for your application. You should try to avoid throwing exceptions in the expected program execution flow. See this article: https://msdn.microsoft.com/en-us/library/ms173163.aspx
See general thoughts about exception handling below:
Handling exceptions in your method
If an exception is thrown in the DoSomethingWith method and you can handle it and continue the flow of execution without any issue, then of course you should do those.
This is a pseudo code example for retrying a database operation:
void DoSomethingAndRetry(value)
{
try
{
SaveToDatabase(value);
}
catch(DeadlockException ex)
{
//deadlock happened, we are retrying the SQL statement
SaveToDatabase(value);
}
}
Letting the exception bubble up
Let's assume your method is public. If an exception happens which implies that the method failed and you can't continue execution, then the exception should bubble up, so that the calling code can handle it accordingly. It depends one the use-case how the calling code would react on the exception.
Before letting the exception bubble up you may wrap it into another application specific exception as inner exception to add additional context information. You may also process the exception somehow, E.g log it , or leave the logging to the calling code, depending on your logging architecture.
public bool SaveSomething(value)
{
try
{
SaveToFile(value);
}
catch(FileNotFoundException ex)
{
//process exception if needed, E.g. log it
ProcessException(ex);
//you may want to wrap this exception into another one to add context info
throw WrapIntoNewExceptionWithSomeDetails(ex);
}
}
Documenting possible exceptions
In .NET it is also helpful to define which exceptions your method is throwing and reasons why it might throw it. So that the calling code can take this into consideration. See https://msdn.microsoft.com/en-us/library/w1htk11d.aspx
Example:
/// <exception cref="System.Exception">Thrown when something happens..</exception>
DoSomethingWith(value)
{
...
}
Ignoring exceptions
For methods where you are OK with a failing method and don't want to add a try catch block around it all the time, you could create a method with similar signature:
public bool TryDoSomethingWith(value)
{
try
{
DoSomethingWith(value);
return true;
}
catch(Exception ex)
{
//process exception if needed, e.g. log it
ProcessException(ex);
return false;
}
}

Resources