Using custom TaskExecutor in Spring Batch - multithreading

I want to set a Sync/Async TaskExecutor in Spring Batch. Is that possible?
I want to configure my step as follows:
<job id="myJob" xmlns="http://www.springframework.org/schema/batch">
<step id="step1">
<tasklet task-executor="MyTaskExecutor">
<chunk reader="ReaderFile" processor="ProcessFile" writer="WriterFile"
commit-interval="10" />
</tasklet>
</step>
</job>
Then create the bean "MyTaskExecutor" as follows:
<bean id="MyTaskExecutor" scope="step" class="batch.app.util.MyTaskExecutor"/>
Then in my class configure the TaskExecutor. (Now working as Async):
package batch.app.util;
import org.springframework.batch.core.JobExecution;
import org.springframework.core.task.TaskExecutor;
public class MyTaskExecutor extends SimpleAsyncTaskExecutor{
public TaskExecutor taskExecutor(){
return new SimpleAsyncTaskExecutor("spring_batch");
}
}
I would like that MyTaskExecutor extends from wether SimpleAsyncTaskExecutor or SyncTaskExecutor depending on a condition... Or if that is not possible, to be Async but before executing the step, check that condition and if the taskExecutor executing that step, then throw an error.
I've been looking if there is a way to obtain the class of the TaskExecutor from the Reader (or the Processor or the Writer), but didn't find anything.
Thank you very much

You can use a condition inside your job config to pick up the custom task executor. Below is a small snippet with an annotation driven bean creation for reference. You can use similar logic in your configuration approach as well.
Below has a condition on the TaskExecutor which can be resolved at the time of construction and we can create custom executors and add it up to the step config,
Job job = jobBuilderFactory.get("testJob").incrementer(new RunIdIncrementer())
.start(testStep()).next(testStep1()).end()
.listener(jobCompletionListener()).build();
#Bean
public Step testStep() {
boolean sync = false;
AbstractTaskletStepBuilder<SimpleStepBuilder<String, Test>> stepConfig = stepBuilderFactory
.get("testStep").<String, Test>chunk(10)
.reader(reader())
.processor(processor())
.writer(writer())
.listener(testListener());
if (sync) {
stepConfig.taskExecutor(syncTaskExecutor());
} else {
stepConfig.taskExecutor(asyncTaskExecutor());
}
return stepConfig.build();
}
#Bean
public TaskExecutor asyncTaskExecutor() {
SimpleAsyncTaskExecutor taskExecutor = new SimpleAsyncTaskExecutor();
taskExecutor.setConcurrencyLimit(10);
return taskExecutor;
}
// Similarly, other TaskExecutor can have its own bean config

Related

Is this possible to apply bucketisation in Spring Boot (TOMCAT)

I exposed 2 api's
/endpoint/A and /endpoint/B .
#GetMapping("/endpoint/A")
public ResponseEntity<ResponseA> controllerA() throws InterruptedException {
ResponseA responseA = serviceA.responseClient();
return ResponseEntity.ok().body(responseA);
}
#GetMapping("/endpoint/B")
public ResponseEntity<ResponseA> controllerB() throws InterruptedException {
ResponseA responseB = serviceB.responseClient();
return ResponseEntity.ok().body(responseB);
}
Services implemented regarding endpoint A internally call /endpoint/C and endpoint B internally call /endpoint/D.
As external service /endpoint/D taking more time i.e getting response from /endpoint/A takes more time hence whole threads are stucked that is affecting /endpoint/B.
I tried to solve this using executor service having following implementation
#Bean(name = "serviceAExecutor")
public ThreadPoolTaskExecutor serviceAExecutor(){
ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
taskExecutor.setCorePoolSize(100);
taskExecutor.setMaxPoolSize(120);
taskExecutor.setQueueCapacity(50);
taskExecutor.setKeepAliveSeconds(120);
taskExecutor.setThreadNamePrefix("serviceAExecutor");
return taskExecutor;
}
Even after implementing this if I received more than 200 request on /endpoint/A simultaneously (greater than default max number of threads in Tomcat server) then I am not getting responses from /endpoint/B as all threads are busy for getting response from endpoint A or in queue.
Can someone plz suggest is there any way to apply bucketization on each exposed endpoint level and allow only limited request to process at a time & put remaining into bucket/queue so that request on other endpoints can work properly ?
Edit:- following is solution approach
#GetMapping("/endpoint/A")
public CompletableFuture<ResponseEntity<ResponseA>> controllerA() throws InterruptedException {
return CompletableFuture.supplyAsync(()->controllerHelperA());
}
#GetMapping("/endpoint/B")
public CompletableFuture<ResponseEntity<ResponseB>> controllerB() throws InterruptedException {
return CompletableFuture.supplyAsync(()->controllerHelperB());
}
private ResponseEntity<ResponseA> controllerHelperA(){
ResponseA responseA = serviceA.responseClient();
return ResponseEntity.ok().body(responseA);
}
private ResponseEntity<ResponseB> controllerHelperB(){
ResponseB responseB = serviceB.responseClient();
return ResponseEntity.ok().body(responseB);
}
Spring MVC supports the async servlet API introduced in Servlet API 3.0. To make it easier when your controller returns a Callable, CompletableFuture or DeferredResult it will run in a background thread and free the request handling thread for further processing.
#GetMapping("/endpoint/A")
public CompletableFuture<ResponseEntity<ResponseA>> controllerA() throws InterruptedException {
return () {
return controllerHelperA();
}
}
private ResponseEntity<ResponseA> controllerHelperA(){
ResponseA responseA = serviceA.responseClient();
return ResponseEntity.ok().body(responseA);
}
Now this will be executed in a background thread. Depending on your version of Spring Boot and if you have configured your own TaskExecutor it will either
use the SimpleAsycnTaskExecutor (which will issue a warning in your logs),
the default provided ThreadPoolTaskExecutor which is configurable through the spring.task.execution namespace
Use your own TaskExecutor but requires additional configuration.
If you don't have a custom TaskExecutor defined and are on a relatively recent version of Spring Boot 2.1 or up (IIRC) you can use the following properties to configure the TaskExecutor.
spring.task.execution.pool.core-size=20
spring.task.execution.pool.max-size=120
spring.task.execution.pool.queue-capacity=50
spring.task.execution.pool.keep-alive=120s
spring.task.execution.thread-name-prefix=async-web-thread
Generally this will be used to execute Spring MVC tasks in the background as well as regular #Async tasks.
If you want to explicitly configure which TaskExecutor to use for your web processing you can create a WebMvcConfigurer and implement the configureAsyncSupport method.
#Configuration
public class AsyncWebConfigurer implements WebMvcConfigurer {
private final AsyncTaskExecutor taskExecutor;
public AsyncWebConfigurer(AsyncTaskExecutor taskExecutor) {
this.taskExecutor=taskExecutor;
}
public void configureAsyncSupport(AsyncSupportConfigurer configurer) {
configurer.setTaskExecutor(taskExecutor);
}
}
You could use an #Qualifier on the constructor argument to specify which TaskExecutor you want to use.

Basic Example using Spring Integration to copy file to another directory raising exception

I am a beginner in Spring Integration. I wrote this code which is in spring boot and it is raising exception "Bean named 'messageSource' is expected to be of type 'org.springframework.context.MessageSource' but was actually of type 'org.springframework.integration.file.FileReadingMessageSource'"
Code:
#Configuration
/*#EnableIntegration annotation designates this class as a Spring Integration configuration.*/
#EnableIntegration
public class SIConfig {
#Bean
public MessageChannel channel() {
return new DirectChannel();
}
//bydefault name of method
#Bean
public MessageSource messageSource() {
FileReadingMessageSource ms= new FileReadingMessageSource();
ms.setDirectory(new File("C:\\Users\\payal\\Pictures"));
ms.setFilter(new SimplePatternFileListFilter("*.mp4"));
return ms;
}
#Bean
public MessageHandler handler() {
FileWritingMessageHandler handler= new FileWritingMessageHandler(new File("C:\\Users\\payal\\Documents\\batch7"));
handler.setFileExistsMode(FileExistsMode.IGNORE);
handler.setExpectReply(false);
return handler;
}
#Bean
public IntegrationFlow flow() {
return IntegrationFlows.from(messageSource(), configurer -> configurer.poller(Pollers.fixedDelay(10000)))
.channel(channel())
.handle(handler()).get();
}
}
Since using boot, versions are automatically managed
Uploaded code n GitHub too:
https://github.com/LearningNewTechnology/SpringIntegrationOne
Any help would be really appreciated.
Change the name of the bean to, e.g.
#Bean
public MessageSource myMessageSource() {
Spring framework (context) has another type of MessageSource and Spring Boot autoconfiguration creates a bean of that type with name messageSource so your bean is colliding with that.

Remote Partition: Why Steps not run in parallel?

I am using the Spring Batch remote partitioning. My steps are not running in parallel. Instead they run sequentially I mean the partitioned steps run sequentially. What is the root cause of the issue?
I'm relatively new to Spring Batch, but I had similar problem when I first tried writing my own partitioned step.
In my case, the problem was my taskExecutor (which wasn't asynchronous).
I added a #Bean that initialized an asynTaskExecutor and chained that to my partition step. Eureka, it worked.
Here is an example:
private Step partitionStep() throws SQLException {
return stepBuilderFactory.get("example_partitionstep")
.partitioner(step.getName(), columnRangePartitioner(partitionColumn, tableName))
.partitionHandler(taskExecutorPartitionHandler(step))
.build();
}
For the step:
private Step step() throws SQLException {
return stepBuilderFactory.get("example_step")
.<>chunk(1000)
.reader(cursorItemReader(0L, 0L))
.processor(compositeItemProcessor())
.writer(itemWriter())
.build();
}
And for the TaskExecutor:
#Bean
public TaskExecutor taskExecutor() {
SimpleAsyncTaskExecutor taskExecutor = new SimpleAsyncTaskExecutor();
taskExecutor.setConcurrencyLimit(6);
return taskExecutor;
}

Exception Router using Annotation

I am trying to convert my code into java annotation but i am stuck with
<int:exception-type-router input-channel="failed-email-fetch" default-output-channel="errorChannel">
<int:mapping exception-type="com.XXXXXX.RateException" channel="customError" />
</int:exception-type-router>
if i used #Router i did not know what to return and this what i used but did not work
#ServiceActivator(inputChannel = "failedEmailFetch")
public ErrorMessageExceptionTypeRouter handleError(MessageHandlingException messageHandlingException) {
ErrorMessageExceptionTypeRouter errorMessageExceptionTypeRouter = new ErrorMessageExceptionTypeRouter();
errorMessageExceptionTypeRouter.setChannelMapping("com.XXXXXX.exception.MessageException","customError");
errorMessageExceptionTypeRouter.setDefaultOutputChannelName("errorChannel");
return errorMessageExceptionTypeRouter;
}
You also need #Bean when the #ServiceActivator annotation is on a MessageHandler.
#ServiceActivator alone is for POJO messaging.
See Annotations on Beans.
Consuming endpoints have 2 beans, the handler and a consumer; the #ServiceActivator defines the consumer. The #Bean is the hander.
I ended up using the below not sure if it is the best way
#Router(inputChannel = "failedEmailFetch",defaultOutputChannel = "errorChannel")
public String handleError(Message<AggregateMessageDeliveryException> message) {
log.info("{}",message.getPayload().getCause().getCause());
if( message.getPayload().getRootCause() instanceof MessageException)
return "customError";
else
return "errorChannel";
}

Spring Integration Cassandra persistence workflow

I try to realize the following workflow with Spring Integration:
1) Poll REST API
2) store the POJO in Cassandra cluster
It's my first try with Spring Integration, so I'm still a bit overwhelmed about the mass of information from the reference. After some research, I could make the following work.
1) Poll REST API
2) Transform mapped POJO JSON result into a string
3) save string into file
Here's the code:
#Configuration
public class ConsulIntegrationConfig {
#InboundChannelAdapter(value = "consulHttp", poller = #Poller(maxMessagesPerPoll = "1", fixedDelay = "1000"))
public String consulAgentPoller() {
return "";
}
#Bean
public MessageChannel consulHttp() {
return MessageChannels.direct("consulHttp").get();
}
#Bean
#ServiceActivator(inputChannel = "consulHttp")
MessageHandler consulAgentHandler() {
final HttpRequestExecutingMessageHandler handler =
new HttpRequestExecutingMessageHandler("http://localhost:8500/v1/agent/self");
handler.setExpectedResponseType(AgentSelfResult.class);
handler.setOutputChannelName("consulAgentSelfChannel");
LOG.info("Created bean'consulAgentHandler'");
return handler;
}
#Bean
public MessageChannel consulAgentSelfChannel() {
return MessageChannels.direct("consulAgentSelfChannel").get();
}
#Bean
public MessageChannel consulAgentSelfFileChannel() {
return MessageChannels.direct("consulAgentSelfFileChannel").get();
}
#Bean
#ServiceActivator(inputChannel = "consulAgentSelfFileChannel")
MessageHandler consulAgentFileHandler() {
final Expression directoryExpression = new SpelExpressionParser().parseExpression("'./'");
final FileWritingMessageHandler handler = new FileWritingMessageHandler(directoryExpression);
handler.setFileNameGenerator(message -> "../../agent_self.txt");
handler.setFileExistsMode(FileExistsMode.APPEND);
handler.setCharset("UTF-8");
handler.setExpectReply(false);
return handler;
}
}
#Component
public final class ConsulAgentTransformer {
#Transformer(inputChannel = "consulAgentSelfChannel", outputChannel = "consulAgentSelfFileChannel")
public String transform(final AgentSelfResult json) throws IOException {
final String result = new StringBuilder(json.toString()).append("\n").toString();
return result;
}
This works fine!
But now, instead of writing the object to a file, I want to store it in a Cassandra cluster with spring-data-cassandra. For that, I commented out the file handler in the config file, return the POJO in transformer and created the following, :
#MessagingGateway(name = "consulCassandraGateway", defaultRequestChannel = "consulAgentSelfFileChannel")
public interface CassandraStorageService {
#Gateway(requestChannel="consulAgentSelfFileChannel")
void store(AgentSelfResult agentSelfResult);
}
#Component
public final class CassandraStorageServiceImpl implements CassandraStorageService {
#Override
public void store(AgentSelfResult agentSelfResult) {
//use spring-data-cassandra repository to store
LOG.info("Received 'AgentSelfResult': {} in Cassandra cluster...");
LOG.info("Trying to store 'AgentSelfResult' in Cassandra cluster...");
}
}
But this seems to be a wrong approach, the service method is never triggered.
So my question is, what would be a correct approach for my usecase? Do I have to implement the MessageHandler interface in my service component, and use a #ServiceActivator in my config. Or is there something missing in my current "gateway-approach"?? Or maybe there is another solution, that I'm not able to see..
Like mentioned before, I'm new to SI, so this may be a stupid question...
Nevertheless, thanks a lot in advance!
It's not clear how you are wiring in your CassandraStorageService bean.
The Spring Integration Cassandra Extension Project has a message-handler implementation.
The Cassandra Sink in spring-cloud-stream-modules uses it with Java configuration so you can use that as an example.
So I finally made it work. All I needed to do was
#Component
public final class CassandraStorageServiceImpl implements CassandraStorageService {
#ServiceActivator(inputChannel="consulAgentSelfFileChannel")
#Override
public void store(AgentSelfResult agentSelfResult) {
//use spring-data-cassandra repository to store
LOG.info("Received 'AgentSelfResult': {}...");
LOG.info("Trying to store 'AgentSelfResult' in Cassandra cluster...");
}
}
The CassandraMessageHandler and the spring-cloud-streaming seemed to be a to big overhead to my use case, and I didn't really understand yet... And with this solution, I keep control over what happens in my spring component.

Resources