My question is very similar to this one : #Async prevent a thread to continue until other thread have finished
Basically i need run ~ hundreds of computations in more threads. I want to run only some amount of parallel threads e.g. 5 threads with 5 computationis in paralell.
I am using spring framework and #Async option is natural choice. I do not need full-featured JMS queue, that`s a bit overhead for me.
Any ideas ?
Thank you
If you are using Spring's Java-configuration, your config class needs to implements AsyncConfigurer:
#Configuration
#EnableAsync
public class AppConfig implements AsyncConfigurer {
[...]
#Override
public Executor getAsyncExecutor() {
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setCorePoolSize(2);
executor.setMaxPoolSize(5);
executor.setQueueCapacity(50);
executor.setThreadNamePrefix("MyExecutor-");
executor.initialize();
return executor;
}
}
See #EnableAsync documentation for more details : http://docs.spring.io/spring/docs/3.1.x/javadoc-api/org/springframework/scheduling/annotation/EnableAsync.html
Have you checked out Task Executor? You can define a Thread Pool, with a maximum number of threads to execute your tasks.
If you want to use it with #Async, use this in your spring-config:
<task:annotation-driven executor="myExecutor" scheduler="myScheduler"/>
<task:executor id="myExecutor" pool-size="5"/>
<task:scheduler id="myScheduler" pool-size="10"/>
Full reference here (25.5.3). Hope this helps.
Since spring boot 2.1 you can use auto configuration and change the maximum number of threads in the application properties file
spring.task.execution.pool.max-size=4
See the full documentation:
https://docs.spring.io/spring-boot/docs/current/reference/htmlsingle/#boot-features-task-execution-scheduling
Related
I am creating an application which requires multiple processes to run in parallel. The number of processes to run is dynamic, it depends on the input received.
E.g., if the user wants information about three different things [car, bike, auto] then I need three separate thread to run each in parallel.
numberOfThreadsNeeded = getNumberOfThingsFromInput();
ExecutorService executor = Executors.newFixedThreadPool(numberOfThreadsNeeded);
Code Snippet:
public class ConsoleController {
private static final Log LOG = LogFactory.getLog(ConsoleController.class);
#Autowired
ConsoleCache consoleCache;
Metrics metrics;
public List<Feature> getConsoleData(List<String> featureIds, Map<String, Object> input, Metrics metrics) {
this.metrics = metrics;
List<FeatureModel> featureModels =
featureIds
.stream()
.map(consoleCache::getFeature)
.collect(toList());
Integer numberOfThreadsNeeded = getThreadCount(featureModels);
ExecutorService executor = Executors.newFixedThreadPool(numberOfThreadsNeeded);
featureModels.stream()
.map(f -> (Callable<Result>) () -> f.fetchData(input, metrics))
.map(executor::submit)
.collect(toList()));
The number of threads to be created varies from 1 to 100. Is it safe to define the thread pool size during initialization?
And also I want to know whether it is safe to run 100 threads in parallel?
There is no hard limit as per Java, but there might be a limit, for example, in the JVM implementation or the Operating System. So, practically speaking there is a limit. And there is a point where adding more threads can make the performance worse, not better. Also, there is a possibility of running out of memory.
The way you use ExecutorService is not the way it was intended to be used. Normally you would create a single ExecutorService with the threads limit number that is best suited for your environment.
Keep in mind that even if you really want all your tasks to be executed in parallel you won't be able to achieve it anyways given the hardware/software concurrency limitations.
BTW, if you still want to create an ExecutorService per request - don't forget to call its shutdown() method, otherwise the JVM won't be able to exit gracefully as there will be threads still hanging around.
I have jBPM 5.4 and I'm seeing that the amount of time it takes for jBPM on wildfly to burn through a bulk dump of workflows asynchronously is the same no matter what I change in the thread pool size of standalone.xml.
I'm afraid that how jBPM does this is via a fixed pool size. Can anyone confirm or deny this?
Disclaimer: I have not tried recently, this is from recollection of old project (where 6.0 was on the horizon, not used, but discussed), and refreshing my memories by checking the docs. Also I don't expect there is anything special to "workflows" here, the same principles should apply.
jBPM's engine is single-thread:
We've chosen to implement logical multi-threading using one thread: a jBPM process that includes logical multi-threading will only be executed in one technical thread.
For async tasks in v5 you have to handle the threading yourself, as shown in this example from the doc:
public class MyServiceTaskHandler implements WorkItemHandler {
public void executeWorkItem(WorkItem workItem, WorkItemManager manager) {
new Thread(new Runnable() {
public void run() {
// Do the heavy lifting here ...
}
}).start();
}
public void abortWorkItem(WorkItem workItem, WorkItemManager manager) {
}
}
My understanding is if you don't do that, your async tasks are just potentially async. And if you do that, you have no control on level of concurrency. So that's a terrible example, they should at least show how to use an ExecutorService or something reasonable.
Anyway, version 6 still has a single-thread core engine, but offers its own executor for async workloads:
In version 6, jBPM introduces new component called jbpm executor which provides quite advanced features for asynchronous execution. It delivers generic environment for background execution of commands.
Its internal threadpool can be configured with system property org.kie.executor.pool.size (mentioned at bottom of page linked above).
This was fixed in jBPM 6: see https://issues.jboss.org/browse/JBPM-4275
I'm already implemented Remote Chunking using AMQP (RabbitMQ). Now I need to run parallel jobs from within a web container.
My simple controller (testJob use remote chunking):
#Controller
public class JobController {
#Autowired
private JobLauncher jobLauncher;
#Autowired
private Job testJob;
#RequestMapping("/job/test")
public void test() {
JobParametersBuilder jobParametersBuilder = new JobParametersBuilder();
jobParametersBuilder.addDate("date",new Date());
try {
jobLauncher.run(personJob,jobParametersBuilder.toJobParameters());
} catch (JobExecutionAlreadyRunningException | JobRestartException | JobParametersInvalidException | JobInstanceAlreadyCompleteException e) {
e.printStackTrace();
}
}
}
testJob reads data from filesystem (master chunk) and send it to remote chunk (slave chunk). The problem is that ItemReader is not thread safe.
There are some practical limitations of using multi-threaded Steps for some common Batch use cases. Many participants in a Step (e.g. readers and writers) are stateful, and if the state is not segregated by thread, then those components are not usable in a multi-threaded Step. In particular most of the off-the-shelf readers and writers from Spring Batch are not designed for multi-threaded use. It is, however, possible to work with stateless or thread safe readers and writers, and there is a sample (parallelJob) in the Spring Batch Samples that show the use of a process indicator (see Section 6.12, “Preventing State Persistence”) to keep track of items that have been processed in a database input table.
I'm considered on parallelJob sample on spring batch github repository
https://github.com/spring-projects/spring-batch/blob/master/spring-batch-samples/src/main/java/org/springframework/batch/sample/common/StagingItemReader.java
I'm a bit confused about Process indicator pattern. Where I can find more detailed information about this pattern?
If all you're concerned with is that the ItemReader instance would be shared across job invocations, you can declare the ItemReader as a step scope and you'll get a new instance per invocation which would remove the threading concerns.
But to answer your direct question about the process indicator pattern I'm not sure where good documentation on it by itself is. There is a sample of it's implementation in the Spring Batch Samples (the parallel job uses it).
The idea behind it is that you provide a status to the records you are going to process. At the beginning of the job/step you mark those records as in process. As the records are committed, you mark them as processed. This removes the need to track the state in the reader since your state is actually in the db (your query only looks for records marked as in process).
I am using spring to manage threads in the Glassfish and below is the code I use. For some reason, 100's of threads are getting created even though I have set the thread pool max count as 10.
final WorkManagerTaskExecutor taskExecutor = new WorkManagerTaskExecutor();
final QPRunable runable = new QPRunable(); => this class implements Runnable
taskExecutor.setWorkManagerName("Workmanager1");
taskExecutor.afterPropertiesSet();
taskExecutor.setBlockUntilCompleted(false);
taskExecutor.execute(runnable);
Any suggestions about how to make the pool reuse the threads and why the thread count is increasing so much.
Thanks in Advance.
Spring's docs read:
On JBoss and GlassFish, obtaining the default JCA WorkManager requires special lookup steps. See the JBossWorkManagerTaskExecutor GlassFishWorkManagerTaskExecutor classes which are the direct equivalent of this generic JCA adapter class.
Maybe that's an issue here?
I am using Spring Batch and I've created a tasklet that is run by using a SimpleAsyncTaskExecutor. In this step, I am retrieving the StepExecutionContext with
#BeforeStep
public void saveStepExecution(StepExecution stepExecution) {
this.stepExecution = stepExecution;
}
In the processing method of the tasklet, I try to update the context:
stepExecution.getExecutionContext().put("info", contextInfo);
This leads to ConcurrentModificationExceptions on the stepExecution.
How can I avoid these and update my context in this multi-threaded environment?
The step execution context is a shared resource. Are you really trying to put one "info" per thread? Depending on your context, there are many ways to solve this, since it is a threading issue, not Spring batch.
1) if there is one info per thread, have the thread put a threadlocal in the context (once), and then use the threadlocal to store the "info".
2) if context info is "global", then you should do the put in a synchronized block and check for its existence before putting.
Hope this helps.