I use async http client in my code to asynchronously handle GET responses
I can run simultaneously 100 requests in the same time.
I use just on instance of httpClient in container
#Bean(destroyMethod = "close")
open fun httpClient() = Dsl.asyncHttpClient()
Code looks like
fun method(): CompletableFuture<String> {
return httpClient.prepareGet("someUrl").execute()
.toCompletableFuture()
.thenApply(::getResponseBody)
}
It works fine functionally. In my testing I use mock endpoint with the same url address. But my expectation was that all the requests are handled in several threads, but in profiler I can see that 16 threads are created for AsyncHttpClient, and they aren't destroyed, even if there are no requests to send.
My expectation was that
it will be less threads for async client
threads will be destroyed after some configured timeout
is there some option to control how much threads can be created by asyncHttpClient?
Am I missing something in my expectations?
UPDATE 1
I saw instruction on https://github.com/AsyncHttpClient/async-http-client/wiki/Connection-pooling
I found no info on thread pool
UPDATE 2
I also created method to do the same, but with handler and additional executor pool
Utility method look like
fun <Value, Result> CompletableFuture<Value>.handleResultAsync(executor: Executor, initResultHandler: ResultHandler<Value, Result>.() -> Unit): CompletableFuture<Result> {
val rh = ResultHandler<Value, Result>()
rh.initResultHandler()
val handler = BiFunction { value: Value?, exception: Throwable? ->
if (exception == null) rh.success?.invoke(value) else rh.fail?.invoke(exception)
}
return handleAsync(handler, executor)
}
The updated method look like
fun method(): CompletableFuture<String> {
return httpClient.prepareGet("someUrl").execute()
.toCompletableFuture()
.handleResultAsync(executor) {
success = {response ->
logger.info("ok")
getResponseBody(response!!)
}
fail = { ex ->
logger.error("Failed to execute request", ex)
throw ex
}
}
}
Then I can see that result of GET method is executed in the threads provided by thread pool (previously result was executed in "AsyncHttpClient-3-x"), but additional thread for AsyncHttpClient are still created and not destroyed.
AHC has two types of threads:
For I/O operation.
On your screen, it's AsyncHttpClient-x-x
threads. AHC creates 2*core_number of those.
For timeouts.
On your screen, it's AsyncHttpClient-timer-1-1 thread. Should be
only one.
Source: issue on GitHub: https://github.com/AsyncHttpClient/async-http-client/issues/1658
Related
I have got a Worker Role running in azure.
This worker processes a queue in which there are a large number of integers. For each integer I have to do processings quite long (from 1 second to 10 minutes according to the integer).
As this is quite time consuming, I would like to do these processings in parallel. Unfortunately, my parallelization seems to not be efficient when I test with a queue of 400 integers.
Here is my implementation :
public class WorkerRole : RoleEntryPoint {
private readonly CancellationTokenSource cancellationTokenSource = new CancellationTokenSource();
private readonly ManualResetEvent runCompleteEvent = new ManualResetEvent(false);
private readonly Manager _manager = Manager.Instance;
private static readonly LogManager logger = LogManager.Instance;
public override void Run() {
logger.Info("Worker is running");
try {
this.RunAsync(this.cancellationTokenSource.Token).Wait();
}
catch (Exception e) {
logger.Error(e, 0, "Error Run Worker: " + e);
}
finally {
this.runCompleteEvent.Set();
}
}
public override bool OnStart() {
bool result = base.OnStart();
logger.Info("Worker has been started");
return result;
}
public override void OnStop() {
logger.Info("Worker is stopping");
this.cancellationTokenSource.Cancel();
this.runCompleteEvent.WaitOne();
base.OnStop();
logger.Info("Worker has stopped");
}
private async Task RunAsync(CancellationToken cancellationToken) {
while (!cancellationToken.IsCancellationRequested) {
try {
_manager.ProcessQueue();
}
catch (Exception e) {
logger.Error(e, 0, "Error RunAsync Worker: " + e);
}
}
await Task.Delay(1000, cancellationToken);
}
}
}
And the implementation of the ProcessQueue:
public void ProcessQueue() {
try {
_queue.FetchAttributes();
int? cachedMessageCount = _queue.ApproximateMessageCount;
if (cachedMessageCount != null && cachedMessageCount > 0) {
var listEntries = new List<CloudQueueMessage>();
listEntries.AddRange(_queue.GetMessages(MAX_ENTRIES));
Parallel.ForEach(listEntries, ProcessEntry);
}
}
catch (Exception e) {
logger.Error(e, 0, "Error ProcessQueue: " + e);
}
}
And ProcessEntry
private void ProcessEntry(CloudQueueMessage entry) {
try {
int id = Convert.ToInt32(entry.AsString);
Service.GetData(id);
_queue.DeleteMessage(entry);
}
catch (Exception e) {
_queueError.AddMessage(entry);
_queue.DeleteMessage(entry);
logger.Error(e, 0, "Error ProcessEntry: " + e);
}
}
In the ProcessQueue function, I try with different values of MAX_ENTRIES: first =20 and then =2.
It seems to be slower with MAX_ENTRIES=20, but whatever the value of MAX_ENTRIES is, it seems quite slow.
My VM is a A2 medium.
I really don't know if I do the parallelization correctly ; maybe the problem comes from the worker itself (which may be it is hard to have this in parallel).
You haven't mentioned which Azure Messaging Queuing technology you are using, however for tasks where I want to process multiple messages in parallel I tend to use the Message Pump Pattern on Service Bus Queues and Subscriptions, leveraging the OnMessage() method available on both Service Bus Queue and Subscription Clients:
QueueClient OnMessage() - https://msdn.microsoft.com/en-us/library/microsoft.servicebus.messaging.queueclient.onmessage.aspx
SubscriptionClient OnMessage() - https://msdn.microsoft.com/en-us/library/microsoft.servicebus.messaging.subscriptionclient.onmessage.aspx
An overview of how this stuff works :-) - http://fabriccontroller.net/blog/posts/introducing-the-event-driven-message-programming-model-for-the-windows-azure-service-bus/
From MSDN:
When calling OnMessage(), the client starts an internal message pump
that constantly polls the queue or subscription. This message pump
consists of an infinite loop that issues a Receive() call. If the call
times out, it issues the next Receive() call.
This pattern allows you to use a delegate (or anonymous function in my preferred case) that handles the receipt of the Brokered Message instance on a separate thread on the WaWorkerHost process. In fact, to increase the level of throughput, you can specify the number of threads that the Message Pump should provide, thereby allowing you to receive and process 2, 4, 8 messages from the queue in parallel. You can additionally tell the Message Pump to automagically mark the message as complete when the delegate has successfully finished processing the message. Both the thread count and AutoComplete instructions are passed in the OnMessageOptions parameter on the overloaded method.
public override void Run()
{
var onMessageOptions = new OnMessageOptions()
{
AutoComplete = true, // Message-Pump will call Complete on messages after the callback has completed processing.
MaxConcurrentCalls = 2 // Max number of threads the Message-Pump can spawn to process messages.
};
sbQueueClient.OnMessage((brokeredMessage) =>
{
// Process the Brokered Message Instance here
}, onMessageOptions);
RunAsync(_cancellationTokenSource.Token).Wait();
}
You can still leverage the RunAsync() method to perform additional tasks on the main Worker Role thread if required.
Finally, I would also recommend that you look at scaling your Worker Role instances out to a minimum of 2 (for fault tolerance and redundancy) to increase your overall throughput. From what I have seen with multiple production deployments of this pattern, OnMessage() performs perfectly when multiple Worker Role Instances are running.
A few things to consider here:
Are your individual tasks CPU intensive? If so, parallelism may not help. However, if they are mostly waiting on data processing tasks to be processed by other resources, parallelizing is a good idea.
If parallelizing is a good idea, consider not using Parallel.ForEach for queue processing. Parallel.Foreach has two issues that prevent you from being very optimal:
The code will wait until all kicked off threads finish processing before moving on. So, if you have 5 threads that need 10 seconds each and 1 thread that needs 10 minutes, the overall processing time for Parallel.Foreach will be 10 minutes.
Even though you are assuming that all of the threads will start processing at the same time, Parallel.Foreach does not work this way. It looks at number of cores on your server and other parameters and generally only kicks off number of threads it thinks it can handle, without knowing too much about what's in those threads. So, if you have a lot of non-CPU bound threads that /can/ be kicked off at the same time without causing CPU over-utilization, default behaviour will not likely run them optimally.
How to do this optimally:
I am sure there are a ton of solutions out there, but for reference, the way we've architected it in CloudMonix (that must kick off hundreds of independent threads and complete them as fast as possible) is by using ThreadPool.QueueUserWorkItem and manually keeping track number of threads that are running.
Basically, we use a Thread-safe collection to keep track of running threads that are started by ThreadPool.QueueUserWorkItem. Once threads complete, remove them from that collection. The queue-monitoring loop is indendent of executing logic in that collection. Queue-monitoring logic gets messages from the queue if the processing collection is not full up to the limit that you find most optimal. If there is space in the collection, it tries to pickup more messages from the queue, adds them to the collection and kick-start them via ThreadPool.QueueUserWorkItem. When processing completes, it kicks off a delegate that cleans up thread from the collection.
Hope this helps and makes sense
Alright, brand new to gpars so please forgive me if this has an obvious answer.
Here is my scenario. We currently have a piece of our code wrapped in a Thread.start {} block. It does this so it can send messages to an message queue in the background and not block the user request. An issue we have recently ran into with this is for large blocks of work, it is possible for the users to perform another action which would cause this block to execute again. As it is threaded, it is possible for the second batch of messages to get sent before the first causing corrupted data.
I would like to change this process to work as a queue flow with gpars. I've seen examples of creating pools such as
def pool = GParsPool.createPool()
or
def pool = new ForkJoinPool()
and then using the pool as
GParsPool.withExistingPool(pool) {
...
}
This seems like it would account for the case that if the user performs an action again, I could reuse the created pool and the actions would not be performed out of order, provided I have a pool size of one.
My question is, is this the best way to do this with gpars? And furthermore, how do I know when the pool is finished all of its work? Does it terminate when all the work is finished? If so, is there a method that can be used to check if the pool has finished/terminated to know I need a new one?
Any help would be appreciated.
No, explicitly created pools do not terminate by themselves. You have to call shutdown() on them explicitly.
Using withPool() {} command, however, will guarantee that the pool is destroyed once the code block is finished.
Here is the current solution we have to our issue. It should be noted that we followed this route due to our requirements
Work is grouped by some context
Work within a given context is ordered
Work within a given context is synchronous
Additional work for a context should execute after the preceding work
Work should not block the user request
Contexts are asynchronous between each other
Once work for a context is finished, the context should clean up after itself
Given the above, we've implemented the following:
class AsyncService {
def queueContexts
def AsyncService() {
queueContexts = new QueueContexts()
}
def queue(contextString, closure) {
queueContexts.retrieveContextWithWork(contextString, true).send(closure)
}
class QueueContexts {
def contextMap = [:]
def synchronized retrieveContextWithWork(contextString, incrementWork) {
def context = contextMap[contextString]
if (context) {
if (!context.hasWork(incrementWork)) {
contextMap.remove(contextString)
context.terminate()
}
} else {
def queueContexts = this
contextMap[contextString] = new QueueContext({->
queueContexts.retrieveContextWithWork(contextString, false)
})
}
contextMap[contextString]
}
class QueueContext {
def workCount
def actor
def QueueContext(finishClosure) {
workCount = 1
actor = Actors.actor {
loop {
react { closure ->
try {
closure()
} catch (Throwable th) {
log.error("Uncaught exception in async queue context", th)
}
finishClosure()
}
}
}
}
def send(closure) {
actor.send(closure)
}
def terminate(){
actor.terminate()
}
def hasWork(incrementWork) {
workCount += (incrementWork ? 1 : -1)
workCount > 0
}
}
}
}
I have one thread in the thread-pool servicing blocking request.
def sync = Action {
import Contexts.blockingPool
Future {
Thread.sleep(100)
}
Ok("Done")
}
In Contexts.blockingPool is configured as:
custom-pool {
fork-join-executor {
parallelism-min = 1
parallelism-max = 1
}
}
In theory, if above request receives 100 simultaneous requests, the expected behaviour should be: 1 request should sleep(100) and rest of 99 requests should be rejected (or queued until timeout?). However I observed that extra worker threads are created to service rest of requests. I also observed that latency increases as (gets slower to service request) as number of threads in the pool gets smaller than the requests received.
What is expected behavior if a request larger than configured thread-pool size is received?
Your test is not correctly structured to test your hypothesis.
If you go over this section in the docs you will see that Play has a few thread pools/execution contexts. The one that is important with regards to your question is the default thread pool and how that relates to the HTTP requests served by your action.
As the doc describes, the default thread pool is where all application code is run by default. I.e. all action code, including all Future's (not explicitly defining their own execution context), will run in this execution context/thread pool. So using your example:
def sync = Action {
// *** import Contexts.blockingPool
// *** Future {
// *** Thread.sleep(100)
// ***}
Ok("Done")
}
All the code in your action not commented by // *** will run in the default thread pool.
I.e. When a request gets routed to your action:
the Future with the Thread.sleep will be dispatched to your custom execution context
then without waiting for that Future to complete (because it's running in it's own thread pool [Context.blockingPool] and therefore not blocking any threads on the default thread pool)
your Ok("Done") statement is evaluated and the client receives the response
Approx. 100 milliseconds after the response has been received, your Future completes
So to explain you observation, when you send 100 simultaneous requests, Play will gladly accept those requests, route to your controller action (executing on the default thread pool), dispatch to your Future and then respond to the client.
The default size of the default pool is
play {
akka {
...
actor {
default-dispatcher = {
fork-join-executor {
parallelism-factor = 1.0
parallelism-max = 24
}
}
}
}
}
to use 1 thread per core up to a max of 24.
Given that your action does very little (excl. the Future), you will be able to handle into the 1000's of requests/sec without a sweat. Your Future will however take much longer to work through the backlog because you are blocking the only thread in your custom pool (blockingPool).
If you use my slightly adjusted version of your action, you will see what confirms the above explanation in the log output:
object Threading {
def sync = Action {
val defaultThreadPool = Thread.currentThread().getName;
import Contexts.blockingPool
Future {
val blockingPool = Thread.currentThread().getName;
Logger.debug(s"""\t>>> Done on thread: $blockingPool""")
Thread.sleep(100)
}
Logger.debug(s"""Done on thread: $defaultThreadPool""")
Results.Ok
}
}
object Contexts {
implicit val blockingPool: ExecutionContext = Akka.system.dispatchers.lookup("blocking-pool-context")
}
All your requests are swiftly dealt with first and then your Future's complete one by one afterwards.
So in conclusion, if you really want to test how Play will handle many simultaneous requests with only one thread handling requests, then you can use the following config:
play {
akka {
akka.loggers = ["akka.event.Logging$DefaultLogger", "akka.event.slf4j.Slf4jLogger"]
loglevel = WARNING
actor {
default-dispatcher = {
fork-join-executor {
parallelism-min = 1
parallelism-max = 1
}
}
}
}
}
you might also want to add a Thread.sleep to your action like this (to slow the default thread pools lonesome thread down a bit)
...
Thread.sleep(100)
Logger.debug(s"""<<< Done on thread: $defaultThreadPool""")
Results.Ok
}
Now you will have 1 thread for requests and 1 thread for your Future's.
If you run this with high concurrent connections you will notice that the client blocks while Play handles the requests one by one. Which is what you expected to see...
Play uses AkkaForkJoinPool which extends scala.concurrent.forkjoin.ForkJoinPool.
It has internal queue of tasks.
You may also find this description interesting in respect to handling blocking code by fork-join-pool: Scala: the global ExecutionContext makes your life easier
Consider this code :
Thread thread = new Thread(() -> tasks.parallelStream().forEach(Runnable::run));
tasks are a list of Runnables that should be executed in parallel.
When we start this thread, and it begins its execution, then depending on some calculations we need to interrupt (cancel) all those tasks.
Interrupting the Thread will only stop one of exections. How do we handle others? or maybe Streams should not be used that way? or you know a better solution?
You can use a ForkJoinPool to interrupt the threads:
#Test
public void testInterruptParallelStream() throws Exception {
final AtomicReference<InterruptedException> exc = new AtomicReference<>();
final ForkJoinPool forkJoinPool = new ForkJoinPool(4);
// use the pool with a parallel stream to execute some tasks
forkJoinPool.submit(() -> {
Stream.generate(Object::new).parallel().forEach(obj -> {
synchronized (obj) {
try {
// task that is blocking
obj.wait();
} catch (final InterruptedException e) {
exc.set(e);
}
}
});
});
// wait until the stream got started
Threads.sleep(500);
// now we want to interrupt the task execution
forkJoinPool.shutdownNow();
// wait for the interrupt to occur
Threads.sleep(500);
// check that we really got an interruption in the parallel stream threads
assertTrue(exc.get() instanceof InterruptedException);
}
The worker threads do really get interrupted, terminating a blocking operation. You can also call shutdown() within the Consumer.
Note that those sleeps might not be tweaked for a proper unit test, you might have better ideas to just wait as necessary. But it is enough to show that it is working.
You aren't actually running the Runnables on the Thread you are creating. You are running a thread which will submit to a pool, so:
Thread thread = new Thread(() -> tasks.parallelStream().forEach(Runnable::run));
In this example you are in lesser terms doing
List<Runnable> tasks = ...;
Thread thread = new Thread(new Runnable(){
public void run(){
for(Runnable r : tasks){
ForkJoinPool.commonPool().submit(r);
}
}
});
This is because you are using a parallelStream that delegates to a common pool when handling parallel executions.
As far as I know, you cannot get a handle of the Threads that are executing your tasks with a parallelStream so may be out of luck. You can always do tricky stuff to get the thread but probably isn't the best idea to do so.
Something like the following should work for you:
AtomicBoolean shouldCancel = new AtomicBoolean();
...
tasks.parallelStream().allMatch(task->{
task.run();
return !shouldCancel.get();
});
The documentation for the method allMatch specifically says that it "may not evaluate the predicate on all elements if not necessary for determining the result." So if the predicate doesn't match when you want to cancel, then it doesn't need to evaluate any more. Additionally, you can check the return result to see if the loop was cancelled or not.
Is there anyway to timeout a scheduled task (kill thread) in Spring if the task takes to long or even hangs because of remote resource unavailability
In my case, tasks can take too long or even hang because they're based on HtmlUnitDriver (Selenium) sequence of steps, but from time to time it hangs and I would like to be able to set a time limit for the thread to execute. Something like 1 minute at most.
I setup a fixed rate execution of 5 minutes with an initial delay of 1 minute.
Thanks in advance
I did the same some time ago following this example: example
The basic idea is to put your code in a class implementing Callable or Runnable, then create a FutureTask wherever you are going to invoque your thread with the Callable or Runnable class as parameter. Define an executor , submit your futureTask to the executor, and now you are able to execute the thread for x time inside a try catch block, if your thread ends with an timeoutException you will know that it took too long.
Here is my code:
CallableServiceExecutor callableServiceExecutor = new CallableServiceExecutor();
FutureTask<> task = new FutureTask<>(callableServiceExecutor);
ExecutorService executor = Executors.newSingleThreadExecutor();
executor.submit(task);
Boolean exito = true;
try {
result = task.get(getTimeoutValidacion() , TimeUnit.SECONDS);
} catch (InterruptedException e) {
exito = false;
} catch (ExecutionException e) {
exito = false;
} catch (TimeoutException e) {
exito = false;
}
task.cancel(true);
executor.shutdown();
See: How to timeout a thread
The short answer is that there is not easy or reliable way to kill a thread due to the limitations of Java's thread implementation. The ExecutorService#shutdown() is sort of a hack and heavy. Its best to deal with this in the task itself e.g. like at the network request level if your making a REST request to timeout on the socket.
Or better if you do some sort of message passing ala Actor model (see Akka) you can send a message from "supervisor" for the Actor to die. Also avoiding blocking by using something like Netty will help.