Terminate existing pool when all work is done - groovy

Alright, brand new to gpars so please forgive me if this has an obvious answer.
Here is my scenario. We currently have a piece of our code wrapped in a Thread.start {} block. It does this so it can send messages to an message queue in the background and not block the user request. An issue we have recently ran into with this is for large blocks of work, it is possible for the users to perform another action which would cause this block to execute again. As it is threaded, it is possible for the second batch of messages to get sent before the first causing corrupted data.
I would like to change this process to work as a queue flow with gpars. I've seen examples of creating pools such as
def pool = GParsPool.createPool()
or
def pool = new ForkJoinPool()
and then using the pool as
GParsPool.withExistingPool(pool) {
...
}
This seems like it would account for the case that if the user performs an action again, I could reuse the created pool and the actions would not be performed out of order, provided I have a pool size of one.
My question is, is this the best way to do this with gpars? And furthermore, how do I know when the pool is finished all of its work? Does it terminate when all the work is finished? If so, is there a method that can be used to check if the pool has finished/terminated to know I need a new one?
Any help would be appreciated.

No, explicitly created pools do not terminate by themselves. You have to call shutdown() on them explicitly.
Using withPool() {} command, however, will guarantee that the pool is destroyed once the code block is finished.

Here is the current solution we have to our issue. It should be noted that we followed this route due to our requirements
Work is grouped by some context
Work within a given context is ordered
Work within a given context is synchronous
Additional work for a context should execute after the preceding work
Work should not block the user request
Contexts are asynchronous between each other
Once work for a context is finished, the context should clean up after itself
Given the above, we've implemented the following:
class AsyncService {
def queueContexts
def AsyncService() {
queueContexts = new QueueContexts()
}
def queue(contextString, closure) {
queueContexts.retrieveContextWithWork(contextString, true).send(closure)
}
class QueueContexts {
def contextMap = [:]
def synchronized retrieveContextWithWork(contextString, incrementWork) {
def context = contextMap[contextString]
if (context) {
if (!context.hasWork(incrementWork)) {
contextMap.remove(contextString)
context.terminate()
}
} else {
def queueContexts = this
contextMap[contextString] = new QueueContext({->
queueContexts.retrieveContextWithWork(contextString, false)
})
}
contextMap[contextString]
}
class QueueContext {
def workCount
def actor
def QueueContext(finishClosure) {
workCount = 1
actor = Actors.actor {
loop {
react { closure ->
try {
closure()
} catch (Throwable th) {
log.error("Uncaught exception in async queue context", th)
}
finishClosure()
}
}
}
}
def send(closure) {
actor.send(closure)
}
def terminate(){
actor.terminate()
}
def hasWork(incrementWork) {
workCount += (incrementWork ? 1 : -1)
workCount > 0
}
}
}
}

Related

AsyncHttpClient creates how much threads?

I use async http client in my code to asynchronously handle GET responses
I can run simultaneously 100 requests in the same time.
I use just on instance of httpClient in container
#Bean(destroyMethod = "close")
open fun httpClient() = Dsl.asyncHttpClient()
Code looks like
fun method(): CompletableFuture<String> {
return httpClient.prepareGet("someUrl").execute()
.toCompletableFuture()
.thenApply(::getResponseBody)
}
It works fine functionally. In my testing I use mock endpoint with the same url address. But my expectation was that all the requests are handled in several threads, but in profiler I can see that 16 threads are created for AsyncHttpClient, and they aren't destroyed, even if there are no requests to send.
My expectation was that
it will be less threads for async client
threads will be destroyed after some configured timeout
is there some option to control how much threads can be created by asyncHttpClient?
Am I missing something in my expectations?
UPDATE 1
I saw instruction on https://github.com/AsyncHttpClient/async-http-client/wiki/Connection-pooling
I found no info on thread pool
UPDATE 2
I also created method to do the same, but with handler and additional executor pool
Utility method look like
fun <Value, Result> CompletableFuture<Value>.handleResultAsync(executor: Executor, initResultHandler: ResultHandler<Value, Result>.() -> Unit): CompletableFuture<Result> {
val rh = ResultHandler<Value, Result>()
rh.initResultHandler()
val handler = BiFunction { value: Value?, exception: Throwable? ->
if (exception == null) rh.success?.invoke(value) else rh.fail?.invoke(exception)
}
return handleAsync(handler, executor)
}
The updated method look like
fun method(): CompletableFuture<String> {
return httpClient.prepareGet("someUrl").execute()
.toCompletableFuture()
.handleResultAsync(executor) {
success = {response ->
logger.info("ok")
getResponseBody(response!!)
}
fail = { ex ->
logger.error("Failed to execute request", ex)
throw ex
}
}
}
Then I can see that result of GET method is executed in the threads provided by thread pool (previously result was executed in "AsyncHttpClient-3-x"), but additional thread for AsyncHttpClient are still created and not destroyed.
AHC has two types of threads:
For I/O operation.
On your screen, it's AsyncHttpClient-x-x
threads. AHC creates 2*core_number of those.
For timeouts.
On your screen, it's AsyncHttpClient-timer-1-1 thread. Should be
only one.
Source: issue on GitHub: https://github.com/AsyncHttpClient/async-http-client/issues/1658

Batch up requests in Groovy?

I'm new to Groovy and am a bit lost on how to batch up requests so they can be submitted to a server as a batch, instead of individually, as I currently have:
class Handler {
private String jobId
// [...]
void submit() {
// [...]
// client is a single instance of Client used by all Handlers
jobId = client.add(args)
}
}
class Client {
//...
String add(String args) {
response = postJson(args)
return parseIdFromJson(response)
}
}
As it is now, something calls Client.add(), which POSTs to a REST API and returns a parsed result.
The issue I have is that the add() method is called maybe thousands of times in quick succession, and it would be much more efficient to collect all the args passed in to add(), wait until there's a moment when the add() calls stop coming in, and then POST to the REST API a single time for that batch, sending all the args in one go.
Is this possible? Potentially, add() can return a fake id immediately, as long as the batching occurs, the submit happens, and Client can later know the lookup between fake id and the ID coming from the REST API (which will return IDs in the order corresponding to the args sent to it).
As mentioned in the comments, this might be a good case for gpars which is excellent at these kinds of scenarios.
This really is less about groovy and more about asynchronous programming in java and on the jvm in general.
If you want to stick with the java concurrent idioms I threw together a code snippet you could use as a potential starting point. This has not been tested and edge cases have not been considered. I wrote this up for fun and since this is asynchronous programming and I haven't spent the appropriate time thinking about it, I suspect there are holes in there big enough to drive a tank through.
That being said, here is some code which makes an attempt at batching up the requests:
import java.util.concurrent.*
import java.util.concurrent.locks.*
// test code
def client = new Client()
client.start()
def futureResponses = []
1000.times {
futureResponses << client.add(it as String)
}
client.stop()
futureResponses.each { futureResponse ->
// resolve future...will wait if the batch has not completed yet
def response = futureResponse.get()
println "received response with index ${response.responseIndex}"
}
// end of test code
class FutureResponse extends CompletableFuture<String> {
String args
}
class Client {
int minMillisLullToSubmitBatch = 100
int maxBatchSizeBeforeSubmit = 100
int millisBetweenChecks = 10
long lastAddTime = Long.MAX_VALUE
def batch = []
def lock = new ReentrantLock()
boolean running = true
def start() {
running = true
Thread.start {
while (running) {
checkForSubmission()
sleep millisBetweenChecks
}
}
}
def stop() {
running = false
checkForSubmission()
}
def withLock(Closure c) {
try {
lock.lock()
c.call()
} finally {
lock.unlock()
}
}
FutureResponse add(String args) {
def future = new FutureResponse(args: args)
withLock {
batch << future
lastAddTime = System.currentTimeMillis()
}
future
}
def checkForSubmission() {
withLock {
if (System.currentTimeMillis() - lastAddTime > minMillisLullToSubmitBatch ||
batch.size() > maxBatchSizeBeforeSubmit) {
submitBatch()
}
}
}
def submitBatch() {
// here you would need to put the combined args on a format
// suitable for the endpoint you are calling. In this
// example we are just creating a list containing the args
def combinedArgs = batch.collect { it.args }
// further there needs to be a way to map one specific set of
// args in the combined args to a specific response. If the
// endpoint responds with the same order as the args we submitted
// were in, then that can be used otherwise something else like
// an id in the response etc would need to be figured out. Here
// we just assume responses are returned in the order args were submitted
List<String> combinedResponses = postJson(combinedArgs)
combinedResponses.indexed().each { index, response ->
// here the FutureResponse gets a value, can be retrieved with
// futureResponse.get()
batch[index].complete(response)
}
// clear the batch
batch = []
}
// bogus method to fake post
def postJson(combinedArgs) {
println "posting json with batch size: ${combinedArgs.size()}"
combinedArgs.collect { [responseIndex: it] }
}
}
A few notes:
something needs to be able to react to the fact that there were no calls to add for a while. This implies a separate monitoring thread and is what the start and stop methods manage.
if we have an infinite sequence of adds without pauses, you might run out of resources. Therefore the code has a max batch size where it will submit the batch even if there is no lull in the calls to add.
the code uses a lock to make sure (or try to, as mentioned above, I have not considered all potential issues here) we stay thread safe during batch submissions etc
assuming the general idea here is sound, you are left with implementing the logic in submitBatch where the main problem is dealing with mapping specific args to specific responses
CompletableFuture is a java 8 class. This can be solved using other constructs in earlier releases, but I happened to be on java 8.
I more or less wrote this without executing or testing, I'm sure there are some mistakes in there.
as can be seen in the printout below, the "maxBatchSizeBeforeSubmit" setting is more a recommendation that an actual max. Since the monitoring thread sleeps for some time and then wakes up to check how we are doing, the threads calling the add method might have accumulated any number of requests in the batch. All we are guaranteed is that every millisBetweenChecks we will wake up and check how we are doing and if the criteria for submitting a batch has been reached, then the batch will be submitted.
If you are unfamiliar with java Futures and locks, I would recommend you read up on them.
If you save the above code in a groovy script code.groovy and run it:
~> groovy code.groovy
posting json with batch size: 153
posting json with batch size: 234
posting json with batch size: 243
posting json with batch size: 370
received response with index 0
received response with index 1
received response with index 2
...
received response with index 998
received response with index 999
~>
it should work and print out the "responses" received from our fake json submissions.

Changing State when Using Scala Concurrency

I have a function in my Controller that takes user input, and then, using an infinite loop, queries a database and sends the object returned from the database to a webpage. This all works fine, except that I needed to introduce concurrency in order to both run this logic and render the webpage.
The code is given by:
def getSearchResult = Action { request =>
val search = request.queryString.get("searchInput").head.head
val databaseSupport = new InteractWithDatabase(comm, db)
val put = Future {
while (true) {
val data = databaseSupport.getFromDatabase(search)
if (data.nonEmpty) {
if (data.head.vendorId.equals(search)) {
comm.communicator ! data.head
}
}
}
}
Ok(views.html.singleElement.render)
}
The issue arises when I want to call this again, but with a different input. Because the first thread is in an infinite loop, it never ceases to run and is still running even when I start the second thread. Therefore, both objects are being sent to the webpage at the same time in two separate threads.
How can I stop the first thread once I call this function again? Or, is there a better implementation of this whole idea so that I could do it without using multithreading?
Note: I tried removing the concurrency from this function (as multithreading has been the thing giving me all of these problems) and instead moving it to the web socket itself, but this posed problems as the web socket is connected to a router, and everything connects to the web socket through the router.
Try AsyncAction where you return a Future[Result] as a result. Make database call in side this result. E.g.(pseudo code),
def getSearchResult = AsyncAction { request =>
val search = request.queryString.get("searchInput").head.head
val databaseSupport = new InteractWithDatabase(comm, db)
Future {
val data = databaseSupport.getFromDatabase(search)
if (data.nonEmpty) {
if (data.head.vendorId.equals(search)) {
comm.communicator ! data.head // A
}
}
Ok(views.html.singleElement.render)
}
}
Better if databaseSupport.getFromDatabase(search) returns a Future but that is a story for another day. The tricky part is to figure how to deal with Actor at "A". Just remember at the exit it must return Future[Result] result type.

Function/Code Design with Concurrency in Swift

I'm trying to create my first app in Swift which involves making multiple requests to a website. These requests are each done using the block
var task = NSURLSession.sharedSession().dataTaskWithRequest(request, completionHandler: {data, response, error -> Void in ... }
task.resume()
From what I understand this block uses a thread different to the main thread.
My question is, what is the best way to design code that relies on the values in that block? For instance, the ideal design (however not possible due to the fact that the thread executing these blocks is not the main thread) is
func prepareEmails() {
var names = getNames()
var emails = getEmails()
...
sendEmails()
}
func getNames() -> NSArray {
var names = nil
....
var task = NSURLSession.sharedSession().dataTaskWithRequest(request, completionHandler: {data, response, error -> Void in
names = ...
})
task.resume()
return names
}
func getEmails() -> NSArray {
var emails = nil
....
var task = NSURLSession.sharedSession().dataTaskWithRequest(request, completionHandler: {data, response, error -> Void in
emails = ...
})
task.resume()
return emails
}
However in the above design, most likely getNames() and getEmails() will return nil, as the the task will not have updated emails/name by the time it returns.
The alternative design (which I currently implement) is by effectively removing the 'prepareEmails' function and doing everything sequentially in the task functions
func prepareEmails() {
getNames()
}
func getNames() {
...
var task = NSURLSession.sharedSession().dataTaskWithRequest(request, completionHandler: {data, response, error -> Void in
getEmails(names)
})
task.resume()
}
func getEmails(names: NSArray) {
...
var task = NSURLSession.sharedSession().dataTaskWithRequest(request, completionHandler: {data, response, error -> Void in
sendEmails(emails, names)
})
task.resume()
}
Is there a more effective design than the latter? This is my first experience with concurrency, so any advice would be greatly appreciated.
The typical pattern when calling an asynchronous method that has a completionHandler parameter is to use the completionHandler closure pattern, yourself. So the methods don't return anything, but rather call a closure with the returned information as a parameter:
func getNames(completionHandler:(NSArray!)->()) {
....
let task = NSURLSession.sharedSession().dataTaskWithRequest(request) {data, response, error -> Void in
let names = ...
completionHandler(names)
}
task.resume()
}
func getEmails(completionHandler:(NSArray!)->()) {
....
let task = NSURLSession.sharedSession().dataTaskWithRequest(request) {data, response, error -> Void in
let emails = ...
completionHandler(emails)
}
task.resume()
}
Then, if you need to perform these sequentially, as suggested by your code sample (i.e. if the retrieval of emails was dependent upon the names returned by getNames), you could do something like:
func prepareEmails() {
getNames() { names in
getEmails() {emails in
sendEmails(names, emails) // I'm assuming the names and emails are in the input to this method
}
}
}
Or, if they can run concurrently, then you should do so, as it will be faster. The trick is how to make a third task dependent upon two other asynchronous tasks. The two traditional alternatives include
Wrapping each of these asynchronous tasks in its own asynchronous NSOperation, and then create a third task dependent upon those other two operations. This is probably beyond the scope of the question, but you can refer to the Operation Queue section of the Concurrency Programming Guide or see the Asynchronous vs Synchronous Operations and Subclassing Notes sections of the NSOperation Class Reference.
Use dispatch groups, entering the group before each request, leaving the group within the completion handler of each request, and then adding a dispatch group notification block (called when all of the group "enter" calls are matched by their corresponding "leave" calls):
func prepareEmails() {
let group = dispatch_group_create()
var emails: NSArray!
var names: NSArray!
dispatch_group_enter(group)
getNames() { results in
names = results
dispatch_group_leave(group)
}
dispatch_group_enter(group)
getEmails() {results in
emails = results
dispatch_group_leave(group)
}
dispatch_group_notify(group, dispatch_get_main_queue()) {
if names != nil && emails != nil {
self.sendEmails(names, emails)
} else {
// one or both of those requests failed; tell the user
}
}
}
Frankly, if there's any way to retrieve both the emails and names in a single network request, that's going to be far more efficient. But if you're stuck with two separate requests, you could do something like the above.
Note, I wouldn't generally use NSArray in my Swift code, but rather use an array of String objects (e.g. [String]). Furthermore, I'd put in error handling where I return the nature of the error if either of these fail. But hopefully this illustrates the concepts involved in (a) writing your own methods with completionHandler blocks; and (b) invoking a third bit of code dependent upon the completion of two other asynchronous tasks.
The answers above (particularly Rob's DispatchQueue based answer) describe the concurrency concepts necessary to run two tasks in parallel and then respond to the result. The answers lack error handling for clarity because traditionally, correct solutions to concurrency problems are quite verbose.
Not so with HoneyBee.
HoneyBee.start()
.setErrorHandler(handleErrorFunc)
.branch {
$0.chain(getNames)
+
$0.chain(getEmails)
}
.chain(sendEmails)
This code snippet manages all of the concurrency, routes all errors to handleErrorFunc and looks like the concurrent pattern that is desired.

Grails promise callback not called

I can't get a callback to be executed that should be called from with a Promise onComplete().
// some service
def initiateDbLoad() {
def p1 = task { dbLoad() }
p1.onComplete { result ->
dbLoadCallback()
}
}
def dbLoad() {
// some long-running process here
}
def dbLoadCallback() {
// I am never called
}
The use case is that I want to kick off a long-running process in a separate thread. The calling thread should return, not wait for the thread to finish. When the long-running process is complete, I want it to execute a callback. Is this possible? Or should it look like the code below?
def initiateDbLoad() {
def p1 = task {
dbLoad()
dbLoadCallback()
}
}
onComplete is only called on success. You have to add an onError too to get actual errors.

Resources