I'm using this tool for talking to hbase:
org.hbase:asynchbase:1.7.0
Why does the following code take ~70 ms?
long start = System.currentTimeMillis();
Deferred<ArrayList<KeyValue>> meta = hbaseClient.get(new GetRequest(propsCtx.hbaseTable, rowIdMeta(tsnfdr.id)));
long end = System.currentTimeMillis();
log.info("supposed 'non-blocking' async hbase call took {} millis", end - start);
(The call to 'propsCtx.hbaseTable' and 'rowIdMeta(tsnfdr.id)' is not the problem).
The docs are very limited, but from the signature of the method, observations in visualvm on usage of netty threads under the hood, and a quick look at the source code tells me that I'm supposed to be using an async api.
Related
As far as I know, programs using await async uses only 1 thread. That means we do not have to worry about any conflicting threads. When the compiler see "await" it will simply do other things on that same thread till what they're awaiting for is done.
I mean the stuff we're awaiting for may run in another thread. However, the program doesn't create another thread. It simply do something else in that same thread.
Hence, we shouldn't worry about conflicts
Yet, today I discover that something is running on at least 2 different thread
Public Sub LogEvents(ByVal whatToLog As String, Optional ByVal canRandom As Boolean = True)
Static logNumber As Integer
Dim timeStamp As String
timeStamp = CStr(Microsoft.VisualBasic.Now)
whatToLog = timeStamp & " " & " " & whatToLog & Microsoft.VisualBasic.vbNewLine
Try
Debug.Print(whatToLog)
System.IO.File.AppendAllText("log.txt", whatToLog, defaultEncoding)
...
Looking at the thread.
So one is worker thread
And another is main thread
Both threads stuck at the same place.
What confuses me is I thought everything should have been running on main thread. That's just how await async works. How can anything run on worker thread?
The task is created like this
For Each account In uniqueAccounts().Values
Dim newtask = account.getMarketInfoAsync().ContinueWith(Sub() account.LogFinishTask("Geting Market Info", starttime))
LogEvents("marketinfo account " + account.Exchange + " is being done by task " + newtask.Id.ToString + " " + newtask.ToString)
tasklist.Add(newtask)
'newtask.ContinueWith(Sub() LogEvents(account.ToString))
Next
This is the screen shot
That is followed by
LogEvents("Really Start Getting Market Detail of All")
Try
Await jsonHelper.whenAllWithTimeout(tasklist.ToArray, 500000)
Catch ex As Exception
Dim b = 1
End Try
That calls
Public Shared Async Function whenAllWithTimeout(taskar As Task(), timeout As Integer) As Task
Dim timeoutTask = Task.Delay(timeout)
Dim maintask = Task.WhenAll(taskar)
Await Task.WhenAny({timeoutTask, maintask})
If maintask.IsCompleted Then
Dim b = 1
For Each tsk In taskar
LogEvents("Not Time Out. Status of task " + tsk.Id.ToString + " is " + tsk.IsCompleted.ToString)
Next
End If
If timeoutTask.IsCompleted Then
Dim b = 1
For Each tsk In taskar
LogEvents("status of task " + tsk.Id.ToString + " is " + tsk.IsCompleted.ToString)
Next
End If
End Function
So I created a bunch of tasks and I use Task.Whenall and Task.Whenany
Is that why they run it on a different thread than the main thread?
How do I make it run on main thread only?
As far as I know, programs using await async uses only 1 thread.
This is incorrect.
When the compiler see "await" it will simply do other things on that same thread till what they're awaiting for is done.
Also incorrect.
I recommend reading my async intro.
await actually causes a return from the method. The thread may or may not be returned to the runtime.
How can anything run on worker thread?
When async methods resume executing after an await, by default they will resume executing on a context captured by that await. If there was no context (common in console applications), then they resume on a thread pool thread.
How do I make it run on main thread only?
Give them a single-threaded context. GUI main threads use a single-threaded context, so you could run this on a GUI main thread. Or if you are writing a console application, you can use AsyncContext from my AsyncEx library.
I have multiple producer and one consumer processes. Each process is launched by MPIPoolExecutor class. First I launch the consumer process and then starts launching producer processes using starmap method. Consumer accepts receives data and save it to hard drive. Each producer process will create a buffer with the same size as the data needs to be sent and send it using the blocker method bsend. I am expecting each producer process to dump the data into the buffer and exit. However, I am noticing a delay where it looks like each producer process waits for the data to be consumed by the consumer process. What am I missing? My code goes like this:
def consumer(args...):
comm = MPI.COMM_WORLD
file = tb.open_file(file_name, 'w')
filters = tb.Filters(complevel=5, complib='blosc')
array = file.create_carray(file.root, 'data', tb.Float32Atom(), shape=(n_, n_), filters=filters)
for i in range(num_tasks):
t = time.time()
idxs, data = MPI.Comm.recv()
print("time for waiting --consumer ", time.time() - t)
array[idxs,:] = data
def producer(args...):
comm = MPI.COMM_WORLD
#adding 1000 just to be in the safe side.
mem = MPI.Alloc_mem(data.nbytes + idxs.nbytes + 1000)
MPI.Attach_buffer(mem)
#Since consumer is launched first, it guarantees to get a rank of 1.
MPI.Comm.bsend([idxs, data], 1)
MPI.Detach_buffer()
....
with MPIPoolExecutor() as executor:
executor.starmap(consumer, [(args)])
executor.starmap(produces, list_of_args)
If the consumer is launched first, it gets rank zero, not 1. Also: you're misunderstanding buffered communication. If you want the producer to return, use MPI_Isend. The buffer detach call blocks until all messages in it have been completed.
I'm using Monix Task for async control.
scenario
tasks are executed in parallel
if failure occurs over X times
stop all tasks that are not yet in complete status (as quick as better)
my solution
I come up the ideas that race between 1. result and 2. error counter, and cancel the loser.
Via Task.race if the error-counter get to threshold first, then the tasks would be canceled by Task.race.
experiment
on Ammonite REPL
{
import $ivy.`io.monix::monix:3.1.0`
import monix.eval.Task
import monix.execution.atomic.Atomic
import scala.concurrent.duration._
import monix.execution.Scheduler
//import monix.execution.Scheduler.Implicits.global
implicit val s = Scheduler.fixedPool("race", 2) // pool size
val taskSize = 100
val errCounter = Atomic(0)
val threshold = 3
val tasks = (1 to taskSize).map(_ => Task.sleep(100.millis).map(_ => errCounter.increment()))
val guard = Task(f"stop because too many error: ${errCounter.get()}")
.restartUntil(_ => errCounter.get() >= threshold)
val race = Task
.race(guard, Task.gather(tasks))
.runToFuture
.onComplete { case x => println(x); println(f"completed task: ${errCounter.get()}") }
}
issue
The outcome is depends on thread pool size !?
For pool size 1
the outcome is almost always a task success i.e. no stop.
Success(Right(.........))
completed task: 100 // all task success !
For pool size 2
it is very un-deterministic between success and failure and the cancelling is not accurate.
for example:
Success(Left(stop because too many error: 1))
completed task: 98
the canceling is as late as 98 tasks has completed.
the error count is weird small to threshold.
The default global scheduler get this same outcome behavior.
For pool size 200
it is more deterministic and the stopping is earlier thus more accurate in sense that less task was completed.
Success(Left(stop because too many error: 2))
completed task: 8
the larger of the pool size the better.
If I change Task.gather to Task.sequence execution, all issues disappeared!
What is the cause for this dependency on pool size ?
How to improve it or is there better alternative for stopping tasks once too many error occurs ?
What you're seeing is likely an effect of the monix scheduler and how it aims for fairness. It's a fairly complex topic but the documentation and scaladocs are excellent (see: https://monix.io/docs/3x/execution/scheduler.html#execution-model)
When you have only one thread (or few) it takes a while until the "guard" Task gets another turn to check. With Task.gather you start 100 tasks at once, so the scheduler is very busy and the "guard" cannot check again until the other tasks are already done.
If you have one thread per task the scheduler cannot guarantee fairness and therefore the "guard" unfairly checks much more frequently and can finish sooner.
If you use Task.sequence those 100 tasks are executed sequentially, which is why the "guard" task gets much more opportunities to finish as soon as needed. If you want to keep your code the way it is, you could use Task.gatherN(parallelism = 4) which will limit the parallelism and therefore allow your "guard" to check more often (a middleground between Task.sequence and Task.gather).
It seems a bit like Go code to me (using Task.race like Go's select) and you're also using side-effects unconstrained which further complicates understanding what's going on. I've tried to rewrite your program in a way that's more idiomatic and for complicated concurrency I usually reach for streams like Observable:
import cats.effect.concurrent.Ref
import monix.eval.Task
import monix.execution.Scheduler
import monix.reactive.Observable
import scala.concurrent.duration._
object ErrorThresholdDemo extends App {
//import monix.execution.Scheduler.Implicits.global
implicit val s: Scheduler = Scheduler.fixedPool("race", 2) // pool size
val taskSize = 100
val threshold = 30
val program = for {
errCounter <- Ref[Task].of(0)
tasks = (1 to taskSize).map(n => Task.sleep(100.millis).flatMap(_ => errCounter.update(_ + (n % 2))))
tasksFinishedCount <- Observable
.fromIterable(tasks)
.mapParallelUnordered(parallelism = 4) { task =>
task
}
.takeUntilEval(errCounter.get.restartUntil(_ >= threshold))
.map(_ => 1)
.sumL
errorCount <- errCounter.get
_ <- Task(println(f"completed tasks: $tasksFinishedCount, errors: $errorCount"))
} yield ()
program.runSyncUnsafe()
}
As you can see I no longer use global mutable side-effects but instead Ref which interally also uses Atomic but provides a functional api which we can use with Task.
For demonstration purposes I also changed the threshold to 30 and only every other task will "error". So the expected output is always around completed tasks: 60, errors: 30 no matter the thread-pool size.
I'm still using polling with errCounter.get.restartUntil(_ >= threshold) which might burn a bit too much CPU for my taste but it's close to your original idea and works well.
Usually I don't create a list of tasks up front but instead throw the inputs into the Observable and create the tasks inside of .mapParallelUnordered. This code keeps your list which is why there is no real mapping involved (it already contains tasks).
You can choose your desired parallelism much like with Task.gatherN which is pretty nice imo.
Let me know if anything is still unclear :)
I would like to use the library threads (or perhaps parallel) for loading/preprocessing data into a queue but I am not entirely sure how it works. In summary;
Load data (tensors), pre-process tensors (this takes time, hence why I am here) and put them in a queue. I would like to have as many threads as possible doing this so that the model is not waiting or not waiting for long.
For the tensor at the top of the queue, extract it and forward it through the model and remove it from the queue.
I don't really understand the example in https://github.com/torch/threads enough. A hint or example as to where I would load data into the queue and train would be great.
EDIT 14/03/2016
In this example "https://github.com/torch/threads/blob/master/test/test-low-level.lua" using a low level thread, does anyone know how I can extract data from these threads into the main thread?
Look at this multi-threaded data provider:
https://github.com/soumith/dcgan.torch/blob/master/data/data.lua
It runs this file in the thread:
https://github.com/soumith/dcgan.torch/blob/master/data/data.lua#L18
by calling it here:
https://github.com/soumith/dcgan.torch/blob/master/data/data.lua#L30-L43
And afterwards, if you want to queue a job into the thread, you provide two functions:
https://github.com/soumith/dcgan.torch/blob/master/data/data.lua#L84
The first one runs inside the thread, and the second one runs in the main thread after the first one completes.
Hopefully that makes it a bit more clear.
If Soumith's examples in the previous answer are not very easy to use, I suggest you build your own pipeline from scratch. I provide here an example of two synchronized threads : one for writing data and one for reading data:
local t = require 'threads'
t.Threads.serialization('threads.sharedserialize')
local tds = require 'tds'
local dict = tds.Hash() -- only local variables work here, and only tables or tds.Hash()
dict[1] = torch.zeros(4)
local m1 = t.Mutex()
local m2 = t.Mutex()
local m1id = m1:id()
local m2id = m2:id()
m1:lock()
local pool = t.Threads(
1,
function(threadIdx)
end
)
pool:addjob(
function()
local t = require 'threads'
local m1 = t.Mutex(m1id)
local m2 = t.Mutex(m2id)
while true do
m2:lock()
dict[1] = torch.randn(4)
m1:unlock()
print ('W ===> ')
print(dict[1])
collectgarbage()
collectgarbage()
end
return __threadid
end,
function(id)
end
)
-- Code executing on master:
local a = 1
while true do
m1:lock()
a = dict[1]
m2:unlock()
print('R --> ')
print(a)
end
I have an actor Dispenser. What it does is it
dispenses some objects by request
listens to arriving new ones
Code follows
class Dispenser extends Actor {
override def receive: Receive = {
case Get =>
context.sender ! getObj()
case x: SomeType =>
addObj(x)
}
}
In real processing it doesn't matter whether 1 ms or even few seconds passed since new object was sent until the dispenser starts to dispense it, so there's no code tracking it.
But now I'm writing test for the dispenser and I want to be sure that firstly it receives new object and only then it receives a Get request.
Here's the test code I came up with:
val dispenser = system.actorOf(Props.create(classOf[Dispenser]))
dispenser ! obj
Thread.sleep(100)
val task = dispenser ? Get()
val result = Await.result(task, timeout)
check(result)
It satisfies one important requirement - it doesn't change original code. But it is
At least 100ms seconds slow even on very high performance boxes
Unstable and fails sometimes because 100 ms or any other constant doesn't provide any guaranties.
And the question is how to make a test that satisfies requirement and doesn't have cons above (neither any other obvious cons)
You can take out the Thread.sleep(..) and your test will be fine. Akka guarantees the ordering you need.
With the code
dispenser ! obj
val task = dispenser ? Get()
dispenser will process obj before Get deterministically because
The same thread puts obj then Get in the actor's mailbox, so they're in the correct order in the actor's mailbox
Actors process messages sequentially and one-at-a-time, so the two messages will be received by the actor and processed in the order they're queued in the mailbox.
(..if there's nothing else going on that's not in your sample code - routers, async processing in getObj or addObj, stashing, ..)
Akka FSM module is really handy for testing underlying state and behavior of the actor and does not require to change its implementation specifically for tests.
By using TestFSMRef one can get actors current state and and data by:
val testActor = TestFSMRef(<actors constructor or Props>)
testActor.stateName shouldBe <state name>
testActor.stateData shouldBe <state data>
http://doc.akka.io/docs/akka/2.4.1/scala/fsm.html