I'm trying to specify to run tasks on specific threads such that I can use two threads dedicated to a time consuming task that "put" to a channel and the others to processing that.
I’m stuck at how to assign a specific task to a specific thread. I thought I could use something like #spawnat but that does not seem to work. I wrote the code below to illustrate what I want to achieve
channel = Channel{Tuple{Int64, Int64}}(1000)
function stream()
# won't actually use this later, but
# easier to see what threads are used
for new_item in channel
println(new_item)
end
end
function cool_function(x::Int64)
sleep(1)
data = (Threads.threadid(), ~(x)+1)
put!(channel, data)
end
function spawner(x::Array{Int64})
for (i, number) in enumerate(x)
if iseven(i)
# Add to queue for thread X
Threads.#spawn cool_function(i)
else
# Add to queue for thread Y
Threads.#spawn cool_function(i)
end
end
end
#async stream()
spawner([1,2,3,4,5])
Any ideas on how to add tasks to a specific thread queue in spawner? :). Like "add to queue of thread1"
Currently there is no an API at julia level for accessing the scheduler to assign a task to a specific thread,
but the package ThreadsPool.jl exposes the macro #tspawnat that allows direct thread assignment.
Probably not the greatest idea, but it's possible to do this:
task = Task(() -> cool_function(i))
tid = iseven(i) ? 1 : 2
ccall(:jl_set_task_tid, Cvoid, (Any, Cint), task, tid - 1)
schedule(task)
Related
I have 3 classes that represent nearly isolated processes that can be run concurrently (meant to be persistent, like 3 main() loops).
class DataProcess:
...
def runOnce(self):
...
class ComputeProcess:
...
def runOnce(self):
...
class OtherProcess:
...
def runOnce(self):
...
Here's the pattern I'm trying to achieve:
start various streams
start each process
allow each process to publish to any stream
allow each process to listen to any stream (at various points in it's loop) and behave accordingly (allow for interruption of it's current task or not, etc.)
For example one 'process' Listens for external data. Another process does computation on some of that data. The computation process might be busy for a while, so by the time it comes back to start and checks the stream, there may be many values that piled up. I don't want to just use a queue because, actually I don't want to be forced to process each one in order, I'd rather be able to implement logic like, "if there is one or multiple things waiting, just run your process one more time, otherwise go do this interruptible task while you wait for something to show up."
That's like a lot, right? So I was thinking of using an actor model until I discovered RxPy. I saw that a stream is like a subject
from reactivex.subject import BehaviorSubject
newData = BehaviorSubject()
newModel = BehaviorSubject()
then I thought I'd start 3 threads for each of my high level processes:
thread = threading.Thread(target=data)
threads = {'data': thread}
thread = threading.Thread(target=compute)
threads = {'compute': thread}
thread = threading.Thread(target=other)
threads = {'other': thread}
for thread in threads.values():
thread.start()
and I thought the functions of those threads should listen to the streams:
def data():
while True:
DataProcess().runOnce() # publishes to stream inside process
def compute():
def run():
ComuteProcess().runOnce()
newData.events.subscribe(run())
newModel.events.subscribe(run())
def other():
''' not done '''
ComuteProcess().runOnce()
Ok, so that's what I have so far. Is this pattern going to give me what I'm looking for?
Should I use threading in conjunction with rxpy or just use rxpy scheduler stuff to achieve concurrency? If so how?
I hope this question isn't too vague, I suppose I'm looking for the simplest framework where I can have a small number of computational-memory units (like objects because they have internal state) that communicate with each other and work in parallel (or concurrently). At the highest level I want to be able to treat these computational-memory units (which I've called processes above) as like individuals who mostly work on their own stuff but occasionally broadcast or send a message to a specific other individual, requesting information or providing information.
Am I perhaps actually looking for an actor model framework? or is this RxPy setup versatile enough to achieve that without extreme complexity?
Thanks so much!
I have a simple rest service which allows you to create task. When a client requests a task - it returns a unique task number and starts executing in a separate thread. The easiest way to implement it
class Executor:
def __init__(self, max_workers=1):
self.executor = ThreadPoolExecutor(max_workers)
def execute(self, body, task_number):
# some logic
pass
def some_rest_method(request):
body = json.loads(request.body)
task_id = generate_task_id()
Executor(max_workers=1).execute(body)
return Response({'taskId': task_id})
Is it a good idea to create each time ThreadPoolExecutor with one (!) workers if i know than one request - is one new task (new thread). Perhaps it is worth putting them in the queue somehow? Maybe the best option is to create a regular stream every time?
Is it a good idea to create each time ThreadPoolExecutor...
No. That completely defeats the purpose of a thread pool. The reason for using a thread pool is so that you don't create and destroy a new thread for every request. Creating and destroying threads is expensive. The idea of a thread pool is that it keeps the "worker thread(s)" alive and re-uses it/them for each next request.
...with just one thread
There's a good use-case for a single-threaded executor, though it probably does not apply to your problem. The use-case is, you need a sequence of tasks to be performed "in the background," but you also need them to be performed sequentially. A single-thread executor will perform the tasks, one after another, in the same order that they were submitted.
Perhaps it is worth putting them in the queue somehow?
You already are putting them in a queue. Every thread pool has a queue of pending tasks. When you submit a task (i.e., executor.execute(...)) that puts the task into the queue.
what's the best way...in my case?
The bones of a simplistic server look something like this (pseudo-code):
POOL = ThreadPoolExecutor(...with however many threads seem appropriate...)
def service():
socket = create_a_socket_that_listens_on_whatever_port()
while True:
client_connection = socket.accept()
POOL.submit(request_handler, connection=connection)
def request_handler(connection):
request = receive_request_from(connection)
reply = generate_reply_based_on(request)
send_reply_to(reply, connection)
connection.close()
def main():
initialize_stuff()
service()
Of course, there are many details that I have left out. I can't design it for you. Especially not in Python. I've written servers like this in other languages, but I'm pretty new to Python.
I have a service running the following loop
while True:
feedback = f1()
if check1(feedback):
break
feedback = f2()
if check2(feedback):
break
feedback = f3()
if check3(feedback):
break
time.sleep(10)
do_cleanup(feedback)
Now I would like to run these feedback checks with different time intervals. One naive way is to move the time.sleep() into the f functions. But that causes blocking. What would be the easiest way to achieve periodic checks with different intervals? Here all the f functions are cheap to run.
The event loop in asyncio sounds like the way to go. But due to my inexperience, I don't know where the check and break logic should go for the event loop.
Or is there any other packages/code patterns to do this kind of monitoring logic?
In asyncio you might split the service into three separate tasks, each with its own loop and timing - you can think of them as three threads, except they are all scheduled in the same thread, and multi-task cooperatively by suspending at await.
For this purpose let's start with a utility function that calls a function and checks its result at a regular interval:
async def at_interval(f, check, seconds):
while True:
feedback = f()
if check(feedback):
return feedback
await asyncio.sleep(seconds)
The return is the equivalent to the break in your original code.
With that in place, the service spawns three such loops and wait for any of them to finish. Whichever completes first carries the "feedback" we're waiting for, and we can dispose of the others.
async def service():
loop = asyncio.get_event_loop()
t1 = loop.create_task(at_interval(f1, check1, 3))
t2 = loop.create_task(at_interval(f2, check2, 5))
t3 = loop.create_task(at_interval(f3, check3, 7))
done, pending = await asyncio.wait(
[t1, t2, t3], return_when=asyncio.FIRST_COMPLETED)
for t in pending:
t.cancel()
feedback = await list(done)[0]
do_cleanup(feedback)
asyncio.get_event_loop().run_until_complete(service())
A small difference between this and your code is that here it is possible (though very unlikely) for more than one check to fail before the service picks up on it. For example, if through a stroke of bad luck two of the above tasks end up sharing the absolute time of wakeup to the microsecond, they will be scheduled in the same event loop iteration. Both will return from their corresponding at_interval coroutines, and done will contain more than one feedback. The code handles it by picking a feedback and calling do_cleanup on that one, but it could also loop over all.
If this is not acceptable, you can easily pass each at_interval a callable that cancels all tasks except itself. This is currently done in service for brevity, but it can be done in at_interval as well. One task cancelling the others would ensure that only one feedback can exist.
So I have been working with coroutines for a little while and I'm sort of having trouble trying to make what I want. I want a class that I can access objectivley creating objects as tasks or processes. I think showing you code would be irrelevant and its not what I want either. So I'm just going to show you how i want the functionality
local task1 = Tasker.newTask(function()
while true do
print("task 1")
end
end)
local task2 = Tasker.newTask(function()
while true do
print("task 2")
end
end)
task1:start()
task2:start()
This way I can run multiple tasks at once, I want to be able to add new tasks when ever during runtime. Also I would like a way to stop the tasks also for example:
task2:stop()
But I don't want the stop command to entirely delete the task instance, only stop the task itself so I can invoke
task2:start()
Then maybe I could use a command to delete it.
task2:delete()
This would be very helpful and thank you for help if you need more info please ask. Also i posted this on my phone so there may be typos and formatting issues
Lua doesn't natively support operating system threads, i.e. preemptive multitasking.
You can use coroutines to implement your own cooperative "threads", but each thread must relinquish control before another can do anything.
local task1 = Tasker.newTask(function()
while true do
print("task 1")
coroutine.yield()
end
end)
local task2 = Tasker.newTask(function()
while true do
print("task 2")
coroutine.yield()
end
end)
Your Tasker class must take the task function and wrap it in a coroutine, then take care of calling coroutine.resume on them. Operations like stop and start would set flags on the task that tell Tasker whether or not to resume that particular coroutine in the main loop.
You can do this via C. You might be able to use LuaLanes and Linda objects.
In my code I have a loop, inside this loop I send several requests to a remote webservice. WS providers said: "The webservice can host at most n threads", so i need to cap my code since I can't send n+1 threads.
If I've to send m threads I would that first n threads will be executed immediately and as soon one of these is completed a new thread (one of the remaining m-n threads) will be executed and so on, until all m threads are executed.
I have thinked of a Thread Pool and explicit setting of the max thread number to n. Is this enough?
For this I would avoid the use of multiple threads. Instead, wrapping the entire loop up which can be run on a single thread. However, if you do want to launch multiple threads using the/a thread pool then I would use the Semaphore class to facilitate the required thread limit; here's how...
A semaphore is like a mean night club bouncer, it has been provide a club capacity and is not allowed to exceed this limit. Once the club is full, no one else can enter... A queue builds up outside. Then as one person leaves another can enter (analogy thanks to J. Albahari).
A Semaphore with a value of one is equivalent to a Mutex or Lock except that the Semaphore has no owner so that it is thread ignorant. Any thread can call Release on a Semaphore whereas with a Mutex/Lock only the thread that obtained the Mutex/Lock can release it.
Now, for your case we are able to use Semaphores to limit concurrency and prevent too many threads from executing a particular piece of code at once. In the following example five threads try to enter a night club that only allows entry to three...
class BadAssClub
{
static SemaphoreSlim sem = new SemaphoreSlim(3);
static void Main()
{
for (int i = 1; i <= 5; i++)
new Thread(Enter).Start(i);
}
// Enfore only three threads running this method at once.
static void Enter(int i)
{
try
{
Console.WriteLine(i + " wants to enter.");
sem.Wait();
Console.WriteLine(i + " is in!");
Thread.Sleep(1000 * (int)i);
Console.WriteLine(i + " is leaving...");
}
finally
{
sem.Release();
}
}
}
I hope this helps.
Edit. You can also use the ThreadPool.SetMaxThreads Method. This method restricts the number of threads allowed to run in the thread pool. But it does this 'globally' for the thread pool itself. This means that if you are running SQL queries or other methods in libraries that you application uses then new threads will not be spun-up due to this blocking. This may not be relevant to you, in which case use the SetMaxThreads method. If you want to block for a particular method however, it is safer to use Semphores.