Lua execute several tasks at once - multithreading

I have been looking for a solution to execute several task (atleast 2) simultaneously. I found something like coroutines in lua. Can anybody please clarify me in detail how to handle 2 or more than one task? Basically what I am trying to do is to run execute some event and measure the memory consumption for that process using lua script. Any quick solution or ideas would be highly appreciated.

Take a look a io.popen(prog [, mode]).
From the documentation for io.popen, it "starts prog in an other process".
Here is how I would implement this:
-- launch the prog you want to measure. I assume it does not quit immediately.
-- lua will not block here: it does not wait for prog to exit.
-- launch the second process to measure the first.
local measure = io.popen('measuring-process', 'r')
-- read all will block until the measure program exits
local measure_output = measure:read('*a')
-- do what you want with the output

I don't know if this can help:
-- task-like example --
local tasks = { } -- queue
local task = coroutine.wrap -- alias
local function suspend ( )
return coroutine.yield (true)
local function shift (xs)
return table.remove (xs, 1) -- removes the first element
local function push (xs, x)
return table.insert (xs, x) -- inserts in the last position
local function enqueue (f)
return push (tasks, task (f))
-- begin to run the tasks --
local function start ( )
local curr
while #tasks > 0 do -- while the tasks queue isn't empty
curr = shift (tasks)
if curr ( ) then push (tasks, curr) end -- enqueue the task if still alive
-- create a task and begin to run the tasks --
local function spawn (f)
local curr = task (f) --
if curr ( ) then push (tasks, curr) end
return start ( )
-- adds task to tasks queue --
enqueue (function ( )
for i = 1, 3 do
print ("Ping. <<", i)
-- os.execute "sleep 1" -- uncomment if you're using *nix
suspend( )
-- yet another task --
enqueue (function ( )
for i = 1, 5 do
print "..."
suspend( )
-- begins to run the tasks --
spawn (function ( )
for i = 1, 5 do
print ("Pong. >>", i)
-- os.execute "sleep 1" -- uncomment if you're using *nix
suspend( )
-- end of script --


When will Go scheduler create a new M and P?

Just learned golang GMP model, now I understand how goroutines, OS threads, and golang contexts/processors cooperate with each other. But I still don't understand when will an M and P be created?
For example, I have a test code to run some operations on DB and there are two test cases (two batches of goroutines):
func Test_GMP(t *testing.T) {
for _ = range []struct {
name string
{"first batch"},
{"second batch"},
} {
goroutineSize := 50
done := make(chan error, goroutineSize)
for i := 0; i < goroutineSize; i++ {
go func() {
// do some databases operations...
// each goroutine should be blocked here for some time...
// propogate the result
done <- nil
for i := 0; i < goroutineSize; i++ {
select {
case err := <-done:
assert.NoError(t, err)
case <-time.After(10 * time.Second):
t.Fatal("timeout waiting for txFunc goroutine")
In my understanding, if M is created in need. In the first batch of goroutines, 8 (the number of virtual cores on my computer) OS threads will be created and the second batch will just reuse the 8 OS threads without creating new ones. Is that correct?
Appreciate if you can provide more materials or blogs on this topic.
M is reusable only if your processes are not blocking or not any sys-calls. In your case you have blocking tasks inside your go func(). So, number of M will not be limited to 8 (the number of virtual cores on my computer). First batch will block and remove from P and wait for blocking processes get finished while new M create an associate with P.
We create a goroutine through Go func ();
There are two queues that store G, one is the local queue of local scheduler P, one is the global G queue. The newly created G will be
saved in the local queue in the P, and if the local queues of P are
full, they will be saved in the global queue;
G can only run in m, one m must hold a P, M and P are 1: 1
relationship. M will pop up a executable G from the local queue of P.
If the local queue is empty, you will think that other MP combinations
steals an executable G to execute;
A process executed by M Scheduling G is a loop mechanism;
When M executes syscall or the remaining blocking operation, M will block, if there are some g in execution, Runtime will remove this
thread M from P, then create one The new operating system thread (if
there is an idle thread available to multiplex idle threads) to serve
this P;
When the M system call ends, this G will try to get an idle P execute and put it into this P's local queue. If you get P, then this
thread m becomes a sleep state, add it to the idle thread, and then
this G will be placed in the global queue.
1. P Quantity:
The environment variable $ GomaxProcs is determined by the Runtime
method gomaxprocs () when the environment variable is scheduled. After
GO1.5, GomaxProcs will be set by default to the available cores, and
before default it is 1.This means that only $ GOMAXPROCS Goroutine is
run at the same time at any time executed.
2. M quantity:
The GO language itself limits: When the GO program starts, the maximum
number of M will set the maximum number of M. However, the kernel is
difficult to support so many threads, so this limit can be ignored.
SetMaxThreads function in runtime / debug, set the maximum number of M
A M blocking, you will create new M.
The number of M and P has no absolute relationship, one m block, p
will create or switch another M, so even if the default number of P is
1, there may be many M out.
Please refer following for more details,

Testing captured IO from a spawned process

I want to test the return value and the IO output on the following method:
defmodule Speaker do
def speak do
receive do
{ :say, msg } ->
_other ->
speak # throw away the message
In the ExUnit.CaptureIO docs, there is an example test that does this which looks like the following:
test "checking the return value and the IO output" do
fun = fn ->
assert Enum.each(["some", "example"], &(IO.puts &1)) == :ok
assert capture_io(fun) == "some\nexample\n"
Given that, I thought I could write the following test that performs a similar action but with a spawned process:
test ".speak with capture io" do
pid = Kernel.spawn(Speaker, :speak, [])
fun = fn ->
assert send(pid, { :say, "Hello" }) == { :say, "Hello" }
assert capture_io(fun) == "Hello\n"
However, I get the following error message telling me there was no output, even though I can see output on the terminal:
1) test .speak with capture io (SpeakerTest)
Assertion with == failed
code: capture_io(fun) == "Hello\n"
lhs: ""
rhs: "Hello\n"
test/speaker_test.exs:30: (test)
So, am I missing something perhaps with regards to testing spawned processes or methods that use the receive macro? How can I change my test to make it pass?
CaptureIO might not be suited for what you're trying to do here. It runs a function and returns the captured output when that function returns. But your function never returns, so seems like this won't work. I came up with the following workaround:
test ".speak with capture io" do
test_process = self()
pid = spawn(fn ->
Process.group_leader(self(), test_process)
send(pid, {:say, "Hello"})
assert_receive {:io_request, _, _, {:put_chars, :unicode, "Hello\n"}}
# Just to cleanup pid which dies upon not receiving a correct response
# to the :io_request after a timeout
Process.exit(pid, :kill)
It uses Process.group_leader to set the current process as the receiver of IO messages for the tested process and then asserts that these messages arrive.
I had a similar problem, I had a registered process on my Application that would timeout every 10 seconds and write to stdio with IO.binwrite, to simulate multiple timeouts I took upon #Pawel-Obrok answer, but change it as to reply the :io_request with an :io_reply, that way the process would not hang allowing me to send multiple messages.
defp assert_io() do
send(MyProcess, :timeout)
receive do
{:io_request, _, reply_as, {:put_chars, _, msg}} ->
assert msg == "Some IO message"
send(Stats, {:io_reply, reply_as, :ok})
_ ->
test "get multiple messages" do
Process.group_leader(Process.whereis(MyProcess), self())
If you want to know more about the IO protocol take a look at the erlang docs about it.

how to add a code to exit this zeromq example by press keyboard q button
hope to terminate the program by press q to exit
main :: IO ()
main =
runZMQ $ do
async $ clientTask "A"
async $ clientTask "B"
async $ clientTask "C"
async serverTask
liftIO $ threadDelay $ 5 * 1000 * 1000
Process-to-process message passing is the very power of the ZeroMQ, so use it:
design a central aKbdMONITOR-thread, that monitors Keyboard and scans for Q | q
async $ clientTask "C"
async $ aKbdMONITOR -- Add central-service async thread
equip this aKbdMONITOR-thread with a PUB service to broadcast to any SUB-side an appearance of such event
aKbdSCANNER <- socket Pub -- PUB-side adequate ZeroMQ Archetype
bind aKbdSCANNER "tcp://*:8123" -- yes, can serve even for remote hosts
equip all other threads with a SUB pattern part and review any subsequent arriving event-notification from aKbdMONITOR-thread to decide locally about self-termination in case aKbdMONITOR-thread announces such case as requested above to exit
aKbdSCANNER <- socket Sub -- SUB-side adequate ZeroMQ Archetype
connect aKbdSCANNER "tcp://ipKBD:8123" -- tcp transport-class remote ipKBD
-- + do not forget to subscribe
-- + use poll to scan

perl system call causing hang when using threads

I am a newbie to perl, so please excuse my ignorance. (I'm using windows 7)
I have borrowed echicken's threads example script and wanted to use it as a basis for a script to make a number of system calls, but I have run into an issue which is beyond my understanding. To illustrate the issue I am seeing, I am doing a simple ping command in the example code below.
$nb_process is the number or simultaneous running threads allowed.
$nb_compute as the number of times we want to run the sub routine (i.e the total number of time we will issue the ping command).
When I set $nb_compute and $nb_process to be same value as each other, it works perfectly.
However when I reduce $nb_process (to restrict the number of running threads at any one time), it seems to lock once the number of threads defined in $nb_process have started.
It works fine if I remove the system call (ping command).
I see the same behaviour for other system calls (it'd not just ping).
Please could someone help? I have provided the script below.
#!/opt/local/bin/perl -w
use threads;
use strict;
use warnings;
my #a = ();
my #b = ();
sub sleeping_sub ( $ $ $ );
print "Starting main program\n";
my $nb_process = 3;
my $nb_compute = 6;
my $i=0;
my #running = ();
my #Threads;
while (scalar #Threads < $nb_compute) {
#running = threads->list(threads::running);
print "LOOP $i\n";
print " - BEGIN LOOP >> NB running threads = ".(scalar #running)."\n";
if (scalar #running < $nb_process) {
my $thread = threads->new( sub { sleeping_sub($i, \#a, \#b) });
push (#Threads, $thread);
my $tid = $thread->tid;
print " - starting thread $tid\n";
#running = threads->list(threads::running);
print " - AFTER STARTING >> NB running Threads = ".(scalar #running)."\n";
foreach my $thr (#Threads) {
if ($thr->is_running()) {
my $tid = $thr->tid;
print " - Thread $tid running\n";
elsif ($thr->is_joinable()) {
my $tid = $thr->tid;
print " - Results for thread $tid:\n";
print " - Thread $tid has been joined\n";
#running = threads->list(threads::running);
print " - END LOOP >> NB Threads = ".(scalar #running)."\n";
print "\nJOINING pending threads\n";
while (scalar #running != 0) {
foreach my $thr (#Threads) {
$thr->join if ($thr->is_joinable());
#running = threads->list(threads::running);
print "NB started threads = ".(scalar #Threads)."\n";
print "End of main program\n";
sub sleeping_sub ( $ $ $ ) {
my #res2 = `ping`;
print "\n#res2";
The main problem with your program is that you have a busy loop that tests whether a thread can be joined. This is wasteful. Furthermore, you could reduce the amount of global variables to better understand your code.
Other eyebrow-raiser:
Never ever use prototypes, unless you know exactly what they mean.
The sleeping_sub does not use any of its arguments.
You use the threads::running list a lot without contemplating whether this is actually correct.
It seems you only want to run N workers at once, but want to launch M workers in total. Here is a fairly elegant way to implement this. The main idea is that we have a queue between threads where threads that just finished can enqueue their thread ID. This thread will then be joined. To limit the number of threads, we use a semaphore:
use threads; use strict; use warnings;
use feature 'say'; # "say" works like "print", but appends newline.
use Thread::Queue;
use Thread::Semaphore;
my #pieces_of_work = 1..6;
my $num_threads = 3;
my $finished_threads = Thread::Queue->new;
my $semaphore = Thread::Semaphore->new($num_threads);
for my $task (#pieces_of_work) {
$semaphore->down; # wait for permission to launch a thread
say "Starting a new thread...";
# create a new thread in scalar context
threads->new({ scalar => 1 }, sub {
my $result = worker($task); # run actual task
$finished_threads->enqueue(threads->tid); # report as joinable "in a second"
$semaphore->up; # allow another thread to be launched
return $result;
# maybe join some threads
while (defined( my $thr_id = $finished_threads->dequeue_nb )) {
# wait for all threads to be finished, by "down"ing the semaphore:
$semaphore->down for 1..$num_threads;
# end the finished thread ID queue:
# join any threads that are left:
while (defined( my $thr_id = $finished_threads->dequeue )) {
With join_thread and worker defined as
sub worker {
my ($task) = #_;
sleep rand 2; # sleep random amount of time
return $task + rand; # return some number
sub join_thread {
my ($tid) = #_;
my $thr = threads->object($tid);
my $result = $thr->join;
say "Thread #$tid returned $result";
we could get the output:
Starting a new thread...
Starting a new thread...
Starting a new thread...
Starting a new thread...
Thread #3 returned 3.05652608754778
Starting a new thread...
Thread #1 returned 1.64777186731541
Thread #2 returned 2.18426146087901
Starting a new thread...
Thread #4 returned 4.59414651998983
Thread #6 returned 6.99852684265667
Thread #5 returned 5.2316971836585
(order and return values are not deterministic).
The usage of a queue makes it easy to tell which thread has finished. Semaphores make it easier to protect resources, or limit the amount of parallel somethings.
The main benefit of this pattern is that far less CPU is used, when contrasted to your busy loop. This also shortens general execution time.
While this is a very big improvement, we could do better! Spawning threads is expensive: This is basically a fork() without all the copy-on-write optimizations on Unix systems. The whole interpreter is copied, including all variables, all state etc. that you have already created.
Therefore, as threads should be used sparingly, and be spawned as early as possible. I already introduced you to queues that can pass values between threads. We can extend this so that a few worker threads constantly pull work from an input queue, and return via an output queue. The difficulty now is to have the last thread to exit finish the output queue.
use threads; use strict; use warnings;
use feature 'say';
use Thread::Queue;
use Thread::Semaphore;
# define I/O queues
my $input_q = Thread::Queue->new;
my $output_q = Thread::Queue->new;
# spawn the workers
my $num_threads = 3;
my $all_finished_s = Thread::Semaphore->new(1 - $num_threads); # a negative start value!
my #workers;
for (1 .. $num_threads) {
push #workers, threads->new( { scalar => 1 }, sub {
while (defined( my $task = $input_q->dequeue )) {
my $result = worker($task);
$output_q->enqueue([$task, $result]);
# we get here when the input queue is exhausted.
# end the output queue if we are the last thread (the semaphore is > 0).
if ($all_finished_s->down_nb) {
# fill the input queue with tasks
my #pieces_of_work = 1 .. 6;
$input_q->enqueue($_) for #pieces_of_work;
# finish the input queue
$input_q->enqueue(undef) for 1 .. $num_threads;
# do something with the data
while (defined( my $result = $output_q->dequeue )) {
my ($task, $answer) = #$result;
say "Task $task produced $answer";
# join the workers:
$_->join for #workers;
With worker defined as before, we get:
Task 1 produced 1.15207098293783
Task 4 produced 4.31247785766295
Task 5 produced 5.96967474718984
Task 6 produced 6.2695013168678
Task 2 produced 2.02545636412421
Task 3 produced 3.22281619053999
(The three threads would get joined after all output is printed, so that output would be boring).
This second solution gets a bit simpler when we detach the threads – the main thread won't exit before all threads have exited, because it is listening to the input queue which is finished by the last thread.

What are Lua coroutines even for? Why doesn't this code work as I expect it?

I'm having trouble understanding this code... I was expecting something similar to threading where I would get an output with random "nooo" and "yaaaay"s interspersed with each other as they both do the printing asynchronously, but rather I discovered that the main thread seems to block on the first calling of coroutine.resume() and thus prevents the next from being started until the first has yielded.
If this is the intended operation coroutines, what are they useful for, and how would I achieve the goal I was hoping for? Would I have to implement my own scheduler for these coroutines to operate asynchronously?, because that seems messy, and I may as well use functions!
co1 = coroutine.create(function ()
local i = 1
while i < 200 do
i = i + 1
co2 = coroutine.create(function ()
local i = 1
while i < 200 do
i = i + 1
Coroutines aren't threads.
Coroutines are like threads that are never actively scheduled. So yes you are kinda correct that you would have to write you own scheduler to have both coroutines run simultaneously.
However you are missing the bigger picture when it comes to coroutines. Check out wikipedia's list of coroutine uses. Here is one concrete example that might guide you in the right direction.
-- level script
-- a volcano erupts every 2 minutes
function level_with_volcano( interface )
while true do
s = play("rumble_sound")
wait( end_of(s) )
-- more stuff
The above script could be written to run iteratively with a switch statement and some clever state variables. But it is much more clear when written as a coroutine. The above script could be a thread but do you really need to dedicate a kernel thread to this simple code. A busy game level could have 100's of these coroutines running without impacting performance. However if each of these were a thread you might get away with 20-30 before performance started to suffer.
A coroutine is meant to allow me to write code that stores state on the stack so that I can stop running it for a while (the wait functions) and start it again where I left off.
Since there have been a number of comments asking how to implement the wait function that would make deft_code's example work, I've decided to write a possible implementation. The general idea is that we have a scheduler with a list of coroutines, and the scheduler decides when to return control to the coroutines after they give up control with their wait calls. This is desirable because it makes asynchronous code be readable and easy to reason about.
This is only one possible use of coroutines, they are a more general abstraction tool that can be used for many different purposes (such as writing iterators and generators, writing stateful stream processing objects (for example, multiple stages in a parser), implementing exceptions and continuations, etc.).
First: the scheduler definition:
local function make_scheduler()
local script_container = {}
return {
continue_script = function(frame, script_thread)
if script_container[frame] == nil then
script_container[frame] = {}
run = function(frame_number, game_control)
if script_container[frame_number] ~= nil then
local i = 1
--recheck length every time, to allow coroutine to resume on
--the same frame
local scripts = script_container[frame_number]
while i <= #scripts do
local success, msg =
coroutine.resume(scripts[i], game_control)
if not success then error(msg) end
i = i + 1
Now, initialising the world:
local fps = 60
local frame_number = 1
local scheduler = make_scheduler()
scheduler.continue_script(frame_number, coroutine.create(function(game_control)
while true do
--instead of passing game_control as a parameter, we could
--have equivalently put these values in _ENV.
s ="rumble_sound")
game_control.wait( game_control.end_of(s) )
-- more stuff
The (dummy) interface to the game:
local game_control = {
seconds = function(num)
return math.floor(num*fps)
minutes = function(num)
return math.floor(num*fps*60)
frames = function(num) return num end,
end_of = function(sound)
return sound.start+sound.duration-frame_number
wait = function(frames_to_wait_for)
start_eruption_volcano = function()
--obviously in a real game, this could
--affect some datastructure in a non-immediate way
print(frame_number..": The volcano is erupting, BOOM!")
start_camera_shake = function()
print(frame_number..": SHAKY!")
play = function(soundname)
print(frame_number..": Playing: "..soundname)
return {name = soundname, start = frame_number, duration = 30}
And the game loop:
while true do,game_control)
frame_number = frame_number+1
co1 = coroutine.create(
for i = 1, 100 do
co2 = coroutine.create(
for i = 1, 100 do
for i = 1, 100 do
