It would be spiffy if Io had this, so that you could control whether code is run, e.g. a combination API-CLI coolstuff.io would run a command line interface, but only if run directly, not when coolstuff.io is imported by other Io code (which may have its own command line interface).
ScriptedMain.io:
#!/usr/bin/env io
ScriptedMain := Object clone
ScriptedMain meaningOfLife := 42
main := method(
"Main: The meaning of life is #{ScriptedMain meaningOfLife}" interpolate println
)
if (System args size > 0 and System args at(0) containsSeq("ScriptedMain"), main)
test.io:
#!/usr/bin/env io
main := method(
"Test: The meaning of life is #{ScriptedMain meaningOfLife}" interpolate println
)
if (System args size > 0 and System args at(0) containsSeq("test"), main)
Example:
$ ./ScriptedMain.io
Main: The meaning of life is 42
$ ./test.io
Test: The meaning of life is 42
Related
I was reading about exec in Go https://gobyexample.com/execing-processes, and tried to do the same using goroutines.
In the following code, I'm trying to make Go run ls, then print a success message in the main thread. However, it's only printing the ls, but not the success message.
What's going on?
Thanks.
package main
import "syscall"
import "os"
import "os/exec"
import "fmt"
func main() {
p := fmt.Println
done := make(chan bool)
binary, lookErr := exec.LookPath("ls")
if lookErr != nil {
panic(lookErr)
}
args := []string{"ls", "-a", "-l", "-h"}
env := os.Environ()
go func() {
execErr := syscall.Exec(binary, args, env)
if execErr != nil {
panic(execErr)
}
done <- true
}()
<-done
p("Done with exec")
}
Here's the output:
Valeriys-MacBook-Pro:test valeriy$ go run test.go
total 8
drwxr-xr-x 3 valeriy staff 96B Dec 17 15:46 .
drwxr-xr-x 8 valeriy staff 256B Dec 17 00:06 ..
-rw-r--r-- 1 valeriy staff 433B Dec 17 15:38 test.go
syscall.Exec replaces the current process with the one invoked.
If you want to run an external command while keeping the original program running, you need to use exec.Command
By the way, the link you included does say:
Sometimes we just want to completely replace the current Go process
with another (perhaps non-Go) one.
If you really want to use the syscall package, you can use syscall.StartProcess which does a fork/exec as opposed to a plain exec.
Short version:
Is it possible in Golang to spawn a number of external processes (shell commands) in parallel, such that it does not start one operating system thread per external process ... and still be able to receive its output when it is finished?
Longer version:
In Elixir, if you use ports, you can spawn thousands of external processes without really increasing the number of threads in the Erlang virtual machine.
E.g. the following code snippet, which starts 2500 external sleep processes, is managed by only 20 operating system threads under the Erlang VM:
defmodule Exmultiproc do
for _ <- 1..2500 do
cmd = "sleep 3600"
IO.puts "Starting another process ..."
Port.open({:spawn, cmd}, [:exit_status, :stderr_to_stdout])
end
System.cmd("sleep", ["3600"])
end
(Provided you set ulimit -n to a high number, such as 10000)
On the other hand, the following code in Go, which is supposed to do the same thing - starting 2500 external sleep processes - does also start 2500 operating system threads. So it obviously starts one operating system thread per (blocking?) system call (so as not to block the whole CPU, or similar, if I understand correctly):
package main
import (
"fmt"
"os/exec"
"sync"
)
func main() {
wg := new(sync.WaitGroup)
for i := 0; i < 2500; i++ {
wg.Add(1)
go func(i int) {
fmt.Println("Starting sleep ", i, "...")
cmd := exec.Command("sleep", "3600")
_, err := cmd.Output()
if err != nil {
panic(err)
}
fmt.Println("Finishing sleep ", i, "...")
wg.Done()
}(i)
}
fmt.Println("Waiting for WaitGroup ...")
wg.Wait()
fmt.Println("WaitGroup finished!")
}
Thus, I was wondering if there is a way to write the Go code so that it does the similar thing as the Elixir code, not opening one operating system thread per external process?
I'm basically looking for a way to manage at least a few thousand external long-running (up to 10 days) processes, in a way that causes as little problems as possible with any virtual or physical limits in the operating system.
(Sorry for any mistakes in the codes, as I'm new to Elixir and, and quite new to Go. I'm eager to get to know any mistakes I'm doing.)
EDIT: Clarified about the requirement to run the long-running processes in parallel.
I find that if we not wait processes, the Go runtime will not start 2500 operating system threads. so please use cmd.Start() other than cmd.Output().
But seems it is impossible to read the process's stdout without consuming a OS thread by golang os package. I think it is because os package not use non-block io to read the pipe.
The bottom, following program runs well on my Linux, although it block the process's stdout as #JimB said in comment, maybe it is because we have small output and it fit the system buffers.
func main() {
concurrentProcessCount := 50
wtChan := make(chan *result, concurrentProcessCount)
for i := 0; i < concurrentProcessCount; i++ {
go func(i int) {
fmt.Println("Starting process ", i, "...")
cmd := exec.Command("bash", "-c", "for i in 1 2 3 4 5; do echo to sleep $i seconds;sleep $i;echo done;done;")
outPipe,_ := cmd.StdoutPipe()
err := cmd.Start()
if err != nil {
panic(err)
}
<-time.Tick(time.Second)
fmt.Println("Finishing process ", i, "...")
wtChan <- &result{cmd.Process, outPipe}
}(i)
}
fmt.Println("root:",os.Getpid());
waitDone := 0
forLoop:
for{
select{
case r:=<-wtChan:
r.p.Wait()
waitDone++
output := &bytes.Buffer{}
io.Copy(output, r.b)
fmt.Println(waitDone, output.String())
if waitDone == concurrentProcessCount{
break forLoop
}
}
}
}
The example is taken from the "A Tour of Go": https://tour.golang.org/concurrency/1
Obviously, program output should have 10 rows: 5 for "hello" and 5 for "world".
But we have on:
Linux - 9 rows
MacOS - 10 rows
Linux output (9 rows):
$ go run 1.go
hello
world
hello
world
hello
world
world
hello
hello
MacOS X output (10 rows):
$ go run 1.go
hello
world
world
hello
hello
world
hello
world
hello
world
Can anyone explain - why?
Linux uname -a:
Linux desktop 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt11-1 (2015-05-24) x86_64 GNU/Linux
MacOS X uname -a:
Darwin 14.5.0 Darwin Kernel Version 14.5.0: Thu Jul 9 22:56:16 PDT 2015; root:xnu-2782.40.6~1/RELEASE_X86_64 x86_64
Source code from tour:
package main
import (
"fmt"
"time"
)
func say(s string) {
for i := 0; i < 5; i++ {
time.Sleep(1000 * time.Millisecond)
fmt.Println(s)
}
}
func main() {
go say("world")
say("hello")
}
From the specification:
Program execution begins by initializing the main package and then invoking the function main. When that function invocation returns, the program exits. It does not wait for other (non-main) goroutines to complete.
So there is no guarantee that the goroutine printing "world" will have time to complete before the program exits.
I suspect that if you run the program enough times, you will see both the 9-line and 10-line outputs on both platforms. Setting the GOMAXPROCS environment variable to 2 might also help in triggering the problem.
You can fix it by making the main goroutine explicitly wait for completion of the other goroutine. For instance, using a channel:
func say(s string, done chan<- bool) {
for i := 0; i < 5; i++ {
time.Sleep(1000 * time.Millisecond)
fmt.Println(s)
}
done <- true
}
func main() {
c := make(chan bool, 2)
go say("world", c)
say("hello", c)
<-c
<-c
}
I've added a buffer to the channel so that the say function can send a value without blocking (primarily so the "hello" invocation actually returns). I then wait receive two values from the channel to make sure both invocations have completed.
For more complex programs, the sync.WaitGroup type can provide a more convenient way to wait on multiple goroutines.
TL;DR: Please just go to the last part and tell me how you would solve this problem.
I've begun using Go this morning coming from Python. I want to call a closed-source executable from Go several times, with a bit of concurrency, with different command line arguments. My resulting code is working just well but I'd like to get your input in order to improve it. Since I'm at an early learning stage, I'll also explain my workflow.
For the sake of simplicity, assume here that this "external closed-source program" is zenity, a Linux command line tool that can display graphical message boxes from the command line.
Calling an executable file from Go
So, in Go, I would go like this:
package main
import "os/exec"
func main() {
cmd := exec.Command("zenity", "--info", "--text='Hello World'")
cmd.Run()
}
This should be working just right. Note that .Run() is a functional equivalent to .Start() followed by .Wait(). This is great, but if I wanted to execute this program just once, the whole programming stuff would not be worth it. So let's just do that multiple times.
Calling an executable multiple times
Now that I had this working, I'd like to call my program multiple times, with custom command line arguments (here just i for the sake of simplicity).
package main
import (
"os/exec"
"strconv"
)
func main() {
NumEl := 8 // Number of times the external program is called
for i:=0; i<NumEl; i++ {
cmd := exec.Command("zenity", "--info", "--text='Hello from iteration n." + strconv.Itoa(i) + "'")
cmd.Run()
}
}
Ok, we did it! But I still can't see the advantage of Go over Python … This piece of code is actually executed in a serial fashion. I have a multiple-core CPU and I'd like to take advantage of it. So let's add some concurrency with goroutines.
Goroutines, or a way to make my program parallel
a) First attempt: just add "go"s everywhere
Let's rewrite our code to make things easier to call and reuse and add the famous go keyword:
package main
import (
"os/exec"
"strconv"
)
func main() {
NumEl := 8
for i:=0; i<NumEl; i++ {
go callProg(i) // <--- There!
}
}
func callProg(i int) {
cmd := exec.Command("zenity", "--info", "--text='Hello from iteration n." + strconv.Itoa(i) + "'")
cmd.Run()
}
Nothing! What is the problem? All the goroutines are executed at once. I don't really know why zenity is not executed but AFAIK, the Go program exited before the zenity external program could even be initialized. This was confirmed by the use of time.Sleep: waiting for a couple of seconds was enough to let the 8 instance of zenity launch themselves. I don't know if this can be considered a bug though.
To make it worse, the real program I'd actually like to call takes a while to execute itself. If I execute 8 instances of this program in parallel on my 4-core CPU, it's gonna waste some time doing a lot of context switching … I don't know how plain Go goroutines behave, but exec.Command will launch zenity 8 times in 8 different threads. To make it even worse, I want to execute this program more than 100,000 times. Doing all of that at once in goroutines won't be efficient at all. Still, I'd like to leverage my 4-core CPU!
b) Second attempt: use pools of goroutines
The online resources tend to recommend the use of sync.WaitGroup for this kind of work. The problem with that approach is that you are basically working with batches of goroutines: if I create of WaitGroup of 4 members, the Go program will wait for all the 4 external programs to finish before calling a new batch of 4 programs. This is not efficient: CPU is wasted, once again.
Some other resources recommended the use of a buffered channel to do the work:
package main
import (
"os/exec"
"strconv"
)
func main() {
NumEl := 8 // Number of times the external program is called
NumCore := 4 // Number of available cores
c := make(chan bool, NumCore - 1)
for i:=0; i<NumEl; i++ {
go callProg(i, c)
c <- true // At the NumCoreth iteration, c is blocking
}
}
func callProg(i int, c chan bool) {
defer func () {<- c}()
cmd := exec.Command("zenity", "--info", "--text='Hello from iteration n." + strconv.Itoa(i) + "'")
cmd.Run()
}
This seems ugly. Channels were not intended for this purpose: I'm exploiting a side-effect. I love the concept of defer but I hate having to declare a function (even a lambda) to pop a value out of the dummy channel that I created. Oh, and of course, using a dummy channel is, by itself, ugly.
c) Third attempt: die when all the children are dead
Now we are nearly finished. I have just to take into account yet another side effect: the Go program closes before all the zenity pop-ups are closed. This is because when the loop is finised (at the 8th iteration), nothing prevents the program from finishing. This time, sync.WaitGroup will be useful.
package main
import (
"os/exec"
"strconv"
"sync"
)
func main() {
NumEl := 8 // Number of times the external program is called
NumCore := 4 // Number of available cores
c := make(chan bool, NumCore - 1)
wg := new(sync.WaitGroup)
wg.Add(NumEl) // Set the number of goroutines to (0 + NumEl)
for i:=0; i<NumEl; i++ {
go callProg(i, c, wg)
c <- true // At the NumCoreth iteration, c is blocking
}
wg.Wait() // Wait for all the children to die
close(c)
}
func callProg(i int, c chan bool, wg *sync.WaitGroup) {
defer func () {
<- c
wg.Done() // Decrease the number of alive goroutines
}()
cmd := exec.Command("zenity", "--info", "--text='Hello from iteration n." + strconv.Itoa(i) + "'")
cmd.Run()
}
Done.
My questions
Do you know any other proper way to limit the number of goroutines executed at once?
I don't mean threads; how Go manages goroutines internally is not relevant. I really mean limiting the number of goroutines launched at once: exec.Command creates a new thread each time it is called, so I should control the number of time it is called.
Does that code look fine to you?
Do you know how to avoid the use of a dummy channel in that case?
I can't convince myself that such dummy channels are the way to go.
I would spawn 4 worker goroutines that read the tasks from a common channel. Goroutines that are faster than others (because they are scheduled differently or happen to get simple tasks) will receive more task from this channel than others. In addition to that, I would use a sync.WaitGroup to wait for all workers to finish. The remaining part is just the creation of the tasks. You can see an example implementation of that approach here:
package main
import (
"os/exec"
"strconv"
"sync"
)
func main() {
tasks := make(chan *exec.Cmd, 64)
// spawn four worker goroutines
var wg sync.WaitGroup
for i := 0; i < 4; i++ {
wg.Add(1)
go func() {
for cmd := range tasks {
cmd.Run()
}
wg.Done()
}()
}
// generate some tasks
for i := 0; i < 10; i++ {
tasks <- exec.Command("zenity", "--info", "--text='Hello from iteration n."+strconv.Itoa(i)+"'")
}
close(tasks)
// wait for the workers to finish
wg.Wait()
}
There are probably other possible approaches, but I think this is a very clean solution that is easy to understand.
A simple approach to throttling (execute f() N times but maximum maxConcurrency concurrently), just a scheme:
package main
import (
"sync"
)
const maxConcurrency = 4 // for example
var throttle = make(chan int, maxConcurrency)
func main() {
const N = 100 // for example
var wg sync.WaitGroup
for i := 0; i < N; i++ {
throttle <- 1 // whatever number
wg.Add(1)
go f(i, &wg, throttle)
}
wg.Wait()
}
func f(i int, wg *sync.WaitGroup, throttle chan int) {
defer wg.Done()
// whatever processing
println(i)
<-throttle
}
Playground
I wouldn't probably call the throttle channel "dummy". IMHO it's an elegant way (it's not my invention of course), how to limit concurrency.
BTW: Please note that you're ignoring the returned error from cmd.Run().
🧩 Modules
Golang Concurrency Manager
📃 Template
package main
import (
"fmt"
"github.com/zenthangplus/goccm"
"math/rand"
"runtime"
)
func main() {
semaphore := goccm.New(runtime.NumCPU())
for {
semaphore.Wait()
go func() {
fmt.Println(rand.Int())
semaphore.Done()
}()
}
semaphore.WaitAllDone()
}
🎰 Optimal routine quantity
If the operation is CPU bounded: runtime.NumCPU()
Otherwise test with: time go run *.go
🔨 Configure
export GOPATH="$(pwd)/gopath"
go mod init *.go
go mod tidy
🧹 CleanUp
find "${GOPATH}" -exec chmod +w {} \;
rm --recursive --force "${GOPATH}"
try this:
https://github.com/korovkin/limiter
limiter := NewConcurrencyLimiter(10)
limiter.Execute(func() {
zenity(...)
})
limiter.Wait()
You could use Worker Pool pattern described here in this post.
This is how an implementation would look like ...
package main
import (
"os/exec"
"strconv"
)
func main() {
NumEl := 8
pool := 4
intChan := make(chan int)
for i:=0; i<pool; i++ {
go callProg(intChan) // <--- launch the worker routines
}
for i:=0;i<NumEl;i++{
intChan <- i // <--- push data which will be received by workers
}
close(intChan) // <--- will safely close the channel & terminate worker routines
}
func callProg(intChan chan int) {
for i := range intChan{
cmd := exec.Command("zenity", "--info", "--text='Hello from iteration n." + strconv.Itoa(i) + "'")
cmd.Run()
}
}
There is a function in the wiringPi 'C' library called delay with type
void delay(unsigned int howLong);
This function delays execution of code for howLong milliseconds. I wrote the binding code in haskell to be able to call this function. The haskell code is as follows,
foreign import ccall "wiringPi.h delay" c_delay :: CUInt -> IO ()
hdelay :: Int -> IO ()
hdelay howlong = c_delay (fromIntegral howlong)
After this, I wrote a simple haskell program to call this function. The simply haskell code is as follows..
--After importing relavent libraries I did
main = wiringPiSetup
>> delay 5000
But the delay does not happen or rather the executable generated by the ghc compiler exits right away.
Could someone tell me what could possibly go wrong here? A small nudge in the right direction would help.
Cheers and Regards.
Please ignore the part in block quote, and see update below - I am preserving the original non-solution because of comments associated with it.
You should mark the import as unsafe since you want the main
thread to block while the function is executing (see comment below by
#carl). By default, import is safe, not unsafe. So, changing
the function signature to this should make the main thread block:
foreign import ccall unsafe "wiring.h delay" c_delay :: CUInt -> IO ()
Also, if you plan to write multi-threaded code, GHC docs for multi-threaded FFI is >very useful. This also seems a good starter.
Update
The behavior seems to be due to signal interrupt handling (if I recall correctly, this was added in GHC 7.4+ to fix some bugs). More details here:
http://hackage.haskell.org/trac/ghc/wiki/Commentary/Rts/Signals
Please note the comment on the above page: Signal handling differs between the threaded version of the runtime and the non-threaded version.
Approach 1 - Handle signal interrupt in FFI code:
A toy code is below which handles the interrupt in sleep. I tested it on Linux 2.6.18 with ghc 7.6.1.
C code:
/** ctest.c **/
#include <unistd.h>
#include <stdio.h>
#include <time.h>
unsigned delay(unsigned sec)
{
struct timespec req={0};
req.tv_sec = sec;
req.tv_nsec = 0;
while (nanosleep(&req, &req) == -1) {
printf("Got interrupt, continuing\n");
continue;
}
return 1;
}
Haskell code:
{-# LANGUAGE ForeignFunctionInterface #-}
-- Filename Test.hs
module Main (main) where
import Foreign.C.Types
foreign import ccall safe "delay" delay :: CUInt -> IO CUInt
main = do
putStrLn "Sleeping"
n <- delay 2000
putStrLn $ "Got return code from sleep: " ++ show n
Now, after compiling with ghc 7.6.1 (command: ghc Test.hs ctest.c), it waits until sleep finishes, and prints a message every time it gets an interrupt signal during sleep:
./Test
Sleeping
Got interrupt, continuing
Got interrupt, continuing
Got interrupt, continuing
Got interrupt, continuing
....
....
Got return code from sleep: 1
Approach 2 - Disable SIGVTALRM before calling FFI code, and re-enable:
I am not sure what the implications are for disabling SIGVTALRM. This is alternative approach which disables SIGVTALRM during FFI call, if you can't alter FFI code. So, FFI code is not interrupted during sleep (assuming it is SIGVTALRM that is causing the interrupt).
{-# LANGUAGE ForeignFunctionInterface #-}
-- Test.hs
module Main (main) where
import Foreign.C.Types
import System.Posix.Signals
foreign import ccall safe "delay" delay :: CUInt -> IO CUInt
main = do
putStrLn "Sleeping"
-- Block SIGVTALRM temporarily to avoid interrupts while sleeping
blockSignals $ addSignal sigVTALRM emptySignalSet
n <- delay 2
putStrLn $ "Got return code from sleep: " ++ show n
-- Unblock SIGVTALRM
unblockSignals $ addSignal sigVTALRM emptySignalSet
return ()