I've noticed that processes started with exec.Command get interrupted even when the interrupt call has been intercepted via signal.Notify. I've done the following example to show the issue:
package main
import (
"log"
"os"
"os/exec"
"os/signal"
"syscall"
)
func sleep() {
log.Println("Sleep start")
cmd := exec.Command("sleep", "60")
cmd.Run()
log.Println("Sleep stop")
}
func main() {
var doneChannel = make(chan bool)
go sleep()
c := make(chan os.Signal, 1)
signal.Notify(c, os.Interrupt)
signal.Notify(c, syscall.SIGTERM)
go func() {
<-c
log.Println("Receved Ctrl + C")
}()
<-doneChannel
}
If Ctrl+C is pressed while this program is running, it's going to print:
2015/10/16 10:05:50 Sleep start
^C2015/10/16 10:05:52 Receved Ctrl + C
2015/10/16 10:05:52 Sleep stop
showing that the sleep commands gets interrupted. Ctrl+C is successfully caught though and the main program doesn't quit, it's just the sleep commands that gets affected.
Any idea how to prevent this from happening?
The shell will signal the entire process group when you press ctrl+c. If you signal the parent process directly, the child process won't receive the signal.
To prevent the shell from signaling the children, you need to start the command in its own process group with with the Setpgid and Pgid fields in syscall.SysProcAttr before starting the processes
cmd := exec.Command("sleep", "60")
cmd.SysProcAttr = &syscall.SysProcAttr{
Setpgid: true,
}
You can ignore the syscall.SIGINT signal, then it won't be passed to the exec.Command.
func main() {
var doneChannel = make(chan bool)
signal.Ignore(syscall.SIGINT)
go func() {
log.Println("Sleep start")
cmd := exec.Command("sleep", "10")
cmd.Run()
log.Println("Sleep stop")
doneChannel <- true
}()
<-doneChannel
}
Related
In golang, I can usually use context.WithTimeout() in combination with exec.CommandContext() to get a command to automatically be killed (with SIGKILL) after the timeout.
But I'm running into a strange issue that if I wrap the command with sh -c AND buffer the command's outputs by setting cmd.Stdout = &bytes.Buffer{}, the timeout no longer works, and the command runs forever.
Why does this happen?
Here is a minimal reproducible example:
package main
import (
"bytes"
"context"
"os/exec"
"time"
)
func main() {
ctx, cancel := context.WithTimeout(context.Background(), 100*time.Millisecond)
defer cancel()
cmdArgs := []string{"sh", "-c", "sleep infinity"}
bufferOutputs := true
// Uncommenting *either* of the next two lines will make the issue go away:
// cmdArgs = []string{"sleep", "infinity"}
// bufferOutputs = false
cmd := exec.CommandContext(ctx, cmdArgs[0], cmdArgs[1:]...)
if bufferOutputs {
cmd.Stdout = &bytes.Buffer{}
}
_ = cmd.Run()
}
I've tagged this question with Linux because I've only verified that this happens on Ubuntu 20.04 and I'm not sure whether it would reproduce on other platforms.
My issue was that the child sleep process was not being killed when the context timed out. The sh parent process was being killed, but the child sleep was being left around.
This would normally still allow the cmd.Wait() call to succeed, but the problem is that cmd.Wait() waits for both the process to exit and for outputs to be copied. Because we've assigned cmd.Stdout, we have to wait for the read-end of the sleep process' stdout pipe to close, but it never closes because the process is still running.
In order to kill child processes, we can instead start the process as its own process group leader by setting the Setpgid bit, which will then allow us to kill the process using its negative PID to kill the process as well as any subprocesses.
Here is a drop-in replacement for exec.CommandContext I came up with that does exactly this:
type Cmd struct {
ctx context.Context
*exec.Cmd
}
// NewCommand is like exec.CommandContext but ensures that subprocesses
// are killed when the context times out, not just the top level process.
func NewCommand(ctx context.Context, command string, args ...string) *Cmd {
return &Cmd{ctx, exec.Command(command, args...)}
}
func (c *Cmd) Start() error {
// Force-enable setpgid bit so that we can kill child processes when the
// context times out or is canceled.
if c.Cmd.SysProcAttr == nil {
c.Cmd.SysProcAttr = &syscall.SysProcAttr{}
}
c.Cmd.SysProcAttr.Setpgid = true
err := c.Cmd.Start()
if err != nil {
return err
}
go func() {
<-c.ctx.Done()
p := c.Cmd.Process
if p == nil {
return
}
// Kill by negative PID to kill the process group, which includes
// the top-level process we spawned as well as any subprocesses
// it spawned.
_ = syscall.Kill(-p.Pid, syscall.SIGKILL)
}()
return nil
}
func (c *Cmd) Run() error {
if err := c.Start(); err != nil {
return err
}
return c.Wait()
}
Currently, I am terminating a process using the Golang os.exec.Cmd.Process.Kill() method (on an Ubuntu box).
This seems to terminate the process immediately instead of gracefully. Some of the processes that I am launching also write to files, and it causes the files to become truncated.
I want to terminate the process gracefully with a SIGTERM instead of a SIGKILL using Golang.
Here is a simple example of a process that is started and then terminated using cmd.Process.Kill(), I would like an alternative in Golang to the Kill() method which uses SIGTERM instead of SIGKILL, thanks!
import "os/exec"
cmd := exec.Command("nc", "example.com", "80")
if err := cmd.Start(); err != nil {
log.Print(err)
}
go func() {
cmd.Wait()
}()
// Kill the process - this seems to kill the process ungracefully
cmd.Process.Kill()
You can use Signal() API. The supported Syscalls are here.
So basically you might want to use
cmd.Process.Signal(syscall.SIGTERM)
Also please note as per documentation.
The only signal values guaranteed to be present in the os package on
all systems are os.Interrupt (send the process an interrupt) and
os.Kill (force the process to exit). On Windows, sending os.Interrupt
to a process with os.Process.Signal is not implemented; it will return
an error instead of sending a signal.
cmd.Process.Signal(syscall.SIGTERM)
You may use:
cmd.Process.Signal(os.Interrupt)
Tested example:
package main
import (
"fmt"
"log"
"net"
"os"
"os/exec"
"sync"
"time"
)
func main() {
cmd := exec.Command("nc", "-l", "8080")
cmd.Stderr = os.Stderr
cmd.Stdout = os.Stdout
cmd.Stdin = os.Stdin
err := cmd.Start()
if err != nil {
log.Fatal(err)
}
var wg sync.WaitGroup
wg.Add(1)
go func() {
err := cmd.Wait()
if err != nil {
fmt.Println("cmd.Wait:", err)
}
fmt.Println("done")
wg.Done()
}()
fmt.Println("TCP Dial")
fmt.Println("Pid =", cmd.Process.Pid)
time.Sleep(200 * time.Millisecond)
// or comment this and use: nc 127.0.0.1 8080
w1, err := net.DialTimeout("tcp", "127.0.0.1:8080", 1*time.Second)
if err != nil {
log.Fatal("tcp DialTimeout:", err)
}
defer w1.Close()
fmt.Fprintln(w1, "Hi")
time.Sleep(1 * time.Second)
// cmd.Process.Kill()
cmd.Process.Signal(os.Interrupt)
wg.Wait()
}
Output:
TCP Dial
Pid = 21257
Hi
cmd.Wait: signal: interrupt
done
I recently realized that I don't know how to properly Read and Close in Go concurrently. In my particular case, I need to do that with a serial port, but the problem is more generic.
If we do that without any extra effort to synchronize things, it leads to a race condition. Simple example:
package main
import (
"fmt"
"os"
"time"
)
func main() {
f, err := os.Open("/dev/ttyUSB0")
if err != nil {
panic(err)
}
// Start a goroutine which keeps reading from a serial port
go reader(f)
time.Sleep(1000 * time.Millisecond)
fmt.Println("closing")
f.Close()
time.Sleep(1000 * time.Millisecond)
}
func reader(f *os.File) {
b := make([]byte, 100)
for {
f.Read(b)
}
}
If we save the above as main.go, and run go run --race main.go, the output will look as follows:
closing
==================
WARNING: DATA RACE
Write at 0x00c4200143c0 by main goroutine:
os.(*file).close()
/usr/local/go/src/os/file_unix.go:143 +0x124
os.(*File).Close()
/usr/local/go/src/os/file_unix.go:132 +0x55
main.main()
/home/dimon/mydata/projects/go/src/dmitryfrank.com/testfiles/main.go:20 +0x13f
Previous read at 0x00c4200143c0 by goroutine 6:
os.(*File).read()
/usr/local/go/src/os/file_unix.go:228 +0x50
os.(*File).Read()
/usr/local/go/src/os/file.go:101 +0x6f
main.reader()
/home/dimon/mydata/projects/go/src/dmitryfrank.com/testfiles/main.go:27 +0x8b
Goroutine 6 (running) created at:
main.main()
/home/dimon/mydata/projects/go/src/dmitryfrank.com/testfiles/main.go:16 +0x81
==================
Found 1 data race(s)
exit status 66
Ok, but how to handle that properly? Of course, we can't just lock some mutex before calling f.Read(), because the mutex will end up locked basically all the time. To make it work properly, we'd need some sort of cooperation between reading and locking, like conditional variables do: the mutex gets unlocked before putting the goroutine to wait, and it's locked back when the goroutine wakes up.
I would implement something like this manually, but then I need some way to select things while reading. Like this: (pseudocode)
select {
case b := <-f.NextByte():
// process the byte somehow
default:
}
I examined docs of the packages os and sync, and so far I don't see any way to do that.
I belive you need 2 signals:
main -> reader, to tell it to stop reading
reader -> main, to tell that reader has been terminated
of course you can select go signaling primitive (channel, waitgroup, context etc) that you prefer.
Example below, I use waitgroup and context. The reason is
that you can spin multiple reader and only need to close the context to tell all the reader go-routine to stop.
I created multiple go routine just as
an example that you can even coordinate multiple go routine with it.
package main
import (
"context"
"fmt"
"os"
"sync"
"time"
)
func main() {
ctx, cancelFn := context.WithCancel(context.Background())
f, err := os.Open("/dev/ttyUSB0")
if err != nil {
panic(err)
}
var wg sync.WaitGroup
for i := 0; i < 3; i++ {
wg.Add(1)
// Start a goroutine which keeps reading from a serial port
go func(i int) {
defer wg.Done()
reader(ctx, f)
fmt.Printf("reader %d closed\n", i)
}(i)
}
time.Sleep(1000 * time.Millisecond)
fmt.Println("closing")
cancelFn() // signal all reader to stop
wg.Wait() // wait until all reader finished
f.Close()
fmt.Println("file closed")
time.Sleep(1000 * time.Millisecond)
}
func reader(ctx context.Context, f *os.File) {
b := make([]byte, 100)
for {
select {
case <-ctx.Done():
return
default:
f.Read(b)
}
}
}
I have an issue when trying to recover a process in go. My go app launch a bunch of processes and when it crashes the processes are out there in the open and when I rerun my app I want to recover my processes. On windows everything works as expected I can wait() on the process kill() it etc.. but in linux it just goes trough my wait() without any error.
Here is the code
func (proc *process) Recover() {
pr, err := os.FindProcess(proc.Cmd.Process.Pid)
if err != nil {
return
}
log.Info("Recovering " + proc.Name + proc.Service.Version)
Processes.Lock()
Processes.Map[proc.Name] = proc
Processes.Unlock()
proc.Cmd.Process = pr
if proc.Service.Reload > 0 {
proc.End = make(chan bool)
go proc.KillRoutine()
}
proc.Cmd.Wait()
if proc.Status != "killed" {
proc.Status = "finished"
}
proc.Time = time.Now()
channelProcess <- proc
//confirmation that process was killed
if proc.End != nil {
proc.End <- true
}
}
process is my own struct to handle processes the important part is cmd which is from the package "os/exec" I have also tried to directly call pr.wait() with the same issue
You're not handing the error message from Wait. Try:
ps, err := proc.Cmd.Wait()
if err != nil {
/* handle it */
}
Also the documentation says:
Wait waits for the Process to exit, and then returns a ProcessState
describing its status and an error, if any. Wait releases any
resources associated with the Process. On most operating systems, the
Process must be a child of the current process or an error will be
returned.
In your case since you're "recovering", your process is not the parent of the processes you found using os.FindProcess.
So why does it work on windows? I suspect it is because on windows it boils down to WaitForSingleObject which doesn't have that requirement.
in Java I can make threads run for long periods of time and I don't need to stay within the function that started the thread.
Goroutines, Go's answer to Threads seem to stop running after I return from the function that started the routine.
How can I make these routines stay running and return from the calling function?
Thanks
Goroutines do continue running after the function that invokes them exits: Playground
package main
import (
"fmt"
"time"
)
func countToTen() chan bool {
done := make(chan bool)
go func() {
for i := 0; i < 10; i++ {
time.Sleep(1 * time.Second)
fmt.Println(i)
}
done <- true
}()
return done
}
func main() {
done := countToTen()
fmt.Println("countToTen() exited")
// reading from the 'done' channel will block the main thread
// until there is something to read, which won't happen until
// countToTen()'s goroutine is finished
<-done
}
Note that we need to block the main thread until countToTen()'s goroutine completes. If we don't do this, the main thread will exit and all other goroutines will be stopped even if they haven't completed their task yet.
You can.
If you want to have a go-routine running in background forever, you need to have some kind of infinite loop, with some kind of graceful stopping mechanism in place, usually via channel. And invoke the go-routine via some other function, so even after this other function terminates, your go-routine will still be running.
For example:
// Go routine which will run indefinitely.
// Unless you send a signal on quit channel.
func goroutine(quit chan bool) {
for {
select {
case <-quit:
fmt.Println("quit")
return
default:
fmt.Println("Do your thing")
}
}
}
// Go routine will still be running,
// after you return from this function.
func invoker() {
q := make(chan bool)
go goroutine(q)
}
Here, you can call invoker, when you want to start the go-routine. And even after invoker returns, your go-routine will still be running in background.
Only exception to this is, when main function returns all go-routines in the application will be terminated.