Sleeping process until completed in Go - linux

I am trying to automate a process in Go. I have been able to implement threads and do the process accordingly however the output is mixed and matched.
I was wondering if there is a way to show the output as it is produced by the program and according to the program's process. So if task A completes before task B, we show A's output before B, or vice-versa.
package main
import (
"fmt"
"log"
"os"
"os/exec"
"sync"
)
var url string
var wg sync.WaitGroup
func nikto() {
cmd := exec.Command("nikto", "-h", url)
cmd.Stdout = os.Stdout
err := cmd.Run()
if err != nil {
log.Fatal(err)
}
wg.Done()
}
func whois() {
cmd := exec.Command("whois", "google.co")
cmd.Stdout = os.Stdout
err := cmd.Run()
if err != nil {
log.Fatal(err)
}
wg.Done()
}
func main() {
fmt.Printf("Please input URL")
fmt.Scanln(&url)
wg.Add(1)
go nikto()
wg.Add(1)
go whois()
wg.Wait()
}

In your process, you pass the os.Stdout file descriptor directly to the commands you invoke to run your child processes. This means the STDOUT pipe of the child processes will be connected directly to your Go program's standard output, and will likely be interleaved if both child processes write simultaneously.
The simplest way to fix this requires you to buffer the output from the STDOUT pipe of the child process in your Go program, so you can intercept the output and control when it is printed.
The Cmd type in the os/exec package provides a function call Output() which will invoke the child process and return the contents of STDOUT in a byte slice. Your code can be adapted with ease to implement this pattern and process the results, for example:
func whois() {
cmd := exec.Command("whois", "google.co")
out, err := cmd.Output()
if err != nil {
log.Fatal(err)
}
fmt.Println(out)
wg.Done()
}
Interleaving of output
If you use functions in the fmt package to print output, there is no guarantee that concurrent calls to fmt.Println will not be interleaved.
To prevent interleaving, you may choose to serialize access to STDOUT, or use a logger which is safe for concurrent use (such as the log package). Here is an example of serializing access to STDOUT in the Go process:
package main
import (
"fmt"
"log"
"os/exec"
"sync"
)
var url string
func nikto(outChan chan<- []byte) {
cmd := exec.Command("nikto", "-h", url)
bs, err := cmd.Output()
if err != nil {
log.Fatal(err)
}
outChan <- bs
}
func whois(outChan chan<- []byte) {
cmd := exec.Command("whois", "google.com")
bs, err := cmd.Output()
if err != nil {
log.Fatal(err)
}
outChan <- bs
}
func main() {
outChan := make(chan []byte)
fmt.Printf("Please input URL")
fmt.Scanln(&url)
go nikto(outChan)
go whois(outChan)
for i := 0; i < 2; i++ {
bs := <-outChan
fmt.Println(string(bs))
}
}

Related

Persistent Reader() object

In Go, I am trying to create a function that reads and processes the next line of input:
// Read a string of hex from stdin and parse to an array of bytes
func ReadHex() []byte {
r := bufio.NewReader(os.Stdin)
t, _ := r.ReadString('\n')
data, _ := hex.DecodeString(strings.TrimSpace(t))
return data
}
Unfortunately, this only works the first time it is called. It captures the first line but is unable to capture subsequent lines piped via standard input.
I suspect, if the same persistent bufio.Reader() object was used on each subsequent call, it would work but I haven't been able to achieve this without passing it manually on each function call.
Yes, try this:
package main
import (
"bufio"
"encoding/hex"
"fmt"
"log"
"os"
"strings"
)
func ReadFunc() func() []byte {
r := bufio.NewReader(os.Stdin)
return func() []byte {
t, err := r.ReadString('\n')
if err != nil {
log.Fatal(err)
}
data, err := hex.DecodeString(strings.TrimSpace(t))
if err != nil {
log.Fatal(err)
}
return data
}
}
func main() {
r, w, err := os.Pipe()
if err != nil {
log.Fatal(err)
}
os.Stdin = r
w.Write([]byte(`ffff
cafebabe
ff
`))
w.Close()
ReadHex := ReadFunc()
fmt.Println(ReadHex())
fmt.Println(ReadHex())
fmt.Println(ReadHex())
}
Output:
[255 255]
[202 254 186 190]
[255]
Using a struct, try this:
package main
import (
"bufio"
"encoding/hex"
"fmt"
"io"
"log"
"os"
"strings"
)
// InputReader struct
type InputReader struct {
bufio.Reader
}
// New creates an InputReader
func New(rd io.Reader) *InputReader {
return &InputReader{Reader: *bufio.NewReader(rd)}
}
// ReadHex returns a string of hex from stdin and parse to an array of bytes
func (r *InputReader) ReadHex() []byte {
t, err := r.ReadString('\n')
if err != nil {
log.Fatal(err)
}
data, err := hex.DecodeString(strings.TrimSpace(t))
if err != nil {
log.Fatal(err)
}
return data
}
func main() {
r, w, err := os.Pipe()
if err != nil {
log.Fatal(err)
}
os.Stdin = r
w.Write([]byte(`ffff
cafebabe
ff
`))
w.Close()
rdr := New(os.Stdin)
fmt.Println(rdr.ReadHex())
fmt.Println(rdr.ReadHex())
fmt.Println(rdr.ReadHex())
}

Does it make sense to make expensive syscalls from different goroutines?

If application does some heavy lifting with multiple file descriptors (e.g., opening - writing data - syncing - closing), what actually happens to Go runtime? Does it block all the goroutines at the time when expensive syscall occures (like syscall.Fsync)? Or only the calling goroutine is blocked while the others are still operating?
So does it make sense to write programs with multiple workers that do a lot of user space - kernel space context switching? Does it make sense to use multithreading patterns for disk input?
package main
import (
"log"
"os"
"sync"
)
var data = []byte("some big data")
func worker(filenamechan chan string, wg *sync.waitgroup) {
defer wg.done()
for {
filename, ok := <-filenamechan
if !ok {
return
}
// open file is a quite expensive operation due to
// the opening new descriptor
f, err := os.openfile(filename, os.o_create|os.o_wronly, os.filemode(0644))
if err != nil {
log.fatal(err)
continue
}
// write is a cheap operation,
// because it just moves data from user space to the kernel space
if _, err := f.write(data); err != nil {
log.fatal(err)
continue
}
// syscall.fsync is a disk-bound expensive operation
if err := f.sync(); err != nil {
log.fatal(err)
continue
}
if err := f.close(); err != nil {
log.fatal(err)
}
}
}
func main() {
// launch workers
filenamechan := make(chan string)
wg := &sync.waitgroup{}
for i := 0; i < 2; i++ {
wg.add(1)
go worker(filenamechan, wg)
}
// send tasks to workers
filenames := []string{
"1.txt",
"2.txt",
"3.txt",
"4.txt",
"5.txt",
}
for i := range filenames {
filenamechan <- filenames[i]
}
close(filenamechan)
wg.wait()
}
https://play.golang.org/p/O0omcPBMAJ
If a syscall blocks, the Go runtime will launch a new thread so that the number of threads available​ to run goroutines remains the same.
A fuller explanation can be found here: https://morsmachine.dk/go-scheduler

How to properly wait for an event/process to finish not being the parent?

I am using GO to check if a process (not been parent) has ben terminated, basically something like the pwait command in FreeBSD but written in go.
Currently I am trying a for loop with a kill -0, but I notice that the CPU usage is very high 99% with this approach, here is the code:
package main
import (
"fmt"
"os"
"strconv"
"syscall"
"time"
)
func main() {
if len(os.Args) != 2 {
fmt.Printf("usage: %s pid", os.Args[0])
os.Exit(1)
}
pid, err := strconv.ParseInt(os.Args[1], 10, 64)
if err != nil {
panic(err)
}
process, err := os.FindProcess(int(pid))
err = process.Signal(syscall.Signal(0))
for err == nil {
err = process.Signal(syscall.Signal(0))
time.Sleep(500 * time.Millisecond)
}
fmt.Println(err)
}
Any idea of how to improve or properly implement this.
Thanks in advance.
UPDATE
Adding a sleep within the loop like suggested, helps reducing the load.
From the provided links, seems to be possible to attach to the existing pid, I will give a try PtraceAttach but don't know if this may have side effects, any idea?
As suggested I was available to use kqueue:
package main
import (
"fmt"
"log"
"os"
"strconv"
"syscall"
)
func main() {
if len(os.Args) != 2 {
fmt.Printf("usage: %s pid", os.Args[0])
os.Exit(1)
}
pid, err := strconv.ParseInt(os.Args[1], 10, 64)
if err != nil {
panic(err)
}
process, _ := os.FindProcess(int(pid))
kq, err := syscall.Kqueue()
if err != nil {
fmt.Println(err)
}
ev1 := syscall.Kevent_t{
Ident: uint64(process.Pid),
Filter: syscall.EVFILT_PROC,
Flags: syscall.EV_ADD,
Fflags: syscall.NOTE_EXIT,
Data: 0,
Udata: nil,
}
for {
events := make([]syscall.Kevent_t, 1)
n, err := syscall.Kevent(kq, []syscall.Kevent_t{ev1}, events, nil)
if err != nil {
log.Println("Error creating kevent")
}
if n > 0 {
break
}
}
fmt.Println("fin")
}
Works fine, but wondering how to implement/achieve the same on linux since I think kqueue not available on it, any ideas ?
One solution would be to use the netlink proc connector, which is a socket the kernel uses to let userspace know about different process events. The official documentation is somewhat lacking, although there are a couple of good examples in C which are probably better to read.
The main caveat to using the proc connector is the process must be run as root. If running your program as a non-root user is a requirement, you should consider other options, such as periodically polling /proc to watch for changes. Any approach which uses polling, as others have pointed out, is susceptible to a race condition if the process is terminated and another one is started with the same PID in between polls.
Anyway, to use the proc connector in Go, we will have to do some translation from C. Specifically, we need to define the proc_event and exit_proc_event structs from cn_proc.h, and the cn_msg and cb_id structs from connector.h.
// CbID corresponds to cb_id in connector.h
type CbID struct {
Idx uint32
Val uint32
}
// CnMsg corresponds to cn_msg in connector.h
type CnMsg struct {
ID CbID
Seq uint32
Ack uint32
Len uint16
Flags uint16
}
// ProcEventHeader corresponds to proc_event in cn_proc.h
type ProcEventHeader struct {
What uint32
CPU uint32
Timestamp uint64
}
// ExitProcEvent corresponds to exit_proc_event in cn_proc.h
type ExitProcEvent struct {
ProcessPid uint32
ProcessTgid uint32
ExitCode uint32
ExitSignal uint32
}
We also need to make a netlink socket and call bind.
sock, err := unix.Socket(unix.AF_NETLINK, unix.SOCK_DGRAM, unix.NETLINK_CONNECTOR)
if err != nil {
fmt.Println("socket: %v", err)
return
}
addr := &unix.SockaddrNetlink{Family: unix.AF_NETLINK, Groups: C.CN_IDX_PROC, Pid: uint32(os.Getpid())}
err = unix.Bind(sock, addr)
if err != nil {
fmt.Printf("bind: %v\n", err)
return
}
Next, we have to send the PROC_CN_MCAST_LISTEN message to the kernel to let it know we want to receive events. We can import this directly from C, where it's defined as an enum, to save some typing, and put it in a function since we will have to call it again with PROC_CN_MCAST_IGNORE when we are done receiving data from the kernel.
// #include <linux/cn_proc.h>
// #include <linux/connector.h>
import "C"
func send(sock int, msg uint32) error {
destAddr := &unix.SockaddrNetlink{Family: unix.AF_NETLINK, Groups: C.CN_IDX_PROC, Pid: 0} // the kernel
cnMsg := CnMsg{}
header := unix.NlMsghdr{
Len: unix.NLMSG_HDRLEN + uint32(binary.Size(cnMsg) + binary.Size(msg)),
Type: uint16(unix.NLMSG_DONE),
Flags: 0,
Seq: 1,
Pid: uint32(unix.Getpid()),
}
msg.ID = CbID{Idx: C.CN_IDX_PROC, Val: C.CN_VAL_PROC}
msg.Len = uint16(binary.Size(msg))
msg.Ack = 0
msg.Seq = 1
buf := bytes.NewBuffer(make([]byte, 0, header.Len))
binary.Write(buf, binary.LittleEndian, header)
binary.Write(buf, binary.LittleEndian, cnMsg)
binary.Write(buf, binary.LittleEndian, msg)
return unix.Sendto(sock, buf.Bytes(), 0, destAddr)
}
After we let the kernel know we're ready to receive events, we can receive them on the socket we're created. Once we receive them, we need to parse them, and check for relevant data. We only care about messages that meet the following criteria:
Come from the kernel
Have a header type of NLMSG_DONE
Have a proc_event_header.what value of PROC_EVENT_EXIT
Match our PID
If they meet these criteria, we can extract the relevant process information into a proc_event_exit struct, which contains the PID of the process.
for {
p := make([]byte, 1024)
nr, from, err := unix.Recvfrom(sock, p, 0)
if sockaddrNl, ok := from.(*unix.SockaddrNetlink); !ok || sockaddrNl.Pid != 0 {
continue
}
if err != nil {
fmt.Printf("Recvfrom: %v\n", err)
continue
}
if nr < unix.NLMSG_HDRLEN {
continue
}
// the sys/unix package doesn't include the ParseNetlinkMessage function
nlmessages, err := syscall.ParseNetlinkMessage(p[:nr])
if err != nil {
fmt.Printf("ParseNetlinkMessage: %v\n", err)
continue
}
for _, m := range(nlmessages) {
if m.Header.Type == unix.NLMSG_DONE {
buf := bytes.NewBuffer(m.Data)
msg := &CnMsg{}
hdr := &ProcEventHeader{}
binary.Read(buf, binary.LittleEndian, msg)
binary.Read(buf, binary.LittleEndian, hdr)
if hdr.What == C.PROC_EVENT_EXIT {
event := &ExitProcEvent{}
binary.Read(buf, binary.LittleEndian, event)
pid := int(event.ProcessTgid)
fmt.Printf("%d just exited.\n", pid)
}
}
}
}
A full code example is here.

How to read input from console in a non blocking way with Go?

So I have:
import (
"bufio"
"os"
)
//...
var reader = bufio.NewReader(os.Stdin)
str, err := reader.ReadString('\n')
But reader.ReadString('\n') is blocking execution. I would like to read input in a non blocking way. Is it possible to achieve non blocking buffered input from os.Stdin using bufio package or any other std lib package from Go?
In general there isn't a concept of non-blocking IO APIs in Go. You accomplish the same thing by using goroutines.
Here's an example on Play, stdin is simulated since play doesn't allow for it.
package main
import "fmt"
import "time"
func main() {
ch := make(chan string)
go func(ch chan string) {
/* Uncomment this block to actually read from stdin
reader := bufio.NewReader(os.Stdin)
for {
s, err := reader.ReadString('\n')
if err != nil { // Maybe log non io.EOF errors, if you want
close(ch)
return
}
ch <- s
}
*/
// Simulating stdin
ch <- "A line of text"
close(ch)
}(ch)
stdinloop:
for {
select {
case stdin, ok := <-ch:
if !ok {
break stdinloop
} else {
fmt.Println("Read input from stdin:", stdin)
}
case <-time.After(1 * time.Second):
// Do something when there is nothing read from stdin
}
}
fmt.Println("Done, stdin must be closed")
}

how to use multiple processes with http

How to make use of all CPUs and spawn a http process for each CPU?
Get num of CPUs
numCPU := runtime.NumCPU()
Start http
package main
import (
"fmt"
"net/http"
)
func handler(w http.ResponseWriter, r *http.Request) {
fmt.Fprintf(w, "Hi there, I love %s!", r.URL.Path[1:])
}
func main() {
http.HandleFunc("/", handler)
http.ListenAndServe(":8080", nil)
}
If your goal is just to have your request-processing code run on all CPU cores, net/http already starts a goroutine (a vaguely thread-like thing with a Go-specific implementation) per connection, and Go arranges for NumCPU OS threads to run by default so that goroutines can be spread across all available CPU cores.
The Accept loop runs in a single goroutine, but the actual work of parsing requests and generating responses runs in one per connection.
You can't nativly, you have to write your own wrapper:
// copied from http://golang.org/src/pkg/net/http/server.go#L1942
type tcpKeepAliveListener struct {
*net.TCPListener
}
func (ln tcpKeepAliveListener) Accept() (c net.Conn, err error) {
tc, err := ln.AcceptTCP()
if err != nil {
return
}
tc.SetKeepAlive(true)
tc.SetKeepAlivePeriod(3 * time.Minute)
return tc, nil
}
func ListenAndServe(addr string, num int) error {
if addr == "" {
addr = ":http"
}
ln, err := net.Listen("tcp", addr)
if err != nil {
return err
}
var wg sync.WaitGroup
for i := 0; i < num; i++ {
wg.Add(1)
go func(i int) {
log.Println("listener number", i)
log.Println(http.Serve(tcpKeepAliveListener{ln.(*net.TCPListener)}, nil))
wg.Done()
}(i)
}
wg.Wait()
return nil
}
func main() {
num := runtime.NumCPU()
runtime.GOMAXPROCS(num) //so the goroutine listeners would try to run on multiple threads
log.Println(ListenAndServe(":9020", num))
}
Or if you use a recent enough Linux Kernel you can use the patch from http://comments.gmane.org/gmane.comp.lang.go.general/121122 and actually spawn multiple processes.

Resources