If application does some heavy lifting with multiple file descriptors (e.g., opening - writing data - syncing - closing), what actually happens to Go runtime? Does it block all the goroutines at the time when expensive syscall occures (like syscall.Fsync)? Or only the calling goroutine is blocked while the others are still operating?
So does it make sense to write programs with multiple workers that do a lot of user space - kernel space context switching? Does it make sense to use multithreading patterns for disk input?
package main
import (
"log"
"os"
"sync"
)
var data = []byte("some big data")
func worker(filenamechan chan string, wg *sync.waitgroup) {
defer wg.done()
for {
filename, ok := <-filenamechan
if !ok {
return
}
// open file is a quite expensive operation due to
// the opening new descriptor
f, err := os.openfile(filename, os.o_create|os.o_wronly, os.filemode(0644))
if err != nil {
log.fatal(err)
continue
}
// write is a cheap operation,
// because it just moves data from user space to the kernel space
if _, err := f.write(data); err != nil {
log.fatal(err)
continue
}
// syscall.fsync is a disk-bound expensive operation
if err := f.sync(); err != nil {
log.fatal(err)
continue
}
if err := f.close(); err != nil {
log.fatal(err)
}
}
}
func main() {
// launch workers
filenamechan := make(chan string)
wg := &sync.waitgroup{}
for i := 0; i < 2; i++ {
wg.add(1)
go worker(filenamechan, wg)
}
// send tasks to workers
filenames := []string{
"1.txt",
"2.txt",
"3.txt",
"4.txt",
"5.txt",
}
for i := range filenames {
filenamechan <- filenames[i]
}
close(filenamechan)
wg.wait()
}
https://play.golang.org/p/O0omcPBMAJ
If a syscall blocks, the Go runtime will launch a new thread so that the number of threads available to run goroutines remains the same.
A fuller explanation can be found here: https://morsmachine.dk/go-scheduler
Related
I was trying to create a client server model to learn some stuff and I just tried sending(writing) data from client to server in a loop and it just don't worked well. I think that there are some concurrency issues and the client writes faster to server and the server than read multiple statements in one go. How can I maintain this concurrency so that only one statement written by the client at a time is read by the server. Here is the code to illustrate the problem in a better.
Here is the server handleConnection Function
func main() {
conn, err := net.Listen("tcp", ":8080")
if err != nil {
log.Println("Error:", err)
}
for {
ln, err := conn.Accept()
if err != nil {
log.Println("Error:", err)
continue
}
go handleConnection(ln)
}
}
func handleConnection(conn net.Conn) {
buffer := make([]byte, 4096)
for i := 0; i < 10; i++ {
n, err := conn.Read(buffer)
if err != nil {
fmt.Println(err, i)
}
fmt.Printf("%s\n", buffer[:n])
}
fmt.Println("Done")
conn.Close()
}
Here is the client writing data to server in loop.
func main() {
conn, err := net.Dial("tcp", ":8080")
if err != nil {
log.Println("Error:", err)
os.Exit(1)
}
for i := 0; i < 10; i++ {
_, err = conn.Write([]byte("Rehan"))
if err != nil {
fmt.Println(err, i)
}
}
fmt.Println("Done")
conn.Close()
}
This is the output by the server.
]1
It isn't a concurrency issue. It's a networking issue.
TCP is a stream protocol, as such, a single read() from a socket doesn't correspond to a single write() from the other side.
Instead, reads return whatever is in the TCP buffer at the time of read, regardless whether it was sent by a single call to write() or a hundred.
If you want to read the data from the socket as separate messages, you need a way of separating them by using a delimiter, counting bytes, or some other method.
I am trying to automate a process in Go. I have been able to implement threads and do the process accordingly however the output is mixed and matched.
I was wondering if there is a way to show the output as it is produced by the program and according to the program's process. So if task A completes before task B, we show A's output before B, or vice-versa.
package main
import (
"fmt"
"log"
"os"
"os/exec"
"sync"
)
var url string
var wg sync.WaitGroup
func nikto() {
cmd := exec.Command("nikto", "-h", url)
cmd.Stdout = os.Stdout
err := cmd.Run()
if err != nil {
log.Fatal(err)
}
wg.Done()
}
func whois() {
cmd := exec.Command("whois", "google.co")
cmd.Stdout = os.Stdout
err := cmd.Run()
if err != nil {
log.Fatal(err)
}
wg.Done()
}
func main() {
fmt.Printf("Please input URL")
fmt.Scanln(&url)
wg.Add(1)
go nikto()
wg.Add(1)
go whois()
wg.Wait()
}
In your process, you pass the os.Stdout file descriptor directly to the commands you invoke to run your child processes. This means the STDOUT pipe of the child processes will be connected directly to your Go program's standard output, and will likely be interleaved if both child processes write simultaneously.
The simplest way to fix this requires you to buffer the output from the STDOUT pipe of the child process in your Go program, so you can intercept the output and control when it is printed.
The Cmd type in the os/exec package provides a function call Output() which will invoke the child process and return the contents of STDOUT in a byte slice. Your code can be adapted with ease to implement this pattern and process the results, for example:
func whois() {
cmd := exec.Command("whois", "google.co")
out, err := cmd.Output()
if err != nil {
log.Fatal(err)
}
fmt.Println(out)
wg.Done()
}
Interleaving of output
If you use functions in the fmt package to print output, there is no guarantee that concurrent calls to fmt.Println will not be interleaved.
To prevent interleaving, you may choose to serialize access to STDOUT, or use a logger which is safe for concurrent use (such as the log package). Here is an example of serializing access to STDOUT in the Go process:
package main
import (
"fmt"
"log"
"os/exec"
"sync"
)
var url string
func nikto(outChan chan<- []byte) {
cmd := exec.Command("nikto", "-h", url)
bs, err := cmd.Output()
if err != nil {
log.Fatal(err)
}
outChan <- bs
}
func whois(outChan chan<- []byte) {
cmd := exec.Command("whois", "google.com")
bs, err := cmd.Output()
if err != nil {
log.Fatal(err)
}
outChan <- bs
}
func main() {
outChan := make(chan []byte)
fmt.Printf("Please input URL")
fmt.Scanln(&url)
go nikto(outChan)
go whois(outChan)
for i := 0; i < 2; i++ {
bs := <-outChan
fmt.Println(string(bs))
}
}
I was trying to implement multithreading in golang. I am able to implement go routines but it is not working as expected. below is the sample program which i have prepared,
func test(s string, fo *os.File) {
var s1 [105]int
count :=0
for x :=1000; x<1101;x++ {
s1[count] = x;
count++
}
//fmt.Println(s1[0])
for i := range s1 {
runtime.Gosched()
sd := s + strconv.Itoa(i)
var fileMutex sync.Mutex
fileMutex.Lock()
fmt.Fprintf(fo,sd)
defer fileMutex.Unlock()
}
}
func main() {
fo,err :=os.Create("D:/Output.txt")
if err != nil {
panic(err)
}
for i := 0; i < 4; i++ {
go test("bye",fo)
}
}
OUTPUT - good0bye0bye0bye0bye0good1bye1bye1bye1bye1good2bye2bye2bye2bye2.... etc.
the above program will create a file and write "Hello" and "bye" in the file.
My problem is i am trying to create 5 thread and wanted to process different values values with different thread. if you will see the above example it is printing "bye" 4 times.
i wanted output like below using 5 thread,
good0bye0good1bye1good2bye2....etc....
any idea how can i achieve this?
First, you need to block in your main function until all other goroutines return. The mutexes in your program aren't blocking anything, and since they're re-initialized in each loop, they don't even block within their own goroutine. You can't defer an unlock if you're not returning from the function, you need to explicitly unlock in each iteration of the loop. You aren't using any of the values in your array (though you should use a slice instead), so we can drop that entirely. You also don't need runtime.GoSched in a well-behaved program, and it does nothing here.
An equivalent program that will run to completion would look like:
var wg sync.WaitGroup
var fileMutex sync.Mutex
func test(s string, fo *os.File) {
defer wg.Done()
for i := 0; i < 105; i++ {
fileMutex.Lock()
fmt.Fprintf(fo, "%s%d", s, i)
fileMutex.Unlock()
}
}
func main() {
fo, err := os.Create("D:/output.txt")
if err != nil {
log.Fatal(err)
}
for i := 0; i < 4; i++ {
wg.Add(1)
go test("bye", fo)
}
wg.Wait()
}
Finally though, there's no reason to try and write serial values to a single file from multiple goroutines, and it's less efficient to do so. If you want the values ordered over the entire file, you will need to use a single goroutine anyway.
I am using GO to check if a process (not been parent) has ben terminated, basically something like the pwait command in FreeBSD but written in go.
Currently I am trying a for loop with a kill -0, but I notice that the CPU usage is very high 99% with this approach, here is the code:
package main
import (
"fmt"
"os"
"strconv"
"syscall"
"time"
)
func main() {
if len(os.Args) != 2 {
fmt.Printf("usage: %s pid", os.Args[0])
os.Exit(1)
}
pid, err := strconv.ParseInt(os.Args[1], 10, 64)
if err != nil {
panic(err)
}
process, err := os.FindProcess(int(pid))
err = process.Signal(syscall.Signal(0))
for err == nil {
err = process.Signal(syscall.Signal(0))
time.Sleep(500 * time.Millisecond)
}
fmt.Println(err)
}
Any idea of how to improve or properly implement this.
Thanks in advance.
UPDATE
Adding a sleep within the loop like suggested, helps reducing the load.
From the provided links, seems to be possible to attach to the existing pid, I will give a try PtraceAttach but don't know if this may have side effects, any idea?
As suggested I was available to use kqueue:
package main
import (
"fmt"
"log"
"os"
"strconv"
"syscall"
)
func main() {
if len(os.Args) != 2 {
fmt.Printf("usage: %s pid", os.Args[0])
os.Exit(1)
}
pid, err := strconv.ParseInt(os.Args[1], 10, 64)
if err != nil {
panic(err)
}
process, _ := os.FindProcess(int(pid))
kq, err := syscall.Kqueue()
if err != nil {
fmt.Println(err)
}
ev1 := syscall.Kevent_t{
Ident: uint64(process.Pid),
Filter: syscall.EVFILT_PROC,
Flags: syscall.EV_ADD,
Fflags: syscall.NOTE_EXIT,
Data: 0,
Udata: nil,
}
for {
events := make([]syscall.Kevent_t, 1)
n, err := syscall.Kevent(kq, []syscall.Kevent_t{ev1}, events, nil)
if err != nil {
log.Println("Error creating kevent")
}
if n > 0 {
break
}
}
fmt.Println("fin")
}
Works fine, but wondering how to implement/achieve the same on linux since I think kqueue not available on it, any ideas ?
One solution would be to use the netlink proc connector, which is a socket the kernel uses to let userspace know about different process events. The official documentation is somewhat lacking, although there are a couple of good examples in C which are probably better to read.
The main caveat to using the proc connector is the process must be run as root. If running your program as a non-root user is a requirement, you should consider other options, such as periodically polling /proc to watch for changes. Any approach which uses polling, as others have pointed out, is susceptible to a race condition if the process is terminated and another one is started with the same PID in between polls.
Anyway, to use the proc connector in Go, we will have to do some translation from C. Specifically, we need to define the proc_event and exit_proc_event structs from cn_proc.h, and the cn_msg and cb_id structs from connector.h.
// CbID corresponds to cb_id in connector.h
type CbID struct {
Idx uint32
Val uint32
}
// CnMsg corresponds to cn_msg in connector.h
type CnMsg struct {
ID CbID
Seq uint32
Ack uint32
Len uint16
Flags uint16
}
// ProcEventHeader corresponds to proc_event in cn_proc.h
type ProcEventHeader struct {
What uint32
CPU uint32
Timestamp uint64
}
// ExitProcEvent corresponds to exit_proc_event in cn_proc.h
type ExitProcEvent struct {
ProcessPid uint32
ProcessTgid uint32
ExitCode uint32
ExitSignal uint32
}
We also need to make a netlink socket and call bind.
sock, err := unix.Socket(unix.AF_NETLINK, unix.SOCK_DGRAM, unix.NETLINK_CONNECTOR)
if err != nil {
fmt.Println("socket: %v", err)
return
}
addr := &unix.SockaddrNetlink{Family: unix.AF_NETLINK, Groups: C.CN_IDX_PROC, Pid: uint32(os.Getpid())}
err = unix.Bind(sock, addr)
if err != nil {
fmt.Printf("bind: %v\n", err)
return
}
Next, we have to send the PROC_CN_MCAST_LISTEN message to the kernel to let it know we want to receive events. We can import this directly from C, where it's defined as an enum, to save some typing, and put it in a function since we will have to call it again with PROC_CN_MCAST_IGNORE when we are done receiving data from the kernel.
// #include <linux/cn_proc.h>
// #include <linux/connector.h>
import "C"
func send(sock int, msg uint32) error {
destAddr := &unix.SockaddrNetlink{Family: unix.AF_NETLINK, Groups: C.CN_IDX_PROC, Pid: 0} // the kernel
cnMsg := CnMsg{}
header := unix.NlMsghdr{
Len: unix.NLMSG_HDRLEN + uint32(binary.Size(cnMsg) + binary.Size(msg)),
Type: uint16(unix.NLMSG_DONE),
Flags: 0,
Seq: 1,
Pid: uint32(unix.Getpid()),
}
msg.ID = CbID{Idx: C.CN_IDX_PROC, Val: C.CN_VAL_PROC}
msg.Len = uint16(binary.Size(msg))
msg.Ack = 0
msg.Seq = 1
buf := bytes.NewBuffer(make([]byte, 0, header.Len))
binary.Write(buf, binary.LittleEndian, header)
binary.Write(buf, binary.LittleEndian, cnMsg)
binary.Write(buf, binary.LittleEndian, msg)
return unix.Sendto(sock, buf.Bytes(), 0, destAddr)
}
After we let the kernel know we're ready to receive events, we can receive them on the socket we're created. Once we receive them, we need to parse them, and check for relevant data. We only care about messages that meet the following criteria:
Come from the kernel
Have a header type of NLMSG_DONE
Have a proc_event_header.what value of PROC_EVENT_EXIT
Match our PID
If they meet these criteria, we can extract the relevant process information into a proc_event_exit struct, which contains the PID of the process.
for {
p := make([]byte, 1024)
nr, from, err := unix.Recvfrom(sock, p, 0)
if sockaddrNl, ok := from.(*unix.SockaddrNetlink); !ok || sockaddrNl.Pid != 0 {
continue
}
if err != nil {
fmt.Printf("Recvfrom: %v\n", err)
continue
}
if nr < unix.NLMSG_HDRLEN {
continue
}
// the sys/unix package doesn't include the ParseNetlinkMessage function
nlmessages, err := syscall.ParseNetlinkMessage(p[:nr])
if err != nil {
fmt.Printf("ParseNetlinkMessage: %v\n", err)
continue
}
for _, m := range(nlmessages) {
if m.Header.Type == unix.NLMSG_DONE {
buf := bytes.NewBuffer(m.Data)
msg := &CnMsg{}
hdr := &ProcEventHeader{}
binary.Read(buf, binary.LittleEndian, msg)
binary.Read(buf, binary.LittleEndian, hdr)
if hdr.What == C.PROC_EVENT_EXIT {
event := &ExitProcEvent{}
binary.Read(buf, binary.LittleEndian, event)
pid := int(event.ProcessTgid)
fmt.Printf("%d just exited.\n", pid)
}
}
}
}
A full code example is here.
How to make use of all CPUs and spawn a http process for each CPU?
Get num of CPUs
numCPU := runtime.NumCPU()
Start http
package main
import (
"fmt"
"net/http"
)
func handler(w http.ResponseWriter, r *http.Request) {
fmt.Fprintf(w, "Hi there, I love %s!", r.URL.Path[1:])
}
func main() {
http.HandleFunc("/", handler)
http.ListenAndServe(":8080", nil)
}
If your goal is just to have your request-processing code run on all CPU cores, net/http already starts a goroutine (a vaguely thread-like thing with a Go-specific implementation) per connection, and Go arranges for NumCPU OS threads to run by default so that goroutines can be spread across all available CPU cores.
The Accept loop runs in a single goroutine, but the actual work of parsing requests and generating responses runs in one per connection.
You can't nativly, you have to write your own wrapper:
// copied from http://golang.org/src/pkg/net/http/server.go#L1942
type tcpKeepAliveListener struct {
*net.TCPListener
}
func (ln tcpKeepAliveListener) Accept() (c net.Conn, err error) {
tc, err := ln.AcceptTCP()
if err != nil {
return
}
tc.SetKeepAlive(true)
tc.SetKeepAlivePeriod(3 * time.Minute)
return tc, nil
}
func ListenAndServe(addr string, num int) error {
if addr == "" {
addr = ":http"
}
ln, err := net.Listen("tcp", addr)
if err != nil {
return err
}
var wg sync.WaitGroup
for i := 0; i < num; i++ {
wg.Add(1)
go func(i int) {
log.Println("listener number", i)
log.Println(http.Serve(tcpKeepAliveListener{ln.(*net.TCPListener)}, nil))
wg.Done()
}(i)
}
wg.Wait()
return nil
}
func main() {
num := runtime.NumCPU()
runtime.GOMAXPROCS(num) //so the goroutine listeners would try to run on multiple threads
log.Println(ListenAndServe(":9020", num))
}
Or if you use a recent enough Linux Kernel you can use the patch from http://comments.gmane.org/gmane.comp.lang.go.general/121122 and actually spawn multiple processes.