Unable to print lines in a loop conditionally - python-3.x

need help on this script where I try to get output related to that command. For example, in the below code
"info related to process and the output should be ps -ef command output and should continue to the next command and print statement likewise"
But i get the lines saying
info related to process and all the commands are being displayed at once.
#!/usr/bin/env python3.7
import os
state = ['process' , 'http status' , 'date info' , 'system']
def comm(com):
for i in state:
for j in com:
print (f"info related to {i}")
os.system(j)
cmd = ['ps -ef | head -2' , 'systemctl status httpd' , 'date' , 'uptime']
comm(cmd)
OUTPUT:
info related to process
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 10:13 ? 00:00:19 /usr/lib/systemd/systemd -
-switched-root --system --deserialize 22
info related to process
● httpd.service - The Apache HTTP Server
Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled; vendor
preset: disabled)
Active: active (running) since Wed 2019-03-27 18:27:50 IST; 1 day 2h ago
Docs: man:httpd(8)
man:apachectl(8)
Process: 8585 ExecReload=/usr/sbin/httpd $OPTIONS -k graceful (code=exited,
status=0/SUCCESS)
Main PID: 1367 (httpd)
Status: "Total requests: 0; Current requests/sec: 0; Current traffic: 0
B/sec"
Tasks: 6
CGroup: /system.slice/httpd.service
├─1367 /usr/sbin/httpd -DFOREGROUND
├─8597 /usr/sbin/httpd -DFOREGROUND
├─8598 /usr/sbin/httpd -DFOREGROUND
├─8599 /usr/sbin/httpd -DFOREGROUND
├─8600 /usr/sbin/httpd -DFOREGROUND
└─8601 /usr/sbin/httpd -DFOREGROUND
info related to process
Thu Mar 28 21:03:57 IST 2019
info related to process
21:03:57 up 10:50, 4 users, load average: 0.35, 0.09, 0.14

You have two loops, one being nested in the other. That means anything the inner loop does will be executed in every iteration of the outer loop. That's just how loops work, but not what (I presume) you want to do here.
You have commands to be executed by the os module and several state names that are associated with them. From a data-first point of view we could structure them in a dictionary:
commands = {
'process': 'ps -ef',
'http status': 'systemctl status httpd',
'date info': 'date',
'sytem': 'uptime',
}
Now when we iterate over this dictionary, in each iteration we will have both the state name and the command to be run as loop variables. The loops become a single for loop and we end up with:
def comm(commands):
for name, command in commands.items():
print (f"info related to {name}")
os.system(command)

Related

Run command in golang and detach it from process

Problem:
I'm writing program in golang on linux that needs to execute long running process so that:
I redirect stdout of running process to file.
I control the user of process.
Process doesn't die when my program exits.
The process doesn't become a zombie when it crashes.
I get PID of running process.
I'm running my program with root permissions.
Attempted solution:
func Run(pathToBin string, args []string, uid uint32, stdLogFile *os.File) (int, error) {
cmd := exec.Command(pathToBin, args...)
cmd.SysProcAttr = &syscall.SysProcAttr{
Credential: &syscall.Credential{
Uid: uid,
},
}
cmd.Stdout = stdLogFile
if err := cmd.Start(); err != nil {
return -1, err
}
go func() {
cmd.Wait() //Wait is necessary so cmd doesn't become a zombie
}()
return cmd.Process.Pid, nil
}
This solution seems to satisfy almost all of my requirements except that when I send SIGTERM/SIGKILL to my program the underlying process crashes. In fact I want my background process to be as separate as possible: it has different parent pid, group pid etc. from my program. I want to run it as daemon.
Other solutions on stackoverflow suggested to use cmd.Process.Release() for similar use cases, but it doesn't seem to work.
Solutions which are not applicable in my case:
I have no control over code of process I'm running. My solution has to work for any process.
I can't use external commands to run it, just pure go. So using systemd or something similar is not applicable.
I can in fact use library that is easily importable using import from github etc.
TLDR;
Just use https://github.com/hashicorp/go-reap
There is a great Russian expression which reads "don't try to give birth to a bicycle" and it means don't reinvent the wheel and keep it simple. I think it applies here. If I were you, I'd reconsider using one of:
https://github.com/krallin/tini
https://busybox.net/
https://software.clapper.org/daemonize/
https://wiki.gentoo.org/wiki/OpenRC
https://www.freedesktop.org/wiki/Software/systemd/
This issue has already been solved ;)
Your question is imprecise or you are asking for non-standard features.
In fact I want my background process to be as separate as possible: it has different parent pid, group pid etc. from my program. I want to run it as daemon.
That is not how process inheritance works. You can not have process A start Process B and somehow change the parent of B to C. To the best of my knowledge this is not possible in Linux.
In other words, if process A (pid 55) starts process B (100), then B must have parent pid 55.
The only way to avoid that is have something else start the B process such as atd, crond, or something else - which is not what you are asking for.
If parent 55 dies, then PID 1 will be the parent of 100, not some arbitrary process.
Your statement "it has different parent pid" does not makes sense.
I want to run it as daemon.
That's excellent. However, in a GNU / Linux system, all daemons have a parent pid and those parents have a parent pid going all the way up to pid 1, strictly according to the parent -> child rule.
when I send SIGTERM/SIGKILL to my program the underlying process crashes.
I can not reproduce that behavior. See case8 and case7 from the proof-of-concept repo.
make case8
export NOSIGN=1; make build case7
unset NOSIGN; make build case7
$ make case8
{ sleep 6 && killall -s KILL zignal; } &
./bin/ctrl-c &
sleep 2; killall -s TERM ctrl-c
kill with:
{ pidof ctrl-c; pidof signal ; } | xargs -r -t kill -9
main() 2476074
bashed 2476083 (2476081)
bashed 2476084 (2476081)
bashed 2476085 (2476081)
zignal 2476088 (2476090)
go main() got 23 urgent I/O condition
go main() got 23 urgent I/O condition
zignal 2476098 (2476097)
go main() got 23 urgent I/O condition
zignal 2476108 (2476099)
main() wait...
p 2476088
p 2476098
p 2476108
p 2476088
go main() got 15 terminated
sleep 1; killall -s TERM ctrl-c
p 2476098
p 2476108
p 2476088
go main() got 15 terminated
sleep 1; killall -s TERM ctrl-c
p 2476098
p 2476108
p 2476088
Bash c 2476085 EXITs ELAPSED 4
go main() got 17 child exited
go main() got 23 urgent I/O condition
main() children done: 1 %!s(<nil>)
main() wait...
go main() got 15 terminated
go main() got 23 urgent I/O condition
sleep 1; killall -s KILL ctrl-c
p 2476098
p 2476108
p 2476088
balmora: ~/src/my/go/doodles/sub-process [main]
$ p 2476098
p 2476108
Bash _ 2476083 EXITs ELAPSED 6
Bash q 2476084 EXITs ELAPSED 8
The bash processes keep running after the parent is killed.
killall -s KILL ctrl-c;
All 3 "zignal" sub-processes are running until killed by
killall -s KILL zignal;
In both cases the sub-processes continue to run despite main process being signaled with TERM, HUP, INT. This behavior is different in a shell environment because of convenience reasons. See the related questions about signals. This particular answer illustrates a key difference for SIGINT. Note that SIGSTOP and SIGKILL cannot be caught by an application.
It was necessary to clarify the above before proceeding with the other parts of the question.
So far you have already solved the following:
redirect stdout of sub-process to a file
set owner UID of sub-process
sub-process survives death of parent (my program exits)
the PID of sub-process can be seen by the main program
The next one depends on whether the children are "attached" to a shell or not
sub-process survives the parent being killed
The last one is hard to reproduce, but I have heard about this problem in the docker world, so the rest of this answer is focused on addressing this issue.
sub-process survives if the parent crashes and does not become a zombie
As you have noted, the Cmd.Wait() is necessary to avoid creating zombies. After some experimentation I was able to consistency produce zombies in a docker environment using an intentionally simple replacement for /bin/sh. This "shell" implemented in go will only run a single command and not much else in terms of reaping children. You can study the code over at github.
The zombie solution
the simple wrapper which causes zombies
package main
func main() {
Sh()
}
The reaper wrapper
package main
import (
"fmt"
"sync"
"github.com/fatih/color"
"github.com/hashicorp/go-reap"
)
func main() {
if reap.IsSupported() {
done := make(chan struct{})
var reapLock sync.RWMutex
pids := make(reap.PidCh, 1)
errors := make(reap.ErrorCh, 1)
go reap.ReapChildren(pids, errors, done, &reapLock)
go report(pids, errors, done)
Sh()
close(done)
} else {
fmt.Println("Sorry, go-reap isn't supported on your platform.")
}
}
func report(pids reap.PidCh, errors reap.ErrorCh, done chan struct{}) {
sprintf := color.New(color.FgWhite, color.Bold).SprintfFunc()
for ;; {
select {
case pid := <-pids:
println(sprintf("raeper pid %d", pid))
case err := <-errors:
println(sprintf("raeper er %s", err))
case <-done:
return
}
}
}
The init / sh (pid 1) process which runs other commands
package main
import (
"os"
"os/exec"
"strings"
"time"
"github.com/google/shlex"
"github.com/tox2ik/go-poc-reaper/fn"
)
func Sh() {
args := os.Args[1:]
script := args[0:0]
if len(args) >= 1 {
if args[0] == "-c" {
script = args[1:]
}
}
if len(script) == 0 {
fn.CyanBold("cmd: expecting sh -c 'foobar'")
os.Exit(111)
}
var cmd *exec.Cmd
parts, _ := shlex.Split(strings.Join(script, " "))
if len(parts) >= 2 {
cmd = fn.Merge(exec.Command(parts[0], parts[1:]...), nil)
}
if len(parts) == 1 {
cmd = fn.Merge(exec.Command(parts[0]), nil)
}
if fn.IfEnv("HANG") {
fn.CyanBold("cmd: %v\n start", parts)
ex := cmd.Start()
if ex != nil {
fn.CyanBold("cmd %v err: %s", parts, ex)
}
go func() {
time.Sleep(time.Millisecond * 100)
errw := cmd.Wait()
if errw != nil {
fn.CyanBold("cmd %v err: %s", parts, errw)
} else {
fn.CyanBold("cmd %v all done.", parts)
}
}()
fn.CyanBold("cmd: %v\n dispatched, hanging forever (i.e. to keep docker running)", parts)
for {
time.Sleep(time.Millisecond * time.Duration(fn.EnvInt("HANG", 2888)))
fn.SystemCyan("/bin/ps", "-e", "-o", "stat,comm,user,etime,pid,ppid")
}
} else {
if fn.IfEnv("NOWAIT") {
ex := cmd.Start()
if ex != nil {
fn.CyanBold("cmd %v start err: %s", parts, ex)
}
} else {
ex := cmd.Run()
if ex != nil {
fn.CyanBold("cmd %v run err: %s", parts, ex)
}
}
fn.CyanBold("cmd %v\n dispatched, exit docker.", parts)
}
}
The Dockerfile
FROM scratch
# for sh.go
ENV HANG ""
# for sub-process.go
ENV ABORT ""
ENV CRASH ""
ENV KILL ""
# for ctrl-c.go, signal.go
ENV NOSIGN ""
COPY bin/sh /bin/sh ## <---- wrapped or simple /bin/sh or "init"
COPY bin/sub-process /bin/sub-process
COPY bin/zleep /bin/zleep
COPY bin/fork-if /bin/fork-if
COPY --from=busybox:latest /bin/find /bin/find
COPY --from=busybox:latest /bin/ls /bin/ls
COPY --from=busybox:latest /bin/ps /bin/ps
COPY --from=busybox:latest /bin/killall /bin/killall
Remaining code / setup can be seen here:
https://github.com/tox2ik/go-poc-reaper
Case 5 (simple /bin/sh)
The gist of it is we start two sub-processes from go, using the "parent" sub-process binary. The first child is zleep and the second fork-if. The second one starts a "daemon" that runs a forever-loop in addition to a few short-lived threads. After a while, we kill the sub-procss parent, forcing sh to take over the parenting for these children.
Since this simple implementation of sh does not know how to deal with abandoned children, the children become zombies.
This is standard behavior. To avoid this, init systems are usually responsible for cleaning up any such children.
Check out this repo and run the cases:
$ make prep build
$ make prep build2
The first one will use the simple /bin/sh in the docker container, and the socond one will use the same code wrapped in a reaper.
With zombies:
$ make prep build case5
(…)
main() Daemon away! 16 (/bin/zleep)
main() Daemon away! 22 (/bin/fork-if)
(…)
main() CRASH imminent
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x49e45c]
goroutine 1 [running]:
main.main()
/home/jaroslav/src/my/go/doodles/sub-process/sub-process.go:137 +0xfc
cmd [/bin/sub-process /log/case5 3 /bin/zleep 111 2 -- /dev/stderr 3 /bin/fork-if --] err: exit status 2
Child '1' done
thread done
STAT COMMAND USER ELAPSED PID PPID
R sh 0 0:02 1 0
S zleep 3 0:02 16 1
Z fork-if 3 0:02 22 1
R fork-child-A 3 0:02 25 1
R fork-child-B 3 0:02 26 25
S fork-child-C 3 0:02 27 26
S fork-daemon 3 0:02 28 27
R ps 0 0:01 30 1
Child '2' done
thread done
daemon
(…)
STAT COMMAND USER ELAPSED PID PPID
R sh 0 0:04 1 0
Z zleep 3 0:04 16 1
Z fork-if 3 0:04 22 1
Z fork-child-A 3 0:04 25 1
R fork-child-B 3 0:04 26 1
S fork-child-C 3 0:04 27 26
S fork-daemon 3 0:04 28 27
R ps 0 0:01 33 1
(…)
With reaper:
$ make -C ~/src/my/go/doodles/sub-process case5
(…)
main() CRASH imminent
(…)
Child '1' done
thread done
raeper pid 24
STAT COMMAND USER ELAPSED PID PPID
S sh 0 0:02 1 0
S zleep 3 0:01 18 1
R fork-child-A 3 0:01 27 1
R fork-child-B 3 0:01 28 27
S fork-child-C 3 0:01 30 28
S fork-daemon 3 0:01 31 30
R ps 0 0:01 32 1
Child '2' done
thread done
raeper pid 27
daemon
STAT COMMAND USER ELAPSED PID PPID
S sh 0 0:03 1 0
S zleep 3 0:02 18 1
R fork-child-B 3 0:02 28 1
S fork-child-C 3 0:02 30 28
S fork-daemon 3 0:02 31 30
R ps 0 0:01 33 1
STAT COMMAND USER ELAPSED PID PPID
S sh 0 0:03 1 0
S zleep 3 0:02 18 1
R fork-child-B 3 0:02 28 1
S fork-child-C 3 0:02 30 28
S fork-daemon 3 0:02 31 30
R ps 0 0:01 34 1
raeper pid 18
daemon
STAT COMMAND USER ELAPSED PID PPID
S sh 0 0:04 1 0
R fork-child-B 3 0:03 28 1
S fork-child-C 3 0:03 30 28
S fork-daemon 3 0:03 31 30
R ps 0 0:01 35 1
(…)
Here is a picture of the same output, which may be less confusing to read.
Zombies
Reaper
How to run the cases in the poc repo
Get the code
git clone https://github.com/tox2ik/go-poc-reaper.git
One terminal:
make tail-cases
Another terminal
make prep
make build
or make build2
make case0 case1
...
Related questions:
go
How to create a daemon process in Golang?
How to start a Go program as a daemon in Ubuntu?
how to keep subprocess running after program exit in golang?
Prevent Ctrl+C from interrupting exec.Command in Golang
signals
https://unix.stackexchange.com/questions/386999/what-terminal-related-signals-are-sent-to-the-child-processes-of-the-shell-direc
https://unix.stackexchange.com/questions/6332/what-causes-various-signals-to-be-sent
https://en.wikipedia.org/wiki/Signal_(IPC)#List_of_signals
Related discussions:
https://github.com/golang/go/issues/227
https://blog.phusion.nl/2015/01/20/docker-and-the-pid-1-zombie-reaping-problem/
Relevant projects:
http://software.clapper.org/daemonize/ (what I would use)
https://github.com/hashicorp/go-reap (if you must have run go on pid 1)
https://github.com/sevlyar/go-daemon (mimics posix fork)
Relevant prose:
A zombie process is a process whose execution is completed but it still has an entry in the process table. Zombie processes usually occur for child processes, as the parent process still needs to read its child’s exit status. Once this is done using the wait system call, the zombie process is eliminated from the process table. This is known as reaping the zombie process.
from https://www.tutorialspoint.com/what-is-zombie-process-in-linux

Why the node child spawn process detached from parent process and start running independently ?

I have a node child spawn process which is continuously write in to the file using writestream on every "data event" received. The script is run under ssh and facing an edge case problem.
Consider multiple terminals open with same ssh host and the script is start to run in one terminal. Someone accidentally close the ssh terminal during it's execution and doesn't want to stop child process execution. while using a command ps -ef | grep "command name" the child process still running with different parent process id (it shows 1) but the writestream in the child process stops writing to the file. It seems like child process become zombie process eventhough i detached the process from parent. You can find the script below:
var execSpawn = require('child_process').spawn;
var Promise = require('bluebird');
var spawnAction = function(path, cmd, cb){
return function(resolve, reject, onCancel){
cmdExec = execSpawn(path, cmd, {detached: true});
//cmdExec = execSpawn(path, cmd, {detached: true}).unref();
var fileData = {}
var count = 0;
var stream = fs.createWriteStream('filepath');
cmdExec.stdout.setEncoding('utf8');
cmdExec.stdout.on('data', function(data){
//Certain actions with filedata and count;
stream.write(data);
});
cmdExec.stderr.on('data', function(data){
//some actions
stream.write("error");
});
cmdExec.on('close', function(){
stream.end();
if(cb){
resolve(cb(fileData));
}else{
resolve(count);
}
});
}
}
This script is running properly when it is allow to run completely without any interruption. When the script execution terminal closes the child process stop the writestream to the file. If i try with detach along with unref() it throws an error like it couldn't figure out the event stdout.on over the child process.
Cannot read property 'stdout' of undefined
More information during the script running. This is taken in the same host in different terminal
ps -ef | grep command_name
root 19904 19191 0 20:16 ? 00:00:00 cli command_name pfitzner7 /dev/sdb
root 19905 19191 0 20:16 ? 00:00:00 cli command_name pfitzner7 /dev/sdc
root 19906 19191 0 20:16 ? 00:00:00 cli command_name pfitzner7 /dev/sdd
root 19907 19191 0 20:16 ? 00:00:00 cli command_name pfitzner7 /dev/sde
root 23101 13105 0 20:16 pts/0 00:00:00 grep --color=auto command_name
After closing the script running terminal before it finishes. I am getting this:
ps -ef | grep command_name
root 19904 1 0 20:16 ? 00:00:00 cli command_name pfitzner7 /dev/sdb
root 19905 1 0 20:16 ? 00:00:00 cli command_name pfitzner7 /dev/sdc
root 19906 1 0 20:16 ? 00:00:00 cli command_name pfitzner7 /dev/sdd
root 19907 1 0 20:16 ? 00:00:00 cli command_name pfitzner7 /dev/sde
root 23163 13105 0 20:16 pts/0 00:00:00 grep --color=auto command_name
I am trying to figure out why it's happening this issue and will it be there any possible ways to do in different way. Why did the parent process id changed to 1 even if it's detached child process? How can a child spawn process can run independently from the parent process?
Please let me know your suggestions on this approach or the reason for the error.
Thanks in advance.
I think, you don't need to detach your childrens. The childrens are spawned asynchronously, so you may manipulate with a few child's stdio at the same time

Linux Service unexpectedly died

Running Ubuntu 17.04, I'd like to have a systemctl service which oversees a main bash script, where three programs (here substituted by dummy script foo_script tagged with an argument) run under an endless loop (because of possible program crashes).
The main script, foo_main.sh, works correctly if called from a command line; but the service I'm trying to set up from it crashes soon.
File foo_script.sh:
#!/bin/bash
echo "FooScripting "$1 >> "foo.d/"$1
File loop.sh:
#!/bin/bash
nLoop=0
prgName=$1
prgArg=$2
echo "<< START of "${prgName} ${prgArg}" loop >>"
while :
do
let nLoop=nLoop+1
echo "<< looping "${prgName} ${prgArg}" >>" ${nLoop}
"./"${prgName} ${prgArg}
sleep 1
done
echo "<< END of "${prgName} ${prgArg}" loop >>"
File foo_main.sh:
#!/bin/bash
echo "foo_main start in "${PWD}
./loop.sh "foo_script.sh" "fb" &
sleep 2
./loop.sh "foo_script.sh" "gc" &
./loop.sh "foo_script.sh" "gb" &
echo "foo_main end"
File /etc/systemd/system/food.service:
[Unit]
Description = Foo Daemon
After = network.target
[Service]
Type = simple
# User = <<USER>>
# PIDFile=/var/food.pid
WorkingDirectory = /home/john/bin
ExecStart = /home/john/bin/foo_main.sh
# ExecStop = killall loop.sh
# ExecReload = killall loop.sh && /home/john/bin/foo_main.sh
# Restart = on-abort
[Install]
WantedBy = multi-user.target
What I obtain from every sudo systemctl status food.service (after a start ofc) is almost the same output
● food.service - Foo Daemon
Loaded: loaded (/etc/systemd/system/food.service; disabled; vendor preset: enabled)
Active: inactive (dead)
Sep 28 14:54:30 john-host systemd[1]: Started Foo Daemon.
Sep 28 14:54:30 john-host foo_main.sh[7376]: foo_main script start in /home/john/bin
Sep 28 14:54:30 john-host foo_main.sh[7376]: << START of foo_script.sh fb loop >>
Sep 28 14:54:30 john-host foo_main.sh[7376]: << looping foo_script.sh fb >> 1
Sep 28 14:54:31 john-host foo_main.sh[7376]: << looping foo_script.sh fb >> 2
Sep 28 14:54:32 john-host foo_main.sh[7376]: foo_main script end
Sep 28 15:24:30 john-host foo_main.sh[7921]: << START of foo_script.sh gb loop >>
Sep 28 15:24:30 john-host foo_main.sh[7921]: << START of foo_script.sh gc loop >>
Sep 28 15:24:30 john-host foo_main.sh[7921]: << looping foo_script.sh gb >> 1
Sep 28 15:24:30 john-host foo_main.sh[7921]: << looping foo_script.sh gc >> 1
Another solution is to use Type=oneshot + RemainAfterExit=yes in your /etc/systemd/system/food.service
Look at https://www.freedesktop.org/software/systemd/man/systemd.service.html to refer onshot type of service.
The service file like following should solve your issue too.
[Unit]
Description = Foo Daemon
After = network.target
[Service]
Type = oneshot
RemainAfterExit=yes
# User = <<USER>>
# PIDFile=/var/food.pid
WorkingDirectory = /home/john/bin
ExecStart = /home/john/bin/foo_main.sh
# ExecStop = killall loop.sh
# ExecReload = killall loop.sh && /home/john/bin/foo_main.sh
# Restart = on-abort
[Install]
WantedBy = multi-user.target
Solved... the service was stopped simply because its execution flow ended with foo_main.sh. It was enough to add something like
# ...
./loop.sh "foo_script.sh" "endless_dummy_loop"
Without background ampersand, at the end of foo_main.sh.
Clearly real services are far different from this, but I got the point now.

Self duplicate background script

This is a background script test.
When run it launch two processes and I don't understand why.
One stop after sleep 20. And other forgets.
#!/bin/bash
back(){
n=0
while [ 1 ]
do
echo $n
n=$(($n+1))
sleep 5
done
}
back &
sleep 20
exit
command "ps -a" in call:
PID TTY TIME CMD
8964 pts/2 00:00:00 backgroundtest
8965 pts/2 00:00:00 backgroundtest
8966 pts/2 00:00:00 sleep
8982 pts/2 00:00:00 sleep
after sleep 20:
PID TTY TIME CMD
8965 pts/2 00:00:00 backgroundtest
9268 pts/2 00:00:00 sleep
then run forever...
why?
while [ 1 ] is an infinite loop. [ 1 ] is always true.
So back & is an infinite loop, started in background (&), then execution continues with sleep 20, which does end after 20 seconds, leaving you with two processes for 20 seconds (& starts a new process in background), then the infinite one after that.

How can I see which CPU core a thread is running in?

In Linux, supposing a thread's pid is [pid], from the directory /proc/[pid] we can get many useful information. For example, these proc files, /proc/[pid]/status,/proc/[pid]/stat and /proc/[pid]/schedstat are all useful. But how can I get the CPU core number that a thread is running in? If a thread is in sleep state, how can I know which core it will run after it is scheduled again?
BTW, is there a way to dump the process(thread) list of running and sleeping tasks for each CPU core?
The "top" command may help towards this, it does not have CPU-grouped list of threads but rather you can see the list of threads (probably for a single process) and which CPU cores the threads are running on by
top -H -p {PROC_ID}
then pressing f to go into field selection, j to enable the CPU core column, and Enter to display.
The answer below is no longer accurate as of 2014
Tasks don't sleep in any particular core. And the scheduler won't know ahead of time which core it will run a thread on because that will depend on future usage of those cores.
To get the information you want, look in /proc/<pid>/task/<tid>/status. The third field will be an 'R' if the thread is running. The sixth from the last field will be the core the thread is currently running on, or the core it last ran on (or was migrated to) if it's not currently running.
31466 (bc) S 31348 31466 31348 34819 31466 4202496 2557 0 0 0 5006 16 0 0 20 0 1 0 10196934 121827328 1091 18446744073709551615 4194304 4271839 140737264235072 140737264232056 217976807456 0 0 0 137912326 18446744071581662243 0 0 17 3 0 0 0 0 0
Not currently running. Last ran on core 3.
31466 (bc) R 31348 31466 31348 34819 31466 4202496 2557 0 0 0 3818 12 0 0 20 0 1 0 10196934 121827328 1091 18446744073709551615 4194304 4271839 140737264235072 140737264231824 4235516 0 0 0 2 0 0 0 17 2 0 0 0 0 0
Currently running on core 2.
To see what the rest of the fields mean, have a look at the Linux kernel source -- specifically the do_task_stat function in fs/proc/array.c or Documentation/filesystems/stat.txt.
Note that all of this information may be obsolete by the time you get it. It was true at some point between when you made the open call on the file in proc and when that call returned.
You can also use ps, something like this:
ps -mo pid,tid,%cpu,psr -p `pgrep BINARY-NAME`
The threads are not necessary to bound one particular Core (if you did not pin it). Therefore to see the continuous switching of the core you can use (a modified answer of Dmitry):
watch -tdn0.5 ps -mo pid,tid,%cpu,psr -p \`pgrep BINARY-NAME\`
For example:
watch -tdn0.5 ps -mo pid,tid,%cpu,psr -p \`pgrep firefox\`
This can be done with top command. The default top command output does not show these details. To view this detail you will have to press f key while on top command interface and then press j(press Enter key after you pressed j). Now the output will show you details regarding a process and which processor its running. A sample output is shown below.
top - 04:24:03 up 96 days, 13:41, 1 user, load average: 0.11, 0.14, 0.15
Tasks: 173 total, 1 running, 172 sleeping, 0 stopped, 0 zombie
Cpu(s): 7.1%us, 0.2%sy, 0.0%ni, 88.4%id, 0.1%wa, 0.0%hi, 0.0%si, 4.2%st
Mem: 1011048k total, 950984k used, 60064k free, 9320k buffers
Swap: 524284k total, 113160k used, 411124k free, 96420k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P COMMAND
12426 nginx 20 0 345m 47m 29m S 77.6 4.8 40:24.92 7 php-fpm
6685 mysql 20 0 3633m 34m 2932 S 4.3 3.5 63:12.91 4 mysqld
19014 root 20 0 15084 1188 856 R 1.3 0.1 0:01.20 4 top
9 root 20 0 0 0 0 S 1.0 0.0 129:42.53 1 rcu_sched
6349 memcache 20 0 355m 12m 224 S 0.3 1.2 9:34.82 6 memcached
1 root 20 0 19404 212 36 S 0.0 0.0 0:20.64 3 init
2 root 20 0 0 0 0 S 0.0 0.0 0:30.02 4 kthreadd
3 root 20 0 0 0 0 S 0.0 0.0 0:12.45 0 ksoftirqd/0
The P column in the output shows the processor core number where the process is currently being executed. Monitoring this for a few minutes will make you understand that a pid is switching processor cores in between. You can also verify whether your pid for which you have set affinity is running on that particular core only
top f navigation screen ( a live system example ) :
Fields Management for window 1:Def, whose current sort field is forest view
Navigate with Up/Dn, Right selects for move then <Enter> or Left commits,
'd' or <Space> toggles display, 's' sets sort. Use 'q' or <Esc> to end!
* PID = Process Id
* USER = Effective User Name
* PR = Priority
* NI = Nice Value
* VIRT = Virtual Image (KiB)
* RES = Resident Size (KiB)
* SHR = Shared Memory (KiB)
* S = Process Status
* %CPU = CPU Usage
* %MEM = Memory Usage (RES)
* TIME+ = CPU Time, hundredths
* COMMAND = Command Name/Line
PPID = Parent Process pid
UID = Effective User Id
RUID = Real User Id
RUSER = Real User Name
SUID = Saved User Id
SUSER = Saved User Name
GID = Group Id
GROUP = Group Name
PGRP = Process Group Id
TTY = Controlling Tty
TPGID = Tty Process Grp Id
SID = Session Id
nTH = Number of Threads
* P = Last Used Cpu (SMP)
TIME = CPU Time
SWAP = Swapped Size (KiB)
CODE = Code Size (KiB)
DATA = Data+Stack (KiB)
nMaj = Major Page Faults
nMin = Minor Page Faults
nDRT = Dirty Pages Count
WCHAN = Sleeping in Function
Flags = Task Flags <sched.h>
CGROUPS = Control Groups
SUPGIDS = Supp Groups IDs
SUPGRPS = Supp Groups Names
TGID = Thread Group Id
ENVIRON = Environment vars
vMj = Major Faults delta
vMn = Minor Faults delta
USED = Res+Swap Size (KiB)
nsIPC = IPC namespace Inode
nsMNT = MNT namespace Inode
nsNET = NET namespace Inode
nsPID = PID namespace Inode
nsUSER = USER namespace Inode
nsUTS = UTS namespace Inode
Accepted answer is not accurate. Here are the ways to find out which CPU is running the thread (or was the last one to run) at the moment of inquiry:
Directly read /proc/<pid>/task/<tid>/stat. Before doing so, make sure format didn't change with latest kernel. Documentation is not always up to date, but at least you can try https://www.kernel.org/doc/Documentation/filesystems/proc.txt. As of this writing, it will be the 14th value from the end.
Use ps. Either give it -F switch, or use output modifiers and add code PSR.
Use top with Last Used Cpu column (hitting f gets you to column selection)
Use htop with PROCESSOR column (hitting F2 gets you to setup screen)
To see the threads of a process :
ps -T -p PID
To see the thread run info
ps -mo pid,tid,%cpu,psr -p PID
Example :
/tmp # ps -T -p 3725
PID SPID TTY TIME CMD
3725 3725 ? 00:00:00 Apps
3725 3732 ? 00:00:10 t9xz1d920
3725 3738 ? 00:00:00 XTimer
3725 3739 ? 00:00:05 Japps
3725 4017 ? 00:00:00 QTask
3725 4024 ? 00:00:00 Kapps
3725 4025 ? 00:00:17 PTimer
3725 4026 ? 00:01:17 PTask
3725 4027 ? 00:00:00 RTask
3725 4028 ? 00:00:00 Recv
3725 4029 ? 00:00:00 QTimer
3725 4033 ? 00:00:01 STask
3725 4034 ? 00:00:02 XTask
3725 4035 ? 00:00:01 QTimer
3725 4036 ? 00:00:00 RTimer
3725 4145 ? 00:00:00 t9xz1d920
3725 4147 ? 00:00:02 t9xz1d920
3725 4148 ? 00:00:00 t9xz1d920
3725 4149 ? 00:00:00 t9xz1d920
3725 4150 ? 00:00:00 t9xz1d920
3725 4865 ? 00:00:02 STimer
/tmp #
/tmp #
/tmp # ps -mo pid,tid,%cpu,psr -p 3725
PID TID %CPU PSR
3725 - 1.1 -
- 3725 0.0 2
- 3732 0.1 0
- 3738 0.0 0
- 3739 0.0 0
- 4017 0.0 6
- 4024 0.0 3
- 4025 0.1 0
- 4026 0.7 0
- 4027 0.0 3
- 4028 0.0 7
- 4029 0.0 0
- 4033 0.0 4
- 4034 0.0 1
- 4035 0.0 0
- 4036 0.0 2
- 4145 0.0 2
- 4147 0.0 0
- 4148 0.0 5
- 4149 0.0 2
- 4150 0.0 7
- 4865 0.0 0
/tmp #

Resources