What exactly is doing ps -aux in terminal? - linux

Command ps statically lists all processes. What exactly is doing -aux option?
a - all processes
u - user
x - execute
Something more?

I found a nice explanation here: http://www.linfo.org/ps.html
A common and convenient way of using ps to obtain much more complete information about the processes currently on the system is to use the following:
ps -aux | less
The -a option tells ps to list the processes of all users on the system rather than just those of the current user, with the exception of group leaders and processes not associated with a terminal. A group leader is the first member of a group of related processes.
The -u option tells ps to provide detailed information about each process. The -x option adds to the list processes that have no controlling terminal, such as daemons, which are programs that are launched during booting (i.e., computer startup) and run unobtrusively in the background until they are activated by a particular event or condition.

Related

How to reference a process reliably (using a tag or something similar)?

I have multiple processes (web-scrapers) running in the background (one scraper for each website). The processes are python scripts that were spawned/forked a few weeks ago. I would like to control (they listen on sockets to enable IPC) them from one central place (kinda like a dispatcher/manager python script), while the processes (scrapers) remain individual unrelated processes.
I thought about using the PID to reference each process, but that would require storing the PID whenever I (re)launch one of the scrapers because there is no semantic relation between a number and my use case. I just want to supply some text-tag along with the process when I launch it, so that I can reference it later on.
pgrep -f searches all processes by their name and calling pattern (including arguments).
E.g. if you spawned a process as python myscraper --scrapernametag=uniqueid01 then you can run:
TAG=uniqueid01; pgrep -f "scrapernametag=$TAG"
to discover the PID of a process later down the line.

How to set process ID in Linux for a specific program

I was wondering if there is some way to force to use some specific process ID to Linux to some application before running it. I need to know in advance the process ID.
Actually, there is a way to do this. Since kernel 3.3 with CONFIG_CHECKPOINT_RESTORE set(which is set in most distros), there is /proc/sys/kernel/ns_last_pid which contains last pid generated by kernel. So, if you want to set PID for forked program, you need to perform these actions:
Open /proc/sys/kernel/ns_last_pid and get fd
flock it with LOCK_EX
write PID-1
fork
VoilĂ ! Child will have PID that you wanted.
Also, don't forget to unlock (flock with LOCK_UN) and close ns_last_pid.
You can checkout C code at my blog here.
As many already suggested you cannot set directly a PID but usually shells have facilities to know which is the last forked process ID.
For example in bash you can lunch an executable in background (appending &) and find its PID in the variable $!.
Example:
$ lsof >/dev/null &
[1] 15458
$ echo $!
15458
On CentOS7.2 you can simply do the following:
Let's say you want to execute the sleep command with a PID of 1894.
sudo echo 1893 > /proc/sys/kernel/ns_last_pid; sleep 1000
(However, keep in mind that if by chance another process executes in the extremely brief amount of time between the echo and sleep command you could end up with a PID of 1895+. I've tested it hundreds of times and it has never happened to me. If you want to guarantee the PID you will need to lock the file after you write to it, execute sleep, then unlock the file as suggested in Ruslan's answer above.)
There's no way to force to use specific PID for process. As Wikipedia says:
Process IDs are usually allocated on a sequential basis, beginning at
0 and rising to a maximum value which varies from system to system.
Once this limit is reached, allocation restarts at 300 and again
increases. In Mac OS X and HP-UX, allocation restarts at 100. However,
for this and subsequent passes any PIDs still assigned to processes
are skipped
You could just repeatedly call fork() to create new child processes until you get a child with the desired PID. Remember to call wait() often, or you will hit the per-user process limit quickly.
This method assumes that the OS assigns new PIDs sequentially, which appears to be the case eg. on Linux 3.3.
The advantage over the ns_last_pid method is that it doesn't require root permissions.
Every process on a linux system is generated by fork() so there should be no way to force a specific PID.
From Linux 5.5 you can pass an array of PIDs to the clone3 system call to be assigned to the new process, up to one for each nested PID namespace, from the inside out. This requires CAP_SYS_ADMIN or (since Linux 5.9) CAP_CHECKPOINT_RESTORE over the PID namespace.
If you a not concerned with PID namespaces use an array of size one.

How to resume stopped job on a remote machine given pid?

I have a process on a machine which I stopped (with a Ctrl-Z). After ssh'ing onto the machine, how do I resume the process?
You will need to find the PID and then issue kill -CONT <pid>.
You can find the PID by using ps with some options to produce extended output. Stopped jobs have a T in the STAT (or S) column.
If you succeed in continuing the process but it no longer has a controlling terminal (and it needs one) then it could possibly hang or go into a loop: just keep your eye on its CPU usage.
You can type in fg to resume process. If you have multiple processes, you can type fg processname, (e.g. fg vim) or fg job_id.
To find out the job id's, use the jobs command.
Relevant quote from wikipedia on what it does:
fg is a job control command in Unix and Unix-like operating systems that resumes execution of a suspended process by bringing it to the foreground and thus redirecting its standard input and output streams to the user's terminal.
To find out job-id and pid, use "jobs -l", like this:
$ jobs -l
[1]+ 3729 Stopped vim clustertst.cpp
The first column is job_id, and the second is pid.

Confusion with pid's and processes on linux

from reading docs and online most people have been saying that to kill a process in linux, only the command kill "pid" is needed.
For example to kill memcached would be kill $(cat memcached.pid)
But for pretty much every process that i've tried to kill including the one above, this would not work. I managed to get it to work with a different command:
ps aux | grep (process name here)
That command, for whatever reason would get a different pid, which would work when killing the program.
I guess my question is, why are there different pid's? Isn't the point of an id to be unique? Why do celery, memcached, and other processes all have a different pid's when using the aux | grep command, versus the pid in the .pid file? Is this some kinda error on my configuration or is it ment to be like this?
Also, where is it possible to get all arguments and descriptions for an executable in linux?
I know the "man" command is useful for some functions, but it wont work for many executables, like celery for example.
Thanks!
The process ID (pid) is assigned by the operating system on-the-fly when a process starts up. It's unique in the sense that no two processes have the same ID. However, the actual value is not guaranteed to be the same from one run of the process to another. The best way to think of it is like those "now serving" tickets:
You are correct that you can look up an ID via ps and grep, though you may find it easier to just use:
pgrep (process name here)
Also, if you just want to kill the process, you can even skip the above step and use:
pkill (process name here)

Advanced pgrep-like process search in bash

I need to find the pid of a certain java process in bash on linux.
If there's only one java process,
PID=$(pgrep java)
works.
For multiple java processes it becomes more complicated. Manually, I run pstree, find the ancestor of the java process that I need first, then find the java process in question. Is it possible to do this in bash? Basically I need the functionality that in pseudo-code looks like:
Having `processname1` and `processname2`
and knowing that `processname2` is in the subtree of 'processname1',
find the pid of `processname2`.
In this example the java process will be processname2.
Reformulating your psuedo-code question: find all processname2 processes which have a processname1 process as parent. This can be directly expressed using the following nested pgrep call:
pgrep -P $(pgrep -d, processname1) processname2
Here's the documentation for those flag straight from the pgrep(1) manpage:
-d delimiter
Sets the string used to delimit each process ID in the output
(by default a newline).
-P ppid,...
Only match processes whose parent process ID is listed.
Note that this will only work if processname2 is an immediate child process of processname1.

Resources