How can I identify a progam by pid - linux

My title is more than no explicite so feel free to change it (don't really know how to name it)
I use a php script to check if a list of pid is running, My issue is that pid identifying is not enough and some other program can get the pid number later on when mine is over.
So, is there something I can do to identify than pid is the good pid that I need to check and not another one.
I think to hash /proc/<pid>/cmdline but even that is not 100% safe (another program can be the same software and the same parameters (it's rare but possible).
if an example is needed:
I run several instance of wget
one of them have PID number 8426
some times later…
I check if PID 8426 is running, it is so my php script react and don't check file downloaded but the fact is that PID 8426 of wget is over and it's another program that running pid 8426.
If the new program run for a long time (eg: a service) I can wait a long time for my php script to check the downloaded file.

Have you tried employing an object-oriented paradigm, where you could encapsulate the specific PID number into its specific object (i.e., specific program)? To accomplish this, you need to create a class (say you give it the arbitrary name "SOURCE") from which these programs can be obtained as objects belonging to that class. Doing so will encapsulate any information (e.g., PID), including the methods of that specific program to that program alone and, therefore, provide a safer way than doing a hash. Similar methods can be found in the object-oriented programming paradigm of Python.

You can read the binary file that /proc/<pid>/exe points to. The following concept is done in a shell but probably can do that in any language including php:
$ readlink "/proc/$$/exe"
/bin/bash

Related

UNIX - Can process communicate with another process by using bash program?

I'm having a discussion with my Operating Systems teacher about whether shell program can be used to communicate with another process. He says it can't while I believe it actually can.
For example: if we write echo "123" >> file.txt, and there is a process named P1 that reads data from this file, isn't it communication between those two processes?
Another example: There is a process P1 that waits for a file to be created in order to proceed. If we create that file by using touch file.txt, isn't it also considered as communication?
Is my teacher really right? If so, could someone please explain to me why? He gave me some examples about how can a process communicate with one another, such as: shared memory area, pipes or signals.

Best approach to get all recursive child processes of a PID from either the `/proc` filesystem or equivalent kernel API?

Let's say, I have chrome running, which has 100 different processeses, not all of which are direct children. What's the best way to programmatically get all of the processes from either the procfs or the any syscall may be (I believe getrusage only allows calling process), given the PID of the main chrome parent in the hierarchy?
Also, is there any API that's equivalent to PSAPI in Windows which provides OpenProcess, GetProcessMemoryInfo etc, that allows you to iterate through memory efficiently, rather than parsing the procfs?
Most efficient way please. No calling other processes like ps, pstree, pgrep, etc.
Side context: This is mostly an educational exercise to find the most efficient way to do this, which I started going down trying to write a simple script in nodejs to try and get all the processes programatically and then, calculate the sum of the memory taken by the process tree, including each.
I'm actually the author of a C++ library that is designed to do exactly that - pfs.
pfs attempts to make all the interesting information inside procfs accessible through a very simple API. If you find it lacking any useful information, please create an issue, and I'll try to add it.
Seeing that you require that information from Node.js, you might be able to use the library for "inspiration" or for research purposes (as in, understand where the information is located).
Regarding the process tree: The procfs contains the parent PID of every process (find it under stat and/or status). You can enumerate all the running processes and store them into a container and then iterate over it while drawing or retaining the order you require.

Is there a way to make a bash script process messages that have been sent to it using the write command

Is there a way to make a bash script process messages that have been sent to it using the "write" command? So for example, if a user wants to activate a feature in my script, could I make it so that they can send the script a command using the write command?
One possible method I thought of was to configure logging for a screen session and then have the bash script parse text through there, but I'm not sure if there would be a simpler or more efficient way to tackle this
EDIT: I was thinking as an alternative solution I could use a named pipe. I'm worried that it would break though if the tmp partition gets filled up completely (not sure if this would impact write as well?). I'm going to be running this script on a shared box, and every once in a while someone will completely fill up the /tmp partition and then just leave it like that until people start complaining
Hmm, you are trying to really circumvent a poor unix command to ask it something it was not specified for. From the man page (emphasize mine):
The write utility allows you to communicate with other users, by copying
lines from your terminal to theirs
That means that write is intended to copy line directly on terminals. As soon as you say, I will dump terminal output with screen, and then parse the dump file, you loose the simplicity of write (and also need disk space, with the problem of removing old lines from a sequencial file)
Worse, as your script lives on its own, it could (should?) be a daemon script attached to no terminal
So if I have correctly understood your question, your requirements are:
a script that does some tasks and should be able to respond to asynchronous requests - common usages are named pipes or network or unix domain sockets, less common are files in a dedicated folder with a optional signal to have immediate processing, adding lines to a sequential file while being possible is uncommon, because of a synchonization of access problem
a simple and convivial way for users to pass requests. Ok write is nice for that part, but much too hard to interface IMHO
If you do not want to waste time on that part by using standard tools, I would recommend the mail system. It is trivial to alias a mail address to a program that will be called with the mail message as input. But I am not sure it is worth it, because the user could directly call the program with the request as input or command line parameter.
So the client part could be simply a program that:
create a temporary file in a dedicated folder (mkstemp is your friend in C or C++, or mktemp in shell - but beware of race conditions)
write the request to that file
optionaly send a signal to a pid - provided the script write its own PID on startup to a dedicated file

Linux non-su script indirectly triggering su script?

I'd like to create an auto-testing/grading script for students on a Linux system such that:
Any student user can initiate the script at any time.
A separate script (with root privileges) copies student code to a non-student-accessible file space, using non-student-accessible unit tests, etc.
The user receives limited feedback in the form of a text file generated by the grading script.
In short, I'm looking to create something similar to programming contest submission systems, but allowing richer feedback without revealing all teacher unit testing.
I would imagine that a spooling behavior between one initiating script and one root-permission cron script might be in order. Are there any models/examples of how one might best structure communication between a user-initiated script and a separate root-initiated script for such purposes?
There are many options.
The things I would mention at the first line:
Don't use su; use sudo; there are several reasons for it, and the main reason, that to use su you need the password of the user you want to be and with sudo — you don't;
Scripts can't be suid, you must use binaries or just a normal script that will be started using sudo (of course students must have sudoers entry that allows them to use the script);
Cron is not that fast, as you may theoretically need; cron runs tasks every minute; please consider inotify usage;
To communicate between components of your system you need something that will react in realtime; there are many opensource components/libraries/frameworks that could help you, but I would recommend you to take a look at ZeroMQ and Redis;
Results of the scripts' executions/tests can be written either to a filesystem (I think it would be better), or to a DBMS.
If you want to stick to shell scripting, the method I suggest for communicating between processes would be to have the root script continually check a named pipe for input (i.e. keep opening it after each eof) and send each input through whatever various tests must be done. Have part of the input be a 'return address' - where to send the result.
This should allow the tests to be performed in a privileged space without exposing any control over the privileged space to the students. The students don't need sudo, and you don't need to pull in libraries. Just have the students pipe their code into a non-privileged script that adds the return address and whatever other markup you may need, which then gives it to the named pipe.

Init process interaction with shell cripts

Nearly all linux courses say that init process, given the run level, will execute appropriate shell scripts to initialize the enivronment. But non of the courses describe in detail how init process does it.
As I understand the init process is basically a C program, much like any Hello World C code. Only much more sophisticated. Does anyone knows how this C program actually runs through all the scripts and invokes them?
I would really appreciate any answer and especially if you have a link to an example source code.
You can find explanations of what it does in different documentation:
http://www.centos.org/docs/5/html/5.1/Installation_Guide/s2-boot-init-shutdown-init.html
http://www.gentoo.org/doc/en/handbook/handbook-x86.xml?part=2&chap=4
and you can find its source code over there:
init.h
init.c
basically, init as process 1, has for role to fork() every application on your system. If you boot linux with the command line init=/bin/sh at boot time, process 1 forked by the kernel will be a shell. The sysvinit program makes it a bit more easy to handle a complex boot. It adds the concept of runlevels, define basic environment etc.. so that makes it easy to boot a system and have many services, and not only a shell. All that part is well explained in the documentations I gave you.
Does anyone knows how this C program actually runs through all the scripts and invokes them?
Well, is as simple as in your question. When you boot your system, init reads the inittab file, figures out what are your preferences (what is the default runlevel? what program to spawn? how many consoles?..), and for the chosen runlevel will fork a shell that will execute the startup script. Then that shell script will makes its way up to the shell script you activated from /etc/init.d. Usually the shell script part is very distribution-specific, that's why I gave you two links about that, and you may find it is different on ubuntu and debian...
For more details on the source code, you may want to start at the bottom of init.c which contains init's mainloop.
And +1 on your question for your curiosity!

Resources