Environment variables set with setenv() aren't shown in /proc/<PID>/environ - linux

The app with setenv() C function sets values of three environment variables. By the app behavior it is clear that env. vars are set properly. They configure paths for OpenSSL config file, config files and modules folder.
Below is a minimum example that illustrates the issue.
// setenv_example.c
#include <stdlib.h>
#include <stdio.h>
int main() {
setenv("my_var", "my_value", 1);
printf("Press any key to exit");
getchar();
return 0;
}
In one terminal window ($ is command prompt):
$ gcc setenv_example.c -o setenv_example
$ ./setenv_example
Press any key to exit
In another terminal window
$ ps -a | grep setenv
164615 pts/5 00:00:00 setenv_example
$ cat /proc/164615/environ | tr '\0' '\n' | grep my
<empty output>
How does one see environment variables set with setenv() of the running process on Linux (Debian 11)?

How does one see environment variables set with setenv() of the running process
It is not (that easily) possible.
/proc/<pid>/environ contains initial program environment, as passed when invoked exec.
If your program is compiled with debugging symbols, you can attach a debugger to your program and read the global environ variable.

Related

Can "tee" command in linux print both the input and the output of a C program?

I have a simple C program where the it will ask to take an integer from the user, and then it will print that integer.
#include <stdio.h>
int main() {
int number;
printf("Enter an integer: ");
scanf("%d", &number);
printf("You entered: %d", number);
return 0;}
When I use this command:
gcc program.c -o test
./test | tee text.txt
The program running on terminal does not print the enter integer line but instead, waits for an input and when I provide that input, it prints it and also into the text.txt folder. I want to run the program as it is and store everything running on terminal into the text.txt folder including both the input and the output. Any possible way to do that?
The tee command works with one input, but you want to capture two. With some care, you could use two separate tee commands two copy both the input and the output to the same file, but you would be better off with a utility designed for your purpose, such as script.
For Debian based Linuxes, run apt install devscripts, and then try the annotate-output util. For example, run cat using a process substitution and a file that's not there at all:
annotate-output cat <(echo hello) /bin/nosuchfile
...which shows what otherwise would be input, output, and standard error output, all sent to standard output which could then be piped to a file:
13:01:03 I: Started cat /dev/fd/63 /bin/nosuchfile
13:01:03 O: hello
13:01:03 E: cat: /bin/nosuchfile: No such file or directory
13:01:03 I: Finished with exitcode 1

How to get cwd for relative paths?

How can I get current working directory in strace output, for system calls that are being called with relative paths? I'm trying to debug complex application that spawns multiple processes and fails to open particular file.
stat("some_file", 0x7fff6b313df0) = -1 ENOENT (No such file or directory)
Since some_file exists I believe that its located in the wrong directory. I'd tried to trace chdir calls too, but since output is interleaved its hard to deduce working directory that way. Is there a better way?
You can use the -y option and it will print the full path. Another useful flag in this situation is -P which only traces syscalls relating to a specific path, e.g.
strace -y -P "some_file"
Unfortunately -y will only print the path of file descriptors, and since your call doesn't load any it doesn't have one. A possible workaround is to interrupt the process when that syscall is run in a debugger, then you can get its working directory by inspecting /proc/<PID>/cwd. Something like this (totally untested!)
gdb --args strace -P "some_file" -e inject=open:signal=SIGSEGV
Or you may be able to use a conditional breakpoint. Something like this should work, but I had difficulty with getting GDB to follow child processes after a fork. If you only have one process it should be fine I think.
gdb your_program
break open if $_streq((char*)$rdi, "some_file")
run
print getpid()
It is quite easy, use the function char *realpath(const char *path, char *resolved_path) for the current directory.
This is my example:
#include <limits.h>
#include <stdio.h>
#include <stdlib.h>
int main(){
char *abs;
abs = realpath(".", NULL);
printf("%s\n", abs);
return 0;
}
output
root#ubuntu1504:~/patches_power_spec# pwd
/root/patches_power_spec
root#ubuntu1504:~/patches_power_spec# ./a.out
/root/patches_power_spec

Bash script execution with and without shebang in Linux and BSD

How and who determines what executes when a Bash-like script is executed as a binary without a shebang?
I guess that running a normal script with shebang is handled with binfmt_script Linux module, which checks a shebang, parses command line and runs designated script interpreter.
But what happens when someone runs a script without a shebang? I've tested the direct execv approach and found out that there's no kernel magic in there - i.e. a file like that:
$ cat target-script
echo Hello
echo "bash: $BASH_VERSION"
echo "zsh: $ZSH_VERSION"
Running compiled C program that does just an execv call yields:
$ cat test-runner.c
void main() {
if (execv("./target-script", 0) == -1)
perror();
}
$ ./test-runner
./target-script: Exec format error
However, if I do the same thing from another shell script, it runs the target script using the same shell interpreter as the original one:
$ cat test-runner.bash
#!/bin/bash
./target-script
$ ./test-runner.bash
Hello
bash: 4.1.0(1)-release
zsh:
If I do the same trick with other shells (for example, Debian's default sh - /bin/dash), it also works:
$ cat test-runner.dash
#!/bin/dash
./target-script
$ ./test-runner.dash
Hello
bash:
zsh:
Mysteriously, it doesn't quite work as expected with zsh and doesn't follow the general scheme. Looks like zsh executed /bin/sh on such files after all:
greycat#burrow-debian ~/z/test-runner $ cat test-runner.zsh
#!/bin/zsh
echo ZSH_VERSION=$ZSH_VERSION
./target-script
greycat#burrow-debian ~/z/test-runner $ ./test-runner.zsh
ZSH_VERSION=4.3.10
Hello
bash:
zsh:
Note that ZSH_VERSION in parent script worked, while ZSH_VERSION in child didn't!
How does a shell (Bash, dash) determines what gets executed when there's no shebang? I've tried to dig up that place in Bash/dash sources, but, alas, looks like I'm kind of lost in there. Can anyone shed some light on the magic that determines whether the target file without shebang should be executed as script or as a binary in Bash/dash? Or may be there is some sort of interaction with kernel / libc and then I'd welcome explanations on how does it work in Linux and FreeBSD kernels / libcs?
Since this happens in dash and dash is simpler, I looked there first.
Seems like exec.c is the place to look, and the relevant functionis are tryexec, which is called from shellexec which is called whenever the shell things a command needs to be executed. And (a simplified version of) the tryexec function is as follows:
STATIC void
tryexec(char *cmd, char **argv, char **envp)
{
char *const path_bshell = _PATH_BSHELL;
repeat:
execve(cmd, argv, envp);
if (cmd != path_bshell && errno == ENOEXEC) {
*argv-- = cmd;
*argv = cmd = path_bshell;
goto repeat;
}
}
So, it simply always replaces the command to execute with the path to itself (_PATH_BSHELL defaults to "/bin/sh") if ENOEXEC occurs. There's really no magic here.
I find that FreeBSD exhibits identical behavior in bash and in its own sh.
The way bash handles this is similar but much more complicated. If you want to look in to it further I recommend reading bash's execute_command.c and looking specifically at execute_shell_script and then shell_execve. The comments are quite descriptive.
(Looks like Sorpigal has covered it but I've already typed this up and it may be of interest.)
According to Section 3.16 of the Unix FAQ, the shell first looks at the magic number (first two bytes of the file). Some numbers indicate a binary executable; #! indicates that the rest of the line should be interpreted as a shebang. Otherwise, the shell tries to run it as a shell script.
Additionally, it seems that csh looks at the first byte, and if it's #, it'll try to run it as a csh script.

on-the-fly output redirection, seeing the file redirection output while the program is still running

If I use a command like this one:
./program >> a.txt &
, and the program is a long running one then I can only see the output once the program ended. That means I have no way of knowing if the computation is going well until it actually stops computing. I want to be able to read the redirected output on file while the program is running.
This is similar to opening a file, appending to it, then closing it back after every writing. If the file is only closed at the end of the program then no data can be read on it until the program ends. The only redirection I know is similar to closing the file at the end of the program.
You can test it with this little python script. The language doesn't matter. Any program that writes to standard output has the same problem.
l = range(0,100000)
for i in l:
if i%1000==0:
print i
for j in l:
s = i + j
One can run this with:
./python program.py >> a.txt &
Then cat a.txt .. you will only get results once the script is done computing.
From the stdout manual page:
The stream stderr is unbuffered.
The stream stdout is line-buffered
when it points to a terminal.
Partial lines will not appear until
fflush(3) or exit(3) is called, or
a new‐line is printed.
Bottom line: Unless the output is a terminal, your program will have its standard output in fully buffered mode by default. This essentially means that it will output data in large-ish blocks, rather than line-by-line, let alone character-by-character.
Ways to work around this:
Fix your program: If you need real-time output, you need to fix your program. In C you can use fflush(stdout) after each output statement, or setvbuf() to change the buffering mode of the standard output. For Python there is sys.stdout.flush() of even some of the suggestions here.
Use a utility that can record from a PTY, rather than outright stdout redirections. GNU Screen can do this for you:
screen -d -m -L python test.py
would be a start. This will log the output of your program to a file called screenlog.0 (or similar) in your current directory with a default delay of 10 seconds, and you can use screen to connect to the session where your command is running to provide input or terminate it. The delay and the name of the logfile can be changed in a configuration file or manually once you connect to the background session.
EDIT:
On most Linux system there is a third workaround: You can use the LD_PRELOAD variable and a preloaded library to override select functions of the C library and use them to set the stdout buffering mode when those functions are called by your program. This method may work, but it has a number of disadvantages:
It won't work at all on static executables
It's fragile and rather ugly.
It won't work at all with SUID executables - the dynamic loader will refuse to read the LD_PRELOAD variable when loading such executables for security reasons.
It's fragile and rather ugly.
It requires that you find and override a library function that is called by your program after it initially sets the stdout buffering mode and preferably before any output. getenv() is a good choice for many programs, but not all. You may have to override common I/O functions such as printf() or fwrite() - if push comes to shove you may just have to override all functions that control the buffering mode and introduce a special condition for stdout.
It's fragile and rather ugly.
It's hard to ensure that there are no unwelcome side-effects. To do this right you'd have to ensure that only stdout is affected and that your overrides will not crash the rest of the program if e.g. stdout is closed.
Did I mention that it's fragile and rather ugly?
That said, the process it relatively simple. You put in a C file, e.g. linebufferedstdout.c the replacement functions:
#define _GNU_SOURCE
#include <stdlib.h>
#include <stdio.h>
#include <dlfcn.h>
char *getenv(const char *s) {
static char *(*getenv_real)(const char *s) = NULL;
if (getenv_real == NULL) {
getenv_real = dlsym(RTLD_NEXT, "getenv");
setlinebuf(stdout);
}
return getenv_real(s);
}
Then you compile that file as a shared object:
gcc -O2 -o linebufferedstdout.so -fpic -shared linebufferedstdout.c -ldl -lc
Then you set the LD_PRELOAD variable to load it along with your program:
$ LD_PRELOAD=./linebufferedstdout.so python test.py | tee -a test.out
0
1000
2000
3000
4000
If you are lucky, your problem will be solved with no unfortunate side-effects.
You can set the LD_PRELOAD library in the shell, if necessary, or even specify that library system-wide (definitely NOT recommended) in /etc/ld.so.preload.
If you're trying to modify the behavior of an existing program try stdbuf (part of coreutils starting with version 7.5 apparently).
This buffers stdout up to a line:
stdbuf -oL command > output
This disables stdout buffering altogether:
stdbuf -o0 command > output
Have you considered piping to tee?
./program | tee a.txt
However, even tee won't work if "program" doesn't write anything to stdout until it is done. So, the effectiveness depends a lot on how your program behaves.
If the program writes to a file, you can read it while it is being written using tail -f a.txt.
Your problem is that most programs check to see if the output is a terminal or not. If the output is a terminal then output is buffered one line at a time (so each line is output as it is generated) but if the output is not a terminal then the output is buffered in larger chunks (4096 bytes at a time is typical) This behaviour is normal behaviour in the C library (when using printf for example) and also in the C++ library (when using cout for example), so any program written in C or C++ will do this.
Most other scripting languages (like perl, python, etc.) are written in C or C++ and so they have exactly the same buffering behaviour.
The answer above (using LD_PRELOAD) can be made to work on perl or python scripts, since the interpreters are themselves written in C.
The unbuffer command from the expect package does exactly what you are looking for.
$ sudo apt-get install expect
$ unbuffer python program.py | cat -
<watch output immediately show up here>

/usr/bin/env questions regarding shebang line pecularities

Questions:
What does the kernel do if you stick a shell-script into the shebang line?
How does the Kernel know which interpreter to launch?
Explanation:
I recently wanted to write a wrapper around /usr/bin/env because my CGI Environment does not allow me to set the PATH variable, except globally (which of course sucks!).
So I thought, "OK. Let's set PREPENDPATH and set PATH in a wrapper around env.". The resulting script (here called env.1) looked like this:
#!/bin/bash
/usr/bin/env PATH=$PREPENDPATH:$PATH $*
which looks like it should work. I checked how they both react, after setting PREPENDPATH:
$ which /usr/bin/env python
/usr/bin/env
/usr/bin/python
$ which /usr/bin/env.1 python
/usr/bin/env
/home/pi/prepend/bin/python
Look absolutely perfect! So far, so good. But look what happens to "Hello World!".
# Shebang is #!/usr/bin/env python
$ test-env.py
Hello World!
# Shebang is #!/usr/bin/env.1 python
$ test-env.1.py
Warning: unknown mime-type for "Hello World!" -- using "application/*"
Error: no such file "Hello World!"
I guess I am missing something pretty fundamental about UNIX.
I'm pretty lost, even after looking at the source code of the original env. It sets the environment and launches the program (or so it seems to me...).
First of all, you should very seldom use $* and you should almost always use "$#" instead. There are a number of questions here on SO which explain the ins and outs of why.
Second - the env command has two main uses. One is to print the current environment; the other is to completely control the environment of a command when it is run. The third use, which you are demonstrating, is to modify the environment, but frankly there's no need for that - the shells are quite capable of handling that for you.
Mode 1:
env
Mode 2:
env -i HOME=$HOME PATH=$PREPENDPATH:$PATH ... command args
This version cancels all inherited environment variables and runs command with precisely the environment set by the ENVVAR=value options.
The third mode - amending the environment - is less important because you can do that fine with regular (civilized) shells. (That means "not C shell" - again, there are other questions on SO with answers that explain that.) For example, you could perfectly well do:
#!/bin/bash
export PATH=${PREPENDPATH:?}:$PATH
exec python "$#"
This insists that $PREPENDPATH is set to a non-empty string in the environment, and then prepends it to $PATH, and exports the new PATH setting. Then, using that new PATH, it executes the python program with the relevant arguments. The exec replaces the shell script with python. Note that this is quite different from:
#!/bin/bash
PATH=${PREPENDPATH:?}:$PATH exec python "$#"
Superficially, this is the same. However, this will execute the python found on the pre-existing PATH, albeit with the new value of PATH in the process's environment. So, in the example, you'd end up executing Python from /usr/bin and not the one from /home/pi/prepend/bin.
In your situation, I would probably not use env and would just use an appropriate variant of the script with the explicit export.
The env command is unusual because it does not recognize the double-dash to separate options from the rest of the command. This is in part because it does not take many options, and in part because it is not clear whether the ENVVAR=value options should come before or after the double dash.
I actually have a series of scripts for running (different versions of) a database server. These scripts really use env (and a bunch of home-grown programs) to control the environment of the server:
#!/bin/ksh
#
# #(#)$Id: boot.black_19.sh,v 1.3 2008/06/25 15:44:44 jleffler Exp $
#
# Boot server black_19 - IDS 11.50.FC1
IXD=/usr/informix/11.50.FC1
IXS=black_19
cd $IXD || exit 1
IXF=$IXD/do.not.start.$IXS
if [ -f $IXF ]
then
echo "$0: will not start server $IXS because file $IXF exists" 1>&2
exit 1
fi
ONINIT=$IXD/bin/oninit.$IXS
if [ ! -f $ONINIT ]
then ONINIT=$IXD/bin/oninit
fi
tmpdir=$IXD/tmp
DAEMONIZE=/work1/jleffler/bin/daemonize
stdout=$tmpdir/$IXS.stdout
stderr=$tmpdir/$IXS.stderr
if [ ! -d $tmpdir ]
then asroot -u informix -g informix -C -- mkdir -p $tmpdir
fi
# Specialized programs carried to extremes:
# * asroot sets UID and GID values and then executes
# * env, which sets the environment precisely and then executes
# * daemonize, which makes the process into a daemon and then executes
# * oninit, which is what we really wanted to run in the first place!
# NB: daemonize defaults stdin to /dev/null and could set umask but
# oninit dinks with it all the time so there is no real point.
# NB: daemonize should not be necessary, but oninit doesn't close its
# controlling terminal and therefore causes cron-jobs that restart
# it to hang, and interactive shells that started it to hang, and
# tracing programs.
# ??? Anyone want to integrate truss into this sequence?
asroot -u informix -g informix -C -a dbaao -a dbsso -- \
env -i HOME=$IXD \
INFORMIXDIR=$IXD \
INFORMIXSERVER=$IXS \
INFORMIXCONCSMCFG=$IXD/etc/concsm.$IXS \
IFX_LISTEN_TIMEOUT=3 \
ONCONFIG=onconfig.$IXS \
PATH=/usr/bin:$IXD/bin \
SHELL=/usr/bin/ksh \
TZ=UTC0 \
$DAEMONIZE -act -d $IXD -o $stdout -e $stderr -- \
$ONINIT "$#"
case "$*" in
(*v*) track-oninit-v $stdout;;
esac
You should carefully read the wikipedia article about shebang.
When your system sees the magic number corresponding to the shebang, it does an execve on the given path after the shebang and gives the script itself as an argument.
Your script fails because the file you give (/usr/bin/env.1) is not an executable, but begins itself by a shebang....
Ideally, you could resolve it using... env on your script with this line as a shebang:
#!/usr/bin/env /usr/bin/env.1 python
It won't work though on linux as it treats "/usr/bin/env.1 python" as a path (it doesn't split arguments)
So the only way I see is to write your env.1 in C
EDIT: seems like no one belives me ^^, so I've written a simple and dirty env.1.c:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
const char* prependpath = "/your/prepend/path/here:";
int main(int argc, char** argv){
int args_len = argc + 1;
char* args[args_len];
const char* env = "/usr/bin/env";
int i;
/* arguments: the same */
args[0] = env;
for(i=1; i<argc; i++)
args[i] = argv[i];
args[argc] = NULL;
/* environment */
char* p = getenv("PATH");
char* newpath = (char*) malloc(strlen(p)
+ strlen(prependpath));
sprintf(newpath, "%s%s", prependpath, p);
setenv("PATH", newpath, 1);
execv(env, args);
return 0;
}

Resources