How to get cwd for relative paths? - linux

How can I get current working directory in strace output, for system calls that are being called with relative paths? I'm trying to debug complex application that spawns multiple processes and fails to open particular file.
stat("some_file", 0x7fff6b313df0) = -1 ENOENT (No such file or directory)
Since some_file exists I believe that its located in the wrong directory. I'd tried to trace chdir calls too, but since output is interleaved its hard to deduce working directory that way. Is there a better way?

You can use the -y option and it will print the full path. Another useful flag in this situation is -P which only traces syscalls relating to a specific path, e.g.
strace -y -P "some_file"
Unfortunately -y will only print the path of file descriptors, and since your call doesn't load any it doesn't have one. A possible workaround is to interrupt the process when that syscall is run in a debugger, then you can get its working directory by inspecting /proc/<PID>/cwd. Something like this (totally untested!)
gdb --args strace -P "some_file" -e inject=open:signal=SIGSEGV
Or you may be able to use a conditional breakpoint. Something like this should work, but I had difficulty with getting GDB to follow child processes after a fork. If you only have one process it should be fine I think.
gdb your_program
break open if $_streq((char*)$rdi, "some_file")
run
print getpid()

It is quite easy, use the function char *realpath(const char *path, char *resolved_path) for the current directory.
This is my example:
#include <limits.h>
#include <stdio.h>
#include <stdlib.h>
int main(){
char *abs;
abs = realpath(".", NULL);
printf("%s\n", abs);
return 0;
}
output
root#ubuntu1504:~/patches_power_spec# pwd
/root/patches_power_spec
root#ubuntu1504:~/patches_power_spec# ./a.out
/root/patches_power_spec

Related

Understanding file descriptor duplication in bash

I'm having a hard time understanding something about redirections in bash.
I'll start with what I know:
Each process has file descriptors opened which it can write to/read from. These file descriptors may represent files on disk, terminals, devices, etc.
When we start teminal with bash, we have file stdin (0) stdout (1) and stderr (2) opened, pointing to the terminal. Whenever we run a command (a new process), that process inherits the file descriptors of its parent (bash), so by default, it will print stdout and stderr messages to the terminal, and read from terminal also.
When we redirect, for example:
$ ls 1>filelist
We're actually changing file descriptor 1 of the ls process, to point to the filelist file, instead of the terminal. So when ls will write(1, ...) it will go to the file.
So to sum it up, a redirection is basically changing the file to which the file descriptor to which the program writes/reads to/from refers to.
Now, let's say I have the following C program:
#include <stdio.h>
#include <fcntl.h>
int main()
{
int fd = 0;
fd = open("info.log", O_CREAT | O_RDWR);
printf("%d", fd);
write(fd, "INFO::", 6);
return 0;
}
This program opens a file info.log, which is referred to by a file descriptor (usually 3).
Indeed, if I now compile this program and run it:
$ ./app
3
It creates the file info.log which contains the "INFO::" text in it.
But here's what I don't get: according to the logic described above, if I now redirect FD 3 to another file:
$ ./app 3> another_file
The text should be written to this other file, but for some reason, it doesn't.
Can someone explain?
Hint: when you run ./app 3> another_file, it'll print "4" instead of "3".
More detailed explanation: when you run ./app 3> another_file in the shell, a series of things happens:
The shell fork()s a subprocess that'll run ./app. The subprocess is basically a clone of its parent process so, it'll still be running the shell program.
In that subprocess, the shell opens "another_file" on file descriptor #3 for writing.
Then it uses one of the execl() family of calls to execute the ./app binary (with "another_file" still open on FD#3).
The program runs open("info.log", O_CREAT | O_RDWR), which creates "info.log" and opens it on the next available file descriptor. Since FD#3 is already in use, that's FD#4.
The program writes "INFO::" to FD#4, which is "info.log".
Since open() uses a new FD, it's not really affected by any active redirects. And actually, if the program did open something on FD#3, that'd replace the connection to "another_file" with whatever it had opened instead, essentially overriding the redirect.
If the program wanted to use the redirect, it'd have to write to FD#3 without first opening anything on it. This is what's normally done with FD#1 and 2 (standard output and error), and that's why redirecting those works.

Cannot locate core file with abrt-hook-cpp installed

I've been led to understand that if abrt-ccpp.service is installed on a Linux PC, it supersedes/overwrites (I've read both, not sure which is true) the file /proc/sys/kernel/core_pattern, which otherwise specifies the location and filename pattern of core files.
Question:
When I execute systemctl, why does abrt-ccpp.service report exited under the SUB column? I don't understand the combination of active and exited: is the service "alive"/active/running or not?
> systemctl
UNIT LOAD ACTIVE SUB
abrt-ccpp.service loaded active exited ...
Question:
Where are core files generated? I wrote this program to generate a SIGSEGV:
#include <iostream>
int main(int argc, char* argv[], char* envz[])
{
int* pInt = NULL;
std::cout << *pInt << std::endl;
return 0;
}
Compilation and execution as follows:
> g++ main.cpp
> ./a.out
Segmentation fault (core dumped)
But I cannot locate where the core file is generated.
What I have tried:
Looked in the same directory as my main.cpp. Core file is not there.
Looked in /var/tmp/abrt/ because of the following comment in /etc/abrt/abrt.conf. Core file is not there.
...
# Specify where you want to store coredumps and all files which are needed for
# reporting. (default:/var/tmp/abrt)
#
# Changing dump location could cause problems with SELinux. See man_abrt_selinux(8).
#
#DumpLocation = /var/tmp/abrt
...
Looked in /var/spool/abrt/ because of a comment at this link. Core file is not there.
Edited /etc/abrt/abrt.conf and uncommented and set DumpLocation = ~/foo which is an existing directory. Followed this by restarting abrt-hook-ccpp (sudo service abrt-ccpp restart) and rerunning a.out. Core file was not generated in ~/foo/
Verified that ulimit -c reports unlimited.
I am out of ideas of what else to try and where else to look.
In case helpful, this is the content of my /proc/sys/kernel/core_pattern:
> cat /proc/sys/kernel/core_pattern
|/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e
Can someone help explain how the abrt-hook-ccpp service works and where it generates core files? Thank you.
I'd like to credit https://unix.stackexchange.com/users/119298/meuh who answered this at https://unix.stackexchange.com/questions/343240/cannot-locate-core-file-with-abrt-hook-cpp-installed.
The answer was to add this line in file /etc/abrt/abrt-action-save-package-data.conf
ProcessUnpackaged = yes
The comment from #daniel-kamil-kozar was also a viable workaround.

Unable to make executable that properly communicates with node.js

I'm testing the communication between node.js and executables launched as child processes. An executable will be launched from within node.js via child_process.spawn() and its output will be monitored by node.js. I'm testing this capability both on Linux and Windows OSs.
I've successfully spawned tail -f /var/log/syslog and listened to its output, but my own executables can't seem to write correctly to stdout (in whatever form it exists when captured by node.js).
Test code:
#include <iostream>
#include <stdio.h>
#include <unistd.h>
int main()
{
using namespace std;
long x = 1;
while (true)
{
fprintf(stdout, "xtime - %ld\n", x++);
usleep(1000000);
}
}
(Note: some includes may be useless; I've not checked them)
stdout output is not automatically flushed (at least on *nix) when stdout is not a tty (even if there is a newline in the output, otherwise a newline generally flushes when stdout is a tty).
So you can either disable stdout buffering entirely via setbuf(stdout, NULL); or you can manually flush output via fflush(stdout);.

Getting current working directory within kernel code

I am working on a project in which I need to know the current working directory of the executable which called the system call. I think it would be possible as some system calls like open would make use of that information.
Could you please tell how I can get the current working directory path in a string?
You can look at how the getcwd syscall is implemented to see how to do that.
That syscall is in fs/dcache.c and calls:
get_fs_root_and_pwd(current->fs, &root, &pwd);
root and pwd are struct path variables,
That function is defined as an inline function in include/linux/fs_struct.h, which also contains:
static inline void get_fs_pwd(struct fs_struct *fs, struct path *pwd)
and that seems to be what you are after.
How do you do that in a terminal ? You use pwd which looks at the environment variable named PWD.
#include <stdlib.h>
int main(int ac, char **av) {
printf("%s\n", getenv("PWD");
return 0;
}
If you want to know in which directory the executable is located you can combine the information from getenv and from argv[0].

on-the-fly output redirection, seeing the file redirection output while the program is still running

If I use a command like this one:
./program >> a.txt &
, and the program is a long running one then I can only see the output once the program ended. That means I have no way of knowing if the computation is going well until it actually stops computing. I want to be able to read the redirected output on file while the program is running.
This is similar to opening a file, appending to it, then closing it back after every writing. If the file is only closed at the end of the program then no data can be read on it until the program ends. The only redirection I know is similar to closing the file at the end of the program.
You can test it with this little python script. The language doesn't matter. Any program that writes to standard output has the same problem.
l = range(0,100000)
for i in l:
if i%1000==0:
print i
for j in l:
s = i + j
One can run this with:
./python program.py >> a.txt &
Then cat a.txt .. you will only get results once the script is done computing.
From the stdout manual page:
The stream stderr is unbuffered.
The stream stdout is line-buffered
when it points to a terminal.
Partial lines will not appear until
fflush(3) or exit(3) is called, or
a new‐line is printed.
Bottom line: Unless the output is a terminal, your program will have its standard output in fully buffered mode by default. This essentially means that it will output data in large-ish blocks, rather than line-by-line, let alone character-by-character.
Ways to work around this:
Fix your program: If you need real-time output, you need to fix your program. In C you can use fflush(stdout) after each output statement, or setvbuf() to change the buffering mode of the standard output. For Python there is sys.stdout.flush() of even some of the suggestions here.
Use a utility that can record from a PTY, rather than outright stdout redirections. GNU Screen can do this for you:
screen -d -m -L python test.py
would be a start. This will log the output of your program to a file called screenlog.0 (or similar) in your current directory with a default delay of 10 seconds, and you can use screen to connect to the session where your command is running to provide input or terminate it. The delay and the name of the logfile can be changed in a configuration file or manually once you connect to the background session.
EDIT:
On most Linux system there is a third workaround: You can use the LD_PRELOAD variable and a preloaded library to override select functions of the C library and use them to set the stdout buffering mode when those functions are called by your program. This method may work, but it has a number of disadvantages:
It won't work at all on static executables
It's fragile and rather ugly.
It won't work at all with SUID executables - the dynamic loader will refuse to read the LD_PRELOAD variable when loading such executables for security reasons.
It's fragile and rather ugly.
It requires that you find and override a library function that is called by your program after it initially sets the stdout buffering mode and preferably before any output. getenv() is a good choice for many programs, but not all. You may have to override common I/O functions such as printf() or fwrite() - if push comes to shove you may just have to override all functions that control the buffering mode and introduce a special condition for stdout.
It's fragile and rather ugly.
It's hard to ensure that there are no unwelcome side-effects. To do this right you'd have to ensure that only stdout is affected and that your overrides will not crash the rest of the program if e.g. stdout is closed.
Did I mention that it's fragile and rather ugly?
That said, the process it relatively simple. You put in a C file, e.g. linebufferedstdout.c the replacement functions:
#define _GNU_SOURCE
#include <stdlib.h>
#include <stdio.h>
#include <dlfcn.h>
char *getenv(const char *s) {
static char *(*getenv_real)(const char *s) = NULL;
if (getenv_real == NULL) {
getenv_real = dlsym(RTLD_NEXT, "getenv");
setlinebuf(stdout);
}
return getenv_real(s);
}
Then you compile that file as a shared object:
gcc -O2 -o linebufferedstdout.so -fpic -shared linebufferedstdout.c -ldl -lc
Then you set the LD_PRELOAD variable to load it along with your program:
$ LD_PRELOAD=./linebufferedstdout.so python test.py | tee -a test.out
0
1000
2000
3000
4000
If you are lucky, your problem will be solved with no unfortunate side-effects.
You can set the LD_PRELOAD library in the shell, if necessary, or even specify that library system-wide (definitely NOT recommended) in /etc/ld.so.preload.
If you're trying to modify the behavior of an existing program try stdbuf (part of coreutils starting with version 7.5 apparently).
This buffers stdout up to a line:
stdbuf -oL command > output
This disables stdout buffering altogether:
stdbuf -o0 command > output
Have you considered piping to tee?
./program | tee a.txt
However, even tee won't work if "program" doesn't write anything to stdout until it is done. So, the effectiveness depends a lot on how your program behaves.
If the program writes to a file, you can read it while it is being written using tail -f a.txt.
Your problem is that most programs check to see if the output is a terminal or not. If the output is a terminal then output is buffered one line at a time (so each line is output as it is generated) but if the output is not a terminal then the output is buffered in larger chunks (4096 bytes at a time is typical) This behaviour is normal behaviour in the C library (when using printf for example) and also in the C++ library (when using cout for example), so any program written in C or C++ will do this.
Most other scripting languages (like perl, python, etc.) are written in C or C++ and so they have exactly the same buffering behaviour.
The answer above (using LD_PRELOAD) can be made to work on perl or python scripts, since the interpreters are themselves written in C.
The unbuffer command from the expect package does exactly what you are looking for.
$ sudo apt-get install expect
$ unbuffer python program.py | cat -
<watch output immediately show up here>

Resources