What does dup2 actually do in this case? - linux

I need some clarification here:
I have some code like this:
child_map[0] = fileno(fd[0]);
..
pid = fork();
if(pid == 0)
/* child process*/
dup2(child_map[0], STDIN_FILENO);
Now, will STDIN_FILENO and child_map[0] point to the same file descriptor ? Will the future inputs be taken from the file pointed to by child_map[0] and STDIN_FILENO ?
I thought STDIN_FILENO means the standard output(terminal).

After the dup2(), child_map[0] and STDIN_FILENO will continue to be separate file descriptors, but they will refer to the same open file description. That means that if, for example, child_map[0] == 5 and STDIN_FILENO == 0, then both file descriptor 5 and 0 will remain open after the dup2().
Referring to the same open file description means that the file descriptors are interchangeable - they share attributes like the current file offset. If you perform an lseek() on one file descriptor, the current file offset is changed for both.
To close the open file description, all file descriptors that point to it must be closed.
It is common to execute close(child_map[0]) after the dup2(), which leaves only one file descriptor open to the file.

It causes all functions which read from stdin to get their data from the specified file descriptor, instead of the parent's stdin (often a terminal, but could be a file or pipe depending on shell redirection).
In fact, this is how a shell would launch processes with redirected input.
e.g.
cat somefile | uniq
uniq's standard input is bound to a pipe, not the terminal.

STDIN_FILENO is stdin, not stdout. (There's a STDOUT_FILENO too.) Traditionally the former is 0 and the latter 1.
This code is using dup2() to redirect the child's stdin from another file descriptor that the parent had opened. (It is in fact the same basic mechanism used for redirection in shells.) What usually happens afterward is that some other program that reads from its stdin is execed, so the code has set up its stdin for that.

Related

Save "perf stat" output per second to a file

I want to get perf output and analyse it. I used
while (true) {
system("sudo perf kvm stat -e r10f4 -a sleep 1 2>&1 | sed '1,3d' | sed '2,$d' > 'system.log' 2>&1");
sleep(0.5);
}
The code above frequently uses perf, which is costly. Instead, I'm running perf stat like: perf kvm stat -a -I 1000 > system.log 2>&1.
This command will keep writting data to "system.log", but I only need the data in a second.
I'm wondering how to let the new data overwrites the older data every second.
Does anybody know if my thoughts are feasible? Or other ways which can solve my problem.
You mean have perf keep overwriting, instead of appending, so you only have the final second?
One way would be to have your own program pipe perf stat -I1000 into itself (e.g. via C stdio popen or with POSIX pipe / dup2 / fork/exec). Then you get to implement the logic that chooses where and how to write the lines to a file. Instead of just writing them normally, you can lseek to the start of the output file before every write. Or pwrite instead of write to always write at file-position 0. You can append some spaces for padding out to a fixed width to make sure a longer line didn't leave some characters in the file that you're not overwriting. (Or ftruncate the output file after writes with lines shorter than the previous line).
Or: redirect perf stat -I 1000 >> file.log with O_APPEND, and periodically truncate the length.
A file opened for append will automatically write at wherever the current end is, so you can leave perf ... -I 1000 running and truncate the file every second or every 5 seconds or so. So at most you have 5 lines to read through to get to the final one you actually want. If using system to run it through a shell, use >>. Or use open(O_APPEND)/dup2 if using fork / execve.
To do the actual truncation, truncate() by path, or ftruncate() by open fd, in a sleep loop. Ideally you'd truncate right before a new line was about to be written, so most of the time there'd be a line. But unless you make extra system calls like fstat or inotify you won't know when exactly to expect another one, although dead reckoning and assuming that sleep sleeps for the minimum time can work ok.
Or: redirect perf stat -I 1000 (without append), and lseek its fd from the parent process, to create the same effect as piping through a process / thread that seeks before write.
Why does forking my process cause the file to be read infinitely shows that parent/child processes with file descriptors that share the same open file description can influence each other's file position with lseek.
This only works if you do the redirect yourself with open / dup2, not if you leave it to a shell via system. You need your own descriptor that refers to the same open file description
Not sure if you could play tricks like lseek on the same fd that a child process had open; in that case avoiding O_APPEND could let you set the write position to 0 without truncating, so the next write will overwrite the contents. But that only works if you open / dup2 / fork/exec and the fd in the parent has its file position tied to the fd in the child process.
Totally untested example
This might not actually compile even with the right headers, but hopefully illustrates the idea.
// must *not* use O_APPEND, otherwise file position is ignored for write.
int fd = open("system.log", O_WRONLY|O_CREAT|O_TRUNC, 0666);
if ((pid = fork()) != 0){
// parent
while(1) {
sleep(1);
lseek(fd, 0, SEEK_SET); // reset the position at which perf will do its next write
}
} else {
// child
dup2(fd, 1); // redirect stdout to the open file
// TODO: look up best practices for fork/exec, and error checking
execlp("sudo", "perf", "kvm", "stat", "-er10f4", "-a", "sleep", "1", NULL);
perror("execlp sudo perf failed");
}
Note the total lack of error checking of return values. This is more like pseudocode, even though it's written in (hopefully) valid C syntax.
And BTW, you can reduce that pipeline to one sed command, just sed -n 4p to print line 4 but not other lines. But perf ... stat -I 1000 doesn't waste lines so there'd be no need to filter on line numbers that way.

linux dup2 doens't seem to work with pipe?

I was trying dup2 on linux. My test program is: I open a pipe, try to dup stdin to fifo write, dup stdout to fifo read, I wish when I run this program, the stdin is writen into the fifo, and fifo automatically dumps the content to stdout:
#include<stdio.h>
#include<unistd.h>
int main(){
int pipefd[2];
pipe(pipefd);
int& readfd=pipefd[0];
int& writefd=pipefd[1];
dup2(STDIN_FILENO,writefd);
dup2(STDOUT_FILENO,readfd);
char buf[1024];
scanf("%s",buf);
return 0;
}
I run this program, didn't see an extra stdout print. My question:
(1) Is my stdin "scanf" being writen to fifo writefd?
(2) If yes, could the content be auto-directed to my console output? If not how to fix it?
If I get man dup2 right, the dup2(oldfd, newfd) system call creates a copy of oldfd file descriptor numbered newfd, silently closing newfd if it was previously open. So your dup2(STDIN_FILENO,writefd) line closes the write end of the pipe and replaces it with a copy of stdin. The next line does the same for the read end and stdout, respectively. So you don't get your stdin and stdout connected through pipe. Instead, you create a pipe and then close both its ends and replace them with copies of your original stdin and stdout descriptors. After that your scanf("%s",buf); just reads a string from the original stdin as usual. You can add a line like printf("%c\n", buf[1]) just after that, and it will print the second character of the string to the original stdout. Note that at that point, in fact there is no pipe created with pipe(pipefd) — both its ends was already closed.

bash: Creating many descriptors in a loop

I am trying to create multiple descriptors to files named 1, 2, 3, etc. in bash.
For example, exec 9>abc/1 works just fine, but when I try to create descriptors in a for loop, like this: exec $[$i+8]>abc/$i, it doesn't work. I tried many different ways, but it seems that exec just does not accept variables. Is there any way to do what I want to?
EDIT: If not, maybe there is a way to use flock without descriptors?
Yes, exec doesn't accept variables for file descriptor numbers. As pointed out in comments, you can use
eval "exec $((i + 8))>"'"abc/$i"'
which, if $i is 1, is equivalent to
exec 9>"abc/$i"
Those complex quotes ensure that eval-ed and then exec-ed command is safe even if file name is changed to something different than abc/1.
But there is a warning:
Redirections using file descriptors greater than 9 should be used with care, as they may conflict with file descriptors the shell uses internally.
So if your task doesn't require consecutive file descriptor numbers, you can use automatically allocated descriptors:
Each redirection that may be preceded by a file descriptor number may instead be preceded by a word of the form {varname}. In this case, for each redirection operator except >&- and <&-, the shell will allocate a file descriptor greater than 10 and assign it to varname.
So,
exec {fd}>"abc/$i"
echo "$fd"
will open file descriptor 10 (or greater) for writing to abc/1 and print that file descriptor number (e.g. 10).

Does piping write to stdin?

Does running something like below cause the textfile lines to be directed to the STDIN of program.sh?
cat textfile | program.sh
Yes; and the rest of this answer comes to satisfy SO's requirement of minimum 30 characters per answer (excluding links).
http://en.wikipedia.org/wiki/Pipeline_(Unix)
Yes. You're writing the stdout from cat to the stdin of program.sh. Because cat isn't doing much except reading the file, you can also write it as:
program.sh < textfile
...which does the same thing.
From a technical standpoint, stdin is accessed through file descriptor 0, while stdout is file descriptor 1 and stderr is file descriptor 2. With this information, you can make more complicated redirections, such as redirecting stderr to the same place (or a different place!) than stdout. For a cheat sheet about redirections, see Peteris Krumins's Bash Redirections Cheat Sheet.
Yes.
You are running the command sort on a text file. The output goes to program.sh

Shell script to nullify a file everytime it reaches a specific size

I am in the middle of writing a shell script to nullify/truncate a file if it reaches certain size. Also the file is being opened/written by a process all the time. Now every time when I nullify the file, will the file pointer be repositioned to the start of the file or will it remain in its previous position? Let me know if we could reset the file pointer once the file has been truncated?
The position of the file pointer depends on how the file was opened by the process that has it open. If it was opened in append mode, then truncating the file will mean that new data will be written at the end of the file, which is actually the beginning too the first time it writes after the file is truncated. If it was not opened in append mode, then truncating the file will simply mean that there is a series of virtual zero bytes at the start of the file, but the real data will continue to be written at the same point as the last write finished. If the file is being reopened by the other process, rather than being held open, then roughly the same rules apply, but there is a better chance that the file will be written at the beginning. It all depends on how the first process to write to the file after the truncation is managing its file pointer.
You can't reset the file pointer of another process, AFAIK.
A cron job or something like this will do the task; it will find every files bigger than 4096bytes then nullified the files
$ find -type f -size 4096c -print0 | while IFS= read -r -d $'\0' line; do cat /dev/null > $line; done
enter link description here

Resources