about fork and execve system call - linux

It is said that fork system call creates a clone of the calling process, and then (usually) the child process issues execve system call to change its image and running a new process. Why this two-step?
BTW, what does execve stand for?

The reason for the two-step is flexibility. Between the two steps you can modify the context of the child process that the newly exec'ed program will inherit.
Some things you may want to change are:
File descriptors
User/group ID
Process group and session IDs
Current directory
Resource limits
Scheduling priority and affinity
File creation mask (umask)
If you did not split up fork and exec and instead had a single spawn-like system call, it would need to take arguments for each of these process attributes if you wanted them set differently in a child process. For example, see the argument list to CreateProcess in the Windows API.
With fork/exec, you change whatever inheritable process attributes you want to in the child before you exec the new program.
Setting up file descriptors is one of the more common things to change in a child's process context. If you want to capture the output of a program, you will typically create a pipe in the parent with the pipe(2) system call, and after fork(2)ing, you will close the write end in the parent process and close the read end in the child process before calling execve(2). (You'll also use dup(2) to set the child end of the pipe to be file descriptor 1 (stdout)). This would either be impossible or restrictive in a single system call.

exec: Execute new process
v : use array of arguments
e: Specify as well the environment
Other variations of exec abound:
int execl(const char *path, const char *arg, ...);
int execlp(const char *file, const char *arg, ...);
int execle(const char *path, const char *arg,..., char * const envp[]);
int execv(const char *path, char *const argv[]);
int execvp(const char *file, char *const argv[]);
l: list arg on function
p: use $PATH to locate executable file

Each step is relatively simple.
In Unix, your process has two parts -- a read-only memory area with the application code ("text") and the read-write memory area ("data").
A fork clones the read-write area, leaving the text page alone. You now have two processes running the same code. They differ by a register value -- the return value from fork -- which separates parent from child.
An exec replaces the text page, leaving the data page alone. There are many forms of exec, depending on how much environment information you're passing to it. See http://linux.die.net/man/3/exec for an additional list of variants.

The "exec" family of functions replace the current process image(from where it is called) with a new process image, so the calling image is replaced by the new process image. For eg. if you were to run the 'ls' command from a shell(/bin/sh or /bin/csh) then the shell would fork to a new process which would then execute ls. Once the ls command exits it returns control to the parent process, which in this example is the shell.
If there were no fork functionality then the shell would be replaced by the 'ls' process which upon exit would leave you with an inaccessible terminal since the shell's image in memory was replaced upon the exec call to ls.
For variations in the 'exec' family look at 0x6adb015's answer.

What does execve stand for?
The 6 variations of the exec functions in C are are exec{l,v}{,e,p}. See function prototypes below for details.
Command-line arguments
v - Command-line arguments are passed to the function as an array (vector) of pointers.
l - Command-line arguments are passed individually (a list) to the function.
Environment variables (optional)
e - An array of pointers to environment variables is explicitly passed to the new process image
Locate the file to be executed (optional)
p - Uses the PATH environment variable to find the file named in the file argument to be executed
int execl (char const *path, char const *arg0, ...);
int execle(char const *path, char const *arg0, ..., char const *envp[]);
int execlp(char const *file, char const *arg0, ...);
int execv (char const *path, char const *argv[]);
int execve(char const *path, char const *argv[], char const *envp[]);
int execvp(char const *file, char const *argv[]);
Source

Related

dup2 vs close+open for stdout redirection

Redirecting stdout with close and open:
close(STDOUT_FILENO);
int fd = open("log", O_RDWR);
printf("My output\n");
differs from redirection through dup2:
int fd = open("log", O_RDWR);
dup2(fd, STDOUT_FILENO);
printf("My output\n");
with strace i see that in the first case write returns EBADF:
$ strace -e write ./a.out
write(1, "My output\n", 10) = -1 EBADF (Bad file descriptor)
How dup2 differs from open+close?
When you do
int fd = open("log", O_RDWR);
The open file can be any file descriptor. Just because you closed STDOUT_FILENO does not mean it will be used for open(). Even if the system chose the first available descriptors (something that's not guaranteed), it is still possible that:
stdin is closed as well and would get chosen first.
a signal is run in between the close() and open() and the signal handler opens a file.
the open() fails, leaving you with a closed stdout.
The dup2() version on the other hand ensures the file descriptor will be the correct one, by specifiying it explicitly. In addition, dup2() is guaranteed to be atomic. That is, at any point in time, STDOUT_FILENO is valid and either it's still the old one, or it is the new one and old one has been closed.
As a sidenote, after a dup2, both descriptor numbers refer to the same file, and both need to be close()ed. So if you're just redirecting stdout, you probably want to close(fd) after the dup2() call.

What does the function ttyn(3) return?

The man page is here: http://man.cat-v.org/unix-6th/3/ttyn
This example:
if (ttyn(0) = 'x'){
...
}
The man page says "x is returned if the indicated file does not correspond to a
typewriter."
The indicated file would be argument 0, so the standardinput, right?
And what is a typewriter? My keyboard?
What are you checking with this line?
if (ttyn(0) = 'x')
At that point in time, a typewriter (or teletype, or tty) was an RS-232 terminal connected to the computer via a serial port. The device entries in /dev corresponding to these ports were named /dev/tty0, /dev/tty1, /dev/ttya, etc. Each of those files was a character special file, as opposed to an ordinary file.
When a terminal was detected by the system, typically by being turned on or connected through a modem, the init process opened the device on file descriptors 0, 1, and 2 in a new process, and those file descriptors persisted through the login process, a user's shell, and any processes forked from the shell.
As you said in your question, file descriptor 0 is also called standard input.
The ttyn function calls fstat on its argument, which returns some info about the file such as its inode number, permissions, etc. ttyn then reads through /dev, looking at each file that starts with "tty", to see which one has the same inode number as ttyn's argument. When it finds a match, it returns the 4th character of the filename, which would be '0', '1', 'a', etc. If no matches are found, it returns 'x'.
There were generally a console and a few 8-port serial interfaces on a PDP-11. so there was no ttyx. And you could name devices in /dev anything you wanted. So it was easy to avoid /dev/ttyx being an actual device.
Commands like goto could use ttyn(0) != 'x' to determine whether the user was actually typing the command on a terminal.
Here is the default config file, /etc/ttys, used by init in V6. The console was tty8.
In V7 Unix, the functionality of ttyn was replaced by ttyname, which could accommodate longer device names, and isatty, which returned true if the fle descriptor was a terminal device. The goto command was not present in V7.
I've never seen this library call before; I'm used to the more familiar ttyname. The webpage doesn't give a return value, but based on what the text says, it would give the last char value in the string returned by ttynam(3). So if stdin (fd0) was connected to "/dev/tty2", then the return value would be the char 2. And in C, you would be able to check it with:
if (ttyn(0) == '2') { ... }
Granted the documentation is not clear. And it is using bad terminology; instead of "typewriter", it should be using "teletype" or "terminal", which are the accepted terms. Remember that stdin can be different from stdout; it is perfectly possible to do run cat </dev/tty1 > /dev/tty2, assuming you have the permissions for it.

How to move files in C drive using MoveFileEx APi

when I use MoveFileEx to move files in C drive, but I am getting the ERROR that ACCESS DENIED. Any solutions
int i ;
DWORD dw ;
String^ Source = "C:\\Folder\\Program\\test.exe" ;
String^ Destination = "C:\\test.exe"; // move to program Files Folder
pin_ptr<const wchar_t> WSource = PtrToStringChars(Source);
pin_ptr<const wchar_t> WDestination = PtrToStringChars(Destination);
i = MoveFileEx( WSource, WDestination ,MOVEFILE_REPLACE_EXISTING | MOVEFILE_COPY_ALLOWED ) ;
dw = GetLastError() ;
You need to make sure that the user account your process runs under has read access to the file being moved and write access to where it's being written to. And that the file being moved isn't locked by another process and that there isn't a file with the same name in the destination directory that's locked by another process.
Try moving the same file manually in Windows Explorer and see what errors you get, when you can do that without problems your app will probably work too (assuming they're running under the same account).
is the code posted in your question the real used code ???
if it is so, you are having a problem with your filenames. \ is the escape character in C and C++ strings, it should be doubled if you want a real \ character in the resulting string.
so your pathes should be:
String source = "C:\\Folder\\Program\\test.exe";
String Destination = "C:\\test.exe";
also, ^ is not a valid character in C and C++, it is only valid for defining pointers in Pascal. i suspect your code is really written in Pascal, but then i am unsure if the above remark about the escape character in string is valid in Pascal...

Checking if subfolders exist linux

I'm trying to check if a folder has any subfolders without iterating through its children, in Linux. The closest I've found so far is using ftw and stopping at the first subfolder - or using scandir and filtering through the results. Both, are, however, an overkill for my purposes, I simply want a yes/no.
On Windows, this is done by calling SHGetFileInfo and then testing dwAttributes & SFGAO_HASSUBFOLDER on the returned structure. Is there such an option on Linux?
The standard answer is to call stat on the directory, then check the st_nlink field ("number of hard links"). On a standard filesystem, each directory is guaranteed to have 2 hard links (. and the link from the parent directory to the current directory), so each hard link beyond 2 indicates a subdirectory (specifically, the subdirectory's .. link to the current directory).
However, it's my understanding that filesystems aren't required to implement this (see, e.g., this mailing list posting), so it's not guaranteed to work.
Otherwise, you have to do as you're doing:
Iterate over the directory's contents using glob with the GNU-specific GLOB_ONLYDIR flag, or scandir, or readdir.
Call stat on each result and check S_ISDIR(s.st_mode) to verify that files found are directories. Or, nonportably, check struct dirent.d_type: if it's DT_DIR then it's a file, and if it's DT_UNKNOWN, you'll have to stat it after all.
The possibilities you've mentioned (as well as e.James's) seem to me like they're better suited to a shell script than a C++ program. Presuming the "C++" tag was intentional, I think you'd probably be better off using the POSIX API directly:
// warning: untested code.
bool has_subdir(char const *dir) {
std::string dot("."), dotdot("..");
bool found_subdir = false;
DIR *directory;
if (NULL == (directory = opendir(dir)))
return false;
struct dirent *entry;
while (!found_subdir && ((entry = readdir(directory)) != NULL)) {
if (entry->d_name != dot && entry->d_name != dotdot) {
struct stat status;
stat(entry->d_name, &status);
found_subdir = S_ISDIR(status.st_mode);
}
}
closedir(directory);
return found_subdir;
}
Does getdirentries do want you want it to do? I think it shoudl return nothing if there are no directories. I would have tried this myself but am temporarily without access to a linux box :(

capturing command line output in ncurses

how to capture the command line output in a window using ncurses?
Suppose I am excecuting a command like "ls" I want to print that output in a specific window which is designed in ncurses. I am new to ncurses.help me.Thanks in advance
One thing I can think of is using system() to execute the command, redirecting its output to a temp file:
system("ls > temp");
Then opening the file temp, reading its content and displaying it on the window.
Not an elegant solution, but works.
The more elegant solution might be to implement the redirect within your program. Look into the dup() and dup2() system calls (see the dup(2) manpage). So, what you would want to do is (this is essentially what the shell called by system() ends up doing):
Code Snippet:
char *tmpname;
int tmpfile;
pid_t pid;
int r;
tmpname = strdup("/tmp/ls_out_XXXXXX");
assert(tmpname);
tmpfile = mkstemp(tmpname);
assert(tmpfile &gt= 0);
pid = fork();
if (pid == 0) { // child process
r = dup2(STDOUT_FILENO, tmpfile);
assert(r == STDOUT_FILENO);
execl("/bin/ls", "ls", NULL);
assert(0);
} else if (pid > 0) { // parent
waitpid(pid, &r, 0);
/* you can mmap(2) tmpfile here, and read from it like it was a memory buffer, or
* read and write as normal, mmap() is nicer--the kernel handles the buffering
* and other memory management hassles.
*/
} else {
/* fork() failed, bail violently for this example, you can handle differently as
* appropriately.
*/
assert(0);
}
// tmpfile is the file descriptor for the ls output.
unlink(tmpname); // file stays around until close(2) for this process only
For more picky programs (ones that care that they have a terminal for input and output), you'll want to look into pseudo ttys, see the pty(7) manpage. (Or google 'pty'.) This would be needed if you want ls to do its multicolumn pretty-printing (eg, ls will detect it is outputting to a file, and write one filename to a line. If you want ls to do the hard work for you, you'll need a pty. Also, you should be able to set the $LINES and $COLUMNS environment variables after the fork() to get ls to pretty print to your window size--again, assuming you are using a pty. The essential change is that you would delete the tmpfile = mkstemp(...); line and replace that and the surrounding logic with the pty opening logic and expand the dup2() call to handle stdin and stderr as well, dup2()ing them from the pty file handles).
If the user can execute arbitrary programs in the window, you'll want to be careful of ncurses programs--ncurses translates the move() and printw()/addch()/addstr() commands into the appropriate console codes, so blindly printing the output of ncurses programs will stomp your program's output and ignore your window location. GNU screen is a good example to look into for how to handle this--it implements a VT100 terminal emulator to catch the ncurses codes, and implements its own 'screen' terminal with its own termcap/terminfo entries. Screen's subprograms are run in pseudo-terminals. (xterm and other terminal emulators perform a similar trick.)
Final note: I haven't compiled the above code. It may have small typos, but should be generally correct. If you mmap(), make sure to munmap(). Also, after you are done with the ls output, you'll want to close(tmpfile). unlink() might be able to go much earlier in the code, or right before the close() call--depends on if you want people to see the output your playing with--I usually put the unlink() directly after the mkstemp() call--this prevents the kernel from writing the file back to disk if the tmp directory is disk backed (this is less and less common thanks to tmpfs). Also, you'll want to free(tmpname) after you unlink() to keep from leaking memory. The strdup() is necessary, as tmpname is modified by mkstemp().
Norman Matloff shows in his Introduction to the Unix Curses Library on page five a way:
// runs "ps ax" and stores the output in cmdoutlines
runpsax()
{ FILE* p; char ln[MAXCOL]; int row,tmp;
p = popen("ps ax","r"); // open Unix pipe (enables one program to read
// output of another as if it were a file)
for (row = 0; row < MAXROW; row++) {
tmp = fgets(ln,MAXCOL,p); // read one line from the pipe
if (tmp == NULL) break; // if end of pipe, break
// don’t want stored line to exceed width of screen, which the
// curses library provides to us in the variable COLS, so truncate
// to at most COLS characters
strncpy(cmdoutlines[row],ln,COLS);
// remove EOL character
cmdoutlines[row][MAXCOL-1] = 0;
}
ncmdlines = row;
close(p); // close pipe
}
...
He then calls mvaddstr(...) to put out the lines from the array through ncurses.

Resources