Why doesn't waitid block until child terminates?

Why doesn't waitid block until child terminates? - linux

void *stack;
stack = malloc(STACK_SIZE);
if (-1 == clone(child_thread, stack + STACK_SIZE, 0, NULL)) {
perror("clone failed:");
}
while(waitid(P_ALL, 0, NULL, WEXITED) != 0){
perror("waitid failed:");
sleep(1);
}
The manual says:
If a child has already changed state, then these calls return
immediately. Otherwise they block until either a child changes state
But in fact it returns immediately :
waitid failed:: No child processes
waitid failed:: No child processes
...
Any advice?

You are using PID options. Look further in the man page:
The following Linux-specific options are for use with children created
using clone(2); they cannot be used with waitid():
__WCLONE
Wait for "clone" children only. If omitted then wait for "non-
clone" children only. (A "clone" child is one which delivers no
signal, or a signal other than SIGCHLD to its parent upon termi-
nation.) This option is ignored if __WALL is also specified.
__WALL (Since Linux 2.4) Wait for all children, regardless of type
("clone" or "non-clone").
__WNOTHREAD
(Since Linux 2.4) Do not wait for children of other threads in
the same thread group. This was the default before Linux 2.4.

I do not know the specifics of what you are trying to get done here, but by using waitid in the following way might help:
#include <sys/types.h>
#include <sys/wait.h>
...
siginfo_t signalInfo;
waitid(P_ALL, 0, &signalInfo, WEXITED | WSTOPPED | WNOWAIT | WNOHANG);
Then check for the following in signalInfo to know what happened whenever child exits:
signalInfo.si_signo : For Signal Number
signalInfo.si_code : Usually SIGCHLD
signalInfo.si_errno) : Any error code set
signalInfo.si_status : For exit code of the child code
Note: Using WNOWAIT makes the OS preserve the child process resource usage even after it is killed. You may/may not use this option. If you do, you will have to explicitly call waitid on the child again without the WNOWAIT option.
Reference: See man pages for waitid for more information on this.

Related

How to set pthread name at the time of creation?

I am using pthread in my program. For creation using pthread_create(). Right after creation I am using pthread_setname_np() to set the created thread's name.
I am observing that the name I set takes a small time to reflect, initially the thread inherits the program name.
Any suggestions how I can set the thread name at the time I create the thread using pthread_create()? I researched a bit in the available pthread_attr() but did not find a function that helps.
A quick way to reproduce what I am observing, is as follows:
void * thread_loop_func(void *arg) {
// some code goes here
pthread_getname_np(pthread_self(), thread_name, sizeof(thread_name));
// Output to console the thread_name here
// some more code
}
int main() {
// some code
pthread_t test_thread;
pthread_create(&test_thread, &attr, thread_loop_func, &arg);
pthread_setname_np(test_thread, "THREAD-FOO");
// some more code, rest of pthread_join etc follows.
return 0;
}
Output:
<program_name>
<program_name>
THREAD-FOO
THREAD-FOO
....
I am looking for the first console output to reflect THREAD-FOO.

how I can set the thread name at the time I create the thread using pthread_create()?
That is not possible. Instead you can use a barrier or mutex to synchronize the child thread until it's ready to be run. Or you can set the thread name from inside the thread (if any other threads are not using it's name).
Do not to use pthread_setname_np. This is a nonstandard GNU extension. The _np suffix literally means "non-portable". Write portable code and instead use your own place where you store your thread names.

Instead of pthread_setname_np(3) you can use prctl(2) with PR_SET_NAME. The only limitation with this function is that you can only set the name of the calling process/thread. But since your example is doing exactly that, there should be no problem with this solution AND it's a portable standard API.

Why hangup signal is caught even with nohup?

package main
import (
"os"
"os/signal"
log "github.com/sirupsen/logrus"
"golang.org/x/sys/unix"
)
func main() {
sigs := make(chan os.Signal, 1)
signal.Notify(sigs, unix.SIGHUP)
go func() {
s := <-sigs
log.Info("OS signal: " + s.String())
}()
DoSomething()
}
I compiled the Go code above and executed with following command:
nohup ./build_linux > /dev/null 2>&1 &
But the process still catches HANGUP signal when I exit the terminal.
Seems like signal.Notify has higher priority, and nohup command has no effect.
What's going on and why nohup does not prevent sending hangup signal to process?

TL;DR
Check signal.Ignored() first:
if !signal.Ignored(unix.SIGHUP) {
signal.Notify(sigs, unix.SIGHUP)
}
Long
tkausl's comment has the correct answer: running:
nohup ./build_linux
from the command line launches the ./build_linux program with SIGHUP set to SIG_IGN (all of these signal names being generic Linux names, rather than Go package names). If you install your own handler for SIGHUP, this overrides the current setting of SIGHUP.
In general, in Unix/Linux programs, the correct pattern is to test whether the signal is currently ignored before (or as part of) installing a signal-catch function. If the signal is ignored, restore it to being ignored.
To make this process completely reliable, the most efficient correct pattern to use is:
hold off the signal (perhaps all signals);
install any desired handler, which returns the current disposition of the signal;
if the current disposition was SIG_IGN, return the disposition to SIG_IGN;
release the signal.
The holding-and-releasing is done with the Unix/Linux sigprocmask or pthread_sigmask system call. The one to use depends on whether you're using threads. Go of course does use threads; see, e.g., this patch to the Cgo runtime startup from 2013 (fixes issue #6811).
Since Go 1.11, which introduced signal.Ignored, you can just use that, as the Go runtime has already done all the appropriate hold / set-and-test / restore sequence at startup, and cached the result. One should definitely use this for SIGHUP so as to obey the nohup convention. One should generally use this for SIGINT and other keyboard signals as well, and there's almost1 no reason not to use it for all signals.1
1Jenkins, or some version or versions of Jenkins at least, apparently (incorrectly) sets all signals to be ignored at startup when running test suites.

killall(1) equivalent system call or C library call

I have to stop the earlier instances of processes before starting a new instance. For this i need to system call or a C library call.
Presently i use "system("killall name"). This works but I want to replace this with any equivalent system(2)/library(3) calls. What is the option?
Also to remove files from directory as in "system("rm -f /opt/files*")",
what would be the alternate library(3)/system(2) call?
Pleas note * in file names, remove all files with one call.
regards,
AK

As far as I know there is no general way to do it, as there is no general way to get the pid by its process name.
You have to collect the pids of related processes and call the int kill(pid_t pid, int signo); function
At least you can try to check how its implemented by killall itself
A small addition from Ben's link, killall invokes following lines, i.e. collecting the pids of related process by find_pid_by_name function, implementation of which can be found here
pidList = find_pid_by_name(arg);
if (*pidList == 0) {
errors++;
if (!quiet)
bb_error_msg("%s: no process killed", arg);
} else {
pid_t *pl;
for (pl = pidList; *pl; pl++) {
if (*pl == pid)
continue;
if (kill(*pl, signo) == 0)
continue;
errors++;
if (!quiet)
bb_perror_msg("can't kill pid %d", (int)*pl);
}
}

You can see the implementation in busybox here: http://git.busybox.net/busybox/tree/procps/kill.c
You can also link with busybox as a shared library and invoke its kill_main instead of launching a separate process. It looks fairly well behaved for embedding like this -- always returns normally, never calls exit() -- although you may have difficultly getting error information beyond the return code. (But you aren't getting that via system() either).

Why i can't catching SIGINT signal?

Good day, i have next code:
server s;
namespace signals_handler
{
//sig_atomic_t is_stop=0;
void signal_handler(int sig)
{
if(sig==SIGHUP)
{
printf("recived :SIGHUP\n");
s.restart();
}
else if(sig==SIGINT)
{
printf("recived :SIGINT\n");
//is_stop = 1;
s.interupt();
}
}
}
int main(int argc, char* argv[])
{
signal(SIGHUP, signals_handler::signal_handler);
signal(SIGINT, signals_handler::signal_handler);
s.start();
s.listen();
return 0;
}
When i start execution of this code i can to catch SIGHUP, SIGINT not deliver for my application but debugger stoped in the "listen" function but not move to signalhandler function, Why this happens and what i doing wrongly?

It's normal. gdb catches the signal. From the manual:
Normally, gdb is set up to let the non-erroneous signals like SIGALRM
be silently passed to your program (so as not to interfere with their
role in the program's functioning) but to stop your program
immediately whenever an error signal happens. You can change these
settings with the handle command.
To change the behaviour, use:
handle SIGINT nostop pass
handle signal [keywords...]
Change the way gdb handles signal signal. signal can be the number of a signal or its name (with or without the ‘SIG’ at the beginning);
a list of signal numbers of the form ‘low-high’; or the word ‘all’,
meaning all the known signals. Optional arguments keywords, described
below, say what change to make.
The keywords allowed by the handle command can be abbreviated. Their
full names are:
nostop
gdb should not stop your program when this signal happens. It may still print a message telling you that the signal has come in.
stop
gdb should stop your program when this signal happens. This implies the print keyword as well.
print
gdb should print a message when this signal happens.
noprint
gdb should not mention the occurrence of the signal at all. This implies the nostop keyword as well.
pass
noignore
gdb should allow your program to see this signal; your program can handle the signal, or else it may terminate if the signal is fatal and not handled. pass and noignore are synonyms.
nopass
ignore
gdb should not allow your program to see this signal. nopass and ignore are synonyms.

Convert call from spawn to fork-exec in C

I have code which looks like this in Linux:
return_code= spawnp(cmd, 3, fd_map, NULL, argv, environ);
I need to convert this from QNX to Linux - so I need to use fork-exec since spawn is not available in Linux.
1) How can that be done ? Is this right ?
pid = fork();
if (pid ==0) /* child */
exec(cmd, argv, environ);
2) How do I pass the parameters fd_map and "3" which are passed in spawn to exec ?

I don't know what "3" does.
If you want to change the file descriptors available to the child process, you do not do this in the call to exec or fork, but you do it between by calling close, dup2, etc. The function posix_spawn basically does this for you, and on Linux/glibc, it is implemented using fork and exec (so you can read the source code...)
pid = fork();
if (!pid) {
// close, dup2 go here
exec(...);
// error
}

The 3 indicates the number of file descriptors you are passing into the fd_map and in the spawnp() call it allows you to conveniently select only those file descriptors you want to pass along to the child process.
So after your call to fork() you will have all of the file descriptors in the child process so you can close out those file descriptors you aren't interested in and then, assuming that the file descriptors are not marked as CLOEXEC (close on exec) they will also carry through to the exec()'ed code.
Note that the fork() will fail however if your application is multi-threaded since, until recent versions, QNX doesn't support forking threaded processes.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string