Read from standard input with all MPI processes - io

So far I've been using OPEN(fid, FILE='IN', ...) and it seems that all MPI processes read the same file IN without interfering with each other.
Furthermore, in order to allow the input file being chosen among several, I simply made the IN file a symbolic link pointing to the desired input. This means that when I want to change the input file I have to run ln -sf desidered-input IN before running the program (mpirun -n $np ./program).
I'd really like to be able to run the progam as mpirun -n $np ./program < input-file. To do so I removed the OPEN statement, and the corresponding CLOSE statement, and changed all READ(fid,*) statements to READ(INPUT_UNIT,*) (I'm using ISO_FORTRAN_ENV module).
But, after all edits, I've realized that only one process (always 0, I noticed) reads from it, since all others reach EOF immediately. Here is a MWE, using OpenMPI 2.0.1.
! cat main.f90
program main
use, intrinsic :: iso_fortran_env
use mpi
implicit none
integer :: myid, x, ierr, stat
x = 12
call mpi_init(ierr)
call mpi_comm_rank(mpi_comm_world, myid, ierr)
read(input_unit,*, iostat=stat) x
if (is_iostat_end(stat)) write(output_unit,*) myid, "I'm out"
if (.not. is_iostat_end(stat)) write(output_unit,*) myid, "I'm in", myid, x
call mpi_finalize(ierr)
end program main
that can be compiled with mpifort -o main main.f90, run with mpirun -np 4 ./main, and which results in this output
1 I'm out
2 I'm out
3 I'm out
17 this is my input from keyboard
0 I'm in 0 17
I know that MPI has proper routines to perform parallel I/O, but I've found nothing about reading from standard input.

You are seeing the expected behaviour with OpenMPI. By default, mpirun
directs UNIX standard input to /dev/null on all processes except the MPI_COMM_WORLD rank 0 process. The MPI_COMM_WORLD rank 0 process inherits standard input from mpirun.
The option --stdin can be used to direct standard input to another process, but not to direct to all.
One could also note that the behaviour of redirection of standard input isn't consistent across MPI implementations (the notion isn't specified by the MPI standard). For example, using Intel MPI there is the -s option to that mpirun. mpirun -np 4 -s all ./main does allow all processes access to mpirun's standard input. There's also no guarantee that processes without that redirection will fail, rather than wait, to read.

Related

GNU Parallel | pipe command

I am completely new in using GNU parallel and I need your advice in running the command below using GNU parallel:
/home/admin/Gfinal/decoder/decdr.pl --gh --w14b /data/tmp/KRX12/a.bin |
perl /home/admin/decout/decoder/flow.pl >> /data/tmp/decodedgfile/out_1.txt
I will run this command on a list of files (.bin), so what is the best (fastest) approach to achieve that using GNU parallel noting that the output of the first part of the command (/home/admin/Gfinal/decoder/decdr.pl --gh --w14b) is very large (> 2 GB).
Any help would be appreciated.
Will this work:
parallel /home/admin/Gfinal/decoder/decdr.pl --gh --w14b {} '|' perl /home/admin/decout/decoder/flow.pl >> /data/tmp/decodedgfile/out_1.txt ::: /data/tmp/KRX12/*.bin
(If the output from flow.pl is more than your disk I/O can handle, try parallel --compress).
Or maybe:
parallel /home/admin/Gfinal/decoder/decdr.pl --gh --w14b {} '|' perl /home/admin/decout/decoder/flow.pl '>>' /data/tmp/decodedgfile/out_{#}.txt ::: /data/tmp/KRX12/*.bin
It depends on whether you want a single output file or one per input file.
Also spend an hour walking through the tutorial. Your command line will love you for it. man parallel_tutorial
Here are some great videos for gnu-parallel / parallel
Ref youtube Part 1: GNU Parallel script processing and execution
Here is a link from the GNU web site for platform specific information.
Ref gnu parallel download information
"Multiple input sources
GNU parallel can take multiple input sources given on the command line. GNU
parallel then generates all combinations of the input sources:
parallel echo ::: A B C ::: D E F
Output (the order may be different):
A D
A E
A F
B D
B E
............
The input sources can be files:
parallel -a abc-file -a def-file echo"
Ref GNU-Parallel-Tutorial
With reference to the pipe
Pipe capacity
A pipe has a limited capacity. If the pipe is full, then a write(2)
will block or fail, depending on whether the O_NONBLOCK flag is set
(see below). Different implementations have different limits for the
pipe capacity. Applications should not rely on a particular
capacity: an application should be designed so that a reading process
consumes data as soon as it is available, so that a writing process
does not remain blocked.
In Linux versions before 2.6.11, the capacity of a pipe was the same
as the system page size (e.g., 4096 bytes on i386). Since Linux
2.6.11, the pipe capacity is 65536 bytes. Since Linux 2.6.35, the
default pipe capacity is 65536 bytes, but the capacity can be queried
and set using the fcntl(2) F_GETPIPE_SZ and F_SETPIPE_SZ operations.
See fcntl(2) for more information.
PIPE_BUF
POSIX.1 says that write(2)s of less than PIPE_BUF bytes must be
atomic: the output data is written to the pipe as a contiguous
sequence. Writes of more than PIPE_BUF bytes may be nonatomic: the
kernel may interleave the data with data written by other processes.
POSIX.1 requires PIPE_BUF to be at least 512 bytes. (On Linux,
PIPE_BUF is 4096 bytes.) The precise semantics depend on whether the
file descriptor is nonblocking (O_NONBLOCK), whether there are
multiple writers to the pipe, and on n, the number of bytes to be
written:
Ref man7.org pipe
You could have a look at fcntl F_GETPIPE_SZ and F_SETPIPE_SZ operations for more information.
Ref fcntl
All the best

Time taken by `less` command to show output

I have a script that produces a lot of output. The script pauses for a few seconds at point T.
Now I am using the less command to analyze the output of the script.
So I execute ./script | less. I leave it running for sufficient time so that the script would have finished executing.
Now I go through the output of the less command by pressing Pg Down key. Surprisingly while scrolling at the point T of the output I notice the pause of few seconds again.
The script does not expect any input and would have definitely completed by the time I start analyzing the output of less.
Can someone explain how the pause of few seconds is noticable in the output of less when the script would have finished executing?
Your script is communicating with less via a pipe. Pipe is an in-memory stream of bytes that connects two endpoints: your script and the less program, the former writing output to it, the latter reading from it.
As pipes are in-memory, it would be not pleasant if they grew arbitrarily large. So, by default, there's a limit of data that can be inside the pipe (written, but not yet read) at any given moment. By default it's 64k on Linux. If the pipe is full, and your script tries to write to it, the write blocks. So your script isn't actually working, it stopped at some point when doing a write() call.
How to overcome this? Adjusting defaults is a bad option; what is used instead is allocating a buffer in the reader, so that it reads into the buffer, freeing the pipe and thus letting the writing program work, but shows to you (or handles) only a part of the output. less has such a buffer, and, by default, expands it automatically, However, it doesn't fill it in the background, it only fills it as you read the input.
So what would solve your problem is reading the file until the end (like you would normally press G), and then going back to the beginning (like you would normally press g). The thing is that you may specify these commands via command line like this:
./script | less +Gg
You should note, however, that you will have to wait until the whole script's output loads into memory, so you won't be able to view it at once. less is insufficiently sophisticated for that. But if that's what you really need (browsing the beginning of the output while the ./script is still computing its end), you might want to use a temporary file:
./script >x & less x ; rm x
The pipe is full at the OS level, so script blocks until less consumes some of it.
Flow control. Your script is effectively being paused while less is paging.
If you want to make sure that your command completes before you use less interactively, invoke less as less +G and it will read to the end of the input, you can then return to the start by typing 1G into less.
For some background information there's also a nice article by Alexander Sandler called "How less processes its input"!
http://www.alexonlinux.com/how-less-processes-its-input
Can I externally enforce line buffering on the script?
Is there an off the shelf pseudo tty utility I could use?
You may try to use the script command to turn on line-buffering output mode.
script -q /dev/null ./script | less # FreeBSD, Mac OS X
script -c "./script" /dev/null | less # Linux
For more alternatives in this respect please see: Turn off buffering in pipe.

Confused about stdin, stdout and stderr?

I am rather confused with the purpose of these three files. If my understanding is correct, stdin is the file in which a program writes into its requests to run a task in the process, stdout is the file into which the kernel writes its output and the process requesting it accesses the information from, and stderr is the file into which all the exceptions are entered. On opening these files to check whether these actually do occur, I found nothing seem to suggest so!
What I would want to know is what exactly is the purpose of these files, absolutely dumbed down answer with very little tech jargon!
Standard input - this is the file handle that your process reads to get information from you.
Standard output - your process writes conventional output to this file handle.
Standard error - your process writes diagnostic output to this file handle.
That's about as dumbed-down as I can make it :-)
Of course, that's mostly by convention. There's nothing stopping you from writing your diagnostic information to standard output if you wish. You can even close the three file handles totally and open your own files for I/O.
When your process starts, it should already have these handles open and it can just read from and/or write to them.
By default, they're probably connected to your terminal device (e.g., /dev/tty) but shells will allow you to set up connections between these handles and specific files and/or devices (or even pipelines to other processes) before your process starts (some of the manipulations possible are rather clever).
An example being:
my_prog <inputfile 2>errorfile | grep XYZ
which will:
create a process for my_prog.
open inputfile as your standard input (file handle 0).
open errorfile as your standard error (file handle 2).
create another process for grep.
attach the standard output of my_prog to the standard input of grep.
Re your comment:
When I open these files in /dev folder, how come I never get to see the output of a process running?
It's because they're not normal files. While UNIX presents everything as a file in a file system somewhere, that doesn't make it so at the lowest levels. Most files in the /dev hierarchy are either character or block devices, effectively a device driver. They don't have a size but they do have a major and minor device number.
When you open them, you're connected to the device driver rather than a physical file, and the device driver is smart enough to know that separate processes should be handled separately.
The same is true for the Linux /proc filesystem. Those aren't real files, just tightly controlled gateways to kernel information.
It would be more correct to say that stdin, stdout, and stderr are "I/O streams" rather
than files. As you've noticed, these entities do not live in the filesystem. But the
Unix philosophy, as far as I/O is concerned, is "everything is a file". In practice,
that really means that you can use the same library functions and interfaces (printf,
scanf, read, write, select, etc.) without worrying about whether the I/O stream
is connected to a keyboard, a disk file, a socket, a pipe, or some other I/O abstraction.
Most programs need to read input, write output, and log errors, so stdin, stdout,
and stderr are predefined for you, as a programming convenience. This is only
a convention, and is not enforced by the operating system.
As a complement of the answers above, here is a sum up about Redirections:
EDIT: This graphic is not entirely correct.
The first example does not use stdin at all, it's passing "hello" as an argument to the echo command.
The graphic also says 2>&1 has the same effect as &> however
ls Documents ABC > dirlist 2>&1
#does not give the same output as
ls Documents ABC > dirlist &>
This is because &> requires a file to redirect to, and 2>&1 is simply sending stderr into stdout
I'm afraid your understanding is completely backwards. :)
Think of "standard in", "standard out", and "standard error" from the program's perspective, not from the kernel's perspective.
When a program needs to print output, it normally prints to "standard out". A program typically prints output to standard out with printf, which prints ONLY to standard out.
When a program needs to print error information (not necessarily exceptions, those are a programming-language construct, imposed at a much higher level), it normally prints to "standard error". It normally does so with fprintf, which accepts a file stream to use when printing. The file stream could be any file opened for writing: standard out, standard error, or any other file that has been opened with fopen or fdopen.
"standard in" is used when the file needs to read input, using fread or fgets, or getchar.
Any of these files can be easily redirected from the shell, like this:
cat /etc/passwd > /tmp/out # redirect cat's standard out to /tmp/foo
cat /nonexistant 2> /tmp/err # redirect cat's standard error to /tmp/error
cat < /etc/passwd # redirect cat's standard input to /etc/passwd
Or, the whole enchilada:
cat < /etc/passwd > /tmp/out 2> /tmp/err
There are two important caveats: First, "standard in", "standard out", and "standard error" are just a convention. They are a very strong convention, but it's all just an agreement that it is very nice to be able to run programs like this: grep echo /etc/services | awk '{print $2;}' | sort and have the standard outputs of each program hooked into the standard input of the next program in the pipeline.
Second, I've given the standard ISO C functions for working with file streams (FILE * objects) -- at the kernel level, it is all file descriptors (int references to the file table) and much lower-level operations like read and write, which do not do the happy buffering of the ISO C functions. I figured to keep it simple and use the easier functions, but I thought all the same you should know the alternatives. :)
I think people saying stderr should be used only for error messages is misleading.
It should also be used for informative messages that are meant for the user running the command and not for any potential downstream consumers of the data (i.e. if you run a shell pipe chaining several commands you do not want informative messages like "getting item 30 of 42424" to appear on stdout as they will confuse the consumer, but you might still want the user to see them.
See this for historical rationale:
"All programs placed diagnostics on the standard output. This had
always caused trouble when the output was redirected into a file, but
became intolerable when the output was sent to an unsuspecting
process. Nevertheless, unwilling to violate the simplicity of the
standard-input-standard-output model, people tolerated this state of
affairs through v6. Shortly thereafter Dennis Ritchie cut the Gordian
knot by introducing the standard error file. That was not quite enough.
With pipelines diagnostics could come from any of several programs
running simultaneously. Diagnostics needed to identify themselves."
stdin
Reads input through the console (e.g. Keyboard input).
Used in C with scanf
scanf(<formatstring>,<pointer to storage> ...);
stdout
Produces output to the console.
Used in C with printf
printf(<string>, <values to print> ...);
stderr
Produces 'error' output to the console.
Used in C with fprintf
fprintf(stderr, <string>, <values to print> ...);
Redirection
The source for stdin can be redirected. For example, instead of coming from keyboard input, it can come from a file (echo < file.txt ), or another program ( ps | grep <userid>).
The destinations for stdout, stderr can also be redirected. For example stdout can be redirected to a file: ls . > ls-output.txt, in this case the output is written to the file ls-output.txt. Stderr can be redirected with 2>.
Using ps -aux reveals current processes, all of which are listed in /proc/ as /proc/(pid)/, by calling cat /proc/(pid)/fd/0 it prints anything that is found in the standard output of that process I think. So perhaps,
/proc/(pid)/fd/0 - Standard Output File
/proc/(pid)/fd/1 - Standard Input File
/proc/(pid)/fd/2 - Standard Error File
for example
But only worked this well for /bin/bash other processes generally had nothing in 0 but many had errors written in 2
For authoritative information about these files, check out the man pages, run the command on your terminal.
$ man stdout
But for a simple answer, each file is for:
stdout for a stream out
stdin for a stream input
stderr for printing errors or log messages.
Each unix program has each one of those streams.
stderr will not do IO Cache buffering so if our application need to print critical message info (some errors ,exceptions) to console or to file use it where as use stdout to print general log info as it use IO Cache buffering there is a chance that before writing our messages to file application may close ,leaving debugging complex
A file with associated buffering is called a stream and is declared to be a pointer to a defined type FILE. The fopen() function creates certain descriptive data for a stream and returns a pointer to designate the stream in all further transactions. Normally there are three open streams with constant pointers declared in the header and associated with the standard open files.
At program startup three streams are predefined and need not be opened explicitly: standard input (for reading conventional input), standard output (for writing conventional output), and standard error (for writing diagnostic output). When opened the standard error stream is not fully buffered; the standard input and standard output streams are fully buffered if and only if the stream can be determined not to refer to an interactive device
https://www.mkssoftware.com/docs/man5/stdio.5.asp
Here is a lengthy article on stdin, stdout and stderr:
What Are stdin, stdout, and stderr on Linux?
To summarize:
Streams Are Handled Like Files
Streams in Linux—like almost everything else—are treated as though
they were files. You can read text from a file, and you can write text
into a file. Both of these actions involve a stream of data. So the
concept of handling a stream of data as a file isn’t that much of a
stretch.
Each file associated with a process is allocated a unique number to
identify it. This is known as the file descriptor. Whenever an action
is required to be performed on a file, the file descriptor is used to
identify the file.
These values are always used for stdin, stdout, and stderr:
0: stdin
1: stdout
2: stderr
Ironically I found this question on stack overflow and the article above because I was searching for information on abnormal / non-standard streams. So my search continues.

Bash: infinite sleep (infinite blocking)

I use startx to start X which will evaluate my .xinitrc. In my .xinitrc I start my window manager using /usr/bin/mywm. Now, if I kill my WM (in order to f.e. test some other WM), X will terminate too because the .xinitrc script reached EOF.
So I added this at the end of my .xinitrc:
while true; do sleep 10000; done
This way X won't terminate if I kill my WM. Now my question: how can I do an infinite sleep instead of looping sleep? Is there a command which will kinda like freeze the script?
sleep infinity does exactly what it suggests and works without cat abuse.
tail does not block
As always: For everything there is an answer which is short, easy to understand, easy to follow and completely wrong. Here tail -f /dev/null falls into this category ;)
If you look at it with strace tail -f /dev/null you will notice, that this solution is far from blocking! It's probably even worse than the sleep solution in the question, as it uses (under Linux) precious resources like the inotify system. Also other processes which write to /dev/null make tail loop. (On my Ubuntu64 16.10 this adds several 10 syscalls per second on an already busy system.)
The question was for a blocking command
Unfortunately, there is no such thing ..
Read: I do not know any way to archive this with the shell directly.
Everything (even sleep infinity) can be interrupted by some signal. So if you want to be really sure it does not exceptionally return, it must run in a loop, like you already did for your sleep. Please note, that (on Linux) /bin/sleep apparently is capped at 24 days (have a look at strace sleep infinity), hence the best you can do probably is:
while :; do sleep 2073600; done
(Note that I believe sleep loops internally for higher values than 24 days, but this means: It is not blocking, it is very slowly looping. So why not move this loop to the outside?)
.. but you can come quite near with an unnamed fifo
You can create something which really blocks as long as there are no signals send to the process. Following uses bash 4, 2 PIDs and 1 fifo:
bash -c 'coproc { exec >&-; read; }; eval exec "${COPROC[0]}<&-"; wait'
You can check that this really blocks with strace if you like:
strace -ff bash -c '..see above..'
How this was constructed
read blocks if there is no input data (see some other answers). However, the tty (aka. stdin) usually is not a good source, as it is closed when the user logs out. Also it might steal some input from the tty. Not nice.
To make read block, we need to wait for something like a fifo which will never return anything. In bash 4 there is a command which can exactly provide us with such a fifo: coproc. If we also wait the blocking read (which is our coproc), we are done. Sadly this needs to keep open two PIDs and a fifo.
Variant with a named fifo
If you do not bother using a named fifo, you can do this as follows:
mkfifo "$HOME/.pause.fifo" 2>/dev/null; read <"$HOME/.pause.fifo"
Not using a loop on the read is a bit sloppy, but you can reuse this fifo as often as you like and make the reads terminat using touch "$HOME/.pause.fifo" (if there are more than a single read waiting, all are terminated at once).
Or use the Linux pause() syscall
For the infinite blocking there is a Linux kernel call, called pause(), which does what we want: Wait forever (until a signal arrives). However there is no userspace program for this (yet).
C
Create such a program is easy. Here is a snippet to create a very small Linux program called pause which pauses indefinitely (needs diet, gcc etc.):
printf '#include <unistd.h>\nint main(){for(;;)pause();}' > pause.c;
diet -Os cc pause.c -o pause;
strip -s pause;
ls -al pause
python
If you do not want to compile something yourself, but you have python installed, you can use this under Linux:
python -c 'while 1: import ctypes; ctypes.CDLL(None).pause()'
(Note: Use exec python -c ... to replace the current shell, this frees one PID. The solution can be improved with some IO redirection as well, freeing unused FDs. This is up to you.)
How this works (I think): ctypes.CDLL(None) loads the standard C library and runs the pause() function in it within some additional loop. Less efficient than the C version, but works.
My recommendation for you:
Stay at the looping sleep. It's easy to understand, very portable, and blocks most of the time.
Maybe this seems ugly, but why not just run cat and let it wait for input forever?
TL;DR: since GNU coreutils version 9, sleep infinity does the right thing on Linux systems. Previously (and in other systems) the implementation was to actually sleep the maximum time allowed, which is finite.
Wondering why this is not documented anywhere, I bothered to read the sources from GNU coreutils and I found it executes roughly what follows:
Use strtod from C stdlib on the first argument to convert 'infinity' to a double precision value. So, assuming IEEE 754 double precision the 64-bit positive infinity value is stored in the seconds variable.
Invoke xnanosleep(seconds) (found in gnulib), this in turn invokes dtotimespec(seconds) (also in gnulib) to convert from double to struct timespec.
struct timespec is just a pair of numbers: integer part (in seconds) and fractional part (in nanoseconds).
Naïvely converting positive infinity to integer would result in undefined behaviour (see §6.3.1.4 from C standard), so instead it truncates to TYPE_MAXIMUM(time_t).
The actual value of TYPE_MAXIMUM(time_t) is not set in the standard (even sizeof(time_t) isn't); so, for the sake of example let's pick x86-64 from a recent Linux kernel.
This is TIME_T_MAX in the Linux kernel, which is defined (time.h) as:
(time_t)((1UL << ((sizeof(time_t) << 3) - 1)) - 1)
Note that time_t is __kernel_time_t and time_t is long; the LP64 data model is used, so sizeof(long) is 8 (64 bits).
Which results in: TIME_T_MAX = 9223372036854775807.
That is: sleep infinite results in an actual sleep time of 9223372036854775807 seconds (10^11 years). And for 32-bit linux systems (sizeof(long) is 4 (32 bits)): 2147483647 seconds (68 years; see also year 2038 problem).
Edit: apparently the nanoseconds function called is not directly the syscall, but an OS-dependent wrapper (also defined in gnulib).
There's an extra step as a result: for some systems where HAVE_BUG_BIG_NANOSLEEP is true the sleep is truncated to 24 days and then called in a loop. This is the case for some (or all?) Linux distros. Note that this wrapper may be not used if a configure-time test succeeds (source).
In particular, that would be 24 * 24 * 60 * 60 = 2073600 seconds (plus 999999999 nanoseconds); but this is called in a loop in order to respect the specified total sleep time. Therefore the previous conclusions remain valid.
In conclusion, the resulting sleep time is not infinite but high enough for all practical purposes, even if the resulting actual time lapse is not portable; that depends on the OS and architecture.
To answer the original question, this is obviously good enough but if for some reason (a very resource-constrained system) you really want to avoid an useless extra countdown timer, I guess the most correct alternative is to use the cat method described in other answers.
Edit: recent GNU coreutils versions will try to use the pause syscall (if available) instead of looping. The previous argument is no longer valid when targeting these newer versions in Linux (and possibly BSD).
Portability
This is an important and valid concern:
sleep infinity is a GNU coreutils extension not contemplated in POSIX. GNU's implementation also supports a "fancy" syntax for time durations, like sleep 1h 5.2s while POSIX only allows a positive integer (e.g. sleep 0.5 is not allowed).
Some compatible implementations: GNU coreutils, FreeBSD (at least from version 8.2?), Busybox (requires to be compiled with options FANCY_SLEEP and FLOAT_DURATION).
The strtod behaviour is C and POSIX compatible (i.e. strtod("infinity", 0) is always valid in C99-conformant implementations, see §7.20.1.3).
sleep infinity looks most elegant, but sometimes it doesn't work for some reason. In that case, you can try other blocking commands such as cat, read, tail -f /dev/null, grep a etc.
Let me explain why sleep infinity works though it is not documented. jp48's answer is also useful.
The most important thing: By specifying inf or infinity (both case-insensitive), you can sleep for the longest time your implementation permits (i.e. the smaller value of HUGE_VAL and TYPE_MAXIMUM(time_t)).
Now let's dig into the details. The source code of sleep command can be read from coreutils/src/sleep.c. Essentially, the function does this:
double s; //seconds
xstrtod (argv[i], &p, &s, cl_strtod); //`p` is not essential (just used for error check).
xnanosleep (s);
Understanding xstrtod (argv[i], &p, &s, cl_strtod)
xstrtod()
According to gnulib/lib/xstrtod.c, the call of xstrtod() converts string argv[i] to a floating point value and stores it to *s, using a converting function cl_strtod().
cl_strtod()
As can be seen from coreutils/lib/cl-strtod.c, cl_strtod() converts a string to a floating point value, using strtod().
strtod()
According to man 3 strtod, strtod() converts a string to a value of type double. The manpage says
The expected form of the (initial portion of the) string is ... or (iii) an infinity, or ...
and an infinity is defined as
An infinity is either "INF" or "INFINITY", disregarding case.
Although the document tells
If the correct value would cause overflow, plus or minus HUGE_VAL (HUGE_VALF, HUGE_VALL) is returned
, it is not clear how an infinity is treated. So let's see the source code gnulib/lib/strtod.c. What we want to read is
else if (c_tolower (*s) == 'i'
&& c_tolower (s[1]) == 'n'
&& c_tolower (s[2]) == 'f')
{
s += 3;
if (c_tolower (*s) == 'i'
&& c_tolower (s[1]) == 'n'
&& c_tolower (s[2]) == 'i'
&& c_tolower (s[3]) == 't'
&& c_tolower (s[4]) == 'y')
s += 5;
num = HUGE_VAL;
errno = saved_errno;
}
Thus, INF and INFINITY (both case-insensitive) are regarded as HUGE_VAL.
HUGE_VAL family
Let's use N1570 as the C standard. HUGE_VAL, HUGE_VALF and HUGE_VALL macros are defined in §7.12-3
The macro
HUGE_VAL
expands to a positive double constant expression, not necessarily representable as a float. The macros
HUGE_VALF
HUGE_VALL
are respectively float and long double analogs of HUGE_VAL.
HUGE_VAL, HUGE_VALF, and HUGE_VALL can be positive infinities in an implementation that supports infinities.
and in §7.12.1-5
If a floating result overflows and default rounding is in effect, then the function returns the value of the macro HUGE_VAL, HUGE_VALF, or HUGE_VALL according to the return type
Understanding xnanosleep (s)
Now we understand all essence of xstrtod(). From the explanations above, it is crystal-clear that xnanosleep(s) we've seen first actually means xnanosleep(HUGE_VALL).
xnanosleep()
According to the source code gnulib/lib/xnanosleep.c, xnanosleep(s) essentially does this:
struct timespec ts_sleep = dtotimespec (s);
nanosleep (&ts_sleep, NULL);
dtotimespec()
This function converts an argument of type double to an object of type struct timespec. Since it is very simple, let me cite the source code gnulib/lib/dtotimespec.c. All of the comments are added by me.
struct timespec
dtotimespec (double sec)
{
if (! (TYPE_MINIMUM (time_t) < sec)) //underflow case
return make_timespec (TYPE_MINIMUM (time_t), 0);
else if (! (sec < 1.0 + TYPE_MAXIMUM (time_t))) //overflow case
return make_timespec (TYPE_MAXIMUM (time_t), TIMESPEC_HZ - 1);
else //normal case (looks complex but does nothing technical)
{
time_t s = sec;
double frac = TIMESPEC_HZ * (sec - s);
long ns = frac;
ns += ns < frac;
s += ns / TIMESPEC_HZ;
ns %= TIMESPEC_HZ;
if (ns < 0)
{
s--;
ns += TIMESPEC_HZ;
}
return make_timespec (s, ns);
}
}
Since time_t is defined as an integral type (see §7.27.1-3), it is natural we assume the maximum value of type time_t is smaller than HUGE_VAL (of type double), which means we enter the overflow case. (Actually this assumption is not needed since, in all cases, the procedure is essentially the same.)
make_timespec()
The last wall we have to climb up is make_timespec(). Very fortunately, it is so simple that citing the source code gnulib/lib/timespec.h is enough.
_GL_TIMESPEC_INLINE struct timespec
make_timespec (time_t s, long int ns)
{
struct timespec r;
r.tv_sec = s;
r.tv_nsec = ns;
return r;
}
What about sending a SIGSTOP to itself?
This should pause the process until SIGCONT is received. Which is in your case: never.
kill -STOP "$$";
# grace time for signal delivery
sleep 60;
I recently had a need to do this. I came up with the following function that will allow bash to sleep forever without calling any external program:
snore()
{
local IFS
[[ -n "${_snore_fd:-}" ]] || { exec {_snore_fd}<> <(:); } 2>/dev/null ||
{
# workaround for MacOS and similar systems
local fifo
fifo=$(mktemp -u)
mkfifo -m 700 "$fifo"
exec {_snore_fd}<>"$fifo"
rm "$fifo"
}
read ${1:+-t "$1"} -u $_snore_fd || :
}
NOTE: I previously posted a version of this that would open and close the file descriptor each time, but I found that on some systems doing this hundreds of times a second would eventually lock up. Thus the new solution keeps the file descriptor between calls to the function. Bash will clean it up on exit anyway.
This can be called just like /bin/sleep, and it will sleep for the requested time. Called without parameters, it will hang forever.
snore 0.1 # sleeps for 0.1 seconds
snore 10 # sleeps for 10 seconds
snore # sleeps forever
There's a writeup with excessive details on my blog here
This approach will not consume any resources for keeping process alive.
while :; do :; done & kill -STOP $! && wait
Breakdown
while :; do :; done & Creates a dummy process in background
kill -STOP $! Stops the background process
wait Wait for the background process, this will be blocking forever, cause background process was stopped before
Notes
works only from within a script file.
Instead of killing the window manager, try running the new one with --replace or -replace if available.
while :; do read; done
no waiting for child sleeping process.

Limiting syscall access for a Linux application

Assume a Linux binary foobar which has two different modes of operation:
Mode A: A well-behaved mode in which syscalls a, b and c are used.
Mode B: A things-gone-wrong mode in which syscalls a, b, c and d are used.
Syscalls a, b and c are harmless, whereas syscall d is potentially dangerous and could cause instability to the machine.
Assume further that which of the two modes the application runs is random: the application runs in mode A with probability 95 % and in mode B with probability 5 %. The application comes without source code so it cannot be modified, only run as-is.
I want to make sure that the application cannot execute syscall d. When executing syscall d the result should be either a NOOP or an immediate termination of the application.
How do I achieve that in a Linux environment?
Is the application linked statically?
If not, you may override some symbols, for example, let's redefine socket:
int socket(int domain, int type, int protocol)
{
write(1,"Error\n",6);
return -1;
}
Then build a shared library:
gcc -fPIC -shared test.c -o libtest.so
Let's run:
nc -l -p 6000
Ok.
And now:
$ LD_PRELOAD=./libtest.so nc -l -p 6000
Error
Can't get socket
What happens when you run with variable LD_PRELOAD=./libtest.so? It overrides with symbols defined in libtest.so over those defined in the C library.
It seems that systrace does exactly what you need. From the Wikipedia page:
An application is allowed to make only those system calls specified as permitted in the policy. If the application attempts to execute a system call that is not explicitly permitted an alarm gets raised.
This is one possible application of sandboxing (specifically, Rule-based Execution). One popular implementation is SELinux.
You will have to write the policy that corresponds to what you want to allow the process to do.
That's exactly what seccomp-bpf is for. See an example how to restrict access to syscalls.

Resources