OpenMPI mpirun universe size - openmpi

I do not know if I perhaps understand this incorrectly. But here is what I want to achieve with OpenMPI in particular just starting with mpirun:
I want to create a single process using the -np parameter that specifies the world size as 1
I then want to set the universe size to some arbitrary number (for argument sake 10), how do I do this?
The following two commands:
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
MPI_Attr_get(MPI_COMM_WORLD, MPI_UNIVERSE_SIZE, &universe_size,
&flag);
yield the output of world_size as 1 and universe_size as 1.

Ok, so I found 2 ways of doing this:
Implicit: mpirun -np 1 -H localhost,localhost,...,localhost executable
Explicit: just assign a value to universe_size in the application itself, it will work fine.
Thank you for anyone that looked at this.

Related

Read from standard input with all MPI processes

So far I've been using OPEN(fid, FILE='IN', ...) and it seems that all MPI processes read the same file IN without interfering with each other.
Furthermore, in order to allow the input file being chosen among several, I simply made the IN file a symbolic link pointing to the desired input. This means that when I want to change the input file I have to run ln -sf desidered-input IN before running the program (mpirun -n $np ./program).
I'd really like to be able to run the progam as mpirun -n $np ./program < input-file. To do so I removed the OPEN statement, and the corresponding CLOSE statement, and changed all READ(fid,*) statements to READ(INPUT_UNIT,*) (I'm using ISO_FORTRAN_ENV module).
But, after all edits, I've realized that only one process (always 0, I noticed) reads from it, since all others reach EOF immediately. Here is a MWE, using OpenMPI 2.0.1.
! cat main.f90
program main
use, intrinsic :: iso_fortran_env
use mpi
implicit none
integer :: myid, x, ierr, stat
x = 12
call mpi_init(ierr)
call mpi_comm_rank(mpi_comm_world, myid, ierr)
read(input_unit,*, iostat=stat) x
if (is_iostat_end(stat)) write(output_unit,*) myid, "I'm out"
if (.not. is_iostat_end(stat)) write(output_unit,*) myid, "I'm in", myid, x
call mpi_finalize(ierr)
end program main
that can be compiled with mpifort -o main main.f90, run with mpirun -np 4 ./main, and which results in this output
1 I'm out
2 I'm out
3 I'm out
17 this is my input from keyboard
0 I'm in 0 17
I know that MPI has proper routines to perform parallel I/O, but I've found nothing about reading from standard input.
You are seeing the expected behaviour with OpenMPI. By default, mpirun
directs UNIX standard input to /dev/null on all processes except the MPI_COMM_WORLD rank 0 process. The MPI_COMM_WORLD rank 0 process inherits standard input from mpirun.
The option --stdin can be used to direct standard input to another process, but not to direct to all.
One could also note that the behaviour of redirection of standard input isn't consistent across MPI implementations (the notion isn't specified by the MPI standard). For example, using Intel MPI there is the -s option to that mpirun. mpirun -np 4 -s all ./main does allow all processes access to mpirun's standard input. There's also no guarantee that processes without that redirection will fail, rather than wait, to read.

Top command: How to stick to one unit (KB/KiB)

I'm using the top command in several distros to feed a Bash script. Currently I'm calling it with top -b -n1.
I'd prefer a unified output in KiB or KB. However, it will display large units in megabytes or gigabytes. Is there an option to avoid these large units?
Please consider the following example:
4911 root 20 0 274m 248m 146m S 0 12.4 0:07.19 example
Edit: To answer 123's question, I transform the columns and send them to a log monitoring appliance. If there's no alternative, I'll convert the units via awk beforehand as per this thread.
Consider cutting out the middleman top and reading directly from /proc/[1-9]*/statm. All those files consist of one line of numbers, of which the first three correspond with top's VIRT RES SHR, respectively, in units of pages, normally 4096 B, so that by multiplying with 4 you get units of KiB.
You need a config file. You can create it yourself as $HOME/.toprc or using top interactively. The latter is easy. You just need to press W while top is running in interactive mode.
But first you need to set top interactively to the state you want. To change the memory scale press e until you see what you want. (Then save with W.)
Either way, you need this set in your config: Task_mscale=0 for the lowest scale.

Open MPI, determine rank of process to send to

I have two different executables each with a specific role. One of the two processes sends the other information by calling MPI_isend. But how do I know the rank of the other process?
I found out that when I run my stack as follows, that exe1, the receiving process, seems to always have rank 0, exe2 seems to always have rank 1. Therefore, if I send to rank 0 from exe2, the message is received. But am I missing anything here, it seems so complicated?
mpirun -np 1 exe1 : -np 1 exe2
Mapping of processes to ranks in Open MPI can be controlled with various CLI arguments to mpiexec with newer versions (like 1.7.x) supporting much finer control than older versions. By default ranks follow the order in which processes are placed in the slots provided. Therefore -np 1 exe1 : -np 1 exe2 will always result in exe1 being rank 0 and exe2 being rank 1 in MPI_COMM_WORLD. If you use -np 3 exe1 : -np 2 exe2 instead, you will get the following:
rank executable
------------------
0 exe1
1 exe1
2 exe1
3 exe2
4 exe2
It is also possible to start exe1 and exe2 as separate MPI jobs and make them connect to each other over an intercommunicator but that is considered an advanced MPI topic.
Another solution would be, to have the receiving process, exe1, send a message with its rank first. When the second process listens for messages from any source with the tag of that message, it will receive the rank of the first process.

Some doubts with Shell Scripting.

I have a couple of requirements to be satisfied using shell scripts. Since I am trio in this area, i would certainly need your help.
1) I have a script which invokes a env-function which will ask for a user input to proceed with the execution. I want my script to supply the answer to this. How can i implement this.
Doing a bit of googling pointed me to an "expect" command, which is unfortunately not installed in my system. Is there any other way to achieve this task?
2) I have another requirement like, the script should find the total number of CPUs in my pc and should append the "-j(2*no. of CPU)" to my make command.
Could somebody please shed some light into how this can be done.
Thanks,
Sen
Since the first part has been answered, for the 2nd part you can try something like
#!/bin/sh
cpu=`cat /proc/cpuinfo | grep -e '^processor' | wc -l`
jobs=$(echo "$cpu * 2" | bc)
make -j$jobs
I have another requirement like, the script should find the total
number of CPUs in my pc
You can read the output of
/proc/cpuinfo
or even better:
You can narrow down output with the following command, to display number of processors in the system:
grep processor /proc/cpuinfo
you can simply redirect an input to that program. An example with checkinstall:
checkinstall < /path/to/file/with/answers

File output redirection in Linux

I have two programs A and B. I can't change the program A - I can only run it with some parameters, but I have written the B myself, and I can modify it the way I like.
Program A runs for a long time (20-40 hours) and during that time it produces output to the file, so that its size increases constantly and can be huge at the end of run (like 100-200 GB). The program B then reads the file and calculates some stuff. The special property of the file is that its content is not correlated: I can divide the file in half and run calculations on each part independently, so that I don't need to store all the data at once: I can calculate on the first part, then throw it away, calculate on the second one, etc.
The problem is that I don't have enough space to store such a big files. I wonder if it is possible to pipe somehow the output of the A to B without storing all the data at once and without making huge files. Is it possible to do something like that?
Thank you in advance, this is crucial for me now, Roman.
If program A supports it, simply pipe.
A | B
Otherwise, use a fifo.
mkfifo /tmp/fifo
ls -la > /tmp/fifo &
cat /tmp/fifo
EDIT: Adjust buffer sizes with ulimit -p and then:
cat /tmp/fifo | B
It is possible to pipeline output of one program into another.
Read here to know the syntax and know-hows of Unix pipelining.
you can use socat which can take stdout and feed it to network and get from network and feed it to stdin
named or unnamed pipe have a problem of small ( 4k ? ) buffer .. that means too many process context switches if you are writing multi gb ...
Or if you are adventurous enough .. you can LD_PRELOAD a so in process A, and trap the open/write calls to do whatever ..

Resources