My MPI code deadlocks when I run this simple code on 512 processes on a cluster. I am far from the memory limit. If I increase the number of procesess to 2048, which is far too many for this problem, the code runs again. The deadlock occurs in the line containing the MPI_File_write_all.
Any suggestions?
int count = imax*jmax*kmax;
// CREATE THE SUBARRAY
MPI_Datatype subarray;
int totsize [3] = {kmax, jtot, itot};
int subsize [3] = {kmax, jmax, imax};
int substart[3] = {0, mpicoordy*jmax, mpicoordx*imax};
MPI_Type_create_subarray(3, totsize, subsize, substart, MPI_ORDER_C, MPI_DOUBLE, &subarray);
MPI_Type_commit(&subarray);
// SET THE VALUE OF THE GRID EQUAL TO THE PROCESS ID FOR CHECKING
if(mpiid == 0) std::printf("Setting the value of the array\n");
for(int i=0; i<count; i++)
u[i] = (double)mpiid;
// WRITE THE FULL GRID USING MPI-IO
if(mpiid == 0) std::printf("Write the full array to disk\n");
char filename[] = "u.dump";
MPI_File fh;
if(MPI_File_open(commxy, filename, MPI_MODE_CREATE | MPI_MODE_WRONLY | MPI_MODE_EXCL, MPI_INFO_NULL, &fh))
return 1;
// select noncontiguous part of 3d array to store the selected data
MPI_Offset fileoff = 0; // the offset within the file (header size)
char name[] = "native";
if(MPI_File_set_view(fh, fileoff, MPI_DOUBLE, subarray, name, MPI_INFO_NULL))
return 1;
if(MPI_File_write_all(fh, u, count, MPI_DOUBLE, MPI_STATUS_IGNORE))
return 1;
if(MPI_File_close(&fh))
return 1;
Your code looks right upon quick inspection. I would suggest that you let your MPI-IO library help tell you what's wrong: instead of returning from error, why don't you at least display the error? Here's some code that might help:
static void handle_error(int errcode, char *str)
{
char msg[MPI_MAX_ERROR_STRING];
int resultlen;
MPI_Error_string(errcode, msg, &resultlen);
fprintf(stderr, "%s: %s\n", str, msg);
MPI_Abort(MPI_COMM_WORLD, 1);
}
Is MPI_SUCCESS guaranteed to be 0? I'd rather see
errcode = MPI_File_routine();
if (errcode != MPI_SUCCESS) handle_error(errcode, "MPI_File_open(1)");
Put that in and if you are doing something tricky like setting a file view with offsets that are not monotonically non-decreasing, the error string might suggest what's wrong.
Related
Please Help!
I am using MPI (= Message Passing Interface) in python for a ring communication, which means that every rank are sending and receiving from each other. I know one way to realize this is by using for instance MPI.COMM_WORLD.issend()and MPI.COMM_WORLD.recv(), this is working and done.
Now I want to realize the same Output on a different way by using MPI.Topocomm.Neighbor_alltoallw but this is not working. I wrote a C Code and is working there, so the same output can be reached with this function, but when I implement this in python it is not working. Please find below the C Code and the Python Code
The definition of the Function says (mpi4py Package for Python):
Neighbor_alltoallw(...)
Topocomm.Neighbor_alltoallw(self, sendbuf, recvbuf)
Neighbor All-to-All Generalized
I do not understand following things:
why is recbuf not a return value? it seems to be an argument here
how can this be implmented for a ring communication in Python?
Thank you for your time and support!
my working C Code:
#include <stdio.h>
#include <mpi.h>
#define to_right 201
#define max_dims 1
int main (int argc, char *argv[])
{
int my_rank, size;
int snd_buf, rcv_buf;
int right, left;
int sum, i;
MPI_Comm new_comm;
int dims[max_dims],
periods[max_dims],
reorder;
MPI_Aint snd_displs[2], rcv_displs[2];
int snd_counts[2], rcv_counts[2];
MPI_Datatype snd_types[2], rcv_types[2];
MPI_Status status;
MPI_Request request;
MPI_Init(&argc, &argv);
/* Get process info. */
MPI_Comm_size(MPI_COMM_WORLD, &size);
/* Set cartesian topology. */
dims[0] = size;
periods[0] = 1;
reorder = 1;
MPI_Cart_create(MPI_COMM_WORLD, max_dims, dims, periods,
reorder,&new_comm);
/* Get coords */
MPI_Comm_rank(new_comm, &my_rank);
/* MPI_Cart_coords(new_comm, my_rank, max_dims, my_coords); */
/* Get nearest neighbour rank. */
MPI_Cart_shift(new_comm, 0, 1, &left, &right);
/* Compute global sum. */
sum = 0;
snd_buf = my_rank;
rcv_buf = -1000; /* unused value, should be overwritten by first MPI_Recv; only for test purpose */
rcv_counts[0] = 1; MPI_Get_address(&rcv_buf, &rcv_displs[0]); snd_types[0] = MPI_INT;
rcv_counts[1] = 0; rcv_displs[1] = 0 /*unused*/; snd_types[1] = MPI_INT;
snd_counts[0] = 0; snd_displs[0] = 0 /*unused*/; rcv_types[0] = MPI_INT;
snd_counts[1] = 1; MPI_Get_address(&snd_buf, &snd_displs[1]); rcv_types[1] = MPI_INT;
for( i = 0; i < size; i++)
{
/* Substituted by MPI_Neighbor_alltoallw() :
MPI_Issend(&snd_buf, 1, MPI_INT, right, to_right,
new_comm, &request);
MPI_Recv(&rcv_buf, 1, MPI_INT, left, to_right,
new_comm, &status);
MPI_Wait(&request, &status);
*/
MPI_Neighbor_alltoallw(MPI_BOTTOM, snd_counts, snd_displs, snd_types,
MPI_BOTTOM, rcv_counts, rcv_displs, rcv_types, new_comm);
snd_buf = rcv_buf;
sum += rcv_buf;
}
printf ("PE%i:\tSum = %i\n", my_rank, sum);
MPI_Finalize();
}
My not working Python Code:
from mpi4py import MPI
size = MPI.COMM_WORLD.Get_size()
my_rank = MPI.COMM_WORLD.Get_rank()
to_right =201
max_dims=1
dims = [max_dims]
periods=[max_dims]
dims[0]=size
periods[0]=1
reorder = True
new_comm=MPI.Intracomm.Create_cart(MPI.COMM_WORLD,dims,periods,True)
my_rank= new_comm.Get_rank()
left_right= MPI.Cartcomm.Shift(new_comm,0,1)
left=left_right[0]
right=left_right[1]
sum=0
snd_buf=my_rank
rcv_buf=-1000 #unused value, should be overwritten, only for test purpose
for counter in range(0,size):
MPI.Topocomm.Neighbor_alltoallw(new_comm,snd_buf,rcv_buf)
snd_buf=rcv_buf
sum=sum+rcv_buf
print('PE ', my_rank,'sum=',sum)
My doubts are as follows :
1 : how to send 'str' from function 'fun' , So that i can display it in main function.
2 : And is the return type correct in the code ?
2 : the current code is displaying some different output.
char * fun(int *arr)
{
char *str[5];
int i;
for(i=0;i<5;i++)
{
char c[sizeof(int)] ;
sprintf(c,"%d",arr[i]);
str[i] = malloc(sizeof(c));
strcpy(str[i],c);
}
return str;
}
int main()
{
int arr[] = {2,1,3,4,5},i;
char *str = fun(arr);
for(i=0;i<5;i++)
{
printf("%c",str[i]);
}
return 0;
}
how to send 'str' from function 'fun' , So that i can display it in main function.
This is the way:
char* str = malloc( size );
if( str == NULL ) {
fprintf( stderr,"Failed to malloc\n");
}
/* Do stuff with str, use str[index],
* remember to free it in main*/
free(str);
And is the return type correct in the code ?
No, Probably char** is the one you need to return.
the current code is displaying some different output.
Consider explaining what/why do you want to do ? The way you have written, seems completely messed up way to me. You're passing array of integer but not its length. How is the fun() supposed to know length of array? Another problem is array of pointers in fun().
You can't write a int to a char (See the both size). So I used char array instead.
However, I'm not sure if this is what you want to do (might be a quick and dirty way of doing it):
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char**
fun(int *arr, int size)
{
char **str = malloc( sizeof(char*)*size );
if( str == NULL ) {
fprintf( stderr, "Failed malloc\n");
}
int i;
for(i=0;i<5;i++) {
str[i] = malloc(sizeof(int));
if( str == NULL ) {
fprintf( stderr, "Failed malloc\n");
}
sprintf(str[i],"%d",arr[i]);
}
return str;
}
int
main()
{
int arr[] = {2,1,3,4,5},i;
char **str = fun(arr, 5);
for(i=0;i<5;i++) {
printf("%s\n",str[i]);
free(str[i]);
}
free(str);
return 0;
}
I made these changes to your code to get it working:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char **fun(int *arr)
{
char **str = malloc(sizeof(char *) * 5);
int i;
for(i = 0; i < 5; i++) {
if ((arr[i] >= 0) && (arr[i] <= 9)) {
char c[2] ;
sprintf(c, "%d", arr[i]);
str[i] = (char *) malloc(strlen(c) + 1);
strcpy(str[i],c);
}
}
return str;
}
int main()
{
int arr[] = {2, 1, 3, 4, 5}, i;
char **str = fun(arr);
for(i = 0; i < 5; i++) {
printf("%s", str[i]);
free(str[i]);
}
printf("\n");
free(str);
return 0;
}
Output
21345
I added a check to make sure that arr[i] is a single digit number. Also, returning a pointer to a stack variable will result in undefined behavior, so I changed the code to allocate an array of strings. I don't check the return value of the malloc calls, which means this program could crash due to a NULL pointer reference.
This solution differs from the others in that it attempts to answer your question based on the intended use.
how to send 'str' from function 'fun' , So that i can display it in main function.
First, you need to define a function that returns a pointer to array.
char (*fun(int arr[]))[]
Allocating variable length strings doesn't buy you anything. The longest string you'll need for 64bit unsigned int is 20 digits. All you need is to allocate an array of 5 elements of 2 characters long each. You may adjust the length to suit your need. This sample assumes 1 digit and 1 null character. Note the allocation is done only once. You may choose to use the length of 21 (20 digits and 1 null).
For readability on which values here are related to the number of digits including the terminator, I'll define a macro that you can modify to suit your needs.
#define NUM_OF_DIGITS 3
You can then use this macro in the whole code.
char (*str)[NUM_OF_DIGITS] = malloc(5 * NUM_OF_DIGITS);
Finally the receiving variable in main() can be declared and assigned the returned array.
char (*str)[NUM_OF_DIGITS] = fun(arr);
Your complete code should look like this:
Code
char (*fun(int arr[]))[]
{
char (*str)[NUM_OF_DIGITS] = malloc(5 * NUM_OF_DIGITS);
int i;
for(i=0;i<5;i++)
{
snprintf(str[i],NUM_OF_DIGITS,"%d",arr[i]); //control and limit to single digit + null
}
return str;
}
int main()
{
int arr[] = {24,1,33,4,5},i;
char (*str)[NUM_OF_DIGITS] = fun(arr);
for(i=0;i<5;i++)
{
printf("%s",str[i]);
}
free(str);
return 0;
}
Output
2413345
With this method you only need to free the allocated memory once.
const int SIZE = 20;
struct Node { Node* next; };
std::atomic<Node*> head (nullptr);
void push (void* p)
{
Node* n = (Node*) p;
n->next = head.load ();
while (!head.compare_exchange_weak (n->next, n));
}
void* pop ()
{
Node* n = head.load ();
while (n &&
!head.compare_exchange_weak (n, n->next));
return n ? n : malloc (SIZE);
}
void thread_fn()
{
std::array<char*, 1000> pointers;
for (int i = 0; i < 1000; i++) pointers[i] = nullptr;
for (int i = 0; i < 10000000; i++)
{
int r = random() % 1000;
if (pointers[r] != nullptr) // allocated earlier
{
push (pointers[r]);
pointers[r] = nullptr;
}
else
{
pointers[r] = (char*) pop (); // allocate
// stamp the memory
for (int i = 0; i < SIZE; i++)
pointers[r][i] = 0xEF;
}
}
}
int main(int argc, char *argv[])
{
int N = 8;
std::vector<std::thread*> threads;
threads.reserve (N);
for (int i = 0; i < N; i++)
threads.push_back (new std::thread (thread_fn));
for (int i = 0; i < N; i++)
threads[i]->join();
}
What is wrong with this usage of compare_exchange_weak ? The above code crashes 1 in 5 times using clang++ (MacOSX).
The head.load() at the time of the crash will have "0xEFEFEFEFEF". pop is like malloc and push is like free. Each thread (8 threads) randomly allocate or deallocate memory from head
It could be nice lock-free allocator, but ABA-problem arise:
A: Assume, that some thread1 executes pop(), which reads current value of head into n variable, but immediately after this the thread is preemted and concurrent thread2 executes full pop() call, that is it reads same value from head and performs successfull compare_exchange_weak.
B: Now object, referred by n in the thread1, has no longer belonged to the list, and can be modified by thread2. So n->next is garbage in general: reading from it can return any value. For example, it can be 0xEFEFEFEFEF, where the first 5 bytes are stamp (EF), witch has been written by thread2, and the last 3 bytes are still 0, from nullptr. (Total value is numerically interpreted in little-endian manner). It seems that, because head value has been changed, thread1 will fail its compare_exchange_weak call, but...
A: Concurrent thread2 push()es resulted pointer back into the list. So thread1 sees initial value of head, and perform successfull compare_exchange_weak, which writes incorrect value into head. List is corrupted.
Note, that problem is more than possibility, that other thread can modify content of n->next. The problem is that value of n->next is no longer coupled with the list. So, even it is not modified concurrently, it becomes invalid (for replace head) in case, e.g., when other thread(s) pop() 2 elements from the list but push() back only first of them. (So n->next will points to the second element, which is has no longer belonged to the list.)
I am trying to calculate row count from a large file based on presence of a certain character and would like to use StreamReader and ReadBlock - below is my code.
protected virtual long CalculateRowCount(FileStream inStream, int bufferSize)
{
long rowCount=0;
String line;
inStream.Position = 0;
TextReader reader = new StreamReader(inStream);
char[] block = new char[4096];
const int blockSize = 4096;
int indexer = 0;
int charsRead = 0;
long numberOfLines = 0;
int count = 1;
do
{
charsRead = reader.ReadBlock(block, indexer, block.Length * count);
indexer += blockSize ;
numberOfLines = numberOfLines + string.Join("", block).Split(new string[] { "&ENDE" }, StringSplitOptions.None).Length;
count ++;
} while (charsRead == block.Length);//charsRead !=0
reader.Close();
fileRowCount = rowCount;
return rowCount;
}
But I get error
Offset and length were out of bounds for the array or count is greater than the number of elements from index to the end of the source collection.
I am not sure what is wrong... Can you help. Thanks ahead!
For one, read the StreamReader.ReadBlock() documentation carefully http://msdn.microsoft.com/en-us/library/system.io.streamreader.readblock.aspx and compare with what you're doing:
The 2nd argument (indexer) should be within the range of the block you're passing in, but you're passing something that will probably exceed it after one iteration. Since it looks like you want to reuse the memory block, pass 0 here.
The 3rd argument (count) indicates how many bytes to read into your memory block; passing something larger than the block size might not work (depends on implementation)
ReadBlock() returns the number of bytes actually read, but you increment indexer as if it will always return the size of the block exactly (most of the time, it won't)
I am new to c++ programming I have to call a function with following arguments.
int Start (int argc, char **argv).
When I try to call the above function with the code below I get run time exceptions. Can some one help me out in resolving the above problem.
char * filename=NULL;
char **Argument1=NULL;
int Argument=0;
int j = 0;
int k = 0;
int i=0;
int Arg()
{
filename = "Globuss -dc bird.jpg\0";
for(i=0;filename[i]!=NULL;i++)
{
if ((const char *)filename[i]!=" ")
{
Argument1[j][k++] = NULL; // Here I get An unhandled
// exception of type
//'System.NullReferenceException'
// occurred
j++;
k=0;
}
else
{
(const char )Argument1[j][k] = filename [j]; // Here I also i get exception
k++;
Argument++;
}
}
Argument ++;
return 0;
}
Start (Argument,Argument1);
Two things:
char **Argument1=NULL;
This is pointer to pointer, You need to allocate it with some space in memory.
*Argument1 = new char[10];
for(i=0, i<10; ++i) Argument[i] = new char();
Don't forget to delete in the same style.
You appear to have no allocated any memory to you arrays, you just have a NULL pointer
char * filename=NULL;
char **Argument1=NULL;
int Argument=0;
int j = 0;
int k = 0;
int i=0;
int Arg()
{
filename = "Globuss -dc bird.jpg\0";
//I dont' know why you have 2D here, you are going to need to allocate
//sizes for both parts of the 2D array
**Argument1 = new char *[TotalFileNames];
for(int x = 0; x < TotalFileNames; x++)
Argument1[x] = new char[SIZE_OF_WHAT_YOU_NEED];
for(i=0;filename[i]!=NULL;i++)
{
if ((const char *)filename[i]!=" ")
{
Argument1[j][k++] = NULL; // Here I get An unhandled
// exception of type
//'System.NullReferenceException'
// occurred
j++;
k=0;
}
else
{
(const char )Argument1[j][k] = filename [j]; // Here I also i get exception
k++;
Argument++;
}
}
Argument ++;
return 0;
}
The first thing you have to do is to find the number of the strings you will have. Thats easy done with something like:
int len = strlen(filename);
int numwords = 1;
for(i = 0; i < len; i++) {
if(filename[i] == ' ') {
numwords++;
// eating up all spaces to not count following ' '
// dont checking if i exceeds len, because it will auto-stop at '\0'
while(filename[i] == ' ') i++;
}
}
In the above code i assume there will be at least one word in the filename (i.e. it wont be an empty string).
Now you can allocate memory for Argument1.
Argument1 = new char *[numwords];
After that you have two options:
use strtok (http://www.cplusplus.com/reference/clibrary/cstring/strtok/)
implement your function to split a string
That can be done like this:
int i,cur,last;
for(i = last = cur = 0; cur < len; cur++) {
while(filename[last] == ' ') { // last should never be ' '
last++;
}
if(filename[cur] == ' ') {
if(last < cur) {
Argument1[i] = new char[cur-last+1]; // +1 for string termination '\0'
strncpy(Argument1[i], &filename[last], cur-last);
last = cur;
}
}
}
The above code is not optimized, i just tried to make it as easy as possible to understand.
I also did not test it, but it should work. Assumptions i made:
string is null terminated
there is at least 1 word in the string.
Also whenever im referring to a string, i mean a char array :P
Some mistakes i noticed in your code:
in c/c++ " " is a pointer to a const char array which contains a space.
If you compare it with another " " you will compare the pointers to them. They may (and probably will) be different. Use strcmp (http://www.cplusplus.com/reference/clibrary/cstring/strcmp/) for that.
You should learn how to allocate dynamically memory. In c you can do it with malloc, in c++ with malloc and new (better use new instead of malloc).
Hope i helped!
PS if there is an error in my code tell me and ill fix it.