Writing array of Struct to a binary file

Writing array of Struct to a binary file - struct

I am not understanding whats erroneous in the program. I am defining a pointer to an array of structures. Malloc'ed enough memory for it. Initialized the array elements. Then used fwrite to write the array on to a binary file. Then attempting to read the same, back into another pointer to a similar array, which has enough memory malloc'ed to it.
#include<stdio.h>
typedef struct ss{
int *p;
char c;
double d;
char g;
float f;
} dd;
main(){
dd (*tt)[5];
int i=0,a[5]={4,1,6,9,3};
tt=malloc(sizeof(struct ss[5]));
for(i=0;i<5;i++){
tt[i]->p=malloc(sizeof(int));
tt[i]->p=&a[i];
tt[i]->c=(char)('a'+i);
tt[i]->d=(double)(5.234234+i);
tt[i]->g=(char)('A'+i);
tt[i]->f=(float)(15.234234+i);
}
FILE *F;
F=fopen("myfile","w+b");
size_t l;
l=fwrite(tt,sizeof(*tt),1,F);
fseek(F,0,SEEK_SET);
//printf("sizeof(dd)=%d sizeof(*tt) =%d bytes written %d\n",sizeof(dd),sizeof(*tt),l);
dd (*xx)[5];
xx=malloc(sizeof(struct ss[5]));
l=fread(xx,sizeof(*xx),1,F);
for(i=0;i<5;i++){
printf("%d, %c,%f,%c,%f\n",*(xx[i]->p),xx[i]->c,xx[i]->d,xx[i]->g,xx[i]->f);
}
printf("Date Read %d \n",l);
for(i=0;i<5;i++){
free(xx[i]->p);
}
free(xx);
free(tt);
fclose(F);
remove("myfile");
}
Output:
4,a,5.234234,A,15.234234
Segmentation fault

You weren't writing your data where you thought you were, because you were accessing tt incorrectly. Your incorrect access was consistent, and therefore you could read out the first record, but the second record was nowhere near where you thought it was- it was, in fact, being written into uninitialized memory and never saved. Trying to access the reloaded data shows this. Additionally, your int* in your struct couldn't be written out correctly the way you wrote it out, but this is moot because of how your program is structured- it would be wrong if you were trying to load the file in a separate run of the program. fwrite and fread cannot follow your int*, because it's only looking at your struct as a bit pattern- it is faithfully reconstructing your pointer, but now you have a pointer to a random chunk of memory that you didn't actually do anything with! In this case, though, your pointers remain valid because you never overwrote the data, but this is specific to the scenario of writing the file out, not flushing the memory, and reading it back in without the program being closed- which is not a realistic scenario for file-writing. There's another StackOverflow question that explains this bug in more detail.
Anyway, here's the much bigger problem with how you're accessing memory, with other lines removed:
dd (*tt)[5];
//...
tt=malloc(sizeof(struct ss[5]));
for(i=0;i<5;i++){
tt[i]->p=malloc(sizeof(int));
tt[i]->p=&a[i];
//...
}
C declarations are read with The Clockwise Spiral Rule, so let's look at what we've said about tt and compare it to how we're using it.
tt is the variable name. To its right is a closing parenthesis, so we keep processing the current scope. We encounter a *, and then the matching paren, then a static array size, then a type. Using The Clockwise Spiral Rule, tt is a pointer to an array (size 5) of dd. This means that if you dereference tt (using (*tt)), you get a dd[5], or, if you prefer to think of it that way (C certainly does), a pointer to the beginning of a block of memory large enough to hold your structure. More importantly, that's what you've said it is. C isn't actually very picky about is pointer types, and that's why your code compiles even though you're committing a serious type error.
Your malloc statement is correct: it is initializing tt with a memory location that the operating system promised has enough space for your five ss. Because C doesn't bother with silly things like array size bounds checking, a 5-element array of struct ss is guaranteed to be exactly five times the size of a single struct ss, so you could actually have written malloc(5 * sizeof(dd)), but either way of writing it is fine.
But let's look at what happens here:
tt[i]->p=malloc(sizeof(int));
Uh-oh. tt is a pointer to an array of struct dd, but you've just treated it as an array of pointers to struct dd.
What you wanted:
Dereference tt
Find the ith element in an array of pointers to dd
Go to field p
Assign it a pointer to space for an int
What you actually got:
Find the ith element in an array of pointers to arrays of dd
Dereference it, treating it as a pointer to dd, since C doesn't know the difference between arrays and pointers
Go indirectly to field p
Assign it a pointer to space for an int
When i is 0, this works properly, because the zeroth element in an array and the array itself are in the same location. (An array has no header, and C _does not understand the difference between arrays and pointers and allows you to use them interchangeably, which is why this compiles at all.)
When i is not 0, you make an enormous mess of memory. Now you are writing to whatever memory happened to follow your pointer! It's actually a pointer, but you told C it was an array, and it believed you, added 1 element-width to its location, and tried to do all those operations. You're using arrays exactly where you should be using pointers, and pointers exactly where you should be using arrays.
You only write to the memory you allocated for element 0. Beyond that, you're writing into unrelated memory, and it's luck (bad luck, in your case) that kept your program from crashing right there. (If it had, you'd have had an easier time finding this as the guilty line.) When you fwrite your allocated memory, the first element is valid, the rest is garbage, and your fread results in a data structure that has one valid element, then random heap garbage that resulted in a crash when you tried to dereference a pointer (that would only be valid because the program didn't end).
Here's the right way to access your pointer-to-array:
(*tt)[i].p=malloc(sizeof(int));...
Also, you're allocating memory, then immediately forgetting your only reference to it, which is a memory leak, since you're overwriting the pointer with a reference to the static array you're initializing everything with. Use this instead:
*((*tt)[i].p)=a[i]
I strongly encourage you to study A Tutorial on Pointers and Arrays in its entirety. It will help you avoid this class of issue in the future.
Be aware that you are reading xx incorrectly when printing its contents in exactly the same manner.

You're pointer usage is incorrect. In this code snippet:
dd (*xx)[5];
xx=malloc(sizeof(struct ss[5]));
l=fread(xx,sizeof(*xx),1,F);
for(i=0;i<5;i++){
printf("%d, %c,%f,%c,%f\n",*(xx[i]->p),xx[i]->c,xx[i]->d,xx[i]->g,xx[i]->f);
}
You are declaring xx as a pointer to an array of 5 'dd' structures. This is where it gets weird. It's a pointer to five structures and not an array of five structures.
It would look something like this in memory:
dd[0] = [{p, c, d, g, f}, {p, c, d, g, f}, {p, c, d, g, f}, {p, c, d, g, f}, {p, c, d, g, f}]
dd[1] = [{p, c, d, g, f}, {p, c, d, g, f}, {p, c, d, g, f}, {p, c, d, g, f}, {p, c, d, g, f}]
...
dd[4] = [{p, c, d, g, f}, {p, c, d, g, f}, {p, c, d, g, f}, {p, c, d, g, f}, {p, c, d, g, f}]
Instead of the intended:
dd[0] = {p, c, d, g, f}
dd[1] = {p, c, d, g, f}
...
dd[4] = {p, c, d, g, f}
As you are iterating from 0 to 5, each array access is advancing your array sizeof(ss[5]) bytes in memory instead of sizeof(ss) bytes. Take out the extra pointer.
dd* xx;
xx = (dd*)malloc(sizeof(dd) * 5);
l = fread(xx, sizeof(dd), 5, F);
for(i = 0; i < 5; ++i) {
printf("%d, %c, %f, %c, %f\n", xx[i].p, , xx[i].c, xx[i].d, xx[i].g, xx[i].f);
}
Additionally you have a problem with your structure. If it is meant to be directly written to disk like this, it cannot contain pointers. Thus your 'int *p;' member needs to instead be 'int p;'. Otherwise if you read this file from a separate application, the pointer you stored will not point at the integer anymore, but at unallocated memory.
Writing application:
int *p = 0x12345 ---> 5
0x12345 gets stored in the file for p.
Writing application reads the file.
int *p = 0x12345 ---> 5
The pointer still points at the same memory because it is still the same memory
layout.
New application reads the file.
int *p = 0x12345 ---> ?????
The pointer doesn't point to a known piece of memory because the memory layout
has changed in this new instance of the application. This could crash or
cause a security issue.

Related

iterate through 2D vector, cant dereference?

I am trying to initialize two iterators two my 2D vector, one for the rows and one for the columns. I have done it this way:
vector<vector<int> > v;
vector<vector<int> >::iterator r;
vector<int>::iterator c;
r = v.begin();
c = r->begin();
and i get the following pop-up window, when i run the code:
Debug Assertion Failed!
Expression: can't dereference value initialized vector iterator.
There are som problem with this statement:
c = r->begin();
But cant see why?
Thanks

v is empty, so r doesn't point to a valid vector<int> instance (there is no instance to point to). You are essentially dereferencing v.end(), whereupon your program exhibits undefined behavior.

Static types and conversions

Suppose I have an algol-like language, with static types and the following piece of code:
a := b + c * d;
where a is a float, b an integer, c a double and d a long. Then, the language will convert d to long to operate with c, and b to double to operate with c*d result. So, after that, the double result of b+c*d will be converted to float to assign the result to a. But, when it happens?, I mean, do all the conversions happens in runtime or compile time?
And if I have:
int x; //READ FROM USER KEYBOARD.
if (x > 5) {
a:= b + c * d;
}
else {
a := b + c;
}
The above code has conditionals. If the compiler converts this at compile time, some portion may never run. Is this correct?

You cannot do a conversion at compile-time any more than you can do an addition at compile time (unless the compiler can determine the value of the variable, perhaps because it is actually constant).
The compiler can (and does) emit a program with instructions which add and multiply the value of variables. It also emits instructions which convert the type of a stored value into a different type prior to computation, if that is necessary.
Languages which do not have variable types fixed at compile-time do have to perform checks at runtime and conditionally convert values to different types. But I don't believe that is the case with any of the languages included in the general category of "Algol-like".

How to allocate a string with c api of R?

I've asked a question here, and that led me to an another question.
In R, there's no fundamental distinction between a string and a
character. A "string" is just a character variable that contains one
or more characters.
and
There is a distinction between a scalar character variable, and a
vector. A character vector is a set of strings stored as a single
object.
So I wonder how to allocate a string with c api of R? For example, what do I get from:
result = Rf_allocVector(STRSXP, dst_size);
is it(the result) a scalar character variable or a vector? or could I use other API for allocating string?
Thanks.

We have that as a motivating example in our introductory vignette in the Rcpp package (and this is also published as a paper JSS in 2011):
In the C API you must do allocate a vector of STRSXP:
SEXP ab;
PROTECT(ab = allocVector(STRSXP, 2));
SET_STRING_ELT( ab, 0, mkChar("foo") );
SET_STRING_ELT( ab, 1, mkChar("bar") );
UNPROTECT(1);
which imposes on the programmer knowledge of PROTECT, UNPROTECT,
SEXP, allocVector, SET_STRING_ELT, and mkChar.
Whereas with Rcpp and
using the Rcpp::CharacterVector class, we can express the same code more concisely:
Rcpp::CharacterVector ab(2);
ab[0] = "foo";
ab[1] = "bar";

Are numbers, bools or nils garbage collected in Lua?

This article implies that all types beside numbers, bools and nil are garbage collected.
The field gc is used for the other values (strings, tables, functions, heavy userdata, and threads), which are those subject to garbage collection.
Would this mean under certain circumstances that overusing these non-gc types might result in memory leaks?

In Lua, you have actually 2 kinds of types: Ones which are always passed by value, and ones passed by reference ( as per chapter 2.1 in the Lua Manual ).
The ones you cite are all of the "passed-by-value" type, hence they are directly stored in a variable.
If you delete the variable, the value will be gone instantly.
So it will not start leaking memory, unless, of course, you keep generating new variables containing new values. But in that case it's your own fault ;).

In the article you linked to they write down the C code that shows how values are represented:
/*You can also find this in lobject.h in the Lua source*/
/*I paraphrased a bit to remove some macro magic*/
/*unions in C store one of the values at a time*/
union Value {
GCObject *gc; /* collectable objects */
void *p; /* light userdata */
int b; /* booleans */
lua_CFunction f; /* light C functions */
numfield /* numbers */
};
typedef union Value Value;
/*the _tt tagtells what kind of value is actually stored in the union*/
struct lua_TObject {
int _tt;
Value value_;
};
As you can see in here, booleans and numbers are stored directly in the TObject struct. Since they are not "heap-allocated" it means that they can never "leak" and therefore garbage collecting them would have made no sense.
One interesting to note, however, is that the garbage collector does not collect references created to things on the C side of things (userdata and C C functions). These need to be manually managed from the C-side of things but that is sort of to be expected since in that case you are writing C instead of Lua.

printf issue in linux

Following is a simple program to print formatted "1.2" on HP & Linux.
However, the behavior is different.
I do not want to make the question bigger but the program where this is actually occurring has a float value in a string, so using %f is not an option (even using sprintf).
Has anyone encountered this before? Which behavior is correct?
This should not be a compiler issue but still have tried it on gcc, icpc, icc, g++.
#include <stdio.h>
int main()
{
printf("%s = [%010s]\n", "[%010s]", "1.2");
return 0;
}
**HP:**
cc test2.c -o t ; ./t
[%010s] = [00000001.2]
**Linux:**
icc test2.c -o t ; ./t
[%010s] = [ 1.2]
Edit: Thank you all very much for the responses :)

From the glibc printf(3) man page:
0 The value should be zero padded. For d, i, o, u, x, X, a, A, e,
E, f, F, g, and G conversions, the converted value is padded on
the left with zeros rather than blanks. If the 0 and - flags
both appear, the 0 flag is ignored. If a precision is given
with a numeric conversion (d, i, o, u, x, and X), the 0 flag is
ignored. For other conversions, the behavior is undefined.
So a 0 flag with s cannot be expected to pad the string with 0s on glibc-based systems.

According to the man page, the behaviour of the 0 flag for anything other than d, i, o, u, x, X, a, A, e, E, f, F, g, and G conversions is undefined. So both are fine.
EDIT: When I say "fine", I mean from the compiler/libc standpoint. From your application's point of view, the behaviour you're relying on (on both Linux & HP) is a bug and you should do your formatted printing correctly.

If you don't want leading zero fill, omit the leading zero fill indicator:
printf("%s = [%10s]\n", "[%010s]", "1.2");
It is somewhat surprising that an implementation honors filling a string with zeros, but it is easily corrected.

Adding to what Ignacio Vazquez-Abrams said, according to the documentation for printf, the result of what you are doing is undefined behavior. The fact that two OSes produce different results is not unexpected.
In fact, compiling your code with gcc 4.5.2 on Ubuntu gives the following warning:
warning: '0' flag used with ‘%s’ gnu_printf format

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Writing array of Struct to a binary file - struct

Related

iterate through 2D vector, cant dereference?

Static types and conversions

How to allocate a string with c api of R?

Are numbers, bools or nils garbage collected in Lua?

printf issue in linux

Categories

Resources