C Function to return a String resulting in corrupted top size - string

I am trying to write a program that calls upon an [external library (?)] (I'm not sure that I'm using the right terminology here) that I am also writing to clean up a provided string. For example, if my main.c program were to be provided with a string such as:
asdfFAweWFwseFL Wefawf JAWEFfja FAWSEF
it would call upon a function in externalLibrary.c (lets call it externalLibrary_Clean for now) that would take in the string, and return all characters in upper case without spaces:
ASDFFAWEWFWSEFLWEFAWFJAWEFFJAFAWSEF
The crazy part is that I have this working... so long as my string doesn't exceed 26 characters in length. As soon as I add a 27th character, I end up with an error that says
malloc(): corrupted top size.
Here is externalLibrary.c:
#include "externalLibrary.h"
#include <ctype.h>
#include <malloc.h>
#include <assert.h>
#include <string.h>
char * restrict externalLibrary_Clean(const char* restrict input) {
// first we define the return value as a pointer and initialize
// an integer to count the length of the string
char * returnVal = malloc(sizeof(input));
char * initialReturnVal = returnVal; //point to the start location
// until we hit the end of the string, we use this while loop to
// iterate through it
while (*input != '\0') {
if (isalpha(*input)) { // if we encounter an alphabet character (a-z/A-Z)
// then we convert it to an uppercase value and point our return value at it
*returnVal = toupper(*input);
returnVal++; //we use this to move our return value to the next location in memory
}
input++; // we move to the next memory location on the provided character pointer
}
*returnVal = '\0'; //once we have exhausted the input character pointer, we terminate our return value
return initialReturnVal;
}
int * restrict externalLibrary_getFrequencies(char * ar, int length){
static int freq[26];
for (int i = 0; i < length; i++){
freq[(ar[i]-65)]++;
}
return freq;
}
the header file for it (externalLibrary.h):
#ifndef LEARNINGC_EXTERNALLIBRARY_H
#define LEARNINGC_EXTERNALLIBRARY_H
#ifdef __cplusplus
extern "C" {
#endif
char * restrict externalLibrary_Clean(const char* restrict input);
int * restrict externalLibrary_getFrequencies(char * ar, int length);
#ifdef __cplusplus
}
#endif
#endif //LEARNINGC_EXTERNALLIBRARY_H
my main.c file from where all the action is happening:
#include <stdio.h>
#include "externalLibrary.h"
int main() {
char * unfilteredString = "ASDFOIWEGOASDGLKASJGISUAAAA";//if this exceeds 26 characters, the program breaks
char * cleanString = externalLibrary_Clean(unfilteredString);
//int * charDist = externalLibrary_getFrequencies(cleanString, 25); //this works just fine... for now
printf("\nOutput: %s\n", unfilteredString);
printf("\nCleaned Output: %s\n", cleanString);
/*for(int i = 0; i < 26; i++){
if(charDist[i] == 0){
}
else {
printf("%c: %d \n", (i + 65), charDist[i]);
}
}*/
return 0;
}
I'm extremely well versed in Java programming and I'm trying to translate my knowledge over to C as I wish to learn how my computer works in more detail (and have finer control over things such as memory).
If I were solving this problem in Java, it would be as simple as creating two class files: one called main.java and one called externalLibrary.java, where I would have static String Clean(string input) and then call upon it in main.java with String cleanString = externalLibrary.Clean(unfilteredString).
Clearly this isn't how C works, but I want to learn how (and why my code is crashing with corrupted top size)

The bug is this line:
char * returnVal = malloc(sizeof(input));
The reason it is a bug is that it requests an allocation large enough space to store a pointer, meaning 8 bytes in a 64-bit program. What you want to do is to allocate enough space to store the modified string, which you can do with the following line:
char *returnVal = malloc(strlen(input) + 1);
So the other part of your question is why the program doesn't crash when your string is less than 26 characters. The reason is that malloc is allowed to give the caller slightly more than the caller requested.
In your case, the message "malloc(): corrupted top size" suggests that you are using libc malloc, which is the default on Linux. That variant of malloc, in a 64-bit process, would always give you at least 0x18 (24) bytes (minimum chunk size 0x20 - 8 bytes for the size/status). In the specific case that the allocation immediately precedes the "top" allocation, writing past the end of the allocation will clobber the "top" size.
If your string is larger than 23 (0x17) you will start to clobber the size/status of the subsequent allocation because you also need 1 byte to store the trailing NULL. However, any string 23 characters or shorter will not cause a problem.
As to why you didn't get an error with a string with 26 characters, to answer that one would have to see that exact program with the string of 26 characters that does not crash to give a more precise answer. For example, if the program provided a 26-character input that contained 3 blanks, this would would require only 26 + 1 - 3 = 24 bytes in the allocation, which would fit.
If you are not interested in that level of detail, fixing the malloc call to request the proper amount will fix your crash.

Related

How to use memcpy in function that is passed a char pointer?

I'm quite new to pointers in c.
Here is a snippet of code I'm working on. I am probably not passing the pointer correctly but I can't figure out what's wrong.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
__uint16_t CCrc8();
__uint16_t process_command();
int main () {
//command format: $SET,<0-1023>*<checksum,hex>\r\n
char test_payload[] = "SET,1023*6e";
process_command(test_payload);
return 0;
}
__uint16_t process_command(char *str1) {
char local_str[20];
memcpy(local_str, str1, sizeof(str1));
printf(str1);
printf("\n");
printf(local_str);
}
This results in:
SET,1023*6e
SET,1023
I'm expecting both lines to be the same. Anything past 8 characters is left off.
The only thing I can determine is that the problem is something with sizeof(str1). Any help appreciated.
Update: I've learned sizeof(*char) is 2 on 16bit systems, 4 on 32bit systems and 8 on 64-bit systems.
So how can I use memcpy to get a local copy of str1 when I'm unsure of the size it will be?
sizeof is a compiler keyword. What you need is strlen from #include <string.h>.
The value of sizeof is determinated at compile time. For example sizeof(char[10]) just means 10. strlen on the other hand is a libc function that can determine string length dynamically.
sizeof on a pointer tells you the size of the pointer itself, not of what it points to. Since you're on a 64-bit system, pointers are 8 bytes long, so your memcpy is always copying 8 bytes. Since your string is null terminated, you should use stpncpy instead, like this:
if(stpncpy(local_str, str1, 20) == local_str + 20) {
// too long - handle it somehow
}
That will copy the string until it gets to a NUL terminator or runs out of space in the destination, and in the latter case you can handle it.

How can i realloc the dynamic array after returning null by malloc in c?

I wanna write a program which should receive a input in form of string,
and this input will save in a dynamic array, so I use malloc with a for example 20*sizeof, and I want if the size of string was longer than my allocating memory, improve it's size. But I receive a crash and cannot improve it's size with realloc.
What can I do?
this is my code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main() {
char *user;
int n = 0;
user = (char*)malloc(20*sizeof(char));
scanf("%s",user);
n = strlen(user);
user = (char*)realloc(user,n);
return 0;
}
The easiest way is to use the m modifier in scanf:
char *user = 0;
scanf("%ms", &user);
// use 'user' -- will be null if there was an error reading.
Unfortunately, this is only available on POSIX systems. On other systems, you'll need to write your own loop reading characters with getchar, and reallocate as you read (as needed).

splitting a line and printing it takes results in a core dumped

When I try to read a line from standard input and split it into words, after removing the /n character, I get a core dumped error. Could anyone explain me the reason? What is the correct way to do this?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define LINE_LEN 50
#define MAX_PARTS 50
int main ()
{
char* token;
char *str;
char* arr[MAX_PARTS];
int i,j;
printf("Write a line: \n $:");
fgets(str, LINE_LEN, stdin);
str = strncpy(str, str, strlen(str)-1);
fflush(stdin);
i=0;
token = strtok(str, " ");
while( token != NULL )
{
arr[i] = token;
printf("%s",arr[i]);
i++;
token = strtok(NULL," ");
}
return 0;
}
You are printing the NULL pointer on your last pass through the while() loop. You probably need to reverse the printf() and strtok() calls like this:
while( token != NULL )
{
arr[i] = token;
printf("%s",arr[i]); # Must come first to avoid printing a NULL on final loop
i++;
token = strtok(NULL," ");
}
You are reading into unallocated memory.
char *str;
This declares a pointer str, which is pointing nowhere. (In fact, it points to a random location, but "nowhere" stops the guys who try to second-guess undefined behaviour.)
fgets(str, LINE_LEN, stdin);
This writes to the location str is pointing at, which is nowhere (see above). This is undefined behaviour. If your program happens to survive this (instead of SEGFAULTing right there), you cannot rely on it behaving in any sane manner from this point on.
While we're at it:
fflush(stdin);
Note that the C standard does not define the behaviour of fflush() when called on input streams, i.e. while this is well-defined under Linux (which does define this behaviour), this is a non-standard, non-portable construct that could well crash on other platforms.

Pointer initialization doubt

We could initialize a character pointer like this in C.
char *c="test";
Where c points to the first character(t).
But when I gave code like below. It gives segmentation fault.
#include<stdio.h>
#include<stdlib.h>
main()
{
int *i=0;
printf("%d",*i);
}
Also when I give
#include<stdio.h>
#include<stdlib.h>
main()
{
int *i;
i=(int *)malloc(2);
*i=0;
printf("%d",*i);
}
It worked(gave output 0).
When I gave malloc(0), it worked(gave output 0).
Please tell what is happening
Your first example is seg faulting because you are trying to de-reference a null pointer which you have created with the line:
int *i=0;
You can't de-reference a pointer that doesn't point to anything and expect good things to happen. =)
The second code segment works because you have actually assigned memory to your pointer using malloc which you may de-reference. I would think it's possible for you to get values other than zero depending on the memory adjacent to the address you're allocated with malloc. I say this because typically an int is 4 bytes and you've only assigned 2. When de-referencing the int pointer, it should return the value as an int based on the 4 bytes pointed to. In your case, the first 2 bytes being what you received from the malloc and the adjacent 2 bytes being whatever is there which could be anything and whatever it is will be treated as if it was an int. You could get strange behavior like this and you should malloc the size of memory needed for the type you are trying to use/point at.
(i.e. int *i = (int *) malloc(sizeof(int)); )
Once you have the pointer pointing at memory that is of the correct size, you can then set the values as such:
#include <stdlib.h>
#include <stdio.h>
int main (int argc, char *argv[])
{
int *i = (int *)malloc(sizeof(int));
*i = 25;
printf("i = %d\n",*i);
*i = 12;
printf("i = %d\n",*i);
return 0;
}
Edit based on comment:
A pointer points to memory, not to values. When initializing char *ptr="test"; You're not assigning the value of "test", you're assigning the memory address of where the compiler is placing "test" which is placed in your processes data segment and is read only. It you tried to modify the string "test", you program would likely seg-fault. What you need to realize about a char * is that it points at a single (i.e. the first) character in the string. When you de-reference the char *, you will see 1 character and one character only. C uses null terminated strings, and notice that you do not de-reference ptr when calling printf, you pass it the pointer itself and that points at just the first character. How this is displayed depends on the format passed to printf. When printf is passed the '%c' format, it will print the single character ptr points at, if you pass the format '%p' it will print the address that ptr points. To get the entire string, you pass '%s' as the format. What this makes printf do is to start at the pointer you passed in and read each successive byte until a null is reached. Below is some code that demonstrates these.
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int main (int argc, char *argv[])
{
// Initialize to data segement/read only string
char *ptr = "test";
printf("ptr points at = %p\n", ptr); // Prints the address ptr points to
printf("ptr dereferenced = %c\n", *ptr); // Prints the value at address ptr
printf("ptr value = %s\n", ptr); // Prints the string of chars pointed to by ptr
// Uncomment this to see bad behavior!
// ptr[1] = 'E'; // SEG FAULT -> Attempting to modify read-only memory
printf("--------------------\n");
// Use memory you have allocated explicitly and can modify
ptr = malloc(10);
strncpy(ptr, "foo", 10);
printf("ptr now points at = %p\n", ptr); // Prints the address ptr points to
printf("ptr dereferenced = %c\n", *ptr); // Prints the value at address ptr
printf("ptr value = %s\n", ptr); // Prints the string of chars pointed to by ptr
ptr[1] = 'F'; // Change the second char in string to F
printf("ptr value (mod) = %s\n", ptr);
return 0;
}

Buffer Overrun Issues VC++

When i execute my code i am getting this error
LPTSTR lpBuffer;
::GetLogicalDriveStrings(1024,lpBuffer);
while(*lpBuffer != NULL)
{
printf("%s\n", lpBuffer); // or MessageBox(NULL, temp, "Test", 0); or whatever
lpBuffer += lstrlen(lpBuffer)+1;
printf("sizeof(lpBuffer) %d\n",lstrlen(lpBuffer));
}
OutPut
C
sizeof(lpBuffer) 3
D
sizeof(lpBuffer) 3
E
sizeof(lpBuffer) 3
F
sizeof(lpBuffer) 0
lpBuffer points to random memory. You need something like this:
LPTSTR lpBuffer = new TCHAR[1025];
edit: Corrected the array size to be 1025 instead of 1024, because the length parameter is 1024. This API requires careful reading.
You are supposed to pass a memory address where the string will be copied. However you have not allocated any space for holding the characters. You need to allocate space before passing it to the GetLogicalDriveStrings function. You can allocate the memory on heap as #Windows programmer suppgested or if the maximum length of the string is known at compile time you can allocate it on stack using TCHAR lpBuffer[1024]; Additinally, you are using printf to print the unicode (may be as it depends on compiler flag). This will not work and will print only first character.
You need to actually pass in a buffer - note that the size of the buffer you pass in needs to be one less than the actual size of the buffer to account for the final terminating '\0' character (I have no idea why the API was designed like that).
Here's a slightly modified version of your example:
#include <windows.h>
#include <tchar.h>
#include <stdio.h>
enum {
BUFSIZE = 1024
};
int _tmain (int argc, TCHAR *argv[])
{
TCHAR szTemp[BUFSIZE];
LPTSTR lpBuffer = szTemp; // point lpBuffer to the buffer we've allocated
szTemp[0] = _T( '\0'); // I'm not sure if this is necessary, but it was
// in the example given for GetLogicalDriveStrings()
GetLogicalDriveStrings( BUFSIZE-1, lpBuffer); // note: BUFSIZE minus 1
while(*lpBuffer != _T('\0'))
{
_tprintf( _T("%s\n"), lpBuffer);
lpBuffer += lstrlen(lpBuffer)+1;
_tprintf( _T("length of lpBuffer: %d\n"),lstrlen(lpBuffer));
}
return 0;
}

Resources