Logical Error in C++ Hexadecimal Converter Code - linux

I've been working on this Hexadecimal Converter and there seems to be a logical error somewhere in the program. I've run it on Ubuntu using the g++ tool and every time I run t program, it gives me a massive heap of garbage values. I can't figure out the source of the garbage values and neither can I find the source of the logical error. I'm a newbie at programming, so please help me figure out my mistake.
#include <iostream>
#include <math.h>
using namespace std;
int main()
{
int bin[20],finhex[10],num,bc=0,i,j,k,l=0,r=10,n=1,binset=0,m=0;
int hex[16]= {0000,0001,0010,0011,0100,0101,0110,0111,1000,1001,1010,1011,1100,1101,1110,1111};
char hexalph='A';
cout<<"\nEnter your Number: ";
cin>>num;
while(num>0)
{
bin[bc]=num%2;
num=num/2;
bc++;
}
if(bc%4!=0)
bc++;
for(j=0;j<bc/4;j++)
for(i=0;i<4;i++)
{
binset=binset+(bin[m]*pow(10,i));
m++;
}
for(k=0;k<16;k++)
{
if(hex[k]==binset)
{
if(k<=9)
finhex[l]=k;
else
while(n>0)
{
if(k==r)
{
finhex[l]=hexalph;
break;
}
else
{
hexalph++;
r++;
}
}
l++;
r=10;
binset=0;
hexalph='A';
break;
}
}
while(l>=0)
{
cout<<"\n"<<finhex[l];
l--;
}
return 0;
}

int hex[16]= {0000,0001,0010,0011,0100,0101,0110,0111,1000,1001,1010,1011,1100,1101,1110,1111};
Allow me to translate those values into decimal for you:
int hex[16] = {0, 1, 8, 9, 64, 65, 72, 73, 1000, 1001, 1010, 1011, 1100, 1101, 1110, 1111};
If you want them to be considered binary literals then you need to either specify them as such or put them in some other form that the compiler understands:
int hex[16] = {0b0000, 0b0001, 0b0010, 0b0011, 0b0100, 0b0101, 0b0110, 0b0111, 0b1000, 0b1001, 0b1010, 0b1011, 0b1100, 0b1101, 0b1110, 0b1111};
int hex[16] = {0x0, 0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7, 0x8, 0x9, 0xa, 0xb, 0xc, 0xd, 0xe, 0xf};

While Ignacio Vazquez-Abrams rightly hinted at the fact that some of your initializers of hex are (due to the prefix 0) octal constants, he overlooked that you chose the unusual, but possible way of representing binary literals as decimal constants with only the digits 0 and 1. Thus, you only have to remove the prefix 0 from all constants greater than 7:
int hex[16] = {0000,0001,10,11,100,101,110,111,1000,1001,1010,1011,1100,1101,1110,1111};
Then, you stored the characters 'A' etc. in int finhex[] and did output them with cout<<"\n"<<finhex[l] - but this way not A is printed, but rather its character code value, e. g. in ASCII 65. In order to really output the character A etc., we could change the finhex array element type to char:
char bin[20],finhex[10]; int num,bc=0,i,j,k,l=0,r=10,n=1,binset=0,m=0;
- but consequently we also have to store the digits 0 to 9 as their character code values:
if (k<=9)
finhex[l]='0'+k;
Furthermore, with the lines
if(bc%4!=0)
bc++;
you rightly pondered on the need to have a multiple of 4 bits for the conversion, but you overlooked that more than one bit could be missing, and also that the additional elements of bin[] are uninitialized, so change to:
while (bc%4!=0) bin[bc++] = 0;
Besides, you omitted block braces around the (appropriately indented) two inner for loops; since C++ is not Python, the indentation has no significance and without surrounding braces only the first of the indented for loops is nested into the outer for loop.
The final while loop should be outdented and go outside the big for loop. There's also an indexing error in it, as the finhex array is indexed with an l which is by one to high; you could change this to:
while (l--) cout<<finhex[l];
cout<<"\n";

Related

C Function to return a String resulting in corrupted top size

I am trying to write a program that calls upon an [external library (?)] (I'm not sure that I'm using the right terminology here) that I am also writing to clean up a provided string. For example, if my main.c program were to be provided with a string such as:
asdfFAweWFwseFL Wefawf JAWEFfja FAWSEF
it would call upon a function in externalLibrary.c (lets call it externalLibrary_Clean for now) that would take in the string, and return all characters in upper case without spaces:
ASDFFAWEWFWSEFLWEFAWFJAWEFFJAFAWSEF
The crazy part is that I have this working... so long as my string doesn't exceed 26 characters in length. As soon as I add a 27th character, I end up with an error that says
malloc(): corrupted top size.
Here is externalLibrary.c:
#include "externalLibrary.h"
#include <ctype.h>
#include <malloc.h>
#include <assert.h>
#include <string.h>
char * restrict externalLibrary_Clean(const char* restrict input) {
// first we define the return value as a pointer and initialize
// an integer to count the length of the string
char * returnVal = malloc(sizeof(input));
char * initialReturnVal = returnVal; //point to the start location
// until we hit the end of the string, we use this while loop to
// iterate through it
while (*input != '\0') {
if (isalpha(*input)) { // if we encounter an alphabet character (a-z/A-Z)
// then we convert it to an uppercase value and point our return value at it
*returnVal = toupper(*input);
returnVal++; //we use this to move our return value to the next location in memory
}
input++; // we move to the next memory location on the provided character pointer
}
*returnVal = '\0'; //once we have exhausted the input character pointer, we terminate our return value
return initialReturnVal;
}
int * restrict externalLibrary_getFrequencies(char * ar, int length){
static int freq[26];
for (int i = 0; i < length; i++){
freq[(ar[i]-65)]++;
}
return freq;
}
the header file for it (externalLibrary.h):
#ifndef LEARNINGC_EXTERNALLIBRARY_H
#define LEARNINGC_EXTERNALLIBRARY_H
#ifdef __cplusplus
extern "C" {
#endif
char * restrict externalLibrary_Clean(const char* restrict input);
int * restrict externalLibrary_getFrequencies(char * ar, int length);
#ifdef __cplusplus
}
#endif
#endif //LEARNINGC_EXTERNALLIBRARY_H
my main.c file from where all the action is happening:
#include <stdio.h>
#include "externalLibrary.h"
int main() {
char * unfilteredString = "ASDFOIWEGOASDGLKASJGISUAAAA";//if this exceeds 26 characters, the program breaks
char * cleanString = externalLibrary_Clean(unfilteredString);
//int * charDist = externalLibrary_getFrequencies(cleanString, 25); //this works just fine... for now
printf("\nOutput: %s\n", unfilteredString);
printf("\nCleaned Output: %s\n", cleanString);
/*for(int i = 0; i < 26; i++){
if(charDist[i] == 0){
}
else {
printf("%c: %d \n", (i + 65), charDist[i]);
}
}*/
return 0;
}
I'm extremely well versed in Java programming and I'm trying to translate my knowledge over to C as I wish to learn how my computer works in more detail (and have finer control over things such as memory).
If I were solving this problem in Java, it would be as simple as creating two class files: one called main.java and one called externalLibrary.java, where I would have static String Clean(string input) and then call upon it in main.java with String cleanString = externalLibrary.Clean(unfilteredString).
Clearly this isn't how C works, but I want to learn how (and why my code is crashing with corrupted top size)
The bug is this line:
char * returnVal = malloc(sizeof(input));
The reason it is a bug is that it requests an allocation large enough space to store a pointer, meaning 8 bytes in a 64-bit program. What you want to do is to allocate enough space to store the modified string, which you can do with the following line:
char *returnVal = malloc(strlen(input) + 1);
So the other part of your question is why the program doesn't crash when your string is less than 26 characters. The reason is that malloc is allowed to give the caller slightly more than the caller requested.
In your case, the message "malloc(): corrupted top size" suggests that you are using libc malloc, which is the default on Linux. That variant of malloc, in a 64-bit process, would always give you at least 0x18 (24) bytes (minimum chunk size 0x20 - 8 bytes for the size/status). In the specific case that the allocation immediately precedes the "top" allocation, writing past the end of the allocation will clobber the "top" size.
If your string is larger than 23 (0x17) you will start to clobber the size/status of the subsequent allocation because you also need 1 byte to store the trailing NULL. However, any string 23 characters or shorter will not cause a problem.
As to why you didn't get an error with a string with 26 characters, to answer that one would have to see that exact program with the string of 26 characters that does not crash to give a more precise answer. For example, if the program provided a 26-character input that contained 3 blanks, this would would require only 26 + 1 - 3 = 24 bytes in the allocation, which would fit.
If you are not interested in that level of detail, fixing the malloc call to request the proper amount will fix your crash.

CUDA equivalent of pragma omp task

I am working on a problem where work between each thread may varies drastically, where, for example, a thread may this time handle 1000000 element, but another thread may only handle 1 or 2 element. So I stumbled upon this where the answer solve the unbalanced workload by using openmp task on CPU, so my question is can I achieve the same on CUDA ?
In case you want more context:
The problem I'm trying to solve is, I have a n tuple, each has a starting point, an ending point and a value.
(0, 3, 1), (3, 6, 2), (6, 10, 3), ...
So for each tuple I want to write the value to every position between starting point and ending point of another empty array.
1, 1, 1, 2, 2, 2, 3, 3, 3, 3, ...
It is guaranteed that there is no start/ ending overlap.
My current approach is a thread for each tuple, but the starting and ending can vary a lot so the imbalanced workload between threads might cause a bottleneck for the program, though rare, but it may very well be.
The most common thread strategy I can think of in CUDA is to assign one thread per output point, and then have each thread do the work necessary to populate its output point.
For your stated objective (have each thread do roughly equal work) this is a useful strategy.
I will suggest using thrust for this. The basic idea is to:
determine the necessary size of the output based on the input
spin up a set of threads equal to the output size, where each thread determines its "insert index" in the output array by using a vectorized binary search on the input
with the insert index, insert the appropriate value in the output array.
I have used your data, the only change is that I changed the insert values from 1,2,3 to 5,2,7:
$ cat t1871.cu
#include <thrust/device_vector.h>
#include <thrust/transform.h>
#include <thrust/binary_search.h>
#include <thrust/copy.h>
#include <thrust/iterator/counting_iterator.h>
#include <thrust/iterator/permutation_iterator.h>
#include <thrust/iterator/transform_iterator.h>
#include <iostream>
using namespace thrust::placeholders;
typedef thrust::tuple<int,int,int> mt;
// returns selected item from tuple
struct my_cpy_functor1
{
__host__ __device__ int operator()(mt d){ return thrust::get<1>(d); }
};
struct my_cpy_functor2
{
__host__ __device__ int operator()(mt d){ return thrust::get<2>(d); }
};
int main(){
mt my_data[] = {{0, 3, 5}, {3, 6, 2}, {6, 10, 7}};
int ds = sizeof(my_data)/sizeof(my_data[0]); // determine data size
int os = thrust::get<1>(my_data[ds-1]) - thrust::get<0>(my_data[0]); // and output size
thrust::device_vector<mt> d_data(my_data, my_data+ds); // transfer data to device
thrust::device_vector<int> d_idx(ds+1); // create index array for searching of insertion points
thrust::transform(d_data.begin(), d_data.end(), d_idx.begin()+1, my_cpy_functor1()); // set index array
thrust::device_vector<int> d_ins(os); // create array to hold insertion points
thrust::upper_bound(d_idx.begin(), d_idx.end(), thrust::counting_iterator<int>(0), thrust::counting_iterator<int>(os), d_ins.begin()); // identify insertion points
thrust::transform(thrust::make_permutation_iterator(d_data.begin(), thrust::make_transform_iterator(d_ins.begin(), _1 -1)), thrust::make_permutation_iterator(d_data.begin(), thrust::make_transform_iterator(d_ins.end(), _1 -1)), d_ins.begin(), my_cpy_functor2()); // insert
thrust::copy(d_ins.begin(), d_ins.end(), std::ostream_iterator<int>(std::cout, ","));
std::cout << std::endl;
}
$ nvcc -o t1871 t1871.cu -std=c++14
$ ./t1871
5,5,5,2,2,2,7,7,7,7,
$

How to sort a variable-length string array with radix sort?

I know that radix sort can sort same-length string arrays, but is it possible to do so with variable-length strings. If it is, what is the C-family code or pseudo-code to implement this?
It might not a be fast algorithm for variable-length strings, but it is easy to implement radix sort, so it's useful if a sort needs to be coded quickly.
I'm not quite sure what you mean by "variable-length strings" but you can perform a binary MSB radix sort in-place so the length of the string doesn't matter since there are no intermediate buckets.
#include <stdio.h>
#include <algorithm>
static void display(char *str, int *data, int size)
{
printf("%s: ", str);
for(int v=0;v<size;v++) {
printf("%d ", data[v]);
}
printf("\n");
}
static void sort(int *data, int size, int bit)
{
if (bit == 0)
return;
int b = 0;
int e = size;
if (size > 0) {
while (b != e) {
if (data[b] & (1 << bit)) {
std::swap(data[b], data[--e]);
}
else {
b++;
}
}
sort(data, e, bit - 1);
sort(data + b, size - b, bit - 1);
}
}
int main()
{
int data[] = { 13, 12, 22, 20, 3, 4, 14, 92, 11 };
int size = sizeof(data) / sizeof(data[0]);
display("Before", data, size);
sort(data, size, sizeof(int)*8 - 1);
display("After", data, size);
}
You can do a MSB-first radix sort on variable-length strings.
There are a couple non-obvious details:
Pass #N will partition (scatter) strings from the input vector into 256 partitions, according to strvec[i][N]. It then will scan the partitions in order, and put (reinsert) strings back into the input vector.
Now the slightly complicated bit...
When you reach the end of a string, it is in its final position, and should never be touched again. That splits the strings before and after it into separate RANGES. The result of each pass is a set of ranges of yet-unsorted rows.
That means that pass #N, after the first, scans the strings in each range, and stores the source range id (index) along with the string, in the partition. In the "reinsert" step, it puts the string back into its source range; and again, it generates a new set of unsorted-row ranges.
You keep the stable-sort bonus of radix sort, if you forward-scan the input ranges and then backward-scan the partitions and reinsert starting at the back of each source range.
You can also use recursion (doing a complete sort from scratch on any subrange) but the above saves on setup and is faster.
There are more details ... quicksort falls through to doing an insertion sort for tiny ranges (e.g. up to 16); radix sort benefits from doing the same.
Using multiple bytes as a partition index is possible. One approach for that is in: Radix Sort-Mischa Sandberg-2010 There are other approaches.
Sorry I can't post code; it's now proprietary.

CString to UTF8 conversion fails for "ý"

In my application I want to convert a string that contains character ý, to UTF-8. But its not giving the exact result.
I am using WideCharToMultiByte function, it is converting the purticular character to ý.
For Example :
Input - "ý"
Output - "ý"
Please see the code below..
String strBuffer("ý" );
char *utf8Buffer = (char*)malloc(strBuffer.GetLength()+1);
int utf8bufferLength = WideCharToMultiByte(CP_UTF8, 0, (LPCWSTR)strBuffer.GetBuffer(strBuffer.GetLength() + 1)),
strBuffer.GetLength(), utf8Buffer, strBuffer.GetLength() * 4,0,0);
Please give your suggestions...
Binoy Krishna
Unicode codepoint for letter ý, according to this page is 25310 or FD16. UTF-8 representation is 195 189 decimal or C3 BD hexadecimal. These two bytes can be seen as letters ý in your program and/or debugger, but they are UTF-8 numbers, so they are bytes, not letters.
In another words the output and the code are fine, and your expectations are wrong. I can't say why are they wrong because you haven't mentioned what exactly were you expecting.
EDIT: The code should be improved. See Rudolfs' answer for more info.
While I was writing this an answer explaining the character values you are seeing was already posted, however, there are two things to mention about your code:
1) you should use the _T() macro when initializing the string: CString strBuffer(_T("ý")); The _T() macro is defined in tchar.h and maps to the correct string type depending on the value of the _UNICODE macro.
2) do not use the GetLength() to calculate the size of the UTF-8 buffer, see the documentation of WideCharToMultiByte in MSDN, it shows how to use the function to calculate the needed length for the UTF-8 buffer in the comments section.
Here is a small example that verifies the output according to the codepoints and demonstrates how to use the automatic length calculation:
#define _AFXDLL
#include <afx.h>
#include <iostream>
int main(int argc, char** argv)
{
CString wideStrBuffer(_T("ý"));
// The length calculation assumes wideStrBuffer is zero terminated
CStringA utf8Buffer('\0', WideCharToMultiByte(CP_UTF8, 0, wideStrBuffer.GetBuffer(), -1, NULL, 0, NULL, NULL));
WideCharToMultiByte(CP_UTF8, 0, wideStrBuffer.GetBuffer(), -1, utf8Buffer.GetBuffer(), utf8Buffer.GetLength(), NULL, NULL);
if (static_cast<unsigned char>(utf8Buffer[0]) == 195 && static_cast<unsigned char>(utf8Buffer[1]) == 189)
{
std::cout << "Conversion successful!" << std::endl;
}
return 0;
}

What is the point of using arrays of one element in ddk structures?

Here is an excerpt from ntdddisk.h
typedef struct _DISK_GEOMETRY_EX {
DISK_GEOMETRY Geometry; // Standard disk geometry: may be faked by driver.
LARGE_INTEGER DiskSize; // Must always be correct
UCHAR Data[1]; // Partition, Detect info
} DISK_GEOMETRY_EX, *PDISK_GEOMETRY_EX;
What is the point of UCHAR Data[1];? Why not just UCHAR Data; ?
And there are a lot of structures in DDK which have arrays of one element in declarations.
Thanks, thats clear now. The one thing is not clear the implementation of offsetof.
It's defined as
#ifdef _WIN64
#define offsetof(s,m) (size_t)( (ptrdiff_t)&(((s *)0)->m) )
#else
#define offsetof(s,m) (size_t)&(((s *)0)->m)
#endif
How this works:
((s *)0)->m ???
This
(size_t)&((DISK_GEOMETRY_EX *)0)->Data
is like
sizeof (DISK_GEOMETRY) + sizeof( LARGE_INTEGER);
But there is two additional questions:
1)
What type is this? And why we should use & for this?
((DISK_GEOMETRY_EX *)0)->Data
2) ((DISK_GEOMETRY_EX *)0)
This gives me 00000000. Is it convering to the address alignment? interpret it like an address?
Very common in the winapi as well, these are variable length structures. The array is always the last element in the structure and it always includes a field that indicates the actual array size. A bitmap for example is declared that way:
typedef struct tagBITMAPINFO {
BITMAPINFOHEADER bmiHeader;
RGBQUAD bmiColors[1];
} BITMAPINFO, FAR *LPBITMAPINFO, *PBITMAPINFO;
The color table has a variable number of entries, 2 for a monochrome bitmap, 16 for a 4bpp and 256 for a 8bpp bitmap. Since the actual length of the structure varies, you cannot declare a variable of that type. The compiler won't reserve enough space for it. So you always need the free store to allocate it using code like this:
#include <stddef.h> // for offsetof() macro
....
size_t len = offsetof(BITMAPINFO, bmiColors) + 256 * sizeof(RGBQUAD);
BITMAPINFO* bmp = (BITMAPINFO*)malloc(len);
bmp->bmiHeader.biClrUsed = 256;
// etc...
//...
free(bmp);

Resources