I try to issue a scsi read(10) and write(10) to a SSD. I use this example code as a reference/basic code.
This is my scsi read:
#define READ_REPLY_LEN 32
#define READ_CMDLEN 10
void scsi_read()
{
unsigned char Readbuffer[ SCSI_OFF + READ_REPLY_LEN ];
unsigned char cmdblk [ READ_CMDLEN ] =
{ 0x28, /* command */
0, /* lun/reserved */
0, /* lba */
0, /* lba */
0, /* lba */
0, /* lba */
0, /* reserved */
0, /* transfer length */
READ_REPLY_LEN, /* transfer length */
0 };/* reserved/flag/link */
memset(Readbuffer,0,sizeof(Readbuffer));
memcpy( cmd + SCSI_OFF, cmdblk, sizeof(cmdblk) );
/*
* +------------------+
* | struct sg_header | <- cmd
* +------------------+
* | copy of cmdblk | <- cmd + SCSI_OFF
* +------------------+
*/
if (handle_scsi_cmd(sizeof(cmdblk), 0, cmd,
sizeof(Readbuffer) - SCSI_OFF, Readbuffer )) {
fprintf( stderr, "read failed\n" );
exit(2);
}
hex_dump(Readbuffer,sizeof(Readbuffer));
}
And this is my scsi write:
void scsi_write ( void )
{
unsigned char Writebuffer[SCSI_OFF];
unsigned char cmdblk [] =
{ 0x2A, /* 0: command */
0, /* 1: lun/reserved */
0, /* 2: LBA */
0, /* 3: LBA */
0, /* 4: LBA */
0, /* 5: LBA */
0, /* 6: reserved */
0, /* 7: transfer length */
0, /* 8: transfer length */
0 };/* 9: control */
memset(Writebuffer,0,sizeof(Writebuffer));
memcpy( cmd + SCSI_OFF, cmdblk, sizeof(cmdblk) );
cmd[SCSI_OFF+sizeof(cmdblk)+0] = 'A';
cmd[SCSI_OFF+sizeof(cmdblk)+1] = 'b';
cmd[SCSI_OFF+sizeof(cmdblk)+2] = 'c';
cmd[SCSI_OFF+sizeof(cmdblk)+3] = 'd';
cmd[SCSI_OFF+sizeof(cmdblk)+4] = 'e';
cmd[SCSI_OFF+sizeof(cmdblk)+5] = 'f';
cmd[SCSI_OFF+sizeof(cmdblk)+6] = 'g';
cmd[SCSI_OFF+sizeof(cmdblk)+7] = 0;
/*
* +------------------+
* | struct sg_header | <- cmd
* +------------------+
* | copy of cmdblk | <- cmd + SCSI_OFF
* +------------------+
* | data to write |
* +------------------+
*/
if (handle_scsi_cmd(sizeof(cmdblk), 8, cmd,
sizeof(Writebuffer) - SCSI_OFF, Writebuffer )) {
fprintf( stderr, "write failed\n" );
exit(2);
}
}
In the following example I do
scsi read
scsi write
scsi read
And I print the hexdumps of the data which is written (scsi write) and what is read (scsi read)
Read(10)
[0000] 00 00 00 44 00 00 00 44 00 00 00 00 00 00 00 00 ...D...D ........
[0010] 00 2C 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ........ ........
[0020] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ........ ........
[0030] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ........ ........
[0040] 00 00 00 00 ....
Write(10):
[0000] 00 00 00 00 00 00 00 24 00 00 00 00 00 00 00 00 ........ ........
[0010] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ........ ........
[0020] 00 00 00 00 2A 00 00 00 00 00 00 00 00 00 41 62 ........ ......Ab
[0030] 63 64 65 66 67 00 cdefg.
Read(10):
[0000] 00 00 00 44 00 00 00 44 00 00 00 00 00 00 00 00 ...D...D ........
[0010] 04 00 20 00 70 00 02 00 00 00 00 0A 00 00 00 00 ....p... ........
[0020] 04 00 00 00 41 62 63 64 65 66 67 00 00 00 00 00 ....Abcd efg.....
[0030] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ........ ........
[0040] 00 00 00 00 ....
fter running the three commands again, I should read Abcdefg with the first read. Right? But running them again changes nothing. You could now assume, that the memory I use has still the data from previous funcions, but I get the same result even though I run memset(Readbuff,0,sizeof(Readbuff)) before the sys_read() happens.
I assumed, that the LBA I try to write is maybe forbidden to write, and I read the cache. But interating over LBA Adresses from 0x00-0xFF changes nothing - That means, I read the same data (Abcdefg).
Do you know an example implementation of scsi read or writes with the scsi generic interface?
In SCSI, the units of the LBA and the transfer length are in blocks, sometimes called sectors. This is almost always 512 bytes. So, you can't read or write just 32 bytes. At a minimum, you'll have to do 512 bytes == one block. This one point is most of what you need to fix.
Your transfer length is zero in your scsi_write implementation, so it's not actually going to write any data.
You should use different buffers for the CDB and the write/read data. I suspect that confusion about these buffers is leading your implementation to write past the end of one of your statically-allocated arrays and over your ReadBuffer. Run it under valgrind and see what shows up.
And lastly, a lot could go wrong in whatever is in handle_scsi_cmd. It can be tricky to set up the data transfer... in particular, make sure you're straight on which way the data is going in the I/O header's dxfer_direction: SG_DXFER_TO_DEV for write, SG_DXFER_FROM_DEV for read.
Check out this example of how to do a read(16). This is more along the lines of what you're trying to accomplish.
https://github.com/hreinecke/sg3_utils/blob/master/examples/sg_simple16.c
Related
I have the following AMD64 ELF file on 64-bit (Arch)linux (not formatted to make it easier to copy-paste)
7F 45 4C 46 02 01 01 00 00 00 00 00 00 00 00 00 02 00 3E 00 01 00 00 00 78 00 40 00 00 00 00 00 40 00 00 00 00 00 00 00 84 00 00 00 00 00 00 00 00 00 00 00 40 00 38 00 01 00 40 00 03 00 02 00 01 00 00 00 05 00 00 00 78 00 00 00 00 00 00 00 78 00 40 00 00 00 00 00 00 00 00 00 00 00 00 00 0C 00 00 00 00 00 00 00 0C 00 00 00 00 00 00 00 00 00 20 00 00 00 00 00 48 B8 3C 00 00 00 00 00 00 00 0F 05 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 01 00 00 00 06 00 00 00 00 00 00 00 78 00 40 00 00 00 00 00 78 00 00 00 00 00 00 00 0C 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 0C 00 00 00 00 00 00 00 07 00 00 00 03 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 44 01 00 00 00 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 00 00 2E 74 65 78 74 00 2E 73 68 73 74 72 74 61 62 00
which does nothing but immediately exit.
The output of readelf -a is
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x400078
Start of program headers: 64 (bytes into file)
Start of section headers: 132 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 1
Size of section headers: 64 (bytes)
Number of section headers: 3
Section header string table index: 2
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .text PROGBITS 0000000000400078 00000078
000000000000000c 000000000000000c AX 0 0 8
[ 2] .shstrtab STRTAB 0000000000000000 00000144
0000000000000010 0000000000000010 0 0 0
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
L (link order), O (extra OS processing required), G (group), T (TLS),
C (compressed), x (unknown), o (OS specific), E (exclude),
l (large), p (processor specific)
There are no section groups in this file.
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000078 0x0000000000400078 0x0000000000000000
0x000000000000000c 0x000000000000000c R E 0x200000
Section to Segment mapping:
Segment Sections...
00 .text
There is no dynamic section in this file.
There are no relocations in this file.
The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported.
No version information found in this file.
Although the executable runs perfectly fine, when I execute gdb <file> I am greeted with "0x7ffd7e078db0s": not in executable format: file format not recognized
The weird thing is, when I remove all sections (so an ELF remains with only file header, program header and code) GDB does recognize it as an executable.
Thus, my question is, how can I let GDB recognize my file as an executable? Alternatively, what information does GDB use to determine that a file is executable?
Thanks for your time and effort.
Your .shstrtab section has length 0x10, but should have length 0x11:
0 1 2 3 4 5 6 7 8 9 A B C D E F <-- byte offset
\0 . t e x t \0 . s h s t r t a b \0 <-- value
Changing 293rd byte from 0x10 to 0x11 makes the program run under GDB.
P.S. eu-readelf is more robust than readelf, and makes the error clearer. Using original (broken) binary:
$ readelf -WS junk.elf
There are 3 section headers, starting at offset 0x84:
Section Headers:
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[ 0] NULL 0000000000000000 000000 000000 00 0 0 0
[ 1] .text PROGBITS 0000000000400078 000078 00000c 0c AX 0 0 8
[ 2] .shstrtab STRTAB 0000000000000000 000144 000010 10 0 0 0
Compare to eu-readelf:
$ eu-readelf -WS junk.elf
There are 3 section headers, starting at offset 0x84:
Section Headers:
[Nr] Name Type Addr Off Size ES Flags Lk Inf Al
[ 0] NULL 0000000000000000 00000000 00000000 0 0 0 0
[ 1] .text PROGBITS 0000000000400078 00000078 0000000c 12 AX 0 0 8
[ 2] <corrupt> STRTAB 0000000000000000 00000144 00000010 16 0 0 0
I have written a program to use an external (wireless) numpad as an input device for gaming. I am using NUMPAD0, NUMPAD. and ENTER for the modifier keys shift, ctrl and alt respectively and I've keymapped the rest of the numpad keys to WASD and some other keys. I've setup a LowLevelKeyboardProc to intercept and eat the "real" input of the numpad and send a custom WM_MESSAGE to my application's Windows Message loop to then send the keymapped "new" input via SendInput (using the scan codes of the keys I want to simulate).
This all works fine and as expected. Except for the left shift key:
Whenever I press NUMPAD0, which is mapped to left shift, not only the correct scan code 0x2A = 42 is send, but also another key with scan code 0x022a = 554. This seems like some sort of flag in bit #9, since 554 is 42 + 2^9, but I can find any documentation on this.
I've read about extended 2-byte scan codes with prefix 0xE0, but not 0x02 as in this case. This also only happens with the simulated shift key; ctrl and alt a behaving just fine.
Relevant parts of both the keyboard hook and the windows messaging stuff:
LowLevelKeyboardProc:
LPWSTR convertToHex(LPBYTE in, size_t size_in_Bytes)
{
std::stringstream tempS;
tempS << std::hex;
for(int i = 0; i<size_in_Bytes; i++)
{
tempS << std::setfill('0') << std::setw(2) << int(in[i]) << " ";
}
std::string tempString = tempS.str();
std::wstring tempW(tempString.begin(), tempString.end());
LPWSTR out= (LPWSTR) malloc((tempW.size()+2)* sizeof(WCHAR));
StringCbCopyW(out, tempW.size()* sizeof(WCHAR), tempW.c_str());
return out;
}
LRESULT CALLBACK LowLevelKeyboardProc(__in int nCode, __in WPARAM wParam, __in LPARAM lParam)
{
KBDLLHOOKSTRUCT *key=(KBDLLHOOKSTRUCT *)lParam;
wchar_t tempString[128];
LPWSTR hexOut = convertToHex((LPBYTE) key , sizeof(KBDLLHOOKSTRUCT));
StringCbPrintfW(tempString, sizeof(tempString), L"HEX: %s \n", hexOut);
OutputDebugStringW(tempString);
free(hexOut);
//if(key->scanCode > 0xFF)
// return -1; //ignore strange scancode
if(nCode==HC_ACTION)
{
if(isNumpad)
{
auto it = KeyMap.find(key->scanCode);
if(it != KeyMap.end())
{
PostMessage(hwndMain, WM_MY_KEYDOWN, (key->flags & 128), it->first);
StringCbPrintfW(tempString, sizeof(tempString), L"Intercept Key, VK: %d, scan: %d, flags: %d, time: %d, event: %d \n", key->vkCode, key->scanCode, key->flags, key->time, wParam);
OutputDebugStringW(tempString);
return -1;
}
}
}
// delete injected flag
//UINT mask = (1<<4);
//if(key->flags & mask)
// key->flags ^= mask;
//key->scanCode = MapVirtualKey(key->vkCode, MAPVK_VK_TO_VSC);
StringCbPrintfW(tempString, sizeof(tempString), L"Output key: VK: %d, scan: %d, flags: %d, time: %d, event: %d \n", key->vkCode, key->scanCode, key->flags, key->time, wParam);
OutputDebugStringW(tempString);
return CallNextHookEx(hhkHook,nCode,wParam,lParam);
}
Windows Message stuff:
case WM_MY_KEYDOWN:
{
DWORD OLD_KEY = lParam;
DWORD NEW_KEY = KeyMap[OLD_KEY];
bool keyUP = wParam;
INPUT inputdata;
inputdata.type=INPUT_KEYBOARD;
inputdata.ki.dwFlags=KEYEVENTF_SCANCODE ;
inputdata.ki.wScan=NEW_KEY;
inputdata.ki.wVk=MapVirtualKey(NEW_KEY, MAPVK_VSC_TO_VK);
inputdata.ki.time = 0;//key->time;
inputdata.ki.dwExtraInfo = 0;//key->dwExtraInfo;
if(keyUP)
{
inputdata.ki.dwFlags |= KEYEVENTF_KEYUP;
}
Sleep(1);
bool sendKey=false;
switch(inputdata.ki.wScan)
{
//there once was some additional logic here to handle the modifier keys ctrl, alt and shift seperately, but not anymore
default:
{
sendKey=true;
}
break;
}
if (sendKey)
{
UINT result = SendInput(1, &inputdata, sizeof(INPUT));
if(!result)
ErrorExit(L"SendInput didn't work!");
}
}
break;
Output when pressing the normal shift key on a physical keyboard:
//key down
HEX: a0 00 00 00 2a 00 00 00 00 00 00 00 29 cf d2 00 00 00 00 00 00 00 00 00
Output key: VK: 160, scan: 42, flags: 0, time: 13815593, event: 256
//key up
HEX: a0 00 00 00 2a 00 00 00 80 00 00 00 e5 cf d2 00 00 00 00 00 00 00 00 00
Output key: VK: 160, scan: 42, flags: 128, time: 13815781, event: 257
Output when pressing the simulated shift key on the numpad:
//keydown event numpad
HEX: 60 00 00 00 52 00 00 00 00 00 00 00 a1 ea d2 00 00 00 00 00 00 00 00 00
Intercept Key, VK: 96, scan: 82, flags: 0, time: 13822625, event: 256
//keydown event simulated shift key
HEX: a0 00 00 00 2a 00 00 00 10 00 00 00 a1 ea d2 00 00 00 00 00 00 00 00 00
Output key: VK: 160, scan: 42, flags: 0, time: 13822625, event: 256
// key up event with strange scancode 554 <- THIS IS WHAT I AM TALKING ABOUT!
HEX: a0 00 00 00 2a 02 00 00 80 00 00 00 3d eb d2 00 00 00 00 00 00 00 00 00
Output key: VK: 160, scan: 554, flags: 128, time: 13822781, event: 257
// key up event numpad
HEX: 2d 00 00 00 52 00 00 00 80 00 00 00 3d eb d2 00 00 00 00 00 00 00 00 00
Intercept Key, VK: 45, scan: 82, flags: 128, time: 13822781, event: 257
// key down with strange scan code
HEX: a0 00 00 00 2a 02 00 00 00 00 00 00 3d eb d2 00 00 00 00 00 00 00 00 00
Output key: VK: 160, scan: 554, flags: 0, time: 13822781, event: 256
//key up event simulated shift key
HEX: a0 00 00 00 2a 00 00 00 90 00 00 00 3d eb d2 00 00 00 00 00 00 00 00 00
Output key: VK: 160, scan: 42, flags: 128, time: 13822781, event: 257
Another thing to note is that the key-up event of this strange key scan code 554 is also exactly in reverse to what the rest is (see output).
As I can simply eat the key event for this strange scan code 554, this is actually not much of a problem functionality-wise, but I would still like to know what this scan code 554 is all about.
C Data Structure:
typedef struct dpfpdd_dev_info {
unsigned int size;
char name[MAX_DEVICE_NAME_LENGTH];
DPFPDD_HW_DESCR descr;
DPFPDD_HW_ID id;
DPFPDD_HW_VERSION ver;
DPFPDD_HW_MODALITY modality;
DPFPDD_HW_TECHNOLOGY technology;
} DPFPDD_DEV_INFO;
typedef struct dpfpdd_hw_descr {
char vendor_name[MAX_STR_LENGTH];
char product_name[MAX_STR_LENGTH];
char serial_num[MAX_STR_LENGTH];
} DPFPDD_HW_DESCR;
typedef struct dpfpdd_hw_id {
unsigned short vendor_id;
unsigned short product_id;
} DPFPDD_HW_ID;
typedef struct dpfpdd_hw_version {
DPFPDD_VER_INFO hw_ver;
DPFPDD_VER_INFO fw_ver;
unsigned short bcd_rev;
} DPFPDD_HW_VERSION;
typedef struct dpfpdd_ver_info {
int major;
int minor;
int maintenance;
} DPFPDD_VER_INFO;
typedef unsigned int DPFPDD_HW_MODALITY;
typedef unsigned int DPFPDD_HW_TECHNOLOGY;
Emulated JS code of the C code above:
const ref = require('ref');
const StructType = require('ref-struct');
const FixedBuffer = require('./FixedBuffer'); // from https://gist.github.com/TooTallNate/80ac2d94b950216a2705
const DPFPDD_DEV_INFO = StructType({
size: ref.types.uint,
name: FixedBuffer(MAX_DEVICE_NAME_LENGTH, 'utf-8'),
descr: DPFPDD_HW_DESCR,
id: DPFPDD_HW_ID,
ver: DPFPDD_HW_VERSION,
modality: DPFPDD_HW_MODALITY,
technology: DPFPDD_HW_TECHNOLOGY
});
const DPFPDD_HW_DESCR = StructType({
vendor_name: FixedBuffer(MAX_STR_LENGTH, 'utf-8'),
product_name: FixedBuffer(MAX_STR_LENGTH, 'utf-8'),
serial_num: FixedBuffer(MAX_STR_LENGTH, 'utf-8')
});
const DPFPDD_HW_ID = StructType({
vendor_id: ref.types.ushort,
product_id: ref.types.ushort
});
const DPFPDD_HW_VERSION = StructType({
hw_ver: DPFPDD_VER_INFO,
fw_ver: DPFPDD_VER_INFO,
bcd_rev: ref.types.ushort
});
const DPFPDD_VER_INFO = StructType({
major: ref.types.int,
minor: ref.types.int,
maintenance: ref.types.int
});
const DPFPDD_HW_MODALITY = ref.types.uint;
const DPFPDD_HW_TECHNOLOGY = ref.types.uint;
DLL API:
int DPAPICALL dpfpdd_query_devices (unsigned int *dev_cnt, DPFPDD_DEV_INFO *dev_infos)
node-ffi code of the DLL API above:
const lib = ffi.Library('theDLL'), {
'dpfpdd_init': [ 'int', [] ],
'dpfpdd_query_devices': [ 'int', [ ref.refType('uint'), ref.refType(DPFPDD_DEV_INFO) ] ]
});
ffi function invocation in node:
lib.dpfpdd_init();
let deviceEntries = ref.alloc('uint');
let deviceInfo = ref.alloc(DPFPDD_DEV_INFO);
lib.dpfpdd_query_devices(deviceEntries, deviceInfo);
console.log(deviceInfo.deref());
The console log would return something like this:
{ size: 0,
name: <Buffer#0x019572AC 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ... >,
descr: { vendor_name: <Buffer#0x019576AC 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ... >,
product_name: <Buffer#0x0195772C 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ... >,
serial_num: <Buffer#0x019577AC 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ... >,
'ref.buffer': <Buffer#0x019576AC 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ... > },
id: { vendor_id: 0,
product_id: 0,
'ref.buffer': <Buffer#0x0195782C 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00> },
ver: { hw_ver: { major: 0,
minor: 0,
maintenance: 0,
'ref.buffer': <Buffer#0x01957830 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00> },
fw_ver: { major: 0,
minor: 0,
maintenance: 0,
'ref.buffer': <Buffer#0x0195783C 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00> },
bcd_rev: 0,
'ref.buffer': <Buffer#0x01957830 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00> },
modality: 0,
technology: 0,
'ref.buffer': <Buffer#0x019572A8 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ... > }
From all the Buffer type logs it seems like I'm receiving back no values at all from my OUT StructType parameter. I have no Idea why this is happening. I'm very new to using node-ffi and node-ref. I managed to get to communicate to another USB device but data structures are much simpler, no nested struct and array types. But this one I cannot get to work. Anyone enlighten me to what with what I'm doing wrong?
I wrote a basic helloworld.exe with C with the simple line printf("helloworld!\n");
Then I used UltraEdit to view the bytes of the EXE file and used also PE Explorer to see the header values. When it comes to Address of Entry Point, PE Explorer displays 0x004012c0.
Magic 010Bh PE32
Linker Version 1902h 2.25
Size of Code 00008000h
Size of Initialized Data 0000B000h
Size of Uninitialized Data 00000C00h
Address of Entry Point 004012C0h
Base of Code 00001000h
Base of Data 00009000h
Image Base 00400000h
But in UltraEdit I see 0x000012c0 after counting 16 bytes after magic 0x010B.
3F 02 00 00 E0 00 07 03 0B 01 02 19 00 80 00 00
00 B0 00 00 00 0C 00 00 C0 12 00 00 00 10 00 00
00 90 00 00 00 00 40 00 00 10 00 00 00 02 00 00
04 00 00 00 01 00 00 00 04 00 00 00 00 00 00 00
00 10 01 00 00 04 00 00 91 F6 00 00 03 00 00 00
00 00 20 00 00 10 00 00 00 00 10 00 00 10 00 00
00 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00
00 E0 00 00 C0 06 00 00 00 00 00 00 00 00 00 00
Which one is correct?
simply read about IMAGE_OPTIONAL_HEADER structure
AddressOfEntryPoint
A pointer to the entry point function, relative to the image base
address. For executable files, this is the starting address. For
device drivers, this is the address of the initialization function.
The entry point function is optional for DLLs. When no entry point is
present, this member is zero.
so absolute address of EntryPoint is AddressOfEntryPoint ? ImageBase + AddressOfEntryPoint : 0
in your case AddressOfEntryPoint == 12c0 and ImageBase == 400000
as result absolute address of EntryPoint is 12c0+400000==4012c0
I'm analyzing this tiny ELF file:
00000000 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 |.ELF............|
00000010 02 00 3e 00 01 00 00 00 78 00 40 00 00 00 00 00 |..>.....x.#.....|
00000020 40 00 00 00 00 00 00 00 98 00 00 00 00 00 00 00 |#...............|
00000030 00 00 00 00 40 00 38 00 01 00 40 00 03 00 02 00 |....#.8...#.....|
00000040 01 00 00 00 05 00 00 00 00 00 00 00 00 00 00 00 |................|
00000050 00 00 40 00 00 00 00 00 00 00 40 00 00 00 00 00 |..#.......#.....|
00000060 7e 00 00 00 00 00 00 00 7e 00 00 00 00 00 00 00 |~.......~.......|
00000070 00 00 20 00 00 00 00 00 31 c0 ff c0 cd 80 00 2e |.. .....1.......|
00000080 73 68 73 74 72 74 61 62 00 2e 74 65 78 74 00 00 |shstrtab..text..|
00000090 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
000000d0 00 00 00 00 00 00 00 00 0b 00 00 00 01 00 00 00 |................|
000000e0 06 00 00 00 00 00 00 00 78 00 40 00 00 00 00 00 |........x.#.....|
000000f0 78 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00 |x...............|
00000100 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 |................|
00000110 00 00 00 00 00 00 00 00 01 00 00 00 03 00 00 00 |................|
00000120 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000130 7e 00 00 00 00 00 00 00 11 00 00 00 00 00 00 00 |~...............|
00000140 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 |................|
00000150 00 00 00 00 00 00 00 00 |........|
00000158
I found documentation on the ELF header and the program header and decoded both of those, but I'm having problems decoding what's after this (starting with 31 c0 ff c0 cd 80 00 2e). Judging by the "shstrtab" text, I am looking at the section table, but what does 31 c0 ff c0 cd 80 00 2e mean? Where is this part documented?
OK, judging by the information in the first 16 bytes of the header:
00000000 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 |.ELF............|
E L F | | '--- Pudding :) ---'
| '--- Little-endian (ELFDATA2LSB)
'------ 64-bit (ELFCLASS64)
we're dealing with a 64-bit ELF with little-endian encoding of multi-byte numbers. So the ELF header is the first 4 rows in the hex editor. We're interested in these fields in the last two rows of it:
Prog Hdr Tab offset Sect Hdr Tab offset
.----------^----------. .----------^----------.
00000020 40 00 00 00 00 00 00 00 98 00 00 00 00 00 00 00 |#...............|
00000030 00 00 00 00 40 00 38 00 01 00 40 00 03 00 02 00 |....#.8...#.....|
'-.-' '-.-' '-.-' '-.-' '-.-'
PHT entry size ---' | | | '-- Sect names in #2
PHT num entries ----------' | '-- SHT num entries
'-------- SHT entry size
So we know that the Program Headers Table starts at offset 0x40 in the file (right after this header) and contains 1 entry of size 0x38 (56 bytes). So it ends at offset 0x40 + 1*0x38 = 0x78 (this is the first byte after this table, and this is also where your "mysterious data" begins, so keep this in mind).
The Section Headers Table starts at offset 0x98 in the file and contains 3 entries of size 0x40 (64 bytes), that is, each entry in SHT takes 4 consecutive rows in a hex editor, and the entire table is 3*4 = 12 such rows, so the offset 0x158 is the first byte after this table. But this is just the end of the file, so there's nothing more after the SHT.
The SHT entry at index 2 (the third=last one) should be a string table that contains the names for the sections.
So let's look at those sections now, shall we?
Section #2
Let's start with section #2, since it is supposed to contain the string table with the names for all the sections, so it will be very useful in further analysis. Here's its header (the last one in the table):
Name index Type=SHT_STRTAB (bingo!)
Flags .----^----. .----^----.
00000118 .----------^----------. 01 00 00 00 03 00 00 00 |........|
00000120 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000130 7e 00 00 00 00 00 00 00 11 00 00 00 00 00 00 00 |~...............|
'----------.----------' '----------.----------'
Starting offset Size
00000140 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 |................|
00000150 00 00 00 00 00 00 00 00 |........|
00000158
So this is indeed a string table (0x03 = SHT_STRTAB). It starts from offset 0x7E in the file and takes 0x11 (17) consecutive bytes. The first byte after the string table is therefore 0x8F. This byte is not a part of any section (garbage).
The string table
So let's see what's in the section containing the string table, so that we could name our sections:
0000007E 00 2e |..|
00000080 73 68 73 74 72 74 61 62 00 2e 74 65 78 74 00 |shstrtab..text.|
0000008F
Here's the string table, with addresses relative to its beginning:
+0 +1 +2 +3 +4 +5 +6 +7 +8 +9 +A +B +C +D +E +F
00: 00 2E 73 68 73 74 72 74 61 62 00 2e 74 65 78 74
10: 00
or the same in ASCII, with the NULL characters marked as ∎:
+0 +1 +2 +3 +4 +5 +6 +7 +8 +9 +A +B +C +D +E +F
00: ∎ . s h s t r t a b ∎ . t e x t
10: ∎
So we have just 3 full string in it, with the following relative offsets:
00: "" (Just the empty string)
01: ".shstrtab" (Name for this section)
0B: ".text" (Name for the section that contains the executable code)
(Keep in mind, though, that sections can also address substrings inside those strings, if they share the common ending.)
We can now verify that this section (#2) is indeed named .shstrtab: its name index was 0x01 after all, wasn't it? ;)
Section #1
Now let's take apart section #1's header:
Name index Type=SHT_PROGBITS
Flags .----^----. .----^----.
000000d8 .----------^----------. 0b 00 00 00 01 00 00 00 |........|
000000e0 06 00 00 00 00 00 00 00 78 00 40 00 00 00 00 00 |........x.#.....|
000000f0 78 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00 |x...............|
'----------.----------' '----------.----------'
Starting offset Size
00000100 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 |................|
00000110 00 00 00 00 00 00 00 00 |........|
00000118
So this section is named .text (note the name index 0x0B) and it is of type SHT_PROGBITS, so it contains some program-defined data; the executable code in this case. It starts from the offset 0x78 in the file and takes the next 6 bytes, so the first byte after this section is at offset 0x7E (where the string table begins). Here's its contents:
00000070 31 c0 ff c0 cd 80 |1.....|
0000007E
But wait! Remember where your "mysterious data" starts? Yes! It's the 0x78 offset! :) So this "mysterious data" is actually your executable payload :) After decoding it as Intel x86-64 opcodes we get this tiny little program:
31 C0 xor %eax,%eax ; Clear the EAX register to 0 (the short way).
FF C0 inc %eax ; Increase the EAX, so now it contains 1.
CD 80 int $0x80 ; Interrupt 0x80 is the system call on Linux.
which is basically equivalent to calling exit(0) in C ;) because the syscall interrupt expects the operation number in EAX, which in this case is sys_exit (operation number 1).
So yeah, mystery solved :) But let's continue anyway, to learn something more, and this way we'll find out where this piece of code will be loaded in memory.
Section #0
And finally section #0. It has some part missing, but I assume it was all 0s, since the first section is always a NULL section after all. Here's its (butchered) header:
00000098 00 00 00 00 00 00 00 00 | ........|
*
000000d0 00 00 00 00 00 00 00 00
But it's of no use to us. Nothing interesting here.
Program Headers Table
The last thing what's left to decode is the Program Headers Table, which – according to the information from the ELF header – starts from the offset 0x40 and takes 56 bytes, the first byte after it being at offset 0x78. Here's the dump:
Type=PHT_EXEC Flags=RX Starting offset in file
.----^----. .----^----. .----------^----------.
00000040 01 00 00 00 05 00 00 00 00 00 00 00 00 00 00 00 |................|
00000050 00 00 40 00 00 00 00 00 00 00 40 00 00 00 00 00 |..#.......#.....|
'----------.----------' '----------.----------'
Virtual address Physical address
Size in file Size in memory
.----------^----------. .----------^----------.
00000060 7e 00 00 00 00 00 00 00 7e 00 00 00 00 00 00 00 |~.......~.......|
00000070 00 00 20 00 00 00 00 00
00000078 '----------.----------'
Alignment
So it says that we load the first 126 (0x7E) bytes of the file into a memory segment of the same size, and the memory segment is supposed to start from the virtual address 0x400000. Our code starts from the offset 0x78 in the file and the first byte after it has the offset 0x7E, so it basically loads the entire beginning of the file, with the ELF header and the program header table into memory, as well as our executable payload at the end of it, and stops loading afterwards, ignoring the rest of the file.
So if the beginning of the file is loaded at address 0x400000, and our program starts 120 (0x78) bytes from its beginning, it will be located at the address 0x400078 in memory :>
Now let's see what entry point is specified in the ELF header for our program:
Executable x86-64 Version=1 Program's entry point
.-^-. .-^-. .----^----. .----------^----------.
00000010 02 00 3e 00 01 00 00 00 78 00 40 00 00 00 00 00 |..>.....x.#.....|
Bingo! :> It's 0x400078, so it points at the start of our little piece of code in the memory image.
And that's all, folks! ;)