How do I take operands as registers from the byte value? - emulation

I have a fairly simple program so far to start off my emulation experience. I load in an instruction and determine how many (if any) operands there are, then I grab those operands and use them. For things like jumps and pushes it's somewhat straightforward until I get to registers.. How do I know when an operand is a register? Or how can I tell if it's the value at an address instead of just an address (i.e when they use something like ld (hl),a)
I'm rather new to emulation and all, but I have a decent bit of experience with assembly, even for the z80.
Question
How do I tell the difference between what is meant as a register and what is meant as an address or dereference of an address?

Because you decode the instruction. For example in ld (hl), a, which is 0x77, or 0b01110111, the first 01 tell you it's an ld reg8, reg8 and that you have to decode two groups of 3 bits, each a reg8. So 110 and 111, and you look them up in the reg8 decoding table, where 110 means (hl) and 111 means a. Alternatively you could just make a Giant Switch of Death and directly decode 0x77 to ld (hl), a, but that's more of a difference in implementation than anything deep or significant.
The instruction completely specifies what the operands are, so this "how do I tell" question strikes me as a bit silly - the answer is already staring you right in the face when you're decoding the instruction.
See also: decoding z80 opcodes

Related

Obfuscation of checksum guards

As part of my project, I have to insert small codes in a C program called checksum guards. What these guards do is they calculate the checksum value of a portion of code using a function(add, xor, etc.) which operates on the instruction opcodes. So, if somebody has tampered with the instructions(add, modify, delete) in that region of code, the checksum value will change and intrusion will be detected.
Here is the research paper which talks about this technique:
https://www.cerias.purdue.edu/assets/pdf/bibtex_archive/2001-49.pdf
Here is the guard template:
guard:
add ebp, -checksum
mov eax, client_addr
for:
cmp eax, client_end
jg end
mov ebx, dword[eax]
add ebp, ebx
add eax, 4
jmp for
end:
Now, I have two questions.
Would putting the guards in the assembly better than putting it in the source program?
Assuming I put them in the assembly(at an appropriate place) what kind of obfuscation should I use to prevent the guard template to be easily visible? (Since when I have more than 1 guard, the attacker should not easily find out all the guard templates and disable all the guards together as that would leave the code with no security)
Thank you in advance.
From attacker's (without sources) point of view the first question doesn't matter; he's tampering with the final binary machine code, whether it was produced from .c or .s will make zero difference. So I would worry mainly how to generate the correct binary with appropriate checksums. I'm not aware of any way how to get proper checksum inside the C source. But I can imagine to have some external tool running over assembler files created by C compiler, in some post-process way - before compiling the .s files into .o. But... Keep in mind, that some calls and addresses are just relative offsets, and the binary loaded into memory is patched by the OS loader according to linker's table, to make those point to the real memory addresses. Thus the data bytes will change (opcodes will stay fixed).
Yours guard template doesn't take that into account, and does checksum whole opcodes with data bytes as well (Some advanced guards have opcodes definitions, and checksum/encrypt/decipher only the opcodes themselves without operand bytes).
Otherwise it's neat, that the result is damaged ebp value, ruining any C code around (*) working with stack variables. But it's still artificial test, you can simply comment out both add ebp,-checksum and add ebp,ebx making the guard harmless.
(*) notice you have to put the guard in between some classic C code to get some real runtime problems from invalid ebp value. If you would put it at the end of subroutine, which ends with pop ebp, everything would work well.
So to the second question:
You definitely want more malicious ways to guard correct value, than only ebp damage. Usually the hardest (to remove) way is to make checksum value part of some calculation, eventually skewing results just slightly, so serious usage of the SW will be impossible, but it will take time to notice by the user.
You can also use some genuine code loop to add your checksumming to it, so simply skipping whole loop will skip also valid code (but I can imagine this one only added by hand into generated assembly from C, so you will have to redo it after every new compilation of particular C source).
Then the particular guard template can be obfuscated by any imaginable mutation (different registers used, modified order of instructions, instruction variants), try to search about viruses with mutation encoding to get some ideas.
And I didn't read that paper, but from the Figures I would say the main point is to make those guarding areas to overlap, so patching off one of them will affect another one, which sounds to me like that extra sugar to make it somewhat functional (although this still looks like normal challenge to 8bit game crackers ;), not even "hard" level). But that also means you would need either very cunning external tool to calculate that cyclic tree of dependencies, and insert the guard templates in correct order, or do it again manually completely.
Of course when doing manually, you have to do it after each new C compilation, so it's worth of the effort only on something very precious and expensive, or rock solid stable, where you will not produce another revision for next 10y or so... :D

6502 and little-endian conversion

For fun I'm implementing an NES emulator. I'm currently reading through documentation for the 6502 CPU and I'm a little confused.
I've seen documentation stating because the 6502 is little-endian so when using absolute addressing mode you need to swap the bytes. I'm writing this on an x86 machine which is also little-endian, so I don't understand why I couldn't simply cast to a uint16_t*, dereference that, and let the compiler work out the details.
I've written some simple tests in google test and they seem to agree with me.
// implementation of READ16
#define READ16(addr) (*(uint16_t*)addr)
TEST(MemMacro, READ16) {
uint8_t arr[] = {0xFF,0xCC};
uint8_t *mem = (&arr[0]);
EXPECT_EQ(0xCCFF, READ16(mem));
}
This passes, so it appears my supposition is correct, but I thought I'd ask someone with more experience than I.
Is this correct for pulling out the operand in 6502 absolute addressing mode? Am I possibly missing something?
It will work for simple cases on little-endian systems, but tying your implementation to those feels unnecessary when the corresponding portable implementation is simple. Sticking to the macro, you could do this instead:
#define READ16(addr) (addr[0] + (addr[1] << 8))
(Just to be pedantic, you should also make sure that addr[1] can't be out-of-bounds, and would need to add some more parentheses if addr could be a complex expression.)
However, as you keep developing your emulator, you will find that it's most natural to use a pair of general-purpose read_mem() and write_mem() functions that operate on single bytes. Remember that the address space is split up into multiple regions (RAM, ROM, and memory-mapped registers from the PPU and APU), so having e.g. a single array that you index into won't work well. The fact that memory regions can be remapped by mappers also complicates things. (You won't have to worry about that for simple games though -- I recommend starting with Donkey Kong.)
What you need to do is to figure out what region or memory-mapped register the address belongs to inside your read_mem() and write_mem() functions (this is called address decoding), and do the right thing for the address.
Returning to the original question, the fact that you'll end up using read_mem() to read the individual bytes of the address anyway means that the uint16_t casting trickery is even less likely to be useful. This is the simplest and most robust approach w.r.t. handling corner cases, and what every emulator I've seen does in practice (Nestopia, Nintendulator, and FCEUX).
In case you've missed it, the #nesdev channel on EFNet is very active and a good resource by the way. I assume you're already familiar with the NESDev wiki. :)
I've also been working on an emulator which can be found here.

68000 Assembly Language - How to know wether an address is an absoulute long or short operand

For example: MOVE.W $1234,$8000
Could someone tell me what the source is using ( Long or short ) and what the destination is using (Long or short). Can you explain how to found this out.
Thanks.
It is probably whatever the assembler decides to use.
To force it, use an appropriate suffix:
move.w ($1234).w, ($8000).l
to use short (also called "near") source but long (aka "far") destination address.
In my (semi-ancient) experience, you don't need to care about this very often, just let the assembler do its job.
Unless explicitly specified by hinting the assembler (the notation may slightly differ depending on the assembler used, $1234.w would hint the assembler to use short mode), it depends on the assembler you're using what is done by default.
A common and sensible choice is to use the shorter variant where possible; e.g. anyhing between -32768 to 32767 inclusive is assembled as short, anything else as long. Applying this rule, $1234 would be assembled as short, while $8000 would assemble as long (because $8000.w would yield an effective address of $FFFF8000 when evaluated by the processor; as explicitly stated in the 68k family manual, address operands less than 32 bits in size, are sign extended to 32 bits before being used).

Decoding 68k instructions

I'm writing an interpreted 68k emulator as a personal/educational project. Right now I'm trying to develop a simple, general decoding mechanism.
As I understand it, the first two bytes of each instruction are enough to uniquely identify the operation (with two rare exceptions) and the number of words left to be read, if any.
Here is what I would like to accomplish in my decoding phase:
1. read two bytes
2. determine which instruction it is
3. extract the operands
4. pass the opcode and the operands on to the execute phase
I can't just pass the first two bytes into a lookup table like I could with the first few bits in a RISC arch, because operands are "in the way". How can I accomplish part 2 in a general way?
Broadly, my question is: How do I remove the variability of operands from the decoding process?
More background:
Here is a partial table from section 8.2 of the Programmer's Reference Manual:
Table 8.2. Operation Code Map
Bits 15-12 Operation
0000 Bit Manipulation/MOVEP/Immediate
0001 Move Byte
...
1110 Shift/Rotate/Bit Field
1111 Coprocessor Interface...
This made great sense to me, but then I look at the bit patterns for each instruction and notice that there isn't a single instruction where bits 15-12 are 0001, 0010, or 0011. There must be some big piece of the picture that I'm missing.
This Decoding Z80 Opcodes site explains decoding explicitly, which is something I haven't found in the 68k programmer's reference manual or by googling.
I've decided to simply create a look-up table with every possible pattern for each instruction. It was my first idea, but I discarded it as "wasteful, inelegant". Now, I'm accepting it as "really fast".

Hex 0x0001 vs 0x00000001

often in code that uses permissions checking, i see some folks use hex 0x0001 and others use 0x00000001. these both look like an equivalent of a decimal 1, if i'm not mistaking.
why use one over the other, just a matter of preference?
Assuming that this is C, C++, Java, C# or something similar, they are the same. 0x0001 implies a 16-bit value while 0x00000001 implies a 32-bit value, but the real word length is determined by the compiler at compile time when evaluating hexadecimal literals such as these. This is a question of coding style, but it doesn't make any difference in the compiled code.
What's going on here is this is a bitmask for which it is tradition to place leading zeros out to the width of the bitmask. I would furthermore guess the width of the bitmask changed at some point to add more specialized permissions.

Resources