GameBoy 16-bit load into 8-bit memory - emulation

I have begun programming an emulator for the Gameboy classic, my next project after a successful Chip 8 Emulator.
As a reference I use the GameBoy CPU Manual.
Now on page 66 it says:
LD A,(HL) 7E 8
Basically, load the value HL into register A.
However, as I understand this, this would load the 16-bit value HL into the 8-bit register A. Which of course doesnt fit.
Do you have any idea how this is meant? All other references are just simple tables without explaination, but say the same thing.
Thanks for your answers!

With this instruction the value pointed to by (HL) is loaded into A not the value of HL itself.
For example if HL has the value 0xABCD and the value of the memory at address 0xABCD is 0x50 then 0x50 is loaded into register A.
Pseudo implementation
register.A = memory.ReadByte(register.HL);

I think LD A,(HL) is a synonym for something more widely written as LD a,[hl] based on the documentation for a similar instruction on page 71.
LDD A,(HL)
Description:
Put value at address HL into A. Decrement HL.
Same as: LD A,(HL) - DEC HL
Therefore, LD A,(HL) means "Put value at address HL into A." HL is a 16 bit value, but the address it references is 8 bit, so it fits into A.

Related

RISC-V user level reference or reference implementation

Summary: What is the definitive reference or reference implementation for the RISC-V user-level ISA?
Context: The RISC-V website has "The RISC-V Instruction Set Manual" which explains the user-level instructions very well, but does not give an exact specification for them. I am trying to build a user-level ISA simulator now and intend to write an FPGA implementation later, so the exact behavior is important to me.
A reference implementation would be sufficient, but should preferably be as simple as possible -- i.e. I would try to understand a pipelined implementation only as a last resort. What is important is to have an understanding of the specified ISA and not of a single CPU implementation or compiler implementation.
One example to show my problem is the AUIPC instruction: The prose explanation says that "AUIPC forms a 32-bit offset from the 20-bit U-immediate, filling in the lowest 12 bits with zeros, adds this offset to the pc, then places the result in register rd." I wanted to know whether this refers to the old or new PC, i.e. the position of the AUIPC instruction or the next instruction. I looked at the "RISCV Angel" implementation, but that seems to mask out the lower bits of the (old) PC -- not just of the immediate -- which I could not find any reason for in the spec, not even in the change history of the spec (since Angel is a bit older). Instead of an answer, I now have two questions about AUIPC. Many other instructions pose similar problems to me.
AFAICT the RISC-V Instruction Set Manual you cite is the closest thing there is to a definitive reference. If there are things that are unclear or incorrect in there then you could open issues at the Github site where that document is maintained: https://github.com/riscv/riscv-isa-manual
As far as AIUPC is concerned, the answer is implied, but not stated explicitly, by this sentence at the bottom of page 9 in the current manual:
There is one additional user-visible register: the program counter pc holds the address of the current instruction.
Based on that statement I would expect that the pc value that is seen and manipulated by the AIUPC instruction is the address of the AIUPC instruction itself.
This interpretation is supported by the discussion of the JALR instruction:
The indirect jump instruction JALR (jump and link register) uses the I-type encoding. The target address is obtained by adding the 12-bit signed I-immediate to the register rs1, then setting the least-signicant bit of the result to zero. The address of the instruction following the jump (pc+4) is written to register rd.
Given that the address of the following instruction is expressed as pc+4, it seems clear that the pc value visible during the execution of JALR is the address of the JALR instruction itself.
The latest draft of the manual (at https://github.com/riscv/riscv-isa-manual/releases/download/draft-20190321-ba17106/riscv-spec.pdf) makes the situation slightly clearer. In place of this in the current manual:
AUIPC appends 12 low-order zero bits to the 20-bit U-immediate, sign-extends the result to 64 bits, then adds it to the pc and places the result in register rd.
the latest draft says:
AUIPC forms a 32-bit offset from the 20-bit U-immediate, filling in the lowest 12 bits with zeros, adds this offset to the pc of the AUIPC instruction, then places the result in register rd.

Linux x86 CPU Instruction Layout Confusion

In x86, I understand multi-byte objects are stored in memory little endian style.
Now generally speaking, when it comes to CPU instructions, the OPCODE determines the purpose of the instruction and data/memory addresses may follow the opcode in it's encoded format. My understanding is the Opcode portion of the instruction should be the most significant byte and thus appear at the highest address of any given instruction encoding representation.
Can someone explain the memory layout on this x86 linux gdb example? I would imagine that the opcode 0xb8 would appear at a higher address due to it being the most significant byte.
(gdb) disassemble _start
Dump of assembler code for function _start:
0x08048080 <+0>: mov eax,0x11223344
(gdb) x/1xb _start+0
0x8048080 <_start>: 0xb8
(gdb) x/1xb _start+1
0x8048081 <_start+1>: 0x44
(gdb) x/1xb _start+2
0x8048082 <_start+2>: 0x33
(gdb) x/1xb _start+3
0x8048083 <_start+3>: 0x22
(gdb) x/1xb _start+4
0x8048084 <_start+4>: 0x11
It appears the instruction mov eax, 0x11223344 is encoding as 0x11 0x22 0x33 0x44 0xb8.
Questions.
1.) How does the CPU know how many bytes the instruction will take up if the first byte it see's is not an opcode?
2.) I'm wondering if perhaps x86 cpu instructions do not even have endian-ness and are considering some type of string? (probably way off here)
x86 is a variable length instruction set, you start with a single byte which has no endianness, it is wherever it is.
Then depending on the opcode there may be more bytes and those might for example be a 32 bit immediate, and (if that group of bytes is an immediate or address of some sort) THOSE bytes will be little endian. Say you have the five bytes ABCDE (no endianess, think array or string). The A byte is the opcode, the B byte would then be the lower 8 bits of the immediate and the E the upper 8 bits of the immediate.
Opcode is a hard to use term, in these older 8/16 bit CISC processors like x86 the entire byte was an opcode that you basically looked up in a table to see what it meant (and inside the processor they did use a table to figure out how to execute it). When you look at MIPS or ARM or other (certainly RISC) instruction sets like those, only a portion of the 32 bits are the "opcode" and in neither of those cases is it the same set of bits from one instruction to another, you have to look at various places in the instruction (yes there is overlap to make the decoding sane), MIPS is a lot more consistent you have one blob in one place you look at but one of those patterns requires you to look at another blob of bits to fully decode. ARM you start at a common bit and as you work your way across you are further decoding the instruction, then you may have to grab some random looking spots to fully decode. The rest of the bits are operands, what register to use or immediate or whatever the kind of thing that in a CISC you needed a look up table for (are implied by the opcode but not defined by bits in the opcode).
1) the next byte after the prior instruction will be interpreted as an opcode even if not intended to be one (if execution continues to that byte and doesnt branch). I dont remember my x86 table off hand to know if there are any undefined instructions or not, if undefined then it will hit a handler, otherwise it will decode what it finds as machine code and if it is not properly formed instructions will likely crash, sometimes you get lucky and it just messes something up and keeps going, or even more lucky and you cant tell that it almost crashed.
2) you are right for these 8/16 bit CISC or similar instruction sets they are treated more like strings that you parse through linearly.

What do opcodes 0xE9 (JP HL) and 0xF8 (LD HL,SP+r8) do?

I think I am struggling to define correctly the following ambiguous opcodes: LD HL,SP+r8 and JP (HL) opcodes (0xE9 and 0xF8 respectively)
In my implementation, LD HL,SP+r8 sets HL to the value of SP+r8, but I have a feeling that it may be to do with loading memory from RAM.
JP (HL), I have as PUSHing the PC onto the stack and setting the Program Counter to the value of HL (like JP a16, except with the value of HL), but I've read a few forums that seem to say that that's wrong.
Any clarification of what either of these instructions do would be great, as I'm pretty lost at the moment.
In my implementation, LD HL,SP+r8 sets HL to the value of SP+r8, but I have a feeling that it may be to do with loading memory from RAM.
No. It just takes an 8-bit immediate, sign-extends it, adds the value of SP to it and stores the result in HL.
JP (HL), I have as PUSHing the PC onto the stack and setting the Program Counter to the value of HL (like JP a16, except with the value of HL)
JP doesn't push the current PC on the stack (maybe you're confusing it with CALL). What JP (HL) does is just PC = HL.

What does a hexadecimal number, with a register in parenthesis mean in Assembly?

lea 0x1c(%ebp),%eax
So, I understand vaguely what the lea instruction does, and I know those are registers, but what is this structure: 0x1c(%ebp)? I got this code out of objdump.
It is one of the many x86 addressing modes. Specifically, this is referred to as "displacement" addressing.
Since you said you used objdump and didn't specify that you used the -M flag, I'm going to assume this in the GAS syntax (as opposed to Intel syntax). This means that the first operand is the source, and the second operand is the destination.
The lea 0x1C(%ebp),%eax instruction means, "Take the value in %ebp, add 0x1C (28 in decimal), then store that value in %eax".

operand of LIDT is displacement/absolute address

I stumbled upon a statement in Intel Software developers manual:
"For LGDT, LIDT, LLDT, LTR, SGDT, SIDT, SLDT, STR, the exit qualification receives the value of the instruction’s displacement field, which is sign-extended to 64 bits if necessary (32 bits on processors that do not support Intel 64 architecture). If the instruction has no displacement (for example, has a register operand), zero is stored into the exit qualification. "
Now if I have an instruction LIDT 0xf290, then is "0xf290" a displacement? I think answer is yes.
So, my confusion is what all constitute as displacement? I was under impression that displacement is something which is calculated with respect to current eip value.
For eg. jmp xxx (In intrasegment jumps this will be a displacement. But for intersegment jumps, it should be absolute address.) If that is the case then why LIDT loads a relative address?
A displacement is just an offset from some origin, which may be a Base+Index*Scale, or 0. The other operand x86 has that can hold large values is immediate, which is useful for things like adding constants (e.g. ADD $42, %eax).
Incidentally, it appears that relative jumps use the immediate field, probably because they modify EIP by a constant.

Resources