Double up bit gb-emu - emulation

Apologies in advance as this is an old topic. I was reading the following post on how the Nintendo logo data is decompressed and scaled before being copied into the v-ram during bootstrap and interestingly enough the data written does indeed look gibberish (as pointed out by the questioner) and I have tried my best (with a gb emulator I wrote) to produce that same output...but without success.
Link to post
The assembly code in question is this part of the boot rom:
LD C,A ; $0095 "Double up" all the bits of the graphics data
LD B,$04 ; $0096 and store in Video RAM
Addr_0098:
PUSH BC ; $0098
RL C ; $0099
RLA ; $009b
POP BC ; $009c
RL C ; $009d
RLA ; $009f
DEC B ; $00a0
JR NZ, Addr_0098 ; $00a1
LD (HL+),A ; $00a3
INC HL ; $00a4
LD (HL+),A ; $00a5
INC HL ; $00a6
RET
In reply to above post the output to v-ram is show to be:
8000: 00000000000000000000000000000000
8010: F000F000FC00FC00FC00FC00F300F300
8020: 3C003C003C003C003C003C003C003C00
8030: F000F000F000F00000000000F300F300
8040: 000000000000000000000000CF00CF00
... and so on
Can anyone explain how this output is generated and if it is indeed correct?
Many thanks in advance.
P.S. Assumption is made that the Nintendo logo is explicitly (inside some C/Java code) copied over to v-ram starting at address 0104h during boot process to test the bootstrap.
.DB $CE,$ED,$66,$66,$CC,$0D,$00,$0B,$03,$73,$00,$83,$00,$0C,$00,$0D
.DB $00,$08,$11,$1F,$88,$89,$00,$0E,$DC,$CC,$6E,$E6,$DD,$DD,$D9,$99
.DB $BB,$BB,$67,$63,$6E,$0E,$EC,$CC,$DD,$DC,$99,$9F,$BB,$B9,$33,$3E

After going through my code and seeing a potential silly bug (maybe a couple or two) I was able to finally get the same result as above. Please consider this resolved.
Basically I was forgetting to update the F register after the settings were changed for INC n, Add n and Sub n.
So technically, the above output seems to be correct.

Related

How to place code to a specific address using gas?

I used IDA Pro to disassemble an existing software running on a Motorola 68K CPU. The output of IDA is a disassembly that follows the MRI notation. I analyzed it and found that it consists of five separate parts:
Default Vectors
Bootloader SW
Application SW
Constants
Test Routines
The memory layout of the disassembly is roughly as shown below. The "stuffing" sections you'll find between the parts are filled with 0xFF and IDA translated them into constants.
Now, my task is to add custom code. As I need to put it somewhere, I decided to use the "stuffing" section that follows the Application SW (0x31600 onwards). As this increases the overall size, I simply remove the corresponding number of constants 0xFF to compensate.
This works nicely for the first time, but soon gets annoying: Each time I adapt my custom code, I need to keep track of the number of constants accordingly.
To find a convenient solution, my idea was to get rid the "stuffing" sections. That is, I would explicitly assign each of my five plus one parts (Bootloader SW, Application SW, ...) manually to its designated address. Using the MRI assembler, the following would do:
.org 0x00000
<Default Vectors>
.org 0x00100
<Bootloader SW>
.org 0x10000
<Application SW>
.org 0x31600
<My new custom code>
.org 0x60000
<Constants>
.org 0x69300
<Test Routines>
Unfortunately, I do not use the MRI assembler, but the GNU m68k-elf-as for some reason. But m68k-elf-as does not support using the .org directive!
So instead of using the .org directive, I tried to use the .section directive:
.section MyVectors, "r"
<Default Vectors>
.section MyBootloader, "x"
<Bootloader SW>
...
Then I tried to make the linker aware of these sections in the linker script:
MEMORY
{
ROM(rx) : ORIGIN = 0x00000, LENGTH = 512K
}
SECTIONS
{
.MyVectors :
{
KEEP(*(.MyVectors))
} > ROM = 0xFF
.MyBootloader :
...
}
Unfortunately, the above does not work properly. Despite my effort, the linker puts all of my sections next to each other without any "stuffing" in between.
How can I solve my issue?
In the linker script, you can assign a start address for each of your named sections. Use the ". =" syntax to do that, effectively setting the location counter manually:
MEMORY
{
ROM(rx) : ORIGIN = 0x00000, LENGTH = 512K
}
SECTIONS
{
. = 0
.MyVectors :
{
KEEP(*(.MyVectors))
} > ROM = 0xFF
. = 0x100
.MyBootloader :
. = 0x10000
...
}

Include binary file in as86/bin86

I have written a bit of code in i8086 assembler that is supposed to put a 80x25 image into the VRAM and show it on screen.
entry start
start:
mov di,#0xb800 ; Point ES:DI at VRAM
mov es,di
mov di,#0x0000
mov si,#image ; And DS:SI at Image
mov cx,#0x03e8 ; Image is 1000 bytes
mov bl,#0x20 ; Print spaces
; How BX is used:
; |XXXX XXXX XXXXXXXX|
; ^^^^^^^^^ BL contains ascii whitespace
; ^^^^ BH higher 4 bits contain background color
; ^^^^ BH lower 4 bits contain unused foreground color
img_loop:
seg ds ; Load color
mov bh,[si]
seg es ; Write a whitespace and color to VRAM
mov [di],bx
add di,#2 ; Advance one 'pixel'
sal bh,#4 ; Shift the unused lower 4-bits so that they become background color for the 2nd pixel
seg es
mov [di],bx
add di,#2
add si,#1
sub cx,#1 ; Repeat until 1 KiB is read
jnz img_loop
endless:
jmp endless
image:
GET splash.bin
The problem is that I cannot get the as86 assembler to include the binary data from the image file. I have looked at the the man page but I could not find anything that works.
If I try to build above code it gives me no error, but the output file produced by the linker is only 44 bytes in size, so clearly it did not bother to put in the 1000 byte image.
Can anybody help me with that? What am I doing wrong?
I am not certain that this will help you, as I have never tried it for 8086 code. But you might be able to make it work.
The objcopy program can convert binary objects to various different formats. Like this example from the man objcopy page:
objcopy -I binary -O <output_format> -B <architecture> \
--rename-section .data=.rodata,alloc,load,readonly,data,contents \
<input_binary_file> <output_object_file>
So from that you'd have an object file with your <input_binary_file> in a section named .rodata. But you could name it whatever you wanted. Then use a linker to link your machine code to the image data.
The symbol names are created for you too. Also from the man page:
-B
--binary-architecture=bfdarch
Useful when transforming a architecture-less input file into an object file. In this case the output architecture can be set to
bfdarch. This option will be ignored if the input file has a known
bfdarch. You can access this binary data inside a program by
referencing the special symbols that are created by the conversion
process. These symbols are called _binary_objfile_start,
_binary_objfile_end and _binary_objfile_size. e.g. you can transform a picture file into an object file and then access it in your code using
these symbols.
If your whole code is pure code (no executable headers, no relocation...) you can just manually concatenate the image at the end of the code (and of course remove GET splash.bin). In Linux for example you can do cat code-binary image-binary > final-binary.
Thank you everybody else trying to help me. Unfortunately I did not get the objcopy to work (maybe I am just too stupid, who knows) and while I actually used cat at first, I had to include multiple binary files soon, which should still be accessible via labels in my assembler code, so that was not a solution either.
What I ended up doing was the following: You reserve the exact amount of bytes in your assembler source code directly after the label you wanna put in your binary file, i.e.:
splash_img:
.SPACE 1000
snake_pit:
.SPACE 2000
Then you assemble your source code creating a symbol table by adding the -s option, i.e. -s snake.symbol to your call to as86. The linker call does not change. Now you have a binary file that has a bunch of zeroes at the position you wanna have your binary data, and you have a symbol table that should look similar to this:
0 00000762 ---R- snake_pit
0 0000037A ---R- splash_img
All you gotta do now is get a program to override the binary file created by the linker with your binary include file starting at the addresses found in the symbol table. It is up to you how you wanna do it, there are a lot of ways, I ended up writing a small C program that does this.
Then I just call ./as86_binin snake snake.symbols splash_img splash.bin and it copies the binary include into my linked assembler program.
I am sorry for answering my own question now, but I felt like this is the best way to do it. It is quite unfortunate bin86 doesn't have a simple binary include macro on its own. If anybody else runs into this problem in the future, I hope this will help you.

Assembly parenthesis explanation

Hello im looking at an executable and don't have access to the source code. I haven't really come across this before and what I have found online, doesn't match the data that I am getting. Code:
0x08048d4c <+45>: movsbl (%ebx,%eax,1),%esi
0x08048d50 <+49>: and $0xf,%esi
0x08048d53 <+52>: add (%ecx,%esi,4),%edx
My confusion is in the +52 line. "x/d $ecx" yields 2, and the value at %esi before the line is called, is 7. after that line is executed %edx is set to be equal to 3 (was zero before hand).
I thought that it would be 2 + (7*4), but that is not the case. Can someone please enlighten me. This is AT&T syntax i believe.
Yes it's at&t syntax and if you are confused by it, then switch gdb to intel syntax (set disassembly-flavor intel). You would see something like: add edx, [ecx + esi*4]
Anyway, this fetches an operand from memory, from address ecx + esi*4. You can see what that is using x/d $ecx+$esi*4. x/d $ecx doesn't help you anything because the addition is to the address, not the value.

"Hello World" function without using C printf

UPDATED
It's my second day working with NASM. After thoroughly understanding this
section .programFlow
global _start
_start:
mov edx,len
mov ecx,msg
mov ebx,0x1 ;select STDOUT stream
mov eax,0x4 ;select SYS_WRITE call
int 0x80 ;invoke SYS_WRITE
mov ebx,0x0 ;select EXIT_CODE_0
mov eax,0x1 ;select SYS_EXIT call
int 0x80 ;invoke SYS_EXIT
section .programData
msg: db "Hello World!",0xa
len: equ $ - msg
I wanted to wrap this stuff inside an assembly function. All (or most of) the examples on the web are using extern and calling printf function of C (see code below) - and I don't want that. I want to learn to create a "Hello World" function in assembly without using C printf (or even other external function calls).
global _main
extern _printf
section .text
_main:
push message
call _printf
add esp, 4
ret
section .data
message: db "Hello, World", 10, 0
Update
I am practicing assembly for Linux, but since I do not own a Linux box, I am running my assembly code here compile_assembly_online.
Assuming you mean in a Windows command prompt environment, writing to standard out:
Since those provide a virtualized version of the old DOS environment, I believe you can use the old DOS interrupts for it:
int 21, function 9 can output a string: Set AH to 9, DS:DX to a string terminated with a $, and trigger the interrupt.
int 21, function 2 can output a single character, so you could use that repeatedly if you need to output $ (or you don't want Ctrl+C and such checking). AH to 2, DL to the ASCII (I expect) character code, and trigger the interrupt.
int 0x80 won't work in Windows or DOS simply because it's a Linux thing. So that's the first thing that has to change.
In terms of doing it under Windows, at some point you're going to need to call a Windows API function, such as (in this case) WriteConsole(). That's bypassing the C library as desired.
It does use the OS to do the heavy lifting in getting output to the "screen" but that's the same as int 0x80 and is probably required whether it's Linux, Windows or DOS.
If it is genuine DOS, your best place to start is the excellent Ralf Brown's Interrupt List, specifically Int21/Fn9.
I want to point out that Nasm "knows" certain section names - ".text", ".data", and ".bss" (a couple others that you don't need yet). The leading '.' is required, and the names are case sensitive. Using other names, as you've done in your first example, may "work" but may not give you the "attributes" you want. For example, section .programDatais going to be read-only. Since you don't try to write to it this isn't going to do any harm... butsection .data` is supposed to be writable.
Trying to learn asm for Linux without being able to try it out must be difficult. Maybe that online site is enough for you. There's a thing called "andlinux" (I think) that will let you run Linux programs in Windows. Or you could run Linux in a "virtual machine". Or you could carve out a parttion on one of your many spare drives and actually install Linux.
For DOS, there's DosBox... or you could install "real DOS" on one of those extra partitions. From there, you can write "direct to screen" at B800h:xxxx. (one byte for "character" and the next for "color"). If you want to do this "without help from the OS", that may be what you want. In a protected mode OS, forget it. They're protected from US!
Maybe you just want to know how to write a subroutine, in general. We could write a subroutine with "msg" and "len" hard coded into it - not very flexible. Or we could write a subroutine that takes two parameters - either in registers or on the stack. Or we could write a subroutine that expects a zero-terminated string (printf does, sys_write does not) and figure out the length to put in edx. If that's what you need help with, we've gotten distracted talking about int 80h vs int 21h vs WriteFile. You may need to ask again...
EDIT: Okay, a subroutine. The non-obvious part of this is that call puts the return address (the address of the instruction right after the call) on the stack, and ret gets the address to return to off the stack, so we don't want to alter where ss:sp points in between. We can change it, but we need to put it back where it was before we hit the ret.
; purpose: to demonstrate a subroutine
; assemble with: nasm -f bin -o myfile.com myfile.asm
; (for DOS)
; Nasm defaults to 16-bit code in "-f bin" mode
; but it won't hurt to make it clear
bits 16
; this does not "cause" our code to be loaded
; at 100h (256 decimal), but informs Nasm that
; this is where DOS will load a .com file
; (needed to calculate the address of "msg", etc.)
org 100h
; we can put our data after the code
; or we can jump over it
; we do not want to execute it!
; this will cause Nasm to move it after the code
section .data
msg db "Hello, World!", 13, 10, "$"
msg2 db "Goodbye cruel world!", 13, 10, "$"
section .text
; execution starts here
mov dx, msg ; address/offset of msg
call myprint
; "ret" comes back here
; no point in a subroutine if we're only going to do it once
mov dx, msg2
call myprint
; when we get back, do something intelligent
exit:
mov ah. 4Ch ; DOS's exit subfunction
int 21h
; ---------------------------------------
; subroutines go here, after the exit
; we don't want to "fall through" into 'em!
myprint:
; expects: address of a $-terminated string in dx
; returns: nothing
push ax ; don't really need to do this
mov ah, 9 ; DOS's print subfunction
int 21h ; do it
pop ax ; restore caller's ax - and our stack!
ret
; end of subroutine
That's untested, but subject to typos and stupid logic errors, it "should" work. We can make it more complicated - pass the parameter on the stack instead of just in dx. We can provide an example for Linux (same general idea). I suggest taking it in small steps...

80x86 Assembly - Very basic I/O program conversion to Linux from Windows

So my first day of Assembly class, and what do you know? My professor teaches everything on her Windows box, using Windows API calls, etc. which is fine except that I'm running Ubuntu on my box..
Basically, I'm hoping I can find either a workaround or some form of common-grounds in order for me to get my assignments done.
Today, our first programming assignment was to input two integers and output the sum. I followed my professor's code as follows:
.386
.model flat
ExitProcess PROTO NEAR32 stdcall, dwExiteCode:DWORD
include io.h
cr EQU 0dh
lf EQU 0ah
.stack 4096
.data
szPrompt1 BYTE "Enter first number: ", 0
szPrompt2 BYTE "Enter second number: ", 0
zLabel1 BYTE cr, lf, "The sum is "
dwNumber1 DWORD ? ; numbers to be added
dwNumber2 DWORD ?
szString BYTE 40 DUP (?) ; input string for numbers
szSum BYTE 12 DUP (0) ; sum in string form
szNewline BYTE cr,lf,0
.code ; start of main program code
_start:
output szPrompt1 ; prompt for ?rst number
input szString,40 ; read ASCII characters
atod szString ; convert to integer
mov dwNumber1,eax ; store in memory
output szPrompt2 ; repeat for second number
input szString,40
atod szString
mov dwNumber2,eax
mov eax,dwNumber1 ; first number to EAX
add eax,dwNumber2 ; add second number
dtoa szSum,eax ; convert to ASCII characters
output szLabel1 ; output label and results
output szSum
output szNewline
INVOKE ExitProcess,0 ; exit with return code 0
PUBLIC _start ; make entry point public
END ; end of source code
Simple and straightforward enough, yeah? So I turned it in today all linked up from the crappy school computers. And I completely understand all the concepts involved, however, I see 2 main issues here for if I actually want to assemble it on my box:
1) .model flat
2) ExitProcess PROTO NEAR32 stdcall, dwExiteCode:DWORD
And
Both of which I've heard are very Windows-specific. So my question is how can I mutate this code to be able to assemble on Linux?
Sorry If I'm missing any details, but I'll let you know if you need.
Thanks!
Assembly code is, generally speaking, almost always platform specific. Indeed, the very syntax varies between assemblers, even within the same hardware and OS platform!
You'll also probably have problems with that io.h there - I would bet it's making a lot of calls into win32 APIs.
I would recommend simply using wine, along with a copy of whatever assembler your professor is using, to run your professor's examples. If it can run things like Microsoft Office and Steam, it can certainly run some trivial example code :)

Resources