80x86 Assembly - Very basic I/O program conversion to Linux from Windows - linux

So my first day of Assembly class, and what do you know? My professor teaches everything on her Windows box, using Windows API calls, etc. which is fine except that I'm running Ubuntu on my box..
Basically, I'm hoping I can find either a workaround or some form of common-grounds in order for me to get my assignments done.
Today, our first programming assignment was to input two integers and output the sum. I followed my professor's code as follows:
.386
.model flat
ExitProcess PROTO NEAR32 stdcall, dwExiteCode:DWORD
include io.h
cr EQU 0dh
lf EQU 0ah
.stack 4096
.data
szPrompt1 BYTE "Enter first number: ", 0
szPrompt2 BYTE "Enter second number: ", 0
zLabel1 BYTE cr, lf, "The sum is "
dwNumber1 DWORD ? ; numbers to be added
dwNumber2 DWORD ?
szString BYTE 40 DUP (?) ; input string for numbers
szSum BYTE 12 DUP (0) ; sum in string form
szNewline BYTE cr,lf,0
.code ; start of main program code
_start:
output szPrompt1 ; prompt for ?rst number
input szString,40 ; read ASCII characters
atod szString ; convert to integer
mov dwNumber1,eax ; store in memory
output szPrompt2 ; repeat for second number
input szString,40
atod szString
mov dwNumber2,eax
mov eax,dwNumber1 ; first number to EAX
add eax,dwNumber2 ; add second number
dtoa szSum,eax ; convert to ASCII characters
output szLabel1 ; output label and results
output szSum
output szNewline
INVOKE ExitProcess,0 ; exit with return code 0
PUBLIC _start ; make entry point public
END ; end of source code
Simple and straightforward enough, yeah? So I turned it in today all linked up from the crappy school computers. And I completely understand all the concepts involved, however, I see 2 main issues here for if I actually want to assemble it on my box:
1) .model flat
2) ExitProcess PROTO NEAR32 stdcall, dwExiteCode:DWORD
And
Both of which I've heard are very Windows-specific. So my question is how can I mutate this code to be able to assemble on Linux?
Sorry If I'm missing any details, but I'll let you know if you need.
Thanks!

Assembly code is, generally speaking, almost always platform specific. Indeed, the very syntax varies between assemblers, even within the same hardware and OS platform!
You'll also probably have problems with that io.h there - I would bet it's making a lot of calls into win32 APIs.
I would recommend simply using wine, along with a copy of whatever assembler your professor is using, to run your professor's examples. If it can run things like Microsoft Office and Steam, it can certainly run some trivial example code :)

Related

Double up bit gb-emu

Apologies in advance as this is an old topic. I was reading the following post on how the Nintendo logo data is decompressed and scaled before being copied into the v-ram during bootstrap and interestingly enough the data written does indeed look gibberish (as pointed out by the questioner) and I have tried my best (with a gb emulator I wrote) to produce that same output...but without success.
Link to post
The assembly code in question is this part of the boot rom:
LD C,A ; $0095 "Double up" all the bits of the graphics data
LD B,$04 ; $0096 and store in Video RAM
Addr_0098:
PUSH BC ; $0098
RL C ; $0099
RLA ; $009b
POP BC ; $009c
RL C ; $009d
RLA ; $009f
DEC B ; $00a0
JR NZ, Addr_0098 ; $00a1
LD (HL+),A ; $00a3
INC HL ; $00a4
LD (HL+),A ; $00a5
INC HL ; $00a6
RET
In reply to above post the output to v-ram is show to be:
8000: 00000000000000000000000000000000
8010: F000F000FC00FC00FC00FC00F300F300
8020: 3C003C003C003C003C003C003C003C00
8030: F000F000F000F00000000000F300F300
8040: 000000000000000000000000CF00CF00
... and so on
Can anyone explain how this output is generated and if it is indeed correct?
Many thanks in advance.
P.S. Assumption is made that the Nintendo logo is explicitly (inside some C/Java code) copied over to v-ram starting at address 0104h during boot process to test the bootstrap.
.DB $CE,$ED,$66,$66,$CC,$0D,$00,$0B,$03,$73,$00,$83,$00,$0C,$00,$0D
.DB $00,$08,$11,$1F,$88,$89,$00,$0E,$DC,$CC,$6E,$E6,$DD,$DD,$D9,$99
.DB $BB,$BB,$67,$63,$6E,$0E,$EC,$CC,$DD,$DC,$99,$9F,$BB,$B9,$33,$3E
After going through my code and seeing a potential silly bug (maybe a couple or two) I was able to finally get the same result as above. Please consider this resolved.
Basically I was forgetting to update the F register after the settings were changed for INC n, Add n and Sub n.
So technically, the above output seems to be correct.

Linux sys_open in 64-bit NASM returns negative value

I am opening an existing file to write into it, using sys_open and sys_write. The sys_write works correctly when I create a new file as shown below. But if I use sys_open, the return value is negative (-13, which is "Permission denied") and the write doesn't work (of course).
This works:
section .data
File_Name: db '/opt/Test_Output_Files/Linux_File_Test',0
File_Mode: dq 754q
Write_Buffer: db 'This is what I want to write',0
section .text
; Create file
mov rax,85 ; sys_creat
mov rdi,File_Name
mov rsi,File_Mode ; mode (permissions)
syscall
mov rdi,rax ; return code from sys_creat
mov rax,1 ; sys_write
mov rsi,Write_Buffer
mov rdx,29
syscall
But when I open the existing file, the sys_open command fails:
mov rax,2 ; sys_open
mov rdi,File_Name
mov rsi,2 ;read-write
mov rdx,[File_Mode]
syscall
Because this is a permissions error, the issue is most likely the flags value in rsi because the mode value in rdx is the same as I use with sys_creat (754). According to the Linux man pages at http://man7.org/linux/man-pages/man2/open.2.html and https://linux.die.net/man/3/open, there are three required options:
O_RDONLY - Open for reading only.
O_WRONLY - Open for writing only.
O_RDWR - Open for reading and writing. The result is undefined if this flag is applied to a FIFO.
I know that read-only is zero, so I assumed write only and read-write are 1 and 2, but I haven't found any listing of the numeric values that we would use in assembly language, unlike the mode which is based on chmod -- and it's the same mode value I used for create, which works.
I've researched this extensively, but there is sparse information on 64-bit syscalls -- most of it is 32-bit. For NASM I need to use a numeric value for the flags in rsi. The man pages say "In addition, zero or more file creation flags and file status flags can be bitwise-or'd in flags. The file creation flags are O_CLOEXEC, O_CREAT, O_DIRECTORY, O_EXCL, O_NOCTTY, O_NOFOLLOW, O_TMPFILE, and O_TRUNC." I could bitwise OR them if I knew what their values are.
Thanks for any help with this.
My guess is, you do not have O_RDWR permission for the file that you are trying to open.
You should try O_RDONLY.
Anyway, to answer your question.
As far as flag values are concerned, those will be:
O_CREAT(0x40)
O_TRUNC(0x200)
O_APPEND(0x400)
You can find the entire list in:
/usr/include/asm-generic/fcntl.h
Note: if O_CREAT is not set then 'mode' (the value that you set in rdx) will be ignored.

Include binary file in as86/bin86

I have written a bit of code in i8086 assembler that is supposed to put a 80x25 image into the VRAM and show it on screen.
entry start
start:
mov di,#0xb800 ; Point ES:DI at VRAM
mov es,di
mov di,#0x0000
mov si,#image ; And DS:SI at Image
mov cx,#0x03e8 ; Image is 1000 bytes
mov bl,#0x20 ; Print spaces
; How BX is used:
; |XXXX XXXX XXXXXXXX|
; ^^^^^^^^^ BL contains ascii whitespace
; ^^^^ BH higher 4 bits contain background color
; ^^^^ BH lower 4 bits contain unused foreground color
img_loop:
seg ds ; Load color
mov bh,[si]
seg es ; Write a whitespace and color to VRAM
mov [di],bx
add di,#2 ; Advance one 'pixel'
sal bh,#4 ; Shift the unused lower 4-bits so that they become background color for the 2nd pixel
seg es
mov [di],bx
add di,#2
add si,#1
sub cx,#1 ; Repeat until 1 KiB is read
jnz img_loop
endless:
jmp endless
image:
GET splash.bin
The problem is that I cannot get the as86 assembler to include the binary data from the image file. I have looked at the the man page but I could not find anything that works.
If I try to build above code it gives me no error, but the output file produced by the linker is only 44 bytes in size, so clearly it did not bother to put in the 1000 byte image.
Can anybody help me with that? What am I doing wrong?
I am not certain that this will help you, as I have never tried it for 8086 code. But you might be able to make it work.
The objcopy program can convert binary objects to various different formats. Like this example from the man objcopy page:
objcopy -I binary -O <output_format> -B <architecture> \
--rename-section .data=.rodata,alloc,load,readonly,data,contents \
<input_binary_file> <output_object_file>
So from that you'd have an object file with your <input_binary_file> in a section named .rodata. But you could name it whatever you wanted. Then use a linker to link your machine code to the image data.
The symbol names are created for you too. Also from the man page:
-B
--binary-architecture=bfdarch
Useful when transforming a architecture-less input file into an object file. In this case the output architecture can be set to
bfdarch. This option will be ignored if the input file has a known
bfdarch. You can access this binary data inside a program by
referencing the special symbols that are created by the conversion
process. These symbols are called _binary_objfile_start,
_binary_objfile_end and _binary_objfile_size. e.g. you can transform a picture file into an object file and then access it in your code using
these symbols.
If your whole code is pure code (no executable headers, no relocation...) you can just manually concatenate the image at the end of the code (and of course remove GET splash.bin). In Linux for example you can do cat code-binary image-binary > final-binary.
Thank you everybody else trying to help me. Unfortunately I did not get the objcopy to work (maybe I am just too stupid, who knows) and while I actually used cat at first, I had to include multiple binary files soon, which should still be accessible via labels in my assembler code, so that was not a solution either.
What I ended up doing was the following: You reserve the exact amount of bytes in your assembler source code directly after the label you wanna put in your binary file, i.e.:
splash_img:
.SPACE 1000
snake_pit:
.SPACE 2000
Then you assemble your source code creating a symbol table by adding the -s option, i.e. -s snake.symbol to your call to as86. The linker call does not change. Now you have a binary file that has a bunch of zeroes at the position you wanna have your binary data, and you have a symbol table that should look similar to this:
0 00000762 ---R- snake_pit
0 0000037A ---R- splash_img
All you gotta do now is get a program to override the binary file created by the linker with your binary include file starting at the addresses found in the symbol table. It is up to you how you wanna do it, there are a lot of ways, I ended up writing a small C program that does this.
Then I just call ./as86_binin snake snake.symbols splash_img splash.bin and it copies the binary include into my linked assembler program.
I am sorry for answering my own question now, but I felt like this is the best way to do it. It is quite unfortunate bin86 doesn't have a simple binary include macro on its own. If anybody else runs into this problem in the future, I hope this will help you.

Assembly parenthesis explanation

Hello im looking at an executable and don't have access to the source code. I haven't really come across this before and what I have found online, doesn't match the data that I am getting. Code:
0x08048d4c <+45>: movsbl (%ebx,%eax,1),%esi
0x08048d50 <+49>: and $0xf,%esi
0x08048d53 <+52>: add (%ecx,%esi,4),%edx
My confusion is in the +52 line. "x/d $ecx" yields 2, and the value at %esi before the line is called, is 7. after that line is executed %edx is set to be equal to 3 (was zero before hand).
I thought that it would be 2 + (7*4), but that is not the case. Can someone please enlighten me. This is AT&T syntax i believe.
Yes it's at&t syntax and if you are confused by it, then switch gdb to intel syntax (set disassembly-flavor intel). You would see something like: add edx, [ecx + esi*4]
Anyway, this fetches an operand from memory, from address ecx + esi*4. You can see what that is using x/d $ecx+$esi*4. x/d $ecx doesn't help you anything because the addition is to the address, not the value.

"Hello World" function without using C printf

UPDATED
It's my second day working with NASM. After thoroughly understanding this
section .programFlow
global _start
_start:
mov edx,len
mov ecx,msg
mov ebx,0x1 ;select STDOUT stream
mov eax,0x4 ;select SYS_WRITE call
int 0x80 ;invoke SYS_WRITE
mov ebx,0x0 ;select EXIT_CODE_0
mov eax,0x1 ;select SYS_EXIT call
int 0x80 ;invoke SYS_EXIT
section .programData
msg: db "Hello World!",0xa
len: equ $ - msg
I wanted to wrap this stuff inside an assembly function. All (or most of) the examples on the web are using extern and calling printf function of C (see code below) - and I don't want that. I want to learn to create a "Hello World" function in assembly without using C printf (or even other external function calls).
global _main
extern _printf
section .text
_main:
push message
call _printf
add esp, 4
ret
section .data
message: db "Hello, World", 10, 0
Update
I am practicing assembly for Linux, but since I do not own a Linux box, I am running my assembly code here compile_assembly_online.
Assuming you mean in a Windows command prompt environment, writing to standard out:
Since those provide a virtualized version of the old DOS environment, I believe you can use the old DOS interrupts for it:
int 21, function 9 can output a string: Set AH to 9, DS:DX to a string terminated with a $, and trigger the interrupt.
int 21, function 2 can output a single character, so you could use that repeatedly if you need to output $ (or you don't want Ctrl+C and such checking). AH to 2, DL to the ASCII (I expect) character code, and trigger the interrupt.
int 0x80 won't work in Windows or DOS simply because it's a Linux thing. So that's the first thing that has to change.
In terms of doing it under Windows, at some point you're going to need to call a Windows API function, such as (in this case) WriteConsole(). That's bypassing the C library as desired.
It does use the OS to do the heavy lifting in getting output to the "screen" but that's the same as int 0x80 and is probably required whether it's Linux, Windows or DOS.
If it is genuine DOS, your best place to start is the excellent Ralf Brown's Interrupt List, specifically Int21/Fn9.
I want to point out that Nasm "knows" certain section names - ".text", ".data", and ".bss" (a couple others that you don't need yet). The leading '.' is required, and the names are case sensitive. Using other names, as you've done in your first example, may "work" but may not give you the "attributes" you want. For example, section .programDatais going to be read-only. Since you don't try to write to it this isn't going to do any harm... butsection .data` is supposed to be writable.
Trying to learn asm for Linux without being able to try it out must be difficult. Maybe that online site is enough for you. There's a thing called "andlinux" (I think) that will let you run Linux programs in Windows. Or you could run Linux in a "virtual machine". Or you could carve out a parttion on one of your many spare drives and actually install Linux.
For DOS, there's DosBox... or you could install "real DOS" on one of those extra partitions. From there, you can write "direct to screen" at B800h:xxxx. (one byte for "character" and the next for "color"). If you want to do this "without help from the OS", that may be what you want. In a protected mode OS, forget it. They're protected from US!
Maybe you just want to know how to write a subroutine, in general. We could write a subroutine with "msg" and "len" hard coded into it - not very flexible. Or we could write a subroutine that takes two parameters - either in registers or on the stack. Or we could write a subroutine that expects a zero-terminated string (printf does, sys_write does not) and figure out the length to put in edx. If that's what you need help with, we've gotten distracted talking about int 80h vs int 21h vs WriteFile. You may need to ask again...
EDIT: Okay, a subroutine. The non-obvious part of this is that call puts the return address (the address of the instruction right after the call) on the stack, and ret gets the address to return to off the stack, so we don't want to alter where ss:sp points in between. We can change it, but we need to put it back where it was before we hit the ret.
; purpose: to demonstrate a subroutine
; assemble with: nasm -f bin -o myfile.com myfile.asm
; (for DOS)
; Nasm defaults to 16-bit code in "-f bin" mode
; but it won't hurt to make it clear
bits 16
; this does not "cause" our code to be loaded
; at 100h (256 decimal), but informs Nasm that
; this is where DOS will load a .com file
; (needed to calculate the address of "msg", etc.)
org 100h
; we can put our data after the code
; or we can jump over it
; we do not want to execute it!
; this will cause Nasm to move it after the code
section .data
msg db "Hello, World!", 13, 10, "$"
msg2 db "Goodbye cruel world!", 13, 10, "$"
section .text
; execution starts here
mov dx, msg ; address/offset of msg
call myprint
; "ret" comes back here
; no point in a subroutine if we're only going to do it once
mov dx, msg2
call myprint
; when we get back, do something intelligent
exit:
mov ah. 4Ch ; DOS's exit subfunction
int 21h
; ---------------------------------------
; subroutines go here, after the exit
; we don't want to "fall through" into 'em!
myprint:
; expects: address of a $-terminated string in dx
; returns: nothing
push ax ; don't really need to do this
mov ah, 9 ; DOS's print subfunction
int 21h ; do it
pop ax ; restore caller's ax - and our stack!
ret
; end of subroutine
That's untested, but subject to typos and stupid logic errors, it "should" work. We can make it more complicated - pass the parameter on the stack instead of just in dx. We can provide an example for Linux (same general idea). I suggest taking it in small steps...

Resources