"Hello World" function without using C printf - linux

UPDATED
It's my second day working with NASM. After thoroughly understanding this
section .programFlow
global _start
_start:
mov edx,len
mov ecx,msg
mov ebx,0x1 ;select STDOUT stream
mov eax,0x4 ;select SYS_WRITE call
int 0x80 ;invoke SYS_WRITE
mov ebx,0x0 ;select EXIT_CODE_0
mov eax,0x1 ;select SYS_EXIT call
int 0x80 ;invoke SYS_EXIT
section .programData
msg: db "Hello World!",0xa
len: equ $ - msg
I wanted to wrap this stuff inside an assembly function. All (or most of) the examples on the web are using extern and calling printf function of C (see code below) - and I don't want that. I want to learn to create a "Hello World" function in assembly without using C printf (or even other external function calls).
global _main
extern _printf
section .text
_main:
push message
call _printf
add esp, 4
ret
section .data
message: db "Hello, World", 10, 0
Update
I am practicing assembly for Linux, but since I do not own a Linux box, I am running my assembly code here compile_assembly_online.

Assuming you mean in a Windows command prompt environment, writing to standard out:
Since those provide a virtualized version of the old DOS environment, I believe you can use the old DOS interrupts for it:
int 21, function 9 can output a string: Set AH to 9, DS:DX to a string terminated with a $, and trigger the interrupt.
int 21, function 2 can output a single character, so you could use that repeatedly if you need to output $ (or you don't want Ctrl+C and such checking). AH to 2, DL to the ASCII (I expect) character code, and trigger the interrupt.

int 0x80 won't work in Windows or DOS simply because it's a Linux thing. So that's the first thing that has to change.
In terms of doing it under Windows, at some point you're going to need to call a Windows API function, such as (in this case) WriteConsole(). That's bypassing the C library as desired.
It does use the OS to do the heavy lifting in getting output to the "screen" but that's the same as int 0x80 and is probably required whether it's Linux, Windows or DOS.
If it is genuine DOS, your best place to start is the excellent Ralf Brown's Interrupt List, specifically Int21/Fn9.

I want to point out that Nasm "knows" certain section names - ".text", ".data", and ".bss" (a couple others that you don't need yet). The leading '.' is required, and the names are case sensitive. Using other names, as you've done in your first example, may "work" but may not give you the "attributes" you want. For example, section .programDatais going to be read-only. Since you don't try to write to it this isn't going to do any harm... butsection .data` is supposed to be writable.
Trying to learn asm for Linux without being able to try it out must be difficult. Maybe that online site is enough for you. There's a thing called "andlinux" (I think) that will let you run Linux programs in Windows. Or you could run Linux in a "virtual machine". Or you could carve out a parttion on one of your many spare drives and actually install Linux.
For DOS, there's DosBox... or you could install "real DOS" on one of those extra partitions. From there, you can write "direct to screen" at B800h:xxxx. (one byte for "character" and the next for "color"). If you want to do this "without help from the OS", that may be what you want. In a protected mode OS, forget it. They're protected from US!
Maybe you just want to know how to write a subroutine, in general. We could write a subroutine with "msg" and "len" hard coded into it - not very flexible. Or we could write a subroutine that takes two parameters - either in registers or on the stack. Or we could write a subroutine that expects a zero-terminated string (printf does, sys_write does not) and figure out the length to put in edx. If that's what you need help with, we've gotten distracted talking about int 80h vs int 21h vs WriteFile. You may need to ask again...
EDIT: Okay, a subroutine. The non-obvious part of this is that call puts the return address (the address of the instruction right after the call) on the stack, and ret gets the address to return to off the stack, so we don't want to alter where ss:sp points in between. We can change it, but we need to put it back where it was before we hit the ret.
; purpose: to demonstrate a subroutine
; assemble with: nasm -f bin -o myfile.com myfile.asm
; (for DOS)
; Nasm defaults to 16-bit code in "-f bin" mode
; but it won't hurt to make it clear
bits 16
; this does not "cause" our code to be loaded
; at 100h (256 decimal), but informs Nasm that
; this is where DOS will load a .com file
; (needed to calculate the address of "msg", etc.)
org 100h
; we can put our data after the code
; or we can jump over it
; we do not want to execute it!
; this will cause Nasm to move it after the code
section .data
msg db "Hello, World!", 13, 10, "$"
msg2 db "Goodbye cruel world!", 13, 10, "$"
section .text
; execution starts here
mov dx, msg ; address/offset of msg
call myprint
; "ret" comes back here
; no point in a subroutine if we're only going to do it once
mov dx, msg2
call myprint
; when we get back, do something intelligent
exit:
mov ah. 4Ch ; DOS's exit subfunction
int 21h
; ---------------------------------------
; subroutines go here, after the exit
; we don't want to "fall through" into 'em!
myprint:
; expects: address of a $-terminated string in dx
; returns: nothing
push ax ; don't really need to do this
mov ah, 9 ; DOS's print subfunction
int 21h ; do it
pop ax ; restore caller's ax - and our stack!
ret
; end of subroutine
That's untested, but subject to typos and stupid logic errors, it "should" work. We can make it more complicated - pass the parameter on the stack instead of just in dx. We can provide an example for Linux (same general idea). I suggest taking it in small steps...

Related

Linux sys_open in 64-bit NASM returns negative value

I am opening an existing file to write into it, using sys_open and sys_write. The sys_write works correctly when I create a new file as shown below. But if I use sys_open, the return value is negative (-13, which is "Permission denied") and the write doesn't work (of course).
This works:
section .data
File_Name: db '/opt/Test_Output_Files/Linux_File_Test',0
File_Mode: dq 754q
Write_Buffer: db 'This is what I want to write',0
section .text
; Create file
mov rax,85 ; sys_creat
mov rdi,File_Name
mov rsi,File_Mode ; mode (permissions)
syscall
mov rdi,rax ; return code from sys_creat
mov rax,1 ; sys_write
mov rsi,Write_Buffer
mov rdx,29
syscall
But when I open the existing file, the sys_open command fails:
mov rax,2 ; sys_open
mov rdi,File_Name
mov rsi,2 ;read-write
mov rdx,[File_Mode]
syscall
Because this is a permissions error, the issue is most likely the flags value in rsi because the mode value in rdx is the same as I use with sys_creat (754). According to the Linux man pages at http://man7.org/linux/man-pages/man2/open.2.html and https://linux.die.net/man/3/open, there are three required options:
O_RDONLY - Open for reading only.
O_WRONLY - Open for writing only.
O_RDWR - Open for reading and writing. The result is undefined if this flag is applied to a FIFO.
I know that read-only is zero, so I assumed write only and read-write are 1 and 2, but I haven't found any listing of the numeric values that we would use in assembly language, unlike the mode which is based on chmod -- and it's the same mode value I used for create, which works.
I've researched this extensively, but there is sparse information on 64-bit syscalls -- most of it is 32-bit. For NASM I need to use a numeric value for the flags in rsi. The man pages say "In addition, zero or more file creation flags and file status flags can be bitwise-or'd in flags. The file creation flags are O_CLOEXEC, O_CREAT, O_DIRECTORY, O_EXCL, O_NOCTTY, O_NOFOLLOW, O_TMPFILE, and O_TRUNC." I could bitwise OR them if I knew what their values are.
Thanks for any help with this.
My guess is, you do not have O_RDWR permission for the file that you are trying to open.
You should try O_RDONLY.
Anyway, to answer your question.
As far as flag values are concerned, those will be:
O_CREAT(0x40)
O_TRUNC(0x200)
O_APPEND(0x400)
You can find the entire list in:
/usr/include/asm-generic/fcntl.h
Note: if O_CREAT is not set then 'mode' (the value that you set in rdx) will be ignored.

Addressing an array of pointers in asm

I have a routine which I can call like this:
mov rdi, struc_point
mov rsi, struc_color
call put_pixel
Now, I would like to create something like an array of pointers to have a color table. What I have now is this, and it's not working:
color_table:
dq 0 ; null color
dq struc_color1
dq struc_color2
dq struc_color3
; etc..., the colors are defined somewhere else
Now, I would like to do something like that with it in the end:
mov rbx, 2 ; index into color table
mov rdi, struc_point
mov rsi, qword [color_table+8*rbx]
call put_pixel
What is going wrong? There are no compiler errors, but when I run this, all animations stop. rsi should contain the address of struc_color, see first snippet. The program works if I hardcode this color (mov rsi, struc_color).
This is in x86_64 asm, booted directly without any OS.
I found the problem. The code I wrote above was in fact correct and worked well.
The problem was located inside of put_pixel which did not save rax. And I was actually using rax just a couple of lines earlier and storing the data to be displayed in it. This lead to put_pixel throwing the program off course on first run.

Assembly parenthesis explanation

Hello im looking at an executable and don't have access to the source code. I haven't really come across this before and what I have found online, doesn't match the data that I am getting. Code:
0x08048d4c <+45>: movsbl (%ebx,%eax,1),%esi
0x08048d50 <+49>: and $0xf,%esi
0x08048d53 <+52>: add (%ecx,%esi,4),%edx
My confusion is in the +52 line. "x/d $ecx" yields 2, and the value at %esi before the line is called, is 7. after that line is executed %edx is set to be equal to 3 (was zero before hand).
I thought that it would be 2 + (7*4), but that is not the case. Can someone please enlighten me. This is AT&T syntax i believe.
Yes it's at&t syntax and if you are confused by it, then switch gdb to intel syntax (set disassembly-flavor intel). You would see something like: add edx, [ecx + esi*4]
Anyway, this fetches an operand from memory, from address ecx + esi*4. You can see what that is using x/d $ecx+$esi*4. x/d $ecx doesn't help you anything because the addition is to the address, not the value.

80x86 Assembly - Very basic I/O program conversion to Linux from Windows

So my first day of Assembly class, and what do you know? My professor teaches everything on her Windows box, using Windows API calls, etc. which is fine except that I'm running Ubuntu on my box..
Basically, I'm hoping I can find either a workaround or some form of common-grounds in order for me to get my assignments done.
Today, our first programming assignment was to input two integers and output the sum. I followed my professor's code as follows:
.386
.model flat
ExitProcess PROTO NEAR32 stdcall, dwExiteCode:DWORD
include io.h
cr EQU 0dh
lf EQU 0ah
.stack 4096
.data
szPrompt1 BYTE "Enter first number: ", 0
szPrompt2 BYTE "Enter second number: ", 0
zLabel1 BYTE cr, lf, "The sum is "
dwNumber1 DWORD ? ; numbers to be added
dwNumber2 DWORD ?
szString BYTE 40 DUP (?) ; input string for numbers
szSum BYTE 12 DUP (0) ; sum in string form
szNewline BYTE cr,lf,0
.code ; start of main program code
_start:
output szPrompt1 ; prompt for ?rst number
input szString,40 ; read ASCII characters
atod szString ; convert to integer
mov dwNumber1,eax ; store in memory
output szPrompt2 ; repeat for second number
input szString,40
atod szString
mov dwNumber2,eax
mov eax,dwNumber1 ; first number to EAX
add eax,dwNumber2 ; add second number
dtoa szSum,eax ; convert to ASCII characters
output szLabel1 ; output label and results
output szSum
output szNewline
INVOKE ExitProcess,0 ; exit with return code 0
PUBLIC _start ; make entry point public
END ; end of source code
Simple and straightforward enough, yeah? So I turned it in today all linked up from the crappy school computers. And I completely understand all the concepts involved, however, I see 2 main issues here for if I actually want to assemble it on my box:
1) .model flat
2) ExitProcess PROTO NEAR32 stdcall, dwExiteCode:DWORD
And
Both of which I've heard are very Windows-specific. So my question is how can I mutate this code to be able to assemble on Linux?
Sorry If I'm missing any details, but I'll let you know if you need.
Thanks!
Assembly code is, generally speaking, almost always platform specific. Indeed, the very syntax varies between assemblers, even within the same hardware and OS platform!
You'll also probably have problems with that io.h there - I would bet it's making a lot of calls into win32 APIs.
I would recommend simply using wine, along with a copy of whatever assembler your professor is using, to run your professor's examples. If it can run things like Microsoft Office and Steam, it can certainly run some trivial example code :)

x86 Assembly: Before Making a System Call on Linux Should You Save All Registers?

I have the below code that opens up a file, reads it into a buffer and then closes the file.
The close file system call requires that the file descriptor number be in the ebx register. The ebx register gets the file descriptor number before the read system call is made. My question is should I save the ebx register on the stack or somewhere before I make the read system call, (could int 80h trash the ebx register?). And then restore the ebx register for the close system call? Or is the code I have below fine and safe?
I have run the below code and it works, I'm just not sure if it is generally considered good assembly practice or not because I don't save the ebx register before the int 80h read call.
;; open up the input file
mov eax,5 ; open file system call number
mov ebx,[esp+8] ; null terminated string file name, first command line parameter
mov ecx,0o ; access type: O_RDONLY
int 80h ; file handle or negative error number put in eax
test eax,eax
js Error ; test sign flag (SF) for negative number which signals error
;; read in the full input file
mov ebx,eax ; assign input file descripter
mov eax,3 ; read system call number
mov ecx,InputBuff ; buffer to read into
mov edx,INPUT_BUFF_LEN ; total bytes to read
int 80h
test eax,eax
js Error ; if eax is negative then error
jz Error ; if no bytes were read then error
add eax,InputBuff ; add size of input to the begining of InputBuff location
mov [InputEnd],eax ; assign address of end of input
;; close the input file
;; file descripter is already in ebx
mov eax,6 ; close file system call number
int 80h
The int 80h call itself will not corrupt anything, apart from putting the return value in eax. So the code fragment you have is fine. (But if your code fragment is part of a larger routine which is expected to be called by other code following the usual Linux x86 ABI, you will need to preserve ebx, and possibly other registers, on entry to your routine, and restore on exit.)
The relevant code in the kernel can be found in arch/x86/kernel/entry_32.S. It's a bit hard to follow, due to extensive use of macros, and various details (support for syscall tracing, DWARF debugging annotations, etc.) but: the int 80h handler is system_call (line 493 in the version I've linked to); the registers are saved via the SAVE_ALL macro (line 497); and they're restored again via RESTORE_REGS (line 534) just before returning.
Yes, you should save and restore as in http://www.linuxjournal.com/files/linuxjournal.com/linuxjournal/articles/040/4048/4048l1.html

Resources