I was skimming through an implementation of scanf here, and I couldn't find the exact method by which the program is picking up input from the keyboard. I know there's always another layer deeper to go, but could someone explain perhaps a step below C code, i.e. scanf, how keyboard input gets made available to my program?
Despite the fact the file is called scanf.c, it contains only the sscanf and vsscanf functions (plus another internal function that's unimportant to the C library API).
Hence it's only used for scanning strings rather than reading from files.
In terms of how an actual scanf would work, it would usually just use the lower level functions in the C library, such as getchar() and ungetc().
In terms of how those functions work, that's up to the implementation. It may call an lower level function yet again, it may have a memory mapped keyboard so it can just read memory to get a keystroke, it may receive interrupts and store keys in an ISR for later extraction.
The possibilities are very wide ranging.
For some concrete examples, I've developed a micro-processor emulator for the educational market that uses I/O ports for the keyboard (and other devices). So the lowest level code there is along the following lines:
:keyin equ 07d2 ; memory mapped keyboard port
push rf ; preserve register
setw rf :keyin ; use register f for input
:loop inb r0 rf ; get keyboard value to register 0
jz :loop ; 0 means none available, so try again
pop rf ; restore register f, register 0 now holds key
In contrast, (the much more successful) Linux can use one of the systems calls for reading a character from the input file descriptor:
buffer: db ?
getch: mov eax, 3 ; sys_read
mov ebx, 0 ; descriptor 0 (input)
mov ecx, buffer ; address to store data
mov edx, 1 ; get one character
int 80h ; make system call (old style).
Related
Is it possible to start reading a file from a specific line or byte. Currently I use this code to read 4 bytes of a file:
section .data
filename db "file.txt", 0
section .bss
read_data resb 4
section .text
global _start
_start:
mov rax, SYS_OPEN
mov rdi, filename
mov rsi, O_RDONLY
mov rdx, 0
syscall
push rax
mov rdi, rax
mov rax, SYS_READ
mov rsi, read_data
mov rdx, 4
syscall
mov rax, SYS_CLOSE
pop rdi
syscall
This code always reads the first 4 bytes, but I want to start reading from other parts of the file, like the middle for example. What do I need to add or change?
A freshly-opened file descriptor starts at position = 0. If you keep reading from the same fd in a loop, you'll get successive chunks. (Use a larger buffer like 8kiB and loop over dwords in user-space, though, using the value that read returned as an upper limit! A system call is very expensive in CPU time.)
Is it possible to start reading a file from a specific line or byte.
Byte: yes
Line: no. In Unix/Linux, the kernel doesn't have an index of line-start byte offsets or any other line-oriented API. The line handling in stdio fgets for example is purely done in user-space. There have been some historical OSes with record-based files, but Unix files are flat arrays of bytes. (They can have holes, unwritten extents, and extended attributes... But the kernel APIs for the main file contents only operate with by byte offsets).
If you want to do lines, read a big block and loop forward until you've seen some number of newlines. If you're not there yet, read another block; repeat until you find the start and end of the line number you want, or you hit EOF. x86-64 can efficiently search 16 bytes at a time with pcmpeqb / pmovmskb / popcnt (popcnt requires SSE4.2 or the specific popcnt feature bit).
Or with just SSE2, or when optimizing for large blocks, with pcmpeqb / psadbw (against all-zero) to hsum bytes to qwords / paddd. Then check how many lines you went every so often with some scalar code. Or keep it simple and branch on finding the first newline in a SIMD vector.
Obviously the slow and simple option is a byte-at-a-time loop that counts '\n' characters - if you know how to do strchr with SSE2 it should be straightforward to vectorize that search using the above suggestions.
But if you only want some specific byte positions, you have two main options:
seek with lseek(2) before read(2) (see #Nicolae Natea's answer)
Use POSIX/Linux pread(2) to read from a specified offset, without moving the fd's file offset for future read calls. The Linux system call name is pread64 (__NR_pread64 equ 17 from asm/unistd_64.h)
ssize_t pread(int fd, void *buf, size_t count, off_t offset); The only difference from read is the offset arg, the 4th arg thus passed in R10 (not RCX like the user-space function calling convention). off_t is a 64-bit type simply passed in a single register in 64-bit code.
Other than the pread64 name in the .h, there's nothing special about the asm interface compared to the C interface, it follows the standard system-calling convention. (It exists since Linux 2.1.60 ; before that glibc's wrapper emulated it with lseek.)
There are other things you can do like mmap, or a preadv system call, but pread is most exactly what you're looking for if you have a known position you want to read from.
Before performing the read you should perform a lseek, so that the file position is updated.
so something along the lines:
mov rdi, rax ; fd
mov rax, SYS_LSEEK
mov rsi, <whatever offset you want>
mov rdx, 0 ; keep 0 if the offset should be from the begining of the file
syscall
note: RDI will still hold the same fd value after a syscall so you don't need extra save/restore for the fd across lseek / read / close.
Tip:
It might be easier to write the code in c and compile it with gcc -g -S -fverbose-asm -Og -c main.c and then look at main.s. (How to remove "noise" from GCC/clang assembly output?). But that will only show the compiler making calls to libc wrapper functions, unless you use inline system call macros like MUSL libc provides.
I am using NASM assembler on linux 64 bit.
There is something with variables and registers I can't understand.
I create a variable named "msg":
msg db "hello, world"
Now when I want to write to the stdout I move the msg to rsi register, however I don't understand the mov instruction bitwise ... the rsi register consists of 64 bit , while the msg variable has 12 symbols which is 8 bits each , which means the msg variable has a size of 12 * 8 bits , which is greater than 64 bits obviously.
So how is this even possible to make an instruction like:
mov rsi, msg , without overflowing the memory allocated for rsi.
Or does the rsi register contain the memory location of the first symbol of the string and after writing 1 symbol it changes to the memory location of the next symbol?
Sorry if I wrote complete nonsense, I'm new to assembly and i just can't get the grasp of it for a while.
In NASM syntax (unlike MASM syntax) mov rsi, symbol puts the address of the symbol into RSI. (Inefficiently with a 64-bit absolute immediate; use a RIP-relative LEA or mov esi, symbol instead. How to load address of function or label into register in GNU Assembler)
mov rsi, [symbol] would load 8 bytes starting at symbol. It's up to you to choose a useful place to load 8 bytes from when you write an instruction like that.
mov rsi, msg ; rsi = address of msg. Use lea rsi, [rel msg] instead
movzx eax, byte [rsi+1] ; rax = 'e' (upper 7 bytes zeroed)
mov edx, [msg+6] ; rdx = ' wor' (upper 4 bytes zeroed)
Note that you can use mov esi, msg because symbol addresses always fit in 32 bits (in the default "small" code model, where all static code/data goes in the low 2GB of virtual address space). NASM makes this optimization for you with assemble-time constants (like mov rax, 1), but probably it can't with link-time constants. Why do x86-64 instructions on 32-bit registers zero the upper part of the full 64-bit register?
and after writing 1 symbol it changes to the memory location of the next symbol?
No, if you want that you have to inc rsi. There is no magic. Pointers are just integers that you manipulate like any other integers, and strings are just bytes in memory.
Accessing registers doesn't magically modify them.
There are instructions like lodsb and pop that load from memory and increment a pointer (rsi or rsp respectively), but x86 doesn't have any pre/post-increment/decrement addressing modes, so you can't get that behaviour with mov even if you want it. Use add/sub or inc/dec.
Disclaimer: I'm not familiar with the flavor of assembly that you're dealing with, so the following is more general. The particular flavor may have more features than what I'm used to. In general, assembly deals with single byte/word entities where the size depends on the processor. I've done quite a bit of work on 8 and 16-bit processors, so that is where my answer is coming from.
General statements about Assembly:
Assembly is just like a high level language, except you have to handle a lot more of the details. So if you're used to some operation in say C, you can start there and then break the operation down even further.
For instance, if you have declared two variables that you want to add, that's pretty easy in C:
x = a + b;
In assembly, you have to break that down further:
mov R1, a * get value from a into register R1
mov R2, b * get value from b into register R2
add R1,R2 * perform the addition (typically goes into a particular location I'll call it the accumulator
mov x, acc * store the result of the addition from the accumulator into x
Depending on the flavor of assembly and the processor, you may be able to directly refer to variables in the addition instruction, but like I said I would have to look at the specific flavor you're working with.
Comments on your specific question:
If you have a string of characters, then you would have to move each one individually using a loop of some sort. I would set up a register to contain the starting address of your string, and then increment that register after each character is moved. It acts like a pointer in C. You will need to have some sort of indication for the termination of the string or another value that tells the size of the string, so you know when to stop.
How do I test to ensure only an integer is entered and ensure length of input is 5 bytes or less in the following code?
I am trying to understand how to properly control input so that the input beyond 5 bytes is not outputted to the terminal upon exiting of the program.
In addition, how would I test to ensure only a string is entered and finally in the last scenario, only a double is entered?
*** Updated code based on x82 and Peter C's guidance. I did some C disas and was able to amend my original code below. It still has some flaws but you are both a great deal of help! I am just stuck on when more than 5 integer bytes are entered it wont re-prompt as it does when I enter in a character data as it continues to dump extra bytes data to tty.
SECTION .data ; initialized data section
promptInput db 'Enter Number: ', 0
lenPromptInput equ $ - promptInput
displayInput db 'Data Entered: ', 0
lenDisplayInput equ $ - lenDisplayInput
SECTION .bss ; uninitialized data section
number resb 1024 ; allocate 1024 bytes for number variable
SECTION .text ; code section
global _start ; linker entry point
_start:
nop ; used for debugging
Read:
mov eax, 4 ; specify sys_write call
mov ebx, 1 ; specify stdout file descriptor
mov ecx, promptInput ; display promptInput
mov edx, lenPromptInput ; length of promptInput
int 0x80 ; call sys_write
mov eax, 3 ; specify sys_read call
mov ebx, 0 ; specify stdin file descriptor
mov ecx, number ; pass address of the buffer to read to
mov edx, 1024 ; specify sys_read to read 1024 bytes stdin
int 0x80 ; call sys_read
cmp eax, 0 ; examine sys_read return value in eax
je Exit ; je if end of file
cmp byte [number], 0x30 ; test input against numeric 0
jb Read ; jb if below 0 in ASCII chart
cmp byte [number], 0x39 ; test input against numeric 9
ja Read ; ja if above 9 in ASCII chart
Write:
mov eax, 4 ; specify sys_write call
mov ebx, 1 ; specify stdout file descriptor
mov ecx, displayInput ; display displayInput
mov edx, lenDisplayInput ; length of displayInput
int 0x80 ; call sys_write
mov eax, 4 ; specify sys_write call
mov ebx, 1 ; specify stdout file descriptor
mov ecx, number ; pass address of the number to write
mov edx, 5 ; pass number of numbers to write
int 0x80 ; call sys_write
Exit:
mov eax, 1 ; specific sys_exit call
mov ebx, 0 ; return code 0 to OS
int 0x80 ; call sys_exit
(Since you accepted this answer, I'll point out that the actual answer to this question about using read on TTYs is my other answer on this question.)
Here's an answer to your low-quality followup question which I was about to post when you deleted it.
Note that I said "you can ask for debugging help in a new question", not that you should ask 3 different questions in one, and re-post your whole code barely changed with no serious attempt at solving your own problem. It's still up to you to make the new question a good question.
I probably wouldn't have answered it if I hadn't sort of led to you posting it in the first place. Welcome to StackOverflow, I'm being generous since you're new and don't know what's a good question yet.
The usual term for the characters '0' through '9' is "digit", not "integer". It's much more specific.
ensure only integers are inputted in the buffer
You can't. You have to decide what you want to do if you detect such input.
Need help creating an array to loop through
Your buffer is an array of bytes.
You can loop over it with something like
# eax has the return value from the read system call, which you've checked is strictly greater than 0
mov esi, number ; start pointer
scan_buffer:
movzx edx, byte [esi]
# do something with the character in dl / edx
...
inc esi ; move to the next character
dec eax
jnz scan_buffer ; loop n times, where n = number of characters read by the system call.
ensure characters over the 1024 buffer do not send data to the tty
If you're worried that 1024 isn't necessarily big enough for this toy program, then use select(2) or poll(2) to check if there's more input to be read without blocking if there isn't.
I'm just going to answer the POSIX systems programming part of the question, and leave it up to you to make the right system calls once you know what you want your program to do. Use gdb to debug it (see the bottom of the x86 tag wiki for debug tips), and strace to trace system calls.
You might want to write your program in C instead of trying to learn asm and the Unix system call API at the same time. Write something in C to test the idea, and then implement it in asm. (Then you can look at the compiler output when you get stuck, to see how the compiler did things. As long as you carefully read and understand how the compiler-generated code works, you're still learning. I'd suggest compiling with -O2 or -O3 as a starting point for an asm implementation. At least -O1, definitely not -O0.).
I am trying to understand how to properly control input so that the input beyond 5 bytes is not outputted to the terminal upon exiting of the program.
This is just a POSIX semantics issue, nothing to do with asm. It would be the same if you were doing systems programming in C, calling the read(2) system call. You're calling it in asm with mov eax,3 / int 0x80, instead of calling the glibc wrapper like C compiler output would, but it's the same system call.
If there is unread data on the terminal (tty) when your program exits, the shell will read it when it checks for input.
In an interactive shell running on a tty, programs you run (like ./a.out or /bin/cat) have their stdin connected to the same tty device that the shell takes interactive input from. So unread data on your program's stdin is the same thing as unread data that the shell will see.
Things are different if you redirected your program's input from a file. (./a.out < my_file.txt). Then your program won't start with an already-open file descriptor for the tty. It could still open("/dev/tty") (which is a "magic" symlink that always refers to the controlling tty) and vacuum up anything that was typed while it was running.
ensure only an integer is entered and ensure length of input is 5 bytes or less in the following code?
You can't control what your input will be. You can detect input you don't like, and print an error message or anything else you want to do.
If you want input characters to stop echoing to the screen after 5 bytes, you'd need to put the tty into raw mode (instead of the default line-buffered "cooked" mode) and either do the echo manually, or disable echo after 5 bytes. (The latter wouldn't be reliable, though. There'd be a race condition between disabling echo and the user typing a 6th byte (e.g. as part of a paste).
RE: edit
I am just stuck on when more than 5 integer bytes are entered it wont re-prompt as it does when I enter in a character data as it continues to dump extra bytes data to tty.
You broke your program, because the logic is still designed around re-read()ing a character if you don't like the digit you read. But your read call reads up to 5 bytes.
The normal thing to do is one big read and then parse the whole line by looping over the bytes in the buffer. So use a big buffer (like 1024 bytes) in the .bss section, and make a read system call.
Don't make another read system call unless you want to prompt the user to enter another line of text.
After reading about at least the first 3 or 4 chapters of about 4 different books on assembly programming I got to a stage where I can put "Hello World" on a dosbox console using MASM 6.11. Imagine my delight!!
The first version of my program used DOS Function 13h.
The second version of my program used BIOS Function 10h
I now want to do the third version using direct hardware output. I have read the parts of the books that explain the screen is divided into 80x25 on a VGA monitors (not bothered about detecting CGA and all that so my program uses memory address 0B800h for colour VGA, because DOSBox is great and all, and my desire to move to Win Assembler sometime before im 90 years old). I have read that each character on the hardware screen is 2 bytes (1 for the attribute and one for the character itself, therefore you have 80x25x2=4000 bytes). The odd bytes describe the attribute, and the even bytes the ASCII character.
But my problem is this. No matter how I try, I cant get my program to output a simple black and white (which is just the attribute, I assume I can change this reasonably easily) string (which is just an array of bytes) 5 lines from the top of the screen, and 20 characters in from the left edge (which is just the number of blank characters away from a zero based index with 4000 bytes long). (if my calc is correct that is 5x80=400+20=420x2=840 is the starting position of my string within the array of 4000 bytes)
How do I separate the attribute from the character (I got it to work partially but it only shows every second character, then a bunch of random junk (thats how I figured I need some sort of byte pair for the attribute and text), or how do I set it up such that both are recognised together. How do I control he position of the text on the screen once the calcs are made. Where am I going wrong.
I have tried looking around the web for this seemingly simple question but am unable to find a solution. Is there anyone who used to program in DOS and x86 Assembly that can tell me how to do this easy little program by not using BIOS or DOS functions, just with the hardware.
I would really appreicate a simple code snippet if possible. Or a refrence to some site or free e-book. I dont want buying a big book on dos console programming which will end up useless when I move to windows shortly. The only reason I am focused on this is because I want to learn true assembly, not some macro language or some pretensious high level language that claims to be assembly.
I am trying to build a library of routines that will make Assembly easier to learn so people dont have to work though all the 3 to 6 chapters across 10 books of theory esentially explaining again and again the same stuff when really all that is needed is enough to know how to get some output, assign values to variables, get some input, and do some loops and decisions. The theory can come along later, and by the time they get to loops and decisions most people will have done enough assembler to have all the theory anyway. I beleive assembly should be taught no different than any other language starting with a simple hello world program then getting input ect. I want to make this possible. But hey, I'm just a beginner, maybe my taughts will change when I learn more.
One other note, I know for a fact the problem is NOT with DOSBox as I have a very old PC running true MS-DOS V6.2 and the program still doesnt work (but gives almost identical output). In fact, DOSBox actually runs some of my old programs even better than True dos. Gem desktop being one example. Just wanted to get that cleared before people try suggesting its a problem with the emulator. It cant be, not with such simple programs. No im afraid the problem is with my little brain not fully understanding what is needed.
Can anyone out there please help!!
Below is the program I used (MASM 6.1 Under DOSBox on Win 7 64-bit). It uses BIOS Intrrupt 10h Function 13h sub function 0. I want to do the very same using direct hardware IO.
.model small
.stack
.data ;part of the program containing data
;Constants - None
;Variables
MyMsg db 'Hello World'
.code
Main:
GetAddress:
mov ax,#data ;Gets address of data segment
mov es,ax ;Loads segment address into es regrister
mov bp,OFFSET MyMsg ;Load Offset into DX
SetAttributes:
mov bl,01001111b ;BG/FG Colour attributes
mov cx,11 ;Length of string in data segment
SetRowAndCol:
mov dh,24 ;Set the row to start printing at
mov dl,68 ;Set the column to start printing at
GetFunctionAndSub:
mov ah,13h ;BIOS Function 10h - String Output
mov al,0 ;BIOS Sub-Function (0-3)
Execute:
int 10h ;BIOS Interrupt 10h
EndProg:
mov ax,4c00h ;Terminate program return 0 to OS
int 21h ;DOS Interrupt 21h
end Main
end
I want to have this in a format that is easy to explain. So here is my current workings. I've almost got it. But it only prints the attributes, getting the characters on screen is a problem. (Ocasionally when I modify it slightly, I get every second character with random attributes (I think I know the technicalities of why, but dont know enough assembler to fix it)).
.model small
.stack
.data
;Constants
ScreenSeg equ 0B800h
;Variables
MyMsg db 'Hello World'
StrLen equ $-MyMsg
.code
Main:
SetSeg:
mov ax, ScreenSeg ;set segment register:
mov ds, ax
InitializeStringLoop: ;Display all characters: - Not working :( Y!
mov cx, StrLen ;number of characters.
mov di, 00h ;start from byte 'h'
OutputString:
mov [di], offset byte ptr MyMsg[di]
add di, 2 ;skip over next attribute code in vga memory.
loop OutputString
InitializeAttributeLoop:;Color all characters: - Atributes are working fine.
mov cx, StrLen ;number of characters.
mov di, 01h ;start from byte after 'h'
;Assuming I have all chars with same attributes - fine for now - later I would make this
;into a procedure that I will just pass the details into. - But for now I just want a
;basic output tutorial.
OutputAttributes:
mov [di], 11101100b ;light red(1100) on yellow(1110)
add di, 2 ;skip over next ascii code in vga memory.
loop OutputAttributes
EndPrg:
mov ax, 4C00h
int 21h
end Main
Of course I want to reduce the instructions used to the bare bones essentials. (for proper tuition purposes, less to cover when teaching others). Hense the reason I did not use MOVSB/W/D ect with REP. I opted instead for an easy to explain manual loop using standard MOV, INC, ADD ect. These are instructions that are basic enough and easy to explain to newcommers. So if possible I would like to keep it as close to this as possible.
I know esentially all that seems to be wrong is the loop for the actual string handler. Its not letting me increment the address the way I want it to. Its embarasssing to me cause I am actually quite a good progammer using C++, C#, VB, and Delphi (way back when)). I know you wouldnt think that given I cant even get a loop right in assembler, but it is such a different language. There are 2 or 3 loops in high level languages, and it seems there are an infinate combination of ways to do loops in assembler depending on the instructions. So I say "Simple Loop", but in reality there is little simple about it.
I hope someone can help me with this, you would be saving my assembly carreer, and ensuring I eventually become a good assembly teacher. Thanks in advance, and especially for reading this far.
The typical convention would be to use ds:si as source, and es:di as destination.
So it would end up being similar to (untested):
mov ax, #data
mov ds, ax
mov ax, ScreenSeg
mov es, ax
...
mov si, offset MyMsg
OutputString:
mov al, byte ptr ds:[si]
mov byte ptr es:[di], al
add si, 1 ; next character from string
add di, 2 ; skip over next attribute code in vga memory.
loop OutputString
I would suggest getting the Masm32 Package if you don't already have it. It is mainly geared towards easily using Assembly Language in "Windows today", which is very nice, but also will teach you a lot about Asm and also shows where to get the Intel Chip manuals that were mentioned in an earlier reply that are indispensable.
I started programming in the 80's and so I can see why you are so interested in the nitty gritty of it, I know I miss it. If more people were to start out there, it would pay off for them very much. You are doing a great service!
I am playing with exactly what you are talking about, Direct Hardware, and I have also learned that Windows has changed some of the DOS services and BIOS services have changed too, so that some don't work any more. I am in fact writing a small .com program and running it from Win7 in a Command Prompt Window, Prints a msg and waits for a key, Pretty cool considering it's Win7 in 2012!
In fact it was BIOS 10h - 0Eh that did not work and so I tried Dos 21h 02h to write to the screen and it worked. The code is below because it is a .com (Command Program) i thought it might be of use to you.
; This makes a .com program (64k Limit, Code, Data and all
; have to fit in this space. Used for small utilities and
; good for very fast tasks. In fact DOS Commands are mostly
; small .com programs like this (except more useful)!
;
; Assemble with Masm using
; c:\masm32\bin\ml /AT /c bfc.asm
; Link with Masm's Link16 using
; c:\masm32\bin\link16 bfc.obj,bfc.com;
;
; Link16 is the key to making this 16bit .com (Command) file
SEGMT SEGMENT
org 100h
Start:
push CS
pop DS
MOV SI, OFFSET Message
Next:
MOV ah, 02h ; Write Char to Standard out
MOV dl, [si] ; Char
INT 21h ; Write it
INC si ; Next Char
CMP byte ptr[si], 0 ; Done?
JNE Next ; Nope
WaitKey:
XOR ah, ah ; 0
INT 16h ; Wait for any Key
ExitHere:
MOV ah, 4Ch ; Exit with Return Code
xor al, al ; Return Code
INT 21h
Message db "It Works in Windows 7!", 0
SEGMT ENDS
END Start
I used to do all of what you are talking about. Trying to remember the details. Michael Abrash is a name you should be googling for. Mode-X for example a 200 something by 200 (240x200?) 256 color mode was very popular as it broke the 16 color boundary and at the time the games looked really good.
I think that the on the metal register programming was doable but painful and you likely need to get the programmers reference/datasheet for the chip you are interested in. As time passed from CGA to EGA to VGA to VESA the way things worked changed as well. I think typically you use int something calls to get access to the page frame then you could fill that in directly. VESA I think worked that way, VESA was a big livesaver when it came to video card support, you used to have to write your own drivers for each chip before then (if you didnt want the ugly standard modes).
I would look at mode-x or vesa and go from there. You need to have a bit of a hacker inside to get through some of this anyway, it is very rare to find a datasheet/programmers reference manual that is complete and accurate, you always have to just shove some bytes around to see what happens. Start filling those memory blocks that are supposed to be the page frames until you see something change on the screen...
I dont see any specific graphics programming books in my library other than the abrash books like the graphics programming black book, which was at the tail end of this period of time. I have bios and dos programmers references and used ralf browns list too. I am sure that I had copies of the manuals for popular video chips (before the internet remember you called a human on that phone thing with a cord hanging out of it, the human took a printed manual, sometimes nicely bound sometimes just a staple in the corner if that, put it in an envelope and mailed it to you and that was your only copy unless you ran it through the copier). I have stacks of printed stuff that, sorry to say, am not going to go through to answer this question. I will keep this question in my mind though and look around some more for info, actually I may have some of my old programs handy, drawing fractals and other such things (direct as practical to the video card/memory).
EDIT.
I know you are looking for text mode stuff, and this is a graphics mode but it may or may not shed some light on what you are trying to do. combination of int calls and filling pages and palette memory directly.
http://dwelch.s3.amazonaws.com/fly.tar.gz
I'm using nasm under ubuntu. By the way i need to get single input character from user's keyboard (like when a program ask you for y/n ?) so as key pressed and without pressing enter i need to read the entered character. I googled it a lot but all what i found was somehow related to this line (int 21h) which result in "Segmentation Fault". Please help me to figure it out how to get single character or how to over come this segmentation fault.
It can be done from assembly, but it isn't easy. You can't use int 21h, that's a DOS system call and it isn't available under Linux.
To get characters from the terminal under UNIX-like operating systems (such as Linux), you read from STDIN (file number 0). Normally, the read system call will block until the user presses enter. This is called canonical mode. To read a single character without waiting for the user to press enter, you must first disable canonical mode. Of course, you'll have to re-enable it if you want line input later on, and before your program exits.
To disable canonical mode on Linux, you send an IOCTL (IO ControL) to STDIN, using the ioctl syscall. I assume you know how to make Linux system calls from assembler.
The ioctl syscall has three parameters. The first is the file to send the command to (STDIN), the second is the IOCTL number, and the third is typically a pointer to a data structure. ioctl returns 0 on success, or a negative error code on fail.
The first IOCTL you need is TCGETS (number 0x5401) which gets the current terminal parameters in a termios structure. The third parameter is a pointer to a termios structure. From the kernel source, the termios structure is defined as:
struct termios {
tcflag_t c_iflag; /* input mode flags */
tcflag_t c_oflag; /* output mode flags */
tcflag_t c_cflag; /* control mode flags */
tcflag_t c_lflag; /* local mode flags */
cc_t c_line; /* line discipline */
cc_t c_cc[NCCS]; /* control characters */
};
where tcflag_t is 32 bits long, cc_t is one byte long, and NCCS is currently defined as 19. See the NASM manual for how you can conveniently define and reserve space for structures like this.
So once you've got the current termios, you need to clear the canonical flag. This flag is in the c_lflag field, with mask ICANON (0x00000002). To clear it, compute c_lflag AND (NOT ICANON). and store the result back into the c_lflag field.
Now you need to notify the kernel of your changes to the termios structure. Use the TCSETS (number 0x5402) ioctl, with the third parameter set the the address of your termios structure.
If all goes well, the terminal is now in non-canonical mode. You can restore canonical mode by setting the canonical flag (by ORing c_lflag with ICANON) and calling the TCSETS ioctl again. always restore canonical mode before you exit
As I said, it isn't easy.
I needed to do this recently, and inspired by Callum's excellent answer, I wrote the following (NASM for x86-64):
DEFAULT REL
section .bss
termios: resb 36
stdin_fd: equ 0 ; STDIN_FILENO
ICANON: equ 1<<1
ECHO: equ 1<<3
section .text
canonical_off:
call read_stdin_termios
; clear canonical bit in local mode flags
and dword [termios+12], ~ICANON
call write_stdin_termios
ret
echo_off:
call read_stdin_termios
; clear echo bit in local mode flags
and dword [termios+12], ~ECHO
call write_stdin_termios
ret
canonical_on:
call read_stdin_termios
; set canonical bit in local mode flags
or dword [termios+12], ICANON
call write_stdin_termios
ret
echo_on:
call read_stdin_termios
; set echo bit in local mode flags
or dword [termios+12], ECHO
call write_stdin_termios
ret
; clobbers RAX, RCX, RDX, R8..11 (by int 0x80 in 64-bit mode)
; allowed by x86-64 System V calling convention
read_stdin_termios:
push rbx
mov eax, 36h
mov ebx, stdin_fd
mov ecx, 5401h
mov edx, termios
int 80h ; ioctl(0, 0x5401, termios)
pop rbx
ret
write_stdin_termios:
push rbx
mov eax, 36h
mov ebx, stdin_fd
mov ecx, 5402h
mov edx, termios
int 80h ; ioctl(0, 0x5402, termios)
pop rbx
ret
(Editor's note: don't use int 0x80 in 64-bit code: What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code? - it would break in a PIE executable (where static addresses aren't in the low 32 bits), or with the termios buffer on the stack. It does actually work in a traditional non-PIE executable, and this version can be easily ported to 32-bit mode.)
You can then do:
call canonical_off
If you're reading a line of text, you probably also want to do:
call echo_off
so that each character isn't echoed as it's typed.
There may be better ways of doing this, but it works for me on a 64-bit Fedora installation.
More information can be found in the man page for termios(3), or in the termbits.h source.
The easy way: For a text-mode program, use libncurses to access the keyboard; for a graphical program, use Gtk+.
The hard way: Assuming a text-mode program, you have to tell the kernel that you want single-character input, and then you have to do a whole lot of bookkeeping and decoding. It's really complicated. There is no equivalent of the good old DOS getch() routine. You can start learning about how to do it here: Terminal I/O. Graphical programs are even more complicated; the lowest-level API for that is Xlib.
Either way, you're going to go mad coding whatever this is in assembly; use C instead.