Linux temios non-canonical sys_call getch() doesn't work - linux

I recently make input using Linux assembly (x86_64) i got this code and i try it eventually the code are doesn't work like what i expected, i should wait a keystroke but it auto input from no where, i suspect termios flags... code are in below :
;Get current settings
Mov EAX, 16 ; SYS_ioctl
Mov EDI, 0 ; STDIN_FILENO
Mov ESI, 0x5401 ; TCGETS
Mov RDX, termios
Int 80h
And dword [c_cflag], 0xFD ; Clear ICANON to disable canonical mode
; Write termios structure back
Mov EAX, 16 ; SYS_ioctl
Mov EDI, 0 ; STDIN_FILENO
Mov ESI, 0x5402 ; TCSETS
Mov RDX, termios
Int 80h
Mov EAX,0 ;sys_read kernel call
Mov EBX,0 ;stdin trap (standart input)
Mov ECX,Nada ;Masukkan offset/jumlah byte yang akan di baca
Mov EDX,1 ;Jumlah byte yang dibaca
Int 80h ;Call Kernel
for the termios struct :
SECTION .bss ;deklarasi untuk variable yang belum terdefinisi
Enter: Resb 1 ;Pesan 1 byte untuk Enter
Nada: Resb 1
termios:
c_iflag Resd 1 ; input mode flags
c_oflag Resd 1 ; output mode flags
c_cflag Resd 1 ; control mode flags
c_lflag Resd 1 ; local mode flags
c_line Resb 1 ; line discipline
c_cc Resb 64 ; control characters
for the output :
nasm -f elf64 -g -F stabs key.asm
ld -o KeyPress key.o
./KeyPress
Untuk memulai tekan tombol enter:
Tekan tombol untuk memainkan satu not: (1,2,3,4,5,6,7,8)
//this part are the error occur,i have to check if user inputed right value
if not it will jump to error label and printed below message//
Error note not found please contact the app developer !!
reference : Linux Getch(), My Github Repo
PS: For the newest code i already push on my repository i use ubuntu 20.04 and Intel i7 (64-bit), thanks for the help

... i got this code and i try it eventually the code are doesn't work like what i expected ...
Mov ESI, 0x5401
Mov RDX, termios
Int 80h
This won't work:
Int 80h is the 32-bit system call used in 32-bit programs. The first three arguments are passed in EBX, ECX and EDX, and definitely not in ESI.
And the values of EAX required for Int 80h differ from the method used in 64-bit programs: read() would be EAX=3, not EAX=0.
Int 80h seems to work in 64-bit programs, too, however, passing 64-bit values wont work; so you cannot use Int 80h for system calls that take addresses (in the example: the address of termios) as argument.
Either you assemble and link your code as 32-bit program, use int 80h, pass the arguments in EBX, ECX and EDX and use the values in EAX required for 32-bit programs (for example: EAX=3 for read()):
mov eax, 54 ; sys_ioctl when using "int 80h"
mov ebx, 0 ; stdin
mov ecx, 0x5402 ; TCSETS
mov edx, termios
int 80h
Or you build a 64-bit program and use the syscall instruction to call system calls (see this question):
mov eax, 0 ; sys_read when using "syscall"
; note that this instruction will actually set RAX to 0
mov edi, 0 ; set RDI to stdin (implicitly sets rdi)
mov rsi, Nada ; Address of the buffer (see below)
; we explicitly have to use "rsi" here!
mov edx, 1 ; number of bytes
syscall
mov ecx, Nada
I don't use "nasm" but another assembler; so maybe I am wrong. But as far as I know the instruction above would be interpreted by "nasm" as:
Read the value stored in the RAM at the address Nada and write that value to the ecx register.
However, you want the address of Nada to be written to the ecx register.
As far as I know, this instruction would be written as: mov ecx, offset Nada in "masm".
If this is true, the corresponding line in my example above must be: mov rsi, offset Nada.
And dword [c_cflag], 0xFD ; Clear ICANON to disable canonical mode
This line contains two errors:
ICANON is located in C_LFLAG, not in C_CFLAG.
And this instruction would be identical to the C/C++ instruction: c_cflag &= ~0xFFFFFF02, but you want to do: c_cflag &= ~2.
To clear bit 1 only, you have two possibilities:
And byte [c_lflag], 0xFD
; OR:
And dword [c_lflag], 0xFFFFFFFD

Related

Interrupt "console input without echo" in Linux

DOS has int 21h / AH=08H: Console input without echo.
Is there something similar for Linux? If I need to process the entered value before it is displayed in the terminal.
Under Linux, it is the tty that buffers the typed chars before "sending" them to the requesting program.
This is controlled through the terminal mode: raw (no buffering) or cooked (also known respectively as non-canonical and canonical mode).
These modes are actually attributes of the tty, which can be controlled with tcgetattr and tcsetattr.
The code to set the terminal in non-canonical mode without echo can be found, for example, here (more info on the VTIME and VMIN control chars can be found here).
That's C, so we need to translate it into assembly.
From the source of tcgetattr we can see that the tty attributes are retrieved through an IOCTL to stdin with the command TCGETS (value 0x5401) and, similarly, they are set with an IOCTL with the command TCSETS (value 0x5402).
The structure read is not the struct termios but struct __kernel_termios which is basically a shortened version of the former.
The IOCTL must be sent to the stdin file (file descriptor STDIN_FILENO of value 0).
Knowing how to implement tcgetattr and tcsetattr we only need to get the value of the constants (like ICANON and similar).
I advise using a compiler (e.g. here) to find the values of the public constants and to check the structure's offsets.
For non-public constants (not visible outside their translation units) we must resort to reading the source (this is not particularly hard, but care must be taken to find the right source).
Below a 64-bit program that invokes the IOCTLs to get-modify-set the TTY attribute in order to enable the raw mode.
Then the program waits for a single char and displays it incremented (e.g. a -> b).
Note that this program has been tested under Linux (5.x) and, as the various constants change values across different clones of Unix, it is not portable.
I used NASM and defined a structure for struct __kernel_termios, I also used a lot of symbolic constants to make the code more readable. I don't really like using structures in assembly but NASM ones are just a thin macro layer (it's better to get used to them if you aren't already).
Finally, I assume familiarity with 64-bit Linux assembly programming.
BITS 64
GLOBAL _start
;
; System calls numbers
;
%define SYS_READ 0
%define SYS_WRITE 1
%define SYS_IOCTL 16
%define SYS_EXIT 60
;
; __kernel_termios structure
;
%define KERNEL_NCC 19
struc termios
.c_iflag: resd 1 ;input mode flags
.c_oflag: resd 1 ;output mode flags
.c_cflag: resd 1 ;control mode flags
.c_lflag: resd 1 ;local mode flags
.c_line: resb 1 ;line discipline
.c_cc: resb KERNEL_NCC ;control characters
endstruc
;
; IOCTL commands
;
%define TCGETS 0x5401
%define TCSETS 0x5402
;
; TTY local flags
;
%define ECHO 8
%define ICANON 2
;
; TTY control chars
;
%define VMIN 6
%define VTIME 5
;
; Standard file descriptors
;
%define STDIN_FILENO 0
%define STDOUT_FILENO 1
SECTION .bss
;The char read (reserve a DWORD to make termios_data be aligned on DWORDs boundary)
data resd 1
;The TTY attributes
termios_data resb termios_size
SECTION .text
_start:
;
;Get the terminal settings by sending the TCGETS IOCTL to stdin
;
mov edi, STDIN_FILENO ;Send IOCTL to stdin (Less efficient but more readable)
mov esi, TCGETS ;The TCGETS command
lea rdx, [REL termios_data] ;The arg, the buffer where to store the TTY attribs
mov eax, SYS_IOCTL ;Do the syscall
syscall
;
;Set the raw mode by clearing ECHO and ICANON and setting VMIN = 1, VTIME = 0
;
and DWORD [REL termios_data + termios.c_lflag], ~(ICANON | ECHO) ;Clear ECHO and ICANON
mov BYTE [REL termios_data + termios.c_cc + VMIN], 1
mov BYTE [REL termios_data + termios.c_cc + VTIME], 0
;
;Set the terminal settings
;
mov edi, STDIN_FILENO ;Send to stdin (Less efficient but more readable)
mov esi, TCSETS ;Use TCSETS as the command
lea rdx, [REL termios_data] ;Use the same data read (and altered) before
mov eax, SYS_IOCTL ;Do the syscall
syscall
;
;Read a char
;
mov edi, STDIN_FILENO ;Read from stdin (Less efficient but more readable)
lea rsi, [REL data] ;Read into data
mov edx, 1 ;Read only 1 char
mov eax, SYS_READ ;Do the syscall (Less efficient but more readable)
syscall
;
;Increment the char (as an example)
;
inc BYTE [REL data]
;
;Print the char
;
mov edi, STDOUT_FILENO ;Write to stdout
lea rsi, [REL data] ;Write the altered char
mov edx, 1 ;Only 1 char to write
mov eax, SYS_WRITE ;Do the syscall
syscall
;
;Restore the terminal settins (similar to the code above)
;
mov edi, STDIN_FILENO
mov esi, TCGETS
lea rdx, [REL termios_data]
mov eax, SYS_IOCTL
syscall
;Set ECHO and ICANON
or DWORD [REL termios_data + termios.c_lflag], ICANON | ECHO
mov edi, STDIN_FILENO
mov esi, TCSETS
lea rdx, [REL termios_data]
mov eax, SYS_IOCTL
syscall
;
;Exit
;
xor edi, edi
mov eax, SYS_EXIT
syscall

Reading a single-key input on Linux (without waiting for return) using x86_64 sys_call

I want to make Linux just take 1 keystroke from keyboard using sys_read, but sys_read just wait until i pressed enter. How to read 1 keystroke ? this is my code:
Mov EAX,3
Mov EBX,0
Mov ECX,Nada
Mov EDX,1
Int 80h
Cmp ECX,49
Je Do_C
Jmp Error
I already tried using BIOS interrupt but it's failed (Segmentation fault), I want capture number 1 to 8 input from keyboard.
Syscalls in 64-bit linux
The tables from man syscall provide a good overview here:
arch/ABI instruction syscall # retval Notes
──────────────────────────────────────────────────────────────────
i386 int $0x80 eax eax
x86_64 syscall rax rax See below
arch/ABI arg1 arg2 arg3 arg4 arg5 arg6 arg7 Notes
──────────────────────────────────────────────────────────────────
i386 ebx ecx edx esi edi ebp -
x86_64 rdi rsi rdx r10 r8 r9 -
I have omitted the lines that are not relevant here. In 32-bit mode, the parameters were transferred in ebx, ecx, etc and the syscall number is in eax. In 64-bit mode it is a little different: All registers are now 64-bit wide and therefore have a different name. The syscall number is still in eax, which now becomes rax. But the parameters are now passed in rdi, rsi, etc. In addition, the instruction syscall is used here instead of int 0x80 to trigger a syscall.
The order of the parameters can also be read in the man pages, here man 2 ioctl and man 2 read:
int ioctl(int fd, unsigned long request, ...);
ssize_t read(int fd, void *buf, size_t count);
So here the value of int fd is in rdi, the second parameter in rsi etc.
How to get rid of waiting for a newline
Firstly create a termios structure in memory (in .bss section):
termios:
c_iflag resd 1 ; input mode flags
c_oflag resd 1 ; output mode flags
c_cflag resd 1 ; control mode flags
c_lflag resd 1 ; local mode flags
c_line resb 1 ; line discipline
c_cc resb 19 ; control characters
Then get the current terminal settings and disable canonical mode:
; Get current settings
mov eax, 16 ; syscall number: SYS_ioctl
mov edi, 0 ; fd: STDIN_FILENO
mov esi, 0x5401 ; request: TCGETS
mov rdx, termios ; request data
syscall
; Modify flags
and byte [c_lflag], 0FDh ; Clear ICANON to disable canonical mode
; Write termios structure back
mov eax, 16 ; syscall number: SYS_ioctl
mov edi, 0 ; fd: STDIN_FILENO
mov esi, 0x5402 ; request: TCSETS
mov rdx, termios ; request data
syscall
Now you can use sys_read to read in the keystroke:
mov eax, 0 ; syscall number: SYS_read
mov edi, 0 ; int fd: STDIN_FILENO
mov rsi, buf ; void* buf
mov rdx, len ; size_t count
syscall
Afterwards check the return value in rax: It contains the number of characters read.
(Or a -errno code on error, e.g. if you closed stdin by running ./a.out <&- in bash. Use strace to print a decoded trace of the system calls your program makes, so you don't need to actually write error handling in toy experiments.)
References:
What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?
Why does the sys_read system call end when it detects a new line?
How do i read single character input from keyboard using nasm (assembly) under ubuntu?
Using the raw keyboard mode under Linux (external site with example in 32-bit assembly)

Display contents of register

hi i need help displaying contents of a register.my code is below.i have been able to display values of the data register but i want to display flag states. eg 1 or 0. and it would be helpful if to display also the contents of other registers like esi,ebp.
my code is not printing the states of the flags ..what am i missing
section .text
global _start ;must be declared for using gcc
_start : ;tell linker entry point
mov eax,msg ; moves message "rubi" to eax register
mov [reg],eax ; moves message from eax to reg variable
mov edx, 8 ;message length
mov ecx, [reg];message to write
mov ebx, 1 ;file descriptor (stdout)
mov eax, 4 ;system call number (sys_write)
int 0x80 ;call kernel
mov eax, 100
mov ebx, 100
cmp ebx,eax
pushf
pop dword eax
mov [save_flags],eax
mov edx, 8 ;message length
mov ecx,[save_flags] ;message to write
mov ebx, 1 ;file descriptor (stdout)
mov eax, 4 ;system call number (sys_write)
int 0x80
mov eax, 1 ;system call number (sys_exit)
int 0x80 ;call kernel
section .data
msg db "rubi",10
section .bss
reg resb 100
save_flags resw 100
I'm not going for anything fancy here since this appears to be a homework assignment (two people have asked the same question today). This code should be made as a function, and it can have its performance enhanced. Since I don't get an honorary degree or an A in the class it doesn't make sense to me to offer the best solution, but one you can work from:
BITS_TO_DISPLAY equ 32 ; Number of least significant bits to display (1-32)
section .text
global _start ; must be declared for using gcc
_start : ; tell linker entry point
mov edx, msg_len ; message length
mov ecx, msg ; message to write
mov ebx, 1 ; file descriptor (stdout)
mov eax, 4 ; system call number (sys_write)
int 0x80 ; call kernel
mov eax, 100
mov ebx, 100
cmp ebx,eax
pushf
pop dword eax
; Convert binary to string by shifting the right most bit off EAX into
; the carry flag (CF) and convert the bit into a '0' or '1' and place
; in the save_flags buffer in reverse order. Nul terminate the string
; in the event you ever wish to use printf to print it
mov ecx, BITS_TO_DISPLAY ; Number of bits of EAX register to display
mov byte [save_flags+ecx], 0 ; Nul terminate binary string in case we use printf
bin2ascii:
xor bl, bl ; BL = 0
shr eax, 1 ; Shift right most bit into carry flag
adc bl, '0' ; bl = bl + '0' + Carry Flag
mov [save_flags-1+ecx], bl ; Place '0'/'1' into string buffer in reverse order
dec ecx
jnz bin2ascii ; Loop until all bits processed
mov edx, BITS_TO_DISPLAY ; message length
mov ecx, save_flags ; address of binary string to write
mov ebx, 1 ; file descriptor (stdout)
mov eax, 4 ; system call number (sys_write)
int 0x80
mov eax, 1 ;system call number (sys_exit)
int 0x80 ;call kernel
section .data
msg db "rubi",10
msg_len equ $ - msg
section .bss
save_flags resb BITS_TO_DISPLAY+1 ; Add one byte for nul terminator in case we use printf
The idea behind this code is that we continually shift the bits (using the SHR instruction) in the EAX register to the right one bit at a time. The bit that gets shifted out of the register gets placed in the carry flag (CF). We can use ADC to add the value of the carry flag (0/1) to ASCII '0' to get an ASCII value of '0` and '1'. We place these bytes into destination buffer in reverse order since we are moving from right to left through the bits.
BITS_TO_DISPLAY can be set between 1 and 32 (since this is 32-bit code). If you are interested in the bottom 8 bits of a register set it to 8. If you want to display all the bits of a 32-bit register, specify 32.
Note that you can pop directly into memory.
And if you want to binary dump register and flag data with write(2), your system call needs to pass a pointer to the buffer, not the data itself. Use a mov-immediate to get the address into the register, rather than doing a load. Or lea to use a RIP-relative addressing mode. Or pass a pointer to where it's sitting on the stack, instead of copying it to a global!
mov edx, 8 ;message length
mov ecx,[save_flags] ;message to write ;;;;;;; <<<--- problem
mov ebx, 1 ;file descriptor (stdout)
mov eax, 4 ;system call number (sys_write)
int 0x80
Passing a bad address to write(2) won't cause your program to receive a SIGSEGV, like it would if you used that address in user-space. Instead, write will return EFAULT. And you're not checking the return status from your system calls, so your code doesn't notice.
mov eax,msg ; moves message "rubi" to eax register
mov [reg],eax ; moves message from eax to reg variable
mov ecx, [reg];
This is silly. You should just mov ecx, msg to get the address of msg into ecx, rather than bouncing it through memory.
Are you building for 64bit? I see you're using 8 bytes for a message length. If so, you should be using the 64bit function call ABI (with syscall, not int 0x80). The system-call numbers are different. See the table in one of the links at x86. The 32bit ABI can only accept 32bit pointers. You will have a problem if you try to pass a pointer that has any of the high32 bits set.
You're probably also going to want to format the number into a string, unless you want to pipe your program's output into hexdump.

Why do I need to use [ ] (square brackets) when moving data from register to memory, but not when other way around?

This is the code I have and it works fine:
section .bss
bufflen equ 1024
buff: resb bufflen
whatread: resb 4
section .data
section .text
global main
main:
nop
read:
mov eax,3 ; Specify sys_read
mov ebx,0 ; Specify standard input
mov ecx,buff ; Where to read to...
mov edx,bufflen ; How long to read
int 80h ; Tell linux to do its magic
; Eax currently has the return value from linux system call..
add eax, 30h ; Convert number to ASCII digit
mov [whatread],eax ; Store how many bytes has been read to memory at loc **whatread**
mov eax,4 ; Specify sys_write
mov ebx,1 ; Specify standart output
mov ecx,whatread ; Get the address of whatread to ecx
mov edx,4 ; number of bytes to be written
int 80h ; Tell linux to do its work
mov eax, 1;
mov ebx, 0;
int 80h
Here is a simple run and output:
koray#koray-VirtualBox:~/asm/buffasm$ nasm -f elf -g -F dwarf buff.asm
koray#koray-VirtualBox:~/asm/buffasm$ gcc -o buff buff.o
koray#koray-VirtualBox:~/asm/buffasm$ ./buff
p
2koray#koray-VirtualBox:~/asm/buffasm$ ./buff
ppp
4koray#koray-VirtualBox:~/asm/buffasm$
My question is: What is with these 2 instructions:
mov [whatread],eax ; Store how many byte reads info to memory at loc whatread
mov ecx,whatread ; Get the address of whatread in ecx
Why the first one works with [] but the other one without?
When I try replacing the second line above with:
mov ecx,[whatread] ; Get the address of whatread in ecx
the executable will not run properly, it will not shown anything in the console.
Using brackets and not using brackets are basically two different things:
A bracket means that the value in the memory at the given address is meant.
An expression without a bracket means that the address (or value) itself is meant.
Examples:
mov ecx, 1234
Means: Write the value 1234 to the register ecx
mov ecx, [1234]
Means: Write the value that is stored in memory at address 1234 to the register ecx
mov [1234], ecx
Means: Write the value stored in ecx to the memory at address 1234
mov 1234, ecx
... makes no sense (in this syntax) because 1234 is a constant number which cannot be changed.
Linux "write" syscall (INT 80h, EAX=4) requires the address of the value to be written, not the value itself!
This is why you do not use brackets at this position!

How should I work with dynamically-sized input in NASM Assembly?

I'm trying to learn assembly with NASM on 64 bit Linux.
I managed to make a program that reads two numbers and adds them. The first thing I realized was that the program will only work with one-digit numbers (and results):
; Calculator
SECTION .data
msg1 db "Enter the first number: "
msg1len equ $-msg1
msg2 db "Enter the second number: "
msg2len equ $-msg2
msg3 db "The result is: "
msg3len equ $-msg3
SECTION .bss
num1 resb 1
num2 resb 1
result resb 1
SECTION .text
global main
main:
; Ask for the first number
mov EAX,4
mov EBX,1
mov ECX,msg1
mov EDX,msg1len
int 0x80
; Read the first number
mov EAX,3
mov EBX,1
mov ECX,num1
mov EDX,2
int 0x80
; Ask for the second number
mov EAX,4
mov EBX,1
mov ECX,msg2
mov EDX,msg2len
int 0x80
; Read the second number
mov EAX,3
mov EBX,1
mov ECX,num2
mov EDX,2
int 0x80
; Prepare to announce the result
mov EAX,4
mov EBX,1
mov ECX,msg3
mov EDX,msg3len
int 0x80
; Do the sum
; Store read values to EAX and EBX
mov EAX,[num1]
mov EBX,[num2]
; From ASCII to decimal
sub EAX,'0'
sub EBX,'0'
; Add
add EAX,EBX
; Convert back to EAX
add EAX,'0'
; Save the result back to the variable
mov [result],EAX
; Print result
mov EAX,4
mov EBX,1
mov ECX,result
mov EDX,1
int 0x80
As you can see, I reserve one byte for the first number, another for the second, and one more for the result. This isn't very flexible. I would like to make additions with numbers of any size.
How should I approach this?
First of all you are generating a 32-bit program, not a 64-bit program. This is no problem as Linux 64-bit can run 32-bit programs if they are either statically linked (this is the case for you) or the 32-bit shared libraries are installed.
Your program contains a real bug: You are reading and writing the "EAX" register from a 1-byte field in RAM:
mov EAX, [num1]
This will normally work on little-endian computers (x86). However if the byte you want to read is at the end of the last memory page of your program you'll get a bus error.
Even more critical is the write command:
mov [result], EAX
This command will overwrite 3 bytes of memory following the "result" variable. If you extend your program by additional bytes:
num1 resb 1
num2 resb 1
result resb 1
newVariable1 resb 1
You'll overwrite these variables! To correct your program you must use the AL (and BL) register instead of the complete EAX register:
mov AL, [num1]
mov BL, [num2]
...
mov [result], AL
Another finding in your program is: You are reading from file handle #1. This is the standard output. Your program should read from file handle #0 (standard input):
mov EAX, 3 ; read
mov EBX, 0 ; standard input
...
int 0x80
But now the answer to the actual question:
The C library functions (e.g. fgets()) use buffered input. Doing it like this would be a bit to complicated for the beginning so reading one byte at a time could be a possibility.
Thinking the way "how would I solve this problem using a high-level language like C". If you don't use libraries in your assembler program you can only use system calls (section 2 man pages) as functions (e.g. you cannot use "fgets()" but only "read()").
In your case a C program reading a number from standard input could look like this:
int num1;
char c;
...
num1 = 0;
while(1)
{
if(read(0,&c,1)!=1) break;
if(c=='\r' || c=='\n') break;
num1 = 10*num1 + c - '0';
}
Now you may think about the assembler code (I typically use GNU assembler, which has another syntax, so maybe this code contains some bugs):
c resb 1
num1 resb 4
...
; Set "num1" to 0
mov EAX, 0
mov [num1], EAX
; Here our while-loop starts
next_digit:
; Read one character
mov EAX, 3
mov EBX, 0
mov ECX, c
mov EDX, 1
int 0x80
; Check for the end-of-input
cmp EAX, 1
jnz end_of_loop
; This will cause EBX to be 0.
; When modifying the BL register the
; low 8 bits of EBX are modified.
; The high 24 bits remain 0.
; So clearing the EBX register before
; reading an 8-bit number into BL is
; a method for converting an 8-bit
; number to a 32-bit number!
xor EBX, EBX
; Load the character read into BL
; Check for "\r" or "\n" as input
mov BL, [c]
cmp BL, 10
jz end_of_loop
cmp BL, 13
jz end_of_loop
; read "num1" into EAX
mov EAX, [num1]
; Multiply "num1" with 10
mov ECX, 10
mul ECX
; Add one digit
sub EBX, '0'
add EAX, EBX
; write "num1" back
mov [num1], EAX
; Do the while loop again
jmp next_digit
; The end of the loop...
end_of_loop:
; Done
Writing decimal numbers with more digits is more difficult!

Resources