How do canary words allow gcc to detect buffer overflows? - security

I could test using strncpy() with larger source string then the destination:
int main() {
char *ptr = malloc(12);
strcpy(ptr,"hello world!");
return 0;
}
Compiling with the flag -fstack-protector and using the -S option I got:
.file "malloc.c"
.text
.globl main
.type main, #function
main:
.LFB2:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $32, %rsp
movl %edi, -20(%rbp)
movq %rsi, -32(%rbp)
movq %fs:40, %rax
movq %rax, -8(%rbp)
xorl %eax, %eax
movq $0, -16(%rbp)
movl $12, %edi
call malloc
movq %rax, -16(%rbp)
movq -16(%rbp), %rax
movabsq $8022916924116329800, %rdx
movq %rdx, (%rax)
movl $560229490, 8(%rax)
movb $0, 12(%rax)
movl $0, %eax
movq -8(%rbp), %rcx
xorq %fs:40, %rcx
je .L3
call __stack_chk_fail
.L3:
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE2:
.size main, .-main
Could someone explain to me how this works? And why isn't the "canary word" also overwritten by the \0 of the hello world! string?

Could someone explain to me how does this work ?
Canary word is read from fs:40 and store at top of frame here:
movq %fs:40, %rax
movq %rax, -8(%rbp)
It's below the return address so if your code happens to overflow the buffer (which will be below -8(%rbp)), it'll first overwrite the -8(%rbp) location. This will be detected by GCC prior to issuing ret here:
movq -8(%rbp), %rcx
xorq %fs:40, %rcx ; Checks that %fs:40 == -8(%rbp)
je .L3 ; Ok, return
call __stack_chk_fail ; Die
as overwritten contents of -8(%rbp) will likely to be different from proper value (installed from fs:40).
And why is not the canary word also overwritten by the \0 of the hello world!?
Your code has heap overflow, not buffer overflow so SSP can't help...

Related

call printf system subroutine to output a integer number error in assembly code [duplicate]

This question already has answers here:
Why does Windows64 use a different calling convention from all other OSes on x86-64?
(4 answers)
How to write hello world in assembly under Windows?
(9 answers)
Closed last year.
Fro
run gcc s2.asm in windows7 console window; then a exe file is generated.
run a.exe,then crash, why.
s2.asm code is generated from source code following:
{
int m;
m = 1;
iprint (m) ;
}
s2.asm eplease refer to the following:
IO:
.string "%lld"
.text
.globl main
main:
pushq %rbp
movq %rsp, %rbp
subq $16, %rsp
pushq $1
movq %rbp, %rax
leaq -8(%rax), %rax
popq (%rax)
movq %rbp, %rax
leaq -8(%rax), %rax
movq (%rax), %rax
pushq %rax
popq %rsi
leaq IO(%rip), %rdi
movq $0, %rax
callq printf
leaveq
retq
I installed tdm64-gcc-10.3.0-21.exe on my windows, therefore I have a gcc 64 bits.
But why a.exe crashed?
thank you.
hi all, thank your reply and ...
I am a fan of compiler technology, I want to realize my toy compiler by my self, it's very sample but support 64bits, which can be compiled by gcc on windows OS,and run on dindows console.
I meet a compiler which written by ocaml by Mune Professor on site ,
the generated assembly seems very sample.
ocaml64 is setup on my PC.
but only a ocaml compiler in it, there is no gcc.
then but by the toy compiler, 80x86 assembly code can be generated,
To convert assembly code to execute file, then Embarcadero_Dev-Cpp_6.3_TDM-GCC_9.2 is setup,
then gcc tmp.s, a a.exe file is generated,
but a.exe cannot be run successfully on windows.
the code is provided on site.1
But I have limited knowledge on assembly.
On this site, the assembly code emmiter module:
for linux, for cygwin, for old ocaml.
At last I have to reconsider the code again: I select cygwin emitter.
then generate assembly like the folowwing, I run the output a.exe file final succcesslly .
IO:
.string "%lld"
.text
.globl main
main:
pushq %rbp
movq %rsp, %rbp
subq $16, %rsp
pushq $1
movq %rbp, %rax
leaq -8(%rax), %rax
popq (%rax)
movq %rbp, %rax
leaq -8(%rax), %rax
movq (%rax), %rax
pushq %rax
popq %rdx
leaq IO(%rip), %rcx
subq $32, %rsp
callq printf
addq $32, %rsp
leaveq
retq
$ ./a.exe
1
Note: the above assembly was not optimized.
After optimization, the code become the following:
IO:
.string "%lld"
.text
.globl main
main:
pushq %rbp
movq %rsp, %rbp
subq $16, %rsp
pushq $1
movq %rbp, %rax
leaq -8(%rax), %rax
popq (%rax)
movq %rbp, %rax
leaq -8(%rax), %rax
movq (%rax), %rax
movq %rax, %rdx
leaq IO(%rip), %rcx
subq $32, %rsp
callq printf
addq $32, %rsp
leaveq
retq
I found the two rows
pushq %rax
popq %rdx
become one row.
movq rax, rdx
Here, the issue was resolved, it's caused by my mistake, since I am not really clear about the assembly code emitter module,
Thank all of you.
https://www.ed.tus.ac.jp/j-mune/ccp/

unable to print characters in assembly [duplicate]

This question already has an answer here:
Using interrupt 0x80 on 64-bit Linux [duplicate]
(1 answer)
Closed 1 year ago.
I am trying to print the character h in assembly, but it is not outputting anything right now. I see no reason, nor can I understand why this is not working.
I would believe that it is because I am using %rbp instead of %eax but I am reasonably new to assembly, and I do not know whether writing to the %rbp register instead of %eax makes a difference.
.section .text
.global _start
_start:
mov %eax, %edi
call main
movl $1, %eax
int $0x80
main:
pushq %rbp
movq %rsp, %rbp
movl $4, %eax
movl $1, %ebx
push $0x068
movl $5, %edx
movq %rbp, %rsp
syscall
popq %rbp
ret
The code is compiled with
> as $(BIN_DIR)/assembly.asm -o $(BIN_DIR)/a.o
> ld $(BIN_DIR)/a.o -o $(BIN_DIR)/a
I looked up the structure in e.g. Free Pascal sources which somewhat illustrates how parameters are allocated and how success is determined.
movq sysnr, %rax { Syscall number -> rax. }
// for calls that have less parameters, just skip the relevant lines that load it
movq param1, %rdi { shift arg1 - arg5. }
movq param2, %rsi
movq param3, %rdx
movq param4, %r10
movq param5, %r8
movq param6, %r9
syscall { Do the system call. }
cmpq $-4095, %rax { Check %rax for error. }
jnae .LSyscOK { Jump to error handler if error. }
negq %rax
movq %rax,%rdi
call seterrno // call some function to set errno threadvar
movq $-1,%rax
.LSyscOK: // end of procedure

Assembly x86 64 Linux AT&T: print routine segmentation error

I am new to assembly and am aware that my assembly code may not be efficient or could be better. The comments of the assembly may be messed up a little due to constant changes. The goal is to print each character of the string individually and when comes across with a format identifier like %s, it prints a string from one of the parameters in place of %s.
So for example:
String: Hello, %s
Parameter (RSI): Foo
Output: Hello, Foo
So the code does what it suppose to do but give segmentation error at the end.
.bss
char: .byte 0
.text
.data
text1: .asciz "%s!\n"
text2: .asciz "My name is %s. I think I’ll get a %u for my exam. What does %r do? And %%?\n"
word1: .asciz "Piet"
.global main
main:
pushq %rbp # push the base pointer (and align the stack)
movq %rsp, %rbp # copy stack pointer value to base pointer
movq $text1, %rdi
movq $word1, %rsi
movq $word1, %rdx
movq $word1, %rcx
movq $word1, %r8
movq $word1, %r9
call myPrint
end:
movq %rbp, %rsp # clear local variables from stack
popq %rbp # restore base pointer location
movq $60, %rax
movq $0, %rdi
syscall
myPrint:
pushq %rbp
movq %rsp, %rbp
pushq %rsi
pushq %rdx
pushq %rcx
pushq %r8
pushq %r9
movq %rdi, %r12
regPush:
movq $0, %rbx
#rbx: counter
printLooper:
movb (%r12), %r14b #Get a byte of r12 to r14
cmpb $0, %r14b #Check if r14 is a null byte
je endPrint #If it is a null byte then go to 'endPrint'
cmpb $37, %r14b
je formatter
incq %r12 #Increment r12 to the next byte
skip:
mov $char, %r15 #Move char address to r15
mov %r14b, (%r15) #Move r14 byte into the value of r15
mov $char, %rcx #Move char address into rcx
movq $1, %r13 #For the number of byte
printer:
movq $0, %rsi #Clearing rsi
mov %rcx, %rsi #Move the address to rsi
movq $1, %rax #Sys write
movq $1, %rdi #Output
movq %r13, %rdx #Number of byte to rdx
syscall
jmp printLooper
formatter:
incq %r12 #Moving to char after "%"
movb (%r12), %r14b #Moving the char byte into r14
cmpb $115, %r14b #Compare 's' with r14
je formatString #If it is equal to 's' then jump to 'formatString'
movb -1(%r12), %r14b #Put back the previous char into r14
jmp skip
####String Formatter Start ##################################################
formatString:
addq $1, %rbx
movq $8, %rax
mulq %rbx
subq %rax, %rbp
movq (%rbp), %r15
pushq %r15 ### into the stack
movq $0, %r13 ### Byte counter
formatStringLoop:
movb (%r15), %r14b #Move char into r14
cmpb $0, %r14b #Compare r14 with null byte
je formatStringEnd #If it is equal, go to 'formatStringEnd'
incq %r15 #Increment to next char
addq $1, %r13 #Add 1 to the byte counter
jmp formatStringLoop#Loop again
formatStringEnd:
popq %rcx #Pop the address into rcx
incq %r12 #Moving r12 to next char
jmp printer
#######String Formatter End #############################################
endPrint:
movq %rbp, %rsp
popq %rbp
ret
In formatString you modify %rbp with subq %rax, %rbp, forgetting that you will restore %rsp from it. So when you mov %rbp, %rsp just before the function returns, you end up with %rsp pointing somewhere else, and so you get the wrong return address.
I guess you are subtracting some offset from %rbp to get some space on the stack. This seems unsafe because you've pushed lots of other stuff there. It is safe to use up to 128 bytes below the stack pointer as this is the red zone, but it would be more natural to use an offset from %rsp instead. Using SIB addressing you can access data at constant or variable offsets to %rsp without actually changing its value.
How I found this with gdb: by setting breakpoints at myPrint and endPrint, I found that %rsp was different at the ret than it was on entry. Its value could only have come from %rbp, so I did watch $rbp to have the debugger break when %rbp changed, and it pointed straight to the offending instruction in formatString. (Which I could also have found by searching the source code for %rbp.)
Also, your .text at the top of the file is misplaced, so all your code gets placed in the .data section. This actually works but it surely is not what you intended.

how do I call a external function in assembler?

I have tried to use an external function in assembly code:
.section .rodata
.LC0:
.string "My number is: %lld"
.text
.globl start
start:
pushq %rbp
movq %rsp, %rbp
subq $16, %rsp
movq $12345, -8(%rbp)
movq -8(%rbp), %rax
movq %rax, %rsi
movl $.LC0, %edi
movl $0, %eax
call printf # my external function
# exit-syscall
mov $1, %eax
mov $0, %ebx
int $0x80
I assembled and linked with:
as -o myObjfile.o mySourcefile.s
ld -e start -o myProgram -lc myObjfile.o
The executable is build, but it doesn't run, so what's wrong with it?
If I understand correctly if the function you want to call is in the C library you can just declare it as an external function.
.section .rodata
.LC0:
.string "My number is: %lld"
.text
.extern printf #declaring the external function
.globl start
start:
pushq %rbp
movq %rsp, %rbp
subq $16, %rsp
movq $12345, -8(%rbp)
movq -8(%rbp), %rax
movq %rax, %rsi
movl $.LC0, %edi
movl $0, %eax
call printf # my external function
# exit-syscall
mov $1, %eax
mov $0, %ebx
int $0x80
If this doesn't work try linking it explicitly with stdlib.h

Swift string via string literal vs initializer

In other languages such as Java, under the hood there is actually a difference between string obtained via string literal vs initializer. In Swift, are they equivalent under the hood?
e.g.
var string:String = ""
var string:String = String()
Refer to this SO post for info on differences between literal and object in Java.
The declarations are equivalent according to the Apple docs:
Initializing an Empty String
To create an empty String value as the starting point for building a longer string, either assign an empty string literal to a variable, or initialize a new String instance with initializer syntax:
var emptyString = "" // empty string literal
var anotherEmptyString = String() // initializer syntax
// these two strings are both empty, and are equivalent to each other
Reference: https://developer.apple.com/library/prerelease/ios/documentation/Swift/Conceptual/Swift_Programming_Language/StringsAndCharacters.html
If we look at the assembly, we will see that the two constructors use identical instructions.
string.swift:
let str = String()
let str2 = ""
Compiled assembly (swiftc -emit-assembly string.swift):
.section __TEXT,__text,regular,pure_instructions
.macosx_version_min 14, 3
.globl _main
.align 4, 0x90
_main:
.cfi_startproc
pushq %rbp
Ltmp0:
.cfi_def_cfa_offset 16
Ltmp1:
.cfi_offset %rbp, -16
movq %rsp, %rbp
Ltmp2:
.cfi_def_cfa_register %rbp
subq $16, %rsp
movq _globalinit_33_1BDF70FFC18749BAB495A73B459ED2F0_token4#GOTPCREL(%rip), %rax
movq _globalinit_33_1BDF70FFC18749BAB495A73B459ED2F0_func4#GOTPCREL(%rip), %rcx
xorl %edx, %edx
movl %edi, -4(%rbp)
movq %rax, %rdi
movq %rsi, -16(%rbp)
movq %rcx, %rsi
callq _swift_once
movq _globalinit_33_1BDF70FFC18749BAB495A73B459ED2F0_token5#GOTPCREL(%rip), %rdi
movq _globalinit_33_1BDF70FFC18749BAB495A73B459ED2F0_func5#GOTPCREL(%rip), %rax
xorl %r8d, %r8d
movl %r8d, %edx
movq __TZvOSs7Process5_argcVSs5Int32#GOTPCREL(%rip), %rcx
movl -4(%rbp), %r8d
movl %r8d, (%rcx)
movq %rax, %rsi
callq _swift_once
movq __TZvOSs7Process11_unsafeArgvGVSs20UnsafeMutablePointerGS0_VSs4Int8__#GOTPCREL(%rip), %rax
movq -16(%rbp), %rcx
movq %rcx, (%rax)
callq __TFSSCfMSSFT_SS
leaq L___unnamed_1(%rip), %rdi
xorl %r8d, %r8d
movl %r8d, %esi
movl $1, %r8d
movq %rax, __Tv6string3strSS(%rip)
movq %rdx, __Tv6string3strSS+8(%rip)
movq %rcx, __Tv6string3strSS+16(%rip)
movl %r8d, %edx
callq __TFSSCfMSSFT21_builtinStringLiteralBp8byteSizeBw7isASCIIBi1__SS
xorl %r8d, %r8d
movq %rax, __Tv6string4str2SS(%rip)
movq %rdx, __Tv6string4str2SS+8(%rip)
movq %rcx, __Tv6string4str2SS+16(%rip)
movl %r8d, %eax
addq $16, %rsp
popq %rbp
retq
.cfi_endproc
.globl __Tv6string3strSS
.zerofill __DATA,__common,__Tv6string3strSS,24,3
.globl __Tv6string4str2SS
.zerofill __DATA,__common,__Tv6string4str2SS,24,3
.section __TEXT,__cstring,cstring_literals
L___unnamed_1:
.space 1
.no_dead_strip __Tv6string3strSS
.no_dead_strip __Tv6string4str2SS
.linker_option "-lswiftCore"
.section __DATA,__objc_imageinfo,regular,no_dead_strip
L_OBJC_IMAGE_INFO:
.long 0
.long 512
.subsections_via_symbols
Notice that the declarations for str and str2 have identical instructions:
xorl %r8d, %r8d
movl %r8d, %esi
movl $1, %r8d
movq %rax, __Tv6string3strSS(%rip)
movq %rdx, __Tv6string3strSS+8(%rip)
movq %rcx, __Tv6string3strSS+16(%rip)
movl %r8d, %edx
# ...
xorl %r8d, %r8d
movq %rax, __Tv6string4str2SS(%rip)
movq %rdx, __Tv6string4str2SS+8(%rip)
movq %rcx, __Tv6string4str2SS+16(%rip)
movl %r8d, %eax
You can learn more about String literals by reviewing the Apple's documentation.

Resources