ARM Assembly Branch Segmentation Fault

ARM Assembly Branch Segmentation Fault - linux

I'm new to assembly and I'm currently getting a segmentation fault when executing the following:
.global _start # Provide program starting address to linker
_start: mov R0,#0 # A value of 1 indicates "True"
bl v_bool # Call subroutine to display "True" or "False"
mov R0,#0 # Exit Status code of 0 for "normal completion"
mov R7,#1 # Service command 1 terminates this program
svc 0 # Issue Linux command to terminate program
# Subroutine v_bool wil display "True" or "False" on the monitor
# R0: contains 0 implies false; non-zero implies true
# LR: Contains the return address
# Registers R0 through R7 will be used by v_bool and not saved
v_bool: cmp R0,#0 # Set condition flags for True or False
beq setf
bne sett
mov R2,#6 # Number of characters to be displayed at a time.
mov R0,#1 # Code for stdout (standard output, monitor)
mov R7,#4 # Linux service command code to write.
svc 0 # Call Linux command
bx LR # Return to the calling program
sett: ldr R1,=T_msg
setf: ldr R1,=F_msg
.data
T_msg: .ascii "True " # ASCII string to display if true
F_msg: .ascii "False " # ASCII string to display if false
.end
I've used the debugger to find that the causes of the segmentation fault are the two branches sett and setf, and I understand that this is caused by the program trying to write to an illegal memory location.
However, I do not understand why these branches are not able to write to R1, or what I should do to fix this. Any help is greatly appreciated.

The issue is not the instructions themselves. The problem is, after executing the instruction at, for instance setf, the execution continues on to undefined memory. You need to make sure the execution after setf and sett goes back to the code of v_bool.

Related

How does gdb start an assembly compiled program and step one line at a time?

Valgrind says the following on their documentation page
Your program is then run on a synthetic CPU provided by the Valgrind core
However GDB doesn't seem to do that. It seems to launch a separate process which executes independently. There's also no c library from what I can tell. Here's what I did
Compile using clang or gcc gcc -g tiny.s -nostdlib (-g seems to be required)
gdb ./a.out
Write starti
Press s a bunch of times
You'll see it'll print out "Test1\n" without printing test2. You can also kill the process without terminating gdb. GDB will say "Program received signal SIGTERM, Terminated." and won't ever write Test2
How does gdb start the process and have it execute only one line at a time?
.text
.intel_syntax noprefix
.globl _start
.p2align 4, 0x90
.type _start,#function
_start:
lea rsi, [rip + .s1]
mov edi, 1
mov edx, 6
mov eax, 1
syscall
lea rsi, [rip + .s2]
mov edi, 1
mov edx, 6
mov eax, 1
syscall
mov eax, 60
xor edi, edi
syscall
.s1:
.ascii "Test1\n"
.s2:
.ascii "Test2\n"

starti implementation
As usual for a process that wants to start another process, it does a fork/exec, like a shell does. But in the new process, GDB doesn't just make an execve system call right away.
Instead, it calls ptrace(PTRACE_TRACEME) to wait for the parent process to attach to it, so GDB (the parent) is already attached before the child process makes an execve() system call to make this process start executing the specified executable file.
Also note in the execve(2) man page:
If the current program is being ptraced, a SIGTRAP signal is sent
to it after a successful execve().
So that's how the kernel debugging API supports stopping before the first user-space instruction is executed in a newly-execed process. i.e. exactly what starti wants. This doesn't depend on setting a breakpoint; that can't happen until after execve anyway, and with ASLR the correct address isn't even known until after execve picks a base address. (GDB by default disables ASLR, but it still works if you tell it not to disable ASLR.)
This is also what GDB use if you set breakpoints before run, manually, or by using start to set a one-time breakpoint on main. Before the starti command existed, a hack to emulate that functionality was to set an invalid breakpoint before run, so GDB would stop on that error, giving you control at that point.
If you strace -f -o gdb.trace gdb ./foo or something, you'll see some of what GDB does. (Nested tracing apparently doesn't work, so running GDB under strace means GDB's ptrace system call fails, but we can see what it does leading up to that.)
...
231566 execve("/usr/bin/gdb", ["gdb", "./foo"], 0x7ffca2416e18 /* 57 vars */) = 0
# the initial GDB process is PID 231566.
... whole bunch of stuff
231566 write(1, "Starting program: /tmp/foo \n", 28) = 28
231566 personality(0xffffffff) = 0 (PER_LINUX)
231566 personality(PER_LINUX|ADDR_NO_RANDOMIZE) = 0 (PER_LINUX)
231566 personality(0xffffffff) = 0x40000 (PER_LINUX|ADDR_NO_RANDOMIZE)
231566 vfork( <unfinished ...>
# 231584 is the new PID created by vfork that would go on to execve the new PID
231584 openat(AT_FDCWD, "/proc/self/fd", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 13
231584 newfstatat(13, "", {st_mode=S_IFDIR|0500, st_size=0, ...}, AT_EMPTY_PATH) = 0
231584 getdents64(13, 0x558403e20360 /* 16 entries */, 32768) = 384
231584 close(3) = 0
... all these FDs
231584 close(12) = 0
231584 getdents64(13, 0x558403e20360 /* 0 entries */, 32768) = 0
231584 close(13) = 0
231584 getpid() = 231584
231584 getpid() = 231584
231584 setpgid(231584, 231584) = 0
231584 ptrace(PTRACE_TRACEME) = -1 EPERM (Operation not permitted)
231584 write(2, "warning: ", 9) = 9
231584 write(2, "Could not trace the inferior pro"..., 37) = 37
231584 write(2, "\n", 1) = 1
231584 write(2, "warning: ", 9) = 9
231584 write(2, "ptrace", 6) = 6
231584 write(2, ": ", 2) = 2
231584 write(2, "Operation not permitted", 23) = 23
231584 write(2, "\n", 1) = 1
# gotta love unbuffered stderr
231584 exit_group(127) = ?
231566 <... vfork resumed>) = 231584 # in the parent
231584 +++ exited with 127 +++
# then the parent is running again
231566 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=231584, si_uid=1000, si_status=127, si_utime=0, si_stime=0} ---
231566 rt_sigreturn({mask=[]}) = 231584
... then I typed "quit" and hit return
There some earlier clone system calls to create more threads in the main GDB process, but those didn't exit until after the vforked PID that attempted ptrace(PTRACE_TRACEME). They were all just threads since they used clone with CLONE_VM. There was one earlier vfork / execve of /usr/bin/iconv.
Annoyingly, modern Linux has moved to PIDs wider than 16-bit so the numbers get inconveniently large for human minds.
step implementation:
Unlike stepi which would use PTRACE_SINGLESTEP on ISAs that support it (e.g. x86 where the kernel can use the TF trap flag, but interestingly not ARM), step is based on source-level line number <-> address debug info. That's usually pointless for asm, unless you want to step past macro expansions or something.
But for step, GDB will use ptrace(PTRACE_POKETEXT) to write an int3 debug-break opcode over the first byte of an instruction, then ptrace(PTRACE_CONT) to let execution run in the child process until it hits a breakpoint or other signal. (Then put back the original opcode byte when this instruction needs to execute). The place at which it puts that breakpoint is something it finds by looking for the next address of a line-number in the DWARF or STABS debug info (metadata) in the executable. That's why only stepi (aka si) works when you don't have debug info.
Or possibly it would use PTRACE_SINGLESTEP one or two times as an optimization if it saw it was close.
(I normally only use si or ni for debugging asm, not s or n. layout reg is also nice, when GDB doesn't crash. See the bottom of the x86 tag wiki for more GDB asm debugging tips.)
If you meant to ask how the x86 ISA supports debugging, rather than the Linux kernel API which exposes those features via a target-independent API, see the related Q&As:
How is PTRACE_SINGLESTEP implemented?
Why Single Stepping Instruction on X86?
How to tell length of an x86-64 instruction opcode using CPU itself?
Also How does a debugger work? has some Windowsy answers.

How to print value of a register using spike?

I have an assembly code for RISCV machine.
I have added an instruction to access floating point control and status register and store floating point flags in register a3. I want to print its value to demonstrate that flag gets set when floating point exception occurs.
I tried using spike. There is an instruction in spike(in debug mode) to print value of a register:
: reg 0 a3
to print value of a3.
But first i have to reach my desired point.
I do not know how will i be able to reach that point.
.file "learn_Assembly.c"
.option nopic
.text
.comm a,4,4
.comm b,4,4
.align 1
.globl main
.type main, #function
main:
addi sp,sp,-32
sd s0,24(sp)
addi s0,sp,32
lui a5,%hi(a)
lui a4,%hi(.LC0)
flw fa5,%lo(.LC0)(a4)
fsw fa5,%lo(a)(a5)
lui a5,%hi(b)
lui a4,%hi(.LC1)
flw fa5,%lo(.LC1)(a4)
fsw fa5,%lo(b)(a5)
lui a5,%hi(a)
flw fa4,%lo(a)(a5)
lui a5,%hi(b)
flw fa5,%lo(b)(a5)
fmul.s fa5,fa4,fa5
frflags a3
fsw fa5,-20(s0)
li a5,0
mv a0,a5
ld s0,24(sp)
addi sp,sp,32
jr ra
.size main, .-main
.section .rodata
.align 2
.LC0:
.word 1082130432
.align 2
.LC1:
.word 1077936128
.ident "GCC: (GNU) 8.2.0"
The other option is to somehow write print it using assembly instruction which i am not sure how to do.

To understand the flow of your program , you could create object dump of your program from compiled elf .
To create elf :-
riscv64-unknown-elf-gcc assmebly_code.s -o executable.elf
Then you could create the object dump by :-
riscv64-unknown-elf-objdump -d executable.elf > executable.dump
executable.dump will contains the program flow like this :-
executable.elf: file format elf64-littleriscv
Disassembly of section .text:
00000000000100b0 <_start>:
100b0: 00002197 auipc gp,0x2
100b4: 35018193 addi gp,gp,848 # 12400 <__global_pointer$>
100b8: 81818513 addi a0,gp,-2024 # 11c18 <_edata>
100bc: 85818613 addi a2,gp,-1960 # 11c58 <_end>
100c0: 8e09 sub a2,a2,a0
100c2: 4581 li a1,0
100c4: 1e6000ef jal ra,102aa <memset>
100c8: 00000517 auipc a0,0x0
100cc: 13850513 addi a0,a0,312 # 10200 <__libc_fini_array>
100d0: 104000ef jal ra,101d4 <atexit>
100d4: 174000ef jal ra,10248 <__libc_init_array>
100d8: 4502 lw a0,0(sp)
100da: 002c addi a1,sp,8
100dc: 4601 li a2,0
100de: 0be000ef jal ra,1019c <main>
100e2: 0fe0006f j 101e0 <exit>
....... ........ .................
....... ........ .................
....... ........ .................
Recognize the required pc with required a3 value .
then on spike use command until to run till that pc value :
: until pc 0 <*required pc*>
Note : Your compiler and assembler names may vary.

You can use until spike instruction to execute until a desired equality is reached:
: until pc 0 2020 (stop when pc=2020)
As explain here (interactive debug).
Once value reached you can use reg to read value you want.

Understanding how $ works in assembly [duplicate]

len: equ 2
len: db 2
Are they the same, producing a label that can be used instead of 2? If not, then what is the advantage or disadvantage of each declaration form? Can they be used interchangeably?

The first is equate, similar to C's:
#define len 2
in that it doesn't actually allocate any space in the final code, it simply sets the len symbol to be equal to 2. Then, when you use len later on in your source code, it's the same as if you're using the constant 2.
The second is define byte, similar to C's:
int len = 2;
It does actually allocate space, one byte in memory, stores a 2 there, and sets len to be the address of that byte.
Here's some pseudo-assembler code that shows the distinction:
line addr code label instruction
---- ---- -------- ----- -----------
1 0000 org 1234h
2 1234 elen equ 2
3 1234 02 dlen db 2
4 1235 44 02 00 mov ax, elen
5 1238 44 34 12 mov ax, dlen
Line 1 simply sets the assembly address to be 1234h, to make it easier to explain what's happening.
In line 2, no code is generated, the assembler simply loads elen into the symbol table with the value 2. Since no code has been generated, the address does not change.
Then, when you use it on line 4, it loads that value into the register.
Line 3 shows that db is different, it actually allocates some space (one byte) and stores the value in that space. It then loads dlen into the symbol table but gives it the value of that address 1234h rather than the constant value 2.
When you later use dlen on line 5, you get the address, which you would have to dereference to get the actual value 2.

Summary
NASM 2.10.09 ELF output:
db does not have any magic effects: it simply outputs bytes directly to the output object file.
If those bytes happen to be in front of a symbol, the symbol will point to that value when the program starts.
If you are on the text section, your bytes will get executed.
Weather you use db or dw, etc. that does not specify the size of the symbol: the st_size field of the symbol table entry is not affected.
equ makes the symbol in the current line have st_shndx == SHN_ABS magic value in its symbol table entry.
Instead of outputting a byte to the current object file location, it outputs it to the st_value field of the symbol table entry.
All else follows from this.
To understand what that really means, you should first understand the basics of the ELF standard and relocation.
SHN_ABS theory
SHN_ABS tells the linker that:
relocation is not to be done on this symbol
the st_value field of the symbol entry is to be used as a value directly
Contrast this with "regular" symbols, in which the value of the symbol is a memory address instead, and must therefore go through relocation.
Since it does not point to memory, SHN_ABS symbols can be effectively removed from the executable by the linker by inlining them.
But they are still regular symbols on object files and do take up memory there, and could be shared amongst multiple files if global.
Sample usage
section .data
x: equ 1
y: db 2
section .text
global _start
_start:
mov al, x
; al == 1
mov al, [y]
; al == 2
Note that since the symbol x contains a literal value, no dereference [] must be done to it like for y.
If we wanted to use x from a C program, we'd need something like:
extern char x;
printf("%d", &x);
and set on the asm:
global x
Empirical observation of generated output
We can observe what we've said before with:
nasm -felf32 -o equ.o equ.asm
ld -melf_i386 -o equ equ.o
Now:
readelf -s equ.o
contains:
Num: Value Size Type Bind Vis Ndx Name
4: 00000001 0 NOTYPE LOCAL DEFAULT ABS x
5: 00000000 0 NOTYPE LOCAL DEFAULT 1 y
Ndx is st_shndx, so we see that x is SHN_ABS while y is not.
Also see that Size is 0 for y: db in no way told y that it was a single byte wide. We could simply add two db directives to allocate 2 bytes there.
And then:
objdump -dr equ
gives:
08048080 <_start>:
8048080: b0 01 mov $0x1,%al
8048082: a0 88 90 04 08 mov 0x8049088,%al
So we see that 0x1 was inlined into instruction, while y got the value of a relocation address 0x8049088.
Tested on Ubuntu 14.04 AMD64.
Docs
http://www.nasm.us/doc/nasmdoc3.html#section-3.2.4:
EQU defines a symbol to a given constant value: when EQU is used, the source line must contain a label. The action of EQU is to define the given label name to the value of its (only) operand. This definition is absolute, and cannot change later. So, for example,
message db 'hello, world'
msglen equ $-message
defines msglen to be the constant 12. msglen may not then be redefined later. This is not a preprocessor definition either: the value of msglen is evaluated once, using the value of $ (see section 3.5 for an explanation of $) at the point of definition, rather than being evaluated wherever it is referenced and using the value of $ at the point of reference.
See also
Analogous question for GAS: Difference between .equ and .word in ARM Assembly? .equiv seems to be the closes GAS equivalent.

equ: preprocessor time. analogous to #define but most assemblers are lacking an #undef, and can't have anything but an atomic constant of fixed number of bytes on the right hand side, so floats, doubles, lists are not supported with most assemblers' equ directive.
db: compile time. the value stored in db is stored in the binary output by the assembler at a specific offset. equ allows you define constants that normally would need to be either hardcoded, or require a mov operation to get. db allows you to have data available in memory before the program even starts.
Here's a nasm demonstrating db:
; I am a 16 byte object at offset 0.
db '----------------'
; I am a 14 byte object at offset 16
; the label foo makes the assembler remember the current 'tell' of the
; binary being written.
foo:
db 'Hello, World!', 0
; I am a 2 byte filler at offset 30 to help readability in hex editor.
db ' .'
; I am a 4 byte object at offset 16 that the offset of foo, which is 16(0x10).
dd foo
An equ can only define a constant up to the largest the assembler supports
example of equ, along with a few common limitations of it.
; OK
ZERO equ 0
; OK(some assemblers won't recognize \r and will need to look up the ascii table to get the value of it).
CR equ 0xD
; OK(some assemblers won't recognize \n and will need to look up the ascii table to get the value of it).
LF equ 0xA
; error: bar.asm:2: warning: numeric constant 102919291299129192919293122 -
; does not fit in 64 bits
; LARGE_INTEGER equ 102919291299129192919293122
; bar.asm:5: error: expression syntax error
; assemblers often don't support float constants, despite fitting in
; reasonable number of bytes. This is one of the many things
; we take for granted in C, ability to precompile floats at compile time
; without the need to create your own assembly preprocessor/assembler.
; PI equ 3.1415926
; bar.asm:14: error: bad syntax for EQU
; assemblers often don't support list constants, this is something C
; does support using define, allowing you to define a macro that
; can be passed as a single argument to a function that takes multiple.
; eg
; #define RED 0xff, 0x00, 0x00, 0x00
; glVertex4f(RED);
; #undef RED
;RED equ 0xff, 0x00, 0x00, 0x00
the resulting binary has no bytes at all because equ does not pollute the image; all references to an equ get replaced by the right hand side of that equ.

Where const strings are saved in assembly?

When i declare a string in assembly like that:
string DB "My string", 0
where is the string saved?
Can i determine where it will be saved when declaring it?

db assembles output bytes to the current position in the output file. You control exactly where they go.
There is no indirection or reference to any other location, it's like char string[] = "blah blah", not char *string = "blah blah" (but without the implicit zero byte at the end, that's why you have to use ,0 to add one explicitly.)
When targeting a modern OS (i.e. not making a boot-sector or something), your code + data will end up in an object file and then be linked into an executable or library.
On Linux (or other ELF platforms), put read-only constant data including strings in section .rodata. This section (along with section .text where you put code) becomes part of the text segment after linking.
Windows apparently uses section .rdata.
Different assemblers have different syntax for changing sections, but I think section .whatever works in most of the one that use DB for data bytes.
;; NASM source for the x86-64 System V ABI.
section .rodata ; use section .rdata on Windows
string DB "My string", 0
section .data
static_storage_for_something: dd 123 ; one dword with value = 123
;; usually you don't need .data and can just use registers or the stack
section .bss ; zero-initialized memory, bytes not stored in the executable, just size
static_array: resd 12300000 ;; 12300000 dwords with value = 0
section .text
extern puts ; defined in libc
global main
main:
mov edi, string ; RDI = address of string = first function arg
;mov [rdi], 1234 ; would segfault because .rodata is mapped read-only
jmp puts ; tail-call puts(string)
peter#volta:/tmp$ cat > string.asm
(and paste the above, then press control-D)
peter#volta:/tmp$ nasm -f elf64 string.asm && gcc -no-pie string.o && ./a.out
My string
peter#volta:/tmp$ echo $?
10
10 characters is the return value from puts, which is the return value from main because we tail-called it, which becomes the exit status of our program. (Linux glibc puts apparently returns the character count in this case. But the manual just says it returns non-negative number on success, so don't count on this)
I used -no-pie because I used an absolute address for string with mov instead of a RIP-relative LEA.
You can use readelf -a a.out or nm to look at what went where in your executable.

How to find the main function's entry point of elf executable file without any symbolic information?

I developed a small cpp program on platform of Ubuntu-Linux 11.10.
Now I want to reverse engineer it. I am beginner. I use such tools: GDB 7.0, hte editor, hexeditor.
For the first time I made it pretty easy. With help of symbolic information I founded the address of main function and made everything I needed.
Then I striped (--strip-all) executable elf-file and I have some problems.
I know that main function starts from 0x8960 in this program.
But I haven't any idea how should I find this point without this knowledge.
I tried debug my program step by step with gdb but it goes into __libc_start_main
then into the ld-linux.so.3 (so, it finds and loads the shared libraries needed by a program). I debugged it about 10 minutes. Of course, may be in 20 minutes I can reach the main function's entry point, but, it seems, that more easy way has to exist.
What should I do to find the main function's entry point without any symbolic info?
Could you advise me some good books/sites/other_sources from reverse engineering of elf-files with help of gdb?
Any help would be appreciated.

Locating main() in a stripped Linux ELF binary is straightforward. No symbol information is required.
The prototype for __libc_start_main is
int __libc_start_main(int (*main) (int, char**, char**),
int argc,
char *__unbounded *__unbounded ubp_av,
void (*init) (void),
void (*fini) (void),
void (*rtld_fini) (void),
void (*__unbounded stack_end));
The runtime memory address of main() is the argument corresponding to the first parameter, int (*main) (int, char**, char**). This means that the last memory address saved on the runtime stack prior to calling __libc_start_main is the memory address of main(), since arguments are pushed onto the runtime stack in the reverse order of their corresponding parameters in the function definition.
One can enter main() in gdb in 4 steps:
Find the program entry point
Find where __libc_start_main is called
Set a break point to the address last saved on stack prior to the call to _libc_start_main
Let program execution continue until the break point for main() is hit
The process is the same for both 32-bit and 64-bit ELF binaries.
Entering main() in an example stripped 32-bit ELF binary called "test_32":
$ gdb -q -nh test_32
Reading symbols from test_32...(no debugging symbols found)...done.
(gdb) info file #step 1
Symbols from "/home/c/test_32".
Local exec file:
`/home/c/test_32', file type elf32-i386.
Entry point: 0x8048310
< output snipped >
(gdb) break *0x8048310
Breakpoint 1 at 0x8048310
(gdb) run
Starting program: /home/c/test_32
Breakpoint 1, 0x08048310 in ?? ()
(gdb) x/13i $eip #step 2
=> 0x8048310: xor %ebp,%ebp
0x8048312: pop %esi
0x8048313: mov %esp,%ecx
0x8048315: and $0xfffffff0,%esp
0x8048318: push %eax
0x8048319: push %esp
0x804831a: push %edx
0x804831b: push $0x80484a0
0x8048320: push $0x8048440
0x8048325: push %ecx
0x8048326: push %esi
0x8048327: push $0x804840b # address of main()
0x804832c: call 0x80482f0 <__libc_start_main#plt>
(gdb) break *0x804840b # step 3
Breakpoint 2 at 0x804840b
(gdb) continue # step 4
Continuing.
Breakpoint 2, 0x0804840b in ?? () # now in main()
(gdb) x/x $esp+4
0xffffd110: 0x00000001 # argc = 1
(gdb) x/s **(char ***) ($esp+8)
0xffffd35c: "/home/c/test_32" # argv[0]
(gdb)
Entering main() in an example stripped 64-bit ELF binary called "test_64":
$ gdb -q -nh test_64
Reading symbols from test_64...(no debugging symbols found)...done.
(gdb) info file # step 1
Symbols from "/home/c/test_64".
Local exec file:
`/home/c/test_64', file type elf64-x86-64.
Entry point: 0x400430
< output snipped >
(gdb) break *0x400430
Breakpoint 1 at 0x400430
(gdb) run
Starting program: /home/c/test_64
Breakpoint 1, 0x0000000000400430 in ?? ()
(gdb) x/11i $rip # step 2
=> 0x400430: xor %ebp,%ebp
0x400432: mov %rdx,%r9
0x400435: pop %rsi
0x400436: mov %rsp,%rdx
0x400439: and $0xfffffffffffffff0,%rsp
0x40043d: push %rax
0x40043e: push %rsp
0x40043f: mov $0x4005c0,%r8
0x400446: mov $0x400550,%rcx
0x40044d: mov $0x400526,%rdi # address of main()
0x400454: callq 0x400410 <__libc_start_main#plt>
(gdb) break *0x400526 # step 3
Breakpoint 2 at 0x400526
(gdb) continue # step 4
Continuing.
Breakpoint 2, 0x0000000000400526 in ?? () # now in main()
(gdb) print $rdi
$3 = 1 # argc = 1
(gdb) x/s **(char ***) ($rsp+16)
0x7fffffffe35c: "/home/c/test_64" # argv[0]
(gdb)
A detailed treatment of program initialization and what occurs before main() is called and how to get to main() can be found be found in Patrick Horgan's tutorial "Linux x86 Program Start Up
or - How the heck do we get to main()?"

If you have a very stripped version, or even a binary that is packed, as using UPX, you can gdb on it in the tough way as:
$ readelf -h echo | grep Entry
Entry point address: 0x103120
And then you can break at it in GDB as:
$ gdb mybinary
(gdb) break * 0x103120
Breakpoint 1 at 0x103120gdb)
(gdb) r
Starting program: mybinary
Breakpoint 1, 0x0000000000103120 in ?? ()
and then, you can see the entry instructions:
(gdb) x/10i 0x0000000000103120
=> 0x103120: bl 0x103394
0x103124: dcbtst 0,r5
0x103128: mflr r13
0x10312c: cmplwi r7,2
0x103130: bne 0x103214
0x103134: stw r5,0(r6)
0x103138: add r4,r4,r3
0x10313c: lis r0,-32768
0x103140: lis r9,-32768
0x103144: addi r3,r3,-1
I hope it helps

As far as I know, once a program has been stripped, there is no straightforward way to locate the function that the symbol main would have otherwise referenced.
The value of the symbol main is not required for program start-up: in the ELF format, the start of the program is specified by the e_entry field of the ELF executable header. This field normally points to the C library's initialization code, and not directly to main.
While the C library's initialization code does call main() after it has set up the C run time environment, this call is a normal function call that gets fully resolved at link time.
In some cases, implementation-specific heuristics (i.e., the specific knowledge of the internals of the C runtime) could be used to determine the location of main in a stripped executable. However, I am not aware of a portable way to do so.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string