Swap sections in ELF - linux

Is there a way to force gcc or ld place code section at the end of output ELF-format file?
Maybe I can force them not to produce any other section except .text if, for example, I dont have anything in .data, .rodata, .bss and other sections?

The minimal version of script that worked for me looked like:
ENTRY(_start)
SECTIONS
{
.data : { *(.data) }
.bss : { *(.bss) *(COMMON) }
.text : { *(.text) }
}
But after I've made some more research (docs here) I've replaced this script with default one (ld --verbose). Then I've just placed code section in the very end of verbose script and it worked perfectly.

Related

.text, .bss, .data sections are not showed in kernel module

I have the following kernel module and Makefile for Linux running on Beaglebone board.
#include <linux/module.h>
#include <linux/kernel.h>
s32 gval = 200;
static s32 __init test_init(void)
{
pr_info("%s done : gval:%d\n", __FUNCTION__, gval);
return 0;
}
static void __exit test_deinit(void)
{
pr_info("%s done : gval:%d\n", __FUNCTION__, gval);
}
module_init(test_init);
module_exit(test_deinit);
MODULE_LICENSE("GPL");
export ARCH=arm
export CROSS_COMPILE=arm-linux-gnueabi-
BBB_KERNEL_SRC=kernel_source_path
EXTRA_CFLAGS += -g -DDEBUG
obj-m += test_km.o
test_km-objs := kmodule.o
all:
make -C $(BBB_KERNEL_SRC) M=$(PWD) modules
clean:
make -C $(BBB_KERNEL_SRC) M=$(PWD) clean
The module builds fine and test_km.ko file is generated, when the test_km.ko file is insmod, the /sys/modules/test_km/sections shows the following.
.ARM.exidx.exit.text
.ARM.exidx.init.text
.exit.text
.gnu.linkonce.this_module
.init.plt
.init.text
.note.Linux
.note.gnu.build-id
.plt
.rodata
.rodata.str1.4
.strtab
.symtab
Why the .text, .data, .bss sections not present for this kernel module.
I have downloaded the kernel source from https://github.com/beagleboard
Linux kernel version : 5.10.120
Why the .text, .data, .bss sections not present for this kernel module?
TL;DR : Because you haven't coded any.
The macros, module_init() and module_exit() place those functions in the init and exit sections.
s32 gval = 200; is only references by the init and exit code and the tools have deduced that just the constant 200 can be used.
You need to add non-init and non-exit code and then the tools will start to put things in .text, .data and .bss.

Where const strings are saved in assembly?

When i declare a string in assembly like that:
string DB "My string", 0
where is the string saved?
Can i determine where it will be saved when declaring it?
db assembles output bytes to the current position in the output file. You control exactly where they go.
There is no indirection or reference to any other location, it's like char string[] = "blah blah", not char *string = "blah blah" (but without the implicit zero byte at the end, that's why you have to use ,0 to add one explicitly.)
When targeting a modern OS (i.e. not making a boot-sector or something), your code + data will end up in an object file and then be linked into an executable or library.
On Linux (or other ELF platforms), put read-only constant data including strings in section .rodata. This section (along with section .text where you put code) becomes part of the text segment after linking.
Windows apparently uses section .rdata.
Different assemblers have different syntax for changing sections, but I think section .whatever works in most of the one that use DB for data bytes.
;; NASM source for the x86-64 System V ABI.
section .rodata ; use section .rdata on Windows
string DB "My string", 0
section .data
static_storage_for_something: dd 123 ; one dword with value = 123
;; usually you don't need .data and can just use registers or the stack
section .bss ; zero-initialized memory, bytes not stored in the executable, just size
static_array: resd 12300000 ;; 12300000 dwords with value = 0
section .text
extern puts ; defined in libc
global main
main:
mov edi, string ; RDI = address of string = first function arg
;mov [rdi], 1234 ; would segfault because .rodata is mapped read-only
jmp puts ; tail-call puts(string)
peter#volta:/tmp$ cat > string.asm
(and paste the above, then press control-D)
peter#volta:/tmp$ nasm -f elf64 string.asm && gcc -no-pie string.o && ./a.out
My string
peter#volta:/tmp$ echo $?
10
10 characters is the return value from puts, which is the return value from main because we tail-called it, which becomes the exit status of our program. (Linux glibc puts apparently returns the character count in this case. But the manual just says it returns non-negative number on success, so don't count on this)
I used -no-pie because I used an absolute address for string with mov instead of a RIP-relative LEA.
You can use readelf -a a.out or nm to look at what went where in your executable.

How to compile STM32f103 program on ubuntu?

I've some experience with programming stm32 arm cortex m3 micro controllers on Windows using Keil. I now want to move to linux environment and use open source tools to program STM32 cortex m3 devices.
I've researched a bit and found that I can use OpenOCD or Texane's ST Link to flash the chip. I also found out that I'll need a cross compiler to compile the code viz. gcc-arm-none-eabi toolchain.
I want to know what basic source and header files are needed? Which are the core and systems file required to make a simple blink program.
I'm not intending to use HAL libraries as of now. I'm using stm32f103zet6 mcu (a very generic board). I went to http://regalis.com.pl/en/arm-cortex-stm32-gnulinux/ , but couldn't exactly pinpoint the files.
If there is any tutorial to start stm32 programming on linux environment, please let me know.
Any help is appreciated. Thanks!
Here is a very simple example that is fairly portable across the stm32 family. Doesnt do anything useful you have to fill in the blanks to blink an led or something (read the schematic, the manuals, enable the clocks to the gpio, follow the instructions to make it a push/pull output and so on, the set the bit or clear the bit, etc).
I have my reasons for how I do it others have theirs, and we all have various numbers of years or decades of experience behind those opinions. But at the end of they day they are opinions and many different solutions will work.
On the last so many releases of ubuntu you can simply do this to get a toolchain:
apt-get install gcc-arm-linux-gnueabi binutils-arm-linux-gnueabi
Or you can go here and get a pre-built for your operating system
https://launchpad.net/gcc-arm-embedded
flash.s
.cpu cortex-m0
.thumb
.thumb_func
.global _start
_start:
stacktop: .word 0x20001000
.word reset
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
.thumb_func
reset:
bl notmain
b hang
.thumb_func
hang: b .
.align
.thumb_func
.globl PUT16
PUT16:
strh r1,[r0]
bx lr
.thumb_func
.globl PUT32
PUT32:
str r1,[r0]
bx lr
.thumb_func
.globl GET32
GET32:
ldr r0,[r0]
bx lr
.thumb_func
.globl dummy
dummy:
bx lr
.end
flash.ld
MEMORY
{
rom : ORIGIN = 0x08000000, LENGTH = 0x1000
ram : ORIGIN = 0x20000000, LENGTH = 0x1000
}
SECTIONS
{
.text : { *(.text*) } > rom
.rodata : { *(.rodata*) } > rom
.bss : { *(.bss*) } > ram
}
sram.s
.cpu cortex-m0
.thumb
.thumb_func
.global _start
_start:
ldr r0,stacktop
mov sp,r0
bl notmain
b hang
.thumb_func
hang: b .
.align
stacktop: .word 0x20001000
.thumb_func
.globl PUT16
PUT16:
strh r1,[r0]
bx lr
.thumb_func
.globl PUT32
PUT32:
str r1,[r0]
bx lr
.thumb_func
.globl GET32
GET32:
ldr r0,[r0]
bx lr
.thumb_func
.globl dummy
dummy:
bx lr
.end
sram.ld
MEMORY
{
ram : ORIGIN = 0x20000000, LENGTH = 0x1000
}
SECTIONS
{
.text : { *(.text*) } > ram
.rodata : { *(.rodata*) } > ram
.data : { *(.data*) } > ram
.bss : { *(.bss*) } > ram
}
notmain.c
void PUT32 ( unsigned int, unsigned int );
unsigned int GET32 ( unsigned int );
void dummy ( unsigned int );
#define STK_CSR 0xE000E010
#define STK_RVR 0xE000E014
#define STK_CVR 0xE000E018
#define STK_MASK 0x00FFFFFF
int delay ( unsigned int n )
{
unsigned int ra;
while(n--)
{
while(1)
{
ra=GET32(STK_CSR);
if(ra&(1<<16)) break;
}
}
return(0);
}
int notmain ( void )
{
unsigned int rx;
PUT32(STK_CSR,4);
PUT32(STK_RVR,1000000-1);
PUT32(STK_CVR,0x00000000);
PUT32(STK_CSR,5);
for(rx=0;;rx++)
{
dummy(rx);
delay(50);
dummy(rx);
delay(50);
}
return(0);
}
Makefile
#ARMGNU ?= arm-none-eabi
ARMGNU ?= arm-linux-gnueabi
AOPS = --warn --fatal-warnings -mcpu=cortex-m0
COPS = -Wall -Werror -O2 -nostdlib -nostartfiles -ffreestanding -mcpu=cortex-m0
all : notmain.gcc.thumb.flash.bin notmain.gcc.thumb.sram.bin
clean:
rm -f *.bin
rm -f *.o
rm -f *.elf
rm -f *.list
rm -f *.bc
rm -f *.opt.s
rm -f *.norm.s
rm -f *.hex
#---------------------------------
flash.o : flash.s
$(ARMGNU)-as $(AOPS) flash.s -o flash.o
sram.o : sram.s
$(ARMGNU)-as $(AOPS) sram.s -o sram.o
notmain.gcc.thumb.o : notmain.c
$(ARMGNU)-gcc $(COPS) -mthumb -c notmain.c -o notmain.gcc.thumb.o
notmain.gcc.thumb.flash.bin : flash.ld flash.o notmain.gcc.thumb.o
$(ARMGNU)-ld -o notmain.gcc.thumb.flash.elf -T flash.ld flash.o notmain.gcc.thumb.o
$(ARMGNU)-objdump -D notmain.gcc.thumb.flash.elf > notmain.gcc.thumb.flash.list
$(ARMGNU)-objcopy notmain.gcc.thumb.flash.elf notmain.gcc.thumb.flash.bin -O binary
notmain.gcc.thumb.sram.bin : sram.ld sram.o notmain.gcc.thumb.o
$(ARMGNU)-ld -o notmain.gcc.thumb.sram.elf -T sram.ld sram.o notmain.gcc.thumb.o
$(ARMGNU)-objdump -D notmain.gcc.thumb.sram.elf > notmain.gcc.thumb.sram.list
$(ARMGNU)-objcopy notmain.gcc.thumb.sram.elf notmain.gcc.thumb.sram.hex -O ihex
$(ARMGNU)-objcopy notmain.gcc.thumb.sram.elf notmain.gcc.thumb.sram.bin -O binary
You can also try/use this approach if you prefer. I have my reasons not to, TL;DW.
void dummy ( unsigned int );
#define STK_MASK 0x00FFFFFF
#define STK_CSR (*((volatile unsigned int *)0xE000E010))
#define STK_RVR (*((volatile unsigned int *)0xE000E014))
#define STK_CVR (*((volatile unsigned int *)0xE000E018))
int delay ( unsigned int n )
{
unsigned int ra;
while(n--)
{
while(1)
{
ra=STK_CSR;
if(ra&(1<<16)) break;
}
}
return(0);
}
int notmain ( void )
{
unsigned int rx;
STK_CSR=4;
STK_RVR=1000000-1;
STK_CVR=0x00000000;
STK_CSR=5;
for(rx=0;;rx++)
{
dummy(rx);
delay(50);
dummy(rx);
delay(50);
}
return(0);
}
Between the arm docs which to some extent ST publishes a derivative for you (not everyone does that you should still go to arm). Plus the st docs.
There is uart based bootloader built in (might be usb, etc), that is pretty easy to interface, lets see...my host code to download programs is in the hundreds of lines of code, probably took an evening or an afternoont to write. YMMV. You can get if you dont already have, one of the discovery or nucleo boards, I recommend those anyway, you can use the debug end of it to program other stm32 or even other non st arm chips (not all, depends on what openocd supports, etc, but some) can get those for 30% cheaper than the dedicated stlink usb dongles and you dont need an extension usb cable, etc, etc. YMMV. Can certainly use an stlink with openocd or texane stlink as you have already mentioned.
Due to the way the cortex-m boots I have provided two examples, one for burning to flash the other for downloading via openocd to ram and running that way, could arguably use the flash one too but you have to tweak the start address when you run. I prefer this method. YMMV.
This approach you are portable and completely unencumbered by HAL limitations or requirements, build environments, etc. But I recommend you try the various methods. Bare metal like this the HAL types of bare metal with one or more st solutions and the cmsis approach. Every year or so try again, see if the one you picked is still the one you like.
This example demonstrates though it does not take a whole lot. I picked the cortex-m0 simply to avoid the armv7m thumb2 extensions. thumb without those extensions is the most portable arm instruction set. so again the code does mostly nothing, but does nothing on any stm32 cortex-m with a systick timer.
EDIT
This along with whatever you need to feed the linker would be the minimal non-C code.
.global _start
_start:
.word 0x20001000
.word reset
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
And this is abbreviated depending on the chip vendor and core there can be dozens to hundreds of vectors for every little interrupt of every little thing. The labels reset and hang in this case would be the names of C functions to handle those vectors (the documentation for the chip and core determine what vector handles what). The first vector is always the initalization value of the stack pointer. The second is always reset, the next few are common, after that they are generic interrupt pins on the core that the chip vendor wires up so you have to look at the chip vendor documentation.
The core design is such that registers are preserved for you so you dont need a little bit of assembly. Going without any bootstrap then you assume to not have .bss zeroed nor .data initialized, and you cant return from the reset function, which in a real implementation you wouldnt but for demonstration tests, you might (blink an led 10 times then program is finished).
Your toolchain may have some other way to do this. Since all toolchains should have an assembler and assemblers can generate tables of words, there is always that option, doesnt really make sense to create yet another tool and language for this but some folks feel the need. Your toolchain may not require the entry point named _start and/or it may have a different entry point name requirement.
Even if you use Kiel, you should also try the gnu tools, easy(easier) to get, significantly more support and experience in the world than for Kiel. May not produce as "good" of code as Kiel, performance wise or other, but should always have that in your back pocket as you will always be able to find help with gnu tools.
http://gnuarmeclipse.github.io/
There you'll find everything, including an IDE (Eclipse), toolchain, debugger, headers.
Look at this package. This is IDE + toolchain + debugger and it available for linux platforms. You can research it and get any ideas to do what you want. I hope most of linux programs have commnad line interface.
In addition I can suggest to you: try to use LL api if it already available for your mcu.

How to understand such sample in GNU ld manual about linker script?

I am learning the GNU linker ld script sample about memory region alias.
I see the following ld script snippet:
SECTIONS
{
.text :
{
*(.text)
} > REGION_TEXT
.rodata :
{
*(.rodata)
rodata_end = .;
} > REGION_RODATA <=========== PLACE 1
.data : AT (rodata_end) <=========== PLACE 2
{
data_start = .;
*(.data)
} > REGION_DATA <=========== PLACE 3
data_size = SIZEOF(.data);
data_load_start = LOADADDR(.data);
.bss :
{
*(.bss)
} > REGION_BSS
}
One possible system memory region layout given in the sample is like this (C in that sample):
MEMORY
{
ROM : ORIGIN = 0, LENGTH = 2M /*0M ~ 2M*/
ROM2 : ORIGIN = 0x10000000, LENGTH = 1M /*256M ~ 257M*/
RAM : ORIGIN = 0x20000000, LENGTH = 1M /*512M ~ 513M*/
}
REGION_ALIAS("REGION_TEXT", ROM); /*0M ~ 2M*/
REGION_ALIAS("REGION_RODATA", ROM2); /*256M ~ 257M*/
REGION_ALIAS("REGION_DATA", RAM); /*512M ~ 513M*/
REGION_ALIAS("REGION_BSS", RAM); /*512M ~ 513M*/
So,
PLACE 1 says .rodata MUST go into REGION_RODATA, that is 256M~257M
PLACE 2 says the .data section MUST be placed immediately after the .rodata section. So .data section MUST start from at most 257M.
But PLACE 3 says the .data section MUST goes into the REGION_DATA region. So .data section MUST start from at least 512M.
So how could it be possible?
The key concepts to understand this example are those of Virtual Memory Address (VMA) and Load Memory Address (LMA).
The GNU Linker official documentation defines those two terms as follows.
Every loadable or allocatable output section has two addresses. The
first is the VMA, or virtual memory address. This is the address the
section will have when the output file is run. The second is the LMA,
or load memory address. This is the address at which the section will
be loaded.
In the example, for all output sections but .data, the VMA and LMA addresses are the same. For section .data the LMA is specified by AT (rodata_end) while the VMA address is the first available address of the REGION_DATA memory region.
With this in mind, we can read again the example and see that it leads to the situation represented below.
ROM (alias REGION_TEXT)
+---------+------------------------------+
| .text | |
+---------+------------------------------+
ROM2 (alias REGION_RODATA)
+-----------+---------+--------+
| .rodata | .data | |
+-----------+---------+--------+
RAM (alias REGION_DATA)
+---------+--------+-----------+
| .data | .bss | |
+---------+--------+-----------+
The .data section appears twice: once in ROM2 and once in RAM. It is put at its load address (LMA) when loaded; subsequently it is moved to its virtual address before running the program.
By the way, this is why, a few line later in the documentation you mentioned, we can read that
It is possible to write a common system initialization routine to copy
the .data section from ROM or ROM2 into the RAM if necessary.

Update linker variables after --gc-sections

I wrote a small binary in cortex-a9 board, and defined a linker script like this:
SECTIONS
{
.text :
{
__text = . ;
*(.vector)
*(.text)
*(.text.*)
}
.rodata :
{
*(.rodata)
*(.rodata.*)
}
.data : {
__data_start = . ;
*(.data)
*(.data.*)
}
. = ALIGN(4);
__bss_start = . ;
.bss :
{
*(.bss)
*(.bss.*)
*(COMMON)
. = ALIGN(4);
}
__bss_end = .;
. = ALIGN(4);
__heap_start = .;
. = . + 0x1000;
. = ALIGN(4);
__heap_end = .;
_end = . ;
PROVIDE (end = .) ;
}
But it seems after --gc-sections worked and removed unused sections, the __heap_start still the value before --gc-sections get workked (I print it in code and check the ld flags):
arm-linux-gnueabihf-gcc -mcpu=cortex-a7 -msoft-float -nostdlib
-Wl,--gc-sections -Wl,--print-gc-sections -Wl,-Ttext,0x04000000 -T csrvisor.lds -Wl,-Map,binary.map
Anyone knows how to change the __heap_start to correct value after --gc-sections removed unused sections?
Check your compiler flags: Do they really contain -ffunction-sections -fdata-sections?
The heap normally (and in your case as well) starts right after the .bss section. So as for the start of the heap your linker script looks fine
Check if the linker really removes unused variables - if it only removes unused text sections, the value for __heap_start won't change.
Code, read-only data, initialized data et. al. normally go into the flash. If something is garbage-collected there, it won't affect your heap.
Data (initialized and uninitialized) will (eventually) turn up in the RAM. If something is garbage-collected there, it will affect your heap. So check if you really have variables which are removed by the garbage collection.
As for your linker script
There is no KEEP statement. Normally something like a reset handler, main et. al. must not be removed by the linker garbage collection
Your data section does not define the handling of initial values.
Your linker script does not contain region declarations (MEMORY). Check which defaults apply
Your sections do not have a target region: Again check which defaults apply in your case.
Examples with target regions:
.rodata :
{
*(.rodata)
*(.rodata.*)
} >rom
.data : {
__data_start = . ;
*(.data)
*(.data.*)
} >ram

Resources