how to boot this code? - linux

i am a newbie to assembly and program in c (use GCC in Linux)
can anyone here tell me how to compile c code into assembly and boot from it using pen drive
i use the command (in linux terminal) :
gcc -S bootcode.c
the code gives me a bootcode.S file
what do i do with that ???
i just wanna compile the following code and run it directly from a USB stick
#include<stdio.h>
void main()
{
printf ("hi");
}
any help here ???

First of all,
You Should be aware that when you are writing bootloader codes , you should know that you are CREATING YOUR OWN ENVIRONMENT of CODE, that means, there is nothing such ready made C Library available to you or anything similar , ONLY and ONLY BIOS SERVICES (or INTERRUPT ROUTINES).
Now, if you got this, you will probably figure out that the above code won't boot since, you don't have the "stdio.h" header, this means that the CPU when executing your compiled code won't find this header and thereby won't understand what is "printf" (since printf is a method of the stdio.h header).
So if you want to print any string you need to write this function by YOUR OWN either in a separate file as a header and link its object file at compilation time when creating the final binary file or in the same file. it is up to you. There could be other ways, I'm not well familiar with them, just do some researches.
Another thing you should know, it is the BIOS who is responsible for loading this boot code (your above code in your case) into memory location 0x07C00 (0x0000h:0x7C00 in segment:offset representation), so you HAVE to mention in your code that you are writing this code on this memory location, either by
1-using the ORG instruction
2-Or by loading the appropriate registers for that (cs,ds,es)
Also, you should get yourself familiar with the segment:offset memory representation scheme, just google it or read intel manuals.
Finally, for the BIOS to load your code into the 0x07C00, the boot code must not exceed 512byte (ONLY ON FIRST SECTOR OF THE BOOTABLE MEDIA, since a sectore is 512byte) and he must find at the last two byte of this first sector (byte 510 & byte 511) of your code the boot signature 0x55AA, otherwise the BIOS won't consider this code AS BOOTABLE.
Usually this is coded as :
ORG 0x7C00
...
your boot code and to load more codes since 512byte won't be sufficient.
...
times 510 - ($ - $$) db 0x00 ; Zerofill up to 510 bytes
dw 0xAA55 ;Boot Sector signature,written in reverse order since it
will be stored as little endian notation
Just to let you know, I'm not covering everything here, because if so, I'll be writing pages about it, you need to look for more resources on the net, and here is a link to start with(coding in assembly):
http://www.brokenthorn.com/Resources/OSDevIndex.html
That's all, hopefully this was helpful to you...^_^
Khilo - ALGERIA

Booting a computer is not that easy. A bootloader needs to be written. The bootloader must obey certain rules and correspond with hardware such as ROM. You also need to disable interrupts, reserve some memory etc. Look up MikeOS, it's a great project that can better help you understand the process.
Cheers

Related

Cannot Make Code Segment Execute-Only (Not Readable)

I'm trying to make the Code Segment Execute-Only (Not Readable).
But I FAILED after I tried everything the Manual told me to. Here is what I did to make the code segment unreadable.
>uname -a
Linux Emmet-VM 3.19.0-25-generic #26~14.04.1-Ubuntu SMP Fri Jul 24 21:18:00 UTC 2015 i686 i686 i686 GNU/Linux
>lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 14.04.3 LTS
Release: 14.04
Codename: trusty
First, I've found this in "Intel(R)64 and IA-32 Architectures Software Developer's Manual(Combined Volumes 1,2A,2B,2C,2D,3A,3B,3C and 3D)":
Set read-enable bit to enable read and Segment Types.(Sorry, I'm still not allowed to embed pictures in my posts, so links instead)
So, I guess if I change %CS, and let it point to a Segment Descriptor which has read-enable bit set as 0, I should make the Code Segment not readable.
Then, I use the code below to insert a new Segment into LDT.entry[2], and I do set the code segment type to 8, aka 1000B, which means "Execute-Only" according to "Segment Types" link posted above:
typedef struct user_desc UserDesc;
UserDesc *seg = (UserDesc*)malloc(sizeof(UserDesc));
seg->entry_number = 0x2;
seg->base_addr = 0x00000000;
seg->limit = 0xffffffff;
seg->seg_32bit = 0x1;
seg->contents = 0x02;
seg->read_exec_only = 0x1;
seg->limit_in_pages = 0x1;
seg->seg_not_present = 0x0;
seg->useable = 0x0;
int ret = modify_ldt(1, (void*)seg, sizeof(UserDesc));
After that, I change %CS to 0x17(00010111B, meaning the entry 2 in LDT) with ljmp.
asm("ljmp $0x17, $reload_cs\n"
"reload_cs:");
But, even with this, I still can read the byte code in code segment:
void foo() {printf("foo\n");}
void test(){
char* a = (char*)foo;
printf("0x%x\n", (unsigned int)a[0]);// This prints 0x55
}
If the code segment is unreadable, code above should throw a segment fault error. But it prints 0x55 successfully.
So, I wonder, is there any mistake I've made during my test?
Or is this just a mistake in Intel's Manual?
You are still accessing the code through DS when doing (unsigned int)a[0].
Write only segments don't exist (and if they did, it would be a bad idea to set DS write only).
If you did everything correctly mov eax, [cs:...] (NASM syntax) will fail (but mov eax, [ds:...] won't).
After a quick glance at the Intel Manual execute only pages should not exist (at least directly), so using mprotect with PROT_EXEC may be of limited use (the code would still be readable).
Worth a shot, though.
There are three ways around this.
None of which can be implemented without the aid of the OS though, so they are more theoretical than practical.
Protection keys
If the CPU supports them (See section 4.6.2 of the Intel manual 3), they introduce an asymmetry in how code and data are read.
Reading data is subject to the key protection.
Fetching however is not:
How a linear address’s protection key controls access to the address depends on the mode of a linear address:
A linear address’s protection controls only data accesses to the address. It does not in any way affect instructions fetches from the address.
So it's possible to set a protection key for the code pages that your application don't have in its PKRU register.
You would still be allowed to execute the code but not to read it.
Desync the TLBs
If your application has never touched the code pages for reading, they will occupy some entries in the ITLB but not in the DTLB.
If then, the OS map them as supervisor-only without flushing the TLBs, access to them is prevented when accessed as data (since no DTLB entries for those pages are present, forcing a walk on the memory) but thanks to the ITLB the code can still be fetched.
This is more involved in practice as code span multiple pages and is actually read as data by the OS.
EPT
The Extended Data Pages are used during virtualization to translate Guest physical addresses to Host physical addresses.
Though they seems just another level of indirection, they have separate Read, Write and Execute control bits.
A paper has been written about preventing the leakage of the kernel code (to counteract dynamic Return Oriented Programming).

How to tell compiler to pad a specific amount of bytes on every C function?

I'm trying to practice some live instrumentation and I saw there was a linker option -call-nop=prefix-nop, but it has some restriction as it only works with GOT function (I don't know how to force compiler to generate GOT function, and not sure if it's good idea for performance reason.) Also, -call-nop=* cannot pad more than 1 byte.
Ideally, I'd like to see a compiler option to pad any specific amount of bytes, and compiler will still perform all the normal function alignment.
Once I have this pad area, I can at run time to reuse these padding area to store some values or redirect the control flow.
P.S. I believe Linux kernel use similar trick to dynamically enable some software tracepoint.
-pg is intended for profile-guided optimization. The correct option for this is -fpatchable-function-entry
-fpatchable-function-entry=N[,M]
Generate N NOPs right at the beginning of each function, with the function entry point before the Mth NOP. If M is omitted, it defaults to 0 so the function entry points to the address just at the first NOP. The NOP instructions reserve extra space which can be used to patch in any desired instrumentation at run time, provided that the code segment is writable. The amount of space is controllable indirectly via the number of NOPs; the NOP instruction used corresponds to the instruction emitted by the internal GCC back-end interface gen_nop. This behavior is target-specific and may also depend on the architecture variant and/or other compilation options.
It'll insert N single-byte 0x90 NOPs and doesn't make use of multi-byte NOPs thus performance isn't as good as it should, but you probably don't care about that in this case so the option should work fine
I achieved this goal by implement my own mcount function in an assembly file and compile the code with -pg.

How can I compile C code to get a bare-metal skeleton of a minimal RISC-V assembly program?

I have the following simple C code:
void main(){
int A = 333;
int B=244;
int sum;
sum = A + B;
}
When I compile this with
$riscv64-unknown-elf-gcc code.c -o code.o
If I want to see the assembly code I use
$riscv64-unknown-elf-objdump -d code.o
But when I explore the assembly code I see that this generates a lot of code which I assume is for Proxy Kernel support (I am a newbie to riscv). However, I do not want that this code has support for Proxy kernel, because the idea is to implement only this simple C code within an FPGA.
I read that riscv provides three types of compilation: Bare-metal mode, newlib proxy kernel and riscv Linux. According to previous research, the kind of compilation that I should do is bare metal mode. This is because I desire a minimum assembly without support for the operating system or kernel proxy. Assembly functions as a system call are not required.
However, I have not yet been able to find as I can compile a C code for get a skeleton of a minimal riscv assembly program. How can I compile the C code above in bare metal mode or for get a skeleton of a minimal riscv assembly code?
Warning: this answer is somewhat out-of-date as of the latest RISC-V Privileged Spec v1.9, which includes the removal of the tohost Control/Status Register (CSR), which was a part of the non-standard Host-Target Interface (HTIF) which has since been removed. The current (as of 2016 Sep) riscv-tests instead perform a memory-mapped store to a tohost memory location, which in a tethered environment is monitored by the front-end server.
If you really and truly need/want to run RISC-V code bare-metal, then here are the instructions to do so. You lose a bunch of useful stuff, like printf or FP-trap software emulation, which the riscv-pk (proxy kernel) provides.
First things first - Spike boots up at 0x200. As Spike is the golden ISA simulator model, your core should also boot up at 0x200.
(cough, as of 2015 Jul 13, the "master" branch of riscv-tools (https://github.com/riscv/riscv-tools) is using an older pre-v1.7 Privileged ISA, and thus starts at 0x2000. This post will assume you are using v1.7+, which may require using the "new_privileged_isa" branch of riscv-tools).
So when you disassemble your bare-metal program, it better
start at 0x200!!! If you want to run it on top of the proxy-kernel, it
better start at 0x10000 (and if Linux, it’s something even larger…).
Now, if you want to run bare metal, you’re forcing yourself to write up the
processor boot code. Yuck. But let’s punt on that and pretend that’s not
necessary.
(You can also look into riscv-tests/env/p, for the “virtual machine”
description for a physically addressed machine. You’ll find the linker script
you need and some macros.h to describe some initial setup code. Or better
yet, in riscv-tests/benchmarks/common.crt.S).
Anyways, armed with the above (confusing) knowledge, let’s throw that all
away and start from scratch ourselves...
hello.s:
.align 6
.globl _start
_start:
# screw boot code, we're going minimalist
# mtohost is the CSR in machine mode
csrw mtohost, 1;
1:
j 1b
and link.ld:
OUTPUT_ARCH( "riscv" )
ENTRY( _start )
SECTIONS
{
/* text: test code section */
. = 0x200;
.text :
{
*(.text)
}
/* data: Initialized data segment */
.data :
{
*(.data)
}
/* End of uninitalized data segement */
_end = .;
}
Now to compile this…
riscv64-unknown-elf-gcc -nostdlib -nostartfiles -Tlink.ld -o hello hello.s
This compiles to (riscv64-unknown-elf-objdump -d hello):
hello: file format elf64-littleriscv
Disassembly of section .text:
0000000000000200 <_start>:
200: 7810d073 csrwi tohost,1
204: 0000006f j 204 <_start+0x4>
And to run it:
spike hello
It’s a thing of beauty.
The link script places our code at 0x200. Spike will start at
0x200, and then write a #1 to the control/status register
“tohost”, which tells Spike “stop running”. And then we spin on an address
(1: j 1b) until the front-end server has gotten the message and kills us.
It may be possible to ditch the linker script if you can figure out how to
tell the compiler to move <_start> to 0x200 on its own.
For other examples, you can peruse the following repositories:
The riscv-tests repository holds the RISC-V ISA tests that are very minimal
(https://github.com/riscv/riscv-tests).
This Makefile has the compiler options:
https://github.com/riscv/riscv-tests/blob/master/isa/Makefile
And many of the “virtual machine” description macros and linker scripts can
be found in riscv-tests/env (https://github.com/riscv/riscv-test-env).
You can take a look at the “simplest” test at (riscv-tests/isa/rv64ui-p-simple.dump).
And you can check out riscv-tests/benchmarks/common for start-up and support code for running bare-metal.
The "extra" code is put there by gcc and is the sort of stuff required for any program. The proxy kernel is designed to be the bare minimum amount of support required to run such things. Once your processor is working, I would recommend running things on top of pk rather than bare-metal.
In the meantime, if you want to look at simple assembly, I would recommend skipping the linking phase with '-c':
riscv64-unknown-elf-gcc code.c -c -o code.o
riscv64-unknown-elf-objdump -d code.o
For examples of running code without pk or linux, I would look at riscv-tests.
I'm surprised no one mentioned gcc -S which skips assembly and linking altogether and outputs assembly code, albeit with a bunch of boilerplate, but it may be convenient just to poke around.

mpc85xx(P2040) startup using Nor Flash, where is Nor Flash mapped to?

I'm porting u-boot to P2040 based board these days.
As u-boot/arch/powerpc/mpc85xx/start.s commented:
The processor starts at 0xffff_fffc and the code is first executed in the last 4K page in flash/rom.
In u-boot/arch/powerpc/mpc85xx/resetvec.S:
.section .resetvec,"ax"
b _start_e500
And in u-boot.lds linker script:
.resetvec RESET_VECTOR_ADDRESS :
{
KEEP(*(.resetvec))
} :text = 0xffff
The reset vector is at 0xffff_fffc, which contains a jump instruction to _start_e500.
The E500MCRM Chapter 6.6 mentioned:
This default TLB entry translates the first instruction fetch out of reset(at effective address 0xffff_fffc).
This instruction should be a branch to the beginning of this page.
So, if I configure the HCW to let powerpc boot from Nor Flash, why should I suppose that the Nor Flash's
last 4K is mapped to 0xffff_f000~0xffff_ffff? Since there're no LAW setup yet and the default BR0/OR0 of Local Bus
does not match that range. I’m confused about how Nor Flash be access at the very beginning of core startup.
Another question is:
Does P2040 always have MMU enabled so as to translate effective address to real address even at u-boot stage?
If so, beside accessing to CCSRBAR, all other memory access should have TLB entry setup first.
Best Regards,
Hook Guo

instruction set emulator guide

I am interested in writing emulators like for gameboy and other handheld consoles, but I read the first step is to emulate the instruction set. I found a link here that said for beginners to emulate the Commodore 64 8-bit microprocessor, the thing is I don't know a thing about emulating instruction sets. I know mips instruction set, so I think I can manage understanding other instruction sets, but the problem is what is it meant by emulating them?
NOTE: If someone can provide me with a step-by-step guide to instruction set emulation for beginners, I would really appreciate it.
NOTE #2: I am planning to write in C.
NOTE #3: This is my first attempt at learning the whole emulation thing.
Thanks
EDIT: I found this site that is a detailed step-by-step guide to writing an emulator which seems promising. I'll start reading it, and hope it helps other people who are looking into writing emulators too.
Emulator 101
An instruction set emulator is a software program that reads binary data from a software device and carries out the instructions that data contains as if it were a physical microprocessor accessing physical data.
The Commodore 64 used a 6502 Microprocessor. I wrote an emulator for this processor once. The first thing you need to do is read the datasheets on the processor and learn about its behavior. What sort of opcodes does it have, what about memory addressing, method of IO. What are its registers? How does it start executing? These are all questions you need to be able to answer before you can write an emulator.
Here is a general overview of how it would look like in C (Not 100% accurate):
uint8_t RAM[65536]; //Declare a memory buffer for emulated RAM (64k)
uint16_t A; //Declare Accumulator
uint16_t X; //Declare X register
uint16_t Y; //Declare Y register
uint16_t PC = 0; //Declare Program counter, start executing at address 0
uint16_t FLAGS = 0 //Start with all flags cleared;
//Return 1 if the carry flag is set 0 otherwise, in this example, the 3rd bit is
//the carry flag (not true for actual 6502)
#define CARRY_FLAG(flags) ((0x4 & flags) >> 2)
#define ADC 0x69
#define LDA 0xA9
while (executing) {
switch(RAM[PC]) { //Grab the opcode at the program counter
case ADC: //Add with carry
A = X + RAM[PC+1] + CARRY_FLAG(FLAGS);
UpdateFlags(A);
PC += ADC_SIZE;
break;
case LDA: //Load accumulator
A = RAM[PC+1];
UpdateFlags(X);
PC += MOV_SIZE;
break;
default:
//Invalid opcode!
}
}
According to this reference ADC actually has 8 opcodes in the 6502 processor, which means you will have 8 different ADC in your switch statement, each one for different opcodes and memory addressing schemes. You will have to deal with endianess and byte order, and of course pointers. I would get a solid understanding of pointer and type casting in C if you dont already have one. To manipulate the flags register you have to have a solid understanding of bitwise operations in C. If you are clever you can make use of C macros and even function pointers to save yourself some work, as the CARRY_FLAG example above.
Every time you execute an instruction, you must advance the program counter by the size of that instruction, which is different for each opcode. Some opcodes dont take any arguments and so their size is just 1 byte, while others take 16-bit integers as in my MOV example above. All this should be pretty well documented.
Branch instructions (JMP, JE, JNE etc) are simple: If some flag is set in the flags register then load the PC to the address specified. This is how "decisions" are made in a microprocessor and emulating them is simply a matter of changing the PC, just as the real microprocessor would do.
The hardest part about writing an instruction set emulator is debugging. How do you know if everything is working like it should? There are plenty of resources for helping you. People have written test codes that will help you debug every instruction. You can execute them one instruction at a time and compare the reference output. If something is different, you know you have a bug somewhere and can fix it.
This should be enough to get you started. The important thing is that you have A) A good solid understanding of the instruction set you want to emulate and B) a solid understanding of low level data manipulation in C, including type casting, pointers, bitwise operations, byte order, etc.

Resources