Linux x86 bootloader - linux

I am trying to build a simple x86 Linux bootloader in nasm.
The Linux bzImage is stored on disk partition sda1 starting from the first sector.
I read the real mode code from the bzImage (15 sectors) into memory starting from 0x7E00.
However when i jump into the code it just hangs, nothing happens.
I have created code for the master boot record on sda. I's probably best if i just attach
the whole thing. I would like to know why it just hangs after the far jump instruction.
[BITS 16]
%define BOOTSEG 0x7C0
%define BOOTADDR (BOOTSEG * 0x10)
%define HDRSEG (BOOTSEG + 0x20)
%define HDRADDR (HDRSEG * 0x10)
%define KERNSEG (HDRSEG + 0x20)
[ORG BOOTADDR]
entry_section:
cli
jmp start
start:
; Clear segments
xor ax, ax
mov ds, ax
mov es, ax
mov gs, ax
mov fs, ax
mov ss, ax
mov sp, BOOTADDR ; Lots of room for it to grow down from here
; Read all 15 sectors of realmode code in the kernel
mov ah, 0x42
mov si, dap
mov dl, 0x80
int 0x13
jc bad
; Test magic number of kernel header
mov eax, dword [HDRADDR + 0x202]
cmp eax, 'HdrS'
jne bad
; Test jump instruction is there
mov al, byte [KERNSEG * 16]
cmp al, 0xEB
jne bad
xor ax, ax ; Kernel entry code will set ds = ax
xor bx, bx ; Will also set ss = dx
jmp dword KERNSEG:0
; Simple function to report an error and halt
bad:
mov al, "B"
call putc
jmp halt
; Param: char in al
putc:
mov ah, 0X0E
mov bh, 0x0F
xor bl, bl
int 0x10
ret
halt:
hlt
jmp halt
; Begin data section
dap: ; Disk address packet
db 0x10 ; Size of dap in bytes
db 0 ; Unused
dw 15 ; Number of sectors to read
dw 0 ; Offset where to place data
dw HDRSEG ; Segment where to place data
dd 0x3F ; Low order of start addres in sectors
dd 0 ; High order of start address in sectors
; End data section
times 446-($-$$) db 0 ; Padding to make the MBR 512 bytes
; Hardcoded partition entries
part_boot:
dw 0x0180, 0x0001, 0xFE83, 0x3c3f, 0x003F, 0x0000, 0xF3BE, 0x000E
part_sda2:
dw 0x0000, 0x3D01, 0xFE83, 0xFFFF, 0xF3FD, 0x000E, 0x5AF0, 0x01B3
part_sda3:
dw 0xFE00, 0xFFFF, 0xFE83, 0xFFFF, 0x4EED, 0x01C2, 0xb113, 0x001D
part_sda4:
dw 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000
dw 0xAA55 ; Magic number at relative address 510
mbrend: ; Relative address 512

Assuming your code is a boot loader (and therefore is not an MBR):
Never disable IRQs unless you have to. The BIOS needs them to function correctly, and will enable them inside some BIOS functions anyway (e.g. waiting for a "sectors transferred" IRQ inside disk functions). Because your code is only loading and passing control to more real mode code (e.g. and no switch to protected mode or anything is involved) you have no reason to disable IRQs anywhere in your entire boot loader.
For real mode addressing, it's typically cleaner/easier to use 0x0000:0x7C00 rather than 0x07C0:0x0000. You seem to be attempting to mix both (e.g. set segment registers for the former but define BOOTSEG and HDRSEG for the latter).
The partition table contains "extended partitions" and not "primary partitions" and your partition table is therefore wrong (and should probably be blank/empty).
The boot loader should not assume any specific/hard-coded "starting LBA" (the "starting LBA" for the partition depends on how the end user felt like partitioning their disk when the OS is installed). You need to determine the partition's "starting LBA" from the MBR's primary partition table, which is typically done by hoping that the MBR left DS:SI pointing to your partition's partition table entry.
You should not assume that your are booting from "BIOS device 0x80". The MBR should leave DL set to the correct device number, and there should be no reason why your code shouldn't work if (e.g.) the OS is installed on the second hard drive or something else.
Your hard-coded "starting LBA to read" (in the DAP) is wrong. For historical reasons there's probably 63 sectors per track and your partition starts on the 64th sector. This means that LBA sector 0x3F is the first sector in the partition (which is your boot loader) and is not the first sector of the kernel. I'm assuming the first sector of the kernel might be LBA sector 0x40 (the second sector of the partition).
The "number of sectors" shouldn't be hard-coded either. You'd want to load the beginning of the kernel and examine it, and determine how many sectors to load where from that.
Typically 512 bytes (actually more like 446 bytes) is far too little for a decent boot loader. The first 512 bytes of a boot loader should load the rest of the boot loader (with every spare byte left over used to improve error handling - e.g. puts("Read error while trying to load boot loader") rather than just putc('B')). Everything else (loading the pieces of the kernel, setting up a video mode, setting correct values in the "real mode kernel header" fields, etc) should be in the additional sectors and not be in the first sector.
Note that the way the computer boots has been carefully designed such that any MBR can chainload any OS on any partition of any disk; and the MBR may be part of something larger (e.g. a boot manager) that allows multiple OSs to be installed (e.g. where the user can use a pretty menu or something to choose which partition the MBR's code should chain-load). This design allows the user to replace the MBR (or boot manager) with anything else at any time without effecting any installed OS (or causing all of their installed OSs to need fixing). For a simple example, a user should be able to have 12 different partitions that all contain your boot loader and a separate/independent version of Linux, and then install any boot manager (e.g. GRUB, Plop, GAG, MasterBooter, etc) that they want at any time.
For why your code hangs, it's not very important given that all of the code needs to be rewritten anyway . If you're curious, I'd strongly recommend running it inside an emulator with a debugger (e.g. Bochs) so that you can examine exactly what has happened (e.g. dump memory at 0x00007E00 to see what it contains, single-step the JMP to see what is being executed, etc).

Comment does not match code!
xor bx, bx ; Will also set ss = dx
I seriously doubt if that is your problem...
Disclaimer: I have not done this! I'm "chicken" and have always done my bootin' from floppy.
What I "expect" to see in an MBR is for it to move itself out of the way, and then load the first sector on the active partition to 7C00h again, and then jump there. This "real bootloader" loads the rest. I'm not familiar with the layout of a bzImage - maybe it'll work loaded at 7E00h...
I think I'm in over my head. I'll get me coat...

Related

How to comprehend the flow of this assembly code

I can' t understand how this works.
Here's a part of main() program disassembled by objdump and written in intel notation
0000000000000530 <main>:
530: lea rdx,[rip+0x37d] # 8b4 <_IO_stdin_used+0x4>
537: mov DWORD PTR [rsp-0xc],0x0
53f: movabs r10,0xedd5a792ef95fa9e
549: mov r9d,0xffffffcc
54f: nop
550: mov eax,DWORD PTR [rsp-0xc]
554: cmp eax,0xd
557: ja 57c <main+0x4c>
559: movsxd rax,DWORD PTR [rdx+rax*4]
55d: add rax,rdx
560: jmp rax
The rodata section dump:
.rodata
08b0 01000200 ecfdffff d4fdffff bcfdffff ................
08c0 9cfdffff 7cfdffff 6cfdffff 4cfdffff ....|...l...L...
08d0 3cfdffff 2cfdffff 0cfdffff ecfcffff <...,...........
08e0 d4fcffff b4fcffff 0cfeffff ............
In 530, rip is [537] so [rdx] = [537 + 37d] = 8b4.
First question is the value of rdx is how large? Is the valueis ec, or ecfdffff or something else? If it has DWORD, I can understand that has 'ecfdffff' (even this is wrong too?:() but this program don't declare it. How can I judge the value?
Then the program continues.
In 559, rax is first appeared.
The second question is this rax can interpret as a part of eax and in this time is the rax = 0? If rax is 0, in 559 means rax = DWORD[rdx] and the value of rax become ecfdffff and next [55d] do rax += rdx, and I think this value can't jamp. There must be something wrong, so tell me where, or how i make any wrongs.
I think I'll diverge from what Peter discussed (he provides good information) and get to the heart of some issues I think are causing you problems. When I first glanced at this question I assumed that the code was likely compiler generated and the jmp rax was likely the result of some control flow statement. The most likely way to generate such a code sequence is via a C switch. It isn't uncommon for a switch statement to be made of a jump table to say what code should execute depending on the control variable. As an example: the control variable for switch(a) is a.
This all made sense to me, and I wrote up a number of comments (now deleted) that ultimately resulted in bizarre memory addresses that jmp rax would go to. I had errands to run but when I returned I had the aha moment that you may have had the same confusion I did. This output from objdump using the -s option appeared as:
.rodata
08b0 01000200 ecfdffff d4fdffff bcfdffff ................
08c0 9cfdffff 7cfdffff 6cfdffff 4cfdffff ....|...l...L...
08d0 3cfdffff 2cfdffff 0cfdffff ecfcffff <...,...........
08e0 d4fcffff b4fcffff 0cfeffff ............
One of your questions seems to be about what values get loaded here. I never used the -s option to look at data in the sections and was unaware that although the dump splits the data out in groups of 4 bytes (32-bit values) they are shown in byte order as it appears in memory. I had at first assumed the output was displaying these values from Most Significant Byte to Least significant byte and objdump -s had done the conversion. That is not the case.
You have to manually reverse the bytes of each group of 4 bytes to get the real value that would be read from memory into a register.
ecfdffff in the output actually means ec fd ff ff. As a DWORD value (32-bit) you need to reverse the bytes to get the HEX value as you would expect when loaded from memory. ec fd ff ff reversed would be ff ff fd ec or the 32-bit value 0xfffffdec. Once you realize that then this makes a lot more sense. If you make this same adjustment for all the data in that table you'd get:
.rodata
08b0: 0x00020001 0xfffffdec 0xfffffdd4 0xfffffdbc
08c0: 0xfffffd9c 0xfffffd7c 0xfffffd6c 0xfffffd4c
08d0: 0xfffffd3c 0xfffffd2c 0xfffffd0c 0xfffffcec
08e0: 0xfffffcd4 0xfffffcb4 0xfffffe0c
Now if we look at the code you have it starts with:
530: lea rdx,[rip+0x37d] # 8b4 <_IO_stdin_used+0x4>
This doesn't load data from memory, it is computing the effective address of some data and places the address in RDX. The disassembly from OBJDUMP is displaying the code and data with the view that it is loaded in memory starting at 0x000000000000. When it is loaded into memory it may be placed at some other address. GCC in this case is producing position independent code (PIC). It is generated in such a way that the first byte of the program can start at an arbitrary address in memory.
The # 8b4 comment is the part we are concerned about (you can ignore the information after that). The disassembly is saying if the program was loaded at 0x0000000000000000 then the value loaded into RDX would be 0x8b4. How was that arrived at? This instruction starts at 0x530 but with RIP relative addressing the RIP (instruction pointer) is relative to the address just after the current instruction. The address the disassembler used was 0x537 (the byte after the current instruction is the address of the first byte of the next instruction). The instruction adds 0x37d to RIP and gets 0x537+0x37d=0x8b4. The address 0x8b4 happens to be in the .rodata section which you are given a dump of (as discussed above).
We now know that RDX contains the base of some data. The jmp rax suggests this is likely going to be a table of 32-bit values that are used to determine what memory location to jump to depending on the value in the control variable of a switch statement.
This statement appears to be storing the value 0 as a 32-bit value on the stack.
537: mov DWORD PTR [rsp-0xc],0x0
These appear to be variables that the compiler chose to store in registers (rather than memory).
53f: movabs r10,0xedd5a792ef95fa9e
549: mov r9d,0xffffffcc
R10 is being loaded with the 64-bit value 0xedd5a792ef95fa9e. R9D is the lower 32-bits of the 64-bit R9 register.The value 0xffffffcc is being loaded into the lower 32-bits of R9 but there is something else occurring. In 64-bit mode if the destination of an instruction is a 32-bit register the CPU automatically zero extends the value into the upper 32-bits of the register. The CPU is guaranteeing us that the upper 32-bits are zeroed.
This is a NOP and doesn't do anything except align the next instruction to memory address 0x550. 0x550 is a value that is 16-byte aligned. This has some value and may hint that the instruction at 0x550 may be the first instruction at the top of a loop. An optimizer may place NOPs into the code to align the first instruction at the top of a loop to a 16-byte aligned address in memory for performance reasons:
54f: nop
Earlier the 32-bit stack based variable at rsp-0xc was set to zero. This reads the value 0 from memory as a 32-bit value and stores it in EAX. Since EAX is a 32-bit register being used as the destination for the instruction the CPU automatically filled the upper 32-bits of RAX to 0. So all of RAX is zero.
550: mov eax,DWORD PTR [rsp-0xc]
EAX is now being compared to 0xd. If it is above (ja) it goes to the instruction at 0x57c.
554: cmp eax,0xd
557: ja 57c <main+0x4c>
We then have this instruction:
559: movsxd rax,DWORD PTR [rdx+rax*4]
The movsxd is an instruction that will take a 32-bit source operand (in this case the 32-bit value at memory address RDX+RAX*4) load it into the bottom 32-bits of RAX and then sign extend the value into the upper 32-bits of RAX. Effectively if the 32-bit value is negative (the most significant bit is 1) the upper 32-bits of RAX will be set to 1. If the 32-bit value is not negative the upper 32-bits of RAX will be set to 0.
When this code is first encountered RDX contains the base of some table at 0x8b4 from the beginning of the program loaded in memory. RAX is set to 0. Effectively the first 32-bits in the table are copied to RAX and sign extended. As seen earlier the value at offset 0xb84 is 0xfffffdec. That 32-bit value is negative so RAX contains 0xfffffffffffffdec.
Now to the meat of the situation:
55d: add rax,rdx
560: jmp rax
RDX still holds the address to the beginning of a table in memory. RAX is being added to that value and stored back in RAX (RAX = RAX+RDX). We then JMP to the address stored in RAX. So this code all seems to suggest we have a JUMP table with 32-bit values that we are using to determine where we should go. So then the obvious question. What are the 32-bit values in the table? The 32-bit values are the difference between the beginning of the table and the address of the instruction we want to jump to.
We know the table is 0x8b4 from the location our program is loaded in memory. The C compiler told the linker to compute the difference between 0x8b4 and the address where the instruction we want to execute resides. If the program had been loaded into memory at 0x0000000000000000 (hypothetically), RAX = RAX+RDX would have resulted in RAX being 0xfffffffffffffdec + 0x8b4 = 0x00000000000006a0. We then use jmp rax to jump to 0x6a0. You didn't show the entire dump of memory but there is going to be code at 0x6a0 that will execute when the value passed to the switch statement is 0. Each 32-bit value in the JUMP table will be a similar offset to the code that will execute depending on the control variable in the switch statement. If we add 0x8b4 to all the entries in the table we get:
08b0: 0x000006a0 0x00000688 0x00000670
08c0: 0x00000650 0x00000630 0x00000620 0x00000600
08d0: 0x000005F0 0x000005e0 0x000005c0 0x000005a0
08e0: 0x00000588 0x00000568 0x000006c0
You should find that in the code you haven't provided us that these addresses coincide with code that appears after the jmp rax.
Given that the memory address 0x550 was aligned, I have a hunch that this switch statement is inside a loop that keeps executing as some kind of state machine until the proper conditions are met for it to exit. Likely the value of the control variable used for the switch statement is changed by the code in the switch statement itself. Each time the switch statement is run the control variable has a different value and will do something different.
The control variable for the switch statement was originally checked for the value being above 0x0d (13). The table starting at 0x8b4 in the .rodata section has 14 entries. One can assume the switch statement probably has 14 different states (cases).
but this program don't declare it
You're looking at disassembly of machine code + data. It's all just bytes in memory. Any labels the disassembler does manage to show are ones that got left in the executable's symbol table. They're irrelevant to how the CPU runs the machine code.
(The ELF program headers tell the OS's program loader how to map it into memory, and where to jump to as an entry point. This has nothing to do with symbols, unless a shared library references some globals or functions defined in the executable.)
You can single-step the code in GDB and watch register values change.
In 559, rax is first appeared.
EAX is the low 32 bits of RAX. Writing to EAX zero-extends into RAX implicitly. From mov DWORD PTR [rsp-0xc],0x0 and the later reload, we know that RAX=0.
This must have been un-optimized compiler output (or volatile int idx = 0; to defeat constant propagation), otherwise it would know at compile time that RAX=0 and could optimize away everything else.
lea rdx,[rip+0x37d] # 8b4
A RIP-relative LEA puts the address of static into a register. It's not a load from memory. (That happens later when movsxd with an indexed addressing mode uses RDX as the base address.)
The disassembler worked out the address for you; it's RDX = 0x8b4. (Relative to the start of the file; when actually running the program would be mapped at a virtual address like 0x55555...000)
554: cmp eax,0xd
557: ja 57c <main+0x4c>
559: movsxd rax,DWORD PTR [rdx+rax*4]
55d: add rax,rdx
560: jmp rax
This is a jump table. First it checks for an out-of-bounds index with cmp eax,0xd, then it indexes a table of 32-bit signed offsets using EAX (movsxd with an addressing mode that scales RAX by 4), and adds that to the base address of the table to get a jump target.
GCC could just make a jump table of 64-bit absolute pointers, but chooses not to so that .rodata is position-independent as well and doesn't need load-time fixups in a PIE executable. (Even though Linux does support doing that.) See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84011 where this is discussed (although the main focus of that bug is that gcc -fPIE can't turn a switch into a table lookup of string addresses, and actually still uses a jump table)
The jump-offset table address is in RDX, this is what was set up with the earlier LEA.

How to read the second floppy with BIOS interrupt

I need to make a mini OS(boot from floppy A) that can write/read floppy B.
The environment is vmware workstation. The floppies are A.img and B.img.
In A.img, I set a MBR program and a func.bin. I need to archieve above function in func.bin. Following is a code snippet in it.
... ; set es:bx
mov ah, 0x03 ; read sectors
mov al, 0x01 ; 1 sector
mov ch, 0x00 ; cylinder
mov cl, 0x03 ; sector
mov dh, 0x00 ; head
mov dl, 0x01 ; B.img
int 0x13
Then I got the return code: ah = 0x01. It means "illegal command" but I don't know what caused it.
I tried to change mov dl, 0x01 to mov dl, 0x00(A.img) or mov dl, 0x80(hard disk), they all succeed. So I want to know how I can solve it.
update my question:
The B.img has been set to "be auto-connected when vm runs".
size of A: 31.5KB
size of B: 1.44MB
The problem has been sovled by myself.
It caused by I don't know the config of VM.
By default, only one floppy drive is enabled in the virtual machine's BIOS. If you are adding a second floppy drive to the virtual machine, click inside the virtual machine window and press F2 as the virtual machine boots to enter the BIOS setup utility. On the main screen, choose Legacy Diskette B: and use the plus (+) and minus (-) keys on the numerical keypad to select the type of floppy drive you want to use. Then press F10 to save your changes and close the BIOS setup utility.
https://www.vmware.com/support/ws5/doc/ws_disk_add_floppy.html

How to limit the address space of 32bit application on 64bit Linux to 3GB?

Is it possible to make 64bit Linux loader to limit the address space of the loaded 32bit program to some upper limit?
Or to set some holes in the address space that to not be allocated by the kernel?
I mean for specific executable, not globally for all processes, neither through kernel configuration. Some code or ELF executable flags are examples of appropriate solution.
The limit should be forced for all loaded shared libraries as well.
Clarification:
The problem I want to fix is that my code uses the numbers above 0xc0000000 as a handle values and I want to clearly distinct between handle values and memory addresses, even when the memory addresses are allocated and returned by some third party library function.
As long as the address space in 64bit Linux is very close to 4G limit, there is no enough addressing space left for the handle values.
On the other hand 3GB or even less is far enough for all my needs.
OK, I found the answer of this question elsewhere.
The solution is to change the "personality" of your program to PER_LINUX32_3GB, using the Linux system call sys_personality.
But there is a problem. After switching to PER_LINUX32_3GB Linux kernel will not allocate space in the upper 1GB, but the already allocated space, for example the application stack, remains there.
The solution is to "restart" your program through sys_execve system call.
Here is the code where I packed everything in one:
proc ___SwitchLinuxTo3GB
begin
cmp esp, $c0000000
jb .finish ; the system is native 32bit
; check the current personality.
mov eax, sys_personality
mov ebx, -1
int $80
; and exit if it is what intended
test eax, ADDR_LIMIT_3GB
jnz .finish ; everything is OK.
; set the needed personality
mov eax, sys_personality
mov ebx, PER_LINUX32_3GB
int $80
; and restart the process
mov eax, [esp+4] ; argument count
mov ebx, [esp+8] ; the filename of the executable.
lea ecx, [esp+8] ; the arguments list.
lea edx, [ecx+4*eax+4] ; the environment list.
mov eax, sys_execve
int $80
; if something gone wrong, it comes here and stops!
int3
.finish:
return
endp

NASM - Use labels for code loaded from disk

As a learning experience, I'm writing a boot-loader for BIOS in NASM on 16-bit real mode in my x86 emulator, Qemu.
BIOS loads your boot-sector at address 0x7C00. NASM assumes you start at 0x0, so your labels are useless unless you do something like specify the origin with [org 0x7C00] (or presumably other techniques). But, when you load the 2nd stage boot-loader, its RAM origin is different, which complicates the hell out of using labels in that newly loaded code.
What's the recommended way to deal with this? It this linker territory? Should I be using segment registers instead of org?
Thanks in advance!
p.s. Here's the code that works right now:
[bits 16]
[org 0x7c00]
LOAD_ADDR: equ 0x9000 ; This is where I'm loading the 2nd stage in RAM.
start:
mov bp, 0x8000 ; set up the stack
mov sp, bp ; relatively out of the way
call disk_load ; load the new instructions
; at 0x9000
jmp LOAD_ADDR
%include "disk_load.asm"
times 510 - ($ - $$) db 0
dw 0xaa55 ;; end of bootsector
seg_two:
;; this is ridiculous. Better way?
mov cx, LOAD_ADDR + print_j - seg_two
jmp cx
jmp $
print_j:
mov ah, 0x0E
mov al, 'k'
int 0x10
jmp $
times 2048 db 0xf
You may be making this harder than it is (not that this is trivial by any means!)
Your labels work fine and will continue to work fine. Remember that, if you look under the hood at the machine code generated, your short jumps (everything after seg_two in what you've posted) are relative jumps. This means that the assembler doesn't actually need to compute the real address, it simply needs to calculate the offset from the current opcode. However, when you load your code into RAM at 0x9000, that is an entirely different story.
Personally, when writing precisely the kind of code that you are, I would separate the code. The boot sector stops at the dw 0xaa55 and the 2nd stage gets its own file with an ORG 0x9000 at the top.
When you compile these to object code you simply need to concatenate them together. Essentially, that's what you're doing now except that you are getting the assembler to do it for you.
Hope this makes sense. :)

Where does code of GRUB stage 1.5 reside on disk and what is the address it is loaded?

I have grub v1.98 installed and after disassembling the MBR I find the following code snippet that I don't understand:
xor ax,ax
mov [si+0x4],ax
inc ax
mov [si-0x1],al
mov [si+0x2],ax
mov word [si],0x10
mov ebx,[0x7c5c]
mov [si+0x8],ebx
mov ebx,[0x7c60]
mov [si+0xc],ebx
mov word [si+0x6],0x7000
mov ah,0x42
int 0x13
It seems this piece of code tries to set up disk address of stage 1.5 code, then load and run it. However, how could I figure out which physical block it tries to read? What's more, what is the destination of the stage 1.5 code? 0x7000?
I refer to MBR for Windows 7, where subsequent boot up code is loaded 0x7c00. Given MBR is first loaded at address 0x7c00, it contains a piece of code copying MBR from 0x7c00 to 0x0600 and then branch to 0x0600 in case the original code corrupted. Will loading stage 1.5 code to address 0x7000 conflict the original code? What's more, I also find:
jmp short 0x65
nop
sar byte [si+0x7c00],1
mov es,ax
mov ds,ax
mov si,0x7c00
mov di,0x600
mov cx,0x200
cld
rep movsb
push ax
push word 0x61c
retf
at the beginning of the MBR. It seems the code tries to do the same thing as in MBR of windows 7 to copy the original MBR from 0x7c00 to 0x0600, except for the first jmp instruction. Will these codes in fact executed? If yes, when will control jumps here.(I believe the answer is YES, but am confused by the leading jmp).
GRUB 1.98 is GRUB version 2. In version 2, there is no stage 1.5 anymore.
The stage 1.5 had a fixed place between MBR and first partition. It was (most often) unused space on the hard drive. GPT partitioning and other (unusual) layouts do not provide this space.
In GRUB v2 stage 1 loads core.img, which can be stored at any LBA48 location, typically between MBR and first partition, but it can also be stored within a partition. In the non-EFI case of GPT, a custom partition should be created for it. The location is hardwired into stage 1.
See also: http://www.gnu.org/software/grub/manual/grub.html#Images

Resources