I am trying to understand memory mapping in Linux, and wrote below program for the same.
#include <stdio.h>
#include <pthread.h>
#include <unistd.h>
#define MEM_SIZE 1*1024*1024*1024 // 1GB
#define PAGE_SIZE 4*1024
#define NO_OF_PAGES (MEM_SIZE)/(PAGE_SIZE)
// Global array of size 1 GB
int mem[MEM_SIZE/sizeof(int)] = {
0,1,2,
};
void *func(void *args)
{
while (1) {
for (int i=0; i<NO_OF_PAGES; i++) {
mem[(i*PAGE_SIZE)/sizeof(int)] = 1; // Update first 4-bytes of each page.
}
// All pages are updated.
fflush(stdout);
printf("."); // Each dot(.) represents update of all pages in global data.
}
return NULL;
}
int main() {
pthread_t t;
pthread_create(&t, NULL, func, NULL);
printf("A thread is continually updating all pages of global 1GB data. (pid : %d)\n", getpid());
pthread_join(t, NULL);
return 0;
}
It executes like this,
[ ] gcc test.c
[ ] ./a.out
A thread is continually updating all pages of global 1GB data. (pid : 4086)
.......................
Here, a thread is continually updating all pages of the global data used in this program.
Hence, this process should consume ~1GB (MEM_SIZE) in memory.
But the process memory looks like below, (I can not see 1GB of data memory being used here).
[ ]# cat /proc/4086/maps
55eac3b7c000-55eac3b7d000 r--p 00000000 08:01 1197055 /root/testing/a.out
55eac3b7d000-55eac3b7e000 r-xp 00001000 08:01 1197055 /root/testing/a.out
55eac3b7e000-55eac3b7f000 r--p 00002000 08:01 1197055 /root/testing/a.out
55eac3b7f000-55eac3b80000 r--p 00002000 08:01 1197055 /root/testing/a.out
55eac3b80000-55eb03b81000 rw-p 00003000 08:01 1197055 /root/testing/a.out
55eb047b5000-55eb047d6000 rw-p 00000000 00:00 0 [heap]
7f19c791b000-7f19c791c000 ---p 00000000 00:00 0
7f19c791c000-7f19c811e000 rw-p 00000000 00:00 0
7f19c811e000-7f19c8140000 r--p 00000000 08:01 1969450 /usr/lib/libc.so.6
7f19c8140000-7f19c829b000 r-xp 00022000 08:01 1969450 /usr/lib/libc.so.6
7f19c829b000-7f19c82f2000 r--p 0017d000 08:01 1969450 /usr/lib/libc.so.6
7f19c82f2000-7f19c82f6000 r--p 001d4000 08:01 1969450 /usr/lib/libc.so.6
7f19c82f6000-7f19c82f8000 rw-p 001d8000 08:01 1969450 /usr/lib/libc.so.6
7f19c82f8000-7f19c8307000 rw-p 00000000 00:00 0
7f19c8311000-7f19c8312000 r--p 00000000 08:01 1969441 /usr/lib/ld-linux-x86-64.so.2
7f19c8312000-7f19c8338000 r-xp 00001000 08:01 1969441 /usr/lib/ld-linux-x86-64.so.2
7f19c8338000-7f19c8342000 r--p 00027000 08:01 1969441 /usr/lib/ld-linux-x86-64.so.2
7f19c8342000-7f19c8344000 r--p 00031000 08:01 1969441 /usr/lib/ld-linux-x86-64.so.2
7f19c8344000-7f19c8346000 rw-p 00033000 08:01 1969441 /usr/lib/ld-linux-x86-64.so.2
7ffe8b517000-7ffe8b538000 rw-p 00000000 00:00 0 [stack]
7ffe8b599000-7ffe8b59d000 r--p 00000000 00:00 0 [vvar]
7ffe8b59d000-7ffe8b59f000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0 [vsyscall]
[ ]#
Though, meminfo reflects the consumption of ~1GB.
Before starting this program:
[ ]# cat /proc/meminfo | grep 'MemFree\|SwapFree'
MemFree: 6740324 kB
SwapFree: 0 kB
and after starting the program:
[ ]# cat /proc/meminfo | grep 'MemFree\|SwapFree'
MemFree: 5699312 kB
SwapFree: 0 kB
In the map, we can see the following zones:
7f19c791b000-7f19c791c000 ---p 00000000 00:00 0 : Red zone to protect the stack of the thread = non RW 4KB long page (stack overflow detection)
7f19c791c000-7f19c811e000 rw-p 00000000 00:00 0 : Stack of the thread (it grows from high to low memory addresses)
55eac3b80000-55eb03b81000 rw-p 00003000 08:01 1197055 : the 1 GB memory space (mem[] table) + 4KB
You could add to your program the display of the mem[] table address:
printf("A thread is continually updating all pages of global 1GB data (mem[]#%p). (pid : %d)\n", mem, getpid());
The size command displays the size of the data segment which is ~1GB:
$ gcc test.c -l pthread
$ size a.out
text data bss dec hex filename
2571 1073742504 16 1073745091 40000cc3 a.out
Execution which shows mem[]'s address at: 0x562c48575020
$ ./a.out
A thread is continually updating all pages of global 1GB data (mem[]#0x562c48575020). (pid : 10029)
.............................................
Display of the map from another terminal where we can see the address of mem[] located in the zone 562c48575000-562c88576000:
$ cat /proc/`pidof a.out`/maps
562c48571000-562c48572000 r--p 00000000 08:05 9044419 /home/xxx/a.out
562c48572000-562c48573000 r-xp 00001000 08:05 9044419 /home/xxx/a.out
562c48573000-562c48574000 r--p 00002000 08:05 9044419 /home/xxx/a.out
562c48574000-562c48575000 r--p 00002000 08:05 9044419 /home/xxx/a.out
562c48575000-562c88576000 rw-p 00003000 08:05 9044419 /home/xxx/a.out
562c89668000-562c89689000 rw-p 00000000 00:00 0 [heap]
7f02946be000-7f02946bf000 ---p 00000000 00:00 0
7f02946bf000-7f0294ebf000 rw-p 00000000 00:00 0
7f0294ebf000-7f0294ee1000 r--p 00000000 08:05 6031519 /usr/lib/x86_64-linux-gnu/libc-2.31.so
7f0294ee1000-7f0295059000 r-xp 00022000 08:05 6031519 /usr/lib/x86_64-linux-gnu/libc-2.31.so
7f0295059000-7f02950a7000 r--p 0019a000 08:05 6031519 /usr/lib/x86_64-linux-gnu/libc-2.31.so
7f02950a7000-7f02950ab000 r--p 001e7000 08:05 6031519 /usr/lib/x86_64-linux-gnu/libc-2.31.so
7f02950ab000-7f02950ad000 rw-p 001eb000 08:05 6031519 /usr/lib/x86_64-linux-gnu/libc-2.31.so
7f02950ad000-7f02950b1000 rw-p 00000000 00:00 0
7f02950b7000-7f02950bd000 r--p 00000000 08:05 6031549 /usr/lib/x86_64-linux-gnu/libpthread-2.31.so
7f02950bd000-7f02950ce000 r-xp 00006000 08:05 6031549 /usr/lib/x86_64-linux-gnu/libpthread-2.31.so
7f02950ce000-7f02950d4000 r--p 00017000 08:05 6031549 /usr/lib/x86_64-linux-gnu/libpthread-2.31.so
7f02950d4000-7f02950d5000 r--p 0001c000 08:05 6031549 /usr/lib/x86_64-linux-gnu/libpthread-2.31.so
7f02950d5000-7f02950d6000 rw-p 0001d000 08:05 6031549 /usr/lib/x86_64-linux-gnu/libpthread-2.31.so
7f02950d6000-7f02950da000 rw-p 00000000 00:00 0
7f02950fc000-7f02950ff000 rw-p 00000000 00:00 0
7f02950ff000-7f0295100000 r--p 00000000 08:05 6031511 /usr/lib/x86_64-linux-gnu/ld-2.31.so
7f0295100000-7f0295123000 r-xp 00001000 08:05 6031511 /usr/lib/x86_64-linux-gnu/ld-2.31.so
7f0295123000-7f029512b000 r--p 00024000 08:05 6031511 /usr/lib/x86_64-linux-gnu/ld-2.31.so
7f029512c000-7f029512d000 r--p 0002c000 08:05 6031511 /usr/lib/x86_64-linux-gnu/ld-2.31.so
7f029512d000-7f029512e000 rw-p 0002d000 08:05 6031511 /usr/lib/x86_64-linux-gnu/ld-2.31.so
7f029512e000-7f029512f000 rw-p 00000000 00:00 0
7f0295131000-7f0295133000 rw-p 00000000 00:00 0
7ffcb340f000-7ffcb3430000 rw-p 00000000 00:00 0 [stack]
7ffcb35ff000-7ffcb3603000 r--p 00000000 00:00 0 [vvar]
7ffcb3603000-7ffcb3605000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0 [vsyscall]
an example for /proc/pid/maps
0022a000-00245000 r-xp 00000000 ca:01 11633540 /lib/ld-2.5.so
00245000-00246000 r--p 0001a000 ca:01 11633540 /lib/ld-2.5.so
00246000-00247000 rw-p 0001b000 ca:01 11633540 /lib/ld-2.5.so
00249000-003a3000 r-xp 00000000 ca:01 11633640 /lib/i686/nosegneg/libc-2.5.so
003a3000-003a5000 r--p 0015a000 ca:01 11633640 /lib/i686/nosegneg/libc-2.5.so
003a5000-003a6000 rw-p 0015c000 ca:01 11633640 /lib/i686/nosegneg/libc-2.5.so
003a6000-003a9000 rw-p 003a6000 00:00 0
00ada000-00adb000 r-xp 00ada000 00:00 0 [vdso]
08048000-08049000 r-xp 00000000 00:16 4735574 /home/yimingwa/test/Ctest/link_test/SectionMapping.elf
08049000-0804a000 rw-p 00000000 00:16 4735574 /home/yimingwa/test/Ctest/link_test/SectionMapping.elf
b7fcf000-b7fd0000 rw-p b7fcf000 00:00 0
b7fe1000-b7fe2000 rw-p b7fe1000 00:00 0
bfe82000-bfe98000 rw-p bffe8000 00:00 0 [stack]
the 4th column means “If the region was mapped from a file, this is the major and minor device number (in hex) where the file lives”
In the above, ca:01 I can find through /proc/devices /dev
Question is that what does "00:16" 00 means which major device?
When I look at a process's memory map using
cat /proc/pid/maps
There are entries like this:
40321000-40336000 r-xp 00000000 b3:15 875 /system/lib/libm.so
40336000-40337000 r--p 00014000 b3:15 875 /system/lib/libm.so
40337000-40338000 rw-p 00015000 b3:15 875 /system/lib/libm.so
40338000-40345000 r-xp 00000000 b3:15 789 /system/lib/libcutils.so
40345000-40346000 r--p 0000c000 b3:15 789 /system/lib/libcutils.so
40346000-40347000 rw-p 0000d000 b3:15 789 /system/lib/libcutils.so
40347000-40355000 rw-p 00000000 00:00 0
40355000-403bc000 r-xp 00000000 b3:15 877 /system/lib/libmedia.so
403bc000-403bd000 ---p 00000000 00:00 0
403bd000-403d0000 r--p 00067000 b3:15 877 /system/lib/libmedia.so
403d0000-403d1000 rw-p 0007a000 b3:15 877 /system/lib/libmedia.so
403d1000-403d5000 rw-p 00000000 00:00 0
403d5000-403d8000 rw-p 00000000 00:00 0
I understand the .so represents the shared libraries the process maps. It seems each .so has 3 entries and their permissions are
r-xp
r--p
rw-p
So how do I interpret this? Can I assume the r-xp is the code section of the library, since it has the x (execute) permission? How about the r--p and rw-p, are they the data sections?
What about the empty entries? For example, the last 6 entries about libmedia have three empty entires (00:00 0). What are these?
403bc000-403bd000 ---p 00000000 00:00 0
403bd000-403d0000 r--p 00067000 b3:15 877 /system/lib/libmedia.so
403d0000-403d1000 rw-p 0007a000 b3:15 877 /system/lib/libmedia.so
403d1000-403d5000 rw-p 00000000 00:00 0
403d5000-403d8000 rw-p 00000000 00:00 0
Can I assume the r-xp is the code section of the library, since it has
the x (execute) permission?
Yes, but this is known as text segment(which stores the instruction). You should also note that it does not have write permission as it should not have.
How about the r--p and rw-p, are they the data sections?
Yes,These segments store the static/global variable. However constant global variable would be stored into r--p segment as it should not be modifiable by any program.
What about the empty entries? For example, the last 6 entries about
libmedia have three empty entires (00:00 0). What are these?
These might be the guard segment(kernel inserts these segments to protect the overflow scenario). The "p" indicates that its private.
EDIT
For complete information, you may want to refer the following link:
http://linux.die.net/man/5/proc
My basic question is why is the VSIZE for a 64 bit process so much larger than that of the exact same program compiled for 32 bit?
The following is the output of the /proc/<pid>/maps file for the 32 bit process.
00148000-00149000 r-xp 00000000 00:00 0 [vdso]
00149000-002d2000 r-xp 00000000 fd:02 8914142 /lib/libc-2.12.so
002d2000-002d3000 ---p 00189000 fd:02 8914142 /lib/libc-2.12.so
002d3000-002d5000 r--p 00189000 fd:02 8914142 /lib/libc-2.12.so
002d5000-002d6000 rw-p 0018b000 fd:02 8914142 /lib/libc-2.12.so
002d6000-002d9000 rw-p 00000000 00:00 0
005c9000-005da000 r-xp 00000000 fd:02 17059392 /tmp/vsizetest/lib/libtesting.so
005da000-005db000 rw-p 00010000 fd:02 17059392 /tmp/vsizetest/lib/libtesting.so
005db000-0061b000 rw-p 00000000 00:00 0
00661000-00689000 r-xp 00000000 fd:02 8917713 /lib/libm-2.12.so
00689000-0068a000 r--p 00027000 fd:02 8917713 /lib/libm-2.12.so
0068a000-0068b000 rw-p 00028000 fd:02 8917713 /lib/libm-2.12.so
00694000-006ab000 r-xp 00000000 fd:02 8917680 /lib/libpthread-2.12.so
006ab000-006ac000 r--p 00016000 fd:02 8917680 /lib/libpthread-2.12.so
006ac000-006ad000 rw-p 00017000 fd:02 8917680 /lib/libpthread-2.12.so
006ad000-006af000 rw-p 00000000 00:00 0
006e5000-00703000 r-xp 00000000 fd:00 3150403 /lib/ld-2.12.so
00703000-00704000 r--p 0001d000 fd:00 3150403 /lib/ld-2.12.so
00704000-00705000 rw-p 0001e000 fd:00 3150403 /lib/ld-2.12.so
00983000-009a0000 r-xp 00000000 fd:02 8914997 /lib/libgcc_s-4.4.5-20110214.so.1
009a0000-009a1000 rw-p 0001d000 fd:02 8914997 /lib/libgcc_s-4.4.5-20110214.so.1
00ca5000-00d86000 r-xp 00000000 fd:02 6300601 /usr/lib/libstdc++.so.6.0.13
00d86000-00d8a000 r--p 000e0000 fd:02 6300601 /usr/lib/libstdc++.so.6.0.13
00d8a000-00d8c000 rw-p 000e4000 fd:02 6300601 /usr/lib/libstdc++.so.6.0.13
00d8c000-00d92000 rw-p 00000000 00:00 0
08048000-08049000 r-xp 00000000 fd:02 21134666 /tmp/vsizetest/bin/testvsz
08049000-0804a000 rw-p 00000000 fd:02 21134666 /tmp/vsizetest/bin/testvsz
09b8d000-09bae000 rw-p 00000000 00:00 0 [heap]
f7796000-f779c000 rw-p 00000000 00:00 0
ff998000-ff9ae000 rw-p 00000000 00:00 0 [stack]
Which results in a total VSIZE of 3656.
The following is the output of the /proc/<pid>/maps file for the 64 bit process.
00400000-00401000 r-xp 00000000 fd:02 21134667 /tmp/vsizetest/bin64/testvsz
00600000-00601000 rw-p 00000000 fd:02 21134667 /tmp/vsizetest/bin64/testvsz
02301000-02322000 rw-p 00000000 00:00 0 [heap]
3b7c800000-3b7c820000 r-xp 00000000 fd:00 661349 /lib64/ld-2.12.so
3b7ca1f000-3b7ca20000 r--p 0001f000 fd:00 661349 /lib64/ld-2.12.so
3b7ca20000-3b7ca21000 rw-p 00020000 fd:00 661349 /lib64/ld-2.12.so
3b7ca21000-3b7ca22000 rw-p 00000000 00:00 0
3b7cc00000-3b7cd86000 r-xp 00000000 fd:00 661350 /lib64/libc-2.12.so
3b7cd86000-3b7cf86000 ---p 00186000 fd:00 661350 /lib64/libc-2.12.so
3b7cf86000-3b7cf8a000 r--p 00186000 fd:00 661350 /lib64/libc-2.12.so
3b7cf8a000-3b7cf8b000 rw-p 0018a000 fd:00 661350 /lib64/libc-2.12.so
3b7cf8b000-3b7cf90000 rw-p 00000000 00:00 0
3b7d000000-3b7d083000 r-xp 00000000 fd:00 661365 /lib64/libm-2.12.so
3b7d083000-3b7d282000 ---p 00083000 fd:00 661365 /lib64/libm-2.12.so
3b7d282000-3b7d283000 r--p 00082000 fd:00 661365 /lib64/libm-2.12.so
3b7d283000-3b7d284000 rw-p 00083000 fd:00 661365 /lib64/libm-2.12.so
3b7d800000-3b7d817000 r-xp 00000000 fd:00 661352 /lib64/libpthread-2.12.so
3b7d817000-3b7da16000 ---p 00017000 fd:00 661352 /lib64/libpthread-2.12.so
3b7da16000-3b7da17000 r--p 00016000 fd:00 661352 /lib64/libpthread-2.12.so
3b7da17000-3b7da18000 rw-p 00017000 fd:00 661352 /lib64/libpthread-2.12.so
3b7da18000-3b7da1c000 rw-p 00000000 00:00 0
3b7e000000-3b7e007000 r-xp 00000000 fd:00 661361 /lib64/librt-2.12.so
3b7e007000-3b7e206000 ---p 00007000 fd:00 661361 /lib64/librt-2.12.so
3b7e206000-3b7e207000 r--p 00006000 fd:00 661361 /lib64/librt-2.12.so
3b7e207000-3b7e208000 rw-p 00007000 fd:00 661361 /lib64/librt-2.12.so
3b87000000-3b87016000 r-xp 00000000 fd:00 664219 /lib64/libgcc_s-4.4.6-20110824.so.1
3b87016000-3b87215000 ---p 00016000 fd:00 664219 /lib64/libgcc_s-4.4.6-20110824.so.1
3b87215000-3b87216000 rw-p 00015000 fd:00 664219 /lib64/libgcc_s-4.4.6-20110824.so.1
3d44c00000-3d44ce8000 r-xp 00000000 fd:00 3019214 /usr/lib64/libstdc++.so.6.0.13
3d44ce8000-3d44ee8000 ---p 000e8000 fd:00 3019214 /usr/lib64/libstdc++.so.6.0.13
3d44ee8000-3d44eef000 r--p 000e8000 fd:00 3019214 /usr/lib64/libstdc++.so.6.0.13
3d44eef000-3d44ef1000 rw-p 000ef000 fd:00 3019214 /usr/lib64/libstdc++.so.6.0.13
3d44ef1000-3d44f06000 rw-p 00000000 00:00 0
7f30ab397000-7f30ab39c000 rw-p 00000000 00:00 0
7f30ab39c000-7f30ab3ad000 r-xp 00000000 fd:02 21127804 /tmp/vsizetest/lib64/libtesting.so
7f30ab3ad000-7f30ab5ac000 ---p 00011000 fd:02 21127804 /tmp/vsizetest/lib64/libtesting.so
7f30ab5ac000-7f30ab5ad000 rw-p 00010000 fd:02 21127804 /tmp/vsizetest/lib64/libtesting.so
7f30ab5ad000-7f30ab5ee000 rw-p 00000000 00:00 0
7f30ab606000-7f30ab609000 rw-p 00000000 00:00 0
7fff69512000-7fff69528000 rw-p 00000000 00:00 0 [stack]
7fff695ff000-7fff69600000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
Which results in a VSIZE of 18480.
The major difference between the 2 maps are the following entries from the 64 bit data:
3b7cd86000-3b7cf86000 ---p 00186000 fd:00 661350 /lib64/libc-2.12.so
3b7d083000-3b7d282000 ---p 00083000 fd:00 661365 /lib64/libm-2.12.so
3b7d817000-3b7da16000 ---p 00017000 fd:00 661352 /lib64/libpthread-2.12.so
3b7e007000-3b7e206000 ---p 00007000 fd:00 661361 /lib64/librt-2.12.so
3b87016000-3b87215000 ---p 00016000 fd:00 664219 /lib64/libgcc_s-4.4.6-20110824.so.1
3d44ce8000-3d44ee8000 ---p 000e8000 fd:00 3019214 /usr/lib64/libstdc++.so.6.0.13
7f30ab3ad000-7f30ab5ac000 ---p 00011000 fd:02 21127804 /tmp/vsizetest/lib64/libtesting.so
Which account for 14316 of the 18480 VSIZE.
Other experimentation with other programs seems to show that in 64 bit you seem to get one of these private, non-readable, non-writeable, non-executable chunks of memory for each shared library that is used by the process, while in 32 bit there are hardly any of these chunks.
Does anyone know what these chunks of memory are?
Note: Based on some answers to a similar question, What these memory regions for, from a Linux process?, this is not a multi-threaded process and it is already compiled -fPIC.
The major VSIZE difference comes from how the PROT_NONE mappings (mode "---p") of the shared libraries are done in the case of the 32-bit and 64-bit versions.
These are exactly the mappings that you spotted as producing the difference.
In general for each shared library loaded we will have four mappings:
3b7cc00000-3b7cd86000 r-xp 00000000 fd:00 661350 /lib64/libc-2.12.so
3b7cd86000-3b7cf86000 ---p 00186000 fd:00 661350 /lib64/libc-2.12.so
3b7cf86000-3b7cf8a000 r--p 00186000 fd:00 661350 /lib64/libc-2.12.so
3b7cf8a000-3b7cf8b000 rw-p 0018a000 fd:00 661350 /lib64/libc-2.12.so
The first one is the code segment with executable permissions, the second the PROT_NONE (mode ---) mapping (pages may not be accessed), and the last two ones the data segment (read only part and read write).
The PROT_NONE has size MAXPAGESIZE and so it is created differently in the 32-bit and 64-bit versions. In the case of the 32-bit version it has 4KB size (MAXPAGESIZE for i386) and in the case of the 64-bit version 2MB (standard MAXPAGESIZE for x86_64 systems).
It should be noted that this memory is not actually consumed (it just consumes addresses of the address space) as noted here:
http://www.greenend.org.uk/rjk/tech/dataseg.html
"This extra doesn’t cost you any RAM or swap space, just address space within each process, which is in plentiful supply on 64-bit platforms. The underlying reason is to do with keeping libraries efficiently sharable, but the implementation is a little odd."
Just a last trick, I find easier to check memory mappings using the pmap utility than parses the maps file and produces a simpler to read output:
For basic info:
pmap <PID>
For extended info:
pmap -x <PID>
[Not really an answer... speaking past my knowledge]
If the memory segments are really "private, non-readable, non-writeable, non-executable" then they should never be referred to, and even though they exist in the VIRTUAL memory space, they will never occupy any real memory, and therefore not much to worry about. (?)
It must be some sort of book-keeping or fragmentation issue. Since these are part of the shared libraries (*.so) it's just how those libraries were built. It really has nothing to do with your program, other than it's linked to those libraries. Unless you want to rebuild those libraries, or not use them, there isn't much to do about it (and not much to gain anyway as they should use no real memory anyway).
Maybe related?
In What these memory regions for, from a Linux process?
#caf says some memory segments that are "---p" are "guard pages".
That suggests they exist just to catch a stray pointer or stack growing to far error... sort of a hard separator in memory so the system can catch a common error and stop processing rather than let those common errors slip by (its a fatal error to refer to them at all, and they really will NEVER use any real memory).
Answering to why and what constitutes a 64bit shared library has a additional chunk of memory, is by taking example of loading libc.so and looking this from how loader loads dynamic libraries. Below are strace outputs for both 32bit and 64bit executables which tells us there are calls to mmap & mprotect.
esunboj#L9AGC12:~/32_64bit$ strace ./crash-x86-64
...
open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\200\30\2\0\0\0\0\0"...,
832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1811128, ...}) = 0
mmap(NULL, 3925208, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) =
0x7fa354f8a000
mprotect(0x7fa35513f000, 2093056, PROT_NONE) = 0
mmap(0x7fa35533e000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE,
3, 0x1b4000) = 0x7fa35533e000
mmap(0x7fa355344000, 17624, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS,
-1, 0) = 0x7fa355344000
close(3) = 0
...
esunboj#L9AGC12:~/32_64bit$ strace ./crash
...
open("/lib/i386-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0000\226\1\0004\0\0\0"...,
512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1730024, ...}) = 0
mmap2(NULL, 1743580, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) =
0xfffffffff7546000
mprotect(0xf76e9000, 4096, PROT_NONE) = 0
mmap2(0xf76ea000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE,
3, 0x1a3) = 0xfffffffff76ea000
mmap2(0xf76ed000, 10972, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS,
-1, 0) = 0xfffffffff76ed000
close(3) = 0
...
Closely observing both the strace's two things are need to be investigate,
1. Each of them maps memory 3 times and 1 call to mprotect exactly after first mmap.
2. Comparing mprotect calls for 64bit & 32bit has 2093056B & 4096B of region protected respectively.
In dl-load.c, subroutine _dl_map_object_from_fd() maps dynamic library memory segments to virtual space by setting required permissions and zero fills .bss section of library and updates the link map structure. Lets get here some part of code for more analysis,
struct link_map *
_dl_map_object_from_fd ( )
{
...
/* Scan the program header table, collecting its load commands. */
struct loadcmd
{
ElfW(Addr) mapstart, mapend, dataend, allocend;
off_t mapoff;
int prot;
} loadcmds[l->l_phnum], *c; // l is link_map struct described for each object
of dynamic linker
size_t nloadcmds = 0;
bool has_holes = false;
...
for (ph = phdr; ph < &phdr[l->l_phnum]; ++ph)
switch (ph->p_type)
{
...
case PT_LOAD:
...
c = &loadcmds[nloadcmds++];
c->mapstart = ph->p_vaddr & ~(GLRO(dl_pagesize) - 1);
c->mapend = ((ph->p_vaddr + ph->p_filesz + GLRO(dl_pagesize) - 1)
& ~(GLRO(dl_pagesize) - 1));
...
if (nloadcmds > 1 && c[-1].mapend != c->mapstart)
has_holes = true;
...
}
...
if (has_holes)
__mprotect ((caddr_t) (l->l_addr + c->mapend),
loadcmds[nloadcmds - 1].mapstart - c->mapend, PROT_NONE);
...
}
In the above code l_phnum used in for statement holds number of entries in the ELF program header. Ideally for each iteration each entry segments are mapped. When PT_LOAD segment case hits for its first time, its basically a .text or .rodata section which gets mmapped (1st mmap in strace) and second PT_LOAD segment represents .datasection gets mapped (2nd mmap in strace). Before second PT_LOAD segment is mapped, mapstart and mapend is preserved which refer to start and end of text section. In next PT_LOAD iteration if previous segment mapend not equals to current (.data) segment mapstart then their is a hole between two PT_LOAD segments (meaning gap between .text and .data sections). Therefore, if their is a hole between memory regions with null permissions, loader will protect (mprotect call in strace) it or make it inaccessible. Protected region for 64bit and 32 bit process are 511 Vs just 1 page respectively adding to huge memory chunk for 64bit libraries.
Proof for 64bit inaccessible region: Objdump for libc.so below gives us some virtual address(VA) statistics which are roundoff appropriately as follows,
PT_LOAD(1) PT_LOAD(2)
mapstart VA 0x0000000000000000 0x00000000003b4000
mapend VA 0x00000000001b5000 0x00000000003A0000
Here PT_LOAD(1) mapend (0x00000000001b5000) is not equal to PT_LOAD(2) mapstart (0x00000000003b4000) resulting a memory hole of 0x00000000001FF000 (In decimal 2093056B).
esunboj#L9AGC12:~/32_64bit$objdump -x -s -d -D /lib/x86_64-linux-gnu/libc.so.6
Program Header:
...
LOAD off 0x0000000000000000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**21
filesz 0x00000000001b411c memsz 0x00000000001b411c flags r-x
LOAD off 0x00000000001b4700 vaddr 0x00000000003b4700 paddr 0x00000000003b4700 align 2**21
filesz 0x0000000000005160 memsz 0x0000000000009dd8 flags rw-
...
On top 64bit text takes a higher representation of instruction bytes compared to 32bit. Similarly size of pointers on 64bit are 8B adding 4 more bytes. Also data structure alignment is a 8B aligned in 64bit making mapped regions larger.
Simple size command on binaries can show the difference between 32/64 bit programs memory regions as below,
esunboj#L9AGC12:~/32_64bit$ ls -lrt
total 10368
-rwxrwxrwx 1 esunboj ei 5758776 Oct 10 11:35 crash-x86-64
-rwxrwxrwx 1 esunboj ei 4855676 Oct 10 11:36 crash
esunboj#L9AGC12:~/32_64bit$ size crash
text data bss dec hex filename
4771286 82468 308704 5162458 4ec5da crash
esunboj#L9AGC12:~/32_64bit$ size crash-x86-64
text data bss dec hex filename
5634861 121164 1623728 7379753 709b29 crash-x86-64
I have a strange ELF binary. I can run this binary in 32bit linux.
But if I open this binary with IDA disassembler, IDA says "invalid entry point".
Result of readelf is as below:
root#meltdown-VirtualBox:/home/meltdown# readelf -S -l SimpleVM
There are no sections in this file.
Elf file type is EXEC (Executable file)
Entry point 0xc023dc
There are 2 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000000 0x00c01000 0x00c01000 0x013c7 0x013c7 RWE 0x1000
LOAD 0x00019c 0x0804b19c 0x0804b19c 0x00000 0x00000 RW 0x1000
There is no section. I thought this binary is packed. But, last virtual address of first LOAD segment is 0xc023c7. And virtual address of entry point is 0xc023dc which is out of range...
Can someone tell me whats going on?
Thank you in advance.
/proc/PID/maps is as follows (two processes are created...)
root#meltdown-VirtualBox:/proc/3510# cat maps
00110000-00111000 rwxp 00000000 00:00 0
006c0000-006c1000 r-xp 00000000 00:00 0 [vdso]
007d2000-007d4000 rwxp 00000000 00:00 0
00c01000-00c02000 rwxp 00000000 08:01 3801242 /home/meltdown/SimpleVM
00ca4000-00e43000 r-xp 00000000 08:01 17171359 /lib/i386-linux-gnu/libc-2.15.so
00e43000-00e45000 r-xp 0019f000 08:01 17171359 /lib/i386-linux-gnu/libc-2.15.so
00e45000-00e46000 rwxp 001a1000 08:01 17171359 /lib/i386-linux-gnu/libc-2.15.so
00e46000-00e49000 rwxp 00000000 00:00 0
08048000-0804b000 r-xp 00000000 00:00 0
0804b000-0804c000 rwxp 00000000 00:00 0
b77a7000-b77c7000 r-xp 00000000 08:01 17171339 /lib/i386-linux-gnu/ld-2.15.so
b77c7000-b77c8000 r-xp 0001f000 08:01 17171339 /lib/i386-linux-gnu/ld-2.15.so
b77c8000-b77c9000 rwxp 00020000 08:01 17171339 /lib/i386-linux-gnu/ld-2.15.so
bfa90000-bfab1000 rwxp 00000000 00:00 0 [stack]
root#meltdown-VirtualBox:/proc/3511# cat maps
00110000-00111000 rwxp 00000000 00:00 0
006c0000-006c1000 r-xp 00000000 00:00 0 [vdso]
007d2000-007d4000 rwxp 00000000 00:00 0
00c01000-00c02000 rwxp 00000000 08:01 3801242 /home/meltdown/SimpleVM
00ca4000-00e43000 r-xp 00000000 08:01 17171359 /lib/i386-linux-gnu/libc-2.15.so
00e43000-00e45000 r-xp 0019f000 08:01 17171359 /lib/i386-linux-gnu/libc-2.15.so
00e45000-00e46000 rwxp 001a1000 08:01 17171359 /lib/i386-linux-gnu/libc-2.15.so
00e46000-00e49000 rwxp 00000000 00:00 0
08048000-0804b000 r-xp 00000000 00:00 0
0804b000-0804c000 rwxp 00000000 00:00 0
b77a7000-b77c7000 r-xp 00000000 08:01 17171339 /lib/i386-linux-gnu/ld-2.15.so
b77c7000-b77c8000 r-xp 0001f000 08:01 17171339 /lib/i386-linux-gnu/ld-2.15.so
b77c8000-b77c9000 rwxp 00020000 08:01 17171339 /lib/i386-linux-gnu/ld-2.15.so
bfa90000-bfab1000 rwxp 00000000 00:00 0 [stack]
It's probably because of the granularity of mapping length. The length of the mapping is going to be rounded up to be a multiple of the page size. On my system the page size is 4k so the mapping would be rounded up to 4k and encompass the entry point. Even with a page size of 1k the length would round up to 0x1400, enough to include the entry point. If the file is long enough then the extra bytes would probably come from the file instead of the page initialization.