Void Buffer Incorrectly Referenced Still Works - linux

While I was working on a bit of C code, I came across this strange bug.
I made a mistake in my code and wrote to buf rather than &buf, but it worked almost just fine.
...
void* buf;
int ret;
int fd = open("1", O_CREAT | O_RDWR, 0777);
write(fd, "test\n", 5);
lseek(fd, 0, SEEK_SET);
ret = read(fd, buf, 5); // Yes, this should be &buf
printf("Ret: %d Str: %s\n", ret, buf);
---- output ----
Ret: 5 Str: test\n
This code works and I get test\n in my stdout, even though I should have had &buf in my read call. Please, I am aware that changing buf to &buf works. That is not the question.
This is what does not work:
...
void* buf;
void* blah = "a"; // Using char* still did not work
int ret;
int fd = open("1", O_CREAT | O_RDWR, 0777);
write(fd, "test\n", 5);
lseek(fd, 0, SEEK_SET);
ret = read(fd, buf, 5);
printf("Ret: %d Str: %s\n", ret, buf);
---- output ----
Ret: -1 Str: 1�I��^H��H���PTI��`#
The binary for file 1 is the same for both programs. No error in writing to 1.
Why does the first code snippet work?
How does adding a variable that is never used make this no longer
work?
Why did writing to buf and not &buf work in the first place?
Here is the strings section in each binary:
Functioning code:
0000770: 0100 0200 0000 0000 0000 0000 0000 0000 ................
0000780: 3100 7465 7374 0a00 4572 723a 2025 640a 1.test..Err: %d.
0000790: 0a00 5374 723a 2025 730a 0000 011b 033b ..Str: %s......;
00007a0: 3000 0000 0500 0000 34fd ffff 7c00 0000 0.......4...|...
Malfunctioning code:
0000770: 0100 0200 0000 0000 0000 0000 0000 0000 ................
0000780: 6100 3100 7465 7374 0a00 4572 723a 2025 a.1.test..Err: %
0000790: 640a 0a00 5374 723a 2025 730a 0000 0000 d...Str: %s.....
00007a0: 011b 033b 3400 0000 0500 0000 30fd ffff ...;4.......0...
Thanks.

(Since it's pretty much impossible to put more than a line of code in comments)
Warnings from compiling with -Wall -Wextra:
x.c: In function ‘main’:
x.c:15:25: warning: format ‘%s’ expects argument of type ‘char *’, but argument 3 has type ‘void *’ [-Wformat=]
printf("Ret: %d Str: %s\n", ret, buf);
~^
%p
x.c:14:9: warning: ‘buf’ is used uninitialized in this function [-Wuninitialized]
ret = read(fd, buf, 5);
^~~~~~~~~~~~~~~~
The results of running your program through valgrind:
==6978== Memcheck, a memory error detector
==6978== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==6978== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==6978== Command: ./a.out
==6978==
==6978== Syscall param read(buf) contains uninitialised byte(s)
==6978== at 0x4F4C081: read (read.c:27)
==6978== by 0x1087BF: main (x.c:14)
==6978==
==6978== Syscall param read(buf) points to unaddressable byte(s)
==6978== at 0x4F4C081: read (read.c:27)
==6978== by 0x1087BF: main (x.c:14)
==6978== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==6978==
==6978== Conditional jump or move depends on uninitialised value(s)
==6978== at 0x4E97A41: vfprintf (vfprintf.c:1643)
==6978== by 0x4EA0F25: printf (printf.c:33)
==6978== by 0x1087DC: main (x.c:18)
==6978==
Ret: -1 Str: (null)
==6978==
==6978== HEAP SUMMARY:
==6978== in use at exit: 0 bytes in 0 blocks
==6978== total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated
==6978==
==6978== All heap blocks were freed -- no leaks are possible
==6978==
==6978== For counts of detected and suppressed errors, rerun with: -v
==6978== Use --track-origins=yes to see where uninitialised values come from
==6978== ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 0 from 0)
You have to make buf point to valid memory you have permission to use or you get undefined behavior where anything can happen. If you're lucky, it'll just crash your program, but you can't count on that.

Related

build-id data offset in the ELF file

I need to modify the build-id of the ELF notes section. I found out that it is possible here. Also found out that I can do it by modifying this code. What I can't figure out is data location. Here is what I'm talking about.
$ eu-readelf -S myelffile
Section Headers:
[Nr] Name Type Addr Off Size ES Flags Lk Inf Al
...
[ 2] .note.ABI-tag NOTE 000000000000028c 0000028c 00000020 0 A 0 0 4
[ 3] .note.gnu.build-id NOTE 00000000000002ac 000002ac 00000024 0 A 0 0 4
...
$ eu-readelf -n myelffile
Note section [ 2] '.note.ABI-tag' of 32 bytes at offset 0x28c:
Owner Data size Type
GNU 16 GNU_ABI_TAG
OS: Linux, ABI: 3.14.0
Note section [ 3] '.note.gnu.build-id' of 36 bytes at offset 0x2ac:
Owner Data size Type
GNU 20 GNU_BUILD_ID
Build ID: d75a086c288c582036b0562908304bc3a8033235
.note.gnu.build-id section is 36 bytes. The build id is 20 bytes. What are the other 16 bytes?
I played with the code a bit and read 36 bytes of myelffile at offset 0x2ac. Got the following 040000001400000003000000474e5500d75a086c288c582036b0562908304bc3a8033235.
Then I decided to use Elf64_Shdr definition, so I read data at address 0x2ac + sizeof(Elf64_Shdr.sh_name) + sizeof(Elf64_Shdr.sh_type) + sizeof(Elf64_Shdr.sh_flags) and I got my build id, d75a086c288c582036b0562908304bc3a8033235. It does makes sense why I got it, sizeof(Elf64_Shdr.sh_name) + sizeof(Elf64_Shdr.sh_type) + sizeof(Elf64_Shdr.sh_flags) = 16 bytes, but according to Elf64_Shdr definition I should be pointing to Elf64_Addr sh_addr, i.e. section virtual address.
So what is not clear to me is what are the other 16 bytes of the section? What do they represent? I can't reconcile the Elf64_Shdr definition and the results I'm getting from my experiments.
.note.gnu.build-id section is 36 bytes. The build id is 20 bytes. What are the other 16 bytes?
Each .note.* section starts with Elf64_Nhdr (12 bytes), followed by (4-byte aligned) note name of variable size (GNU\0 here), followed by (4-byte aligned) actual note data. Documentation.
Looking at /bin/date on my system:
eu-readelf -Wn /bin/date
Note section [ 2] '.note.ABI-tag' of 32 bytes at offset 0x2c4:
Owner Data size Type
GNU 16 GNU_ABI_TAG
OS: Linux, ABI: 3.2.0
Note section [ 3] '.note.gnu.build-id' of 36 bytes at offset 0x2e4:
Owner Data size Type
GNU 20 GNU_BUILD_ID
Build ID: 979ae4616ae71af565b123da2f994f4261748cc9
What are the bytes at offset 0x2e4?
dd bs=1 skip=$((0x2e4)) count=36 < /bin/date | xxd
00000000: 0400 0000 1400 0000 0300 0000 474e 5500 ............GNU.
00000010: 979a e461 6ae7 1af5 65b1 23da 2f99 4f42 ...aj...e.#./.OB
00000020: 6174 8cc9 at..
So we have: .n_namesz == 4, .n_descsz == 20, .n_type == 3 == NT_GNU_BUILD_ID, followed by 4-byte GNU\0 note name, followed by 20 bytes of actual build-id bytes 0x97, 0x9a, etc.

Loaded glibc base address different for each function

I'm trying to calculate the base address of the library of a binary file.
I have the address of printf, puts ecc and then I subtract it's offset to get the base address of the library.
I was doing this for printf, puts and signal, but every time I got a different base address.
I also tried to do the things in this post, but I couldn't get the right result either.
ASLR is disabled.
this is where I take the address of the library function:
gdb-peda$ x/20wx 0x804b018
0x804b018 <signal#got.plt>: 0xf7e05720 0xf7e97010 0x080484e6 0x080484f6
0x804b028 <puts#got.plt>: 0xf7e3fb40 0x08048516 0x08048526 0xf7df0d90
0x804b038 <memset#got.plt>: 0xf7f18730 0x08048556 0x08048566 0x00000000
then I have:
gdb-peda$ info proc mapping
process 114562
Mapped address spaces:
Start Addr End Addr Size Offset objfile
0x8048000 0x804a000 0x2000 0x0 /home/ofey/CTF/Pwnable.tw/applestore/applestore
0x804a000 0x804b000 0x1000 0x1000 /home/ofey/CTF/Pwnable.tw/applestore/applestore
0x804b000 0x804c000 0x1000 0x2000 /home/ofey/CTF/Pwnable.tw/applestore/applestore
0x804c000 0x806e000 0x22000 0x0 [heap]
0xf7dd8000 0xf7fad000 0x1d5000 0x0 /lib/i386-linux-gnu/libc-2.27.so
0xf7fad000 0xf7fae000 0x1000 0x1d5000 /lib/i386-linux-gnu/libc-2.27.so
0xf7fae000 0xf7fb0000 0x2000 0x1d5000 /lib/i386-linux-gnu/libc-2.27.so
0xf7fb0000 0xf7fb1000 0x1000 0x1d7000 /lib/i386-linux-gnu/libc-2.27.so
0xf7fb1000 0xf7fb4000 0x3000 0x0
0xf7fd0000 0xf7fd2000 0x2000 0x0
0xf7fd2000 0xf7fd5000 0x3000 0x0 [vvar]
0xf7fd5000 0xf7fd6000 0x1000 0x0 [vdso]
0xf7fd6000 0xf7ffc000 0x26000 0x0 /lib/i386-linux-gnu/ld-2.27.so
0xf7ffc000 0xf7ffd000 0x1000 0x25000 /lib/i386-linux-gnu/ld-2.27.so
0xf7ffd000 0xf7ffe000 0x1000 0x26000 /lib/i386-linux-gnu/ld-2.27.so
0xfffdd000 0xffffe000 0x21000 0x0 [stack]
and :
gdb-peda$ info sharedlibrary
From To Syms Read Shared Object Library
0xf7fd6ab0 0xf7ff17fb Yes /lib/ld-linux.so.2
0xf7df0610 0xf7f3d386 Yes /lib/i386-linux-gnu/libc.so.6
I then found the offset of signal and puts to calculate the base libc address.
base_with_signal_offset = 0xf7e05720 - 0x3eda0 = 0xf7dc6980
base_with_puts_offset = 0xf7e3fb40 - 0x809c0 = 0xf7dbf180
I was expecting base_with_signal_offset = base_with_puts_offset = 0xf7dd8000, but that's not the case.
What I'm doing wrong?
EDIT(To let you understand where I got those offset):
readelf -s /lib/x86_64-linux-gnu/libc-2.27.so | grep puts
I get :
191: 00000000000809c0 512 FUNC GLOBAL DEFAULT 13 _IO_puts##GLIBC_2.2.5
422: 00000000000809c0 512 FUNC WEAK DEFAULT 13 puts##GLIBC_2.2.5
496: 00000000001266c0 1240 FUNC GLOBAL DEFAULT 13 putspent##GLIBC_2.2.5
678: 00000000001285d0 750 FUNC GLOBAL DEFAULT 13 putsgent##GLIBC_2.10
1141: 000000000007f1f0 396 FUNC WEAK DEFAULT 13 fputs##GLIBC_2.2.5
1677: 000000000007f1f0 396 FUNC GLOBAL DEFAULT 13 _IO_fputs##GLIBC_2.2.5
2310: 000000000008a640 143 FUNC WEAK DEFAULT 13 fputs_unlocked##GLIBC_2.2.5
I was expecting base_with_signal_offset = base_with_puts_offset = 0xf7dd8000
There are 3 numbers in your calculation:
&puts_at_runtime - symbol_value_from_readelf == &first_executable_pt_load_segment_libc.
The readelf output shows that you got one of these almost correct: the value of puts in 64-bit /lib/x86_64-linux-gnu/libc-2.27.so is indeed 0x809c0, but that is not the library you are actually using. You need to repeat the same on the actually used 32-bit library: /lib/i386-linux-gnu/libc-2.27.so.
For the first number -- &puts_at_runtime, you are using value from the puts#got.plt import stub. That value is only guaranteed to have been resolved (point to actual puts in libc.so) IFF you have LD_BIND_NOW=1 set in the environment, or you linked your executable with -z now linker flag, or you actually called puts already.
It may be better to print &puts in GDB.
The last number -- &first_executable_pt_load_segment_libc is correct (because info shared shows that libc.so.6 .text section starts at 0xf7df0610, which is between 0xf7dd8000 and 0xf7fad000.
So putting it all together, the only error was that you used the wrong version of libc.so to extract the symbol_value_from_readelf.
On my system:
#include <signal.h>
#include <stdio.h>
int main() {
puts("Hello");
signal(SIGINT, SIG_IGN);
return 0;
}
gcc -m32 t.c -fno-pie -no-pie
gdb -q a.out
... set breakpoint on exit from main
Breakpoint 1, 0x080491ae in main ()
(gdb) p &puts
$1 = (<text variable, no debug info> *) 0xf7e31300 <puts>
(gdb) p &signal
$2 = (<text variable, no debug info> *) 0xf7df7d20 <ssignal>
(gdb) info proc map
process 114065
Mapped address spaces:
Start Addr End Addr Size Offset objfile
0x8048000 0x8049000 0x1000 0x0 /tmp/a.out
...
0x804d000 0x806f000 0x22000 0x0 [heap]
0xf7dc5000 0xf7de2000 0x1d000 0x0 /lib/i386-linux-gnu/libc-2.29.so
...
(gdb) info shared
From To Syms Read Shared Object Library
0xf7fd5090 0xf7ff0553 Yes (*) /lib/ld-linux.so.2
0xf7de20e0 0xf7f2b8d6 Yes (*) /lib/i386-linux-gnu/libc.so.6
Given above, we expect readelf -s to give us 0xf7e31300 - 0xf7dc5000 ==
0x6c300 for puts and 0xf7df7d20 - 0xf7dc5000 == 0x32d20 for signal respectively.
readelf -Ws /lib/i386-linux-gnu/libc-2.29.so | egrep ' (puts|signal)\W'
452: 00032d20 68 FUNC WEAK DEFAULT 14 signal##GLIBC_2.0
458: 0006c300 400 FUNC WEAK DEFAULT 14 puts##GLIBC_2.0
QED.

Place .text section at the very end of ELF

For a specific reason i need to place .text section at the very end of my ELF file.
I've tried to achieve this in this way:
I took default large linker script and moved .text section to the very end of SECTIONS { ... } part.
$ readelf -S beronew
[ #] Name Type Address Offset
Size Size.Ent Flags - - Alignment
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .data PROGBITS 00000000006000b0 000000b0
000000000000003b 0000000000000000 WA 0 0 1
[ 2] .text PROGBITS 0000000000a000f0 000000f0
00000000000003e9 0000000000000000 AX 0 0 1
[ 3] .shstrtab STRTAB 0000000000000000 000004d9
0000000000000027 0000000000000000 0 0 1
[ 4] .symtab SYMTAB 0000000000000000 00000680
0000000000000438 0000000000000018 5 41 8
[ 5] .strtab STRTAB 0000000000000000 00000ab8
0000000000000258 0000000000000000 0 0 1
What i see is that ld added extra sections after my "ending" section. To replace them i used -nostdlib -s linker option (to not use stdlib (just in case) and omit all symbol information).
Run $ readelf -S beronew one more time:
[ #] Name Type Address Offset
Size Size.Ent Flags - - Alignment
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .data PROGBITS 00000000006000b0 000000b0
000000000000003b 0000000000000000 WA 0 0 1
[ 2] .text PROGBITS 0000000000a000f0 000000f0
00000000000003e9 0000000000000000 AX 0 0 1
[ 3] .shstrtab STRTAB 0000000000000000 000004d9
0000000000000017 0000000000000000 0 0 1
Section header string table is still there. I've tried $strip -R .shstrtab beronew. It had no effect, section is still there.
This section is only 0x17 bytes long but i couldn't achieve my goal. Then i looked at hexdump of my file:
$ hexdump beronew
...
00004d0 0060 c748 bfc6 6000 0000 732e 7368 7274
00004e0 6174 0062 642e 7461 0061 742e 7865 0074
00004f0 0000 0000 0000 0000 0000 0000 0000 0000
*
0000530 000b 0000 0001 0000 0003 0000 0000 0000
0000540 00b0 0060 0000 0000 00b0 0000 0000 0000
0000550 003b 0000 0000 0000 0000 0000 0000 0000
...
What i see is that there is another code after sections part. According to ELF structure it is Section header table at the end of file. So even if i remove somehow .shstrtab section, there still will be this header at the end.
So my question is how can i place my .text section at the very end of file? I don't really need to remove all the sections and headers so if You know a (better) way to achieve this, it will be highly appreciated.
.
.
P.S. For those who wonder why do i need this:
This ELF file (beronew) contains rutime library. It will be used as "header" for another file that generates asm instructions with some logic in opcode form. This opcode will be added to the very end of beronew. Then i'm gonna patch sh_size field in .text's section header to be able to run my recently added 'code'.
(One more question: Is this all that i need to patch in case 'text' section is the last one in file?)
P.P.S. I know that this is a bad architecture but it is my course project - porting an app that was built this way in Win32 to Linux64, and now I'm stuck at the point where i merge runtime library "header" file and "logic" part because i can't place .text section at the end of ELF.
Thanks one more time!
.
UPD:
Based on fuz's comment i've tried to add PHDRS to simple linker script as this:
PHDRS
{
headers PT_PHDR PHDRS ;
data PT_LOAD ;
bss PT_LOAD ;
text PT_LOAD ;
}
SECTIONS
{
. = 0x200000;
.data : { *(.data) *(COMMON) } :data
.bss : { *(.bss) } :bss
.text : { *(.text) } :text
}
but it doesn't seem to work now.
For someone who wonder how after all I've managed to get this working there is an answer:
I've made a linker script that placed text section after the data section (here is the script). But there were some debugging sections at the end of file like Shstrtab section and so on (picture below). So I've converted this file byte-by-byte into string.
After this I got an elf on a picture below
in a string-of-bytes form. Then I've just read some headers and found out where the section of code ends (right before the Shstrtab section), so I could split this string into 2 pieces. First one contained data for loader. Second one - for linker.
Then I've converted my 'extra' code into opcode form which is also an array (string) of bytes and concatenated it to the original .text section. Next I've concatenated the ending part to it. So I've got a single file with my extra code in it.
To get this working I've edited values from picture below:
First column is a name of field that needs to be edited (it corresponds to picture of elf structure).
Second column is an offset from the beginning of file to the beginning of the field. Let function s(xxx) be the the size_of(some_header_structure) and injSize be the size of my injected extra code. And values like 0x18, 0x20, 0x28 are the offsets of fields inside their strucures (section_headers, program_headers, elf_headers).
Third one represents the value that should replace that original one.
*note that represented elf is and ELF64 so widths of some fields differ from ELF32's ones.
After I've made all this I've managed to just run this new Elf file and it worked perfectly! Maybe (for sure) it's not the best solution but it works and it was a nice meterial for my research work.

is there any static code analyzer which can catch this memory leak?

Such leaks seem too trivial to naked eye and I think static code analysis tools should be able to find them out.
Ex1:
void foo(void) {
u32 *ptr = kmalloc(512, GFP_KERNEL);
ptr = (u32 *)0xffffffff;
kfree(ptr);
}
I know Coverity can find leaks as below but not sure about the above one: Can anyone please let me know if this will get detected in either Coverity or tools like Sparse?
Ex2:
void foo(void) {
kmalloc(512, GFP_KERNEL);
}
Ex3:
void foo(void) {
void * ptr = kmalloc(512, GFP_KERNEL);
if (true)
return;
kfree(ptr)
}
I don't know about kmalloc (and I don't have a Linux system with a Coverity license to test that on), but Coverity detects leaks of this form with malloc easily. So I doubt kmalloc would give it trouble.
If it does give trouble, you can always provide a user model of the kmalloc function that just wraps around the malloc function so Coverity knows how to treat the function.
Valgrind can be used to detect memory leaks mentioned in Ex1.
e.g.
#include<stdio.h>
void foo(void) {
int *ptr = (int *)malloc(512);
ptr = (int *)0xffffffff;
free(ptr);
}
int main(){
foo();
return 1;
}
Valigrind Output:
[test#myhost /tmp]# valgrind --tool=memcheck --leak-check=full ./Ex1
==23780== Memcheck, a memory error detector
==23780== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.
==23780== Using Valgrind-3.5.0 and LibVEX; rerun with -h for copyright info
==23780== Command: ./Ex1
==23780==
==23780== Invalid free() / delete / delete[]
==23780== at 0x4A05A31: free (vg_replace_malloc.c:325)
==23780== by 0x400509: foo (in /tmp/Ex1)
==23780== by 0x400514: main (in /tmp/Ex1)
==23780== Address 0xffffffff is not stack'd, malloc'd or (recently) free'd
==23780==
==23780==
==23780== HEAP SUMMARY:
==23780== in use at exit: 512 bytes in 1 blocks
==23780== total heap usage: 1 allocs, 1 frees, 512 bytes allocated
==23780==
==23780== 512 bytes in 1 blocks are definitely lost in loss record 1 of 1
==23780== at 0x4A05E1C: malloc (vg_replace_malloc.c:195)
==23780== by 0x4004E9: foo (in /tmp/Ex1)
==23780== by 0x400514: main (in /tmp/Ex1)
==23780==
==23780== LEAK SUMMARY:
==23780== definitely lost: 512 bytes in 1 blocks
==23780== indirectly lost: 0 bytes in 0 blocks
==23780== possibly lost: 0 bytes in 0 blocks
==23780== still reachable: 0 bytes in 0 blocks
==23780== suppressed: 0 bytes in 0 blocks
==23780==
==23780== For counts of detected and suppressed errors, rerun with: -v
==23780== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 4 from 4)

Why using conv=notrunc when cloning a disk with dd?

If you look up how to clone an entire disk to another one on the web, you will find something like that:
dd if=/dev/sda of=/dev/sdb conv=notrunc,noerror
While I understand the noerror, I am getting a hard time understanding why people think that notrunc is required for "data integrity" (as ArchLinux's Wiki states, for instance).
Indeed, I do agree on that if you are copying a partition to another partition on another disk, and you do not want to overwrite the entire disk, just one partition. In thise case notrunc, according to dd's manual page, is what you want.
But if you're cloning an entire disk, what does notrunc change for you? Just time optimization?
TL;DR version:
notrunc is only important to prevent truncation when writing into a file. This has no effect on a block device such as sda or sdb.
Educational version
I looked into the coreutils source code which contains dd.c to see how notrunc is processed.
Here's the segment of code that I'm looking at:
int opts = (output_flags
| (conversions_mask & C_NOCREAT ? 0 : O_CREAT)
| (conversions_mask & C_EXCL ? O_EXCL : 0)
| (seek_records || (conversions_mask & C_NOTRUNC) ? 0 : O_TRUNC));
/* Open the output file with *read* access only if we might
need to read to satisfy a `seek=' request. If we can't read
the file, go ahead with write-only access; it might work. */
if ((! seek_records
|| fd_reopen (STDOUT_FILENO, output_file, O_RDWR | opts, perms) < 0)
&& (fd_reopen (STDOUT_FILENO, output_file, O_WRONLY | opts, perms) < 0))
error (EXIT_FAILURE, errno, _("opening %s"), quote (output_file));
We can see here that if notrunc is not specified, then the output file will be opened with O_TRUNC. Looking below at how O_TRUNC is treated, we can see that a normal file will get truncated if written into.
O_TRUNC
If the file already exists and is a regular file and the open
mode allows writing (i.e., is O_RDWR or O_WRONLY) it will be truncated
to length 0. If the file is a FIFO or terminal device file, the
O_TRUNC flag is ignored. Otherwise the effect of O_TRUNC is
unspecified.
Effects of notrunc / O_TRUNC I
In the following example, we start out by creating junk.txt of size 1024 bytes. Next, we write 512 bytes to the beginning of it with conv=notrunc. We can see that the size stays the same at 1024 bytes. Finally, we try it without the notrunc option and we can see that the new file size is 512. This is because it was opened with O_TRUNC.
$ dd if=/dev/urandom of=junk.txt bs=1024 count=1
$ ls -l junk.txt
-rw-rw-r-- 1 akyserr akyserr 1024 Dec 11 17:08 junk.txt
$ dd if=/dev/urandom of=junk.txt bs=512 count=1 conv=notrunc
$ ls -l junk.txt
-rw-rw-r-- 1 akyserr akyserr 1024 Dec 11 17:10 junk.txt
$ dd if=/dev/urandom of=junk.txt bs=512 count=1
$ ls -l junk.txt
-rw-rw-r-- 1 akyserr akyserr 512 Dec 11 17:10 junk.txt
Effects of notrunc / O_TRUNC II
I still haven't answered your original question of why when doing a disk-to-disk clone, why conv=notrunc is important. According to the above definition, O_TRUNC seems to be ignored when opening certain special files, and I would expect this to be true for block device nodes too. However, I don't want to assume anything and will attempt to prove it here.
openclose.c
I've written a simple C program here which opens and closes a file given as an argument with the O_TRUNC flag.
#include <stdio.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <time.h>
int main(int argc, char * argv[])
{
if (argc < 2)
{
fprintf(stderr, "Not enough arguments...\n");
return (1);
}
int f = open(argv[1], O_RDWR | O_TRUNC);
if (f >= 0)
{
fprintf(stderr, "%s was opened\n", argv[1]);
close(f);
fprintf(stderr, "%s was closed\n", argv[1]);
} else {
perror("Opening device node");
}
return (0);
}
Normal File Test
We can see below that the simple act of opening and closing a file with O_TRUNC will cause it to lose anything that was already there.
$ dd if=/dev/urandom of=junk.txt bs=1024 count=1^C
$ ls -l junk.txt
-rw-rw-r-- 1 akyserr akyserr 1024 Dec 11 17:26 junk.txt
$ ./openclose junk.txt
junk.txt was opened
junk.txt was closed
$ ls -l junk.txt
-rw-rw-r-- 1 akyserr akyserr 0 Dec 11 17:27 junk.txt
Block Device File Test
Let's try a similar test on a USB flash drive. We can see that we start out with a single partition on the USB flash drive. If it get's 'truncated', perhaps the partition will go away (considering it's defined in the first 512 bytes of the disk)?
$ ls -l /dev/sdc*
brw-rw---- 1 root disk 8, 32 Dec 11 17:22 /dev/sdc
brw-rw---- 1 root disk 8, 33 Dec 11 17:22 /dev/sdc1
$ sudo ./openclose /dev/sdc
/dev/sdc was opened
/dev/sdc was closed
$ sudo ./openclose /dev/sdc1
/dev/sdc1 was opened
/dev/sdc1 was closed
$ ls -l /dev/sdc*
brw-rw---- 1 root disk 8, 32 Dec 11 17:31 /dev/sdc
brw-rw---- 1 root disk 8, 33 Dec 11 17:31 /dev/sdc1
It looks like it has no affect whatsoever to open either the disk or the disk's partition 1 with the O_TRUNC option. From what I can tell, the filesystem is still mountable and the files are accessible and intact.
Effects of notrunc / O_TRUNC III
Okay, for my final test I will use dd on my flash drive directly. I will start by writing 512 bytes of random data, then writing 256 bytes of zeros at the beginning. For the final test, we will verify that the last 256 bytes remained unchanged.
$ sudo dd if=/dev/urandom of=/dev/sdc bs=256 count=2
$ sudo hexdump -n 512 /dev/sdc
0000000 3fb6 d17f 8824 a24d 40a5 2db3 2319 ac5b
0000010 c659 5780 2d04 3c4e f985 053c 4b3d 3eba
0000020 0be9 8105 cec4 d6fb 5825 a8e5 ec58 a38e
0000030 d736 3d47 d8d3 9067 8db8 25fb 44da af0f
0000040 add7 c0f2 fc11 d734 8e26 00c6 cfbb b725
0000050 8ff7 3e79 af97 2676 b9af 1c0d fc34 5eb1
0000060 6ede 318c 6f9f 1fea d200 39fe 4591 2ffb
0000070 0464 9637 ccc5 dfcc 3b0f 5432 cdc3 5d3c
0000080 01a9 7408 a10a c3c4 caba 270c 60d0 d2f7
0000090 2f8d a402 f91a a261 587b 5609 1260 a2fc
00000a0 4205 0076 f08b b41b 4738 aa12 8008 053f
00000b0 26f0 2e08 865e 0e6a c87e fc1c 7ef6 94c6
00000c0 9ced 37cf b2e7 e7ef 1f26 0872 cd72 54a4
00000d0 3e56 e0e1 bd88 f85b 9002 c269 bfaa 64f7
00000e0 08b9 5957 aad6 a76c 5e37 7e8a f5fc d066
00000f0 8f51 e0a1 2d69 0a8e 08a9 0ecf cee5 880c
0000100 3835 ef79 0998 323d 3d4f d76b 8434 6f20
0000110 534c a847 e1e2 778c 776b 19d4 c5f1 28ab
0000120 a7dc 75ea 8a8b 032a c9d4 fa08 268f 95e8
0000130 7ff3 3cd7 0c12 4943 fd23 33f9 fe5a 98d9
0000140 aa6d 3d89 c8b4 abec 187f 5985 8e0f 58d1
0000150 8439 b539 9a45 1c13 68c2 a43c 48d2 3d1e
0000160 02ec 24a5 e016 4c2d 27be 23ee 8eee 958e
0000170 dd48 b5a1 10f1 bf8e 1391 9355 1b61 6ffa
0000180 fd37 7718 aa80 20ff 6634 9213 0be1 f85e
0000190 a77f 4238 e04d 9b64 d231 aee8 90b6 5c7f
00001a0 5088 2a3e 0201 7108 8623 b98a e962 0860
00001b0 c0eb 21b7 53c6 31de f042 ac80 20ee 94dd
00001c0 b86c f50d 55bc 32db 9920 fd74 a21e 911a
00001d0 f7db 82c2 4d16 3786 3e18 2c0f 47c2 ebb0
00001e0 75af 6a8c 2e80 c5b6 e4ea a9bc a494 7d47
00001f0 f493 8b58 0765 44c5 ff01 42a3 b153 d395
$ sudo dd if=/dev/zero of=/dev/sdc bs=256 count=1
$ sudo hexdump -n 512 /dev/sdc
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
0000100 3835 ef79 0998 323d 3d4f d76b 8434 6f20
0000110 534c a847 e1e2 778c 776b 19d4 c5f1 28ab
0000120 a7dc 75ea 8a8b 032a c9d4 fa08 268f 95e8
0000130 7ff3 3cd7 0c12 4943 fd23 33f9 fe5a 98d9
0000140 aa6d 3d89 c8b4 abec 187f 5985 8e0f 58d1
0000150 8439 b539 9a45 1c13 68c2 a43c 48d2 3d1e
0000160 02ec 24a5 e016 4c2d 27be 23ee 8eee 958e
0000170 dd48 b5a1 10f1 bf8e 1391 9355 1b61 6ffa
0000180 fd37 7718 aa80 20ff 6634 9213 0be1 f85e
0000190 a77f 4238 e04d 9b64 d231 aee8 90b6 5c7f
00001a0 5088 2a3e 0201 7108 8623 b98a e962 0860
00001b0 c0eb 21b7 53c6 31de f042 ac80 20ee 94dd
00001c0 b86c f50d 55bc 32db 9920 fd74 a21e 911a
00001d0 f7db 82c2 4d16 3786 3e18 2c0f 47c2 ebb0
00001e0 75af 6a8c 2e80 c5b6 e4ea a9bc a494 7d47
00001f0 f493 8b58 0765 44c5 ff01 42a3 b153 d395
Summary
Through the above experimentation, it seems that notrunc is only important for when you have a file you want to write into, but don't want to truncate it. This seems to have no effect on a block device such as sda or sdb.

Resources