Reverse engineer firmware image and rebuild Linux kernel for TI-AR7

Reverse engineer firmware image and rebuild Linux kernel for TI-AR7 - linux

I am trying to build my own Linux derivative to run on an TI-AR7 board. I took the board from an old Telekom Speedport W 501V router. To understand how firmware is flashed onto the device I have downloaded the most recent official firmware. Using the Linux file command I determined the image is a tar archive, which can be extracted easily.
ubuntu#ip-172-31-23-210:~/reverse$ ls
fw_speedport_w501v_v_28.04.38.image
ubuntu#ip-172-31-23-210:~/reverse$ file fw*
fw_speedport_w501v_v_28.04.38.image: POSIX tar archive (GNU)
ubuntu#ip-172-31-23-210:~/reverse$ tar -xvf fw*
./var/
./var/tmp/
./var/tmp/kernel.image
./var/tmp/filesystem.image
./var/flash_update.ko
./var/flash_update.o
./var/info.txt
./var/install
./var/chksum
./var/regelex
./var/signature
ubuntu#ip-172-31-23-210:~/reverse$
According to a wiki (Firmware-Image) that I have found, ./var/tmp/kernel.image contains the actual firmware. During the update process this image is written to the mtd1 device. As stated in the wiki (LZMA-Kernel) the lzma compressed kernel starts with the magic number 0xfeed1281. A hexdump of kernel.image contains that number at its beginning.
ubuntu#ip-172-31-23-210:~/reverse/var/tmp$ hexdump -n 4 kernel.image
0000000 1281 feed
0000004
ubuntu#ip-172-31-23-210:~/reverse/var/tmp$
The following script given on the last wiki entry should decompress the kernel.
#! /usr/bin/perl
use Compress::unLZMA;
use Archive::Zip;
open INPUT, "<$ARGV[0]" or die "can't open $ARGV[0]: $!";
read INPUT, $buf, 4;
$magic = unpack("V", $buf);
if ($magic != 0xfeed1281) {
die "bad magic";
}
read INPUT, $buf, 4;
$len = unpack("V", $buf);
read INPUT, $buf, 4*2; # address, unknown
read INPUT, $buf, 4;
$clen = unpack("V", $buf);
read INPUT, $buf, 4;
$dlen = unpack("V", $buf);
read INPUT, $buf, 4;
$cksum = unpack("V", $buf);
printf "Archive checksum: 0x%08x\n", $cksum;
read INPUT, $buf, 1+4; # properties, dictionary size
read INPUT, $dummy, 3; # alignment
$buf .= pack('VV', $dlen, 0); # 8 bytes of real size
#$buf .= pack('VV', -1, -1); # 8 bytes of real size
read INPUT, $buf2, $clen;
$crc = Archive::Zip::computeCRC32($buf2);
printf "Input CRC32: 0x%08x\n", $crc;
if ($cksum != $crc) {
die "wrong checksum";
}
$buf .= $buf2;
$data = Compress::unLZMA::uncompress($buf);
unless (defined $data) {
die "uncompress: $#";
}
open OUTPUT, ">$ARGV[1]" or die "can't write $ARGV[1]";
print OUTPUT $data;
#truncate OUTPUT, $dlen;
To use the script you may need to install Compress::unLZMA and Archive::Zip perl modules.
ubuntu#ip-172-31-23-210:~/reverse/var/tmp$ tar -xvf Compress*
Compress-unLZMA-0.04/
Compress-unLZMA-0.04/Makefile.PL
Compress-unLZMA-0.04/ppport.h
Compress-unLZMA-0.04/Changes
Compress-unLZMA-0.04/lzma_sdk/
[...]
ubuntu#ip-172-31-23-210:~/reverse/var/tmp$ cd Compress*
ubuntu#ip-172-31-23-210:~/reverse/var/tmp/Compress-unLZMA-0.04$ perl Makefile.PL
Checking if your kit is complete...
Looks good
Writing Makefile for Compress::unLZMA
Writing MYMETA.yml and MYMETA.json
ubuntu#ip-172-31-23-210:~/reverse/var/tmp/Compress-unLZMA-0.04$ make
cp lib/Compress/unLZMA.pm blib/lib/Compress/unLZMA.pm
/usr/bin/perl /usr/share/perl/5.18/ExtUtils/xsubpp -typemap /usr/share/perl/5.18/ExtUtils/typemap unLZMA.xs > unLZMA.xsc && mv unLZMA.xsc unLZMA.c
cc -c -I. -Ilzma_sdk/Source -D_REENTRANT -D_GNU_SOURCE
[...]
ubuntu#ip-172-31-23-210:~/reverse/var/tmp/Compress-unLZMA-0.04$ sudo make install
Files found in blib/arch: installing files in blib/lib into architecture dependent library tree
Installing /usr/local/lib/perl/5.18.2/auto/Compress/unLZMA/unLZMA.bs
Installing /usr/local/lib/perl/5.18.2/auto/Compress/unLZMA/unLZMA.so
Installing /usr/local/lib/perl/5.18.2/Compress/unLZMA.pm
Installing /usr/local/man/man3/Compress::unLZMA.3pm
Appending installation info to /usr/local/lib/perl/5.18.2/perllocal.pod
ubuntu#ip-172-31-23-210:~/reverse/var/tmp/Compress-unLZMA-0.04$ # same for Archive::Zip module
After installing these dependencies the script decompressed the kernel successfully.
ubuntu#ip-172-31-23-210:~/reverse/var/tmp$ ./decompress.pl kernel.image kernel.decompressed
Archive checksum: 0x29176e12
Input CRC32: 0x29176e12
ubuntu#ip-172-31-23-210:~/reverse/var/tmp$
But what kind of file is kernel.decompressed and how do I generate a similar file from my Linux kernel source? I continued analyzing it using file and binwalk.
ubuntu#ip-172-31-23-210:~/reverse/var/tmp$ file kernel.decompressed
kernel.decompressed: data
ubuntu#ip-172-31-23-210:~/reverse/var/tmp$ binwalk kernel.decompressed
DECIMAL HEXADECIMAL DESCRIPTION
--------------------------------------------------------------------------------
1509632 0x170900 Linux kernel version "2.6.13.1-ohio (686) (gcc version 3.4.6) #9 Wed Apr 4 13:48:08 CEST 2007"
1516240 0x1722D0 CRC32 polynomial table, little endian
1517535 0x1727DF Copyright string: "Copyright 1995-1998 Mark Adler "
1549488 0x17A4B0 Unix path: /usr/gnemul/irix/
1550920 0x17AA48 Unix path: /usr/lib/libc.so.1
1618031 0x18B06F Neighborly text, "neighbor %.2x%.2x.%.2x:%.2x:%.2x:%.2x:%.2x:%.2x lost on port %d(%s)(%s)"
1966080 0x1E0000 gzip compressed data, maximum compression, from Unix, last modified: 2007-04-04 11:45:13
ubuntu#ip-172-31-23-210:~/reverse/var/tmp$
So the Linux kernel starts at 1509632 and ends at 1516240. What kind of data is stored in front the Linux kernel (0 to 1509632)? I extracted the kernel and that piece of unknown data using dd.
ubuntu#ip-172-31-23-210:~/reverse/var/tmp$ dd if=kernel.decompressed of=unknown.data bs=1 count=1509632
1509632+0 records in
1509632+0 records out
1509632 bytes (1.5 MB) copied, 1.62137 s, 931 kB/s
ubuntu#ip-172-31-23-210:~/reverse/var/tmp$ dd if=kernel.decompressed of=kernel bs=1 skip=1509632 count=6608
6608+0 records in
6608+0 records out
6608 bytes (6.6 kB) copied, 0.0072771 s, 908 kB/s
ubuntu#ip-172-31-23-210:~/reverse/var/tmp$
I need to ask again: What kind of file is kernel and how do I generate a similar file from my Linux kernel source? I used xxd and strings to look at the file more closely.
ubuntu#ip-172-31-23-210:~/reverse/var/tmp$ xxd -l 100 kernel
0000000: 4c69 6e75 7820 7665 7273 696f 6e20 322e Linux version 2.
0000010: 362e 3133 2e31 2d6f 6869 6f20 2836 3836 6.13.1-ohio (686
0000020: 2920 2867 6363 2076 6572 7369 6f6e 2033 ) (gcc version 3
0000030: 2e34 2e36 2920 2339 2057 6564 2041 7072 .4.6) #9 Wed Apr
0000040: 2034 2031 333a 3438 3a30 3820 4345 5354 4 13:48:08 CEST
0000050: 2032 3030 370a 0000 0000 0000 0000 0000 2007...........
0000060: 0000 0000 ....
ubuntu#ip-172-31-23-210:~/reverse/var/tmp$ strings kernel
Linux version 2.6.13.1-ohio (686) (gcc version 3.4.6) #9 Wed Apr 4 13:48:08 CEST 2007
do_be
do_bp
do_tr
do_ri
do_cpu
nmi_exception_handler
do_ade
emulate_load_store_insn
do_page_fault
context_switch
__put_task_struct
do_exit
local_bh_enable
run_workqueue
2.6.13.1-ohio gcc-3.4
enable_irq
__free_pages_ok
free_hot_cold_page
prep_new_page
kmem_cache_destroy
kmem_cache_create
pageout
vunmap_pte_range
vmap_pte_range
__vunmap
__brelse
sync_dirty_buffer
bio_endio
queue_kicked_iocb
proc_get_inode
remove_proc_entry
sysfs_get
sysfs_fill_super
kref_get
kref_put
0123456789abcdefghijklmnopqrstuvwxyz
0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ
vsnprintf
{zt^f
pw0Gm
0cIZ-
68BG+
QC]S%
v,;Zk
ubuntu#ip-172-31-23-210:~/reverse/var/tmp$
This Github repository contains the extracted files to use for further analysis.

Related

The Yocto bbclass folders structure misunderstanding

I have a relatively simple question but probably not for me.
I created a bbclass named spi-nand-ubi.bbclass. In this file, I have a few records like
dd if=${DEPLOY_DIR_IMAGE}/${KERNEL_DEVICETREE} of=${SPINAND} bs=1k seek=512 conv=notrunc
dd if=${DEPLOY_DIR_IMAGE}/${KERNEL_IMAGETYPE} of=${SPINAND} bs=1k seek=640 conv=notrunc
and above records are fine, but the following record don't
dd if=${IMGDEPLOYDIR}/${IMAGE_NAME}${IMAGE_NAME_SUFFIX}.ubi of=${SPINAND} bs=1k seek=5504 conv=notrunc
I get always an error like
dd: failed to open '/home/mw/yocto/tmp/work/indus-poky-linux-gnueabi/console-image/1.0-r0/deploy-console-image-image-complete/console-image-indus-20230120084720.rootfs.ubi': No such file or directory
When I type manually to the
cd /home/mw/yocto/tmp/work/indus-poky-linux-gnueabi/console-image/1.0-r0/deploy-console-image-image-complete
the file with provided name exists.
What I missed, misunderstood. Please give me a hint on how to resolve this issue.
Thanks

directory content of
/home/mw/yocto/tmp/work/indus-poky-linux-gnueabi/console-image/1.0-r0/deploy-console-image-image-complete
console-image-indus-20230120082905.rootfs.ext4
console-image-indus-20230120082905.rootfs.jffs2
console-image-indus-20230120082905.rootfs.manifest
console-image-indus-20230120082905.rootfs.sunxi-spinand
console-image-indus-20230120082905.rootfs.tar
console-image-indus-20230120082905.rootfs.tar.gz
console-image-indus-20230120082905.rootfs.tar.xz
console-image-indus-20230120082905.rootfs.ubi
console-image-indus-20230120082905.rootfs.ubifs
console-image-indus-20230120082905.testdata.json
console-image-indus-20230120084720.rootfs.sunxi-spinand
console-image-indus-20230120084720.rootfs.tar
console-image-indus-20230120084720.rootfs.tar.gz
console-image-indus-20230120084720.rootfs.tar.xz
console-image-indus.ext4
console-image-indus.jffs2
console-image-indus.manifest
console-image-indus.testdata.json
console-image-indus.ubi
console-image-indus.ubifs
ubinize-console-image-indus-20230120082905.cfg
the content of my spi-nand-ubi.bbclass
inherit image_types
#
# Create an image that can by written into a SPI NAND (128 MBytes) flash technology using dd application.
# Written for Indus board to simplify programming process and to write only one combined image file.
#
# The image layout(layout is valid for 128MBytes Flashes) used is:
#
# OFFSET PARTITION SIZE PARTITION
# 0 -> 458752(0x80000) - SPL U-Boot with NAND offset
# 512*1024 -> 65536(0x20000) - The dtb file
# 640*1024 -> 4980736(0x4C0000) - Kernel
# 5504*1024 -> 122494976(0x7AA0000) - Ubifs rootfs (*.ubi for NAND)
#
# SUM of all partition should give all flash memory size,
# SUM = 0x70000+0x10000+0x4C0000+0x7AA0000= 0x7A12000(128000000)
#
# Before change partition offsets here, do it first for U-Boot DTS, defconifg and Kernel DTS
# This image depends on the rootfs image
RDEPENDS_mtd-utils-tests += "bash"
SPINAND_ROOTFS_TYPE ?= "ubifs"
IMAGE_TYPEDEP_sunxi-spinand= "${SPINAND_ROOTFS_TYPE}"
do_image_sunxi_spinand[depends] += " \
mtools-native:do_populate_sysroot \
dosfstools-native:do_populate_sysroot \
virtual/kernel:do_deploy \
virtual/bootloader:do_deploy \
"
# The NAND SPI Flash image name
SPINAND = "${IMGDEPLOYDIR}/${IMAGE_NAME}.rootfs.sunxi-spinand "
IMAGE_CMD:sunxi-spinand () {
${DEPLOY_DIR_IMAGE}/mknanduboot.sh ${DEPLOY_DIR_IMAGE}/${SPL_BINARY} ${DEPLOY_DIR_IMAGE}/${NAND_SPL_BINARY}
dd if=/dev/zero of=${SPINAND} bs=1M count=16
dd if=${DEPLOY_DIR_IMAGE}/${NAND_SPL_BINARY} of=${SPINAND} bs=1k conv=notrunc
dd if=${DEPLOY_DIR_IMAGE}/${KERNEL_DEVICETREE} of=${SPINAND} bs=1k seek=512 conv=notrunc
dd if=${DEPLOY_DIR_IMAGE}/${KERNEL_IMAGETYPE} of=${SPINAND} bs=1k seek=640 conv=notrunc
mkfs.ubifs -F -r ${WORKDIR}/rootfs -m 2048 -e 126976 -c 2048 -o ${DEPLOY_DIR_IMAGE}/rootfs.ubifs
#ubinize -vv -o ${DEPLOY_DIR_IMAGE}/rootfs.ubi -m 2048 -p 131072 -s 2048 ${DEPLOY_DIR_IMAGE}/config.ini
#ubinize -vv -o ${DEPLOY_DIR_IMAGE}/rootfs.ubi -m 2048 -p 131072 -s 2048 ${DEPLOY_DIR_IMAGE}/ubinize${vname}-${IMAGE_NAME}.cfg
dd if=${IMGDEPLOYDIR}/${IMAGE_NAME}${IMAGE_NAME_SUFFIX}.ubi of=${SPINAND} bs=1k seek=5504 conv=notrunc
#dd if=${DEPLOY_DIR_IMAGE}/rootfs.ubi of=${SPINAND} bs=1k seek=5504 conv=notrunc
}

I did a few experiments and I think I found a reason. The problem is that ubi and ubifs are generated later than the receipt which uses my bbclass. This is the reason why it doesn't see demanded files. The question is what should I add to wait on ubi and ubifs to be finalized?

how to extract AT_EXECFN from a coredump file

I need to extract a value for AT_EXECFN from a coredump file. For those who don't know this value is part of the auxiliary vector. It is really simple to get this value while in your running process, just use getauxval, however I couldn't find any information on how to do it when you have an ELF coredump file. I can find the NT_AUXV section of this file, but I can't find out how to find an exact string where AT_EXECFN is pointing to. Say I found AT_EXECFN in the coredump. According to this I'm going to get a struct with ann address where the actual value is stored. My question is how do I find this address in the coredump file?

Here is an example:
int main() { abort(); }
gcc -w -O2 t.c && ulimit -c unlimited && ./a.out
Aborted (core dumped)
First we can look with GDB:
gdb -q ./a.out core
Reading symbols from ./a.out...
(No debugging symbols found in ./a.out)
[New LWP 86]
Core was generated by `./a.out'.
Program terminated with signal SIGABRT, Aborted.
#0 __GI_raise (sig=sig#entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) info auxv
33 AT_SYSINFO_EHDR System-supplied DSO's ELF header 0x7ffd4037a000
16 AT_HWCAP Machine-dependent CPU capability hints 0x178bfbff
...
31 AT_EXECFN File name of executable 0x7ffd40374ff0 "./a.out"
15 AT_PLATFORM String identifying platform 0x7ffd403739c9 "x86_64"
0 AT_NULL End of vector 0x0
This proves that the info is in fact present in the core, and also shows what values to expect.
The steps therefore are:
read / decode NT_AUXV note until you find an entry with .a_type == AT_EXECFN ( 31). Find the pointer to string ($addr, here you would find 0x7ffd40374ff0).
Using eu-readelf -n core is helpful to verify that you are reading expected values. Here is the output:
CORE 320 AUXV
SYSINFO_EHDR: 0x7ffd4037a000
HWCAP: 0x178bfbff <fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht>
...
EXECFN: 0x7ffd40374ff0
PLATFORM: 0x7ffd403739c9
NULL
iterate over all program headers with .p_type == PT_LOAD until you find one with .p_vaddr <= $addr and $addr < .p_vaddr + .p_memsz (that is, a LOAD segment which "covers" desired address). In above case, that's this entry:
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
...
LOAD 0x016000 0x00007ffd40354000 0x0000000000000000 0x021000 0x021000 RW 0x1000
...
finally you can find location of the string in the core file at .p_offset + $addr - .p_vaddr. Using above numbers, we expect the string to be at offset 0x016000 + 0x7ffd40374ff0 - 0x00007ffd40354000 = 225264 bytes into the file.
And we do find it there:
dd status=none bs=1 skip=225264 count=10 if=core | xxd -g1
00000000: 2e 2f 61 2e 6f 75 74 00 00 00 ./a.out...

build-id data offset in the ELF file

I need to modify the build-id of the ELF notes section. I found out that it is possible here. Also found out that I can do it by modifying this code. What I can't figure out is data location. Here is what I'm talking about.
$ eu-readelf -S myelffile
Section Headers:
[Nr] Name Type Addr Off Size ES Flags Lk Inf Al
...
[ 2] .note.ABI-tag NOTE 000000000000028c 0000028c 00000020 0 A 0 0 4
[ 3] .note.gnu.build-id NOTE 00000000000002ac 000002ac 00000024 0 A 0 0 4
...
$ eu-readelf -n myelffile
Note section [ 2] '.note.ABI-tag' of 32 bytes at offset 0x28c:
Owner Data size Type
GNU 16 GNU_ABI_TAG
OS: Linux, ABI: 3.14.0
Note section [ 3] '.note.gnu.build-id' of 36 bytes at offset 0x2ac:
Owner Data size Type
GNU 20 GNU_BUILD_ID
Build ID: d75a086c288c582036b0562908304bc3a8033235
.note.gnu.build-id section is 36 bytes. The build id is 20 bytes. What are the other 16 bytes?
I played with the code a bit and read 36 bytes of myelffile at offset 0x2ac. Got the following 040000001400000003000000474e5500d75a086c288c582036b0562908304bc3a8033235.
Then I decided to use Elf64_Shdr definition, so I read data at address 0x2ac + sizeof(Elf64_Shdr.sh_name) + sizeof(Elf64_Shdr.sh_type) + sizeof(Elf64_Shdr.sh_flags) and I got my build id, d75a086c288c582036b0562908304bc3a8033235. It does makes sense why I got it, sizeof(Elf64_Shdr.sh_name) + sizeof(Elf64_Shdr.sh_type) + sizeof(Elf64_Shdr.sh_flags) = 16 bytes, but according to Elf64_Shdr definition I should be pointing to Elf64_Addr sh_addr, i.e. section virtual address.
So what is not clear to me is what are the other 16 bytes of the section? What do they represent? I can't reconcile the Elf64_Shdr definition and the results I'm getting from my experiments.

.note.gnu.build-id section is 36 bytes. The build id is 20 bytes. What are the other 16 bytes?
Each .note.* section starts with Elf64_Nhdr (12 bytes), followed by (4-byte aligned) note name of variable size (GNU\0 here), followed by (4-byte aligned) actual note data. Documentation.
Looking at /bin/date on my system:
eu-readelf -Wn /bin/date
Note section [ 2] '.note.ABI-tag' of 32 bytes at offset 0x2c4:
Owner Data size Type
GNU 16 GNU_ABI_TAG
OS: Linux, ABI: 3.2.0
Note section [ 3] '.note.gnu.build-id' of 36 bytes at offset 0x2e4:
Owner Data size Type
GNU 20 GNU_BUILD_ID
Build ID: 979ae4616ae71af565b123da2f994f4261748cc9
What are the bytes at offset 0x2e4?
dd bs=1 skip=$((0x2e4)) count=36 < /bin/date | xxd
00000000: 0400 0000 1400 0000 0300 0000 474e 5500 ............GNU.
00000010: 979a e461 6ae7 1af5 65b1 23da 2f99 4f42 ...aj...e.#./.OB
00000020: 6174 8cc9 at..
So we have: .n_namesz == 4, .n_descsz == 20, .n_type == 3 == NT_GNU_BUILD_ID, followed by 4-byte GNU\0 note name, followed by 20 bytes of actual build-id bytes 0x97, 0x9a, etc.

Problem, related to Zip4 extension, directing Standard Input Content into a zip archive, using zip, and using zipnotes to change the zipped file name

I have an annoying problem with zip and zipnote programs (both in 3.0 version) in my Debian stable platform.
I wish to create a zip archive storing (not compressing) data from standard input, without extra attributes/fields, and giving a name to the resulting file inside the zip file.
My first try was
printf "foodata" | zip -X0 bar.zip -
printf "# -\n#=foofile\n" | zipnote -w bar.zip
where zip create a bar.zip archive, with a stored file "-" containing "foodata", and zipnote rename the file from "-" to "foofile".
First problem (solved): zip, as we can see from zipdetails
001E Filename '-'
001F Extra ID #0001 0001 'ZIP64'
0021 Length 0010
0023 Uncompressed Size 0000000000000007
002B Compressed Size 0000000000000007
receiving data from standard input, doesn't know the size of the resulting file so create a PKZIP 4.5 compatible zip archive (that can exceed 4 GB) using Zip64 extension and adding a Zip64 extra attribute to the file.
And the -X option remove extrafile attributes but doesn't remove the Zip64 extra field.
This problem is easily solvable adding the -fz- option, as stated in zip man page
// .................................VVVV
printf "foodata" | zip -X0 -fz- bar.zip -
Now bar.zip is a PKZIP 2 compatible file and there isn't the Zip64 extra field.
Second problem (not solved): zipnote change the name of the contained file and add the Zip64 field to the file.
I don't know why.
According the zip man page
zip removes the Zip64 extensions if not needed when archive entries are copied (see the -U (--copy) option).
So I understand that
zip bar.zip --out bar-corrected.zip
should create a new bar-corrected.zip archive where the file foofile isZip64free (thefoofileis very short so theZip64` extension isn't needed, I presume).
Unfortunately, this doesn't works: I get the warning
copying: foofile
zip warning: Local Version Needed To Extract does not match CD: foofile
and the resulting file maintain the Zip64 extension.
And seems that doesn't works explicating the filename or adding the -fz- option: I've tried a lot o combinations but (maybe is my fault) without success.
Questions:
(1) can I avoid (and how) that zipnote, changing the name of a file, add the Zip64 fields to it?
(2) otherwise, how can I use zip (with --copy? with -fz-?) to create a new zip archive Zip64 extension free?

[Edit: Updated to use Store rather than Deflate]
Not sure how to achieve what you want with zip and zipnote, but here is an alternative.
echo abc | perl -MIO::Compress::Zip=zip -e ' zip "-" => "out.zip", Method => 0, Name => "member.txt" '
$ unzip -lv out.zip
Archive: out.zip
Length Method Size Cmpr Date Time CRC-32 Name
-------- ------ ------- ---- ---------- ----- -------- ----
4 Stored 4 0% 2019-10-10 21:54 4788814e member.txt
-------- ------- --- -------
4 4 0% 1 file
No Zip64 or extra attributes are present in the zip file.
$ zipdetails out.zip
0000 LOCAL HEADER #1 04034B50
0004 Extract Zip Spec 14 '2.0'
0005 Extract OS 00 'MS-DOS'
0006 General Purpose Flag 0008
[Bit 3] 1 'Streamed'
0008 Compression Method 0000 'Stored'
000A Last Mod Time 4F4AAECA 'Thu Oct 10 21:54:20 2019'
000E CRC 00000000
0012 Compressed Length 00000000
0016 Uncompressed Length 00000000
001A Filename Length 000A
001C Extra Length 0000
001E Filename 'member.txt'
0028 PAYLOAD abc.
002C STREAMING DATA HEADER 08074B50
0030 CRC 4788814E
0034 Compressed Length 00000004
0038 Uncompressed Length 00000004
003C CENTRAL HEADER #1 02014B50
0040 Created Zip Spec 14 '2.0'
0041 Created OS 03 'Unix'
0042 Extract Zip Spec 14 '2.0'
0043 Extract OS 00 'MS-DOS'
0044 General Purpose Flag 0008
[Bit 3] 1 'Streamed'
0046 Compression Method 0000 'Stored'
0048 Last Mod Time 4F4AAECA 'Thu Oct 10 21:54:20 2019'
004C CRC 4788814E
0050 Compressed Length 00000004
0054 Uncompressed Length 00000004
0058 Filename Length 000A
005A Extra Length 0000
005C Comment Length 0000
005E Disk Start 0000
0060 Int File Attributes 0000
[Bit 0] 0 'Binary Data'
0062 Ext File Attributes 81A40000
0066 Local Header Offset 00000000
006A Filename 'member.txt'
0074 END CENTRAL HEADER 06054B50
0078 Number of this disk 0000
007A Central Dir Disk no 0000
007C Entries in this disk 0001
007E Total Entries 0001
0080 Size of Central Dir 00000038
0084 Offset to Central Dir 0000003C
0088 Comment Length 0000
Done

Created a simple script that wraps the previous answer. See streamzip
Usage is
printf "foodata" | streamzip -method=store -member-name=foofile -zipfile=/tmp/bar.zip
This is what unzip thinks is in the zip file
unzip -lv /tmp/bar.zip
Archive: /tmp/bar.zip
Length Method Size Cmpr Date Time CRC-32 Name
-------- ------ ------- ---- ---------- ----- -------- ----
7 Stored 7 0% 2019-10-15 20:25 1b7dd7cd foofile
-------- ------- --- -------
7 7 0% 1 file

Why using conv=notrunc when cloning a disk with dd?

If you look up how to clone an entire disk to another one on the web, you will find something like that:
dd if=/dev/sda of=/dev/sdb conv=notrunc,noerror
While I understand the noerror, I am getting a hard time understanding why people think that notrunc is required for "data integrity" (as ArchLinux's Wiki states, for instance).
Indeed, I do agree on that if you are copying a partition to another partition on another disk, and you do not want to overwrite the entire disk, just one partition. In thise case notrunc, according to dd's manual page, is what you want.
But if you're cloning an entire disk, what does notrunc change for you? Just time optimization?

TL;DR version:
notrunc is only important to prevent truncation when writing into a file. This has no effect on a block device such as sda or sdb.
Educational version
I looked into the coreutils source code which contains dd.c to see how notrunc is processed.
Here's the segment of code that I'm looking at:
int opts = (output_flags
| (conversions_mask & C_NOCREAT ? 0 : O_CREAT)
| (conversions_mask & C_EXCL ? O_EXCL : 0)
| (seek_records || (conversions_mask & C_NOTRUNC) ? 0 : O_TRUNC));
/* Open the output file with *read* access only if we might
need to read to satisfy a `seek=' request. If we can't read
the file, go ahead with write-only access; it might work. */
if ((! seek_records
|| fd_reopen (STDOUT_FILENO, output_file, O_RDWR | opts, perms) < 0)
&& (fd_reopen (STDOUT_FILENO, output_file, O_WRONLY | opts, perms) < 0))
error (EXIT_FAILURE, errno, _("opening %s"), quote (output_file));
We can see here that if notrunc is not specified, then the output file will be opened with O_TRUNC. Looking below at how O_TRUNC is treated, we can see that a normal file will get truncated if written into.
O_TRUNC
If the file already exists and is a regular file and the open
mode allows writing (i.e., is O_RDWR or O_WRONLY) it will be truncated
to length 0. If the file is a FIFO or terminal device file, the
O_TRUNC flag is ignored. Otherwise the effect of O_TRUNC is
unspecified.
Effects of notrunc / O_TRUNC I
In the following example, we start out by creating junk.txt of size 1024 bytes. Next, we write 512 bytes to the beginning of it with conv=notrunc. We can see that the size stays the same at 1024 bytes. Finally, we try it without the notrunc option and we can see that the new file size is 512. This is because it was opened with O_TRUNC.
$ dd if=/dev/urandom of=junk.txt bs=1024 count=1
$ ls -l junk.txt
-rw-rw-r-- 1 akyserr akyserr 1024 Dec 11 17:08 junk.txt
$ dd if=/dev/urandom of=junk.txt bs=512 count=1 conv=notrunc
$ ls -l junk.txt
-rw-rw-r-- 1 akyserr akyserr 1024 Dec 11 17:10 junk.txt
$ dd if=/dev/urandom of=junk.txt bs=512 count=1
$ ls -l junk.txt
-rw-rw-r-- 1 akyserr akyserr 512 Dec 11 17:10 junk.txt
Effects of notrunc / O_TRUNC II
I still haven't answered your original question of why when doing a disk-to-disk clone, why conv=notrunc is important. According to the above definition, O_TRUNC seems to be ignored when opening certain special files, and I would expect this to be true for block device nodes too. However, I don't want to assume anything and will attempt to prove it here.
openclose.c
I've written a simple C program here which opens and closes a file given as an argument with the O_TRUNC flag.
#include <stdio.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <time.h>
int main(int argc, char * argv[])
{
if (argc < 2)
{
fprintf(stderr, "Not enough arguments...\n");
return (1);
}
int f = open(argv[1], O_RDWR | O_TRUNC);
if (f >= 0)
{
fprintf(stderr, "%s was opened\n", argv[1]);
close(f);
fprintf(stderr, "%s was closed\n", argv[1]);
} else {
perror("Opening device node");
}
return (0);
}
Normal File Test
We can see below that the simple act of opening and closing a file with O_TRUNC will cause it to lose anything that was already there.
$ dd if=/dev/urandom of=junk.txt bs=1024 count=1^C
$ ls -l junk.txt
-rw-rw-r-- 1 akyserr akyserr 1024 Dec 11 17:26 junk.txt
$ ./openclose junk.txt
junk.txt was opened
junk.txt was closed
$ ls -l junk.txt
-rw-rw-r-- 1 akyserr akyserr 0 Dec 11 17:27 junk.txt
Block Device File Test
Let's try a similar test on a USB flash drive. We can see that we start out with a single partition on the USB flash drive. If it get's 'truncated', perhaps the partition will go away (considering it's defined in the first 512 bytes of the disk)?
$ ls -l /dev/sdc*
brw-rw---- 1 root disk 8, 32 Dec 11 17:22 /dev/sdc
brw-rw---- 1 root disk 8, 33 Dec 11 17:22 /dev/sdc1
$ sudo ./openclose /dev/sdc
/dev/sdc was opened
/dev/sdc was closed
$ sudo ./openclose /dev/sdc1
/dev/sdc1 was opened
/dev/sdc1 was closed
$ ls -l /dev/sdc*
brw-rw---- 1 root disk 8, 32 Dec 11 17:31 /dev/sdc
brw-rw---- 1 root disk 8, 33 Dec 11 17:31 /dev/sdc1
It looks like it has no affect whatsoever to open either the disk or the disk's partition 1 with the O_TRUNC option. From what I can tell, the filesystem is still mountable and the files are accessible and intact.
Effects of notrunc / O_TRUNC III
Okay, for my final test I will use dd on my flash drive directly. I will start by writing 512 bytes of random data, then writing 256 bytes of zeros at the beginning. For the final test, we will verify that the last 256 bytes remained unchanged.
$ sudo dd if=/dev/urandom of=/dev/sdc bs=256 count=2
$ sudo hexdump -n 512 /dev/sdc
0000000 3fb6 d17f 8824 a24d 40a5 2db3 2319 ac5b
0000010 c659 5780 2d04 3c4e f985 053c 4b3d 3eba
0000020 0be9 8105 cec4 d6fb 5825 a8e5 ec58 a38e
0000030 d736 3d47 d8d3 9067 8db8 25fb 44da af0f
0000040 add7 c0f2 fc11 d734 8e26 00c6 cfbb b725
0000050 8ff7 3e79 af97 2676 b9af 1c0d fc34 5eb1
0000060 6ede 318c 6f9f 1fea d200 39fe 4591 2ffb
0000070 0464 9637 ccc5 dfcc 3b0f 5432 cdc3 5d3c
0000080 01a9 7408 a10a c3c4 caba 270c 60d0 d2f7
0000090 2f8d a402 f91a a261 587b 5609 1260 a2fc
00000a0 4205 0076 f08b b41b 4738 aa12 8008 053f
00000b0 26f0 2e08 865e 0e6a c87e fc1c 7ef6 94c6
00000c0 9ced 37cf b2e7 e7ef 1f26 0872 cd72 54a4
00000d0 3e56 e0e1 bd88 f85b 9002 c269 bfaa 64f7
00000e0 08b9 5957 aad6 a76c 5e37 7e8a f5fc d066
00000f0 8f51 e0a1 2d69 0a8e 08a9 0ecf cee5 880c
0000100 3835 ef79 0998 323d 3d4f d76b 8434 6f20
0000110 534c a847 e1e2 778c 776b 19d4 c5f1 28ab
0000120 a7dc 75ea 8a8b 032a c9d4 fa08 268f 95e8
0000130 7ff3 3cd7 0c12 4943 fd23 33f9 fe5a 98d9
0000140 aa6d 3d89 c8b4 abec 187f 5985 8e0f 58d1
0000150 8439 b539 9a45 1c13 68c2 a43c 48d2 3d1e
0000160 02ec 24a5 e016 4c2d 27be 23ee 8eee 958e
0000170 dd48 b5a1 10f1 bf8e 1391 9355 1b61 6ffa
0000180 fd37 7718 aa80 20ff 6634 9213 0be1 f85e
0000190 a77f 4238 e04d 9b64 d231 aee8 90b6 5c7f
00001a0 5088 2a3e 0201 7108 8623 b98a e962 0860
00001b0 c0eb 21b7 53c6 31de f042 ac80 20ee 94dd
00001c0 b86c f50d 55bc 32db 9920 fd74 a21e 911a
00001d0 f7db 82c2 4d16 3786 3e18 2c0f 47c2 ebb0
00001e0 75af 6a8c 2e80 c5b6 e4ea a9bc a494 7d47
00001f0 f493 8b58 0765 44c5 ff01 42a3 b153 d395
$ sudo dd if=/dev/zero of=/dev/sdc bs=256 count=1
$ sudo hexdump -n 512 /dev/sdc
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
0000100 3835 ef79 0998 323d 3d4f d76b 8434 6f20
0000110 534c a847 e1e2 778c 776b 19d4 c5f1 28ab
0000120 a7dc 75ea 8a8b 032a c9d4 fa08 268f 95e8
0000130 7ff3 3cd7 0c12 4943 fd23 33f9 fe5a 98d9
0000140 aa6d 3d89 c8b4 abec 187f 5985 8e0f 58d1
0000150 8439 b539 9a45 1c13 68c2 a43c 48d2 3d1e
0000160 02ec 24a5 e016 4c2d 27be 23ee 8eee 958e
0000170 dd48 b5a1 10f1 bf8e 1391 9355 1b61 6ffa
0000180 fd37 7718 aa80 20ff 6634 9213 0be1 f85e
0000190 a77f 4238 e04d 9b64 d231 aee8 90b6 5c7f
00001a0 5088 2a3e 0201 7108 8623 b98a e962 0860
00001b0 c0eb 21b7 53c6 31de f042 ac80 20ee 94dd
00001c0 b86c f50d 55bc 32db 9920 fd74 a21e 911a
00001d0 f7db 82c2 4d16 3786 3e18 2c0f 47c2 ebb0
00001e0 75af 6a8c 2e80 c5b6 e4ea a9bc a494 7d47
00001f0 f493 8b58 0765 44c5 ff01 42a3 b153 d395
Summary
Through the above experimentation, it seems that notrunc is only important for when you have a file you want to write into, but don't want to truncate it. This seems to have no effect on a block device such as sda or sdb.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string