Extracting from bin file - squashfs

So I tried this:
root#kali:~/Desktop/fmk# binwalk upgrade-2.4.0.bin
DECIMAL HEXADECIMAL DESCRIPTION
--------------------------------------------------------------------------------
512 0x200 LZMA compressed data, properties: 0x6D, dictionary size: 8388608 bytes, uncompressed size: 2805816 bytes
927576 0xE2758 Squashfs filesystem, little endian, version 4.0, compression:xz, size: 12316692 bytes, 2963 inodes, blocksize: 262144 bytes, created: 2015-08-04 02:40:49
And then I used the following dd:
sudo dd if=upgrade-2.4.0.bin of=pineapple.squashfs bs=1 count=12316692
And I can't unsquashfs pineapple.squashfs.
Can't find a SQUASHFS superblock on pineapple.squashfs

You have to set the offset where the squashfs is
Usage: dd [OPERAND]...
or: dd OPTION
Copy a file, converting and formatting according to the operands.
bs=BYTES read and write up to BYTES bytes at a time
cbs=BYTES convert BYTES bytes at a time
conv=CONVS convert the file as per the comma separated symbol list
count=N copy only N input blocks
ibs=BYTES read up to BYTES bytes at a time (default: 512)
if=FILE read from FILE instead of stdin
iflag=FLAGS read as per the comma separated symbol list
obs=BYTES write BYTES bytes at a time (default: 512)
of=FILE write to FILE instead of stdout
oflag=FLAGS write as per the comma separated symbol list
seek=N skip N obs-sized blocks at start of output
skip=N skip N ibs-sized blocks at start of input
status=LEVEL The LEVEL of information to print to stderr;
'none' suppresses everything but error messages,
'noxfer' suppresses the final transfer statistics,
'progress' shows periodic transfer statistics
...
So, to extract the filesystem
dd if=upgrade-2.4.0.bin of=pineapple.squashfs bs=1 skip=927576

I did it with:
binwalk -Me upgrade-2.4.0.bin

Related

MMLS (Sleuth Kit) not working in some situations using DCFLDD

I am experiencing some issues when using mmls command after having created an image with dcfldd/guymager in some particular situations. Usually this approach seems to be working fine to create physical images of devices, but with some USBs (working fine and undamaged) I manage to create the .dd disk image file, but then it won't be opened by mmls, nor fsstat.
fls does open the file system structure, but it seems like it won't show me any unallocated files just as if this was a logical image.
This is the command run to create a disk image using dcfldd:
sudo dcfldd if=/dev/sda hash=sha256 hashlog=usb.sha256hash of=./usb.dd bs=512 conv=noerror,sync,notrunc
Also, this is the output of usb.info, generated by guymager:
GUYMAGER ACQUISITION INFO FILE
==============================
Guymager
========
Version : 0.8.13-1
Version timestamp : 2022-05-11-00.00.00 UTC
Compiled with : gcc 12.1.1 20220507 (Red Hat 12.1.1-1)
libewf version : 20140812 (not used as Guymager is configured to use its own EWF module)
libguytools version: 2.0.2
Host name : lucafedora
Domain name : (none)
System : Linux lucafedora 6.1.7-100.fc36.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Jan 18 18:37:43 UTC 2023 x86_64
Device information
==================
Command executed: bash -c "search="`basename /dev/sda`: H..t P.......d A..a de.....d" && dmesg | grep -A3 "$search" || echo "No kernel HPA messages for /dev/sda""
Information returned:
----------------------------------------------------------------------------------------------------
No kernel HPA messages for /dev/sda
Command executed: bash -c "smartctl -s on /dev/sda ; smartctl -a /dev/sda"
Information returned:
----------------------------------------------------------------------------------------------------
/usr/bin/bash: line 1: smartctl: command not found
/usr/bin/bash: line 1: smartctl: command not found
Command executed: bash -c "hdparm -I /dev/sda"
Information returned:
----------------------------------------------------------------------------------------------------
/usr/bin/bash: line 1: hdparm: command not found
Command executed: bash -c "CIDFILE=/sys/block/$(basename /dev/sda)/device/cid; echo -n "CID: " ; if [ -e $CIDFILE ] ; then cat $CIDFILE ; else echo "not available" ; fi "
Information returned:
----------------------------------------------------------------------------------------------------
CID: not available
Hidden areas: unknown
Acquisition
===========
Linux device : /dev/sda
Device size : 8053063680 (8.1GB)
Format : Linux dd raw image - file extension is .dd
Image path and file name: /home/HOMEDIR/case_usb/usb.dd
Info path and file name: /home/HOMEDIR/case_usb/usb.info
Hash calculation : SHA-256
Source verification : on
Image verification : on
No bad sectors encountered during acquisition.
No bad sectors encountered during verification.
State: Finished successfully
MD5 hash : --
MD5 hash verified source : --
MD5 hash verified image : --
SHA1 hash : --
SHA1 hash verified source : --
SHA1 hash verified image : --
SHA256 hash : 7285a8b0a2b472a8f120c4ca4308a94a3aaa3e308a1dd86e3670041b07c27e76
SHA256 hash verified source: 7285a8b0a2b472a8f120c4ca4308a94a3aaa3e308a1dd86e3670041b07c27e76
SHA256 hash verified image : 7285a8b0a2b472a8f120c4ca4308a94a3aaa3e308a1dd86e3670041b07c27e76
Source verification OK. The device delivered the same data during acquisition and verification.
Image verification OK. The image contains exactely the data that was written.
Acquisition started : 2023-01-28 12:27:07 (ISO format YYYY-MM-DD HH:MM:SS)
Verification started: 2023-01-28 12:30:11
Ended : 2023-01-28 12:35:24 (0 hours, 8 minutes and 16 seconds)
Acquisition speed : 41.97 MByte/s (0 hours, 3 minutes and 3 seconds)
Verification speed : 24.62 MByte/s (0 hours, 5 minutes and 12 seconds)
Generated image files and their MD5 hashes
==========================================
No MD5 hashes available (configuration parameter CalcImageFileMD5 is off)
MD5 Image file
n/a usb.dd
Worth to mention that when mmls is run against usb.dd it produces no output whatsoever. I have to forcefully add -v option for it to spit out this kind of information:
tsk_img_open: Type: 0 NumImg: 1 Img1: usb.dd
aff_open: Error determining type of file: usb.dd
aff_open: Success
Error opening vmdk file
Error checking file signature for vhd file
tsk_img_findFiles: usb.dd found
tsk_img_findFiles: 1 total segments found
raw_open: segment: 0 size: 8053063680 max offset: 8053063680 path: usb.dd
dos_load_prim: Table Sector: 0
raw_read: byte offset: 0 len: 65536
raw_read: found in image 0 relative offset: 0 len: 65536
raw_read_segment: opening file into slot 0: usb.dd
dos_load_prim_table: Testing FAT/NTFS conditions
dos_load_prim_table: MSDOS OEM name exists
bsd_load_table: Table Sector: 1
gpt_load_table: Sector: 1
gpt_open: Trying other sector sizes
gpt_open: Trying sector size: 512
gpt_load_table: Sector: 1
gpt_open: Trying sector size: 1024
gpt_load_table: Sector: 1
gpt_open: Trying sector size: 2048
gpt_load_table: Sector: 1
gpt_open: Trying sector size: 4096
gpt_load_table: Sector: 1
gpt_open: Trying sector size: 8192
gpt_load_table: Sector: 1
gpt_open: Trying secondary table
gpt_load_table: Sector: 15728639
raw_read: byte offset: 8053063168 len: 512
raw_read: found in image 0 relative offset: 8053063168 len: 512
gpt_open: Trying secondary table sector size: 512
gpt_load_table: Sector: 15728639
gpt_open: Trying secondary table sector size: 1024
gpt_load_table: Sector: 7864319
raw_read: byte offset: 8053062656 len: 1024
raw_read: found in image 0 relative offset: 8053062656 len: 1024
gpt_open: Trying secondary table sector size: 2048
gpt_load_table: Sector: 3932159
raw_read: byte offset: 8053061632 len: 2048
raw_read: found in image 0 relative offset: 8053061632 len: 2048
gpt_open: Trying secondary table sector size: 4096
gpt_load_table: Sector: 1966079
raw_read: byte offset: 8053059584 len: 4096
raw_read: found in image 0 relative offset: 8053059584 len: 4096
gpt_open: Trying secondary table sector size: 8192
gpt_load_table: Sector: 983039
raw_read: byte offset: 8053055488 len: 8192
raw_read: found in image 0 relative offset: 8053055488 len: 8192
sun_load_table: Trying sector: 0
sun_load_table: Trying sector: 1
mac_load_table: Sector: 1
mac_load: Missing initial magic value
mac_open: Trying 4096-byte sector size instead of 512-byte
mac_load_table: Sector: 1
mac_load: Missing initial magic value

build-id data offset in the ELF file

I need to modify the build-id of the ELF notes section. I found out that it is possible here. Also found out that I can do it by modifying this code. What I can't figure out is data location. Here is what I'm talking about.
$ eu-readelf -S myelffile
Section Headers:
[Nr] Name Type Addr Off Size ES Flags Lk Inf Al
...
[ 2] .note.ABI-tag NOTE 000000000000028c 0000028c 00000020 0 A 0 0 4
[ 3] .note.gnu.build-id NOTE 00000000000002ac 000002ac 00000024 0 A 0 0 4
...
$ eu-readelf -n myelffile
Note section [ 2] '.note.ABI-tag' of 32 bytes at offset 0x28c:
Owner Data size Type
GNU 16 GNU_ABI_TAG
OS: Linux, ABI: 3.14.0
Note section [ 3] '.note.gnu.build-id' of 36 bytes at offset 0x2ac:
Owner Data size Type
GNU 20 GNU_BUILD_ID
Build ID: d75a086c288c582036b0562908304bc3a8033235
.note.gnu.build-id section is 36 bytes. The build id is 20 bytes. What are the other 16 bytes?
I played with the code a bit and read 36 bytes of myelffile at offset 0x2ac. Got the following 040000001400000003000000474e5500d75a086c288c582036b0562908304bc3a8033235.
Then I decided to use Elf64_Shdr definition, so I read data at address 0x2ac + sizeof(Elf64_Shdr.sh_name) + sizeof(Elf64_Shdr.sh_type) + sizeof(Elf64_Shdr.sh_flags) and I got my build id, d75a086c288c582036b0562908304bc3a8033235. It does makes sense why I got it, sizeof(Elf64_Shdr.sh_name) + sizeof(Elf64_Shdr.sh_type) + sizeof(Elf64_Shdr.sh_flags) = 16 bytes, but according to Elf64_Shdr definition I should be pointing to Elf64_Addr sh_addr, i.e. section virtual address.
So what is not clear to me is what are the other 16 bytes of the section? What do they represent? I can't reconcile the Elf64_Shdr definition and the results I'm getting from my experiments.
.note.gnu.build-id section is 36 bytes. The build id is 20 bytes. What are the other 16 bytes?
Each .note.* section starts with Elf64_Nhdr (12 bytes), followed by (4-byte aligned) note name of variable size (GNU\0 here), followed by (4-byte aligned) actual note data. Documentation.
Looking at /bin/date on my system:
eu-readelf -Wn /bin/date
Note section [ 2] '.note.ABI-tag' of 32 bytes at offset 0x2c4:
Owner Data size Type
GNU 16 GNU_ABI_TAG
OS: Linux, ABI: 3.2.0
Note section [ 3] '.note.gnu.build-id' of 36 bytes at offset 0x2e4:
Owner Data size Type
GNU 20 GNU_BUILD_ID
Build ID: 979ae4616ae71af565b123da2f994f4261748cc9
What are the bytes at offset 0x2e4?
dd bs=1 skip=$((0x2e4)) count=36 < /bin/date | xxd
00000000: 0400 0000 1400 0000 0300 0000 474e 5500 ............GNU.
00000010: 979a e461 6ae7 1af5 65b1 23da 2f99 4f42 ...aj...e.#./.OB
00000020: 6174 8cc9 at..
So we have: .n_namesz == 4, .n_descsz == 20, .n_type == 3 == NT_GNU_BUILD_ID, followed by 4-byte GNU\0 note name, followed by 20 bytes of actual build-id bytes 0x97, 0x9a, etc.

Handle EOF with multi-line command in Dockerfile

I'm trying to use sfdisk to create an image file inside a docker container and I can use below command without any problem:
root#c8e9be2eb26f:/# sfdisk bbb_image.img << EOF
> 1M,48M,0xE,*
> ,,,-
> EOF
Checking that no-one is using this disk right now ... OK
Disk bbb_image.img: 1 GiB, 1073741824 bytes, 2097152 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
>>> Created a new DOS disklabel with disk identifier 0x34b8e793.
bbb_image.img1: Created a new partition 1 of type 'W95 FAT16 (LBA)' and of size 48 MiB.
bbb_image.img2: Created a new partition 2 of type 'Linux' and of size 975 MiB.
bbb_image.img3: Done.
New situation:
Disklabel type: dos
Disk identifier: 0x34b8e793
Device Boot Start End Sectors Size Id Type
bbb_image.img1 * 2048 100351 98304 48M e W95 FAT16 (LBA)
bbb_image.img2 100352 2097151 1996800 975M 83 Linux
The partition table has been altered.
Syncing disks.
Now in my Dockerfile this seems doesn't work and reproduce incomplete results:
RUN sfdisk bbb_image.img << "EOF\n\
1M,48M,0xE,*\n\
,,,-\n\
EOF"
And reproduce this in the console which is wrong:
Step 4/4 : RUN sfdisk bbb_image.img << "EOF\n1M,48M,0xE,*\n,,,-\nEOF\n"
---> Running in afc86ffef92a
Checking that no-one is using this disk right now ... OK
Disk bbb_image.img: 1 GiB, 1073741824 bytes, 2097152 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
>>> Done.
New situation:
ERROR: Service 'test' failed to build: The command '/bin/sh -c sfdisk bbb_image.img << "EOF\n1M,48M,0xE,*\n,,,-\nEOF\n"' returned a non-zero code: 1
I'm not sure how to handle the EOF in the Dockerfile.
Maybe try this one:
RUN sfdisk bbb_image.img << "\n1M,48M,0xE,*\n,,,-\n"

dd utility outputs different lines under the same command for sh and bash, how to force output of the last line inside the docker container?

Inside the docker container I test the
dd if=/dev/zero of=/tmp/test2.img bs=512 count=1000 2> >( grep copied )
line execution. I drill into container using 2 ways.
1) the classical one is docker exec -it 2b65c84ddce2 /bin/sh
the execution the line inside the contained inherinted from the alpine I'm greeting
/bin/sh: syntax error: unexpected redirection beacuse of something near >(
2) when I enter the container into the bash executor like docker exec -it 2b65c84ddce2 /bin/bash
dd if=/dev/zero of=/tmp/test2.img bs=512 count=1000 2> >( grep copied ) returns no output
dd if=/dev/zero of=/tmp/test2.img bs=512 count=1000 returns 2 lines output only, while the expectation is 3:
1+0 records in
1+0 records out
At the host level the same dd command is returning 3 lines like this:
dd if=/dev/zero of=/tmp/test2.img bs=512 count=1000
1000+0 records in
1000+0 records out
512000 bytes (512 kB, 500 KiB) copied, 0.0109968 s, 46.6 MB/s
and with redirection the output is the last line:
dd if=/dev/zero of=/tmp/test2.img bs=512 count=1000 2> >( grep copied )
512000 bytes (512 kB, 500 KiB) copied, 0.0076261 s, 67.1 MB/s
So how can I get the last line of dd output from the inside of the docker container?
PS.
The redirecting stderr to stdout doesn't help in general:
/ # dd if=/dev/zero of=/tmp/test2.img bs=512 count=1000 2>&1 | grep copied
/ # dd if=/dev/zero of=/tmp/test2.img bs=512 count=1000 2>&1
1000+0 records in
1000+0 records out
while at the host system it works
$ dd if=/dev/zero of=/tmp/test2.img bs=512 count=1000 2>&1 | grep copied
512000 bytes (512 kB, 500 KiB) copied, 0.00896706 s, 57.1 MB/s
host:
dd --v
dd (coreutils) 8.30
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by Paul Rubin, David MacKenzie, and Stuart Kemp.
container:
/ # dd --v
BusyBox v1.31.1 () multi-call binary.
Usage: dd [if=FILE] [of=FILE] [ibs=N obs=N/bs=N] [count=N] [skip=N] [seek=N]
[conv=notrunc|noerror|sync|fsync]
[iflag=skip_bytes|fullblock] [oflag=seek_bytes|append]
Copy a file with converting and formatting
if=FILE Read from FILE instead of stdin
of=FILE Write to FILE instead of stdout
bs=N Read and write N bytes at a time
ibs=N Read N bytes at a time
obs=N Write N bytes at a time
count=N Copy only N input blocks
skip=N Skip N input blocks
seek=N Skip N output blocks
conv=notrunc Don't truncate output file
conv=noerror Continue after read errors
conv=sync Pad blocks with zeros
conv=fsync Physically write data out before finishing
conv=swab Swap every pair of bytes
iflag=skip_bytes skip=N is in bytes
iflag=fullblock Read full blocks
oflag=seek_bytes seek=N is in bytes
oflag=append Open output file in append mode
status=noxfer Suppress rate output
status=none Suppress all output
N may be suffixed by c (1), w (2), b (512), kB (1000), k (1024), MB, M, GB, G
they are in fact, different
For anyone searching this question:
the DD used was being used with BusyBox. The third line is an optional output which is defined when compiling BusyBox from Source. The pre compiled versions have this disabled
ENABLE_FEATURE_DD_THIRD_STATUS_LINE must be defined.
see https://git.busybox.net/busybox/tree/coreutils/dd.c line 166.
#if ENABLE_FEATURE_DD_THIRD_STATUS_LINE
# if ENABLE_FEATURE_DD_STATUS
if (G.flags & FLAG_STATUS_NOXFER) /* status=noxfer active? */
return;
//TODO: should status=none make dd stop reacting to USR1 entirely?
//So far we react to it (we print the stats),
//status=none only suppresses final, non-USR1 generated status message.
# endif
fprintf(stderr, "%llu bytes (%sB) copied, ",
G.total_bytes,
/* show fractional digit, use suffixes */
make_human_readable_str(G.total_bytes, 1, 0)
);
/* Corner cases:
* ./busybox dd </dev/null >/dev/null
* ./busybox dd bs=1M count=2000 </dev/zero >/dev/null
* (echo DONE) | ./busybox dd >/dev/null
* (sleep 1; echo DONE) | ./busybox dd >/dev/null
*/
seconds = (now_us - G.begin_time_us) / 1000000.0;
bytes_sec = G.total_bytes / seconds;
fprintf(stderr, "%f seconds, %sB/s\n",
seconds,
/* show fractional digit, use suffixes */
make_human_readable_str(bytes_sec, 1, 0)
);
#endif
}

Specific data formatting using awk or sed

I am currently working with large datasets containing file information which is formatted into blocks of data. I am trying to take a piece of data from the file path line and append it as a new column on certain lines. The dataset contains file information formatted like so:
File path: /d9b50a6f54d5a1f8/7b3d459a3454703c/a6d1040ea2c84e10/afcbe93ced71e5e6/2b517a561f5da8a6/aab17eb15d782d7b/af38f2bcc4998af0/0d8eb680024af333.jar
Inode Num: 22525898
Chunk Hash Chunk Size (bytes) Compression Ratio (tenth)
45:97:2a:60:e3:69 3208 10
7a:8b:8e:20:7b:38 1982 10
b9:45:3d:f4:97:88 1849 10
Whole File Hash: 865999b40fd9
File path: /d9b50a6f54d5a1f8/7b3d459a3454703c/a6d1040ea2c84e10/afcbe93ced71e5e6/2b517a561f5da8a6/1e82b13443330bb3/12fd3e87b2f62dc8/6e1a9f0b0a281564.c
Inode Num: 31881221
Chunk Hash Chunk Size (bytes) Compression Ratio (tenth)
e8:b0:cb:6f:76:ff 1344 10
19:c5:b2:aa:b3:60 613 10
11:7c:7e:76:4b:d5 1272 10
36:e0:59:49:b6:4a 581 10
9c:31:bc:8a:39:94 3296 10
01:f0:56:3a:e1:a9 1140 10
Whole File Hash: 4b28b44ae03d
What I am wanting to do is take the file type (.jar and .c in this example) and append it to their respective Chunk Hash lines so the final formatting would look like:
File path: /d9b50a6f54d5a1f8/7b3d459a3454703c/a6d1040ea2c84e10/afcbe93ced71e5e6/2b517a561f5da8a6/aab17eb15d782d7b/af38f2bcc4998af0/0d8eb680024af333.jar
Inode Num: 22525898
Chunk Hash Chunk Size (bytes) Compression Ratio (tenth)
45:97:2a:60:e3:69 3208 10 .jar
7a:8b:8e:20:7b:38 1982 10 .jar
b9:45:3d:f4:97:88 1849 10 .jar
Whole File Hash: 865999b40fd9
File path: /d9b50a6f54d5a1f8/7b3d459a3454703c/a6d1040ea2c84e10/afcbe93ced71e5e6/2b517a561f5da8a6/1e82b13443330bb3/12fd3e87b2f62dc8/6e1a9f0b0a281564.c
Inode Num: 31881221
Chunk Hash Chunk Size (bytes) Compression Ratio (tenth)
e8:b0:cb:6f:76:ff 1344 10 .c
19:c5:b2:aa:b3:60 613 10 .c
11:7c:7e:76:4b:d5 1272 10 .c
36:e0:59:49:b6:4a 581 10 .c
9c:31:bc:8a:39:94 3296 10 .c
01:f0:56:3a:e1:a9 1140 10 .c
Whole File Hash: 4b28b44ae03d
I already have the awk code to pull the file type and the chunk hash lines:
awk 'match($0,/\..+/) {print substr($0,RSTART,RLENGTH)}'
awk '/Chunk Hash/{flag=1;next}/Whole File Hash:/{flag=0}flag'
I am just not sure on how to connect these pieces using awk (or sed) to append the file type as a new column onto each line in their respective data block. Another thing to note is I am trying to do this in a bash script if that makes a difference.
Here is a (GNU) sed solution:
/File path:/ { # If line matches "File path:"
h # Copy pattern space to hold space
s/.*(\.[^.]*)$/\1/ # Remove everything but extension from pattern space
x # Swap pattern space and hold space
} # Hold space now contains extension
/Chunk Hash/ { # If line matches "Chunk Hash"
n # Get next line into pattern space
:loop # Anchor for loop
/Whole File Hash/b # If line matches "Whole File Hash", jump out of loop
G # Append extension from hold space to pattern space
s/\n/\t\t\t\t/ # Substitute newline with a bunch of tabs
n # Get next line
b loop # Jump back to ":loop" label
}
This can be stored in a separate file (say, so.sed), and has to be called like
sed -r -f so.sed infile
resulting in
File path: /d9b50a6f54d5a1f8/7b3d459a3454703c/a6d1040ea2c84e10/afcbe93ced71e5e6/2b517a561f5da8a6/aab17eb15d782d7b/af38f2bcc4998af0/0d8eb680024af333.jar
Inode Num: 22525898
Chunk Hash Chunk Size (bytes) Compression Ratio (tenth)
45:97:2a:60:e3:69 3208 10 .jar
7a:8b:8e:20:7b:38 1982 10 .jar
b9:45:3d:f4:97:88 1849 10 .jar
Whole File Hash: 865999b40fd9
File path: /d9b50a6f54d5a1f8/7b3d459a3454703c/a6d1040ea2c84e10/afcbe93ced71e5e6/2b517a561f5da8a6/1e82b13443330bb3/12fd3e87b2f62dc8/6e1a9f0b0a281564.c
Inode Num: 31881221
Chunk Hash Chunk Size (bytes) Compression Ratio (tenth)
e8:b0:cb:6f:76:ff 1344 10 .c
19:c5:b2:aa:b3:60 613 10 .c
11:7c:7e:76:4b:d5 1272 10 .c
36:e0:59:49:b6:4a 581 10 .c
9c:31:bc:8a:39:94 3296 10 .c
01:f0:56:3a:e1:a9 1140 10 .c
Whole File Hash: 4b28b44ae03d
Non-GNU seds have to jump through the usual hoops to insert tabs and can't use the -r option (but probably -E, which should be equivalent here; -r was just used for convenience to having to escape ()).
Solution in TXR language:
#(repeat)
# (cases)
File path: #*path.#suff
Inode Num: #inode
#header
# (collect)
#hashline
# (last)
Whole File Hash: #wfh
# (end)
# (output)
File path: #path.#suff
Inode Num: #inode
#header
# (repeat)
#{hashline 88}.#suff
# (end)
Whole File Hash: #wfh
# (end)
# (or)
#other
# (do (put-line other))
# (end)
#(end)
Run:
$ txr suffixes.txr data
File path: /d9b50a6f54d5a1f8/7b3d459a3454703c/a6d1040ea2c84e10/afcbe93ced71e5e6/2b517a561f5da8a6/aab17eb15d782d7b/af38f2bcc4998af0/0d8eb680024af333.jar
Inode Num: 22525898
Chunk Hash Chunk Size (bytes) Compression Ratio (tenth)
45:97:2a:60:e3:69 3208 10 .jar
7a:8b:8e:20:7b:38 1982 10 .jar
b9:45:3d:f4:97:88 1849 10 .jar
Whole File Hash: 865999b40fd9
File path: /d9b50a6f54d5a1f8/7b3d459a3454703c/a6d1040ea2c84e10/afcbe93ced71e5e6/2b517a561f5da8a6/1e82b13443330bb3/12fd3e87b2f62dc8/6e1a9f0b0a281564.c
Inode Num: 31881221
Chunk Hash Chunk Size (bytes) Compression Ratio (tenth)
e8:b0:cb:6f:76:ff 1344 10 .c
19:c5:b2:aa:b3:60 613 10 .c
11:7c:7e:76:4b:d5 1272 10 .c
36:e0:59:49:b6:4a 581 10 .c
9c:31:bc:8a:39:94 3296 10 .c
01:f0:56:3a:e1:a9 1140 10 .c
Whole File Hash: 4b28b44ae03d
In awk:
$ cat script.awk
/File path/ {
match($0,/\..+/)
ext=substr($0,RSTART,RLENGTH)
}
/Chunk Hash/ {
flag=1 # flag on
print # print here to...
next # avoid printing ext
}
/Whole File Hash:/ {
flag=0 # flag off
}
flag==1 {
print $0, ext # add space here to your liking, left it short...
next # ... to show output on screen without sidescrolling
} 1 # print non-flagged records
Run:
$ awk -f script.awk data.txt
File path: /d9b50a6f54d5a1f8/7b3d459a3454703c/a6d1040ea2c84e10/afcbe93ced71e5e6/2b517a561f5da8a6/aab17eb15d782d7b/af38f2bcc4998af0/0d8eb680024af333.jar
Inode Num: 22525898
Chunk Hash Chunk Size (bytes) Compression Ratio (tenth)
45:97:2a:60:e3:69 3208 10 .jar
7a:8b:8e:20:7b:38 1982 10 .jar
b9:45:3d:f4:97:88 1849 10 .jar
Whole File Hash: 865999b40fd9
File path: /d9b50a6f54d5a1f8/7b3d459a3454703c/a6d1040ea2c84e10/afcbe93ced71e5e6/2b517a561f5da8a6/1e82b13443330bb3/12fd3e87b2f62dc8/6e1a9f0b0a281564.c
Inode Num: 31881221
Chunk Hash Chunk Size (bytes) Compression Ratio (tenth)
e8:b0:cb:6f:76:ff 1344 10 .c
19:c5:b2:aa:b3:60 613 10 .c
11:7c:7e:76:4b:d5 1272 10 .c
36:e0:59:49:b6:4a 581 10 .c
9c:31:bc:8a:39:94 3296 10 .c
01:f0:56:3a:e1:a9 1140 10 .c
Whole File Hash: 4b28b44ae03d
awk --re-interval '
/^File/{ #If the beginning of line matches "File"
s=gensub("[^.]+(.*)","\\1","1",$0); #Gain the keywords like ".c,.jar" and assign them to s
}
/(..:){3,}/{ #If line matches "**:" three times or more
gsub("[0-9]+$","&\t\t\t\t\t" s,$0) #At the end add s
}
1' file #Print line

Resources