How to create zip file deflate all the entry - zip

I have a zip file that i think it was created by zip, but when i unzip it, and try to create a zip from those unziped file, it diferent from zipinfo command.
this is result when I do zipinfo command on linux of source file:
root#TLBBServer:/home/web/Patch/u# zipinfo u1.zip
Archive: u1.zip
Zip file size: 662378 bytes, number of entries: 3
-rw-a-- 2.3 ntf 3133736 bx defX 08-Jan-28 07:20 Data/LaunchSkin.axp
-rw-a-- 2.3 ntf 1196154 bx defX 08-Feb-03 03:52 (version)
-rw-a-- 2.3 ntf 36 bx defX 08-Feb-03 03:53 (command)
3 files, 4329926 bytes uncompressed, 661972 bytes compressed: 84.7%
this is result when i do with a clone file:
root#TLBBServer:/home/web/Patch/u# zipinfo u.zip
Archive: u.zip
Zip file size: 661897 bytes, number of entries: 3
-rw---- 2.3 ntf 1217589 tx defX 08-Feb-03 03:52 (version)
-rw---- 2.3 ntf 3135715 bx defX 08-Jan-28 07:20 Data/LaunchSkin.axp
-rw---- 2.3 ntf 38 tx stor 08-Feb-03 03:53 (command)
3 files, 4353342 bytes uncompressed, 661255 bytes compressed: 84.8%
this is source file:
https://github.com/HadesD/TLBB-Web/raw/master/u1.zip

So? Why do you care?
A different zip program, or the same zip program with different settings, or a different version of the same zip program with the same settings can all produce different output. However they will all be valid, decompressible zip files, assuring that the contents are uncompressed exactly.
The only guarantee of the lossless compressor is that compression followed by decompression will always give you exactly the same thing back. There is no guarantee, and there does not need to be a guarantee, that decompression followed by compression will give you the same thing back.

Related

List files in zip subdirectory

To list all the files in a spreadsheet (xlsx) file, I can do:
$ unzip -l /Users/david/Desktop/myspreadsheet.xlsx
698 01-01-1980 00:00 xl/_rels/workbook.xml.rels
1415625191 01-01-1980 00:00 xl/worksheets/sheet1.xml
6798 01-01-1980 00:00 xl/theme/theme1.xml
2315 01-01-1980 00:00 xl/styles.xml
779218 01-01-1980 00:00 xl/sharedStrings.xml
322 01-01-1980 00:00 xl/worksheets/_rels/sheet1.xml.rels
9840 01-01-1980 00:00 xl/printerSettings/printerSettings1.bin
640 01-01-1980 00:00 docProps/core.xml
797 01-01-1980 00:00 docProps/app.xml
From here we can see that the spreadsheet has one sheet -- xl/worksheets/sheet1.xml. Is there a way to only see the zip contents of the xl/worksheets/ folder? For example, doing something like:
$ unzip -l /Users/david/Desktop/myspreadsheet.xlsx xl/worksheets
The most I could find from man was:
-l list archive files (short format). The names, uncompressed file sizes and modification dates and
times of the specified files are printed, along with totals for all files specified. If UnZip was
compiled with OS2_EAS defined, the -l option also lists columns for the sizes of stored OS/2
extended attributes (EAs) and OS/2 access control lists (ACLs). In addition, the zipfile comment
and individual file comments (if any) are displayed. If a file was archived from a single-case
file system (for example, the old MS-DOS FAT file system) and the -L option was given, the file-
name is converted to lowercase and is prefixed with a caret (^).
But it seems like there are tons of other options. Is there a way to do the above?

Combining and compressing using "tar czf" and "tar + gzip". The resultant file in both cases is packname.tar.gz but why sizes are different?

There are three text files. test1, test2 and test3 with file sizes as:
test1 - 121 B
test2 - 4 B
test3 - 26 B
I am trying to combine and compress these files using different methods.
Method-A
Combine the files using tar and then compress it using gzip.
$tar cf testpack1.tar test1 test2 test3
$gzip testpack1.tar
Output is testpack1.tar.gz with size 276 B
Method-B
Combine and compress the files using tar.
$tar czf testpack2.tar.gz test1 test2 test3
Output is testpack2.tar.gz with size 262 B
Why the size of the two files are different?
B mean bytes.
If you un-gzip the archive created by your step B, I bet it will be 10240 bytes. Reason for such difference in size is that tar will align compressed archive to block size (using zero character), but it will not align the uncompressed archive. Here is excerpt from the GNU tar documentation:
-b blocks
--blocking-factor=blocks
Set record size to blocks * 512 bytes.
This option is used to specify a blocking factor for the archive. When
reading or writing the archive, tar, will do reads and writes of the
archive in records of block*512 bytes. This is true even when the
archive is compressed. Some devices requires that all write operations
be a multiple of a certain size, and so, tar pads the archive out to
the next record boundary. The default blocking factor is set when tar
is compiled, and is typically 20. Blocking factors larger than 20
cannot be read by very old versions of tar, or by some newer versions
of tar running on old machines with small address spaces. With a
magnetic tape, larger records give faster throughput and fit more data
on a tape (because there are fewer inter-record gaps). If the archive
is in a disk file or a pipe, you may want to specify a smaller
blocking factor, since a large one will result in a large number of
null bytes at the end of the archive.
You can create same compressed tar archive like this:
tar -b 20 -cf test.tar test1 test2 test3
gzip test.tar

Recover crack unzip Zip 2.0 (Legacy) file with lost password on Linux

I have an old zip file with a lost password.
zipinfo -z tells me its version Zip 2.0 which uses PKWARE encryption. Good news as that's apparently weak.
Bad news is nothing I've searched & founds tells me if its possible to crack without using brute force.
PKCrack looked like an option but it requires an unencrypted version of one of the files which I don't have.
I tried fcrackzip with a dictionary but it seems I don't remember anything about the password.
Does anyone know a good method to recover or unzip the files or the password or crack the zip without using brute force?
I believe it was encrypted on Windows - probably XP. I'm now using Ubuntu 18.04 LTS.
Here's the zipinfo -v output:
offset of local header from start of archive: 10089220 (000000000099F304h) bytes
file system or operating system of origin: NTFS
version of encoding software: 2.0
minimum file system compatibility required: MS-DOS, OS/2 or NT FAT
minimum software version required to extract: 2.0
compression method: deflated
compression sub-type (deflation): normal
file security status: encrypted
extended local header: no
file last modified on (DOS date/time): 2004 Feb 19 14:13:00
32-bit CRC value (hex): 498a9daf
compressed size: 39271 bytes
uncompressed size: 39855 bytes
length of filename: 16 characters
length of extra field: 0 bytes
length of file comment: 0 characters
disk number on which file begins: disk 1
apparent file type: binary
non-MSDOS external file attributes: 000000 hex
MS-DOS file attributes (20 hex): arc
Thank you for any help - these files are important to me
Try fcrackzip using the RockYou.txt password list.
fcrackzip -v -u -D -p PATH/rockyou.txt NAME_OF_ZIP.zip
-v: for verbose mode.
-u: for weed out wrong passwords.
-D: for using a dictionary.
-p: for using a string as a password.
https://github.com/brannondorsey/naive-hashcat/releases/download/data/rockyou.txt

Difference in .tar.gz and first gz and then tar

I made two compressed copy of my folder, first by using the command tar czf dir.tar.gz dir
This gives me an archive of size ~16kb. Then I tried another method, first i gunzipped all files inside the dir and then used
gzip ./dir/*
tar cf dir.tar dir/*.gz
but the second method gave me dir.tar of size ~30kb (almost double). Why there is so much difference in size?
Because zip process in general is more efficient on big sample than on small files. You have zipped 100 files of 1ko for example. Each file will have a certain compression, plus the overhead of the gzip format.
file1.tar -> files1.tar.gz (admit 30 bytes of headers/footers)
file2.tar -> files2.tar.gz (admit 30 bytes of headers/footers)
...
file100.tar -> files100.tar.gz (admit 30 bytes of headers/footers)
------------------------------
30*100 = 3ko of overhead.
But if you try to compress a tar file of 100ko (which contains your 100 files), the overhead of the gzip format will be added only one time (instead of 100 times) and the compression can be better)
Overhead from the per-file metadata and suboptimal conpression by gzip when processing files individually resulting from gzip not observing data in full and thus compressing with suboptimal dictionary (which is reset after each file).
tar cf should create an uncompressed archive, it means the size of your directory should almost be the same as your archive, maybe even more.
tar czf will run gunzip compression through it.
This can be further checked by doing a man tar in shell prompt in Linux,
-z, --gzip, --gunzip, --ungzip
filter the archive through gzip

How to validate equivalency between two zip packages

I'm trying to validate if two zip packages are equivalent. I can not rely on md5sum. When I extract the two packages, and do a md5sum diff between all the files in the packages, there is no difference, and all files have equivalent md5sums. But the zip packages themselves have different md5sum values. My question is: How can I validate that two zip packages are equivalent?
When you list the archive's content with
unzip -v archive.zip
you get a list of files with these column headings
Length Method Size Cmpr Date Time CRC-32 Name
Depending on what you consider equivalent (e.g. Size, CRC, Name), you can extract the relevant columns for both archives, sort them and do a diff over the output.
without unzipping the file you can use zipinfo
e.g:
ipinfo 5.zip
Archive: 5.zip 158 bytes 1 file
drwxr-xr-x 3.0 unx 0 bx stor 18-Nov-13 07:23 501/
1 file, 0 bytes uncompressed, 0 bytes compressed: 0.0%

Resources