print content of more than one file in a zip archive - linux

I have some zip files that are really large and I want to print them without extracting first. I am using zcat and zless to do that and then I redirect the output to a different application. When my zip file contains more than one text file I receive the following error:
zcat tweets.zip >a
gzip: tweets.zip has more than one entry--rest ignored
How can I do what I want with zip files that contain more than one text file?

You can do this to output a file without extracting:
$ unzip -p <zip_file> <file_to_print>
For example:
$ unzip -p MyEar.ear META-INF/MANIFEST.MF
As cur4so mentioned you can also list all files using:
$ unzip -l <zip_file>

Use the -p option of unzip to pipe the output. Multiple files are concatenated. The -c option does the same thing, but includes the file name in front of each file.

If you just want to see a list of files in your zip archive use:
unzip -l tweets.zip
if you want to extract just some file:
unzip tweets.zip file-of-interest-as-it-is-pointed-in-the-archive
if you want something else, could you clarify your question?

Related

get the text inside a zipped text file without extraction

I want to get the content of the file1.txt of my archive.zip without extrancting the file.
For all these commands I obtain caution: filename not matched: file1.txt
unzip -p ~/archive.zip ~/archive.zip/file1.txt | less
unzip -p ~/archive.zip ~/archive/file1.txt | less
unzip -p ~/archive.zip file1.txt | less
the archive.zip is at the home directory, and the respective names are correct.
Hardcoding the path, produces the same undesired outcome:
unzip -p /home/pi/archive.zip ~/archive.zip/file1.txt | less
unzip -p /home/pi/archive.zip ~/archive/file1.txt | less
unzip -p /home/pi/archive.zip file1.txt | less
I am trying to do this in a raspberry-pi.
The expected output is the content of the file1.txt.
It's possible that the zip extracts into a directory and the file is present there, eg file.zip creates a someproject-name top level directory and the contents are under that. So you can do something like this:
unzip -p /home/pi/archive.zip '*/file1.txt' So it would look at the top level directory aswell due to the glob.

How to extract first few lines from a csv file inside a tar file without extracting it in linux?

I have a tar file which has lot of csv files in it.
How to get the first few lines of each csv file without extracting it?
I tried:
$(tar -Oxf $tarfile $file | head -n "$NL") >> cdn.log
But got error saying:
time(http:index: command not found
This is some line in one of the csv files. Similar errors are reported for all csv files...
Any idea??
Using -O you can tell tar to extract a file to standard output instead of to file. So you should be able to first use tar tf <YOUR_FILE> to list the files from archive and filter it using grep to find the CSV files, and then for each file use tar xf <YOUR_FILE> <NAME_OF_CSV> -O | head to get the file's beginning to stdout. This may be a bit ineffective since you unpack the archive as many tiems as there are CSV files, but should work.
You can use perl and its Archive::Tar module. Here a one-liner that extract the first two lines of each one:
perl -MArchive::Tar -E '
for (Archive::Tar->new(shift)->get_files) {
say (join qq|\n|, (split /\n/, $_->get_content, 3)[0..1])
}
' file.tar
It assumes that the tar file only has text files and they are csv. Otherwise you will have to grep the list to filter those you want.

Combine files in one

Currently I am in this directory-
/data/real/test
When I do ls -lt at the command prompt. I get like below something-
REALTIME_235000.dat.gz
REALTIME_234800.dat.gz
REALTIME_234600.dat.gz
REALTIME_234400.dat.gz
REALTIME_234200.dat.gz
How can I consolidate the above five dat.gz files into one dat.gz file in Unix without any data loss. I am new to Unix and I am not sure on this. Can anyone help me on this?
Update:-
I am not sure which is the best way whether I should unzip each of the five file then combine into one? Or
combine all those five dat.gz into one dat.gz?
If it's OK to concatenate files content in random order, then following command will do the trick:
zcat REALTIME*.dat.gz | gzip > out.dat.gz
Update
This should solve order problem:
zcat $(ls -t REALTIME*.dat.gz) | gzip > out.dat.gz
What do you want to happen when you gunzip the result? If you want the five files to reappear, then you need to use something other than the gzip (.gz) format. You would need to either use tar (.tar.gz) or zip (.zip).
If you want the result of the gunzip to be the concatenation of the gunzip of the original files, then you can simply cat (not zcat or gzcat) the files together. gunzip will then decompress them to a single file.
cat [files in whatever order you like] > combined.gz
Then:
gunzip combined.gz
will produce an output that is the concatenation of the gunzip of the original files.
The suggestion to decompress them all and then recompress them as one stream is completely unnecessary.

Linux: Adding named files to a zip archive, from a pipe

Is it possible to use something like:
command.exe | zip >> archive.zip
command2.exe | zip >> archive.zip
...and end up with two named files inside one zip archive.
This way, if at all possible, would be neater than having temp files.
Create two named pipes in a new dir (with mkfifo), pipe the output of the commands to these two pipes and then zip the dir.
mkdir tmp
mkfifo tmp/1.out
mkfifo tmp/2.out
command1.exe > tmp/1.out
command2.exe > tmp/2.out
zip -FI -r tmp.zip tmp/
EDIT: Added the FI flag to zip, which does make this possible. The only caveat is that you need zip 3.0 for this to work. Tar:ing FIFO:s is not implemented (according to tar devs) because you need the file size in advance in order to write it to the TAR header.
Use fuse, fuze-zip rather.

How to check for an exploding zip file in bash?

I have a bash shell script that unzips a zip file, and manipulates the resulting files. Because of the process, I expect all the content I am interested to be within a single folder like so:
file.zip
/file
/contentFolder1
/contentFolder2
stuff1.txt
stuff2.txt
...
I've noticed users on Windows typically don't create a sub folder but instead submit an exploding zip file that looks like:
file.zip
/contentFolder1
/contentFolder2
stuff1.txt
stuff2.txt
...
How can I detect these exploding zips, so that I may handle them accordingly? Is it possible without unzipping the file first?
If you want to check, unzip -l will print the contents of the zip file without extracting them. You'll have to massage the output a bit, though, since it's printing all sorts of additional crud.
Unzip to a directory first, and then remove the extra layer if the zip is not a bomb.
tempdir=`mktemp -d`
unzip -d $tempdir file.zip
if [ $(ls $tempdir | wc -l) = 1 ]; then
mv $tempdir/* .
rmdir $tempdir
else
mv $tempdir file
fi
I wouldn't try to detect it. I'd just force unzip to do what I want. With InfoZip:
$ unzip -j -d unzip-output-dir FileFromUntrustedSource.zip
-j makes it ignore any directory structure within the file, and -d tells it to put files in a particular directory, creating it if necessary.
If there are two files with the same name but in different subdirectories, the above command will make unzip ask if you want to overwrite the first with the second. You can add -o to force it to overwrite without asking, or -f to only overwrite if the second file is newer.

Resources