Decompress .gz files from subfolders (recursively) into root folder - linux

Here is the situation : I have a folder, containing a lot of subfolders, some of them containing .gz compressed files (NOT tar, just compressed text files). I want to recursively decompress all these .gz files into the root folder, but I can't figure out the exact way to do it.
I have my folders like that :
/folderX/subfolder1/file.gz
by using
gzip -c -d -r *.gz
I can probably extract all the files at once, but they will remain in their respective subfolders. I want them all in /folderX/
find -name *.gz
gives me the correct list of the files I am looking for, but I have no idea how to combine the 2 commands. Should I combine these commands in a script ? Or is there a functionality of gzip that I have missed allowing to decompress everything in the folder from which you are executing the command ?
Thanks for the help !

You can use a while..done loop that iterate the input:
find dirname -name *.gz|while read i; do gzip -c -d -r $i; done
You can also use xargs, with the additional benefit of dealing with spaces (" ") in the file name of parameter -0:
find dirname -name '*.gz' -print0 | xargs -0 -L 1 gzip -c -d -r
The "-print0" output all the files found separated by NULL character. The -0 switch of xargs rebuild the list parsing the NULL character and applies the "gzip..." command to each of them. Pay attention to the "-L 1" parameter which tells xargs to pass only ONE file at a time to gzip.

Related

How to gunzip without delete?

How to unzip .gz all archives without delete in specified folder?
I tried: gunzip *.gz /folder
You can simply use -k option specified in man.
-k --keep
Keep (don't delete) input files during compression or decompression.
For example if you use the command
gunzip -k my_file.gz
my_file.gz will not be deleted after compression or decompression.
can be done with some bash scripting using gzip --stdout option and compute the uncompressed name with bash substitution:
for i in *.gz
do
echo unzipping $i to /folder/${i/.gz}
gunzip --stdout "$i" > "/folder/${i/.gz}"
done
the current implementation fails with filenames containing spaces
find /source/dir/ -type f -name '*.gz' \
-exec bash -c 'gunzip -c "$0" > "/folder/${0%.gz}"' {} \;
This command finds all *.gz files in /source/dir/ directory and executes a bash command for each file. The bash command uncompresses the file ($0) to the standard output and redirects the result to "/folder/${0%.gz}". The latter is the "/folder/" string concatenated with the filename without .gz extension. The ${0%.gz} expression removes shortest match for .gz from the back of the string (the path).
And the {} sequence is replaced with path of the next gzipped file. It is passed to the bash command as the first argument ($0).
Replace /source/dir/ with dot (.), if the source directory is the current directory.
use gzip to unzip to stdout and pipe that to your required file (as -k option doesn't work for me either
gzip -dc somefile.gz > outputfile.txt

Finding a file within recursive directory of zip files

I have an entire directory structure with zip files. I would like to:
Traverse the entire directory structure recursively grabbing all the zip files
I would like to find a specific file "*myLostFile.ext" within one of these zip files.
What I have tried
1. I know that I can list files recursively pretty easily:
find myLostfile -type f
2. I know that I can list files inside zip archives:
unzip -ls myfilename.zip
How do I find a specific file within a directory structure of zip files?
You can omit using find for single-level (or recursive in bash 4 with globstar) searches of .zip files using a for loop approach:
for i in *.zip; do grep -iq "mylostfile" < <( unzip -l $i ) && echo $i; done
for recursive searching in bash 4:
shopt -s globstar
for i in **/*.zip; do grep -iq "mylostfile" < <( unzip -l $i ) && echo $i; done
You can use xargs to process the output of find or you can do something like the following:
find . -type f -name '*zip' -exec sh -c 'unzip -l "{}" | grep -q myLostfile' \; -print
which will start searching in . for files that match *zip then will run unzip -ls on each and search for your filename. If that filename is found it will print the name of the zip file that matched it.
Some have suggested to use ugrep to search zip files and tarballs. To find the zip files that contain a mylostfile file, specify it as a -g glob pattern like so:
ugrep -z -l -g'myLostfile' ''
With the empty regex pattern '' this this recursively searches all files down the working directory, including any zip, tar, cpio/pax archives for mylostfile. If you only want to search the zip files located in the working directory:
ugrep -z -l -g'myLostfile' '' *.zip

tar/gzip excluding certain files

I have a directory with many sub-directories. In some of those sub-directories I have files with *.asc extension and some with *.xdr.
I want to create a SINGLE tarball/gzip file which maintains the directory structure but excludes all files with the *.xdr extension.
How can I do this?
I did something like find . -depth -name *.asc -exec gzip -r9 {} + but this gzips every *.asc file individually which is not what I want to do.
You need to use the --exclude option:
tar -zc -f test.tar.gz --exclude='*.xdr' *
gzip will always handle files individually. If you want a bundled archive you will have to tar the files first and then gzip the result, hence you will end up with a .tar.gz file or .tgz for short.
To get a better control over what you are doing, you can first find the files using the command you already posted (with -print instead of the gzip command) and put them into a file, then use this file (=filelist.txt) to instruct tar with what to archive
tar -T filelist.txt -c -v -f myarchive.tar

Extract and delete all .gz in a directory- Linux

I have a directory. It has about 500K .gz files.
How can I extract all .gz in that directory and delete the .gz files?
This should do it:
gunzip *.gz
#techedemic is correct but is missing '.' to mention the current directory, and this command go throught all subdirectories.
find . -name '*.gz' -exec gunzip '{}' \;
There's more than one way to do this obviously.
# This will find files recursively (you can limit it by using some 'find' parameters.
# see the man pages
# Final backslash required for exec example to work
find . -name '*.gz' -exec gunzip '{}' \;
# This will do it only in the current directory
for a in *.gz; do gunzip $a; done
I'm sure there's other ways as well, but this is probably the simplest.
And to remove it, just do a rm -rf *.gz in the applicable directory
Extract all gz files in current directory and its subdirectories:
find . -name "*.gz" | xargs gunzip
If you want to extract a single file use:
gunzip file.gz
It will extract the file and remove .gz file.
for foo in *.gz
do
tar xf "$foo"
rm "$foo"
done
Try:
ls -1 | grep -E "\.tar\.gz$" | xargs -n 1 tar xvfz
Then Try:
ls -1 | grep -E "\.tar\.gz$" | xargs -n 1 rm
This will untar all .tar.gz files in the current directory and then delete all the .tar.gz files. If you want an explanation, the "|" takes the stdout of the command before it, and uses that as the stdin of the command after it. Use "man command" w/o the quotes to figure out what those commands and arguments do. Or, you can research online.

Unzipping from a folder of unknown name?

I have a bunch of zip files, and I'm trying to make a bash script to automate the unzipping of certain files from it.
Things is, although I know the name of the file I want, I don't know the name of the folder it's in; it is one folder depth in
How can I extract these files, preferably discarding the folder?
Here's how to unzip any given file at any depth and junk the folder paths on the way out:
unzip -j somezip.zip *somefile.txt
The -j junks any folder structure in the zip file and the asterisk gives a wildcard to match along any path.
if you're in:
some_directory/
and the zip files are in any number of subdirectories, say:
some_directory/foo
find ./ -name myfile.zip -exec unzip {} -d /directory \;
Edit: As for the second part, removing the directory that contained the zip file I assume?
find ./ -name myfile.zip -exec unzip {} -d /directory \; -exec echo rm -rf `dirname {}` \;
Notice the "echo." That's a sanity check. I always echo first when executing something destructive like rm -rf in a loop/iterative sequence like this. Good luck!
Have you tried unzip somefile.zip "*/blah.txt"?
You can use find to find the file that you need to unzip, and xargs to call unzip:
find /path/to/root/ -name 'zipname.zip' -print0 | xargs -0 unzip
print0 enables the command to work with files or paths that have white space in them. -0 is the option to xargs that makes it work with print0.

Resources