Recursively unzip all subdirectories while retaining file structure - linux

I'm new to bash scripting, and i'm finding it hard to solve this one.
I have a parent folder containing a mixture of sub directories and zipped sub directories.
Within those sub directories are also more nested zip files.
Not only are there .zip files, but also .rar and .7z files which also contain nested zips/rars/7zs.
I want to unzip, unrar and un7z all my nested sub directories recursively until the parent folder no longer contains any .rar, .zip, .7zip files. (these eventually need to be removed when they have been extracted). There could be thousands of sub directories all at different nesting depths. You could have zipped folders or zipped files.
However I want to retain my folder structure, so the unzipped folders must stay in the same place where it has been unzipped
I have tried this script that works for unzipping, but it does not retain the file structure.
#!/bin/bash
while [ "`find . -type f -name '*.zip' | wc -l`" -gt 0 ]
do
find . -type f -name "*.zip" -exec unzip -- '{}' \; -exec rm -- '{}' \;
done
I want for example:
folder 'a' contain zipped folder 'b.zip' which contains a zipped text file pear.zip (which is pear.txt that has been zipped to pear.zip a/b.zip(/pear.zip))
I would like folder 'a' to contain 'b' to contain pear.txt 'a/b/pear.txt'
The script above brings 'b' (b is empty) and pear both into folder 'a' where the script is executed which is not what I want. eg 'a/b' and 'a/pear.txt'

You could try this:
#!/bin/bash
while :; do
mapfile -td '' archives \
< <(find . -type f -name '*.zip' -o -name '*.7z' -print0)
[[ ${#archives[#]} -eq 0 ]] && break
for i in "${archives[#]}"; do
case $i in
*.zip) unzip -d "$(dirname "$i")" -- "$i";;
*.7z) 7z x "-o$(dirname "$i")" -- "$i";;
esac
done
rm -rf "${archives[#]}" || break
done
Every archive is listed by find. That list is extracted in the correct location and the archives removed. This repeats, until zero archives are found.
You can add an equivalent unrar command (I'm not familiar with it).
Add -o -name '*.rar' to find, and another case to case. If there's no option to specify a target directory with unrar, you could use cd "$(dirname "$i")" && unrar "$i".
There are some issues with this script. In particular, if extraction fails, the archive is still removed. Otherwise it would cause an infinite loop. You can use unzip ... || exit 1 to exit if extraction fails, and deal with that manually.
It's possible to both avoid removal and also an infinite loop, by counting files which aren't removed, but hopefully not necessary.
I couldn't test this properly. YMMV.

Related

shell command to delete all directories with empty __init__.py file

I'm looking for a command for Linux shell, that will recursively delete all directories containing just empty __init__.py file and/or other empty directories. So if any file in that directory actually contains at least one byte, it shouldn't be removed.
So, in other words, remove all empty python modules recursively.
Please note, that if directory contains anything else but empty init.py file - it shouldn't be deleted.
What i've found/tried so far was:
find . -type d -empty -delete
And
find . -type d -size -5k -delete
And
find . -type d -size 0 -delete
First one deletes directories without files(in my example, they contain empty init.py file.
Second one somewhy captures all directories
Third doesn't capture anything
It may be possible to do this with one complicated find command, but it's more manageable if you break it up into stages:
Delete empty __init__.py files.
Delete empty directories.
If you do this bottom up using -depth then it'll naturally remove directories containing only empty init files and/or nested empty directories.
find -depth '(' -type d -o -name __init__.py ')' -print0 |
while IFS= read -d '' -r path; do
[[ -f $path && ! -s $path ]] && (($(ls -A1 "$(dirname "$path")" | wc -l) == 1)) && rm "$path"
rmdir "$path" 2> /dev/null || :
done
Steps:
Use -depth to process children before parents.
Find directories and __init__.pys.
Process each match in a loop. -print0 pairs up with read -d '' to make sure we handle paths with spaces and newlines properly.
The only files we matched were __init__.py, so [[ -f && ! -s ]] matches empty init files. (($(ls -A1 "$(dirname "$path")" | wc -l) == 1)) checks that the init file is the only one in its directory. If both conditions are met, the init file is removed.
Try to rmdir the path. If it's an empty directory it'll be removed. If it's a file or a non-empty directory it won't be. That's fine: errors are suppressed with 2> /dev/null. || : ignores the failed exit code, making it safe to run this script with set -e.
You sound like you've already covered how to delete empty directories with your first find command. To delete any directories that have empty files in them you can use:
find . -size 0 -exec dirname {} + | xargs rm -rf
Here the find command will get the directory name for each directory containing empty files and then all directory names will be piped to xargs which will remove them.
If you have files other than __init.py__ that may be empty then you can just directly specify that only 0 byte files named __init.py__ match the find command with:
find . -size 0 -and -name "__init.py__" -exec dirname {} + | xargs rm -rf

Rename all files in multiple folders with some condition in single linux command os script.

I have multiple folders with multiple files. I need to rename those files with the same name like the folder where the file stored with "_partN" prefix.
As example,
I have a folder named as "new_folder_for_upload" which have 2 files. I need to convert the name of these 2 files like,
new_folder_for_upload_part1
new_folder_for_upload_part2
I have so many folders like above which have multiple files. I need to convert all the file names as I describe above.
Can anybody help me to find out for a single linux command or script to do this work automatically?
Assuming bash shell, and assuming you want the file numbering to restart for each subdirectory, and doing the moving of all files to the top directory (leaving empty subdirectories). Formatted as script for easier reading:
find . -type f -print0 | while IFS= read -r -d '' file
do
myfile=$(echo $file | sed "s#./##")
mydir=$(dirname "$myfile")
if [[ $mydir != $lastdir ]]
then
NR=1
fi
lastdir=${mydir}
mv "$myfile" "$(dirname "$myfile")_part${NR}"
((NR++))
done
Or as one-line command:
find . -type f -print0 | while IFS= read -r -d '' file; do myfile=$(echo $file | sed "s#./##"); mydir=$(dirname "$myfile"); if [[ $mydir != $lastdir ]]; then NR=1; fi; lastdir=${mydir}; mv "$myfile" "$(dirname "$myfile")_part${NR}"; ((NR++)); done
Beware. This is armed, and will do a bulk renaming / moving of every file in or below your current work directory. Use at your own risk.
To delete the empty subdirs:
find . -depth -empty -type d -delete

Command for moving subfolders with files, with keeping the original structure

I have a parent/ folder with a couple of subfolders in it. Structure:
/parent/
/subfolder_1/
- file_1.txt
- file_2.txt
/subfolder_2/
- file_3.txt
- file_4.txt
Now, I need to recursively move the contents of parent/ folder to the empty parent_tmp/ directory. Thing is, I need to keep the original folder structure in parent/.
Expected outcome after moving:
/parent/
/subfolder_1/
(empty)
/subfolder_2/
(empty)
/parent_tmp/
/subfolder_1/
- file_1.txt
- file_2.txt
/subfolder_2/
- file_3.txt
- file_4.txt
Normally, I would simply do
mv parent/* parent_tmp
but this will, of course, move the subfolders permanently.
Is there a way to adjust the mv command to keep the original structure of the source directory?
Note:
I realize that I can e.g. copy parent/ to parent_tmp, and then remove the files in parent/ subfolders. This is plan B to me.
You can use find from parent of parent and parent_tmp directoroies:
find parent -type f -exec bash -c 'mkdir -p "parent_tmp/${1%/*}" &&
mv "$1" "parent_tmp/${1%/*}"' - {} \;
You could copy the files
cp -r parent/* parent_tmp/
or create hard links (should be a lot faster for big files)
cp -l -r parent/* parent_tmp/
and then delete the original files
find parent -type f -delete
while keeping the directory structure.
Zip the content of the parent folder and Unzip it in the target folder.
Quick and Dirty:
I don't think you'll find a tool or option in the mv command to do what you want, but you should be able to achieve the desired goal by using find:
cd parent && while read file ; do dirname="$(dirname "$file")" ; mkdir -p ../parent_tmp/"$dirname"/; mv "$file" "../parent_tmp/"${file#}"" ; done < <( find . -type f ) && cd -
Function
If you use this a lot then you can add the above to your ~/.basrc like so (append to the end of the file):
alias mvkp=moveandkeep
moveandkeep() {
cd "$1"
while read file ;
do dirname="$(dirname "$file")" ;
mkdir -p "$2"/"${dirname#}";
mv "$file" ""$2"/"${file#}"";
done < <(find . -type f)
cd -
}
Now you could simply do the following: (Full path to directories required)
mvkp /home/user/parent /home/user/parent_tmp

Finding a file within recursive directory of zip files

I have an entire directory structure with zip files. I would like to:
Traverse the entire directory structure recursively grabbing all the zip files
I would like to find a specific file "*myLostFile.ext" within one of these zip files.
What I have tried
1. I know that I can list files recursively pretty easily:
find myLostfile -type f
2. I know that I can list files inside zip archives:
unzip -ls myfilename.zip
How do I find a specific file within a directory structure of zip files?
You can omit using find for single-level (or recursive in bash 4 with globstar) searches of .zip files using a for loop approach:
for i in *.zip; do grep -iq "mylostfile" < <( unzip -l $i ) && echo $i; done
for recursive searching in bash 4:
shopt -s globstar
for i in **/*.zip; do grep -iq "mylostfile" < <( unzip -l $i ) && echo $i; done
You can use xargs to process the output of find or you can do something like the following:
find . -type f -name '*zip' -exec sh -c 'unzip -l "{}" | grep -q myLostfile' \; -print
which will start searching in . for files that match *zip then will run unzip -ls on each and search for your filename. If that filename is found it will print the name of the zip file that matched it.
Some have suggested to use ugrep to search zip files and tarballs. To find the zip files that contain a mylostfile file, specify it as a -g glob pattern like so:
ugrep -z -l -g'myLostfile' ''
With the empty regex pattern '' this this recursively searches all files down the working directory, including any zip, tar, cpio/pax archives for mylostfile. If you only want to search the zip files located in the working directory:
ugrep -z -l -g'myLostfile' '' *.zip

Merging Sub-Folders together, Linux

I have a main folder "Abc" which has about 800 sub-folders. Each of these sub-folders contains numerous files (all of the same format, say ".doc"). How do I create one master folder with all these files (and not being distributed into subfolders). I am doing this on a Windows 7 machine, using cygwin terminal.
The cp -r command copies it but leaves the files in the sub-folders, so it doesn't really help much. I'd appreciate assistance with this. Thank you!
Assuming there could be name collisions and multiple extensions, this will create unique names, changing directory paths to dashes (e.g. a/b/c.doc would become a-b-c.doc). Run this from within the folder you want to collapse:
# if globstar is not enabled, you'll need it.
shopt -s globstar
for file in */**; do [ -f "$file" ] && mv -i "$file" "${file//\//-}"; done
# get rid of the now-empty subdirectories.
find . -type d -empty -delete
If you can guarantee unique names, this will move the files and remove the subdirectories. You can change the two .s to the name of a folder and run it from outside said folder:
find . -depth \( -type f -exec mv -i {} . \; \) -o \( -type d -empty -delete \)
This may not be the most elegant or efficient way to do it, but I believe it'd accomplish what you want:
for file in `find abc`
do
if [ -f $file ]
then
mv $file `basename $file`
fi
done
Iterate through everything in abc, check if it's a file (not a directory) and if it is then move it from its current location (eg abc/d/example.txt) to abc/
Edit: This would leave all the subfolders in place (but they'd be empty now)

Resources