Recursively compressing images in a folder-structure, preserving the folder-structure - linux

I have this folder-strucutre, with really heavy high-quality images in each subfolder
tree -d ./
Output:
./
├── 1_2dia-pedro
├── 3dia-pedro
│   ├── 101MSDCF
│   └── 102MSDCF
├── 4dia-pedro
└── Wagner
├── 410_0601
├── 411_0701
├── 412_0801
├── 413_2101
├── 414_2801
├── 415_0802
├── 416_0902
├── 417_1502
├── 418_1602
├── 419_1702
├── 420_2502
└── 421_0103
18 directories
And, I want to compress it, just like I would do with ffmpeg, or imagemagick.
e.g.,
ffmpeg -i %%F -q:v 10 processed/%%F"
mogrify -quality 3 $F.png
I'm currently think of creating a vector of the directories, using shopt, as discussed here
shopt -s globstar dotglob nullglob
printf '%q\n' **/*/
Then, create a new folder-compressed, with the same structure
mkdir folder-compressed
<<iterate the array-vector-out-of-shopt>>
Finally, compress, for each subfolder, something in the lines of
mkdir processed
for f in *.jpg;
do ffmpeg -i "$f" -q:v 1 processed/"${f%.jpg}.jpg";
done
Also, I read this question, and this procedure seems close to what I would like,
for f in mydir/**/*
do
# operations here
done
Major problem: I'm bash newbie. I know all tools needed are at my disposal!
EDIT: There is a program that, for the particular purpose of compressing images with lossless quality, gives us a a-liner, and a lot of options to this compression. The caveat: make another copy of the original folder-structure-files, because it will change them permanently in the folder-structure-files you give it.
cp -r ./<folder-structure-files> ./<folder-structure-files-copy>
image_optim -r ./<folder-structure-files-copy>
I think #m0hithreddy solution is pretty cool, though. Certainly, I will be using that logic elsewhere anytime soon.

Instead of pre-mkdiring directories, you can create the required directories on the fly. Recursion solutions look elegant to me then compared to loops. Here is a straight-forward approach. I echoed the file names and directories to keep track of whats going on. I am not ffmpeg pro, I used cp instead but should work fine for your use case.
Shell script:
source=original/
destination=compressed/
f1() {
mkdir -p ${destination}${1}
for file in `ls ${source}${1}*.jpg 2>/dev/null`
do
echo 'Original Path:' ${file}
echo 'Compressed Path:' ${destination}${1}$(basename $file) '\n'
cp ${file} ${destination}${1}$(basename $file)
done
for dir in `ls -d ${source}${1}*/ 2>/dev/null`
do
echo 'Enter sub-directory:' ${dir} '\n'
f1 ${dir#*/}
done
}
f1 ''
Terminal Session:
$ ls
original script.sh
$ tree original/
original/
├── f1
│   ├── f16
│   │   └── f12.jpg
│   ├── f5
│   │   └── t4.jpg
│   └── t3.txt
├── f2
│   └── t5.txt
├── f3
├── f4
│   └── f10
│   ├── f2
│   │   └── f6.jpg
│   └── f3.jpg
├── t1.jpg
└── t2.txt
8 directories, 8 files
$ sh script.sh
Original Path: original/t1.jpg
Compressed Path: compressed/t1.jpg
Enter sub-directory: original/f1/
Enter sub-directory: original/f1/f16/
Original Path: original/f1/f16/f12.jpg
Compressed Path: compressed/f1/f16/f12.jpg
Enter sub-directory: original/f1/f5/
Original Path: original/f1/f5/t4.jpg
Compressed Path: compressed/f1/f5/t4.jpg
Enter sub-directory: original/f2/
Enter sub-directory: original/f3/
Enter sub-directory: original/f4/
Enter sub-directory: original/f4/f10/
Original Path: original/f4/f10/f3.jpg
Compressed Path: compressed/f4/f10/f3.jpg
Enter sub-directory: original/f4/f10/f2/
Original Path: original/f4/f10/f2/f6.jpg
Compressed Path: compressed/f4/f10/f2/f6.jpg
$ tree compressed/
compressed/
├── f1
│   ├── f16
│   │   └── f12.jpg
│   └── f5
│   └── t4.jpg
├── f2
├── f3
├── f4
│   └── f10
│   ├── f2
│   │   └── f6.jpg
│   └── f3.jpg
└── t1.jpg
8 directories, 5 files

Related

Using bash command to copy files from a subfolder to another

I have the following structure:
.
├── dag_1
│   ├── dag
│   │   ├── current
│   │   └── deprecated
│   └── sparkjobs
│   ├── current
│      | └── spark_3.py
│   └── deprecated
│      └── spark_1.py
│      └── spark_2.py
├── dag_2
│   ├── dag
│   │   ├── current
│   │   └── deprecated
│   └── sparkjobs
│   ├── current
│      | └── spark_3.py
│   └── deprecated
│      └── spark_1.py
│      └── spark_2.py
I want to create a new folder getting only current spark jobs, my expected output folder is:
.
├── dag_1
| └── spark_3.py
├── dag_2
└── spark_3.py
I've tried to use
find /mnt/c/Users/User/Test/ -type f -wholename "sparkjob/current" | xargs -i cp {} /mnt/c/Users/User/Test/output/
Although my script is not writing the files and returns me no error. How can I solve this?
Use this, install command take the input file and copy it to another dir structure, creating the whole tree of dirs if necessary as mkdir -p transparently:
(you need to add wildcard * in -wholename to effectively find files)
find . -type f -wholename "*/sparkjob/current/*" -exec bash -c '
dir=${1#./} dir=${dir%%/*} file=${1##*/}
install -D "$1" "./$dir/$file"
' bash {} \;
Exemple of what is done:
install -D ./dag_2/sparkjob/current/spark_3.py ./dag_2/spark_3.py
install -D ./dag_1/sparkjob/current/spark_3.py ./dag_1/spark_3.py
The source path is an example, if longer, no issue.
First you should check what find returns by removing everything after |. You'll see find doesn't find any files. The reasons:
as the name implies, -wholename matches the whole name, so you need */sparkjob/current/*
according to your tree output, the folder is not named sparkjob but sparkjobs.
I'd start with something like this:
find /mnt/c/Users/User/Test/ -type f -wholename "*/sparkjobs/current/*" -print0 | while IFS= read -r -d '' file; do
echo mv "$file" "$(realpath "$(dirname "$file")"/../..)"
done
I added an echo so you can check all paths and commands are correct.
You may want to trade simplicity for performance. See https://mywiki.wooledge.org/BashFAQ/001 if performance is important (many files or frequent runs).
You'll want to do:
mkdir ../new_folder
find . -type f \
-path '*/sparkjobs/current/*' \
-exec sh -c 'f=$1
new=${f/sparkjobs\/current\//}
dest="../new_folder/$(dirname "$new")"
mkdir -p "$dest"
cp -v "$f" "$dest"' sh '{}' \;
‘./dag_1/sparkjobs/current/spark_3.py’ -> ‘../new_folder/./dag_1/spark_3.py’
‘./dag_2/sparkjobs/current/spark_3.py’ -> ‘../new_folder/./dag_2/spark_3.py’
This looks pretty straightforward.
for d in $old_loc/dag_*
do mkdir -p "$new_loc/${d##*/}"
cp "$d"/sparkjobs/current/spark_*.py "${d##*/}"
done

Bash brace expansion - operand behavior

When using brace expansion with certain commands, the actual behavior differed from what I expected- a member within the brace was evaluated as an argument in the other brace member's expansion.
For instance,
$ mkdir -p {folder1,folder2,folder3}/{folderA,folderB,folderC}
Works as expected -
$ tree .
.
├── folder1
│   ├── folderA
│   ├── folderB
│   └── folderC
├── folder2
│   ├── folderA
│   ├── folderB
│   └── folderC
└── folder3
├── folderA
├── folderB
└── folderC
However, if we do
$ cp -r folder1/ folder2/{folderA,folderB}
Instead of folder1 being copied to both folder2/folderA and folder2/folderB, 'folderA' is interpreted as a second source. Thus we get -
.
├── folder1
│   ├── folderA
│   ├── folderB
│   └── folderC
├── folder2
│   ├── folderA
│   ├── folderB
│   │   ├── folder1
│   │   │   ├── folderA
│   │   │   ├── folderB
│   │   │   └── folderC
│   │   └── folderA
│   └── folderC
└── folder3
├── folderA
├── folderB
└── folderC
Can anyone explain why this is the case? I would have thought the above to be evaluated as -
$ cp -r folder1/ folder2/folderA
$ cp -r folder1/ folder2/folderB
Brace expansion doesn't result in multiple commands, it's just expanded in place in the original command. So the result is
cp -r folder1/ folder2/folderA folder2/folderB
When you get more than 2 arguments to cp, the last is the destination folder, the rest are source files and folders.
If you want multiple commands, you can use an explicit loop:
for dest in folder2/{folderA,folderB}; do
cp -r folder1/ "$dest"
done

Renaming files with the same names as directory - bash script

I want to rename my files so that they are name with the same name as the folder.
I have a main folder that has around 1000 folders. each of these folders have another file within it. in that very last folder, I have files with different extentions. and I want to rename the files that have pdb extention.
here's the strcuture of my folders :
pv----|
|--m10\ pk\ result0.pdb result1.pdb result2.pdb
|--m20\ pk\ result0.pdb result1.pdb result2.pdb
|--m30\ pk\ result0.pdb result1.pdb result2.pdb
I want something like this :
pv----|
|--m10\ pk\ m10_result0.pdb m10_result1.pdb m10_result2.pdb
|--m20\ pk\ m20_result0.pdb m20_result1.pdb m20_result2.pdb
|--m30\ pk\ m30_result0.pdb m30_result1.pdb m30_result2.pdb
that's the code I made but It's not working ..
for d in MD_PR2 / * / * /
do
(cd "$d" && for file in *.pdb ; do mv "$file" "${file/result/$d_result}" ; done)
done
my code is deleting "result" of each file's name and I don't know. it becomes 0.pdb , 1.pdb ..etc
thank you very much
Before:
user#pc:~$ tree
.
├── m10
│   └── pk
│   ├── result0.pdb
│   ├── result1.pdb
│   └── result2.pdb
├── m20
│   └── pk
│   ├── result0.pdb
│   ├── result1.pdb
│   └── result2.pdb
└── m30
└── pk
├── result0.pdb
├── result1.pdb
└── result2.pdb
Your code is not working because $d_result is being interpreted as a variable name, not as a concatenation of $d and _result. I suggest using ${d}_result.
However I would suggest another approach, one that doesn't need to cd into each directory.
Code:
shopt -s globstar
for file in **; do
if [[ "$file" =~ ".pdb" ]] ; then
mv "$file" `echo $file | sed -e 's/\(.*\)\/\(.*\)\/\(.*.pdb\)/\1\/\2\/\1_\2_\3/'`;
fi;
done;
After:
user#pc:~$ tree
.
├── m10
│   └── pk
│   ├── m10_pk_result0.pdb
│   ├── m10_pk_result1.pdb
│   └── m10_pk_result2.pdb
├── m20
│   └── pk
│   ├── m20_pk_result0.pdb
│   ├── m20_pk_result1.pdb
│   └── m20_pk_result2.pdb
└── m30
└── pk
├── m30_pk_result0.pdb
├── m30_pk_result1.pdb
└── m30_pk_result2.pdb
Code explanation:
shopt -s globstar: Allow for ** to be expanded into "all files and directories recursively"
Variable "file" contains filenames including directories
Check "file" against "$file" =~ ".pdb" to ignore working with directories
Generate newfilename with sed:
Search and replace: s/search/replace/
Find something like dir1/dir2/smthg.pdb: (.*)/(.*)/(.*.pdb)
Replace with dir1/dir2/dir1_dir2_smthg.pdb: \1/\2/\1_\2_\3 (replace with \1_\2_\3 if you also want to move renamed files into parent dir)
(I removed some backslashes for readability)
mv file to newfilename

Linux: Batch rename multiple files to parent dir + suffix in order of name

I need to batch rename multiple images and want to use the parent directory as base name. To prevent one overwriting the other, a suffix must be added. The order of the renaming process musts follow the timestamp of the file. Because the 'first' file is a featured image for the site I'm using it for.
Tree:
└── main
├── white tshirt
│   ├── IMG_1.jpg
│   ├── IMG_2.jpg
│   ├── IMG_3.jpg
│   └── IMG_4.jpg
├── black tshirt
│   ├── IMG_1.jpg
│   ├── IMG_2.jpg
│   ├── IMG_3.jpg
│   └── IMG_4.jpg
└── red tshirt
├── IMG_1.jpg
├── IMG_2.jpg
├── IMG_3.jpg
└── IMG_4.jpg
Goal:
└── main
├── white tshirt
│   ├── white-tshirt-1.jpg
│   ├── white-tshirt-2.jpg
│   ├── white-tshirt-3.jpg
│   └── white-tshirt-4.jpg
├── black tshirt
│   ├── black-tshirt-1.jpg
│   ├── black-thisrt-2.jpg
│   ├── black-tshirt-3.jpg
│   └── black-tshirt-4.jpg
└── red tshirt
├── red-tshirt-1.jpg
├── red-tshirt-2.jpg
├── red-tshirt-3.jpg
└── red-tshirt-4.jpg
Replacing spaces with dashes is not required, but preferred. Platform: Debian 8
I think this should do the job:
#!/bin/sh
for dir in *; do
if [ ! -d "$dir" ]; then
continue
fi
cd "$dir"
pref=$(echo "$dir" | tr ' ' -)
i=1
ls -tr | while read f; do
ext=$(echo "$f" | sed 's/.*\.//')
mv "$f" "${pref}-${i}.$ext"
i=$(expr $i + 1)
done
cd ..
done
Invoke the script inside your main directory and make sure there are only your target folders in it. Also make sure your files'names do not contain the character '\'

Linux/shell - Remove all (sub)subfolders from a directory except one

I've inherited a structure like the below, a result of years of spaghetti code...
gallery
├── 1
│   ├── deleteme1
│   ├── deleteme2
│   ├── deleteme3
│   └── full
│   ├── file1
│   ├── file2
│   └── file3
├── 2
│   ├── deleteme1
│   ├── deleteme2
│   ├── deleteme3
│   └── full
│   ├── file1
│   ├── file2
│   └── file3
└── 3
├── deleteme1
├── deleteme2
├── deleteme3
└── full
├── file1
├── file2
└── file3
In reality, this folder is thousands of subfolders large. I only need to keep ./gallery/{number}/full/* (i.e. the full folder and all files within, from each numbered directory within gallery), with everything else no longer required and needs to be deleted.
Is it possible to construct a one-liner to handle this? I've experimented with find/maxdepth/prune could not find an arragement which met my needs.
(Update: To clarify, all folders contain files - none are empty)
Using PaddyD answer you can first clean unwanted directories and then delete them:
find . -type f -not -path "./gallery/*/full/*" -exec rm {} + && find . -type d -empty -delete
This can easily be done with bash extglobs, which allow matching all files that don't match a pattern:
shopt -s extglob
rm -ri ./gallery/*/!(full)
How about:
find . -type d -empty -delete

Resources