convert images to pdfs in subdirectories - linux

I have a bunch of sub-folders with images in them. I am trying to recursively convert them into pdf files using the directory names as the pdf name. With the help of some Google searches I tried using this script I wrote:
#!/bin/bash
for D in `find . -type d` do
convert *.jpg ${PWD##*/}.pdf
end
It did not work. How can I get this to work?
Each folder has several .JPGs in it with a number 01-10 etc. Using convert *.jpg name.pdf converts all the images into one pdf file. I want to make a script to do this and for the PDF files to have the directory name.
I would also like the script to then grab the converted pdf and move it up a level into the parent directory.

A while loop is more appropriate here. Try this:
find . -type d | while read d; do convert "${d}"/*.jpg ./"${d##*/}.pdf"; done
find will return all directories in current directory.
while read d will read each directory path into variable $d.
convert ${d}/*.jpg performs the conversion on all .jpg images in directory $d.
./${d##*/}.pdf replaces the whole directory path with just the directory name, appends .pdf, and ensures that PDF file is created in parent directory.

Thanks to #savanto for the answer. I modified it a bit for my needs.
Read through the files in a directory's subdirectories. Convert one file type to a different file types. Name the file and save it to a directory (using OSX 10.10, terminal bash-3.2).
cd /parent_directory
find . -type d | while read d; do mogrify -format png *.eps; done
find . -name '*.eps' -execdir mogrify -format png {} \;

Here's a version of savanto's answer that uses img2pdf, which is faster, lossless, and makes smaller file sizes than ImageMagick.
`find . -type d | while read d; do img2pdf "${d}"/*.jp2 --output ./"${d##*/}.pdf"; done`

Easy:
find path/to/images -iname "*.jpg" | xargs -I %this convert %this %this.pdf
Very simple and clean!

Related

ImageMagick - How do you convert files recursively in the shell?

I have a folder tree of several subfolders within subfolders of BMP files I'd like to convert to JPG format, but I can't figure out a working command to get mogrify/imagemagick to go through every folder recursively.
Any solutions?
Easiest way to pass list of files would be probably by using backticks and recursive find command:
mogrify -format jpg `find . -name '*.bmp'`

list base files in a folder with numerous date stampped versions of a file

I've got a folder with numerous versions of files (thousands of them), each with a unique date/time stamp as the file extension. For example:
./one.20190422
./one.20190421
./one.20190420
./folder/two.txt.20190420
./folder/two.txt.20190421
./folder/folder/three.mkv.20190301
./folder/folder/three.mkv.20190201
./folder/folder/three.mkv.20190101
./folder/four.doc.20190401
./folder/four.doc.20190329
./folder/four.doc.20190301
I need to get a unique list of the base files. For example, for the above example, this would be the expected output:
./one
./folder/two.txt
./folder/folder/three.mkv
./folder/four.doc
I've come up with the below code, but am wondering if there is a better, more efficient way.
# find all directories
find ./ -type d | while read folder ; do
# go into that directory
# then find all the files in that directory, excluding sub-directories
# remove the extension (date/time stamp)
# sort and remove duplicates
# then loop through each base file
cd "$folder" && find . -maxdepth 1 -type f -exec bash -c 'printf "%s\n" "${#%.*}"' _ {} + | sort -u | while read file ; do
# and find all the versions of that file
ls "$file".* | customFunctionToProcessFiles
done
done
If it matters, the end goal is find all the versions of a specific file, in groups of the base file, and process them for something. So my plan was to get the base files, then loop through the list and find all the version files. So, using the above example again, I'd process all the one.* files first, then the two.* files, etc...
Is there a better, faster, and/or more efficient way to accomplish this?
Some notes:
There are potentially thousands of files. I know I could just search for all files from the root folder, remove the date/time extension, sort and get unique, but since there may be thousands of files I thought it might be more efficient to loop through the directories.
The date/time stamp extension of the file is not in my control and it may not always be just numbers. The only thing I can guarantee is it is on the end after a period. And, whatever format the date/time is in, all the files will share it -- there won't be some files with one format and other files with another format.
You can use find ./ -type f -regex to look for files directly
find ./ -type f -regex '.*\.[0-9]+'
./some_dir/asd.mvk.20190422
./two.txt.20190420
Also, pipe the result to your function through xargs whithout needing while loops
re='(.*)(\.[0-9]{8,8})'
find ./ -type f -regextype posix-egrep -regex "$re" | \
sed -re "s/$re/\1/" | \
xargs -r0 customFunctionToProcessFiles

List all the directories which contain a *.pdf file in linux and pipe the result to a file

Is there a way in Linux to list all the directories which contain a *.pdf file using something like grep and piping all the results to a file?
The reason why I want to do this is I have a large static html based site and I want to list all the urls which contain a pdf.
The sites structure is your typical tree structure with many levels and sublevels for this is a small cut down example:
public_html/
content/
help/
advice/
marketing/
campaign1/
pdfs/
campaign2/
pdfs/
shop/
templates/
nav/
guarantees/
includes/
etc...
I want to search through this structure and the list all the pdfs contained in here and pipe them to a file. The example output would look like below with each new result on a new line:
public_html/content/marketing/campaign1/pdfs/example1.pdf
public_html/content/marketing/campaign1/pfds/example2.pdf
public_html/shop/templates/nav/guarantees/guarantee.pdf
I would only want the pdfs from the public_html folder. I wouldn't want to search through my home, bin, tmp, var etc... folders on the hard drive.
You could do this through find command,
find /public_html -mindepth 1 -iname "*.pdf" -type f > output-file
Explanation:
/public_html # Directory on which the find operation is going to takesplace.
-mindepth 1 # If mindepth is set to 1, it will search inside subdirectories also.
-iname "*.pdf" # Name must end with .pdf.
-type f # Only files.
Hope this would help
find "Directory" | grep .pdf> outputfile
example : find Desktop/|grep pdf > 1.txt

How to convert jpg files into png files with linux command? + Difficulty = Subfolders

I want to convert several jpg files into png files. As far as I know, one can use this command
mogrify -format png *.*
I have one problem, I have a lot of subfolders. Let's say a is my main folder and b,c and d are subfolders. The images are in the subfolders.
How can I convert all images without having to open every folder manually?
-> I would like to write a command that works, when I am in folder a, but works for all files in the subfolders.
Assuming you're in folder a the following might work for you
find . -name "*.jpg" -exec mogrify -format png {} \;
You can use the find command to get all the jpg files in all the subfolders and pass your command as an argument to find

imagemagik convert files by directory

I wrote this simple shell script to convert jpgs using imagemagik. It works fine, but I would like to include pngs, gifs, jpeg, etc... while passing the file extension through the script for each iteration of the find. I do prefer this approach of looping over a find, so that I can better report on each item processed, and allow a more scalable script for adding other sizes and transformations to each process. (rather than a simple convert * command.).
Any suggestions?
find cdn/ -name '*.jpg' -print | sort |
while read f;
do
b=$(basename $f .jpg)
in="${b}.jpg"
thumb="${b}_150x150.jpg"
if [ -e $thumb ];
then
true
else
convert -resize 150 $in $thumb
fi
done
+1 for "Splitting up the problem into 1) finding the files, 2) deciding what to do with a file and 3) process the file. Making it modular will split the problem into parts which you can tackle separately." as previously suggested. This allow a more scalable approach for adding more processings.
This way, you don't need to pass the files by extensions. Is that what you want, passing the files by extensions? Do you have to do that?
Also,
Use -iname '*.jpg' instead of -name '*.jpg' so as to do a case-insensitive search.
Use more -iname parameters on the find to find all other extensions that you want. E.g.,
find cdn/ \(-iname '*.jpg' -o -iname '*.jpeg' -o -iname '*.png' -o -iname '*.gif' \) -print
I.e., you find all the files you want in one-pass, instead of using find over and over to find different extension files.
Make it more modular, call a second script to process the image files.
find /path -type f -print |
while read filename ; do
sh /path/to/process_image $filename
done
Within the process_image script you can then choose what do do based on the file's
extension name or file type. The script could call other scripts depending on what you
want to do based on the image type, size etc.
Splitting up the problem into 1) finding the files, 2) deciding what to do with a file and 3) process the file. Making it modular will split the problem into parts which you can tackle separately.

Resources