I'm trying to create a shell file I can set and forget...
Images with resolutions higher than 1920px resizes. Higher than 1920 seems to me too high for an e-commerce store anno 2018 ...
On an 8K+ image, an iPhone in Chrome bv simply crashes...
I have already replaced the scripts in as many places as possible so that front-end appropriate resolutions are shown and therefore no original files are simply resized. Ex 4000 x 6720 original, are now 500 x 1000's and are shown at 250 x 500 (so it is still sharp on retina).
These will then be in this lower resolution in the Magento Cache and then in Varnish Cache.
The idea is now to create a shell script that will clean up these "too large" files, resized themselves:
find /home/customer/customer.com \
-type f \
-regex "^. * \. \ (png \ | jpg \ | jpeg \) $" \
-exec identify -format "% d,% w /% f,% h \ n "{} \; \
| awk -F ',' '$ 2> 1080 && $ 3> 1920' \
| grep "png \ | jpg \ | jpeg"
Gives me:
/home/sample/sample.com/media/catalog_fm/1453138191_LIU.JO_Dames_kleed_P16049T1633_nero_2_x15148601020101.jpg, 1440, 2160
/home/sample/sample.com/media/catalog_fm/1446052276_LIU.JO JEANS_Dames_vest_W65221E0139_22222_2_x14864501010101.jpg, 1440, 2160
/home/sample/sample.com/media/catalog_fm/1446655568_LIU-JO ACCESOIRES_Dames_shoes & boots_S65019P0055_22222_4_x14874601010101.jpg, 1440, 2160
...
The intention is to resize this now as follows:
convert FILENAME -verbose -resize x1920 FILENAME
I want to do all of these separate steps in a single shell file. Now I copy the generated file > .txt and extract it, run the script with the copy pasted data manually.
I don't think your approach is the best, you can tell convert to only resize images if they are larger, IIRC:
convert -resize ${MAXWIDTH}x${MAXHEIGHT}\> ${FILENAME}
So you can just find all the pictures and unleash find on them:
find . -type f \( -name "*.png" -or -name "*.jpg" -or -name "*.jpeg"\) -exec convert -resize ${MAXWIDTH}x${MAXHEIGHT}\> {} \;
Related
I have the following cmd that fetches all .pdf files with an STP pattern in the filename and places them into a folder:
find /home/OurFiles/Images/ -name '*.pdf' |grep "STP*" | xargs cp -t /home/OurFiles/ImageConvert/STP/
I have another cmd that converts pdf to jpg.
find /home/OurFiles/ImageConvert/STP/ -type f -name '*.pdf' -print0 |
while IFS= read -r -d '' file
do convert -verbose -density 500 -resize 800 "${file}" "${file%.*}.jpg"
done
Is it possible to combine these commands into one? Also, I would like pre-pend a prefix onto the converted image file name in the single command, if possible. Example: STP_OCTOBER.jpg to MSP-STP_OCTOBER.jpg. Any feedback is much appreciated.
find /home/OurFiles/Images/ -type f -name '*STP*.pdf' -exec sh -c '
destination=$1; shift # get the first argument
for file do # loop over the remaining arguments
fname=${file##*/} # get the filename part
cp "$file" "$destination" &&
convert -verbose -density 500 -resize 800 "$destination/$fname" "$destination/MSP-${fname%pdf}jpg"
done
' sh /home/OurFiles/ImageConvert/STP {} +
You could pass the destination directory and all PDFs found to find's -exec option to execute a small script.
The script removes the first argument and saves it to variable destination and then loops over the given PDF paths. For each filepath, extract the filename, copy the file to the destination directory and run the convert command if the copy operation was successful.
Maybe something like:
find /home/OurFiles/Images -type f -name 'STP*.pdf' -print0 |
while IFS= read -r -d '' file; do
destfile="/home/OurFiles/ImageConvert/STP/MSP-$(basename "$file" .pdf).jpg"
convert -verbose -density 500 -resize 800 "$file" "$destfile"
done
The only really new thing in this merged one compared to your two separate commands is using basename(1) to strip off the directories and extension from the filename in order to create the output filename.
I wrote an shell script which
get list of all image files from directory
create new folder if needed for new image
optimize image in order to save storage resources
I've tried to use parallel -j "$(nproc)" before mogrify but found that it was wrong, because before mogrify is used DIR and mkdir, i need instead something like & at end of mogrify but to do it only for n processes.
the current code look like:
#!/bin/bash
find $1 -iname "*.jpg" -o -iname "*.jpeg" -o -iname "*.png" -o -iname "*.gif" -type f | while read IMAGE
do
DIR="$2"/`dirname $IMAGE`
echo "$IMAGE > $DIR"
mkdir -p $DIR
mogrify -path "$DIR" -resize "6000000#>" -filter Triangle -define filter:support=2 -unsharp 0.25x0.08+8.3+0.045 -dither None -posterize 136 -quality 82 -define jpeg:fancy-upsampling=off -define png:compression-filter=5 -define png:compression-level=9 -define png:compression-strategy=1 -define png:exclude-chunk=all -interlace none -colorspace sRGB "$IMAGE"
done
exit 0
Can someone suggest what will be the right way to run such script in parallel? as each run take about 15 seconds.
When you have a shell loop that does some setup and invokes an expensive command, the way to parallelize it is to use sem from GNU parallel:
for i in {1..10}
do
echo "Doing some stuff"
sem -j +0 sleep 2
done
sem --wait
This allows the loop to run and do its thing as normal, while also scheduling the commands to run in parallel (-j +0 runs one job per CPU core).
Make a bash function that deals correctly with one file and call that in parallel:
#!/bin/bash
doit() {
IMAGE="$1"
DIR="$2"/`dirname $IMAGE`
echo "$IMAGE > $DIR"
mkdir -p $DIR
mogrify -path "$DIR" -resize "6000000#>" -filter Triangle -define filter:support=2 -unsharp 0.25x0.08+8.3+0.045 -dither None -posterize 136 -quality 82 -define jpeg:fancy-upsampling=off -define png:compression-filter=5 -define png:compression-level=9 -define png:compression-strategy=1 -define png:exclude-chunk=all -interlace none -colorspace sRGB "$IMAGE"
}
export -f doit
find $1 -iname "*.jpg" -o -iname "*.jpeg" -o -iname "*.png" -o -iname "*.gif" -type f |
parallel doit
Default for GNU Parallel is to run one job per CPU-thread, so Çıproc is not needed.
This has less overhead than starting sem for each file (sem = 0.2 sec per call, parallel = 7 ms per call).
I have the code below for image conversion.
I have a directory with many images, I would like to convert all images that the width was less than 200 pixel.
Regardless of the extension, jpg, gif or png
find . -iname \*.jpg -exec convert -verbose -resize 200x140! "{}" "{}" \;
I think you want this - or something very close to it - so make a backup first!
find . \( -iname \*.jpg -o -iname \*.png -o -iname \*.gif \) \
-exec bash -c '[ $(identify -format %w "$0" ) -lt 200 ] && convert "$0" -resize 200x140\! "$0"' {} \;
That says... "find, starting in the current directory (.), any files whose names end, in a case-insensitive fashion (-iname), in JPG, PNG or GIF and start a new bash shell for each one. Once inside the shell, get the width of the file and if it is less than 200 pixels, execute the convert command to resize the file to 200x140, ignoring aspect ratio."
The "first" part in there is: acquiring the width of all images in that folder. And if I read your question correctly, that is where you have problems with; thus you can look into the identify command coming with ImageMagick. It works like this
identify -format "%wx%h" pic.jpg
See here for handling formatting. As soon as you got your list of "width matching" files, you should be able to further convert them.
The situation
I'm looking for a way to batch-resize approximately 15 million images of different file types to fit a certain bounding box resolution (in this case the image(s) cannot be bigger than 1024*1024), without distorting the image and thus retaining the correct aspect ratio. All files are currently located on a Linux server on which I have sudo access, so if I need to install anything, I'm good to go.
Things I've tried
After dabbling around with some tools under Windows (Adobe Photoshop and other tools) I am no longer willing to run this on my own machine as this renders it virtually unusable when rendering. Considering the size of this job, I'm really looking for some command-line magic to directly run it on Linux, but my endeavors with ImageMagick so far haven't given me anything to work with as I'm getting nothing but errors.
To be honest, ImageMagick's documentation could use some work... or someone should put in the effort to make a good web-interface to create one of these mythical image convertion one-liners.
Required output format
I need the images to be resized to the same filename and of a format which will fit inside a certain maximum dimension, for example 1024*1024, meaning:
a JPG of 2048*1024 becomes a JPG of 1024*512 at 75% quality
a PNG of 1024*2048 becomes a PNG of 512*1024
The resulting image should contain no additional transparent pixels to fill up the remaining pixels; I'm just looking for a way to convert the images to a limited resolution.
Thanks for any help!
The best way I found to convert millions of images like these is by creating a simple bash script which starts converting all the images it finds, like the one listed below:
To edit this bash script, I use nano if you don't have nano: "apt-get install nano" for Ubuntu/Debian or "yum install nano" for CentOS/CloudLinux.. for other distributions: use Google) but you're free to use any editor you want.
Bash script
First, create the bash script by starting your favorite editor (mine's nano):
nano -w ~/imgconv.sh
Then fill it with this content:
#!/bin/bash
find ./ -type f -iname "*.jpeg" -exec mogrify -verbose -format jpeg -layers Dispose -resize 1024\>x1024\> -quality 75% {} +
find ./ -type f -iname "*.jpg" -exec mogrify -verbose -format jpg -layers Dispose -resize 1024\>x1024\> -quality 75% {} +
find ./ -type f -iname "*.png" -exec mogrify -verbose -format png -alpha on -layers Dispose -resize 1024\>x1024\> {} +
Then all you need to do is make it executable with chmod +x ~/imgconv.sh and run it from the main images directory where you want to resize the images in all subdirectories:
cd /var/www/webshop.example.com/public_html/media/
~/imgconv.sh
That should start the conversion process.
Explanation
The way the script works is that it uses find to find the file with extension .jpeg of any capitalization and then runs a command:
find ./ -type f -iname "*.jpeg" -exec <COMMAND> {} +
.. and then execute the appropriate convert job using the "-exec {} +" parameter:
mogrify -verbose -format jpeg -layers Dispose -resize 1024\>x1024\> -quality 75% <### the filename goes here, in this case *.jpeg ###>
If you're working with files older than today and you want to prevent re-doing files you've already converted today, you could even tell the 'find' command only convert the files older than today by adding the option -mtime +1 like so:
#!/bin/bash
find ./ -type f -mtime +1 -iname "*.jpeg" -exec mogrify -verbose -format jpeg -layers Dispose -resize 1024\>x1024\> -quality 75% {} +
find ./ -type f -mtime +1 -iname "*.jpg" -exec mogrify -verbose -format jpg -layers Dispose -resize 1024\>x1024\> -quality 75% {} +
find ./ -type f -mtime +1 -iname "*.png" -exec mogrify -verbose -format png -alpha on -layers Dispose -resize 1024\>x1024\> {} +
Performance
A really simple way to use more cores to perform this process is to fork each job to the background by adding a & after each line. Another way would be to use GNU Parallel, especially with the -X parameter as it will use all your CPU cores and get the job done many times quicker.
But no matter what kind of parallelization technique you'll be using, be sure only to do that on your own system and not on a shared disk system where your production platform resides, since going for maximum performance will bog down your hardware or hypervisor performance.
This job is going to take a while, so be sure to set up a screen or a terminal without timeout/noop packets beforehand. On my system, it churned through about 5000 files per minute, so the entire job should take less than ~50-60 hours... sounds like a fine job to run over the weekend.
Just be sure to separate all file extensions from each other by writing separate commands; Piling all options on top of each other and having 'mogrify' using all options for all image formats won't work.
ImageMagick is a powerful tool in the right hands.
I used the following command to convert and merge all the JPG files in a directory to a single PDF file:
convert *.jpg file.pdf
The files in the directory are numbered from 1.jpg to 123.jpg. The conversion went fine but after converting, the pages were all mixed up. I wanted the PDF to have pages from 1.jpg to 123.jpg in the same order as they are named. I tried it with the following command as well:
cd 1
FILES=$( find . -type f -name "*jpg" | cut -d/ -f 2)
mkdir temp && cd temp
for file in $FILES; do
BASE=$(echo $file | sed 's/.jpg//g');
convert ../$BASE.jpg $BASE.pdf;
done &&
pdftk *pdf cat output ../1.pdf &&
cd ..
rm -rf temp
But still no luck. Operating system is Linux.
From the manual of ls:
-v natural sort of (version) numbers within text
So, doing what we need in a single command:
convert $(ls -v *.jpg) foobar.pdf
Mind that convert is part of ImageMagick.
The problem is because your shell is expanding the wildcard in a purely alphabetical order, and because the lengths of the numbers are different, the order will be incorrect:
$ echo *.jpg
1.jpg 10.jpg 100.jpg 101.jpg 102.jpg ...
The solution is to pad the filenames with zeros as required so they're the same length before running your convert command:
$ for i in *.jpg; do num=`expr match "$i" '\([0-9]\+\).*'`;
> padded=`printf "%03d" $num`; mv -v "$i" "${i/$num/$padded}"; done
Now the files will be matched by the wildcard in the correct order, ready for the convert command:
$ echo *.jpg
001.jpg 002.jpg 003.jpg 004.jpg 005.jpg 006.jpg 007.jpg 008.jpg ...
You could use
convert '%d.jpg[1-132]' file.pdf
via https://www.imagemagick.org/script/command-line-processing.php:
Another method of referring to other image files is by embedding a
formatting character in the filename with a scene range. Consider the
filename image-%d.jpg[1-5]. The command
magick image-%d.jpg[1-5] causes ImageMagick to attempt to read images
with these filenames:
image-1.jpg image-2.jpg image-3.jpg image-4.jpg image-5.jpg
See also https://www.imagemagick.org/script/convert.php
All of the above answers failed for me, when I wanted to merge many high-resolution jpeg images (from a scanned book).
Imagemagick tried to load all files into RAM, I therefore used the following two-step approach:
find -iname "*.JPG" | xargs -I'{}' convert {} {}.pdf
pdfunite *.pdf merged_file.pdf
Note that with this approach, you can also use GNU parallel to speed up the conversion:
find -iname "*.JPG" | parallel -I'{}' convert {} {}.pdf
This is how I do it:
First line convert all jpg files to pdf it is using convert command.
Second line is merging all pdf files to one single as pdf per page. This is using gs ((PostScript and PDF language interpreter and previewer))
for i in $(find . -maxdepth 1 -name "*.jpg" -print); do convert $i ${i//jpg/pdf}; done
gs -dNOPAUSE -sDEVICE=pdfwrite -sOUTPUTFILE=merged_file.pdf -dBATCH `find . -maxdepth 1 -name "*.pdf" -print"`
https://gitlab.mister-muffin.de/josch/img2pdf
In all of the proposed solutions involving ImageMagick, the JPEG data gets fully decoded and re-encoded. This results in generation loss, as well as performance "ten to hundred" times worse than img2pdf.
img2pdf is also available from many Linux distros, as well as via pip3.
Mixing first idea with their reply, I think this code maybe satisfactory
jpgs2pdf.sh
#!/bin/bash
cd $1
FILES=$( find . -type f -name "*jpg" | cut -d/ -f 2)
mkdir temp > /dev/null
cd temp
for file in $FILES; do
BASE=$(echo $file | sed 's/.jpg//g');
convert ../$BASE.jpg $BASE.pdf;
done &&
pdftk `ls -v *pdf` cat output ../`basename $1`.pdf
cd ..
rm -rf temp
How to create A PDF document from a list of images
Step 1: Install parallel from Repository. This will speed up the process
Step 2: Convert each jpg to pdf file
find -iname "*.JPG" | sort -V | parallel -I'{}' convert -compress jpeg -quality 25 {} {}.pdf
The sort -V will sort the file names in natural order.
Step 3: Merge all PDFs into one
pdfunite $(find -iname '*.pdf' | sort -V) output_document.pdf
Credit Gregor Sturm
Combining Felix Defrance's and Delan Azabani's answer(from above):
convert `for file in $FILES; do echo $file; done` test_2.pdf