Packet jpeg optimization in linux - linux

I have many JPEG images on my server and they added evry day. I need optimize this images.
To optimize it I have use next command
find . -iname "*.jpg" -exec jpegoptim -m85 --strip-all {} \;
But find command finds all images, not only new! I know, that I may specify -ctime and -mtime params, but when jpegoptim optimizes image, image creation time changes to now! Therefore I can not specify last mod time for find command.
I think, that solution is save already processed files names in text file and when find command runs again exclude already processed file.
How can I do this? How to add finded file name in text file, and how check is file name in text file in next path?

You could use inotifywait from the inotify-tools (or some other inotify frontend) to continuously monitor the folder which contains your jpegs and have them converted on the fly, as soon as they are uploaded.
$ inotifywait -mrq --format '%w%f' -e create . | while read file; do
[ "${file##*.}" = "jpg" ] || continue;
[ -f "${file}" ] || continue;
echo "jpegoptim -m85 --strip-all '$file'";
done
This watches for all file creations in the current folder (-e create .) recursively (-r) and you have to check for the file extension (.jpg) yourself as shown, which means that in case there will be lots of non-jpeg creations in the same folder this is an unnecessary overhead. If this is a problem for you, you could use the --exclude argument and specify a negated regex to filter out files that do not match the desired extension (which is cumbersome and does not look very nice and straight-forward) or you can checkout the current version from the git repo yourself which provides an --include filter (v3.14 does not have it) and have inotifywait only report files already matching your criteria.
Also note that this will silently skip files with newlines in their filenames.

Related

Need script to move and rename files without overwriting duplicate filenames

Maybe i'm just going about this wrong and making it harder than it has to be.
This is my problem. I have 2 different scripts that download various picture files. the first downloads from email and the downloaded files go into the /attachments/ directory. The second script copies the contents of google drive, all files and folders get copied into ~/gdrive/ directory. i want to be able to move all picture files from both these folders as well as any subfolders to ~/Pictures/$today and prevent any overwriting in the case of duplicate file names. I don't mind having 2 separate scripts to handle the pictures in the 2 different directories, but I do need it to be able to get all files in subdirectories of the starting point. it also needs to be able to handle a variety of file extensions. my current solution adds a numbered extension such as .~1~ after the files normal extension .jpg, .png, .tiff, etc. I dont lose any files this way but any that wind up with a backup number after the extension are rendered useless to my project. This is what I am currently using
TODAY=$(date +"%m-%d-%Y")
mkdir -p ~/Pictures/$TODAY &&
sudo find /attachments -type f -exec mv --backup=numbered -t ~/Pictures/$TODAY {} +
My result if there are duplicate file names looks like this:
DSC07286.JPG
DSC07286.JPG.~1~
Is there a better approach than what i am doing? Is there a way to dissect the filename parts and reorganize them and do it recursively for all files in the directory? Thanks
Something like this should do it (untested; uses standard lowercase variable names and puts the index just before the extension to not mess with sorting):
for path in ~/Pictures/"$today"/*.JPG
do
index=0
for duplicate_path in "$path".~[0-9]*
do
new_path="${duplicate_path%%.*}${index}.JPG"
echo "$duplicate_path" "$new_path"
((++index))
done
done
When you're confident it's doing the right thing, simply replace echo with mv to actually move the files.
Here is my solution.
#!/bin/bash
TODAY=$(date +"%m-%d-%Y")
NOW=$(date +"%D %T")
sudo mkdir -p /home/pi/Pictures/emailpics/$TODAY &&
sudo find /attachments -type f -exec mv --backup=numbered -t /home/pi/Pictures/emailpics/$TODAY/ {} + &&
for f in /home/pi/Pictures/emailpics/$TODAY/*.~?~
do
fullfilename=$f
filepath=$(dirname "$fullfilename")
filename=$(basename "$fullfilename")
fname="${filename%.*}"
bkpnum="${filename##*.}"
file="${fname%.*}"
ext="${fname##*.}"
sudo mv $f $filepath/$file$bkpnum.$ext
done
Can't say i fully understand all the syntax for the parsing bits, but it works. maybe someone else can explain what is going on.

How to identify renamed files on linux?

I am using the 'find' command to identify modified files. But I've noticed that my method only identifies content-modified files and new files. It does not identify files where the only change was a rename. Is there a way to use 'find' to identify renamed files? If not, is there some other linux command that can be used for this?
Here is my current method for identifying changed files going back roughly one month (this method does NOT identify renamed files):
$ touch --date "2017-09-10T16:00:00" ~/Desktop/tmp
$ find ~/Home -newer ~/Desktop/tmp -type f > modified-files
You should replace option -newer with \( -newer -o -cnewer \) in order to catch modifications to file metadata as well.

Bash script to delete a file in all sub directories.

I have a directory that is filled with subdirectories exceeding 450 GBs. Inside of these subdirectories is an instruction file in each subdirectory. I have a script that copies the instruction file in the directory I am currently in and puts it inside every subdirectory via:
#!/bin/bash
for d in */; do cp "INSTALLATION INSTRUCTIONS.rtf" "$d"; done
I need to remove all of these files in the subdirectories and replace them with new instructions. Can I simple write another script that does this:
#!/bin/bash
for d in */; do rm "INSTALLATION INSTRUCTIONS.rtf" "$d"; done
I am very hesitant and wanted to make sue as these files are vitally important and I don't want to accidentally remove anything and making a backup of 450+ GBs is very taxing.
find . -mindepth 2 -name "INSTALLATION INSTRUCTIONS.rtf" -exec rm -f '{}' +
Since this is "vitally important" data, I would first list all files that match the file name you want to delete/overwrite, without taking any action on it (other than listing):
find /folder/ -type f -name "INSTALLATION INSTRUCTIONS.rtf" -print > /tmp/holder
That would create a list of matches on /tmp/holder. Then you could analyze this list before taking any action (either visually or programatically) to make sure that the list does not include anything you don't want to delete (when dealing with big amounts of data, strange things can happen, so be proactive on protecting the data).
If you are happy with what the list shows, then you could delete the old instructions, or if possible, overwrite them with the new file. Here's an example to overwrite the old file with the new one:
while read -r line; do cp --no-preserve=all /folder/newfile "$line"; done < /tmp/holder
The cp --no-preserve=all command (available on GNU bash) would ensure that the new file has permissions that are "adequate" to the folder where they are located. You may change that to a simple cp if you don't want that to happen.

Zipping and deleting files with certain age

i'm trying to elaborate a command that will find files that haven't been modified in over 6 months and zip them with one command. Afterwards i want to delete all those files and i just archived.
My current command to find the directories with the files is
find /var/www -type d -mtime -400 ! -mtime -180 | xargs ls -l > testd.txt
This gave me all the directories including the files that are older than 6 months
Now i was wondering if there was a way of zipping all the results and deleting them afterwards. Something amongst the line of
find /var/www -type f -mtime -400 ! -mtime -180 | gzip -c archive.gz
If anyone knows the proper syntax to achieve this i'd love to know. Thakns!
Edit, after a few tests this command results in a corrupted file
find /var/www -mtime -900 ! -mtime -180 | xargs tar -cf test4.tar
Any ideas?
Break this into several distinct steps that you can implement and thoroughly test separately:
Build a list of files to be archived and then deleted, saved to a temp file
Use the list from step 1 to add the files to .tar.gz archives. Give the archive file a name following a specific pattern that won't appear in the files to be archived, and put it in a directory outside the hierarchy of files being archived.
Read back the files from the .tar.gz and compare them (or their hashes) to the original files to ENSURE that you got them all without corruption
Use the list from step 1 to delete the files. Do not use a wildcard for deletion. Put in some guard code to prevent deletion of any file matching the name pattern of the archive .tar.gz file(s) created in step 2.
When testing a script that can do irreversible damage, always code the dangerous command with a leading echo and leave it that way until you are sure everything works. Only then remove the echo.
Consider zip, it should meet your requirements.
find ... | zip -m# archive.zip
-m (move) deletes the input directories/files after making the specified zip archive.
-# takes the list of input files from standard input.
You may find more options which are useful to you in the zip manual, e. g.
-r (recurse) travels the directory structure recursively.
-sf (show-files) shows the files that would be operated on, then exits.
-t or --from-date operates on files not modified prior to the specified date.
-tt or --before-date operates on files not modified after or at the specified date.
This could possibly make findexpendable.
zip -mr --from-date 2012-09-05 --before-date 2013-04-13 archive /var/www

Bash script to recursively step through folders and delete files

Can anyone give me a bash script or one line command i can run on linux to recursively go through each folder from the current folder and delete all files or directories starting with '._'?
Change directory to the root directory you want (or change . to the directory) and execute:
find . -name "._*" -print0 | xargs -0 rm -rf
xargs allows you to pass several parameters to a single command, so it will be faster than using the find -exec syntax. Also, you can run this once without the | to view the files it will delete, make sure it is safe.
find . -name '._*' -exec rm -Rf {} \;
I've had a similar problem a while ago (I assume you are trying to clean up a drive that was connected to a Mac which saves a lot of these files), so I wrote a simple python script which deletes these and other useless files; maybe it will be useful to you:
http://github.com/houbysoft/short/blob/master/tidy
find /path -name "._*" -exec rm -fr "{}" +;
Instead of deleting the AppleDouble files, you could merge them with the corresponding files. You can use dot_clean.
dot_clean -- Merge ._* files with corresponding native files.
For each dir, dot_clean recursively merges all ._* files with their corresponding native files according to the rules specified with the given arguments. By default, if there is an attribute on the native file that is also present in the ._ file, the most recent attribute will be used.
If no operands are given, a usage message is output. If more than one directory is given, directories are merged in the order in which they are specified.
Because dot_clean works recursively by default, use:
dot_clean <directory>
If you want to turn off the recursively merge, use -f for flat merge.
dot_clean -f <directory>
find . -name '.*' -delete
A bit shorter and perform better in case of extremely long list of files.

Resources