I wrote this simple shell script to convert jpgs using imagemagik. It works fine, but I would like to include pngs, gifs, jpeg, etc... while passing the file extension through the script for each iteration of the find. I do prefer this approach of looping over a find, so that I can better report on each item processed, and allow a more scalable script for adding other sizes and transformations to each process. (rather than a simple convert * command.).
Any suggestions?
find cdn/ -name '*.jpg' -print | sort |
while read f;
do
b=$(basename $f .jpg)
in="${b}.jpg"
thumb="${b}_150x150.jpg"
if [ -e $thumb ];
then
true
else
convert -resize 150 $in $thumb
fi
done
+1 for "Splitting up the problem into 1) finding the files, 2) deciding what to do with a file and 3) process the file. Making it modular will split the problem into parts which you can tackle separately." as previously suggested. This allow a more scalable approach for adding more processings.
This way, you don't need to pass the files by extensions. Is that what you want, passing the files by extensions? Do you have to do that?
Also,
Use -iname '*.jpg' instead of -name '*.jpg' so as to do a case-insensitive search.
Use more -iname parameters on the find to find all other extensions that you want. E.g.,
find cdn/ \(-iname '*.jpg' -o -iname '*.jpeg' -o -iname '*.png' -o -iname '*.gif' \) -print
I.e., you find all the files you want in one-pass, instead of using find over and over to find different extension files.
Make it more modular, call a second script to process the image files.
find /path -type f -print |
while read filename ; do
sh /path/to/process_image $filename
done
Within the process_image script you can then choose what do do based on the file's
extension name or file type. The script could call other scripts depending on what you
want to do based on the image type, size etc.
Splitting up the problem into 1) finding the files, 2) deciding what to do with a file and 3) process the file. Making it modular will split the problem into parts which you can tackle separately.
Related
I'm writing a shell script on a Linux machine to be run via a crontab which is meant to move all files older than the current day to a new folder, and then tar and zip the entire folder. Seems like a simple task but for some reason, I'm running into all kinds of roadblocks. I'm new to this and self-taught so any help or redirection would be greatly appreciated.
Specific criteria for which files to archive:
All log files are in /home/tech/logs/ and all pdfs are in /home/tech/logs/pdf
All files are over a day old as indicated by the file name (file name does not include $CURRENT_DATE)
All files must be *.log or *.pdf (i.e. don't archive files that don't include $CURRENT_DATE if it isn't a log or pdf file.
Filename formatting specifics:
All the log file names are in home/tech/logs in the format NAME 00_20180510.log, and all the pdf files are in a "pdf" subdirectory (home/tech/logs/pdf) with the format NAME 00_20180510_00000000.pdf ("20180510" would be whenever the file was created and the 0's would be any number). I need to use the name rather than the file metadata for the creation date, and all files (pdf/log) whose name does not include the current date are "old". I also can't just move all files that don't contain $CURRENT_DATE in the name because it would take any non-*.pdf or *.log files with it.
Right now the script creates a new folder with a new pdf subdir for the old files (mkdir -p /home/tech/logs/$ARCHIVE_NAME/pdf). I then want to move the old logs into $ARCHIVE_NAME, and move all old pdfs from the original pdf subdirectory into $ARCHIVE_NAME/pdf.
Current code:
find /home/tech/logs -maxdepth 1 -name ( "*[^$CURRENT_DATE].log" "*.log" ) -exec mv -t "$ARCHIVE_NAME" '{}' ';'
find /home/tech/logs/pdf -maxdepth 1 -name ( "*[^$CURRENT_DATE]*.pdf" "*.pdf" ) -exec mv -t "$ARCHIVE_NAME/pdf" '{}' ';'
This hasn't been working because it treats the numbers in $CURRENT_DATE as a list of numbers to exclude rather than a literal string.
I've considered just using tar's exclude options like this:
tar -cvzPf "$ARCHIVE_NAME.tgz" --directory /home/tech/logs --exclude="$CURRENT_DATE" --no-unquote --recursion --remove-files --files-from="/home/tech/logs/"
But a) it doesn't work, and b) it would theoretically include all files that weren't *.pdf or *.log files, which would be a problem.
Am I overcomplicating this? Is there a better way to go about this?
I would go about this using bash's extended glob features, which allow you to negate a pattern:
#!/bin/bash
shopt -s extglob
mv /home/tech/logs/*!("$CURRENT_DATE")*.log "$ARCHIVE_NAME"
mv /home/tech/logs/pdf/*!("$CURRENT_DATE")*.pdf "$ARCHIVE_NAME"/pdf
With extglob enabled, !(pattern) expands to everything that doesn't match the pattern (or list of pipe-separated patterns).
Using find it should also be possible:
find /home/tech/logs -name '*.log' -not -name "*$CURRENT_DATE*" -exec mv -t "$ARCHIVE_NAME" {} +
Building on #tom-fenech answer, optimized to avoid many mv invocations:
find /home/tech/logs -maxdepth 1 -name '*.log' -not -name "*_${CURRENT_DATE?}.log" | \
xargs mv -t "${ARCHIVE_NAME?}"
An interesting feature, from processing the file thru pipes, is the ability to filter them with extra tools (aka grep :), which can (arguably) become more readable i.e. ->
find /home/tech/logs -maxdepth 1 -name '*.log' | fgrep -v "_${CURRENT_DATE?}" | \
xargs mv -t "${ARCHIVE_NAME?}"
Then similarly for the pdf ones, BTW you can "dry-run" above by just replacing mv by echo mv.
--jjo
I am using the 'find' command to identify modified files. But I've noticed that my method only identifies content-modified files and new files. It does not identify files where the only change was a rename. Is there a way to use 'find' to identify renamed files? If not, is there some other linux command that can be used for this?
Here is my current method for identifying changed files going back roughly one month (this method does NOT identify renamed files):
$ touch --date "2017-09-10T16:00:00" ~/Desktop/tmp
$ find ~/Home -newer ~/Desktop/tmp -type f > modified-files
You should replace option -newer with \( -newer -o -cnewer \) in order to catch modifications to file metadata as well.
I am figuring out a command to copy files that are modified on a Saturday.
find -type f -printf '%Ta\t%p\n'
This way the line starts with the weekday.
When I combine this with a 'egrep' command using a regular expression (starts with "za") it shows only the files which start with "za".
find -type f -printf '%Ta\t%p\n' | egrep "^(za)"
("za" is a Dutch abbreviation for "zaterdag", which means Saturday,
This works just fine.
Now I want to copy the files with this command:
find -type f -printf '%Ta\t%p\n' -exec cp 'egrep "^(za)" *' /home/richard/test/ \;
Unfortunately it doesn't work.
Any suggestions?
The immediate problem is that -printf and -exec are independent of each other. You want to process the result of -printf to decide whether or not to actually run the -exec part. Also, of course, passing an expression in single quotes simply passes a static string, and does not evaluate the expression in any way.
The immediate fix to the evaluation problem is to use a command substitution instead of single quotes, but the problem that the -printf function's result is not available to the command substitution still remains (and anyway, the command substitution would happen before find runs, not while it runs).
A common workaround would be to pass a shell script snippet to -exec, but that still doesn't expose the -printf function to the -exec part.
find whatever -printf whatever -exec sh -c '
case $something in za*) cp "$1" "$0"; esac' "$DEST_DIR" {} \;
so we have to figure out a different way to pass the $something here.
(The above uses a cheap trick to pass the value of $DEST_DIR into the subshell so we don't have to export it. The first argument to sh -c ... ends up in $0.)
Here is a somewhat roundabout way to accomplish this. We create a format string which can be passed to sh for evaluation. In order to avoid pesky file names, we print the inode numbers of matching files, then pass those to a second instance of find for performing the actual copying.
find \( -false $(find -type f \
-printf 'case %Ta in za*) printf "%%s\\n" "-o -inum %i";; esac\n' |
sh) \) -exec cp -t "$DEST_DIR" \+
Using the inode number means any file name can be processed correctly (including one containing newlines, single or double quotes, etc) but may increase running time significantly, because we need two runs of find. If you have a large directory tree, you will probably want to refactor this for your particular scenario (maybe run only in the current directory, and create a wrapper to run it in every directory you want to examine ... thinking out loud here; not sure it helps actually).
This uses features of GNU find which are not available e.g. in *BSD (including OSX). If you are not on Linux, maybe consider installing the GNU tools.
What you can do is a shell expansion. Something like
cp $(find -type f -printf '%Ta\t%p\n' | egrep "^(za)") $DEST_DIR
Assuming that the result of your find and grep is just the filenames (and full paths, at that), this will copy all the files that match your criteria to whatever you set $DEST_DIR to.
EDIT As mentioned in the comments, this won't work if your filenames contain spaces. If that's the case, you can do something like this:
find -type f -printf '%Ta\t%p\n' | egrep "^(za)" | while read file; do cp "$file" $DEST_DIR; done
I'm trying to rename a load of files (I count over 200) that either have the company name in the filename, or in the text contents. I basically need to change any references to "company" to "newcompany", maintaining capitalisation where applicable (ie "Company becomes Newcompany", "company" becomes "newcompany"). I need to do this recursively.
Because the name could occur pretty much anywhere I've not been able to find example code anywhere that meets my requirements. It could be any of these examples, or more:
company.jpg
company.php
company.Class.php
company.Company.php
companysomething.jpg
Hopefully you get the idea. I not only need to do this with filenames, but also the contents of text files, such as HTML and PHP scripts. I'm presuming this would be a second command, but I'm not entirely sure what.
I've searched the codebase and found nearly 2000 mentions of the company name in nearly 300 files, so I don't fancy doing it manually.
Please help! :)
bash has powerful looping and substitution capabilities:
for filename in `find /root/of/where/files/are -name *company*`; do
mv $filename ${filename/company/newcompany}
done
for filename in `find /root/of/where/files/are -name *Company*`; do
mv $filename ${filename/Company/Newcompany}
done
For the file and directory names, use for, find, mv and sed.
For each path (f) that has company in the name, rename it (mv) from f to the new name where company is replaced by newcompany.
for f in `find -name '*company*'` ; do mv "$f" "`echo $f | sed s/company/nemcompany/`" ; done
For the file contents, use find, xargs and sed.
For every file, change company by newcompany in its content, keeping original file with extension .backup.
find -type f -print0 | xargs -0 sed -i .bakup 's/company/newcompany/g'
I'd suggest you take a look at man rename an extremely powerful perl-utility for, well, renaming files.
Standard syntax is
rename 's/\.htm$/\.html/' *.htm
the clever part is that the tool accept any perl-regexp as a pattern for a filename to be changed.
you might want to run it with the -n switch which will make the tool to only report what it would have changed.
Can't figure out a nice way to keep the capitalization right now, but since you already can search through the filestructure, issue several rename with different capitalization until all files are changed.
To loop through all files below current folder and to search for a particular string, you can use
find . -type f -exec grep -n -i STRING_TO_SEARCH_FOR /dev/null {} \;
The output from that command can be directed to a file (after some filtering to just extract the file names of the files that need to be changed).
find . /type ... > files_to_operate_on
Then wrap that in a while read loop and do some perl-magic for inplace-replacement
while read file
do
perl -pi -e 's/stringtoreplace/replacementstring/g' $file
done < files_to_operate_on
There are few right ways to recursively process files. Here's one:
while IFS= read -d $'\0' -r file ; do
newfile="${file//Company/Newcompany}"
newfile="${newfile//company/newcompany}"
mv -f "$file" "$newfile"
done < <(find /basedir/ -iname '*company*' -print0)
This will work with all possible file names, not just ones without whitespace in them.
Presumes bash.
For changing the contents of files I would advise caution because a blind replacement within a file could break things if the file is not plain text. That said, sed was made for this sort of thing.
while IFS= read -d $'\0' -r file ; do
sed -i '' -e 's/Company/Newcompany/g;s/company/newcompany/g'"$file"
done < <(find /basedir/ -iname '*company*' -print0)
For this run I recommend adding some additional switches to find to limit the files it will process, perhaps
find /basedir/ \( -iname '*company*' -and \( -iname '*.txt' -or -ianem '*.html' \) \) -print0
cat `find . -name '*.css'`
This will open any css file. I now what do two things.
1) How do I add *.js to this as well. So I want to look inside all css and javascript files.
2) I want to look for any css or image files within those (css or js files) and push those into an array. So I guess look for a .png, .jpg, .gif, .tif, .css and put everything before that until the quote or single quote into an array. I want an array because this command will go into a shell script and after I get all the names of the files that I need I will need to loop through and download those files later.
Any help would be appreciated.
Extra hackery, in case someone needs it:
find ./ -name "*.css" | xargs grep -o -h -E '[A-Za-z0-9:./_-]+\.(png|jpg|gif|tif|css)'| sed -e 's/\.\./{{url here}}/g'|xargs wget
will download every missing resource
Do the command:
find ./ -name "*.css" -or -name "*.js" > fileNames.txt
Then read each line of fileNames.txt in the loop and download them.
Or if you are going to use wget to download the images you could do:
find ./ -name "*.css" -or -name "*.js" | xargs grep '*.png' | xargs wget
May need a little refinement like a cut after the grep but you get the idea
1) simple answer: you can add the names of all .js files to your cat command, by instructing find to find more files:
cat `find . -name '*.css' -or -name '*.js'`
2) a text-searching tool such as grep is probably what you're after:
find . -name '*.css' -or -name '*.js' | xargs grep -o -h -E '[A-Za-z0-9:./_-]+\.(png|jpg|gif|tif|css)'
Note: my grep pattern isn't universal or perfect, but it's a starting example. It matches any string that includes alpha-numeric,colon,dot,slash,underscore or hyphens in it, followed by any one of the given extensions.
The -o option causes grep to output only the parts of the .css/.js files that match the pattern (i.e. only the apparent filenames).
If you want to download them you could add | xargs wget -v to the command, which would instruct wget to fetch all those filenames.
NOTE: this won't work for relative filenames; some other magic will be required (i.e. you'll have to resolve them with respect to the grepped file's location). Perhaps some extra hackery, such as sed or awk.
Also: How often do you see references to TIFFs in your CSS/JS?