Terminal command 'find' read from txt files

Terminal command 'find' read from txt files - text

I would like to find all .html files from a directory but to ignore some files.
$ find /Volumes/WORKSPACE/MyProject/ -name '*.html' ! -name 'MyFile.html'
Above command ignore MyFile.html only. However i have a lists of files to be ignore and wanna to maintain the ignore file in a single txt file.
MyFile1.html
MyFile1.html
MyFile2.html
MyFile2.html
Above is the txt file content.
I would like to have the list of files to be ignored maintain in a single txt files and use the txt file in find command.
Please advice. Thanks in advance.

You would have to build something like an array from the file and then use it in the find syntax. i.e.
#!/bin/bash
while IFS= read -r file; do
arr+=(-name "$file" -o)
done < list_of_filenames_to_omit.txt
#remove trailing -o
unset arr[$((${#arr[#]}-1))]
if ((${#arr[#]}!=0)); then
find /Volumes/WORKSPACE/MyProject/ -name '*.html' ! \( "${arr[#]}" \)
else
#list_of_filenames_to_omit.txt is empty
find /Volumes/WORKSPACE/MyProject/ -name '*.html'
fi

Related

Linux bash command -backup=numbered. Put the number BEFORE the file extension

Using a one-line bash command with GitBash on windows, using find and cp, I am backing up a bunch of script files that exist in multiple sub-directories. I am currently backing them up to a single directory. As you can imagine, naming conflicts arise. This is easy enough to avoid with the --backup=numbered option which creates a copy of the file. However, the problem with this is that it puts the number AFTER the file extension, naming the file like this: example.js.~2~. What I want is to preserve the file extension and name the file like this: example2.js rather than putting the number after the file extension. Is there any way to do this?
Another option would be to prepend the directory name (from the directory that it is being copied from) to the file that is being copied instead of adding a number. I would accept either of these as a solution.
Here is what I have so far:
find . -path "*node_modules*" -prune -o -type f \( -name '*.js' -or -name '*.js.map' -or -name '*.ts' -or -name '*.json' \) -printf "%h\n" -exec cp {} --backup=numbered "/c/test/" \;
Any help would be appreciated! Thank you!

what about :
#!/bin/bash
# your find command here
FILES=$(find . -type f .....)
# loop through files and create a new filename with the path within ( slashes replaced by underscores
for FILE in $FILES; do
NEW_FILENAME=$(printf "%s" "$FILE" | sed s/\\//_/g)
cp "$FILE" "/c/test/${NEW_FILENAME}"
done
from your question, I am unsure if a one liner is mandatory...

Copying a type of file, in specific directories, to another directory

I have a .txt file that contains a list of directories. I want to make a script that goes through this .txt file, copies anything in the directory thats listed of a certain file type, to another directory.
I've never done this with directories, only files.
How can i edit this simple script to work for reading a directory list, looking for a .csv file, and copy it to another directory?
cat filenames.list | \
while read FILENAME
do
find . -name "$FILENAME" -exec cp '{}' new_dir\;
done

for DIRNAME in $(dirname.list); do find $DIRNAME -type f -name "*.csv" -exec cp \{} dest \; ; done;
sorry, in my first answer i didnt understand what you asking for.
The first line of code, simply, take a dirname entry in your directory list as a path and search in it for each file which end with ".csv" extension; then copy it inside the destination you want.
But you could do with less code:
for DIRNAME in $(dirname.list); do cp $DIRNAME/*.csv dest ; done

Despite the filename of the list filenames.list, let me assume the file contains the list of directory names, not filenames. Then would you please try:
while IFS= read -r dir; do
find "$dir" -type f -name "*.mp3" -exec cp -p -- {} new_dir \;
done < filenames.list
The find command searches in "$dir" for files which have an extension .mp3 then copies them to the new_dir.
The script above does not care the duplication of the filenames. If you want to keep the original directory tree and/or need a countermeasure for the duplication of the filenames, please let me know.

Using find inside a while loop works but find will run on each line of the file, another alternative is to save the list in an array, that way find can search on the directories in the list in one search.
If you have bash4+ you can use mapfile.
mapfile -t directories < filenames.list
If you're stuck at bash3.
directories=()
while IFS= read -r line; do
directories+=("$lines")
done < filenames.list
Now if you're just after one file type like files ending in *.csv.
find "${directories[#]}" -type f -name '*.csv' -exec sh -c 'cp -v -- "$#" /newdirectory' _ {} +
If you have multiple file type to match and multiple directories to copy the files.
while IFS= read -r -d '' file; do
case $file in
*.csv) cp -v -- "$file" /foodirectory;; ##: csv file copy to foodirectory
*.mp3) cp -v -- "$file" /bardirectory;; ##: mp3 file copy to bardirectory
*.avi) cp -v -- "$file" /bazdirectory;; ##: avi file copy to bazdirectory
esac
done < <(find "${directories[#]}" -type f -print0)
find's print0 will work with read's -d '' when dealing with files with white spaces and newlines. see How can I find and deal with file names containing newlines, spaces or both?
The -- is there so if you have a problematic filename that starts with a dash - cp will not interpret it as an option.

Given find ability to process multiple folder, and assuming goal is to 'flatten' all csv files into a single destination, consider the following.
Note that it assumes folder names do not have special characters (including spaces, tabs, new lines, etc).
As a side benefit, it will minimize the number of 'cp' calls, making the process efficient across large number of files/folders.
find $(<filename.list) -name '*.csv' | xargs cp -t DESTINATION/
For the more complex case, where folder names/file name can be anything (including space, '*', etc.), consider using NUL separator (-print0 and -0).
xargs -I{} -t find '{}' -name '*.csv' <dd -print0 | xargs -0 -I{} -t cp -t new/ '{}'
Which will fork multiple find and multiple cp.

Best way to tar and zip files meeting specific name criteria?

I'm writing a shell script on a Linux machine to be run via a crontab which is meant to move all files older than the current day to a new folder, and then tar and zip the entire folder. Seems like a simple task but for some reason, I'm running into all kinds of roadblocks. I'm new to this and self-taught so any help or redirection would be greatly appreciated.
Specific criteria for which files to archive:
All log files are in /home/tech/logs/ and all pdfs are in /home/tech/logs/pdf
All files are over a day old as indicated by the file name (file name does not include $CURRENT_DATE)
All files must be *.log or *.pdf (i.e. don't archive files that don't include $CURRENT_DATE if it isn't a log or pdf file.
Filename formatting specifics:
All the log file names are in home/tech/logs in the format NAME 00_20180510.log, and all the pdf files are in a "pdf" subdirectory (home/tech/logs/pdf) with the format NAME 00_20180510_00000000.pdf ("20180510" would be whenever the file was created and the 0's would be any number). I need to use the name rather than the file metadata for the creation date, and all files (pdf/log) whose name does not include the current date are "old". I also can't just move all files that don't contain $CURRENT_DATE in the name because it would take any non-*.pdf or *.log files with it.
Right now the script creates a new folder with a new pdf subdir for the old files (mkdir -p /home/tech/logs/$ARCHIVE_NAME/pdf). I then want to move the old logs into $ARCHIVE_NAME, and move all old pdfs from the original pdf subdirectory into $ARCHIVE_NAME/pdf.
Current code:
find /home/tech/logs -maxdepth 1 -name ( "*[^$CURRENT_DATE].log" "*.log" ) -exec mv -t "$ARCHIVE_NAME" '{}' ';'
find /home/tech/logs/pdf -maxdepth 1 -name ( "*[^$CURRENT_DATE]*.pdf" "*.pdf" ) -exec mv -t "$ARCHIVE_NAME/pdf" '{}' ';'
This hasn't been working because it treats the numbers in $CURRENT_DATE as a list of numbers to exclude rather than a literal string.
I've considered just using tar's exclude options like this:
tar -cvzPf "$ARCHIVE_NAME.tgz" --directory /home/tech/logs --exclude="$CURRENT_DATE" --no-unquote --recursion --remove-files --files-from="/home/tech/logs/"
But a) it doesn't work, and b) it would theoretically include all files that weren't *.pdf or *.log files, which would be a problem.
Am I overcomplicating this? Is there a better way to go about this?

I would go about this using bash's extended glob features, which allow you to negate a pattern:
#!/bin/bash
shopt -s extglob
mv /home/tech/logs/*!("$CURRENT_DATE")*.log "$ARCHIVE_NAME"
mv /home/tech/logs/pdf/*!("$CURRENT_DATE")*.pdf "$ARCHIVE_NAME"/pdf
With extglob enabled, !(pattern) expands to everything that doesn't match the pattern (or list of pipe-separated patterns).
Using find it should also be possible:
find /home/tech/logs -name '*.log' -not -name "*$CURRENT_DATE*" -exec mv -t "$ARCHIVE_NAME" {} +

Building on #tom-fenech answer, optimized to avoid many mv invocations:
find /home/tech/logs -maxdepth 1 -name '*.log' -not -name "*_${CURRENT_DATE?}.log" | \
xargs mv -t "${ARCHIVE_NAME?}"
An interesting feature, from processing the file thru pipes, is the ability to filter them with extra tools (aka grep :), which can (arguably) become more readable i.e. ->
find /home/tech/logs -maxdepth 1 -name '*.log' | fgrep -v "_${CURRENT_DATE?}" | \
xargs mv -t "${ARCHIVE_NAME?}"
Then similarly for the pdf ones, BTW you can "dry-run" above by just replacing mv by echo mv.
--jjo

Delete .DS_STORE files in current folder and all subfolders from command line on Mac

I understand I can use find . -name ".DS_Store" to find all the .DS_Store files in the current folder and all subfolders. But how could I delete them from command line simultaneously? I found it's really annoying to switch back and forth to all folders and delete it one by one.

find can do that. Just add -delete:
find . -name ".DS_Store" -delete
Extend it even further to also print their relative paths
find . -name ".DS_Store" -print -delete
For extra caution, you can exclude directories and filter only for files
find . -name ".DS_Store" -type f -delete

find . -name ".DS_Store" -print -delete
This will delete all the files named .DS_Store in the current path while also displaying their relative paths

Here is how to remove recursively the .DS_Store file
Open up Terminal
In the command line, go to the location of the folder where all files and folders are:
cd to/your/directory
Then finally, type in the below command:
find . -name '.DS_Store' -type f -delete
Press Enter
Cheers!!

You can also use extended globbing (**):
rm -v **/.DS_Store
in zsh, bash 4 and similar shells (if not enabled, activate by: shopt -s globstar).

The best way to do this cleanly is using:
find . -type f \( -name ".DS_Store" -o -name "._.DS_Store" \) -delete -print 2>&1 | grep -v "Permission denied"
This removes the files, hides "permission denied" errors (while keeping other errors), printing out a clean list of files removed.

All the answers above work but there is a bigger problem if one is using mac and still on mac. The described lines do delete all the DS_Store files but Finder recreates them immediately again because that is the default behaviour. You can read about how this all works here. To quote from there if you are on mac, you should remove them only if you really need:
If you don’t have a particular reason to delete these .DS_Store files (windows sharing might be a solid reason,) it’s best to leave them “as is.” There’s no performance benefit in deleting .DS_Store files. They are harmless files that don’t usually cause any problems. Remember that the .DS_Store file saves your personalized folder settings, such as your icon arrangement and column sortings. And that’s why you normally don’t want to delete them but rather HIDE them.
If you really do, there is one more way which was not mentioned here:
sudo find / -name “.DS_Store” -depth -exec rm {} \;

Make a new file with a text editor, copy and paste the following text into it, and save it with the ".sh" file extension, then open the file with Terminal. Make sure the text editor is actually saving the raw text and not saving the file as a Rich Text Format file or some other text file format with additional information in the file.
#!/bin/bash
echo -e "\nDrag a folder here and press the Enter or Return keys to delete all files whose names begin with a dot in its subfolders:\n"
read -p "" FOLDER
echo -e "\nThe following files will be deleted:\n"
find $FOLDER -name ".*"
echo -e "\nDelete these files? (y/n): "
read -p "" DECISION
while true
do
case $DECISION in
[yY]* ) find $FOLDER -name ".*" -delete
echo -e "\nThe files were deleted.\n"
break;;
[nN]* ) echo -e "\nAborting without file deletion.\n"
exit;;
* ) echo -e "\nAborting without file deletion.\n"
exit;;
esac
done

This wasn't exactly the question, but if you wanna actually zip the directory without them .DS_STORE files, this works a treat...
zip -r -X archive_name.zip folder_to_compress

How do I rename lots of files changing the same filename element for each in Linux?

I'm trying to rename a load of files (I count over 200) that either have the company name in the filename, or in the text contents. I basically need to change any references to "company" to "newcompany", maintaining capitalisation where applicable (ie "Company becomes Newcompany", "company" becomes "newcompany"). I need to do this recursively.
Because the name could occur pretty much anywhere I've not been able to find example code anywhere that meets my requirements. It could be any of these examples, or more:
company.jpg
company.php
company.Class.php
company.Company.php
companysomething.jpg
Hopefully you get the idea. I not only need to do this with filenames, but also the contents of text files, such as HTML and PHP scripts. I'm presuming this would be a second command, but I'm not entirely sure what.
I've searched the codebase and found nearly 2000 mentions of the company name in nearly 300 files, so I don't fancy doing it manually.
Please help! :)

bash has powerful looping and substitution capabilities:
for filename in `find /root/of/where/files/are -name *company*`; do
mv $filename ${filename/company/newcompany}
done
for filename in `find /root/of/where/files/are -name *Company*`; do
mv $filename ${filename/Company/Newcompany}
done

For the file and directory names, use for, find, mv and sed.
For each path (f) that has company in the name, rename it (mv) from f to the new name where company is replaced by newcompany.
for f in `find -name '*company*'` ; do mv "$f" "`echo $f | sed s/company/nemcompany/`" ; done
For the file contents, use find, xargs and sed.
For every file, change company by newcompany in its content, keeping original file with extension .backup.
find -type f -print0 | xargs -0 sed -i .bakup 's/company/newcompany/g'

I'd suggest you take a look at man rename an extremely powerful perl-utility for, well, renaming files.
Standard syntax is
rename 's/\.htm$/\.html/' *.htm
the clever part is that the tool accept any perl-regexp as a pattern for a filename to be changed.
you might want to run it with the -n switch which will make the tool to only report what it would have changed.
Can't figure out a nice way to keep the capitalization right now, but since you already can search through the filestructure, issue several rename with different capitalization until all files are changed.
To loop through all files below current folder and to search for a particular string, you can use
find . -type f -exec grep -n -i STRING_TO_SEARCH_FOR /dev/null {} \;
The output from that command can be directed to a file (after some filtering to just extract the file names of the files that need to be changed).
find . /type ... > files_to_operate_on
Then wrap that in a while read loop and do some perl-magic for inplace-replacement
while read file
do
perl -pi -e 's/stringtoreplace/replacementstring/g' $file
done < files_to_operate_on

There are few right ways to recursively process files. Here's one:
while IFS= read -d $'\0' -r file ; do
newfile="${file//Company/Newcompany}"
newfile="${newfile//company/newcompany}"
mv -f "$file" "$newfile"
done < <(find /basedir/ -iname '*company*' -print0)
This will work with all possible file names, not just ones without whitespace in them.
Presumes bash.
For changing the contents of files I would advise caution because a blind replacement within a file could break things if the file is not plain text. That said, sed was made for this sort of thing.
while IFS= read -d $'\0' -r file ; do
sed -i '' -e 's/Company/Newcompany/g;s/company/newcompany/g'"$file"
done < <(find /basedir/ -iname '*company*' -print0)
For this run I recommend adding some additional switches to find to limit the files it will process, perhaps
find /basedir/ \( -iname '*company*' -and \( -iname '*.txt' -or -ianem '*.html' \) \) -print0

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string