Using bash, how do I find all files containing a specific string and replace them with an existing file?

Using bash, how do I find all files containing a specific string and replace them with an existing file? - linux

I am using Linux and would like to replace all files containing the string 000000 with an existing file /home/user/offblack.png but keep the existing filename. I've been working at this for a while with various combinations of -exec and xargs but no luck. So far I have:
find | grep 000000
Which does list all the files I want to change fine. How do I copy and replace these files with my existing offblack.png file?

Here's what I would use:
find (your find args here) \
| xargs fgrep '000000' /dev/null \
| awk -F: '{print $1}' \
| xargs -n 1 -I ORIGINAL_FILENAME /bin/echo /bin/cp /path/to/offblack.png ORIGINAL_FILENAME
Expanding, find all the files you're interested in, grep inside of them for the string '000000' (adding /dev/null to the list of files in case one of the generated fgreps ended up with only one filename - it ensures the output is always formatted as "filename: <line containing '000000'>"), strip out only the filenames, then one-by-one, copy in offblack.png over those files. Note that I inserted a /bin/echo in there. That's your dry-run. Remove the echo to get it to run for real.
If what you mean is that the filenames contain "000000":
find . -type f -a -name '*000000*' -exec /bin/echo /bin/cp /path/to/offblack.png {} \;
Much simpler. :-) Find every file under the current directory with a name containing your string and exec the copy of offblack.png over it. Again, what I've given you there is a dry-run. Remove the echo for your live fire drill. :-)

find . -type f | grep 000000 | tr \\n \\0 | xargs -0i+ cp ~/offblack.png "+"

Let's try and use Bash a bit more:
for read -r filename
do
hit=""
for read -r
do
if [[ $REPLY == *000000* ]]
then
hit=$filename
break
fi
done < $filename
[[ -n $hit ]] && cp /path/offblack.png $filename
done < <(find . -type -f)
Fewer man pages to search!

Related

LINUX Copy the name of the newest folder and paste it in a command [duplicate]

I would like to find the newest sub directory in a directory and save the result to variable in bash.
Something like this:
ls -t /backups | head -1 > $BACKUPDIR
Can anyone help?

BACKUPDIR=$(ls -td /backups/*/ | head -1)
$(...) evaluates the statement in a subshell and returns the output.

There is a simple solution to this using only ls:
BACKUPDIR=$(ls -td /backups/*/ | head -1)
-t orders by time (latest first)
-d only lists items from this folder
*/ only lists directories
head -1 returns the first item
I didn't know about */ until I found Listing only directories using ls in bash: An examination.

This ia a pure Bash solution:
topdir=/backups
BACKUPDIR=
# Handle subdirectories beginning with '.', and empty $topdir
shopt -s dotglob nullglob
for file in "$topdir"/* ; do
[[ -L $file || ! -d $file ]] && continue
[[ -z $BACKUPDIR || $file -nt $BACKUPDIR ]] && BACKUPDIR=$file
done
printf 'BACKUPDIR=%q\n' "$BACKUPDIR"
It skips symlinks, including symlinks to directories, which may or may not be the right thing to do. It skips other non-directories. It handles directories whose names contain any characters, including newlines and leading dots.

Well, I think this solution is the most efficient:
path="/my/dir/structure/*"
backupdir=$(find $path -type d -prune | tail -n 1)
Explanation why this is a little better:
We do not need sub-shells (aside from the one for getting the result into the bash variable).
We do not need a useless -exec ls -d at the end of the find command, it already prints the directory listing.
We can easily alter this, e.g. to exclude certain patterns. For example, if you want the second newest directory, because backup files are first written to a tmp dir in the same path:
backupdir=$(find $path -type -d -prune -not -name "*temp_dir" | tail -n 1)

The above solution doesn't take into account things like files being written and removed from the directory resulting in the upper directory being returned instead of the newest subdirectory.
The other issue is that this solution assumes that the directory only contains other directories and not files being written.
Let's say I create a file called "test.txt" and then run this command again:
echo "test" > test.txt
ls -t /backups | head -1
test.txt
The result is test.txt showing up instead of the last modified directory.
The proposed solution "works" but only in the best case scenario.
Assuming you have a maximum of 1 directory depth, a better solution is to use:
find /backups/* -type d -prune -exec ls -d {} \; |tail -1
Just swap the "/backups/" portion for your actual path.
If you want to avoid showing an absolute path in a bash script, you could always use something like this:
LOCALPATH=/backups
DIRECTORY=$(cd $LOCALPATH; find * -type d -prune -exec ls -d {} \; |tail -1)

With GNU find you can get list of directories with modification timestamps, sort that list and output the newest:
find . -mindepth 1 -maxdepth 1 -type d -printf "%T#\t%p\0" | sort -z -n | cut -z -f2- | tail -z -n1
or newline separated
find . -mindepth 1 -maxdepth 1 -type d -printf "%T#\t%p\n" | sort -n | cut -f2- | tail -n1
With POSIX find (that does not have -printf) you may, if you have it, run stat to get file modification timestamp:
find . -mindepth 1 -maxdepth 1 -type d -exec stat -c '%Y %n' {} \; | sort -n | cut -d' ' -f2- | tail -n1
Without stat a pure shell solution may be used by replacing [[ bash extension with [ as in this answer.

Your "something like this" was almost a hit:
BACKUPDIR=$(ls -t ./backups | head -1)
Combining what you wrote with what I have learned solved my problem too. Thank you for rising this question.
Note: I run the line above from GitBash within Windows environment in file called ./something.bash.

How to grep files that has different letters?

I have thousands of files in a directory that are called: abc.txt srr.txt eek.txt abb.txt and etc. I want to grep only those files that has different last two letters. Example:
Good output: abc.txt eek.txt
Bad output: ekk.txt dee.txt.
Here is what I am trying to do:
#!/bin/bash
ls -l directory |grep .txt
It greps every file that has .txt in it.
How do I grep files that has two different last letters?

I'd go with find to list the *.txt files, and grep to filter out the ones that have the last two letters the same (using a backreference):
find . -type f -name '*.txt' | grep -v '\(.\)\1\.txt$'
It essentially picks up a character then immediately tries to back-reference it before .txt, and -v provides a reverse match leaving only files that do not have the same last two characters.
UPDATE: To move the found files you can chain mv to the command:
find . -type f -name '*.txt' | grep -v '\(.\)\1\.txt$' | xargs -i -t mv {} DESTINATION

It's not a good idea to parse the result of ls (read this doc to understand why). Here is what you could do in pure Bash, without using any external commands:
#!/bin/bash
shop -s nullglob # make sure glob yields nothing if there are no matches
for file in *.txt; do # grab all .txt files
[[ -f $file ]] || continue # skip if not a regular file
last6="${file: -6}" # get the last 6 characters of file name
[[ "${last6:1:1}" != "${last6:2:1}" ]] && printf '%s\n' "$file" # pick the files that match the criteria
# change printf to mv "$file" "$target_dir" above if you want to move the files
done

I've seem to accomplish what I wanted by using this:
ls -l |awk '{print $9}' | grep -vE "(.).?\1.?\."
awk '{print $9}' prints only the .txt files
grep -vE '(.).?\1.?\.' filters any names where the three characters before the period are not unique: aaa.txt, aab.txt, aba.txt and baa.txt are all filtered.

How to use sed to change file extensions?

I have to do a sed line (also using pipes in Linux) to change a file extension, so I can do some kind of mv *.1stextension *.2ndextension like mv *.txt *.c. The thing is that I can't use batch or a for loop, so I have to do it all with pipes and sed command.

you can use string manipulation
filename="file.ext1"
mv "${filename}" "${filename/%ext1/ext2}"
Or if your system support, you can use rename.
Update
you can also do something like this
mv ${filename}{ext1,ext2}
which is called brace expansion

sed is for manipulating the contents of files, not the filename itself. My suggestion:
rename 's/\.ext/\.newext/' ./*.ext
Or, there's this existing question which should help.

This may work:
find . -name "*.txt" |
sed -e 's|./||g' |
awk '{print "mv",$1, $1"c"}' |
sed -e "s|\.txtc|\.c|g" > table;
chmod u+x table;
./table
I don't know why you can't use a loop. It makes life much easier :
newex="c"; # Give your new extension
for file in *.*; # You can replace with *.txt instead of *.*
do
ex="${file##*.}"; # This retrieves the file extension
ne=$(echo "$file" | sed -e "s|$ex|$newex|g"); # Replaces current with the new one
echo "$ex";echo "$ne";
mv "$file" "$ne";
done

You can use find to find all of the files and then pipe that into a while read loop:
$ find . -name "*.ext1" -print0 | while read -d $'\0' file
do
mv $file "${file%.*}.ext2"
done
The ${file%.*} is the small right pattern filter. The % marks the pattern to remove from the right side (matching the smallest glob pattern possible), The .* is the pattern (the last . followed by the characters after the .).
The -print0 will separate file names with the NUL character instead of \n. The -d $'\0' will read in file names separated by the NUL character. This way, file names with spaces, tabs, \n, or other wacky characters will be processed correctly.

You may try following options
Option 1 find along with rename
find . -type f -name "*.ext1" -exec rename -f 's/\.ext1$/ext2/' {} \;
Option 2 find along with mv
find . -type f -name "*.ext1" -exec sh -c 'mv -f $0 ${0%.ext1}.ext2' {} \;
Note: It is observed that rename doesn't work for many terminals

Another solution only with sed and sh
printf "%s\n" *.ext1 |
sed "s/'/'\\\\''/g"';s/\(.*\)'ext1'/mv '\''\1'ext1\'' '\''\1'ext2\''/g' |
sh
for better performance: only one process created
perl -le '($e,$f)=#ARGV;map{$o=$_;s/$e$/$f/;rename$o,$_}<*.$e>' ext2 ext3

well this should work
mv $file $(echo $file | sed -E -e 's/.xml.bak.*/.xml/g' | sed -E -e 's/.\///g')
output
abc.xml.bak.foobar -> abc.xml

How to recursively rename files and folder with iconv from Bash

I have been trying to recursively rename files AND folders with iconv without success, the files are correctly renamed but folders dont.
What I use for files is (works perfect):
find . -name * -depth \ -exec bash -c 'mv "$1" "${1%/*}/$(iconv -f UTF8 -t ASCII//TRANSLIT <<< ${1##*/})"' -- {} \;
What I tried for files AND folders (fail: Only rename folders):
find . -exec bash -c 'mv "$1" "$(iconv -f UTF8 -t ASCII//TRANSLIT <<< $1)"' -- {} \;
ORIGINAL problem:
I just want to bulk rename lots of files to make them "web friendly", thinks like removing spaces, weird characters and so on, currently I have
find . -name '*' -depth \
| while read f ;
do
mv -i "$f" "$(dirname "$f")/$(basename "$f"|tr -s ' ' _|tr -d "'"|tr -d ","|tr - _|tr "&" "y"|tr "#" "a")" ;
done
Is there any way to do the tr stuff above and the iconv at a single run? because I am talking around 300,000 files to rename, I would like to avoid a second search if possible.
If needed, I am working with Bash 4.2.24
Thanks in advance.

I think the following does everything you want in one pass.
# Update: if this doesn't work, use read -d '' instead
find . -print0 | while IFS= read -d '$\000' f ;
do
orig_f="$f"
# Below is pure bash. You can replace with tr if you like
# f="$( echo $f | tr -d ,\' | tr "$'&'#- " "ya__" )"
f="${f// /_}" # Replace spaces with _
f="${f//\'}" # Remove single quote
f="${f//-/_}" # Replace - with _
f="${f//,}" # Remove commas
f="${f//&/y}" # Replace ampersand with y
f="${f//#/a}" # Replace at sign with a
f=$( iconv -f UTF8 -t ASCII//TRANSLIT <<< "$f" )
new_dir="$(dirname $f)"
new_f="$(basename $f)"
mkdir -p "$new_dir"
mv -i "$orig_f" "$new_dir/$new_f"
done
The find command (no real options needed, other than -print0 to handle filenames with spaces) will send null-separated file names to the while loop (and someone will correct my errors there, no doubt). A long list of assignments utilizing parameter expansion removes/replaces various characters; I include what I think is the equivalent pipeline using tr as a comment. Then we run the filename through iconv to deal with character set issues. Finally, we split the name into its path and filename components, since we may have to make a new directory before executing the mv.

Here is an update I offer after chepner's answer to avoid nesting bugs. Reverse the output of find with tac to act on folders content before the folders themselves. This way, there is no need to mkdir anymore:
echo "renaming:"
find . -print0 | tac -s '' | while IFS= read -d '' f ;
do
Odir=$(dirname "$f") # original location
Ofile=$(basename "$f") # original filename
newFile=$Ofile
# remove unwanted characters
newFile=$(echo $newFile | tr -d ",'\"?()[]{}\\!")
newFile="${newFile// /_}" # Replace spaces with _
newFile="${newFile//&/n}" # Replace ampersand with n
newFile="${newFile//#/a}" # Replace at sign with a
newFile=$( iconv -f UTF8 -t ASCII//TRANSLIT <<< "$newFile" )
if [[ "$Ofile" != "$newFile" ]]; then # act if something has changed
echo "$Odir/$Ofile to"
echo "$Odir/$newFile"
mv -i "$Odir/$Ofile" "$Odir/$newFile"
echo ""
fi
done
echo "done."
Enjoy ;)

Shell script to count files, then remove oldest files

I am new to shell scripting, so I need some help here. I have a directory that fills up with backups. If I have more than 10 backup files, I would like to remove the oldest files, so that the 10 newest backup files are the only ones that are left.
So far, I know how to count the files, which seems easy enough, but how do I then remove the oldest files, if the count is over 10?
if [ls /backups | wc -l > 10]
then
echo "More than 10"
fi

Try this:
ls -t | sed -e '1,10d' | xargs -d '\n' rm
This should handle all characters (except newlines) in a file name.
What's going on here?
ls -t lists all files in the current directory in decreasing order of modification time. Ie, the most recently modified files are first, one file name per line.
sed -e '1,10d' deletes the first 10 lines, ie, the 10 newest files. I use this instead of tail because I can never remember whether I need tail -n +10 or tail -n +11.
xargs -d '\n' rm collects each input line (without the terminating newline) and passes each line as an argument to rm.
As with anything of this sort, please experiment in a safe place.

find is the common tool for this kind of task :
find ./my_dir -mtime +10 -type f -delete
EXPLANATIONS
./my_dir your directory (replace with your own)
-mtime +10 older than 10 days
-type f only files
-delete no surprise. Remove it to test your find filter before executing the whole command
And take care that ./my_dir exists to avoid bad surprises !

Make sure your pwd is the correct directory to delete the files then(assuming only regular characters in the filename):
ls -A1t | tail -n +11 | xargs rm
keeps the newest 10 files. I use this with camera program 'motion' to keep the most recent frame grab files. Thanks to all proceeding answers because you showed me how to do it.

The proper way to do this type of thing is with logrotate.

I like the answers from #Dennis Williamson and #Dale Hagglund. (+1 to each)
Here's another way to do it using find (with the -newer test) that is similar to what you started with.
This was done in bash on cygwin...
if [[ $(ls /backups | wc -l) > 10 ]]
then
find /backups ! -newer $(ls -t | sed '11!d') -exec rm {} \;
fi

Straightforward file counter:
max=12
n=0
ls -1t *.dat |
while read file; do
n=$((n+1))
if [[ $n -gt $max ]]; then
rm -f "$file"
fi
done

I just found this topic and the solution from mikecolley helped me in a first step. As I needed a solution for a single line homematic (raspberrymatic) script, I ran into a problem that this command only gave me the fileames and not the whole path which is needed for "rm". My used CUxD Exec command can not start in a selected folder.
So here is my solution:
ls -A1t $(find /media/usb0/backup/ -type f -name homematic-raspi*.sbk) | tail -n +11 | xargs rm
Explaining:
find /media/usb0/backup/ -type f -name homematic-raspi*.sbk searching only files -type f whiche are named like -name homematic-raspi*.sbk (case sensitive) or use -iname (case insensitive) in folder /media/usb0/backup/
ls -A1t $(...) list the files given by find without files starting with "." or ".." -A sorted by mtime -t and with a return of only one column -1
tail -n +11 return of only the last 10 -n +11 lines for following rm
xargs rm and finally remove the raiming files in the list
Maybe this helps others from longer searching and makes the solution more flexible.

stat -c "%Y %n" * | sort -rn | head -n +10 | \
cut -d ' ' -f 1 --complement | xargs -d '\n' rm
Breakdown: Get last-modified times for each file (in the format "time filename"), sort them from oldest to newest, keep all but the last ten entries, and then keep all but the first field (keep only the filename portion).
Edit: Using cut instead of awk since the latter is not always available
Edit 2: Now handles filenames with spaces

On a very limited chroot environment, we had only a couple of programs available to achieve what was initially asked. We solved it that way:
MIN_FILES=5
FILE_COUNT=$(ls -l | grep -c ^d )
if [ $MIN_FILES -lt $FILE_COUNT ]; then
while [ $MIN_FILES -lt $FILE_COUNT ]; do
FILE_COUNT=$[$FILE_COUNT-1]
FILE_TO_DEL=$(ls -t | tail -n1)
# be careful with this one
rm -rf "$FILE_TO_DEL"
done
fi
Explanation:
FILE_COUNT=$(ls -l | grep -c ^d ) counts all files in the current folder. Instead of grep we could use also wc -l but wc was not installed on that host.
FILE_COUNT=$[$FILE_COUNT-1] update the current $FILE_COUNT
FILE_TO_DEL=$(ls -t | tail -n1) Save the oldest file name in the $FILE_TO_DEL variable. tail -n1 returns the last element in the list.

Based on others suggestions and some awk foo, I got this to work. I know this an old thread, but I didn't find a decent answer here and this sorted it for me. This just deletes the oldest file, but you can change the head -n 1 to 10 and get the oldest 10.
find $DIR -type f -printf '%T+ %p\n' | sort | head -n 1 | awk '{first =$1; $1 =""; print $0}' | xargs -d '\n' rm

Using inode numbers via stat & find command (to avoid pesky-chars-in-file-name issues):
stat -f "%m %i" * | sort -rn -k 1,1 | tail -n +11 | cut -d " " -f 2 | \
xargs -n 1 -I '{}' find "$(pwd)" -type f -inum '{}' -print
#stat -f "%m %i" * | sort -rn -k 1,1 | tail -n +11 | cut -d " " -f 2 | \
# xargs -n 1 -I '{}' find "$(pwd)" -type f -inum '{}' -delete

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Using bash, how do I find all files containing a specific string and replace them with an existing file? - linux

find . -type f | grep 000000 | tr \\n \\0 | xargs -0i+ cp ~/offblack.png "+"

Let's try and use Bash a bit more: for read -r filename do hit="" for read -r do if [[ $REPLY == 000000 ]] then hit=$filename break fi done < $filename [[ -n $hit ]] && cp /path/offblack.png $filename done < <(find . -type -f) Fewer man pages to search!

Related

LINUX Copy the name of the newest folder and paste it in a command [duplicate]

How to grep files that has different letters?

How to use sed to change file extensions?

How to recursively rename files and folder with iconv from Bash

Shell script to count files, then remove oldest files

Categories

Resources

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Using bash, how do I find all files containing a specific string and replace them with an existing file? - linux

find . -type f | grep 000000 | tr \\n \\0 | xargs -0i+ cp ~/offblack.png "+"

Let's try and use Bash a bit more: for read -r filename do hit="" for read -r do if [[ $REPLY == *000000* ]] then hit=$filename break fi done < $filename [[ -n $hit ]] && cp /path/offblack.png $filename done < <(find . -type -f) Fewer man pages to search!

Related

LINUX Copy the name of the newest folder and paste it in a command [duplicate]

How to grep files that has different letters?

How to use sed to change file extensions?

How to recursively rename files and folder with iconv from Bash

Shell script to count files, then remove oldest files

Categories

Resources

Let's try and use Bash a bit more: for read -r filename do hit="" for read -r do if [[ $REPLY == 000000 ]] then hit=$filename break fi done < $filename [[ -n $hit ]] && cp /path/offblack.png $filename done < <(find . -type -f) Fewer man pages to search!