search and remove specific file using linux command [duplicate] - linux

This question already has answers here:
Delete files with string found in file - Linux cli
(8 answers)
Closed 5 years ago.
I using this command for search all file contain this word . I want to remove all file contain this word in specific directory . grep command perfectly. suggest me how can I used
rm -rf
with below command
grep -l -r -i "Pending" . | grep -n . | wc -l

This could be done by using the l flag and piping the filenames to xargs:
-l
(The letter ell.) Write only the names of files containing selected
lines to standard output. Pathnames are written once per file searched.
If the standard input is searched, a pathname of (standard input) will
be written, in the POSIX locale. In other locales, standard input may be
replaced by something more appropriate in those locales.
grep -l -r 'Pending' . | xargs rm
The above will delete all files in the current directory containing the word Pending.

Related

How can I delete the oldest n group of files with the same prefix?

In Linux I use InfluxDB which can make a backup of the database for archival purposes. Each backup comprises a series of files with the same prefix "/tank/Backups/var/Influxdb/20191225T235655Z." and different extensions.
I wanted to write a bash script which first deletes the oldest existing backups, then creates a new one (here I paste only the removal):
ls -tp /tank/Backups/var/Influxdb/* | grep -v '/$' | sed -E 's/\..+//' | \
sort -ru | sed 's/$/.*/' | tail -n +4 | xargs -d '\n' -r rm --
However, when I run the script as "sudo", I get
rm: cannot remove '/tank/Backups/var/Influxdb/20191225T235655Z.*': No such file or directory
When I run the quoted script, except the latest part, I get:
/tank/Backups/var/Influxdb/20190930T215357Z.*
/tank/Backups/var/Influxdb/20190930T215352Z.*
which is correct. Also, if I manually write
sudo /tank/Backups/var/Influxdb/20190930T215357Z.*
the command succeeds.
Why is the script reporting an error?
I'm using Ubuntu 18.04 and the folder "/tank" is a ZFS volume.
Better do :
find /tank/Backups/var/Influxdb/* -mtime +5 -delete
to remove files older than 5 days.
Then, you can run the next command
Explaining the Error
This answer is only here to explain the error and give a deeper understanding of what is happening. If you are simply looking for an elegant solution search for other answers.
When I run the quoted script, except the latest part, I get:
/tank/Backups/var/Influxdb/20190930T215357Z.*
/tank/Backups/var/Influxdb/20190930T215352Z.*
which is correct
The listed strings are not what you want. When you pass these paths to rm it sees them just as literal strings, that is, two files whose names end with a literal *. Since you don't have such files you get an error.
When you type rm * manually into your console bash (not rm!) does globbing. bash searches files and replaces the * with the list of found files. Only after that bash executes rm foundFile1 foundFile2 .... rm never sees the *.
Strings inside a pipeline are not processed by bash, but by the commands in the pipeline, in your case rm. rm does not glob.
You could run bash inside your pipeline and let it expand the * you inserted earlier. To this end, replace the last command in your pipeline with xargs -r bash -c 'rm -- $*' --. However, note that your paths are not quoted here. If there are spaces or literal * in your filenames the command will break. This is necessary for globbing as quoted "*" are not expanded by bash.
To quote your files you have to insert the * glob inside the bash command:
ls -tp /tank/Backups/var/Influxdb/* | grep -v '/$' | sed -E 's/\..+//' |
sort -ru | tail -n +4 | xargs -d\\n -L1 -r bash -c 'rm -- "$0."*'
Above command is only a simple fix for your command. It is neither elegant nor very robust. Using tools like find is strongly recommended.

Execute script for all but certain files in directory [duplicate]

This question already has answers here:
How do I exclude a directory when using `find`?
(46 answers)
Closed 3 years ago.
I need a bash script to iterate on all files in directory besides one with specific names. Maybe it can be done with help of awk/sed during script execution?
Here is my script, that simply merge all file in directory to one:
#!/bin/bash
(find $DIR_NAME -name app.gz\* | sort -rV | xargs -L1 gunzip -c 2> /dev/null || :)
How can I add some $DIR_NAME to list, and don`t iterate over them?
Put the names of the files to be excluded into a file, say "blacklist.txt", one filename per line. Then use
... | grep -F -f blacklist.txt | sort ...
to exclude them from the input to xargs.

List files that do not contain pattern [duplicate]

This question already has answers here:
How do I find files that do not contain a given string pattern?
(18 answers)
Closed 6 years ago.
In bash, how to list files that do NOT contain a given string?
Given that
grep --include=*.c -rlw './' -e "pattern"
return any file that matches the pattern I was expecting that
grep --include=*.c -rlwv './' -e "pattern"
would return any file that does not match the pattern but it just returns all the *.c files regardless of wether they match the pattern.
You can try to use -L option:
grep -L -r -i --include \*.c "pattern" ./
You can use -v option also
grep -v -rwl "pattern"

Best way to find the numeric value in UNIX file system [duplicate]

This question already has answers here:
How to find all files containing specific text (string) on Linux?
(54 answers)
Closed 8 years ago.
I need to grep for a particular port number from a huge set of files.
I am using a command:
find . |xargs grep "9461"
But it does not finds all the occurrences for number 9461.
Can anyone suggest a better unix/linux command to do so.
The kind of files it gets is : x.log, y.txt,z.htm, a.out etc files
But it was not able to get abc.conf files
You surely have some reason for using find in combination with grep, but just in case:
You can replace your command by:
grep -r "9461" .
and if you want even line numbers
grep -rn "9461" .
As JonathanLefflero commented, there is also the option -e that make grep match againt a regular expression, so, the ultimate command would be
grep -rne 9461
You should take a look on grep man page
A final note, you should check if what you want to grep is "9461" or 9461 without "".

Clearing archive files with linux bash script

Here is my problem,
I have a folder where is stored multiple files with a specific format:
Name_of_file.TypeMM-DD-YYYY-HH:MM
where MM-DD-YYYY-HH:MM is the time of its creation. There could be multiple files with the same name but not the same time of course.
What i want is a script that can keep the 3 newest version of each file.
So, I found one example there:
Deleting oldest files with shell
But I don't want to delete a number of files but to keep a certain number of newer files. Is there a way to get that find command, parse in the Name_of_file and keep the 3 newest???
Here is the code I've tried yet, but it's not exactly what I need.
find /the/folder -type f -name 'Name_of_file.Type*' -mtime +3 -delete
Thanks for help!
So i decided to add my final solution in case anyone liked to get it. It's a combination of the 2 solutions given.
ls -r | grep -P "(.+)\d{4}-\d{2}-\d{2}-\d{2}:\d{2}" | awk 'NR > 3' | xargs rm
One line, super efficiant. If anything changes on the pattern of date or name just change the grep -P pattern to match it. This way you are sure that only the files fitting this pattern will get deleted.
Can you be extra, extra sure that the timestamp on the file is the exact same timestamp on the file name? If they're off a bit, do you care?
The ls command can sort files by timestamp order. You could do something like this:
$ ls -t | awk 'NR > 3' | xargs rm
THe ls -t lists the files by modification time where the newest are first.
The `awk 'NR > 3' prints out the list of files except for the first three lines which are the three newest.
The xargs rm will remove the files that are older than the first three.
Now, this isn't the exact solution. There are possible problems with xargs because file names might contain weird characters or whitespace. If you can guarantee that's not the case, this should be okay.
Also, you probably want to group the files by name, and keep the last three. Hmm...
ls | sed 's/MM-DD-YYYY-HH:MM*$//' | sort -u | while read file
do
ls -t $file* | awk 'NR > 3' | xargs rm
done
The ls will list all of the files in the directory. The sed 's/\MM-DD-YYYY-HH:MM//' will remove the date time stamp from the files. Thesort -u` will make sure you only have the unique file names. Thus
file1.txt-01-12-1950
file2.txt-02-12-1978
file2.txt-03-12-1991
Will be reduced to just:
file1.txt
file2.txt
These are placed through the loop, and the ls $file* will list all of the files that start with the file name and suffix, but will pipe that to awk which will strip out the newest three, and pipe that to xargs rm that will delete all but the newest three.
Assuming we're using the date in the filename to date the archive file, and that is possible to change the date format to YYYY-MM-DD-HH:MM (as established in comments above), here's a quick and dirty shell script to keep the newest 3 versions of each file within the present working directory:
#!/bin/bash
KEEP=3 # number of versions to keep
while read FNAME; do
NODATE=${FNAME:0:-16} # get filename without the date (remove last 16 chars)
if [ "$NODATE" != "$LASTSEEN" ]; then # new file found
FOUND=1; LASTSEEN="$NODATE"
else # same file, different date
let FOUND="FOUND + 1"
if [ $FOUND -gt $KEEP ]; then
echo "- Deleting older file: $FNAME"
rm "$FNAME"
fi
fi
done < <(\ls -r | grep -P "(.+)\d{4}-\d{2}-\d{2}-\d{2}:\d{2}")
Example run:
[me#home]$ ls
another_file.txt2011-02-11-08:05
another_file.txt2012-12-09-23:13
delete_old.sh
not_an_archive.jpg
some_file.exe2011-12-12-12:11
some_file.exe2012-01-11-23:11
some_file.exe2012-12-10-00:11
some_file.exe2013-03-01-23:11
some_file.exe2013-03-01-23:12
[me#home]$ ./delete_old.sh
- Deleting older file: some_file.exe2012-01-11-23:11
- Deleting older file: some_file.exe2011-12-12-12:11
[me#home]$ ls
another_file.txt2011-02-11-08:05
another_file.txt2012-12-09-23:13
delete_old.sh
not_an_archive.jpg
some_file.exe2012-12-10-00:11
some_file.exe2013-03-01-23:11
some_file.exe2013-03-01-23:12
Essentially, but changing the file name to dates in the form to YYYY-MM-DD-HH:MM, a normal string sort (such as that done by ls) will automatically group similar files together sorted by date-time.
The ls -r on the last line simply lists all files within the current working directly print the results in reverse order so newer archive files appear first.
We pass the output through grep to extract only files that are in the correct format.
The output of that command combination is then looped through (see the while loop) and we can simply start deleting after 3 occurrences of the same filename (minus the date portion).
This pipeline will get you the 3 newest files (by modification time) in the current dir
stat -c $'%Y\t%n' file* | sort -n | tail -3 | cut -f 2-
To get all but the 3 newest:
stat -c $'%Y\t%n' file* | sort -rn | tail -n +4 | cut -f 2-

Resources