How to use find command to find the directory of a filename and remove duplicates?

How to use find command to find the directory of a filename and remove duplicates? - linux

I'm using find / -name "*.dbf" to find the directories of all .dbf files.
It gives me the directories and the filenames.
The output should be only the directories with no duplicates. I don't need to see the filenames.

You can pipe the result through dirname and then remove duplicates like this:
find / -name \*.dbf -print0 | xargs -0 -n1 dirname | sort | uniq

Another solution: find / -name "*.dbf" -exec dirname {} \; 2> /dev/null | sort -u

I can understand your question in two ways:
To find only directories matching the <name_pattern> with no duplicates, you can use the -type option of the find piped into a sort | uniq:
find / -name '<name_pattern>' -type d | sort | uniq
To find all the files, but only return the directories including the matching files with no duplicates:
find / -name '<name_pattern>' | perl -pe 's/(.*\/).*$/$1/' | sort | uniq

Related

delete all files except a pattern list file

I need to delete all the files in the current directory except a list of patterns that are described in a whitelist file (delete_whitelist.txt) like this:
(.*)dir1(/)?
(.*)dir2(/)?
(.*)dir2/ser1(/)?(.*)
(.*)dir2/ser2(/)?(.*)
(.*)dir2/ser3(/)?(.*)
(.*)dir2/ser4(/)?(.*)
(.*)dir2/ser5(/)?(.*)
How can I perform this in one bash line?

Any bash script can fit on one line:
find . -type f -print0 | grep -EzZvf delete_whitelist.txt | xargs -0 printf '%s\n'
Check the output and then, if it's OK:
find . -type f -print0 | grep -EzZvf delete_whitelist.txt | xargs -0 rm

Bash command to find files who have the same name but not the same extension

I want to find files that do not co-exist with another file extention” i.e. all the .c files that don’t have a corresponding .o file.
I tried find $HOME \( -name '*.c' ! -a -name '*.o' \) but it does work.

You can do the following:
Find names of all files
Strip trailing extension, if any (assuming dot is only used before extension)
Sort to group duplicates
List only duplicates
The remaining lines list files that occur with different extensions
find yourdirectory -type f | sed 's#\..*##' | sort | uniq -d
If you are only interested in extensions .c and .o, then confine the find accordingly.
find yourdirectory -type f -name '*.c' -or -name '*.o' | sed 's#\..*##' | sort | uniq -d
As it turns out, you actually wanted to know (and that should have been your question in the very beginning): "How to find .c files that have no .o file"
find yourdir -name '*.c' | sed 's#..$##' | sort > c-files
find yourdir -name '*.o' | sed 's#..$##' | sort > o-files
diff c-files o-files | grep '^<'
The final grep will filter lines that are only in the left files (c-files)

find oldest accessed files

How can I find, let's say, the 100 oldest accessed files? I've tried the following, but it just prints random accessed files.
find /home/you -iname "*.pdf" -atime -100000 -type f | tail -n100

find /home/you -iname '*.pdf' -printf '%A# %p\n' | sort -n | head -n 100

You could use the stat command
stat -c '%X %n' *.pdf | sort -n | head -n100

Script to delete all folders barring the last two most modified?

I need to write a recursive script to delete all folders in a subfolder named 'date-2012-01-01_12_30' but leave the two latest.
/var/www/temp/updates/ then hundreds of folders by 'date' and by 'code'
e.g.
/var/www/temp/updates/2012-01-01/temp1/date-2012-01-_12_30
/var/www/temp/updates/2012-01-01/temp1/date-2012-02-_13_30
/var/www/temp/updates/2012-01-01/temp1/date-2013-11-_12_30
/var/www/temp/updates/2012-01-01/temp2/date-2012-01-_12_30
I was thinking about using a find to get the folder but unsure how to know what folders I can delete as the script will have to know how date - folders are in that subfolder and which ones are the latest ones
Hmm, any help would be great?
Code:
$PATH=/var/www/temp/updates/*/*
find $PATH -type d -name "date-*" -printf '%T# %p\n' | sort -n | head -n -2 | cut -f2- | xargs ls -l
The script will need to go through thousands of different folders and keep the two most recent folders - Someone on here helped before but I haven't changed it for the thousands of folders to search through

Can you try this script
PATH1=/var/www/temp/updates
find $PATH1 -iname "date-*" -print0 | ls -tr | tail -2 | xargs -I file rm -fr file
thanx

Actually I think the script will work fine as the find will going through all the folders under /updates/
$PATH=/var/www/temp/updates/*/*
find $PATH -type d -name "date-*" -printf '%T# %p\n' | sort -n | xargs rm -rf

Shell command for counting words in files

I want to run a command, that will count the number of words in all file. (From the selected number of files)
If i do like, find ABG-Development/ -name "*.php" | grep "<?" | wc -l , it will search only in the filename not the file contents.
And I tried one more way like
find ABG-Development/ -name "*.php" -exec grep "<?" {} \; | wc -l, I got error.
In above example I need how many time "
Please help..

use xargs
find ABG-Development/ -name "*.php" -print0 | xargs -0 grep "<?" | wc -l

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to use find command to find the directory of a filename and remove duplicates? - linux

I'm using find / -name "*.dbf" to find the directories of all .dbf files. It gives me the directories and the filenames. The output should be only the directories with no duplicates. I don't need to see the filenames.

You can pipe the result through dirname and then remove duplicates like this: find / -name \*.dbf -print0 | xargs -0 -n1 dirname | sort | uniq

Another solution: find / -name "*.dbf" -exec dirname {} \; 2> /dev/null | sort -u

Related

delete all files except a pattern list file

Bash command to find files who have the same name but not the same extension

find oldest accessed files

Script to delete all folders barring the last two most modified?

Shell command for counting words in files

Categories

Resources