How to only get file name with Linux 'find'?

How to only get file name with Linux 'find'? - linux

I'm using find to all files in directory, so I get a list of paths. However, I need only file names. i.e. I get ./dir1/dir2/file.txt and I want to get file.txt

In GNU find you can use -printf parameter for that, e.g.:
find /dir1 -type f -printf "%f\n"

If your find doesn't have a -printf option you can also use basename:
find ./dir1 -type f -exec basename {} \;

Use -execdir which automatically holds the current file in {}, for example:
find . -type f -execdir echo '{}' ';'
You can also use $PWD instead of . (on some systems it won't produce an extra dot in the front).
If you still got an extra dot, alternatively you can run:
find . -type f -execdir basename '{}' ';'
-execdir utility [argument ...] ;
The -execdir primary is identical to the -exec primary with the exception that utility will be executed from the directory that holds the current file.
When used + instead of ;, then {} is replaced with as many pathnames as possible for each invocation of utility. In other words, it'll print all filenames in one line.

If you are using GNU find
find . -type f -printf "%f\n"
Or you can use a programming language such as Ruby(1.9+)
$ ruby -e 'Dir["**/*"].each{|x| puts File.basename(x)}'
If you fancy a bash (at least 4) solution
shopt -s globstar
for file in **; do echo ${file##*/}; done

If you want to run some action against the filename only, using basename can be tough.
For example this:
find ~/clang+llvm-3.3/bin/ -type f -exec echo basename {} \;
will just echo basename /my/found/path. Not what we want if we want to execute on the filename.
But you can then xargs the output. for example to kill the files in a dir based on names in another dir:
cd dirIwantToRMin;
find ~/clang+llvm-3.3/bin/ -type f -exec basename {} \; | xargs rm

On mac (BSD find) use:
find /dir1 -type f -exec basename {} \;

As others have pointed out, you can combine find and basename, but by default the basename program will only operate on one path at a time, so the executable will have to be launched once for each path (using either find ... -exec or find ... | xargs -n 1), which may potentially be slow.
If you use the -a option on basename, then it can accept multiple filenames in a single invocation, which means that you can then use xargs without the -n 1, to group the paths together into a far smaller number of invocations of basename, which should be more efficient.
Example:
find /dir1 -type f -print0 | xargs -0 basename -a
Here I've included the -print0 and -0 (which should be used together), in order to cope with any whitespace inside the names of files and directories.
Here is a timing comparison, between the xargs basename -a and xargs -n1 basename versions. (For sake of a like-with-like comparison, the timings reported here are after an initial dummy run, so that they are both done after the file metadata has already been copied to I/O cache.) I have piped the output to cksum in both cases, just to demonstrate that the output is independent of the method used.
$ time sh -c 'find /usr/lib -type f -print0 | xargs -0 basename -a | cksum'
2532163462 546663
real 0m0.063s
user 0m0.058s
sys 0m0.040s
$ time sh -c 'find /usr/lib -type f -print0 | xargs -0 -n 1 basename | cksum'
2532163462 546663
real 0m14.504s
user 0m12.474s
sys 0m3.109s
As you can see, it really is substantially faster to avoid launching basename every time.

Honestly basename and dirname solutions are easier, but you can also check this out :
find . -type f | grep -oP "[^/]*$"
or
find . -type f | rev | cut -d '/' -f1 | rev
or
find . -type f | sed "s/.*\///"

-exec and -execdir are slow, xargs is king.
$ alias f='time find /Applications -name "*.app" -type d -maxdepth 5'; \
f -exec basename {} \; | wc -l; \
f -execdir echo {} \; | wc -l; \
f -print0 | xargs -0 -n1 basename | wc -l; \
f -print0 | xargs -0 -n1 -P 8 basename | wc -l; \
f -print0 | xargs -0 basename | wc -l
139
0m01.17s real 0m00.20s user 0m00.93s system
139
0m01.16s real 0m00.20s user 0m00.92s system
139
0m01.05s real 0m00.17s user 0m00.85s system
139
0m00.93s real 0m00.17s user 0m00.85s system
139
0m00.88s real 0m00.12s user 0m00.75s system
xargs's parallelism also helps.
Funnily enough i cannot explain the last case of xargs without -n1.
It gives the correct result and it's the fastest ¯\_(ツ)_/¯
(basename takes only 1 path argument but xargs will send them all (actually 5000) without -n1. does not work on linux and openbsd, only macOS...)
Some bigger numbers from a linux system to see how -execdir helps, but still much slower than a parallel xargs:
$ alias f='time find /usr/ -maxdepth 5 -type d'
$ f -exec basename {} \; | wc -l; \
f -execdir echo {} \; | wc -l; \
f -print0 | xargs -0 -n1 basename | wc -l; \
f -print0 | xargs -0 -n1 -P 8 basename | wc -l
2358
3.63s real 0.10s user 0.41s system
2358
1.53s real 0.05s user 0.31s system
2358
1.30s real 0.03s user 0.21s system
2358
0.41s real 0.03s user 0.25s system

I've found a solution (on makandracards page), that gives just the newest file name:
ls -1tr * | tail -1
(thanks goes to Arne Hartherz)
I used it for cp:
cp $(ls -1tr * | tail -1) /tmp/

Related

I want to get an output of the find command in shell script

Am trying to write a script that finds the files that are older than 10 hours from the sub-directories that are in the "HS_client_list". And send the Output to a file "find.log".
#!/bin/bash
while IFS= read -r line; do
echo Executing cd /moveit/$line
cd /moveit/$line
#Find files less than 600 minutes old.
find $PWD -type f -iname "*.enc" -mmin +600 -execdir basename '{}' ';' | xargs ls > /home/infa91punv/find.log
done < HS_client_list
However, the script is able to cd to the folders from HS_client_list(this file contents the name of the subdirectories) but, the find command (find $PWD -type f -iname "*.enc" -mmin +600 -execdir basename '{}' ';' | xargs ls > /home/infa91punv/find.log) is not working. The Output file is empty. But when I run find $PWD -type f -iname "*.enc" -mmin +600 -execdir basename '{}' ';' | xargs ls > /home/infa91punv/find.log as a command it works and from the script it doesn't.

You are overwriting the file in each iteration.
You can use xargs to perform find on multiple directories; but you have to use an alternate delimiter to avoid having xargs populate the {} in the -execdir command.
sed 's%^%/moveit/%' HS_client_list |
xargs -I '<>' find '<>' -type f -iname "*.enc" -mmin +600 -execdir basename {} \; > /home/infa91punv/find.log
The xargs ls did not seem to perform any useful functionality, so I took it out. Generally, don't use ls in scripts.
With GNU find, you could avoid the call to an external utility, and use the -printf predicate to print just the part of the path name that you care about.
For added efficiency, you could invoke a shell to collect the arguments:
sed 's%^%/moveit/%' HS_client_list |
xargs sh -c 'find "$#" -type f -iname "*.enc" -mmin +600 -execdir basename {} \;' _ >/home/infa91punv/find.log
This will run as many directories as possible in a single find invocation.
If you want to keep your loop, the solution is to put the redirection after done. I would still factor out the cd, and take care to quote the variable interpolation.
while IFS= read -r line; do
find /moveit/"$line" -type f -iname "*.enc" -mmin +600 -execdir basename '{}' ';'
done < HS_client_list >/home/infa91punv/find.log

How to limit the number of results from find?

I would like to do something like:
find ./ -name "*.jpg" -nbresult 50 -exec cp {} /50randomsjpgfrommyharddrive
I can use head and xargs, but with -print0, head doesn't work any more.

Provided that you do not have newlines in the filenames, -print0 is unnecessary, and you can have instead:
find ./ -name "*.jpg" | head -n 50 | xargs -d'\n' -n1 -I'{}' cp '{}' /50randomsjpgfrommyharddrive
In this command, the -d'\n' will make xargs delimit on newlines. Other whitespace in filenames, which would by default be treated as a delimiter by xargs, are then not a problem.
Alternatively, if you still need to use -print0, the following command line incorporates a filter which is analogous to head -n 50 but is based on null delimiters (rather than newlines) on its input and output. Note that -0 is needed on xargs in this case.
find ./ -name "*.jpg" -print0 | perl -p0e 'exit if $i++ == 50' | xargs -0 -n1 -I'{}' cp '{}' /50randomsjpgfrommyharddrive

GNU head has an option called -z for changing the line terminator to NUL, which can be used for this task as shown below.
find -name '*.jpg' -print0 \
| head -z -n 50 \
| xargs -0 cp -t /destination

shell script to delete all files except the last updated file in different folders

My application logs will be created in below folders in linux system.
Folder 1: 100001_1001
folder 2 : 200001_1002
folder 3 :300061_1003
folder 4: 300001_1004
folder 5 :400011_1008
want to delete all files except the latest file in above folders and want to add this to cron job.
i tried below one not working need help
30 1 * * * ls -lt /abc/cde/etc/100* | awk '{if(NR!=1) print $9}' | xargs -i rm -rf {} \;
30 1 * * * ls -lt /abc/cde/etc/200* | awk '{if(NR!=1) print $9}' | xargs -i rm -rf {} \;
30 1 * * * ls -lt /abc/cde/etc/300* | awk '{if(NR!=1) print $9}' | xargs -i rm -rf {} \;
30 1 * * * ls -lt /abc/cde/etc/400* | awk '{if(NR!=1) print $9}' | xargs -i rm -rf {} \;

You can use this pipeline consisting all gnu utilities (so that we can also handle file paths with special characters, whitespaces and glob characters)
find /parent/log/dir -type f -name '*.zip' -printf '%T#\t%p\0' |
sort -zk1,1rn |
cut -zf2 |
tail -z -n +2 |
xargs -0 rm -f

Using a slightly modified approach to your own:
find /abc/cde/etc/100* -printf "%A+\t%p\n" | sort -k1,1r| awk 'NR!=1{print $2}' | xargs -i rm "{}"
The find version doesn't suffer the lack of paths, so this MIGHT work (I don't know anything about the directory structure, and whether 100* points at a directory, a file or a group of files ...

You should use find, instead. It has a -delete action that deletes he files it found that match you specification. Warning: it is very easy to go wrong with -delete. Test your command first. Example, to find all files named *.zip under a/b/c (and only files):
find a/b/c -depth -name '*.zip' -type f -print
This is the test, it will print all files that the final command will delete (do not forget the -depth, it is important). And once you are sure, the command that does the deletion is:
find a/b/c -depth -name '*.zip' -type f -delete
find also has options to select files by last modification date, by size... You could, for instance, find all files that were modified at least 24 hours ago:
find a/b/c -depth -type f -mtime +0 -print
and, after careful check, delete them:
find a/b/c -depth -type f -mtime +0 -delete

Using find and grep in a specifc file in a specific directory

I have the following directory structure:
./A1
./A2
./A3
./B
./C
In each one of the A* directories I have:
./A*/logs
./A*/test
in the logs directory I have:
./log-jan-1
./log-jan-2
./log-feb-1
How do I grep for a string in all January logs in the A directories?
I tried this, but it did not find the string although it is present in the log files:
find . -type d -name 'A*' print | xargs -n1 -I PATH grep string - PATH/logs/log-jan*
What am I doing wrong?

Why don't you simply use
grep string ./A*/logs/log-jan*
?

If it is not a typo, you should use -print (or -print0) instead of print.
But as find + xargs + grep constructs are hard to debug, you should test in sequence :
find . -type d -name 'A*' -print
find . -type d -name 'A*' -print0 | xargs -0 -n1 -I PATH echo grep string - PATH
and finally :
find . -type d -name 'A*' -print0 | xargs -0 -n1 -I PATH grep string - PATH/logs/log-jan*
In you use case, -print and -print0 should give same results, but for having been burnt with -print, I always use -print0 before xargs

How to pipe the results of 'find' to mv in Linux

How do I pipe the results of a 'find' (in Linux) to be moved to a different directory? This is what I have so far.
find ./ -name '*article*' | mv ../backup
but its not yet right (I get an error missing file argument, because I didn't specify a file, because I was trying to get it from the pipe)

find ./ -name '*article*' -exec mv {} ../backup \;
OR
find ./ -name '*article*' | xargs -I '{}' mv {} ../backup

xargs is commonly used for this, and mv on Linux has a -t option to facilitate that.
find ./ -name '*article*' | xargs mv -t ../backup
If your find supports -exec ... \+ you could equivalently do
find ./ -name '*article*' -exec mv -t ../backup {} \+
The -t option is a GNU extension, so it is not portable to systems which do not have GNU coreutils (though every proper Linux I have seen has that, with the possible exception of Busybox). For complete POSIX portability, it's of course possible to roll your own replacement, maybe something like
find ./ -name '*article*' -exec sh -c 'mv "$#" "$0"' ../backup {} \+
where we shamelessly abuse the convenient fact that the first argument after sh -c 'commands' ends up as the "script name" parameter in $0 so that we don't even need to shift it.
Probably see also https://mywiki.wooledge.org/BashFAQ/020

I found this really useful having thousands of files in one folder:
ls -U | head -10000 | egrep '\.png$' | xargs -I '{}' mv {} ./png
To move all pngs in first 10000 files to subfolder png

mv $(find . -name '*article*') ../backup

Here are a few solutions.
find . -type f -newermt "2019-01-01" ! -newermt "2019-05-01" \
-exec mv {} path \;**
or
find path -type f -newermt "2019-01-01" ! -newermt "2019-05-01" \
-exec mv {} path \;
or
find /Directory/filebox/ -type f -newermt "2019-01-01" \
! -newermt "2019-05-01" -exec mv {} ../filemove/ \;
The backslash + newline is just for legibility; you can equivalently use a single long line.

xargs is your buddy here (When you have multiple actions to take)!
And using it the way I have shown will give great control to you as well.
find ./ -name '*article*' | xargs -n1 sh -c "mv {} <path/to/target/directory>"
Explanation:
-n1
Number of lines to consider for each operation ahead
sh -c
The shell command to execute giving it the lines as per previous condition
"mv {} /target/path"
The move command will take two arguments-
1) The line(s) from operation 1, i.e. {}, value substitutes automatically
2) The target path for move command, as specified
Note: the "Double Quotes" are specified to allow any number of spaces or arguments for the shell command which receives arguments from xargs

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string