Count lines found with find command - linux

I have configured glusterfs into two servers.
I want to implement a script wich monitors the replication. My idea is to exec the following:
find "/replica_path/" -mmin +1 -exec ls -l {} \; |wc -l
This will find the files modified more than 1 min ago and must return the same count in both servers.
I'll use spawn to exec this line remotely-
But when executing that line from the command line, the server takes a long to return the path, in fact I've to break the execution.
How could I implement this?

ls -l might need quite some time to resolve owner names etc.
perhaps you just need to count the number of matches:
find "/replica_path/" -mmin +1 | wc -l

It might help to avoid executing /bin/ls for each matched item if you just want to count them.
Try
find "/replica_path/" -mmin -1 -print | wc -l

Related

searching an exact number of files in a Linux directory

I want recursively count files in a Linux directory using this
find DIR_NAME -type f | wc -l
my question is , how to stop the find execution above if more than 1000 files are found in that folder ?
Is it possible ? Or do I need to wait find execution ?
You can use head to limit the number of lines returned by find:
find DIR_NAME -type f | head -n 1000 | wc -l
The head program will exit after the first thousand lines, find will receive SIGPIPE and exit as well.

Find and sort files by date modified

I know that there are many answers to this question online. However, I would like to know if this alternate solution would work:
ls -lt `find . -name "*.jpg" -print | head -10`
I'm aware of course that this will only give me the first 10 results. The reason I'm asking is because I'm not sure whether the ls is executing separately for each result of find or not. Thanks
In your solution:
the ls will be executed after the find is evaluated
it is likely that find will yield too many results for ls to process, in which case you might want to look at the xargs command
This should work better:
find . -type f -print0 | xargs -0 stat -f"%m %Sm %N" | sort -rn
The three parts of the command to this:
find all files and print their path
use xargs to process the (long) list of files and print out the modification unixtime, human readable time, and filename for each file
sort the resulting list in reverse numerical order
The main trick is to add the numerical unixtime when the files were last modified to the beginning of the lines, and then sort them.

find -exec doesn't recognize argument

I'm trying to count the total lines in the files within a directory. To do this I am trying to use a combination of find and wc. However, when I run find . -exec wc -l {}\;, I recieve the error find: missing argument to -exec. I can't see any apparent issues, any ideas?
You simply need a space between {} and \;
find . -exec wc -l {} \;
Note that if there are any sub-directories from the current location, wc will generate an error message for each of them that looks something like that:
wc: ./subdir: Is a directory
To avoid that problem, you may want to tell find to restrict the search to files :
find . -type f -exec wc -l {} \;
Another note: good idea using the -exec option . Too many times people pipe commands together thinking to get the same result, for instance here it would be :
find . -type f | xargs wc -l
The problem with piping commands in such a manner is that it breaks if any files has spaces in it. For instance here if a file name was "a b" , wc would receive "a" and then "b" separately and you would obviously get 2 error messages: a: no such file and b: no such file.
Unless you know for a fact that your file names never have any spaces in them (or non-printable characters), if you do need to pipe commands together, you need to tell all the tools you are piping together to use the NULL character (\0) as a separator instead of a space. So the previous command would become:
find . -type f -print0 | xargs -0 wc -l
With version 4.0 or later of bash, you don't need your find command at all:
shopt -s globstar
wc -l **/*
There's no simple way to skip directories, which as pointed out by Gui Rava you might want to do, unless you can differentiate files and directories by name alone. For example, maybe directories never have . in their name, while all the files have at least one extension:
wc -l **/*.*

List all files (with full paths) in a directory (and subdirectories), order by access time

I'd like to construct a Linux command to list all files (with their full paths) within a specific directory (and subdirectories) ordered by access time.
ls can order by access time, but doesn't give the full path. find gives the full path, but the only control you have over the access time is to specify a range with -atime N (accessed at least 24*N hours ago), which isn't what I want.
Is there a way to order by access time and get the full path at once? I could just write a script, but it seems there should be a way to do this with the standard Linux programs.
find . -type f -exec ls -l {} \; 2> /dev/null | sort -t' ' -k +6,6 -k +7,7
This will find all files, and sort them by date and then time. You can then use awk or cut to extract the dates and files name from the ls -l output
you could try:
ls -l $(find /foo/bar -type f )
you can add other options (e.g. -t for sorting) to ls command to achieve your goal.
also you could add your searching criteria to find cmd
find . -type f | xargs ls -ldt should do the trick as long as there's not so many files that you hit the command like argument limit and spawn 2 instances of ls.
pwd | xargs -I % find % -type f
find . -type f -exec ls -l --full-time {} \; 2> /dev/null | sort -t' ' -k +6,6 -k +7,7
Alex's answer did not work for me since I had files older than one year and the sorting got messed up. The above adds the --full-time parameter which nuetralizes the date/time values and makes them sortable regardless of how old they are.

How can I use FIND to recursively backup multiple subversion repositories

At the moment our backup script explicitly runs svnadmin hotcopy on each of our repositories every night. Our repos are all stored under a parent directory (/usr/local/svn/repos)
Our backup script has a line for each of the repos under that directory along the lines of:
svnadmin hotcopy /usr/local/svn/repos/myrepo1 /usr/local/backup/myrepo1
Instead of having to manually add a new line for each every new repo we bring online, I was hoping to using the find command to run svnadmin hotcopy for every directory it finds under /usr/local/svn/repos.
So far I've got:
find /usr/local/svn/repos/ -maxdepth 1 -mindepth 1 -type d -exec echo /usr/local/backup{} \;
,where I'm substituting "svnadmin hotcopy" with "echo" for simplicity's sake.
The output of which is:
/usr/local/backup/usr/local/svn/repos/ure
/usr/local/backup/usr/local/svn/repos/cheetah
/usr/local/backup/usr/local/svn/repos/casemgt
/usr/local/backup/usr/local/svn/repos/royalliver
/usr/local/backup/usr/local/svn/repos/ure_andras
/usr/local/backup/usr/local/svn/repos/profserv
/usr/local/backup/usr/local/svn/repos/frontoffice
/usr/local/backup/usr/local/svn/repos/ure.orig
/usr/local/backup/usr/local/svn/repos/projectcommon
/usr/local/backup/usr/local/svn/repos/playground
/usr/local/backup/usr/local/svn/repos/casegen
The problem being the full path is included in {}. I need only the last element of the directory name passed to -exec
The output I want being:
/usr/local/backup/ure
/usr/local/backup/cheetah
/usr/local/backup/casemgt
/usr/local/backup/royalliver
/usr/local/backup/ure_andras
/usr/local/backup/profserv
/usr/local/backup/frontoffice
/usr/local/backup/ure.orig
/usr/local/backup/projectcommon
/usr/local/backup/playground
/usr/local/backup/casegen
I'm pretty much stuck at this point. Can anyone help me out here?
Thanks in advance,
Dave
You were on the right track. Try this:
find /usr/local/svn/repos/ -maxdepth 1 -mindepth 1 -type d -printf "%f\0" | xargs -0 -I{} echo svnadmin hotcopy /usr/local/svn/repos/\{\} /usr/local/backup/\{\}
The %f is like basename and the null plus the -0 on xargs ensures that names with spaces, etc., get passed through successfully.
Just remove the echo and make any adjustments you might need and it should do the trick.
put a cut command at the end
find /usr/local/svn/repos/ -maxdepth 1 -mindepth 1 -type d -exec echo /usr/local/backup{} \| cut -f1,2,3,9 -d"/"
How about adding a sed filter cuting out the middle part?
sed 's/usr.local.svn.repos.//g'
Added like this
find /usr/local/svn/repos/ -maxdepth 1 -mindepth 1 -type d -exec echo /usr/local/backup{} ";" | sed 's/usr.local.svn.repos.//g'
ls -al /usr/local/svn/repos/ |grep '^d' |sed s/^...............................................................//" |xargs -L 1 -I zzyggy echo /usr/local/svn/repos/zzyggy
It's a bit long but it does the trick. You don't have to do everything with find when there are lots of other shell commands, although if I had to write this kind of script, I would do it in Python and leave the shell for interactive work.
ls -al lists all the files in the named directory with attributes
grep '^d' selects the lines beginning with d which are directories
sed strips off all the characters to the left of the actual directory name. You may need to add or delete some dots
xargs takes the list of directory names and issues it one at a time. I specified zzyggy as the name to substitute in the executed command, but you can choose what you like. Of course, you would replace echo with your svnadmin command.
If it was in a shell script you should really do this
SVNDIRNAME="/usr/local/svn/repos"
ls -al $SVNDIRNAME |grep '^d' |sed s/^...............................................................//" |xargs -L 1 -I zzyggy echo $SVNDIRNAME/zzyggy
but I decided to show the wrong and right way just to explain this point. I'm going to tag this with some shell tag, but I still think that a Python script is a superior way to solve this kind of problem in the 21st century.

Resources