Bash Script to find large files recently modified in the past 24 hours - linux

How can I search through a massive amount of data (28TB) to find the largest 10 files in the past 24 hours?
From the current answers below I've tried:
$ find . -type f -mtime -1 -printf "%p %s\n" | sort -k2nr | head -5
This command takes over 24 hours which defeats the purpose of searching for most recently modified in the past 24 hours. Are there any solutions known to be faster than the one above that can drastically cut search time? Solutions to monitor the system also will not work as there is simply too much to monitor and doing such could cause performance issues.

something like this?
$ find . -type f -mtime -1 -printf "%p %s\n" | sort -k2nr | head -5
top 5 modified files by size in the past 24 hours.

you can use the standard yet very powerful find command like this (start_directory is the directory where to scan files)
find start_directory -type f -mtime -1 -size +3000G
-mtime -1 option: files modified 1 day before or less
-size +3000G option: files of size at least 3 Gb

Related

Shell script to find recently modified files [duplicate]

E.g., a MySQL server is running on my Ubuntu machine. Some data has been changed during the last 24 hours.
What (Linux) scripts can find the files that have been changed during the last 24 hours?
Please list the file names, file sizes, and modified time.
To find all files modified in the last 24 hours (last full day) in a particular specific directory and its sub-directories:
find /directory_path -mtime -1 -ls
Should be to your liking
The - before 1 is important - it means anything changed one day or less ago.
A + before 1 would instead mean anything changed at least one day ago, while having nothing before the 1 would have meant it was changed exacted one day ago, no more, no less.
Another, more humanist way, is to use -newermt option which understands human-readable time units.
Unlike -mtime option which requires the user to read find documentation to figure our what time units -mtime expects and then having the user to convert its time units into those, which is error-prone and plain user-unfriendly. -mtime was barely acceptable in 1980s, but in the 21st century -mtime has the convenience and safety of stone age tools.
Example uses of -newermt option with the same duration expressed in different human-friendly units:
find /<directory> -newermt "-24 hours" -ls
find /<directory> -newermt "1 day ago" -ls
find /<directory> -newermt "yesterday" -ls
You can do that with
find . -mtime 0
From man find:
[The] time since each file was last modified is divided by 24 hours and any remainder is discarded. That means that to
match -mtime 0, a file will have to have a modification in the past which is less than 24 hours ago.
On GNU-compatible systems (i.e. Linux):
find . -mtime 0 -printf '%T+\t%s\t%p\n' 2>/dev/null | sort -r | more
This will list files and directories that have been modified in the last 24 hours (-mtime 0). It will list them with the last modified time in a format that is both sortable and human-readable (%T+), followed by the file size (%s), followed by the full filename (%p), each separated by tabs (\t).
2>/dev/null throws away any stderr output, so that error messages don't muddy the waters; sort -r sorts the results by most recently modified first; and | more lists one page of results at a time.
For others who land here in the future (including myself), add a -name option to find specific file types, for instance: find /var -name "*.php" -mtime -1 -ls
This command worked for me
find . -mtime -1 -print
Find the files...
You can set type f = file
find /directory_path -type f -mtime -1 -exec ls -lh {} \;
👍

Find files modified over 1 hour ago but less than 3 days

In linux, using bash, what's the easiest way to find files that were modified more than an hour ago but less than 3 days ago?
Surely, there's got to be an easy way to do this. I keep searching and can't find an easy solution.
Find has -mtime and -mmin:
find . -mtime +3 -mmin -60
From the find manual:
Numeric arguments can be specified as:
+n for greater than n
-n for less than n
n for exactly n
This should suffice: find . -mtime -3 -mmin +60
I just tried it:
find ./ -mtime -3 -mmin +60 -exec ls -lhrt {} \; | awk '{print $5" "$6" "$7" "$8}'

find files which have been modified in the last 30 minutes in Linux

how to find files based upon time information, such as creation, modified and accessed. It is useful to find files before a certain time, after a certain time and between two times. what command in Linux would i have to use ?
I understand to find setuid files on linux computers i would have to use :
find / -xdev ( -perm -4000 ) -type f -print0 | xargs -0 ls -l
How do i check for files which have been modified in the last 30 minutes. (I created a new file called FILE2)
Just add -mtime -30m. I might be wrong about the actual syntax, but you get the idea. See man find.
Answer on your question is
find . -cmin -30 -exec ls -l {} \;

Grep inside all files created within date range

I am on the Ubuntu OS. I want to grep a word (say XYZ) inside all log files which are created within date range 28-may-2012 to 30-may-2012.
How do I do that?
This is a little different from Banthar's solution, but it will work with versions of find that don't support -newermt and it shows how to use the xargs command, which is a very useful tool.
You can use the find command to locate files "of a certain age". This will find all files modified between 5 and 10 days ago:
find /directory -type f -mtime -10 -mtime +5
To then search those files for a string:
find /directory -type f -mtime -10 -mtime +5 -print0 |
xargs -0 grep -l expression
You can also use the -exec switch, but I find xargs more readable (and it will often perform better, too, but possibly not in this case).
(Note that the -0 flag is there to let this command operate on files with embedded spaces, such as this is my filename.)
Update for question in comments
When you provide multiple expressions to find, they are ANDed together. E.g., if you ask for:
find . -name foo -size +10k
...find will only return files that are both (a) named foo and (b) larger than 10 kbytes. Similarly, if you specify:
find . -mtime -10 -mtime +5
...find will only return files that are (a) newer than 10 days ago and (b) older than 5 days ago.
For example, on my system it is currently:
$ date
Fri Aug 19 12:55:21 EDT 2016
I have the following files:
$ ls -l
total 0
-rw-rw-r--. 1 lars lars 0 Aug 15 00:00 file1
-rw-rw-r--. 1 lars lars 0 Aug 10 00:00 file2
-rw-rw-r--. 1 lars lars 0 Aug 5 00:00 file3
If I ask for "files modified more than 5 days ago (-mtime +5) I get:
$ find . -mtime +5
./file3
./file2
But if I ask for "files modified more than 5 days ago but less than 10 days ago" (-mtime +5 -mtime -10), I get:
$ find . -mtime +5 -mtime -10
./file2
Combine grep with find:
find -newermt "28 May 2012" -not -newermt "30 May 2012" -exec grep XYZ \{\} \;
find doesn't seem to have options where you can specify specific dates for timestamp comparison (at least the version on my laptop doesn't - there may be other versions and/or other tools that perform similarly), so you'll have to use the number of days. So, as of 2012/06/05, you want to find files newer than 9 days but older than 6 days:
find . -type f -ctime -9 -ctime +6 -print0 | xargs -0 grep XYZ

Linux command to check new files in file system

We have linux machine we would like to check what new files have been added between a certain date range.
I only have SSH access to this box and it's openSUSE 11.1
Is there some sort of command that can give me a list of files that have been added to the filesystem between say 04/05/2011 and 05/05/2011
Thanks
Regards
Gabriel
There are bunch of ways for doing that.
First one:
start_date=201105040000
end_date=201105042359
touch -t ${start_date} start
touch -t ${end_date} end
find /you/path -type f -name '*you*pattern*' -newer start ! -newer end -exec ls -s {} \;
Second one:
find files modified between 20 and 21 days ago:
find -ctime +20 -ctime -21
finds files modified between 2500 and 2800 minutes ago:
find -cmin +2500 -cmin -2800
And read this topic too.
Well, you could use find to get a list of all the files that were last-modified in a certain time window, but that isn't quite what you want. I don't think you can tell just from a file's metadata when it came into existence.
Edit: To list the files along with their modification dates, you can pipe the output of find through xargs to run ls -l on all the files, which will show the modification time.
find /somepath -type f ... -print0 | xargs -0 -- ls -l
I misunderstood your question. Depending on what filesystem you are using, it may or may not store creation time.
My understanding is that ext2/3/4 do not store creation time, but modified, changed (status, which is slightly different), and access times are.
Fat32 on the other hand does contain creation timestamps IIRC.
If you are using an ext filesystem, you have two options it seems:
1.Settle for finding all of the files that were modified between two dates (which will include created files, but also files that were just edited). You could do this using find.
2.Create a script/cronjob that will document the contents of your filesystem at some interval, e.g.
find / > filesystem.$(date +%s).log
and then run diffs to see what has been added. This, of course, would prevent you from looking backwards to time before you started making these logs.
You can try one of these:
find -newerct "1 Aug 2013" ! -newerct "1 Sep 2013" -ls
find . -mtime $(date +%s -d"Jan 1, 2013 23:59:59") -mtime $(date +%s -d"Jan 2, 2016 23:59:59")
find /media/WD/backup/osool/olddata/ -newermt 20120101T1200 -not -newermt 20130101T1400
find . -mtime +1 -mtime -3
find . -mtime +1 -mtime -3 > files_from_yesterday.txt 2>&1
find . -mtime +1 -mtime -3 -ls > files_from_yesterday.txt 2>&1
touch -t 200506011200 first
touch -t 200507121200 last
find / -newer first ! -newer last
#!/bin/bash
for i in `find Your_Mail_Dir/ -newermt "2011-01-01" ! -newermt "2011-12-31"`; do
mv $i /moved_emails_dir/
Hope this helps.

Resources