grep/touch -mtime check log within last XX minutes - linux

i want to run a query on a linux machine. i want to check a log file and only check within the last 15 minutes. assuming there were changes and additions to the log file. what is the correct query?
grep 'test condition' /home/somelogpath/file.log -mtime 15
thanks.

to check only files modified in the last hour:
find /path/to/logfiles -mtime -1 -print
so you could
find /path/to/logfiles -mtime -1 -print | xargs grep "some condition" /dev/null
(note : I add a file (/dev/null) so that grep will always have at least 2 files and thus will be forced to display the name(s) of the matching files.. As it's /dev/null, it won't match anything you grep, so you don't have the risk of seing "/dev/null" in the ouput of grep. But it serves its purpose: grep will prefix any match with the filename, even if only 1 file matched)
for minutes, etc, please check your :
man find
(it depends on your os, etc) (and I don't have a linux at hand right now)
If you meant "only match the last X seconds in a log file" : use awk, and have a condition on the line (it depends on the logfile format.. if it uses "seconds since startup" or "epoch", you can simply test the relevant field to be >= some value. If it's using "2014-05-15 hh:mm" you can also find ways to do it, but it will be cumbersome..)

Related

how to find a last updated file with the prefix name in bash?

How can I find a last updated file with the specific prefix in bash?
For example, I have three files, and I just want to see a file that has "ABC" and where the last Last_updatedDateTime desc.
fileName Last_UpdatedDateTime
abc123 7/8/2020 10:34am
abc456 7/6/2020 10:34am
def123 7/8/2020 10:34am
You can list files sorted in the order they were modified with ls -t:
-t sort by modification time, newest first
You can use globbing (abc*) to match all files starting with abc.
Since you will get more than one match and only want the newest (that is first):
head -1
Combined:
ls -t abc* | head -1
If there are a lot of these files scattered across a variety of directories, find mind be better.
find -name abc\* -printf "%T# %f\n" |sort -nr|sed 's/^.* //; q;'
Breaking that out -
find -name 'abc*' -printf "%T# %f\n" |
find has a ton of options. This is the simplest case, assuming the current directory as the root of the search. You can add a lot of refinements, or just give / to search the whole system.
-name 'abc*' picks just the filenames you want. Quote it to protect any globs, but you can use normal globbing rules. -iname makes the search case-insensitive.
-printf defines the output. %f prints the filename, but you want it ordered on the date, so print that first for sorting so the filename itself doesn't change the order. %T accepts another character to define the date format - # is the unix epoch, seconds since 00:00:00 01/01/1970, so it is easy to sort numerically. On my git bash emulation it returns fractions as well, so it's great granularity.
$: find -name abc\* -printf "%T# %f\n"
1594219755.7741618000 abc123
1594219775.5162510000 abc321
1594219734.0162554000 abc456
find may not return them in the order you want, though, so -
sort -nr |
-n makes it a numeric sort. -r sorts in reverse order, so that the latest file will pop out first and you can ignore everything after that.
sed 's/^.* //; q;'
Since the first record is the one we want, sed can just use s/^.* //; to strip off everything up to the space, which we know will be the timestamp numbers since we controlled the output explicitly. That leaves only the filename. q explicitly quits after the s/// scrubs the record, so sed spits out the filename and stops without reading the rest, which prevents the need for another process (head -1) in the pipeline.

Quickly list random set of files in directory in Linux

Question:
I am looking for a performant, concise way to list N randomly selected files in a Linux directory using only Bash. The files must be randomly selected from different subdirectories.
Why I'm asking:
In Linux, I often want to test a random selection of files in a directory for some property. The directories contain 1000's of files, so I only want to test a small number of them, but I want to take them from different subdirectories in the directory of interest.
The following returns the paths of 50 "randomly"-selected files:
find /dir/of/interest/ -type f | sort -R | head -n 50
The directory contains many files, and resides on a mounted file system with slow read times (accessed through ssh), so the command can take many minutes. I believe the issue is that the first find command finds every file (slow), and only then prints a random selection.
If you are using locate and updatedb updates regularly (daily is probably the default), you could:
$ locate /home/james/test | sort -R | head -5
/home/james/test/10kfiles/out_708.txt
/home/james/test/10kfiles/out_9637.txt
/home/james/test/compr/bar
/home/james/test/10kfiles/out_3788.txt
/home/james/test/test
How often do you need it? Do the work periodically in advance to have it quickly available when you need it.
Create a refreshList script.
#! /bin/env bash
find /dir/of/interest/ -type f | sort -R | head -n 50 >/tmp/rand.list
mv -f /tmp/rand.list ~
Put it in your crontab.
0 7-20 * * 1-5 nice -25 ~/refresh
Then you will always have a ~/rand.list that's under an hour old.
If you don't want to use cron and aren't too picky about how old it is, just write a function that refreshes the file after you use it every time.
randFiles() {
cat ~/rand.list
{ find /dir/of/interest/ -type f |
sort -R | head -n 50 >/tmp/rand.list
mv -f /tmp/rand.list ~
} &
}
If you can't run locate and the find command is too slow, is there any reason this has to be done in real time?
Would it be possible to use cron to dump the output of the find command into a file and then do the random pick out of there?

Using grep to identify a pattern

I have several documents hosted on a cloud instance. I want to extract all words conforming to a specific pattern into a .txt file. This is the pattern:
ABC123A
ABC123B
ABC765A
and so one. Essentially the words start with a specific character string 'ABC', have a fixed number of numerals, and end with a letter. This is my code:
grep -oh ABC[0-9].*[a-zA-Z]$ > /home/user/abcLetterMatches.txt
When I execute the query, it runs for several hours without generating any output. I have over 1100 documents. However, when I run this query:
grep -r ABC[0-9].*[a-zA-Z]$ > /home/user/abcLetterMatches.txt
the list of files with the strings is generated in a matter for seconds.
What do I need to correct in my query? Also, what is causing the delay?
UPDATE 1
Based on the answers, it's evident that the command is missing the file name on which it needs to be executed. I want to run the code on multiple document files (>1000)
The documents I want searched are in multiple sub-directories within a directory. What is a good way to search through them? Doing
grep -roh ABC[0-9].*[a-zA-Z]$ > /home/user/abcLetterMatches.txt
only returns the file names.
UPDATE 2
If I use the updated code from the answer below:
find . -exec grep -oh "ABC[0-9].*[a-zA-Z]$" >> ~/abcLetterMatches.txt {} \;
I get a no file or directory error
UPDATE 3
The pattern can be anywhere in the line.
You can use this regexp :
~/ grep -E "^ABC[0-9]{3}[A-Z]$" docs > filename
ABC123A
ABC123B
ABC765A
There is no delay, grep is just waiting for the input you didn't give it (and therefore it waits on standard input, by default). You can correct your command by supplying argument with filename:
grep -oh "ABC[0-9].*[a-zA-Z]$" file.txt > /home/user/abcLetterMatches.txt
Source (man grep):
SYNOPSIS
grep [OPTIONS] PATTERN [FILE...]
To perform the same grepping on several files recursively, combine it with find command:
find . -exec grep -oh "ABC[0-9].*[a-zA-Z]$" >> ~/abcLetterMatches.txt {} \;
This does what you ask for:
grep -hr '^ABC[0-9]\{3\}[A-Za-z]$'
-h to not get the filenames.
-r to search recursively. If no directory is given (as above) the current one is used. Otherwise just specify one as the last argument.
Quotes around the pattern to avoid accidental globbing, etc.
^ at the beginning of the pattern to — together with $ at the end — only match whole lines. (Not sure if this was a requirement, but the sample data suggests it.)
\{3\} to specify that there should be three digits.
No .* as that would match a whole lot of other things.

Bash - Get files for last 12 hours / sophisticated name format

I have a set of logs which have the names as follows:
SystemOut_15.07.20_23.00.00.log SystemOut_15.07.21_10.27.17.log
SystemOut_15.07.21_16.48.29.log SystemOut_15.07.22_15.57.46.log
SystemOut_15.07.22_13.03.46.log
From that list I need to get only files for last 12 hours.
So as an output I will receive:
SystemOut_15.07.22_15.57.46.log SystemOut_15.07.22_13.03.46.log
I had similar issue with files having below names but was able to resolve that quickly as the date comes in an easy format:
servicemix.log.2015-07-21-11 servicemix.log.2015-07-22-12
servicemix.log.2015-07-22-13
So I created a variable called 'day':
day=$(date -d '-12 hour' +%F-%H)
And used below command to get the files for last 12 hours:
ls | awk -F. -v x=$day '$3 >= x'
Can you help to have that done with SystemOut files as they have such name syntax containing underscore which confuses me.
Assuming the date-time in log file's name is in the format
YY.MM.DD_HH24.MI.SS,
day=$(date -d '-12 hour' +%Y.%m.%d_%H.%M.%S.log)
Prepend the century to the 2 digit year in the log file name and then compare
ls | awk -F_ -v x=$day '"20"$2"_"$3 >= x'
Alternatively, as Ed Morton suggested, find can be used like so:
find . -type f -name '*.log' -cmin -720
This returns the log files created within last 720 minutes. To be precise, this means file status was last changed within the past 720 minutes. -mmin option can be used to search by modification time.

Bash - find exec return value

I need a way to tell if grep does find something, and ideally pass that return value to an if statement.
Let's say I have a tmp folder (current folder), in that there are several files and sub-folders. I want to search all files named abc for a pattern xyz. The search is assumed to be successful if it finds any occurrence of xyz (it does not matter how many times xyz is found). The search fails if no occurrence of xyz is found.
In bash, it can be done like this:
find . -name "abc" -exec grep "xyz" {} \;
That would show if xyz is found at all. But I'm not sure how pass the result (successful or not) back to an if statement.
Any help would be appreciated.
You can try
x=`find . -name abc | xargs grep xyz`
echo $x
That is, x contains your return value. It is blank when there is no match.
If you want to know that find finds some files abc and that at least one of them contains the string xyz, then you can probably use:
if find . -name 'abc' -type f -exec grep -q xyz {} +
then : All invocations of grep found at least one xyz and nothing else failed
else : One or more invocations of grep failed to find xyz or something else failed
fi
This relies on find returning an exit status for its own operations, and a non-zero exit status of any of the command(s) it executes. The + at the end groups as many file names as find thinks reasonable into a single command line. You need quite a lot of file name (a large number of fairly long names) to make find run grep multiple times. On a Mac running Mac OS X 10.10.4, I got to about 3,000 files, each with about 32 characters in the name, for an argument list of just over 100 KiB, without grep being run multiple times. OTOH, when I had just under 8000 files, I got two runs of grep, with around 130 KiB of argument list for each.
Someone briefly left a comment asking whether the exit status of find is guaranteed. The answer is 'yes' — though I had to modify my description of the exit status a bit. Under the description of -exec, POSIX specifies:
If any invocation [of the 'utility', such as grep in this question] returns a non-zero value as exit status, the find utility shall return a non-zero exit status.
And under the general 'Exit status' it says:
The following exit values shall be returned:
0 — All path operands were traversed successfully.
>0 — An error occurred.
Thus, find will report success as long as none of its own operations fails and as long as none of the grep commands it invokes fails. Zero files found but no failures will be reported as success. (Failures might be lack of permission to search a directory, for example.)
find returns the result of the -exec'd command as its result, just place the command in an if statement:
if [[ -n $(find . -name "abc" -exec grep "xyz" {} \;) ]]
then
# do something
fi
The grep command can be given a -q option that will "quiet" its output. Grep will return a success or a failure on the basis of whether it found anything in the file you pointed it at.
If your shell is capable of it, rather than trying to parse the output of a find command, you might want to try using a for loop instead. For example in bash:
shopt -s globstar
cd /some/directory
for file in **/abc; do
grep -q "xyz" "$file" && echo "Found something in $file"
done
Is this what you're looking for?
You should be able to do this with just grep with --include I believe.
Like this:
if grep -r --include=abc -q xyz; then
: Found at least one match
else
: No matching lines found
fi

Resources