Bash - Find directory containing specific logs file - linux

I've created a script to quickly analyze some logs and automatically provide advices to solve problems based on errors found.
All works as expected.
However, it's appears that folders structure containing these logs can change (depends on system configuration) and my script not work any more.
I would like to find a way to find the directory containing a specifics files like logs or appinfo.txt file.
Once obtains I could use it as variable and finally solve my problem.
Here is an example:
AppLogDirectory ='Your_Special_Command_You_Will_HelpMe_To_Find'
grep -i "Error" $AppLogDirectory/esl*.log
Log format is: ESL.randomValue.log
Files analyzed : appinfo.txt,
system.txt etc ..
A suggested in comment section, I edit my orginal post with more detail to clarify the context, below an example:
Log files (esl.xxx.tt.ss.log ) can be in random directory, like:
/var/log/ApplicationName/logs/
/opt/ApplicationName/logs/
/var/data/ApplicationName/Extended/logs/
Because of random directory, I need to find a solution to print the directory names of the files that match esl*.log patter (without esl filename)

Use find and pass the output to xargs with grep, like so, which runs grep on multiple files and prints the output together with the file name where the pattern was found:
find /path/to/files /another/path/to/other/files \( -name 'appinfo.txt' -o -name 'system.txt' -o -name 'esl*.log' \) -print0 | xargs -0 grep -i 'Error'
Or simply use -exec ... \+, which gives the same effect, without the need for xargs:
find /path/to/files /another/path/to/other/files \( -name 'appinfo.txt' -o -name 'system.txt' -o -name 'esl*.log' -exec grep -i 'Error' \+
To find the directories which contain the files that contain the desired pattern, use grep -l to print file names only (not the lines that match), and pipe the results to xargs dirname to print the directory names. If you need the unique dir names, pipe it further to sort -u:
find /path/to/files /another/path/to/other/files \( -name 'appinfo.txt' -o -name 'system.txt' -o -name 'esl*.log' -exec grep -il 'Error' \+ | xargs dirname | sort -u
SEE ALSO:
GNU find manual
To search for files based on their contents
xargs

Solution found thanks to you thank you again!
#Ask for extracted tar.gz folder
read -p "Where did you extract the tar.gz file? r1
#directory path where esl files is located
logpath=`find $r1 -name "esl*.log" | xargs dirname | sort -u`
#Search value (here "Error") into all esl*.log
grep 'Error' $logpath/esl*.log | awk '{print $8}'

Related

Grep files in subdirectories and write out files for each directory

I am working on a bioinformatics workflow in which the tool in question, 'salmon' creates multiple directories having a 'quant.sf' file. I want to find all 'lnc' entries within these files and save them as 'lnc.sf' for all directories.
I was previously running
cat quant.sf | grep 'lnc' > lnc.sf
in all directories individually that seemed to solve my problem. Now I want to write a script that goes into each directory and generates a lnc.sf file.
I have tried doing
find . -name "quant.sf" | while read A
do
cat $A | grep 'lnc' > lnc.sf
done
But this just creates a concatenated lnc.sf file in the current directory. Any help is highly appreciated.
Thank You!
If all your quant.sf files are at the same hierarchy level, the following should work, assuming a folder structure like month/day/quant.sf:
grep -h 'lnc' */*/quant.sf > lnc.sf
Otherwise, find the files, be aware of using find+read instead of exec or xargs; understand variable expansion with whitespaces, get rid of the redundant cat process, and write the file to the correct directory:
find . -name 'quant.sf' | while IFS= read -r A
do
grep 'lnc' "$A" > "${A%/*}/lnc.sf"
done
If you have GNU find + xargs, use -print0 combined with -0:
find . -name 'quant.sf' -print0 | xargs -0 -n1 sh -c 'grep "lnc" "$1" > "${1%/*}/lnc.sf"' -
Or use -exec of find, which avoids problems with weird files names:
find . -name 'quant.sf' -exec sh -c 'grep "lnc" "$1" > "${1%/*}/lnc.sf"' - ';'

Find specific folders then search specific files inside them for a word

I am trying to combine find and grep in a way to find folder names that start with k0 and search a specific file "test.log" for a word ERROR.
Something like:
find . -type d -name "k0*" -print | xargs grep ERROR test.log
unfortunately this command doesnt work as intended.
try this, I am assuming you have multiple files named test.log in the folders whose names start with k0 here:
for file in $(find ./k0* -name 'test.log'); do
grep -w 'ERROR' $file
done
You can make this into a one-liner command like this:
for file in $(find ./k0* -name 'test.log'); do grep -w 'ERROR' $file; done
It's executable on terminal if you just post it.

Piping find results into grep for fast directory exclusion

I am successfully using find to create a list of all files in the current subdirectory, excluding those in the subdirectory "cache." Here's my first bit of code:
find . -wholename './cach*' -prune -o -print
I now wish to pipe this into a grep command. It seems like that should be simple:
find . -wholename './cach*' -prune -o -print | xargs grep -r -R -i "samson"
... but this is returning results that are mostly from the cache directory. I've tried removing the xargs reference, but that does what you'd expect, running the grep on text of the file names, rather than on the files themselves. My goal is to find "samson" in any files that aren't cached content.
I'll probably get around this issue by just using doubled greps in this instance, but I'm very curious about why this one-liner behaves this way. I'd love to hear thoughts on a way to modify it while still using these two commands (as there are speed advantages to doing it this way).
(This is in CentOS 5, btw.)
The wholename match may be the reason why it's still including "cache" files. If you're executing the find command in the directory that contains the "cache" folder, it should work. If not, try changing it to -name '*cache*' instead.
Also, you do not need the -r or -R for your grep, that tells it to recurse through directories - but you're testing individual files.
You can update your command using the piped version, or a single-command:
find . -name '*cache*' -prune -o -print0 | xargs -0 grep -il "samson"
or
find . -name '*cache*' -prune -o -exec grep -iq "samson" {} \; -print
Note, the -l in the first command tells grep to "list the file" and not the line(s) that match. The -q in the second does the same; it tells grep to respond quietly so find will then just print the filename.
You've told grep itself to recurse (twice! -r and -R are synonyms). Since one of the arguments you're passing is . (the top directory), grep is searching in every file (some of them twice, or even more if they're in subdirectories).
If you're going to use find and grep, do this:
find . -path './cach*' -prune -o -print0 | xargs -0 grep -i "samson"
Using -print0 and -0 makes your script work even with file names that contain spaces or punctuation characters.
However, you probably don't need to bother with find here, since GNU grep is capable of excluding directories:
grep -R --exclude-dir='cach*' -i "samson" .
(This also excludes ./deeply/nested/directory/cache. If you only want to exclude cache directories at the toplevel, use find as you did.)
Use the -exec option on find instead of piping them to another command. From there you can use grep "samson" {} \; to look for samson in each file listed.
For example:
find . -wholename './cach*' -prune -o -exec grep "samson" "{}" +

search through files and put into array

cat `find . -name '*.css'`
This will open any css file. I now what do two things.
1) How do I add *.js to this as well. So I want to look inside all css and javascript files.
2) I want to look for any css or image files within those (css or js files) and push those into an array. So I guess look for a .png, .jpg, .gif, .tif, .css and put everything before that until the quote or single quote into an array. I want an array because this command will go into a shell script and after I get all the names of the files that I need I will need to loop through and download those files later.
Any help would be appreciated.
Extra hackery, in case someone needs it:
find ./ -name "*.css" | xargs grep -o -h -E '[A-Za-z0-9:./_-]+\.(png|jpg|gif|tif|css)'| sed -e 's/\.\./{{url here}}/g'|xargs wget
will download every missing resource
Do the command:
find ./ -name "*.css" -or -name "*.js" > fileNames.txt
Then read each line of fileNames.txt in the loop and download them.
Or if you are going to use wget to download the images you could do:
find ./ -name "*.css" -or -name "*.js" | xargs grep '*.png' | xargs wget
May need a little refinement like a cut after the grep but you get the idea
1) simple answer: you can add the names of all .js files to your cat command, by instructing find to find more files:
cat `find . -name '*.css' -or -name '*.js'`
2) a text-searching tool such as grep is probably what you're after:
find . -name '*.css' -or -name '*.js' | xargs grep -o -h -E '[A-Za-z0-9:./_-]+\.(png|jpg|gif|tif|css)'
Note: my grep pattern isn't universal or perfect, but it's a starting example. It matches any string that includes alpha-numeric,colon,dot,slash,underscore or hyphens in it, followed by any one of the given extensions.
The -o option causes grep to output only the parts of the .css/.js files that match the pattern (i.e. only the apparent filenames).
If you want to download them you could add | xargs wget -v to the command, which would instruct wget to fetch all those filenames.
NOTE: this won't work for relative filenames; some other magic will be required (i.e. you'll have to resolve them with respect to the grepped file's location). Perhaps some extra hackery, such as sed or awk.
Also: How often do you see references to TIFFs in your CSS/JS?

How to list specific type of files in recursive directories in shell?

How can we find specific type of files i.e. doc pdf files present in nested directories.
command I tried:
$ ls -R | grep .doc
but if there is a file name like alok.doc.txt the command will display that too which is obviously not what I want. What command should I use instead?
If you are more confortable with "ls" and "grep", you can do what you want using a regular expression in the grep command (the ending '$' character indicates that .doc must be at the end of the line. That will exclude "file.doc.txt"):
ls -R |grep "\.doc$"
More information about using grep with regular expressions in the man.
ls command output is mainly intended for reading by humans. For advanced querying for automated processing, you should use more powerful find command:
find /path -type f \( -iname "*.doc" -o -iname "*.pdf" \)
As if you have bash 4.0++
#!/bin/bash
shopt -s globstar
shopt -s nullglob
for file in **/*.{pdf,doc}
do
echo "$file"
done
find . | grep "\.doc$"
This will show the path as well.
Some of the other methods that can be used:
echo *.{pdf,docx,jpeg}
stat -c %n * | grep 'pdf\|docx\|jpeg'
We had a similar question. We wanted a list - with paths - of all the config files in the etc directory. This worked:
find /etc -type f \( -iname "*.conf" \)
It gives a nice list of all the .conf file with their path. Output looks like:
/etc/conf/server.conf
But, we wanted to DO something with ALL those files, like grep those files to find a word, or setting, in all the files. So we use
find /etc -type f \( -iname "*.conf" \) -print0 | xargs -0 grep -Hi "ServerName"
to find via grep ALL the config files in /etc that contain a setting like "ServerName" Output looks like:
/etc/conf/server.conf: ServerName "default-118_11_170_172"
Hope you find it useful.
Sid
Similarly if you prefer using the wildcard character * (not quite like the regex suggestions) you can just use ls with both the -l flag to list one file per line (like grep) and the -R flag like you had. Then you can specify the files you want to search for with *.doc
I.E. Either
ls -l -R *.doc
or if you want it to list the files on fewer lines.
ls -R *.doc
If you have files with extensions that don't match the file type, you could use the file utility.
find $PWD -type f -exec file -N \{\} \; | grep "PDF document" | awk -F: '{print $1}'
Instead of $PWD you can use the directory you want to start the search in. file prints even out he PDF version.

Resources