How to get my expected search result using grep on Linux - linux

I am trying to find out which files contains this text 'roads' on the server, so I use this command: grep -rl 'roads'. But it shows lots of such files:
./res/js/.svn/entries
./res/js/.svn/all-wcprops
./res/styles/.svn/entries
./res/styles/.svn/all-wcprops
./res/images/.svn/entries
I do not want this folder .svn show in the search result because this is just for version control, it means nothing for me, so is there a way that I can do this: if the result contains .svn, then it does not show up in the final result. e.g. below three files contain the text 'roads':
check.php
./res/js/.svn/entries
./res/js/.svn/all-wcprops
Then the result only shows:
check.php

One simple way is to simply grep away your false positives:
grep -rl roads . | grep -v '/\.svn/'
If you want to be more efficient and not spend time searching through the SVN files, you can filter them away before grepping through them:
find -type f | grep -v '/\.svn/' | xargs grep -l roads

grep has a feature to exclude a particular directory, called "--exclude-dir", so simply you can pass .svn as "--exclude-dir" param, like below:
"grep -rl --exclude-dir=.svn ./ -e roads"

Related

linux : listing files that contain several words

I try to find a way to list all the files in the directory tree (recursively) that contain several words.
While searching I found example such as egrep -R -l 'toto|tata' . but | induce OR. I would like AND...
Thank you for your help
Using GNU grep with GNU xargs,
grep -ERl 'toto' | xargs -r grep 'tata'
The first grep lists those files containing the pattern toto which is then fed to xargs and with the second grep those files containing tata is retrieved. The -r flag is to ensure second grep doesn't run on an empty output.
The -r flag in xargs from the man page,
-r, --no-run-if-empty
If the standard input does not contain any nonblanks, do not run the command.
Normally, the command is run once even if there is no input. This option is a GNU
extension.
agrep tool is designed for providing AND to grep with usage:
agrep 'pattern1;pattern2' file
In your case you could run
find . -type f -exec agrep 'toto;tata' {} \; #apply -l to display the file names
PS1: For current directory you can just agrep 'pattern1;pattern2' *.*
PS2: Unfortunatelly agrep does not support -R option.

How to know which file holds grep result?

There is a directory which contains 100 text files. I used grep to search a given text in the directory as follow:
cat *.txt | grep Ya_Mahdi
and grep shows Ya_Mahdi.
I need to know which file holds the text. Is it possible?
Just get rid of cat and provide the list of files to grep:
grep Ya_Mahdi *.txt
While this would generally work, depending on the number of .txt files in that folder, the argument list for grep might get too large.
You can use find for a bullet proof solution:
find --maxdepth 1 -name '*.txt' -exec grep -H Ya_Mahdi {} +

'ls | grep -c' and full path

Can I use ls | grep -c /full/path/to/file to count the occurrences of a file, but while executing the command from a different directory than where the files I'm looking for are?
Let's say I want to look how many .txt files I have in my "results" directory. Can I do something like ls | grep -c /full/path/to/results/*.txt while I'm in another directory?
Although I have .txt files in that directory, I always get a zero when I run the command from another directory :( What's happening? Can I only use ls for the current directory?
You have to use ls <dirname>. Plain ls defaults only to the current directory.
What you are trying to do can be accomplished by find <dir> -name "*.txt" | grep -c txt or find <dir> -name "*.txt" | wc -l
But you can do ls * | grep \.txt$ as well. Please read the manual to find the differences.
grep accepts regular expressions, not glob. /foo/bar/*.txt is a glob. Try /foo/bar/.*\.txt
also ls lists files and directories under your current directory. It will not list the full path. Do some tests, and you will see it easily.
ls may output results in a single line, and this could make your grep -c give an incorrect result. Because grep does line-based matching.

Grep into given filenames

I have a directory that contains many subdirectories that contain many files.
I list the contents of the current directory using ls *. I see that there are certain files that are relevant, in terms of their names. Therefore, the relevant files can be obtained as such ls * | grep "abc\|def\|ghi".
Now I want to search within the given filenames. So I try something like:
ls * | grep "abc\|def\|ghi" | zgrep -i "ERROR" *, however, this is not looking into the file contents, rather the names. Is there an easy way to do this with pipes?
To use grep to search the contents of files within a directory, try using the find command, using xargs to couple it with the grep command, like so:
find . -type f | xargs grep '...'
You can do it like this:
find -E . -type f -regex ".*/.*(abc|def).*" -exec grep -H ERROR {} \+
The -E allows use of extended regexes so you can use the pipe (|) for expressing alternations. The + at the end allows searching in as many files as possible for each invocation of -exec grep rather than needing a whole new process for every single file.
You should use xargs to grep each file contents:
ls * | grep "abc\|def\|ghi" | xargs zgrep -i "ERROR" *
I know you asked for a solution with pipes, but they are not necessary for this task. grep has many parameters, and can solve this problem alone:
grep . -rh --include "*abc*" --include "*def*" -e "ERROR"
Parameters:
--include : Search only files whose base name matches the give wildcard pattern (not regex!)
-h : Suppress the prefixing of file names on output.
-r : recursive
-e : regex filter pattern
grep -i "ERROR" `ls * | grep "abc\|def\|ghi"`

List of files to check if present

I have a large list of files, and I need to check to see whether they are somewhere on my linux server. Some of them may be and some of them may not.
Is there a command line tool to do this?
Or must I resort to looping find in a shell script?
There is another alternative, which relies on using find. The idea is to run find once, save all the filenames and then compare them to the list of files.
First, the list of files must be sorted: let us called sortedFiles.txt
run
find / -type f | xargs -n1 -I# basename '#' | sort -u > /tmp/foundFiles.txt
now compare them, and print only those in the first file but not in the second
comm -23 /tmp/sortedFiles.txt /tmp/foundFiles.txt
This will tell you the ones that are not in the computer.
if you want the ones in the computer then use:
comm -12 /tmp/sortedFiles.txt /tmp/foundFiles.txt
this will tell you the ones that are in the computer. The disadvantage is that you don't know where they are. :)
Alternatively run find:
find / -type f > /tmp/allFiles.txt
then iterate using grep, making sure you match the end of the line from the last /
cat /tmp/filesToFind.txt | xargs -n1 -I# egrep '/#$' /tmp/allFiles.txt
This will print only the locations of the files found, but will not print those that are not found.
--dmg
I assume you have a list of filenames without path (all unique). I would suggest to use locate
assuming you have the file with the filenames: files.txt
cat files.txt | xargs -n1 -I# locate -b '\#' | xargs -n1 -I# basename # | uniq > found.txt
then just diff the files.
diff files.txt found.txt
oh, one clarification. This will tell you if the files EXIST in your computer, not where :)
if you want to know where simple run:
cat files.txt | xargs -n1 -I# locate -b '\#'
--dmg
If you do the loop, it's better to use locate instead of find. It's faster!
If lista contains file names you can use:
cat lista | xargs locate

Resources