grep: grep -v to exclude a file but doesn't work - linux

I have three file in a directory:
a.html:
<html>
a
</html>
b.html:
<html>
b
</html>
htmlfile:
this is just a html file
I want to get the files which filename extension are not .html, so the command I use is:
ls|grep -v *html
but the result is:
<html>
b
</html>
why?
Thank you. but I don't know why ls|grep -v *html print out the content of b.html. if this command is print out the content of files which ending with .html, why don't print out the content of a.html?

Since you did not put *html in quotes, the shell expands your command to
ls | grep -v a.html b.html
Now, since grep is called with two arguments, it will ignore stdin. So the result is equivalent to
grep -v "a.html" b.html
which prints the contents of b.html.
edit
To make it work, use either
ls | grep -v "html$"
# Note that html$ is the regexp equivalent to the shell pattern *html
or
shopt -s extglob # turn on extended globbing
ls -d !(*html)

Use
ls | grep -v .html
this will filter out names with a .html extension.
Quick test:
$ ls
a.html b.html htmlfile
$ ls | grep -v .html
htmlfile

This should do the trick:
ls -1 | grep -v '.html$'

This is something to do with shell globbing. I'm not sure exactly why grep -v behaves this way, but here are some similar results on Arch Linux zsh:
ls | grep -v *html
<html>b</html>
grep -v *html
<html>b</html>
Note those have the same result. The grep command is operating on the current working directory and using the shell globbing character (*) as an argument. The pipe from ls has nothing to do with it. It's included in the output, but grep discards it.
To see how this works more clearly, move up a directory and issue:
ls <dirname> | grep -v *html
zsh: no matches found: *html
Edit: See Pumbaa80's answer for why this happens.

Related

Loop to filter out lines from apache log files

I have several apache access files that I would like to clean up a bit before I analyze them. I am trying to use grep in the following way:
grep -v term_to_grep apache_access_log
I have several terms that I want to grep, so I am piping every grep action as follow:
grep -v term_to_grep_1 apache_access_log | grep -v term_to_grep_2 | grep -v term_to_grep_3 | grep -v term_to_grep_n > apache_access_log_cleaned
Until here my rudimentary script works as expected! But I have many apache access logs, and I don't want to do that for every file. I have started to write a bash script but so far I couldn't make it work. This is my try:
for logs in ./access_logs/*;
do
cat $logs | grep -v term_to_grep | grep -v term_to_grep_2 | grep -v term_to_grep_3 | grep -v term_to_grep_n > $logs_clean
done;
Could anyone point me out what I am doing wrong?
If you have a variable and you append _clean to its name, that's a new variable, and not the value of the old one with _clean appended. To fix that, use curly braces:
$ var=file.log
$ echo "<$var>"
<file.log>
$ echo "<$var_clean>"
<>
$ echo "<${var}_clean>"
<file.log_clean>
Without it, your pipeline tries to redirect to the empty string, which results in an error. Note that "$file"_clean would also work.
As for your pipeline, you could combine that into a single grep command:
grep -Ev 'term_to_grep|term_to_grep_2|term_to_grep_3|term_to_grep_n' "$logs" > "${logs}_clean"
No cat needed, only a single invocation of grep.
Or you could stick all your terms into a file:
$ cat excludes
term_to_grep_1
term_to_grep_2
term_to_grep_3
term_to_grep_n
and then use the -f option:
grep -vf excludes "$logs" > "${logs}_clean"
If your terms are strings and not regular expressions, you might be able to speed this up by using -F ("fixed strings"):
grep -vFf excludes "$logs" > "${logs}_clean"
I think GNU grep checks that for you on its own, though.
You are looping over several files, but in your loop you constantly overwrite your result file, so it will only contain the last result from the last file.
You don't need a loop, use this instead:
egrep -v 'term_to_grep|term_to_grep_2|term_to_grep_3' ./access_logs/* > "$logs_clean"
Note, it is always helpful to start a Bash script with set -eEuCo pipefail. This catches most common errors -- it would have stopped with an error when you tried to clobber the $logs_clean file.

linux file based using grep cmd

I have a series of files, for example:
ABC_DDS_20150212_CD.csv
ABC_DDS_20150210_20150212_CD.csv
ABC_DFG_20150212_20150217_CD.csv
I want to apply grep command in Linux so I can extract the first 2 files, but not the 3rd file from ls.
I tried to following:
grep -l "" *20150212* -- exclude *20150212_201502*
You can pipe into grep -v:
grep -l "" *20150212* | grep -v "20150212_201502"
I haven't seen people use grep -l like that though. Using ls seems like a cleaner solution:
ls -l *20150212* | grep -v "20150212_201502"
As u mentioned first two files of ls command
then we can use head as well , just like below
ls -l *your_search_name* | head -2
You can extract your matches by using the below command
grep -l "" *20150212*
output
ABC_DDS_20150210_20150212_CD.csv
ABC_DDS_20150212_CD.csv
ABC_DFG_20150212_20150217_CD.csv
and then to get first 2 lines, you can use head command
grep -l "" *20150212* | head -2
Output
ABC_DDS_20150210_20150212_CD.csv
ABC_DDS_20150212_CD.csv

grep lines with X but not with Y

The problem:
I want to get all lines of code in my project folder that have ".js" in order to check that i don't have un-minimized JavaScript files.
When I'm trying to do the following: grep -H ".js\"" *
I'm getting everything right. but still have a problem as I don't want to get lines with ".min.js" which i don't want to get.
Is it possible using grep command to search my project folder for all files/lines that have ".js" but not ".min.js" ?
Thanks.
GalT.
Just pipe the output to another grep as
grep -H ".js" | grep -vH ".min.js"
You can do this with awk
awk '/.js/ && !/.min.js/'
To print filename:
awk '/.js/ && !/.min.js/ {print FILENAME}' *
The following command will work in folder as well.
For current dir you can use this
find . | xargs grep ".js" | grep -v "min.js"
For any specific folder
find (folder path) | xargs grep ".js" | grep -v "min.js"

grep command working in testdir but not in "real" directory

I just thought I had found my solution because the command works in my test directory.
grep -H -e 'author="[^"].*' *.xml | cut -d: -f1 | xargs -I '{}' mv {} mydir/.
But using the command in the non-test-direcory the command did not work:
This is the error message:
grep: unknown option -- O
Usage: grep [OPTION]... PATTERN [FILE]...
Try `grep --help' for more information.
Not even this worked:
$ grep -H author *.xml
or this:
$ grep -H 'author' *.xml
(same error message)
I suspect it has some relation to the file names or the amount of files.
I have almost 3000 files in the non-test-directory and only 20 in my test directory.
In both directories almost all file names contain spaces and " - ".
Some more info:
I'm using Cygwin.
I am not allowed to change the filenames
Try this (updated):
grep -HlZ 'author="[^"].*' -- *.xml | xargs -0 -I {} mv -- {} mydir/
EXPLANATION (updated)
In your "real" directory you have a file with name starting with -O.
Your shell expands the file list *.xml and grep takes your - starting filename as an option (not valid). Same thing happens with mv. As explained in the Common options section of info coreutils, you can use -- to delimit the option list. What comes after -- is considered as an operand, not an option.
Using the -l (lowercase L) option, grep outputs only the filename of matching files, so you don't need to use cut.
To correctly handle every strange filename, you have to use the pair -Z in grep and -0 in xargs.
No need to use -e because your pattern does not begin with -.
Hope this will help!

Pipe output to use as the search specification for grep on Linux

How do I pipe the output of grep as the search pattern for another grep?
As an example:
grep <Search_term> <file1> | xargs grep <file2>
I want the output of the first grep as the search term for the second grep. The above command is treating the output of the first grep as the file name for the second grep. I tried using the -e option for the second grep, but it does not work either.
You need to use xargs's -i switch:
grep ... | xargs -ifoo grep foo file_in_which_to_search
This takes the option after -i (foo in this case) and replaces every occurrence of it in the command with the output of the first grep.
This is the same as:
grep `grep ...` file_in_which_to_search
Try
grep ... | fgrep -f - file1 file2 ...
If using Bash then you can use backticks:
> grep -e "`grep ... ...`" files
the -e flag and the double quotes are there to ensure that any output from the initial grep that starts with a hyphen isn't then interpreted as an option to the second grep.
Note that the double quoting trick (which also ensures that the output from grep is treated as a single parameter) only works with Bash. It doesn't appear to work with (t)csh.
Note also that backticks are the standard way to get the output from one program into the parameter list of another. Not all programs have a convenient way to read parameters from stdin the way that (f)grep does.
I wanted to search for text in files (using grep) that had a certain pattern in their file names (found using find) in the current directory. I used the following command:
grep -i "pattern1" $(find . -name "pattern2")
Here pattern2 is the pattern in the file names and pattern1 is the pattern searched for
within files matching pattern2.
edit: Not strictly piping but still related and quite useful...
This is what I use to search for a file from a listing:
ls -la | grep 'file-in-which-to-search'
Okay breaking the rules as this isn't an answer, just a note that I can't get any of these solutions to work.
% fgrep -f test file
works fine.
% cat test | fgrep -f - file
fgrep: -: No such file or directory
fails.
% cat test | xargs -ifoo grep foo file
xargs: illegal option -- i
usage: xargs [-0opt] [-E eofstr] [-I replstr [-R replacements]] [-J replstr]
[-L number] [-n number [-x]] [-P maxprocs] [-s size]
[utility [argument ...]]
fails. Note that a capital I is necessary. If i use that all is good.
% grep "`cat test`" file
kinda works in that it returns a line for the terms that match but it also returns a line grep: line 3 in test: No such file or directory for each file that doesn't find a match.
Am I missing something or is this just differences in my Darwin distribution or bash shell?
I tried this way , and it works great.
[opuser#vjmachine abc]$ cat a
not problem
all
problem
first
not to get
read problem
read not problem
[opuser#vjmachine abc]$ cat b
not problem xxy
problem abcd
read problem werwer
read not problem 98989
123 not problem 345
345 problem tyu
[opuser#vjmachine abc]$ grep -e "`grep problem a`" b --col
not problem xxy
problem abcd
read problem werwer
read not problem 98989
123 not problem 345
345 problem tyu
[opuser#vjmachine abc]$
You should grep in such a way, to extract filenames only, see the parameter -l (the lowercase L):
grep -l someSearch * | xargs grep otherSearch
Because on the simple grep, the output is much more info than file names only. For instance when you do
grep someSearch *
You will pipe to xargs info like this
filename1: blablabla someSearch blablabla something else
filename2: bla someSearch bla otherSearch
...
Piping any of above line makes nonsense to pass to xargs.
But when you do grep -l someSearch *, your output will look like this:
filename1
filename2
Such an output can be passed now to xargs
I have found the following command to work using $() with my first command inside the parenthesis to have the shell execute it first.
grep $(dig +short) file
I use this to look through files for an IP address when I am given a host name.

Resources