How to grep while excluding some words?

How to grep while excluding some words? - linux

I wanted to grep the word "force" but most of the output listed is from the command -force.
When I did grep -v "-force" filename , it says grep : orce most probably because of the -f command.
I just want to find a force signal from files using grep. How?

use grep -v -- "-force" - the double - signals that there are no more options being expected.

If you want to grep specific word from file then we can use cat command
# cat filename.txt | grep force
For other basic Commands

this line maybe simpler:
grep '[^-]force' tmp
it says: grep "force", but only if it does not has a prefix - by using [^]. See some simple regular expression examples here.

Use [-] to remove the special significance. Check this out:
> cat rand_file.txt
1. list items of random text
2. -force
3. look similar as the first batch
4. force
5. some random text
> grep -v "-force" rand_file.txt
grep: orce: No such file or directory
> grep -v "[-]force" rand_file.txt | grep force
4. force
>

Related

Loop to filter out lines from apache log files

I have several apache access files that I would like to clean up a bit before I analyze them. I am trying to use grep in the following way:
grep -v term_to_grep apache_access_log
I have several terms that I want to grep, so I am piping every grep action as follow:
grep -v term_to_grep_1 apache_access_log | grep -v term_to_grep_2 | grep -v term_to_grep_3 | grep -v term_to_grep_n > apache_access_log_cleaned
Until here my rudimentary script works as expected! But I have many apache access logs, and I don't want to do that for every file. I have started to write a bash script but so far I couldn't make it work. This is my try:
for logs in ./access_logs/*;
do
cat $logs | grep -v term_to_grep | grep -v term_to_grep_2 | grep -v term_to_grep_3 | grep -v term_to_grep_n > $logs_clean
done;
Could anyone point me out what I am doing wrong?

If you have a variable and you append _clean to its name, that's a new variable, and not the value of the old one with _clean appended. To fix that, use curly braces:
$ var=file.log
$ echo "<$var>"
<file.log>
$ echo "<$var_clean>"
<>
$ echo "<${var}_clean>"
<file.log_clean>
Without it, your pipeline tries to redirect to the empty string, which results in an error. Note that "$file"_clean would also work.
As for your pipeline, you could combine that into a single grep command:
grep -Ev 'term_to_grep|term_to_grep_2|term_to_grep_3|term_to_grep_n' "$logs" > "${logs}_clean"
No cat needed, only a single invocation of grep.
Or you could stick all your terms into a file:
$ cat excludes
term_to_grep_1
term_to_grep_2
term_to_grep_3
term_to_grep_n
and then use the -f option:
grep -vf excludes "$logs" > "${logs}_clean"
If your terms are strings and not regular expressions, you might be able to speed this up by using -F ("fixed strings"):
grep -vFf excludes "$logs" > "${logs}_clean"
I think GNU grep checks that for you on its own, though.

You are looping over several files, but in your loop you constantly overwrite your result file, so it will only contain the last result from the last file.
You don't need a loop, use this instead:
egrep -v 'term_to_grep|term_to_grep_2|term_to_grep_3' ./access_logs/* > "$logs_clean"
Note, it is always helpful to start a Bash script with set -eEuCo pipefail. This catches most common errors -- it would have stopped with an error when you tried to clobber the $logs_clean file.

grep 2 words at if statements in Bash

I am trying to see if my nohup file contains the words that I am looking for. If it does, then I need to put that into tmp file.
So I am currently using:
if grep -q "Started|missing" $DIR3/$dirName/nohup.out
then
grep -E "Started|missing" "$DIR3/$dirName/nohup.out" > tmp
fi
But it never goes into the if statement even if there are words that I am looking for.
How can I fix this?

Since basic sed uses BRE, regex alternation operator is represented by \| . | matches a literal | symbol. And you don't need to touch | symbol in the grep which uses ERE.
if grep -q "Started\|missing" $DIR3/$dirName/nohup.out

You should use egrep instead of grep (Avinash Raj has explained that in other words already in his answer).
I would generally recommend using egrep as a default for everyday use (even though many expressions only contain the basic regular expression syntax). From a practical point the standard grep is only interesting for performance reasons.
Details about the advantages of grep vs. egrep can be found in that superuser question.

When you only put the grep results into the tmp-file, you do not want to grep the file twice.
You can not use
egrep "Started|missing" $DIR3/$dirName/nohup.out > tmp
since that would create an empty tmp file when nothing is found.
You can remove empty files with if [ ! -s tmp ] or use another solution:
Redirectong the grep results without grepping again can be done with
rm -f tmp 2>/dev/null
egrep "Started|missing" $DIR3/$dirName/nohup.out | while read -r strange_line; do
echo "${strange_line}" >> tmp
done

how can I extract a pattern only using grep

How can we extract the contents present inside KeyProviderType tag only using grep command from the folllowing pattern?
<ContentProtectKeyProfiles-row><Name>PREM7</Name><Domain>42.0.112.121</Domain<ProfileType>4</ProfileType>
<Protocol>HTTP</Protocol><Port>80</Port><KeyProviderType>HLS-AES-128</KeyProviderType</ContentProtectKeyProfiles-row>

a#x:/tmp$ cat s.xml
<ContentProtectKeyProfiles-row> <Name>PREM7</Name> <Domain>42.0.112.121</Domain> <ProfileType>4</ProfileType> <Protocol>HTTP</Protocol> <Port>80</Port> <KeyProviderType>HLS-AES-128</KeyProviderType> </ContentProtectKeyProfiles-row>dhruv#dhruv-pathak:/tmp$
a#x:/tmp$ cat s.xml | grep -oe "<KeyProviderType>.*</KeyProviderType>"
<KeyProviderType>HLS-AES-128</KeyProviderType>

Don't use grep to process XML files. Use a proper XML parser. For example, using xsh, I can just run
open in.xml ;
echo (//KeyProviderType) ;
BTW, I had to fix 2 tags that were missing > in your input.

You can try to use gnu awk (due to RS)
awk -v RS="KeyProviderType" 'NR%2==0 {gsub(/>|<\//,"");print}' file
HLS-AES-128

You may use regex lookahead's and lookbehind's, provided your grep support -P flag.
cat s.xml | grep -oP "(?<=<KeyProviderType>).*(?=</KeyProviderType>)"

Preserve colouring after piping grep to grep

There is a simlar question in Preserve ls colouring after grep’ing but it annoys me that if you pipe colored grep output into another grep that the coloring is not preserved.
As an example grep --color WORD * | grep -v AVOID does not keep the color of the first output. But for me ls | grep FILE do keep the color, why the difference ?

grep sometimes disables the color output, for example when writing to a pipe. You can override this behavior with grep --color=always
The correct command line would be
grep --color=always WORD * | grep -v AVOID
This is pretty verbose, alternatively you can just add the line
alias cgrep="grep --color=always"
to your .bashrc for example and use cgrep as the colored grep. When redefining grep you might run into trouble with scripts which rely on specific output of grep and don't like ascii escape code.

A word of advice:
When using grep --color=always, the actual strings being passed on to the next pipe will be changed. This can lead to the following situation:
$ grep --color=always -e '1' * | grep -ve '12'
11
12
13
Even though the option -ve '12' should exclude the middle line, it will not because there are color characters between 1 and 2.

The existing answers only address the case when the FIRST command is grep (as asked by the OP, but this problem arises in other situations too).
More general answer
The basic problem is that the command BEFORE | grep, tries to be "smart" by disabling color when it realizes the output is going to a pipe. This is usually what you want so that ANSI escape codes don't interfere with your downstream program.
But if you want colorized output emanating from earlier commands, you need to force color codes to be produced regardless of the output sink. The forcing mechanism is program-specific.
Git: use -c color.status=always
git -c color.status=always status | grep -v .DS_Store
Note: the -c option must come BEFORE the subcommand status.
Others
(this is a community wiki post so feel free to add yours)

Simply repeat the same grep command at the end of your pipe.
grep WORD * | grep -v AVOID | grep -v AVOID2 | grep WORD

Pipe output to use as the search specification for grep on Linux

How do I pipe the output of grep as the search pattern for another grep?
As an example:
grep <Search_term> <file1> | xargs grep <file2>
I want the output of the first grep as the search term for the second grep. The above command is treating the output of the first grep as the file name for the second grep. I tried using the -e option for the second grep, but it does not work either.

You need to use xargs's -i switch:
grep ... | xargs -ifoo grep foo file_in_which_to_search
This takes the option after -i (foo in this case) and replaces every occurrence of it in the command with the output of the first grep.
This is the same as:
grep `grep ...` file_in_which_to_search

Try
grep ... | fgrep -f - file1 file2 ...

If using Bash then you can use backticks:
> grep -e "`grep ... ...`" files
the -e flag and the double quotes are there to ensure that any output from the initial grep that starts with a hyphen isn't then interpreted as an option to the second grep.
Note that the double quoting trick (which also ensures that the output from grep is treated as a single parameter) only works with Bash. It doesn't appear to work with (t)csh.
Note also that backticks are the standard way to get the output from one program into the parameter list of another. Not all programs have a convenient way to read parameters from stdin the way that (f)grep does.

I wanted to search for text in files (using grep) that had a certain pattern in their file names (found using find) in the current directory. I used the following command:
grep -i "pattern1" $(find . -name "pattern2")
Here pattern2 is the pattern in the file names and pattern1 is the pattern searched for
within files matching pattern2.
edit: Not strictly piping but still related and quite useful...

This is what I use to search for a file from a listing:
ls -la | grep 'file-in-which-to-search'

Okay breaking the rules as this isn't an answer, just a note that I can't get any of these solutions to work.
% fgrep -f test file
works fine.
% cat test | fgrep -f - file
fgrep: -: No such file or directory
fails.
% cat test | xargs -ifoo grep foo file
xargs: illegal option -- i
usage: xargs [-0opt] [-E eofstr] [-I replstr [-R replacements]] [-J replstr]
[-L number] [-n number [-x]] [-P maxprocs] [-s size]
[utility [argument ...]]
fails. Note that a capital I is necessary. If i use that all is good.
% grep "`cat test`" file
kinda works in that it returns a line for the terms that match but it also returns a line grep: line 3 in test: No such file or directory for each file that doesn't find a match.
Am I missing something or is this just differences in my Darwin distribution or bash shell?

I tried this way , and it works great.
[opuser#vjmachine abc]$ cat a
not problem
all
problem
first
not to get
read problem
read not problem
[opuser#vjmachine abc]$ cat b
not problem xxy
problem abcd
read problem werwer
read not problem 98989
123 not problem 345
345 problem tyu
[opuser#vjmachine abc]$ grep -e "`grep problem a`" b --col
not problem xxy
problem abcd
read problem werwer
read not problem 98989
123 not problem 345
345 problem tyu
[opuser#vjmachine abc]$

You should grep in such a way, to extract filenames only, see the parameter -l (the lowercase L):
grep -l someSearch * | xargs grep otherSearch
Because on the simple grep, the output is much more info than file names only. For instance when you do
grep someSearch *
You will pipe to xargs info like this
filename1: blablabla someSearch blablabla something else
filename2: bla someSearch bla otherSearch
...
Piping any of above line makes nonsense to pass to xargs.
But when you do grep -l someSearch *, your output will look like this:
filename1
filename2
Such an output can be passed now to xargs

I have found the following command to work using $() with my first command inside the parenthesis to have the shell execute it first.
grep $(dig +short) file
I use this to look through files for an IP address when I am given a host name.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to grep while excluding some words? - linux

I wanted to grep the word "force" but most of the output listed is from the command -force. When I did grep -v "-force" filename , it says grep : orce most probably because of the -f command. I just want to find a force signal from files using grep. How?

use grep -v -- "-force" - the double - signals that there are no more options being expected.

If you want to grep specific word from file then we can use cat command # cat filename.txt | grep force For other basic Commands

this line maybe simpler: grep '[^-]force' tmp it says: grep "force", but only if it does not has a prefix - by using [^]. See some simple regular expression examples here.

Related

Loop to filter out lines from apache log files

grep 2 words at if statements in Bash

how can I extract a pattern only using grep

Preserve colouring after piping grep to grep

Pipe output to use as the search specification for grep on Linux

Categories

Resources