grep character not followed by character - linux

I'm trying to print lines that have b not followed by e in a file. I tried using negative look-ahead, but it's not working.
grep 'b(?!e)' filename
grep '(?!e)b)' filename
egrep 'b(?!e)' f3.txt
When I run these commands, nothing shows up, no error or anything else. I checked other people's similar posts as well but was unable to run it.

grep -E 'b([^e]|$)' filename
That should match 'b' followed by a character which is not 'e', or 'b' at end-of-line.

If your grep supports Perl regular expressions with -P, look-arounds work:
$ grep -P 'b(?!e)' <<< 'be' # Gets no output
$ grep -P 'b(?!e)' <<< 'bb'
bb
$ grep -P 'b(?!e)' <<< 'b'
b
The only difference to grep -E (in this case) is that you don't have to take care of the end-of-line situation (see pilcrow's answer).

Related

Grep special character #‘

trying to grep "#‘om" but not able to escape or account for the quote char. I tried grep -F, grep -e, grep -n or simply grep "#\‘om" to no avail.
That quote is not the simple quote character it appears to be. It's not clear whether copying-and-pasting the quote character from this website is accurate.
$ echo '‘' | cat -v
M-bM-^#M-^X
$ echo '‘' | xxd
$ 00000000: e280 980a ....
So, it appears the problem is one of character sets.
Note, however, that the following works for me:
$ echo '‘' | grep -F '‘'
‘
As does the following:
$ echo '#‘om' | grep -F '#‘om'
#‘om
It would help to see exactly what is being tried. Perhaps use xxd to confirm what bytes are making up that quote.

sed no output on no pattern match

I want sed to give me a single line output irrespective of whether the matched pattern is found and substituted, or even if there is no pattern match, with same command options.
1. echo "700K" | sed -n 's/[A-Z]//gp' // gives one output
2. echo "700" | sed -n 's/[A-Z]//gp' // no output
Is there any way in sed i can get a single output for second case without removing the "-n" option, forcing it to print the input irrespective of substitution made or not?
It is not clear for me why you need to keep the -n option but if you really do need to keep it you can use the following sed command:
echo "700" | sed -n 's/[A-Z]//g;p'
this will first make the substitution if possible then print the line.
output:
You don't need to mess with all these sed options. Use sed in it's simpliest format which will make a substitution if pattern is found:
$ echo "700K" | sed 's/[A-Z]//g'
700
$ echo "700" | sed 's/[A-Z]//g'
700
$ sed --version
sed (GNU sed) 4.4
$ sed 's/[A-Z]//g' <<<$'700\n700K\n500\n3500A'
700
700
500
3500

Grep multiple strings

I want to find a list of files that have A but do not have B and C.
grep -r -L 'B\|C' finds the ones without B and C, but how do I add the condition of having A as well?
If I understand your question correctly:
grep -l "A" $(grep -r -E -L "B|C" *)
i.e. search for files containing "A" in the list of files that your original command generates.
You can use negative lookahead in grep using options -P or --perl-regexp
grep -r -P -L '^(?!.*A).*$|B|C'
If I understood your question correctly, you can do it like this:
grep "A" file.txt | grep -v -e "B" -e "C"
The first grep finds lines containing A, the second greptakes the result and removes lines containing either "B" or "C". This works by the -v flag which inverses matches.

Grep substrings in string/word

Is there a way on grep or any other unix tool to search for a sequence of substrings in a string?
To clarify:
$ grep "substring1.*subrstring2"
substring1_mySubstring2 # OK substrings forming a single string
substring1 substring2 # WRONG substrings are separated`
You can tell grep to look for substring1 + some characters + substring2:
grep -iE 'substring1\w+substring2' file
Note the usage of -i to ignore case and -E for an extended regex coverage (the same without -E could be done with \w\+ instead).
Test
$ cat a
substring1_mySubstring2
substring1 substring2
substring1_and_other_things12345substring2 blabla
Let's see how this matches just when there is no spaces in between:
$ grep -iE 'substring1\w+substring2' a
substring1_mySubstring2
substring1_and_other_things12345substring2 blabla

Grep Syntax with Capitals

I'm trying to write a script with a file as an argument that greps the text file to find any word that starts with a capital and has 8 letters following it. I'm bad with syntax so I'll show you my code, I'm sure it's an easy fix.
grep -o '[A-Z][^ ]*' $1
I'm not sure how to specify that:
a) it starts with a capital letter, and
b)that it's a 9 letter word.
Cheers
EDIT:
As an edit I'd like to add my new code:
while read p
do
echo $p | grep -Eo '^[A-Z][[:alpha:]]{8}'
done < $1
I still can't get it to work, any help on my new code?
'[A-Z][^ ]*' will match one character between A and Z, followed by zero or more non-space characters. So it would match any A-Z character on its own.
Use \b to indicate a word boundary, and a quantifier inside braces, for example:
grep '\b[A-Z][a-z]\{8\}\b'
If you just did grep '[A-Z][a-z]\{8\}' that would match (for example) "aaaaHellosailor".
I use \{8\}, the braces need to be escaped unless you use grep -E, also known as egrep, which uses Extended Regular Expressions. Vanilla grep, that you are using, uses Basic Regular Expressions. Also note that \b is not part of the standard, but commonly supported.
If you use ^ at the beginning and $ at the end then it will not find "Wiltshire" in "A Wiltshire pig makes great sausages", it will only find lines which just consist of a 9 character pronoun and nothing else.
This works for me:
$ echo "one-Abcdefgh.foo" | grep -o -E '[A-Z][[:alpha:]]{8}'
$ echo "one-Abcdefghi.foo" | grep -o -E '[A-Z][[:alpha:]]{8}'
Abcdefghi
$
Note that this doesn't handle extensions or prefixes. If you want to FORCE the input to be a 9-letter capitalized word, we need to be more explicit:
$ echo "one-Abcdefghij.foo" | grep -o -E '\b[A-Z][[:alpha:]]{8}\b'
$ echo "Abcdefghij" | grep -o -E '\b[A-Z][[:alpha:]]{8}\b'
$ echo "Abcdefghi" | grep -o -E '\b[A-Z][[:alpha:]]{8}\b'
Abcdefghi
$
I have a test file named 'testfile' with the following content:
Aabcdefgh
Babcdefgh
cabcdefgh
eabcd
Now you can use the following command to grep in this file:
grep -Eo '^[A-Z][[:alpha:]]{8}' testfile
The code above is equal to:
cat testfile | grep -Eo '^[A-Z][[:alpha:]]{8}'
This matches
Aabcdefgh
Babcdefgh

Resources