Grep special character #‘ - linux

trying to grep "#‘om" but not able to escape or account for the quote char. I tried grep -F, grep -e, grep -n or simply grep "#\‘om" to no avail.

That quote is not the simple quote character it appears to be. It's not clear whether copying-and-pasting the quote character from this website is accurate.
$ echo '‘' | cat -v
M-bM-^#M-^X
$ echo '‘' | xxd
$ 00000000: e280 980a ....
So, it appears the problem is one of character sets.
Note, however, that the following works for me:
$ echo '‘' | grep -F '‘'
‘
As does the following:
$ echo '#‘om' | grep -F '#‘om'
#‘om
It would help to see exactly what is being tried. Perhaps use xxd to confirm what bytes are making up that quote.

Related

How can I delete multiple characters in my bash script?

I want to create a bash script on Linux, which will only tell me my ip-adress, netmask and broadcast. Right now it shows more than that though, so I would like to remove a specific number of characters from my variable.
Example:
What I have
ip=Hello world!
What I want
ip=Hello
So how can I remove a specific amount of characters from the back of the variable?
I tried multiple things that I found online, but couldn't get it working the way I want it to.
with bash's substring extraction:
$ my_var="Hello world!"
$ my_var=${my_var:0:-6}
$ echo $my_var
Hello
You could pipe through grep and only output the matching portion
echo "ip=hello world" | grep -o "ip=\w*"
Output: ip=hello
Another point is you can use sed command.
For, example
$ my_var="Hello world"
$ my_var=$(echo $my_var | sed -e 's/ .*//')
$ echo $my_var
Hello
or another approach using cut
$ my_var="Hello world"
$ my_var=$(echo $my_var | cut -d ' ' -f1)
$ echo $my_var
Hello
Important - do not type ! symbol in double-quotes

sed no output on no pattern match

I want sed to give me a single line output irrespective of whether the matched pattern is found and substituted, or even if there is no pattern match, with same command options.
1. echo "700K" | sed -n 's/[A-Z]//gp' // gives one output
2. echo "700" | sed -n 's/[A-Z]//gp' // no output
Is there any way in sed i can get a single output for second case without removing the "-n" option, forcing it to print the input irrespective of substitution made or not?
It is not clear for me why you need to keep the -n option but if you really do need to keep it you can use the following sed command:
echo "700" | sed -n 's/[A-Z]//g;p'
this will first make the substitution if possible then print the line.
output:
You don't need to mess with all these sed options. Use sed in it's simpliest format which will make a substitution if pattern is found:
$ echo "700K" | sed 's/[A-Z]//g'
700
$ echo "700" | sed 's/[A-Z]//g'
700
$ sed --version
sed (GNU sed) 4.4
$ sed 's/[A-Z]//g' <<<$'700\n700K\n500\n3500A'
700
700
500
3500

grep character not followed by character

I'm trying to print lines that have b not followed by e in a file. I tried using negative look-ahead, but it's not working.
grep 'b(?!e)' filename
grep '(?!e)b)' filename
egrep 'b(?!e)' f3.txt
When I run these commands, nothing shows up, no error or anything else. I checked other people's similar posts as well but was unable to run it.
grep -E 'b([^e]|$)' filename
That should match 'b' followed by a character which is not 'e', or 'b' at end-of-line.
If your grep supports Perl regular expressions with -P, look-arounds work:
$ grep -P 'b(?!e)' <<< 'be' # Gets no output
$ grep -P 'b(?!e)' <<< 'bb'
bb
$ grep -P 'b(?!e)' <<< 'b'
b
The only difference to grep -E (in this case) is that you don't have to take care of the end-of-line situation (see pilcrow's answer).

Grep Syntax with Capitals

I'm trying to write a script with a file as an argument that greps the text file to find any word that starts with a capital and has 8 letters following it. I'm bad with syntax so I'll show you my code, I'm sure it's an easy fix.
grep -o '[A-Z][^ ]*' $1
I'm not sure how to specify that:
a) it starts with a capital letter, and
b)that it's a 9 letter word.
Cheers
EDIT:
As an edit I'd like to add my new code:
while read p
do
echo $p | grep -Eo '^[A-Z][[:alpha:]]{8}'
done < $1
I still can't get it to work, any help on my new code?
'[A-Z][^ ]*' will match one character between A and Z, followed by zero or more non-space characters. So it would match any A-Z character on its own.
Use \b to indicate a word boundary, and a quantifier inside braces, for example:
grep '\b[A-Z][a-z]\{8\}\b'
If you just did grep '[A-Z][a-z]\{8\}' that would match (for example) "aaaaHellosailor".
I use \{8\}, the braces need to be escaped unless you use grep -E, also known as egrep, which uses Extended Regular Expressions. Vanilla grep, that you are using, uses Basic Regular Expressions. Also note that \b is not part of the standard, but commonly supported.
If you use ^ at the beginning and $ at the end then it will not find "Wiltshire" in "A Wiltshire pig makes great sausages", it will only find lines which just consist of a 9 character pronoun and nothing else.
This works for me:
$ echo "one-Abcdefgh.foo" | grep -o -E '[A-Z][[:alpha:]]{8}'
$ echo "one-Abcdefghi.foo" | grep -o -E '[A-Z][[:alpha:]]{8}'
Abcdefghi
$
Note that this doesn't handle extensions or prefixes. If you want to FORCE the input to be a 9-letter capitalized word, we need to be more explicit:
$ echo "one-Abcdefghij.foo" | grep -o -E '\b[A-Z][[:alpha:]]{8}\b'
$ echo "Abcdefghij" | grep -o -E '\b[A-Z][[:alpha:]]{8}\b'
$ echo "Abcdefghi" | grep -o -E '\b[A-Z][[:alpha:]]{8}\b'
Abcdefghi
$
I have a test file named 'testfile' with the following content:
Aabcdefgh
Babcdefgh
cabcdefgh
eabcd
Now you can use the following command to grep in this file:
grep -Eo '^[A-Z][[:alpha:]]{8}' testfile
The code above is equal to:
cat testfile | grep -Eo '^[A-Z][[:alpha:]]{8}'
This matches
Aabcdefgh
Babcdefgh

Using sed to get an env var from /proc/*/environ weirdness with \x00

I'm trying to grovel through some other processes environment to get a specific env var.
So I've been trying a sed command like:
sed -n "s/\x00ENV_VAR_NAME=\([^\x00]*\)\x00/\1/p" /proc/pid/environ
But I'm getting as output the full environ file. If I replace the \1 with just a static string, I get that string plus the entire environ file:
sed -n "s/\x00ENV_VAR_NAME=\([^\x00]*\)\x00/BLAHBLAH/p" /proc/pid/environ
I should just be getting "BLAHBLAH" in the last example. This doesn't happen if I get rid of the null chars and use some other test data set.
This lead me to try transforming the \x00 to \x01's, which does seem to work:
cat /proc/pid/environ | tr '\000' '\001' | sed -n "s/\x01ENV_VAR_NAME=\([^\x01]*\)\x01/\1/p"
Am I missing something simple about sed here? Or should I just stick to this workaround?
A lot of programs written in C tend to fail with strings with embedded NULs as a NUL terminates a C-style string. Unless specially written to handle it.
I process /proc/*/environ on the command line with xargs:
xargs -n 1 -0 < /proc/pid/environ
This gives you one env var per line. Without a command, xargs just echos the argument. You can then easily use grep, sed, awk, etc on that by piping to it.
xargs -n 1 -0 < /proc/pid/environ | sed -n 's/^ENV_VAR_NAME=\(.*\)/\1/p'
I use this often enough that I have a shell function for it:
pidenv()
{
xargs -n 1 -0 < /proc/${1:-self}/environ
}
This gives you the environment of a specific pid, or self if no argument is supplied.
You could process the list with gawk, setting the record separator to \0 and the field separator to =:
gawk -v 'RS=\0' -F= '$1=="ENV_VAR_NAME" {print $2}' /proc/pid/environ
Or you could use read in a loop to read each NUL-delimited line. For instance:
while read -d $'\0' ENV; do declare "$ENV"; done < /proc/pid/environ
echo $ENV_VAR_NAME
(Do this in a sub-shell to avoid clobbering your own environment.)
cat /proc/PID/environ | tr '\0' '\n' | sed 's/^/export /' ;
then copy and paste as needed.
In spite of really old and answered question, I am adding one very simple oneliner, probably simpler for getting the text output and further processing:
strings /proc/$PID/environ
For some reason sed does not match \0 with .
% echo -n "\00" | xxd
0000000: 00 .
% echo -n "\00" | sed 's/./a/g' | xxd
0000000: 00 .
% echo -n "\01" | xxd
0000000: 01 .
% echo -n "\01" | sed 's/./a/g' | xxd
0000000: 61 a
Solution: do not use sed or use your workaround.

Resources