passing grep into a variable in bash - linux

I have a file named email.txt like these one :
Subject:My test
From:my email <myemail#gmail.com>
this is third test
I want to take out only the email address in this file by using bash script.So i put this script in my bash script named myscript:
#!/bin/bash
file=$(myscript)
var1=$(awk 'NR==2' $file)
var2=$("$var1" | (grep -Eio '\b[A-Z0-9._%+-]+#[A-Z0-9.-]+\.[A-Z]{2,4}\b'))
echo $var2
But I failed to run this script.When I run this command manually in bash i can obtain the email address:
echo $var1 | grep -Eio '\b[A-Z0-9._%+-]+#[A-Z0-9.-]+\.[A-Z]{2,4}\b'
I need to put the email address to store in a variable so i can use it in other function.Can someone show me how to solve this problem?
Thanks.

I think this is an overly complicated way to go about things, but if you just want to get your script to work, try this:
#!/bin/bash
file="email.txt"
var1=$(awk 'NR==2' $file)
var2=$(echo "$var1" | grep -Eio '\b[A-Z0-9._%+-]+#[A-Z0-9.-]+\.[A-Z]{2,4}\b')
echo $var2
I'm not sure what file=$(myscript) was supposed to do, but on the next line you want a file name as argument to awk, so you should just assign email.txt as a string value to file, not execute a command called myscript. $var1 isn't a command (it's just a line from your text file), so you have to echo it to give grep anything useful to work with. The additional parentheses around grep are redundant.

What is happening is this:
var2=$("$var1" | (grep -Eio '\b[A-Z0-9._%+-]+#[A-Z0-9.-]+\.[A-Z]{2,4}\b'))
^^^^^^^ Execute the program named (what is in variable var1).
You need to do something like this:
var2=$(echo "$var1" | grep -Eio '\b[A-Z0-9._%+-]+#[A-Z0-9.-]+\.[A-Z]{2,4}\b')
or even
var2=$(awk 'NR==2' $file | grep -Eio '\b[A-Z0-9._%+-]+#[A-Z0-9.-]+\.[A-Z]{2,4}\b')

There are very helpful flags for bash: -xv
The line with
var2=$("$var1" | (grep...
should be
var2=$(echo "$var1" | (grep...
Also my version of grep doesn't have -o flag.
And, as far as grep patterns are "greedy" even as the following code runs, it's output is not exactly what you want.
#!/bin/bash -xv
file=test.txt
var1=$(awk 'NR==2' $file)
var2=$(echo "$var1" | (grep -Ei '\b[A-Z0-9._%+-]+#[A-Z0-9.-]+.[A-Z]{2,4}\b'))
echo $var2

Use Bash parameter expansion,
var2="${var1#*:}"

There's a cruder way:
cat $file | grep # | tr '<>' '\012\012' | grep #
That is, extract the line(s) with # signs, turn the angle brackets into newlines, then grep again for anything left with an # sign.
Refine as needed...

Related

how to remove the extension of multiple files using cut in a shell script?

I'm studying about how to use 'cut'.
#!/bin/bash
for file in *.c; do
name = `$file | cut -d'.' -f1`
gcc $file -o $name
done
What's wrong with the following code?
There are a number of problems on this line:
name = `$file | cut -d'.' -f1`
First, to assign a shell variable, there should be no whitespace around the assignment operator:
name=`$file | cut -d'.' -f1`
Secondly, you want to pass $file to cut as a string, but what you're actually doing is trying to run $file as if it were an executable program (which it may very well be, but that's beside the point). Instead, use echo or shell redirection to pass it:
name=`echo $file | cut -d. -f1`
Or:
name=`cut -d. -f1 <<< $file`
I would actually recommend that you approach it slightly differently. Your solution will break if you get a file like foo.bar.c. Instead, you can use shell expansion to strip off the trailing extension:
name=${file%.c}
Or you can use the basename utility:
name=`basename $file .c`
You should use the command substitution (https://www.gnu.org/software/bash/manual/bashref.html#Command-Substitution) to execute a command in a script.
With this the code will look like this
#!/bin/bash
for file in *.c; do
name=$(echo "$file" | cut -f 1 -d '.')
gcc $file -o $name
done
With the echo will send the $file to the standard output.
Then the pipe will trigger after the command.
The cut command with the . delimiter will split the file name and will keep the first part.
This is assigned to the name variable.
Hope this answer helps

Bash - Piping output of command into while loop

I'm writing a Bash script where I need to look through the output of a command and do certain actions based on that output. For clarity, this command will output a few million lines of text and it may take roughly an hour or so to do so.
Currently, I'm executing the command and piping it into a while loop that reads a line at a time then looks for certain criteria. If that criterion exists, then update a .dat file and reprint the screen. Below is a snippet of the script.
eval "$command"| while read line ; do
if grep -Fq "Specific :: Criterion"; then
#pull the sixth word from the line which will have the data I need
temp=$(echo "$line" | awk '{ printf $6 }')
#sanity check the data
echo "\$line = $line"
echo "\$temp = $temp"
#then push $temp through a case statement that does what I need it to do.
fi
done
So here's the problem, the sanity check on the data is showing weird results. It is printing lines that don't contain the grep criteria.
To make sure that my grep statement is working properly, I grep the log file that contains a record of the text that is output by the command and it outputs only the lines that contain the specified criteria.
I'm still fairly new to Bash so I'm not sure what's going on. Could it be that the command is force feeding the while loop a new $line before it can process the $line that met the grep criteria?
Any ideas would be much appreciated!
How does grep know what line looks like?
if ( printf '%s\n' "$line" | grep -Fq "Specific :: Criterion"); then
But I cant help feel like you are overcomplicating a lot.
function process() {
echo "I can do anything I want"
echo " per element $1"
echo " that I want here"
}
export -f process
$command | grep -F "Specific :: Criterion" | awk '{print $6}' | xargs -I % -n 1 bash -c "process %";
Run the command, filter only matching lines, and pull the sixth element. Then if you need to run an arbitrary code on it, send it to a function (you export to make it visible in subprocesses) via xargs.
What are you applying the grep on ?
Modify
if grep -Fq "Specific :: Criterion"; then
as below
if ( echo $line | grep -Fq "Specific :: Criterion" ); then

Why is my shell command working at the prompt, but not as a bash script?

New to bash scripting. I'm getting pretty familiar with shell scripting pretty well. I wrote this text transform script for a feed for a client. And extracts the url's I want, and the titles of articles. Awesome.
echo $(var=$(curl -L website.com/news)) |
grep -Po '<h3 class="article-link"><a href="\K[^<]+' <<< $var |
result=$(sed 's/"/\n/g' | sed 's/ \//\n\//g' | sed 's/>//g') ; let this=0 ; echo "$result" | while read line ; do if ((this % 2 == 0 )) ; then echo website.com/news$line ; else echo $line ; fi ; let this+=1 ; done
When I try to extract it to a file and run it with bash OR sh myThing.sh, it doesn't work at all. The only thing that echo's is 'webiste.com/news', when I try to echo $this, all I get is 1. What am I doing wrong?
#!/bin/bash
echo $(var=$(curl -L website.com/news)) |
grep -Po '<h3 class="article-link"><a href="\K[^<]+' <<< $var |
result=$(sed 's/"/\n/g' | sed 's/ \//\n\//g' | sed 's/>//g')
let this=0
echo "$result" | while read line
do
if ((this % 2 == 0 ))
then
echo website.com/news$line
else
echo $line
fi
let this+=1
done
edit:
#!/bin/bash
var=$(curl -L linux.com/news)
select=$(grep -Po '<h3 class="article-list__title"><a href="\K[^<]+' <<< $var)
result=$(sed 's/"/\n/g' | sed 's/ \//\n\//g' | sed 's/>//g')
let this=0
echo "$result" | while read line
do
if ((this % 2 == 0 ))
then
echo website.com/news$line
else
echo $line
fi
let this+=1
done
This answer solves the OP's specific problem, but to address the question "Why is my shell command working at the prompt, but not as a bash script?" generally, Etan Reisner provides an excellent answer in the comments:
"You are either not running that exact command or it "works" because you have shell state that is affecting things in ways you take to be "working" and your script doesn't have that state. Try launching an entirely new shell session and see if that command, on its own, works for you there."
echo $(var=...) will assign a value to variable $var, but will not output anything, so the echo command will simply print a newline.
Furthermore, because the assignment to $var happens inside $(...) (a command substitution), it is confined to the subshell that the command inside the substitution ran in, so $var will not be defined in the calling shell.
(A subshell is a child process that contains a duplicate of the current shell's environment, without being able to modify the current shell's environment).
More generally, you cannot meaningfully define variables inside a pipeline - they will neither be visible to other pipeline segments, nor after the pipeline finishes.[1]
The only reason your [original] command could ever have worked is if $var had a preexisting value in your shell.
In fact, given that you provide input to grep via a here-string (<<<), the first segment of your pipeline (echo ...) is entirely ignored.
To pass the output of curl through the pipeline to grep and then to sed, no intermediate variables are needed at all.
Furthermore, your sed command is lacking input: you probably meant to feed it $var in your first attempt, and $select in the 2nd (your 2nd attempt came close to a correct solution).
What you were probably ultimately looking for:
result=$(curl -L website.com/news |
grep -Po '<h3 class="article-link"><a href="\K[^<]+' |
sed 's/"/\n/g' | sed 's/ \//\n\//g' | sed 's/>//g')
# ... processing of "$result"
Some additional notes:
You could combine the 3 sed calls into a single one.
You could feed the pipeline output directly into your while loop, without the need for intermediate variable $result.
You should generally double-quote variable references (e.g., use "$line" instead of $line to protect them from interpretation by the shell (word-splitting, globbing).
let this+=1 is better expressed as (( ++this )) in modern Bash.
This answer of mine contains links to resources for learning about bash.
[1] All commands involved in a pipeline by default run in a subshell in bash, so they all see copies of the parent shell's variables. Bash 4.2+ offers the lastpipe option (off by default) to allow you to create variables in the current shell instead of in a subshell, by running the last pipeline segment (only) in the current shell instead of in a subshell, to facilitate scenarios such as ... | while read -r line ... and have $line continue to exist after the pipeline finishes.
Note that this still doesn't enable defining a variable in an earlier pipeline segment in the hopes that a later segment will see it - this can never work, because the commands that make up a pipeline are launched at the same time, and it is only through coordination of the input and output streams that effective left-to-right processing happens.
This line is totally wrong. You are attempting to pass thru pipes the standard output of each process when none of them ever prints anything except standard error.
echo $(var=$(curl -L website.com/news)) | grep -Po '<h3 class="article-link"><a href="\K[^<]+' <<< $var | result=$(sed 's/"/\n/g' | sed 's/ \//\n\//g' | sed 's/>//g')
I'll break down what I believe you are attempting to do.
echo $(var=$(curl -: website.com/news))
The above code will only print the standard error, which is a separate stream than standard output. The standard output is assigned to $var. However you are attempting to pass the standard output to the next process which is nothing but a newline at this time.
grep -Po '<h3 class="article-link"><a href="\K[^<]+' <<< $var
The here-string <<< takes precedence over pipe. But variable $var is lost as it was defined inside a sub-shell and not in the parent shell. Thanks to #mklement0.
The proper way to accomplish all this is to not use $var. All you wanted is the value stored in $result.
result=$(curl -L website.com/news | grep -Po '<h3 class="article-link"><a href="\K[^<]+'| sed 's/"/\n/g' | sed 's/ \//\n\//g' | sed 's/>//g')
I don't intend to optimize your script. This is more of a suggested solution. A more comprehensive answer to your question Why is my shell command working at the prompt, but not as a bash script? is answered by mklement0 here.

Backticks can't handle pipes in variable

I have a problem with one script in bash with CAT command.
This works:
#!/bin/bash
fil="| grep LSmonitor";
log="/var/log/sys.log ";
lines=`cat $log | grep LSmonitor | wc -l`;
echo $lines;
Output: 139
This does not:
#!/bin/bash
fil="| grep LSmonitor";
log="/var/log/sys.log ";
string="cat $log $fil | wc -l";
echo $string;
`$string`;
Output:
cat /var/log/sys.log | grep LSmonitor | wc -l
cat: opcion invalida -- 'l'
Pruebe 'cat --help' para mas informacion.
$fil is a parameter in this example static, but in real script, parameter is get from html form POST, and if I print I can see that the content of $fil is correct.
In this case, since you're building a pipeline as a string, you would need:
eval "$string"
But DON'T DO THIS!!!! -- someone can easily enter the filter
; rm -rf *
and then you're hosed.
If you want a regex-based filter, get the user to just enter the regex, and then you'll do:
grep "$fil" "$log" | wc -l
Firstly, allow me to say that this sounds like a really bad idea:
[…] in real script, parameter is get from html form POST, […]
You should not be allowing the content of POST requests to be run by your shell. This is a massive attack vector, and whatever mechanisms you have in place to try to protect it are probably not as effective as you think.
Secondly, | inside variables are not treated as special. This isn't specific to backticks. Parameter expansion (e.g., replacing $fil with | grep LSmonitor) happens after the command is parsed and mostly processed. There's a little bit of post-processing that's done on the results of parameter expansion (including "word splitting", which is why $fil is equivalent to the three arguments '|' grep LSmonitor rather than to the single argument '| grep LSmonitor'), but nothing as dramatic as you describe. So, for example, this:
pipe='|'
echo $pipe cat
prints this:
| cat
Since your use-case is so frightening, I'm half-tempted to not explain how you can do what you want — I think you'll be better off not doing this — but since Stack Overflow answers are intended to be useful for more people than just the original poster, an example of how you can do this is below. I encourage the OP not to read on.
fil='| grep LSmonitor'
log=/var/log/sys.log
string="cat $log $fil | wc -l"
lines="$(eval "$string")"
echo "$lines"
Try using eval (taken from https://stackoverflow.com/a/11531431/2687324).
It looks like it's interpreting | as a string, not a pipe, so when it reaches -l, it treats it as if you're trying to pass in -l to cat instead of wc.
The other answers outline why you shouldn't do it this way.
grep LSmonitor /var/log/syslog | wc -l will do what you're looking for.

BASH: can grep on command line, but not in script

Did this million times already, but this time it's not working
I try to grep "$TITLE" from a file. on command line it's working, "$TITLE" variable is not empty, but when i run the script it finds nothing
*title contains more than one word
echo "$TITLE"
cat PAGE.$TEMP.2 | grep "$TITLE"
what i've tried:
echo "cat PAGE.$TEMP.2 | grep $TITLE"
to see if title is not empty and file name is actually there
Are you sure that $TITLE does not have leading or trailing whitespace which is not in the file? Your fix with the string would strip out whitespace before execution, so it would not see it.
For example, with a file containing 'Line one':
/home/user1> TITLE=' one '
/home/user1> grep "$TITLE" text.txt
/home/user1> cat text.txt | grep $TITLE
Line one
Try echo "<$TITLE>", or echo "$TITLE"|od -xc which sould enable you to spot errant chars.
This command
echo "cat PAGE.$TEMP.2 | grep $TITLE"
echoes a string that starts with 'cat'. It does not run a command. You would want
echo "$( cat PAGE.$TEMP.2 | grep $TITLE )"
although that is identical in functionality to the simpler
cat PAGE.$TEMP.2 | grep $TITLE
And as pointed out by others, there is no need to pipe a single file using cat; grep can read from files just fine:
grep "$TITLE" "PAGE.$TEMP.2"
(Your default behavior should be to quote parameter expansions, unless you can show it is incorrect to do so.)
Works for me:
~> cat test.dat
abc
cda
xyz
~> export GRP=cda
~> cat test.dat | grep $GRP
cda
Edit:
Also the proper way to use grep is:
~> grep $GRP test.dat

Resources