How to use awk to print "hello" if pattern is found - linux

I want to search for a pattern in a tab-separated .txt-file and, if the pattern is found in a line, print the third field of that line.
I only need to find the first occurence in the line, since the pattern appears only once for sure.
Structure of .txt-file:
XXX01 foo target1
XXX02 bar target2
XXX03 foobar target3
My first idea was, to print "hello", if the pattern is found, to control, if my code works. I also included echos of the variables I pass to my bash script.
Command line call and Script:
$ ./script.sh file.txt foo
#!/bin/bash
file=$1
pattern=$2
awk '/"$pattern"/{print "hello"}' "$file"
echo "$file"
echo "$pattern"
As far as I found it for awk, to get the third field printed, I would have to substitute print "hello" with print "\$2".
But printing "hello" already does not work:
Actual output:
file.txt
foo
Desired output:
hello (respectively target1)
file.txt
foo
And I also checked for sure, that "foo" is in the file.txt
Progress (see comments and answer please):
#!/bin/bash
awk -v p="$2"'$2=="$p"{print "hello",$3}' "$1"
echo "$1"
echo "$2"
new output:
awk: 1:unexpected character '.'
file.txt
foo

I believe you want something like:
$ ./script.sh file.txt foo
#!/bin/bash
file=$1
pattern=$2
awk -v pattern=$pattern'$2==pattern{print "hello",$3}' "$file"
echo "$file"
echo "$pattern"
Here we get rid of the loop since awk checks every record when it is fed a file. We also use the -v flag to pass in the $pattern variable into the awk script. Then we check that the second field $2 is pattern and print "hello" as well as the contents of the third field $3.
You could change that awk condition to $2~/pattern/ to truly utilize regex if you want but I suspect it will print the 1st and 3rd line as foo shows up in both.
If you want to check if your pattern exists in anywhere in the line then you can drop the $2~ so it's just '/pattern/{print "hello",$2}.

Look:
$ x="foo"'bar' && echo "$x"
foobar
$ x="foo" 'bar' && echo "$x"
-bash: bar: command not found
Your script is:
awk -v p="$2"'$2=="$p"{print "hello",$3}' "$1"
so guess what not leaving a space between -v p="$2" and '$2=="$p" is doing. Right, it's concatenating them so don't do that - add a space:
awk -v p="$2" '$2=="$p"{print "hello",$3}' "$1"
The unexpected . btw was the . in your file name file.txt when awk was trying to evaluate the string file.txt as its cript due to the concatenation consuming the actual script into the assignment to p.
Now to actually USE the variable p in the comparison you'd have to use it as a variable instead of putting it inside a string:
awk -v p="$2" '$2==p{print "hello",$3}' "$1"
The above simply answers your question about the syntax error. To actually do what you WANT would require one of these, depending on whether you want a string or regexp match and whether you want partial or full matching:
awk -v p="$2" '$2==p{print "hello",$3}' "$1"
awk -v p="$2" '$2~p{print "hello",$3}' "$1"
awk -v p="$2" '$2~"\\<"p"\\>"{print "hello",$3}' "$1"
or some other solution depending on your so far unstated requirements.

Related

parsing complex string using shell script

I'm trying the whole day to find a good way for parsing some strings with a shell script. the strings are used as calling parameter for some applications.
they looks like:
parsingParams -c "id=uid5 prog=/opt/bin/example arg=\"-D -t5 >/dev/null 1>&2\" info='fdhff fd'" start
I'm only allowed to use shell-script. I tried to use some sed and cut commands but nothing works fine.
My tries are like:
prog=$(echo $# | cut -d= -f3 | sed 's|\s.*$||')
that return the correct value of prog but for the value of arg I couldn't find a good way to get it.
the info parameter is optional also it may be left.
may any one have a good idea that can solve this problem?
many thanks in advance
Looks like you could use eval to let the shell parse your input string, but if you don't control the input (if it comes from an unreliable source), that will introduce a major vulnerability (imagine an attacker somehow passes -c "rm -rf /" to your program).
A safer way would be to explicitly specify allowed forms of user input.
The problem you have with splitting on space (with cut) if the space is quoted, can be avoided if you specify valid fields (content, not separator), for example in GNU awk, you can use FPAT:
$ params="id=uid5 prog=/opt/bin/example arg=\"-D -t5 >/dev/null 1>&2\" info='fdhff fd'"
$ awk -v FPAT="[^=]+=(\"[^\"]*\"|'[^']*'|[^ ]*) *" '{for (i=1; i<=NF; i++) print $i}' <<<"$params"
id=uid5
prog=/opt/bin/example
arg="-D -t5 >/dev/null 1>&2"
info='fdhff fd'
Valid fields will be in one of the following forms:
var="val with spaces"
var='val with spaces'
var=val_no_spaces
Now with assignments split (one per line, assuming newline is not allowed in params), you can process them further, even with cut:
$ awk ... | cut -d $'\n' -f3
arg="-D -t5 >/dev/null 1>&2"
eval
$ eval "id=uid5 prog=/opt/bin/example arg=\"-D -t5 >/dev/null 1>&2\" info='fdhff fd'"
$ echo $id
uid5
$ echo $prog
/opt/bin/example
$ echo $arg
-D -t5 >/dev/null 1>&2
$ echo $info
fdhff fd

Unable to delete a line in file in shell script

I have to delete a line in a file from inside a shell script.
I am trying this:
linenumber=0
##CHeck If server IP exists
if grep -wq $serverip $FILE; then
echo "IP exists"
linenumber=$(awk -v serverip="$serverip" '$0 ~ serverip {print NR}' $FILE)
echo "$linenumber"
sed -e '${$linenumber}d' $FILE
fi
Basically I extract the line number and then want to delete it.
sed -e '1d' $FILE --> WOrks on CLI but inside script does not work
Why? How to get it working ?
This is simply a case of using the incorrect quotes around your sed command, so the variable isn't being used. Ignoring the fact that you're unnecessarily using 3 tools when 1 would suffice, the fix is this:
sed -e "${linenumber}d" "$FILE"
Perhaps your requirement is more complex than it appears but I would suggest changing your entire script to this:
awk -v serverip="$serverip" '!($0 ~ serverip)' "$FILE"
This prints every line that doesn't contain the shell variable $serverip. It is assumed that you have escaped any regex meta-characters present in the variable.
Alternatively (and more succinctly):
sed "/$serverip/d" "$FILE"
If you actually want the messages to be printed out (I assumed that they were for debugging), then that's easy enough to achieve:
awk -v serverip="$serverip" '$0 ~ serverip { print "IP exists"; print NR; next } 1' "$FILE"
If you're not familiar with the 1 at the end, it's just a common shorthand which causes awk to print each line (1 is always true and the default action is { print }).

Why am I getting command not found error on numeric comparison?

I am trying to parse each line of a file and look for a particular string. The script seems to be doing its intended job, however, in parallel it tries to execute the if command on line 6:
#!/bin/bash
for line in $(cat $1)
do
echo $line | grep -e "Oct/2015"
if($?==0); then
echo "current line is: $line"
fi
done
and I get the following (my script is readlines.sh)
./readlines.sh: line 6: 0==0: command not found
First: As Mr. Llama says, you need more spaces. Right now your script tries to look for a file named something like /usr/bin/0==0 to run. Instead:
[ "$?" -eq 0 ] # POSIX-compliant numeric comparison
[ "$?" = 0 ] # POSIX-compliant string comparison
(( $? == 0 )) # bash-extended numeric comparison
Second: Don't test $? at all in this case. In fact, you don't even have good cause to use grep; the following is both more efficient (because it uses only functionality built into bash and requires no invocation of external commands) and more readable:
if [[ $line = *"Oct/2015"* ]]; then
echo "Current line is: $line"
fi
If you really do need to use grep, write it like so:
if echo "$line" | grep -q "Oct/2015"; then
echo "Current line is: $line"
fi
That way if operates directly on the pipeline's exit status, rather than running a second command testing $? and operating on that command's exit status.
#Charles Duffy has a good answer which I have up-voted as correct (and it is), but here's a detailed, line by line breakdown of your script and the correct thing to do for each part of it.
for line in $(cat $1)
As I noted in my comment elsewhere this should be done as a while read construct instead of a for cat construct.
This construct will wordsplit each line making spaces in the file separate "lines" in the output.
All empty lines will be skipped.
In addition when you cat $1 the variable should be quoted. If it is not quoted spaces and other less-usual characters appearing in the file name will cause the cat to fail and the loop will not process the file.
The complete line would read:
while IFS= read -r line
An illustrative example of the tradeoffs can be found here. The linked test script follows. I tried to include an indication of why IFS= and -r are important.
#!/bin/bash
mkdir -p /tmp/testcase
pushd /tmp/testcase >/dev/null
printf '%s\n' '' two 'three three' '' ' five with leading spaces' 'c:\some\dos\path' '' > testfile
printf '\nwc -l testfile:\n'
wc -l testfile
printf '\n\nfor line in $(cat) ... \n\n'
let n=1
for line in $(cat testfile) ; do
echo line $n: "$line"
let n++
done
printf '\n\nfor line in "$(cat)" ... \n\n'
let n=1
for line in "$(cat testfile)" ; do
echo line $n: "$line"
let n++
done
let n=1
printf '\n\nwhile read ... \n\n'
while read line ; do
echo line $n: "$line"
let n++
done < testfile
printf '\n\nwhile IFS= read ... \n\n'
let n=1
while IFS= read line ; do
echo line $n: "$line"
let n++
done < testfile
printf '\n\nwhile IFS= read -r ... \n\n'
let n=1
while IFS= read -r line ; do
echo line $n: "$line"
let n++
done < testfile
rm -- testfile
popd >/dev/null
rmdir /tmp/testcase
Note that this is a bash-heavy example. Other shells do not tend to support -r for read, for example, nor is let portable. On to the next line of your script.
do
As a matter of style I prefer do on the same line as the for or while declaration, but there's no convention on this.
echo $line | grep -e "Oct/2015"
The variable $line should be quoted here. In general, meaning always unless you specifically know better, you should double-quote all expansion--and that means subshells as well as variables. This insulates you from most unexpected shell weirdness.
You decclared your shell as bash which means you will have there "Here string" operator <<< available to you. When available it can be used to avoid the pipe; each element of a pipeline executes in a subshell, which incurs extra overhead and can lead to unexpected behavior if you try to modify variables. This would be written as
grep -e "Oct/2015" <<<"$line"
Note that I have quoted the line expansion.
You have called grep with -e, which is not incorrect but is needless since your pattern does not begin with -. In addition you have full-quoted a string in shell but you don't attempt to expand a variable or use other shell interpolation inside of it. When you don't expect and don't want the contents of a quoted string to be treated as special by the shell you should single quote them. Furthermore, your use of grep is inefficient: because your pattern is a fixed string and not a regular expression you could have used fgrep or grep -F, which does string contains rather than regular expression matching (and is far faster because of this). So this could be
grep -F 'Oct/2015' <<<"$line"
Without altering the behavior.
if($?==0); then
This is the source of your original problem. In shell scripts commands are separated by whitespace; when you say if($?==0) the $? expands, probably to 0, and bash will try to execute a command called if(0==0) which is a legal command name. What you wanted to do was invoke the if command and give it some parameters, which requires more whitespace. I believe others have covered this sufficiently.
You should never need to test the value of $? in a shell script. The if command exists for branching behavior based on the return code of whatever command you pass to it, so you can inline your grep call and have if check its return code directly, thus:
if grep -F 'Oct/2015` <<<"$line" ; then
Note the generous whitespace around the ; delimiter. I do this because in shell whitespace is usually required and can only sometiems be omitted. Rather than try to remember when you can do which I recommend an extra one space padding around everything. It's never wrong and can make other mistakes easier to notice.
As others have noted this grep will print matched lines to stdout, which is probably not something you want. If you are using GNU grep, which is standard on Linux, you will have the -q switch available to you. This will suppress the output from grep
if grep -q -F 'Oct/2015' <<<"$line" ; then
If you are trying to be strictly standards compliant or are in any environment with a grep that doesn't know -q the standard way to achieve this effect is to redirect stdout to /dev/null/
if printf "$line" | grep -F 'Oct/2015' >/dev/null ; then
In this example I also removed the here string bashism just to show a portable version of this line.
echo "current line is: $line"
There is nothing wrong with this line of your script, except that although echo is standard implementations vary to such an extent that it's not possible to absolutely rely on its behavior. You can use printf anywhere you would use echo and you can be fairly confident of what it will print. Even printf has some caveats: Some uncommon escape sequences are not evenly supported. See mascheck for details.
printf 'current line is: %s\n' "$line"
Note the explicit newline at the end; printf doesn't add one automatically.
fi
No comment on this line.
done
In the case where you did as I recommended and replaced the for line with a while read construct this line would change to:
done < "$1"
This directs the contents of the file in the $1 variable to the stdin of the while loop, which in turn passes the data to read.
In the interests of clarity I recommend copying the value from $1 into another variable first. That way when you read this line the purpose is more clear.
I hope no one takes great offense at the stylistic choices made above, which I have attempted to note; there are many ways to do this (but not a great many correct) ways.
Be sure to always run interesting snippets through the excellent shellcheck and explain shell when you run into difficulties like this in the future.
And finally, here's everything put together:
#!/bin/bash
input_file="$1"
while IFS= read -r line ; do
if grep -q -F 'Oct/2015' <<<"$line" ; then
printf 'current line is %s\n' "$line"
fi
done < "$input_file"
If you like one-liners, you may use AND operator (&&), for example:
echo "$line" | grep -e "Oct/2015" && echo "current line is: $line"
or:
grep -qe "Oct/2015" <<<"$line" && echo "current line is: $line"
Spacing is important in shell scripting.
Also, double-parens is for numerical comparison, not single-parens.
if (( $? == 0 )); then

If condition giving error in shell script when checking two strings

In following shell script I want to perform two different tasks depending on file type,
but it is giving an error: "[==c]: command not found"
echo "enter file name"
read num
var_check= echo $str |awk -F . '{if (NF>1) {print $NF}}'
if ["$var_check"=="c"];then
echo "Some task for c"
elif ["$var_check"=="cpp"];then
echo "Some task for cpp"
else
echo "Wrong file extension"
fi
You wrote:
if ["$var_check"=="c"];then
The [ command is a command; its name must be surrounded by spaces (put simplistically).
if [ "$var_check" == "c" ]; then
The last argument, ], must also be preceded by a space. The operands within must also be space separated; they need to be separate arguments. The rules for the [[ ... ]] operator are a bit different, but using spaces helps people read the code even there. What you wrote is a bit like expecting:
ls"-l"/dev/tty
to work; it won't.
You also need to double check whether your test or [ operator supports ==; the normal form is =.
The line:
var_check= echo $str |awk -F . '{if (NF>1) {print $NF}}'
This runs the echo command with var_check set as an environment variable, which is unlikely to be what you wanted. You almost certainly intended to write:
var_check=$(echo $str |awk -F . '{if (NF>1) {print $NF}}')
This runs the echo and awk commands and captures the output in var_check. Use the $(...) notation in preference to the older but more complex to use `...` notation. In simple cases, they look the same; when you nest them, the $(...) notation is far, far simpler to understand and use.
Also, looking on the larger scale (3 lines instead of just 1 line):
echo "enter file name"
read num
var_check=$(echo $str |awk -F . '{if (NF>1) {print $NF}}')
You read the file name into variable num; you then echo $str instead of $num. If you've already got $str set somewhere earlier in the script (in unshown code), what you've got may be fine. Taken as a standalone fragment, it isn't right.
You could also simplify the awk a little:
var_check=$(echo $str |awk -F . 'NF > 1 {print $NF}')
This would work the same as what you wrote, but uses fewer parentheses and braces.

Bash matching binary pattern

I want to check inside a file if it matches a binary pattern.
For that, I'm using clamAV signature database
Trojan.Bancos-166:1:*:3d415d736715ab5ee347238cacac61c7123fe35427224d25253c7b035558baf19e54e8d1a82742d6a7b37afc6d91015f751de1102d0a31e66ec33b74034b1ab471cc1381884dfdf0bb3e4233bd075fef235f342302ffd72ecabfa5aedf1b3dc99b3348346db4d9001026aef44c592fee61493f7262ad2bd1bce8a7ce60d81022533f6473ae184935f25cf6cc07c3aebfdf70a5a09139
I code this to retrieve the hex string representation signature
signature=$(echo "$line" |awk -F':' '{ print $4 }')
Moreover I change hex string to binary
printf -v variable $(sed 's/\(..\)/\\x\1/g;' <<< "$signature")
Until here It works perfectly.
Finally I would like to check if my file ( *$raw_file_path* ) matches my binary pattern (now in $variable)
I try this
test_var=$(grep -qU "$variable" "$raw_file_path")
or
test_var=$(grep -qU --regexp="$variable" "$raw_file_path")
I don't know why it doesn't work, Grep doesn't match anything
.
And sometimes some errors:
grep: Trailing backslash
grep: Invalid regular expression
I know it deals with pattern matching problems.
In my test I don't want use regular expression.
If you have any idea, or other bash tool.
Thanks.
You are currently using the --quiet option for grep by specifying q in -qU. This prevents grep from printing anything to stdout, therefore nothing will be saved to test_var.
Change your code to:
test_var=$(grep -UE "$variable" "$raw_file_path")
First the extra sub-shell can be avoided:
#!/bin/bash
signature="Trojan.Bancos-166:1:*:3d415d736715ab5ee347238cacac61c7123fe35427224d25253c7b035558baf19e54e8d1a82742d6a7b37afc6d91015f751de1102d0a31e66ec33b74034b1ab471cc1381884dfdf0bb3e4233bd075fef235f342302ffd72ecabfa5aedf1b3dc99b3348346db4d9001026aef44c592fee61493f7262ad2bd1bce8a7ce60d81022533f6473ae184935f25cf6cc07c3aebfdf70a5a09139"
variable=$(echo "${signature//*:/}" | sed 's/\(..\)/\\x\1/g;')
Require only confirmation of a match:
if grep -qU "$variable" "$raw_file_path"; then
# matches
fi
Or require the result for further processing:
test_var=$(grep -U "$variable" "$raw_file_path")
# contents of match in test_var
When returning to a variable, greps -q opt suppresses stdout
Edit
Tested working example
> signature="Trojan.Bancos-166:1:All_text before-the last : should be trimed:3d415d736715ab5ee347238cacac61c7123fe35427224d25253c7b035558baf19e54e8d1a82742d6a7b37afc6d91015f751de1102d0a31e66ec33b74034b1ab471cc1381884dfdf0bb3e4233bd075fef235f342302ffd72ecabfa5aedf1b3dc99b3348346db4d9001026aef44c592fee61493f7262ad2bd1bce8a7ce60d81022533f6473ae184935f25cf6cc07c3aebfdf70a5a09139" \
> hex_string=$( echo "${signature//*:/}" | sed 's/\(..\)/\\x\1/g;' ) \
> echo "$hex_string"
\x3d\x41\x5d\x73\x67\x15\xab\x5e\xe3\x47\x23\x8c\xac\xac\x61\xc7\x12\x3f\xe3\x54\x27\x22\x4d\x25\x25\x3c\x7b\x03\x55\x58\xba\xf1\x9e\x54\xe8\xd1\xa8\x27\x42\xd6\xa7\xb3\x7a\xfc\x6d\x91\x01\x5f\x75\x1d\xe1\x10\x2d\x0a\x31\xe6\x6e\xc3\x3b\x74\x03\x4b\x1a\xb4\x71\xcc\x13\x81\x88\x4d\xfd\xf0\xbb\x3e\x42\x33\xbd\x07\x5f\xef\x23\x5f\x34\x23\x02\xff\xd7\x2e\xca\xbf\xa5\xae\xdf\x1b\x3d\xc9\x9b\x33\x48\x34\x6d\xb4\xd9\x00\x10\x26\xae\xf4\x4c\x59\x2f\xee\x61\x49\x3f\x72\x62\xad\x2b\xd1\xbc\xe8\xa7\xce\x60\xd8\x10\x22\x53\x3f\x64\x73\xae\x18\x49\x35\xf2\x5c\xf6\xcc\x07\xc3\xae\xbf\xdf\x70\xa5\xa0\x91\x39

Resources