Bash matching binary pattern - linux

I want to check inside a file if it matches a binary pattern.
For that, I'm using clamAV signature database
Trojan.Bancos-166:1:*:3d415d736715ab5ee347238cacac61c7123fe35427224d25253c7b035558baf19e54e8d1a82742d6a7b37afc6d91015f751de1102d0a31e66ec33b74034b1ab471cc1381884dfdf0bb3e4233bd075fef235f342302ffd72ecabfa5aedf1b3dc99b3348346db4d9001026aef44c592fee61493f7262ad2bd1bce8a7ce60d81022533f6473ae184935f25cf6cc07c3aebfdf70a5a09139
I code this to retrieve the hex string representation signature
signature=$(echo "$line" |awk -F':' '{ print $4 }')
Moreover I change hex string to binary
printf -v variable $(sed 's/\(..\)/\\x\1/g;' <<< "$signature")
Until here It works perfectly.
Finally I would like to check if my file ( *$raw_file_path* ) matches my binary pattern (now in $variable)
I try this
test_var=$(grep -qU "$variable" "$raw_file_path")
or
test_var=$(grep -qU --regexp="$variable" "$raw_file_path")
I don't know why it doesn't work, Grep doesn't match anything
.
And sometimes some errors:
grep: Trailing backslash
grep: Invalid regular expression
I know it deals with pattern matching problems.
In my test I don't want use regular expression.
If you have any idea, or other bash tool.
Thanks.

You are currently using the --quiet option for grep by specifying q in -qU. This prevents grep from printing anything to stdout, therefore nothing will be saved to test_var.
Change your code to:
test_var=$(grep -UE "$variable" "$raw_file_path")

First the extra sub-shell can be avoided:
#!/bin/bash
signature="Trojan.Bancos-166:1:*:3d415d736715ab5ee347238cacac61c7123fe35427224d25253c7b035558baf19e54e8d1a82742d6a7b37afc6d91015f751de1102d0a31e66ec33b74034b1ab471cc1381884dfdf0bb3e4233bd075fef235f342302ffd72ecabfa5aedf1b3dc99b3348346db4d9001026aef44c592fee61493f7262ad2bd1bce8a7ce60d81022533f6473ae184935f25cf6cc07c3aebfdf70a5a09139"
variable=$(echo "${signature//*:/}" | sed 's/\(..\)/\\x\1/g;')
Require only confirmation of a match:
if grep -qU "$variable" "$raw_file_path"; then
# matches
fi
Or require the result for further processing:
test_var=$(grep -U "$variable" "$raw_file_path")
# contents of match in test_var
When returning to a variable, greps -q opt suppresses stdout
Edit
Tested working example
> signature="Trojan.Bancos-166:1:All_text before-the last : should be trimed:3d415d736715ab5ee347238cacac61c7123fe35427224d25253c7b035558baf19e54e8d1a82742d6a7b37afc6d91015f751de1102d0a31e66ec33b74034b1ab471cc1381884dfdf0bb3e4233bd075fef235f342302ffd72ecabfa5aedf1b3dc99b3348346db4d9001026aef44c592fee61493f7262ad2bd1bce8a7ce60d81022533f6473ae184935f25cf6cc07c3aebfdf70a5a09139" \
> hex_string=$( echo "${signature//*:/}" | sed 's/\(..\)/\\x\1/g;' ) \
> echo "$hex_string"
\x3d\x41\x5d\x73\x67\x15\xab\x5e\xe3\x47\x23\x8c\xac\xac\x61\xc7\x12\x3f\xe3\x54\x27\x22\x4d\x25\x25\x3c\x7b\x03\x55\x58\xba\xf1\x9e\x54\xe8\xd1\xa8\x27\x42\xd6\xa7\xb3\x7a\xfc\x6d\x91\x01\x5f\x75\x1d\xe1\x10\x2d\x0a\x31\xe6\x6e\xc3\x3b\x74\x03\x4b\x1a\xb4\x71\xcc\x13\x81\x88\x4d\xfd\xf0\xbb\x3e\x42\x33\xbd\x07\x5f\xef\x23\x5f\x34\x23\x02\xff\xd7\x2e\xca\xbf\xa5\xae\xdf\x1b\x3d\xc9\x9b\x33\x48\x34\x6d\xb4\xd9\x00\x10\x26\xae\xf4\x4c\x59\x2f\xee\x61\x49\x3f\x72\x62\xad\x2b\xd1\xbc\xe8\xa7\xce\x60\xd8\x10\x22\x53\x3f\x64\x73\xae\x18\x49\x35\xf2\x5c\xf6\xcc\x07\xc3\xae\xbf\xdf\x70\xa5\xa0\x91\x39

Related

Is it possible to retrieve one string between 2 special characters from text file using bash?

Let's say I have the following text file
test.txt
ABC_01:Testing-ABCDEFG
If I want to retrieve the string after colon, I will be using
awk -F ":" '/ABC_01/{print $NF}' test.txt
which will return Testing-ABCDEFG
But what should I do if I only want to retrieve the string after the colon and before the hyphen?
You are so close. That is where split() comes in, e.g.
awk -F: '/ABC_01/{ split($NF,arr,"-"); print arr[1] }'
Which will output
Testing
The GNU Awk User's Guide - String Manipulation Functions provides the details on split(). Give it a try and let me know if you have any further questions.
Using Bash's built'in Extended Regex Engine
#!/usr/bin/env bash
while read -r; do
[[ $REPLY =~ :(.*)- ]] || :
echo "${BASH_REMATCH[1]}"
done
Using standard POSIX shell IFS field separators:
#!/usr/bin/env sh
while IFS=':-' read -r _ m _; do
echo "$m"
done
Using (GNU) grep and look-around:
$ grep -oP '(?<=:)[^-]*(?=-)' file
Testing
Explained:
grep GNU grep supports PCRE and look-around
`-o Print only the matched (non-empty) parts of a matching line
-P Interpret PATTERNS as Perl-compatible regular expressions
(?<=:) positive look-behind, ie. preceeded by a colon
[^-]* anything but a hyphen
(?=-) positive look-ahead, ie. followed by a hyphen

parsing complex string using shell script

I'm trying the whole day to find a good way for parsing some strings with a shell script. the strings are used as calling parameter for some applications.
they looks like:
parsingParams -c "id=uid5 prog=/opt/bin/example arg=\"-D -t5 >/dev/null 1>&2\" info='fdhff fd'" start
I'm only allowed to use shell-script. I tried to use some sed and cut commands but nothing works fine.
My tries are like:
prog=$(echo $# | cut -d= -f3 | sed 's|\s.*$||')
that return the correct value of prog but for the value of arg I couldn't find a good way to get it.
the info parameter is optional also it may be left.
may any one have a good idea that can solve this problem?
many thanks in advance
Looks like you could use eval to let the shell parse your input string, but if you don't control the input (if it comes from an unreliable source), that will introduce a major vulnerability (imagine an attacker somehow passes -c "rm -rf /" to your program).
A safer way would be to explicitly specify allowed forms of user input.
The problem you have with splitting on space (with cut) if the space is quoted, can be avoided if you specify valid fields (content, not separator), for example in GNU awk, you can use FPAT:
$ params="id=uid5 prog=/opt/bin/example arg=\"-D -t5 >/dev/null 1>&2\" info='fdhff fd'"
$ awk -v FPAT="[^=]+=(\"[^\"]*\"|'[^']*'|[^ ]*) *" '{for (i=1; i<=NF; i++) print $i}' <<<"$params"
id=uid5
prog=/opt/bin/example
arg="-D -t5 >/dev/null 1>&2"
info='fdhff fd'
Valid fields will be in one of the following forms:
var="val with spaces"
var='val with spaces'
var=val_no_spaces
Now with assignments split (one per line, assuming newline is not allowed in params), you can process them further, even with cut:
$ awk ... | cut -d $'\n' -f3
arg="-D -t5 >/dev/null 1>&2"
eval
$ eval "id=uid5 prog=/opt/bin/example arg=\"-D -t5 >/dev/null 1>&2\" info='fdhff fd'"
$ echo $id
uid5
$ echo $prog
/opt/bin/example
$ echo $arg
-D -t5 >/dev/null 1>&2
$ echo $info
fdhff fd

How to search with grep exactly string in a file via shell linux?

I have a file, the content of file has a string like this:
'/ad/e','#'.base64_decode("ZXZhbA==").'($zad)', 'add'
I want to check the file has this string. But when I use grep to check, It always return false. I try some ways:
grep "'/ad/e','#'.base64_decode("ZXZhbA==").'($zad)', 'add'" foo.txt
grep "'/ad/e','#'\.base64_decode\("ZXZhbA\=\="\)\.'\(\$zad\)', 'add'" foo.txt
str="'/ad/e','#'\.base64_decode\("ZXZhbA\=\="\)\.'\(\$zad\)', 'add'"
grep "$str" foo.txt
Can you help me? Maybe, another command line.
This is my case:
while read str; do
if [ ! -z "$str" ]; then
if grep -Fxq "$str" "$file_path"; then
do somthing
fi
fi
done < <(cat /usr/local/caotoc/db.dat)
Thank you so much!
First, you need to make sure the string is quoted properly. This is a bit of an art form, since your string contains both single and double quotes.
One thought would be to use read and a here-document to avoid having to escape anything.
Second, you need to use -F to perform exact string matching instead of more general regular-expression matching.
IFS= read -r str <<'EOF'
'/ad/e','#'.base64_decode("ZXZhbA==").'($zad)', 'add'
EOF
grep -F "$str" foo.txt
Based on the update, you can use a simple loop to read them one at a time.
while IFS= read -r str; do
grep -F "$str" foo.txt
done < /usr/local/caotoc/db.dat
You may be able to simply use the -f option to grep, which will cause grep to output lines from foo.txt that match any line from db.dat.
grep -f /usr/local/caotoc/db.dat -F foo.txt
Instead of trying to workaround regexes, the simplest way is to turn off regular expressions using -F (or --fixed-strings) option, which makes grep act like a simple string search
-F, --fixed-strings PATTERN is a set of newline-separated strings
like this:
grep -F "'/ad/e','#'.base64_decode(\"ZXZhbA==\").'(\$zad)', 'add'" test
Note: because of the shell, you still need to escape:
double quotes
dollar sign or else $zad is evaluated as an environment variable

Bash Script - Nested $(..) Commands - Not working correctly

I was trying to do these few operations/commands on a single line and assign it to a variable. I have it working about 90% of the way except for one part of it.
I was unaware you could do this, but I read that you can nest $(..) inside other $(..).... So I was trying to do that to get this working, but can't seem to get it the rest of the way.
So basically, what I want to do is:
1. Take the output of a file and assign it to a variable
2. Then pre-pend some text to the start of that output
3. Then append some text to the end of the output
4. And finally remove newlines and replace them with "\n" character...
I can do this just fine in multiple steps but I would like to try and get this working this way.
So far I have tried the following:
My 1st attempt, before reading about nested $(..):
MY_VAR=$(echo -n "<pre style=\"display:inline;\">"; cat input.txt | sed ':a;N;$!ba;s/\n/\\n/g'; echo -n "</pre>")
This one worked 99% of the way except there was a newline being added between the cat command's output and the last echo command. I'm guessing this is from the cat command since sed removed all newlines except for that one, maybe...?
Other tries:
MY_VAR=$( $(echo -n "<pre style=\"display:inline;\">"; cat input.txt; echo -n "</pre>") | sed ':a;N;$!ba;s/\n/\\n/g')
MY_VAR="$( echo $(echo -n "<pre style=\"display:inline;\">"; cat input.txt; echo "</pre>") | sed ':a;N;$!ba;s/\n/\\n/g' )"
MY_VAR="$( echo "$(echo -n "<pre style=\"display:inline;\">"; cat input.txt; echo "</pre>")" | sed ':a;N;$!ba;s/\n/\\n/g' )"
*Most these others were tried with and without the extra double-quotes surrounding the different $(..) parts...
I had a few other attempts, but they didn't have any luck either... On a few of the other attempts above, it seemed to work except sed was NOT inserting the replacement part of it. The output was correct for the most part, except instead of seeing "\n" between lines it just showed each of the lines smashed together into one line without anything to separate them...
I'm thinking there is something small I am missing here if anyone has any idea..?
*P.S. Does Bash have a name for the $(..) structure? It's hard trying to Google for that since it doesn't really search symbols...
You have no need to nest command substitutions here.
your_var='<pre style="display:inline;">'"$(<input.txt)"'</pre>'
your_var=${your_var//$'\n'/'\n'}
"$(<input.txt)" expands to the contents of input.txt, but without any trailing newline. (Command substitution always strips trailing newlines; printf '%s' "$(cat ...)" has the same effect, albeit less efficiently as it requires a subshell, whereas cat ... alone does not).
${foo//bar/baz} expands to the contents of the shell variable named foo, with all instances of bar replaced with baz.
$'\n' is bash syntax for a literal newline.
'\n' is bash syntax for a two-character string, beginning with a backslash.
Thus, tying all this together, it first generates a single string with the prefix, the contents of the file, and the suffix; then replaces literal newlines inside that combined string with '\n' two-character sequences.
Granted, this is multiple lines as implemented above -- but it's also much faster and more efficient than anything involving a command substitution.
However, if you really want a single, nested command substitution, you can do that:
your_var=$(printf '%s' '<pre style="display:inline;">' \
"$(sed '$ ! s/$/\\n/g' <input.txt | tr -d '\n')" \
'</pre>')
The printf %s combines its arguments without any delimiter between them
The sed operation adds a literal \n to the end of each line except the last
The tr -d '\n' operation removes literal newlines from the file
However, even this approach could be done more efficiently without the nesting:
printf -v your_var '%s' '<pre style="display:inline;">' \
"$(sed '$ ! s/$/\\n/g' <input.txt | tr -d '\n')" \
'</pre>')
...which has the printf assign its results directly to your_var, without any outer command substitution required (and thus saving the expense of the outer subshell).

Using a variable to replace lines in a file with backslashes

I want to add the string %%% to the beginning of some specific lines in a text file.
This is my script:
#!/bin/bash
a="c:\Temp"
sed "s/$a/%%%$a/g" <File.txt
And this is my File.txt content:
d:\Temp
c:\Temp
e:\Temp
But nothing changes when I execute it.
I think the 'sed' command is not finding the pattern, possibly due to the \ backslashes in the variable a.
I can find the c:\Temp line if I use grep with -F option (to not interpret strings):
cat File.txt | grep -F "$a"
But sed seems not to implement such '-F` option.
Not working neither:
sed 's/$a/%%%$a/g' <File.txt
sed 's/"$a"/%%%"$a"/g' <File.txt
I have found similar threads about replacing with sed, but they don't refer to variables.
How can I replace the desired lines by using a variable adding them the %%% char string?
EDIT: It would be fine that the $a variable could be entered via parameter when calling the script, so it will be assigned like:
a=$1
Try it like this:
#!/bin/sh
a='c:\\Temp' # single quotes
sed "s/$a/%%%$a/g" <File.txt # double quotes
Output:
Johns-MacBook-Pro:sed jcreasey$ sh x.sh
d:\Temp
e:\Temp
%%%c:\Temp
You need the double slash '\' to escape the '\'.
The single quotes won't expand the variables.
So you escape the slash in single quotes and pass it into the double quotes.
Of course you could also just do this:
#!/bin/sh
sed 's/\(.*Temp\)/%%%&/' <File.txt
If you want to get input from the command line you have to allow for the fact that \ is an escape character there too. So the user needs to type 'c:\\' or the interpreter will just wait for another character. Then once you get it, you will need to escape it again. (printf %q).
#!/bin/sh
b=`printf "%q" $1`
sed "s/\($b\)/%%% &/" < File.txt
The issue you are having has to do with substitution of your variable providing a regular expression looking for a literal c:Temp with the \ interpreted as an escape by the shell. There are a number of workarounds. Seeing the comments and having worked through the possibilities, the following will allow the unquoted entry of the search term:
#!/bin/bash
## validate that needed input is given on the command line
[ -n "$1" -a "$2" ] || {
printf "Error: insufficient input. Usage: %s <term> <file>\n" "${0//*\//}" >&2
exit 1
}
## validate that the filename given is readable
[ -r "$2" ] || {
printf "Error: file not readable '%s'\n" "$2" >&2
exit 1
}
a="$1" # assign a
filenm="$2" # assign filename
## test and fix the search term entered
[[ "$a" =~ '/' ]] || a="${a/:/:\\}" # test if \ removed by shell, if so replace
a="${a/\\/\\\\}" # add second \
sed -e "s/$a/%%%$a/g" "$filenm" # call sed with output to stdout
Usage:
$ bash sedwinpath.sh c:\Temp dat/winpath.txt
d:\Temp
%%%c:\Temp
e:\Temp
Note: This allows both single-quoted or unquoted entry of the dos path search term. To edit in place use sed -i. Additionally, the [[ operator and =~ operator are limited to bash.
I could have sworn the original question said replace, but to append, just as you suggest in the comments. I have updated the code with:
sed -e "s/$a/%%%$a/g" "$filenm"
Which provides the new output:
$ bash sedwinpath.sh c:\Temp dat/winpath.txt
d:\Temp
%%%c:\Temp
e:\Temp
Remember: If you want to edit the file in place use sed -i or sed -i.bak which will edit the actual file (and if -i.bak is given create a backup of the original in originalname.bak). Let me know if that is not what you intended and I'm happy to edit again.
Creating your script with a positional parameter of $1
#!/bin/bash
a="$1"
cat <file path>|sed "s/"$1"/%%%"$1"/g" > "temporary file"
Now whenever you want sed to find "c:\Temp" you need to use your script command line as follows
bash <my executing script> c:\\\\Temp
The first backslash will make bash interpret any backslashes that follows therefore what will be save in variable "a" in your executing script is "c:\\Temp". Now substituting this variable in sed will cause sed to interpret 1 backlash since the first backslash in this variable will cause sed to start interpreting the other backlash.
when you Open your temporary file you will see:
d:\Temp
%%%c:\Temp
e:\Temp

Resources