mod_rewrite & /+Result:+chosen+nickname+%22preorrinkap%22 - .htaccess

We've got Wordpress installed on our site and we're getting a lot of 404 pages coming up with the below at the end of them.
/+Result:+chosen+nickname+%22preorrinkap%22
The nickname at the end differs in almost all pages, but the url that preceeds the link is valid. By adding this at the end of a valid URL it makes the page come up as a 404.
As I understand it, there is a Wordpress exploit which bots (more than likely, due to the number of requests we are getting) are trying to use.
(see here https://security.stackexchange.com/questions/26598/strange-request-uri-with-lot-of-spaces-and-chosen-nickname) What I want to do is write a mod_rewrite rule to include in our .htaccess file to detect the end text and redirect the bot to the correct URL.
For example here are a couple of our links that are coming up:
/News/2010/press-releases/launch-of-ondemand-video-subscription-service/+Result:+chosen+nickname+%22preorrinkap%22
/News/2010/press-releases/launch-of-ondemand-video-subscription-service/+Result:+chosen+nickname+%22coughiscout%22
Due to the structure of our site the links will always start with /News/ if that helps.
I've tried all manner of ways of getting the regex to pickup this pattern, but i just cant get it to work.
Any help would be great.

In case someone comes across this you can set up the following redirect rule in your htaccess
RewriteRule ^(.*)\+Result:\+chosen\+nickname\+(.*)$ /News/$1 [R=301,L]
This should redirect the bot to the actual article. It turns out I wasn't escaping the plus signs correctly in my previous attempts.

Related

Why doesn't this simple RewriteRule work anymore and why does google keep seeing it?

Some time ago (I think a couple of years) this simple RewriteRule in my htaccess stopped working.
RewriteRule tags/ tags.php [L]
It worked for years, then a day after a server change or a server upgrade or switch to php-fpm (I don't remember) it stopped working.
I solved it by deleting it, and sending all my links directly to the tags.php file.
This rule is part of a small CMS that I use for many of my sites. The sites work and everything works correctly.
But punctually when I create a new site, after a few days google sends me a warning telling me that the url mysite.com/tags/ creates an error 404.
And this is strange, because the url mysite.com/tags/ no longer exists for years now in my sites, nor in my sitemaps, I am sure because it is used only once in the main menu of my sites, and has been replaced with mysite.com/tags.php.
Above all it cannot exist in new sites. At first I didn't pay much attention to it on old sites. Probably google may have seen it in the old sites, and haven't forgotten it yet, but surely can't have seen it in the new sites.
So, I have a couple of unanswered questions.
The first and perhaps most important to understand: How does google see the url mysite.com/tags/? Is it possible that google reads my htaccess to understand what kind of url I'm going to create?
Second: how can I solve the problem permanently?
--------------------------update---------------------
Sorry for the delay with which I reply (summer vacation).
Regarding anubhava's answer, I have a doubt, but it's my fault, maybe I omitted part of the code.
The next rule says:
RewriteRule ^tags-([^/]+)\/$ tags.php?letter=$1 [L]
and makes work some urls like:
mysite.com/tags-k/
and these urls work, but if I put a 301 redirect on tags.php, will they still work?
No, Google (or any other) search bot cannot read your .htaccess
It is difficult to figure out how search bot found /tags URI but it is definitely hidden somewhere in your web pages.
Now to tell search bots that /tags doesn't exist is to use a redirect rule with R=301:
RewriteEngine On
RewriteRule ^tags(/.*)?$ /tags.php [L,NC,R=301]
With this 301 rule search bot will eventually let go old /tags/ result and will remember /tags.php only.

Apache .htaccess redirect to an anchor

I'm trying to do a one-off damage-limitation redirection to an anchor on a page on a website. A wrong URL got published in some publicity material, like this:
https://mydomain.org.uk/A/B
when what I really wanted to publish was
https://mydomain.org.uk/A#B
Having looked at some other answers it seems that any redirect with an anchor needs to be an absolute URL. So I put this in my .htaccess:
RewriteRule A/B https://mydomain.org.uk/A.php#B [NE,L]
(note, the .php is correct, A.php is the page file). And it just simply doesn't work. The browser simply loads A.php and displays it from the top.
I know that the rule pattern is matching, because if I make the target be a completely nonexistent page I get a 404 as expected.
Unfortunately my web hosting service doesn't let me use the Apache log, so it's hard to trace what's going wrong. Can anyone guide me to how to do the rewrite properly so that I pass the #anchor all the way through to the user's browser?
Thanks in advance!
When the RewriteRule is processed by the server, it basically changes internally which resource to access, without the browser noticing.
The only way to change the URL in the browser is to use the redirect flag. This will make the webserver send a HTTP 302 response with a Location header, which then will result in the browser changing the URL and requesting the new page. This new URL can contain an anchor.
In your case the following rule should work:
RewriteRule A/B https://mydomain.org.uk/A.php#B [NE,R,L]
Please keep in mind that anchors are a browser feature so they are normally not sent to the server and therefore neither appear in access logs nor can be used in a RewriteRule.

Strange 404 error - trying to avoid with .htaccess

I've got a fairly simple web site in which I trap any 404 errors and one that's started coming through is trying to access /home/sycamore/public_html/...valid url... instead of /...valid url...
Obviously /home/sycamore/public_html/ is the file path to where the site lives but firstly, how on earth is this request being generated. I can find nothing in my code that does anything like that and yet it is happening in just one area that;s been recently added. Any idea what can caused this?
In an attempt to avoid this I've added a rule to the .htaccess file so it now starts
Options +FollowSymLinks
RewriteEngine on
RewriteRule ^/home/sycamore/public_html/(.*)$ /$1 [R,L]
However, this doesn't appear to do anything, I'm still getting the 404 reported on /home/sycamore/public_html/...valid url... although there are many other rewrite and redirect rules in .htaccess which work perfectly.
Any ideas as to (1) why the problem might be there in the first place and (2) why my htaccess attempt to 'correct' the url is failing?
I would also add that most of the requests for this URL are coming from search bots; google, yahoo, etc. and there is no mention of /home/... in the sitemap.xml file.
I'm pretty sure the original cause was down to some redirects in the htaccess file which were redirecting to a non-existent page and, I assume, causing it to generate this sort of error.
Still no idea why the attempted fix to rewrite the url didn't work but hopefully mending the htaccess file will have got rid of a lot of the problems (though there are still some unaccounted ones).

.htaccess file - forward everything, but not?

First time user, been looking all night.
We recently changed our site from .net to wordpress. We transferred over half of the news articles and not the other half. So now we get old users coming to the site and getting a 404.
The news articles that exist in the wordpress site have been reditected and work fine, for example,
www.example.com/news/transfered-news-story.aspx
redirects to
www.example.com/blog/news/transfered-news-story
this was done manually.
What I need help with is if someone comes to the site with any other request, e.g.
www.example.com/news/this-didnt-get-moved.aspx
or
www,example.com/news/anything-else
or
www.example.com/news/2010/02
all just gets redirected to
www.example.com/blog/news
I have been reading on and off for a couple of weeks and tried a few things but they all append the additional stuff on the end of the redirected string.
so www.example.com/news/my-stuff-ok
becomes www.example.com/blog/news/my-stuff-ok (and I want to drop the my-stuff-ok)
I hope you get what I'm after, any help would be very much appreciated.
Thanks
Phil
You can simply write a directive that converts a 404 to a url (documentation):
ErrorDocument 404 /blog/news
However, you really should go through the motions of adding manual redirects (permanent redirect) to the new url for each of the other articles because you will take a considerable SEO hit if those urls no longer serve up the content that was linked by the search engine.

problems getting nice browser url using redirect/rewriterule

Currently I use a .htaccess redirect to send a (nice) url /offices/london/whatever to my script (nasty url) /db/db.pl?offices-london-whatever
i want the browser url to be nice, with the 301 redirect it isn't so i tried with the RewriteRule but the browser url is still the nasty one.
e.g. RewriteRule Offices/London/(.*)$ /db/db.pl?Offices-London-$1 [NC]
it all navigates, i get the pages i want with either method, but i want the nice url not the nasty one for both the user browser and SEO. presently i only get the nasty url
any clues what i am doing wrong?
Let's assume the following:
RewriteRule ^/Offices/London/(.*)$ /db/db.pl?Offices-London-$1 [L,NC]
This makes your page accessible through www.yourdomain.com/offices/london. So you can just use that URL in your browser. As for SEO the crawlers will see you are using that URL in your links and will index it.
Remember that you can always use the other URL (the nasty one) aswell, just dont use it (except for testing ofcourse).
ok, thanks for that
the problem is not one of 'accessing' the script, that all works fine, it is of getting the browser address bar to NOT display the ugly path/url, which happens with the example above.
as for the SEO, it is not the case, google is currently displaying the ugly url.
by reading through http://www.webmasterworld.com/forum92/6079.htm (and www.askapache.com/htaccess/mod_rewrite-basic-examples.html) i am slowly getting there, with two rewrites and a cond, but i have been lazy in my perl and the relative paths are screwed, so got to do some more on it.
for now though, i got to do some other pesky customer things for a while.
will post my full solution here shortly!!!

Resources