A search bot is scanning pages on my site with a lot of strange GET params right now. For example ?x?, ?728%10%02, ?%18%9B%D9%DF%05 etc. I don't know where the bot found that urls but it makes my cpu to smoke because a cache system doesn't process urls with GET params.
I have no ability to modify cache system, but i want to redirect requests with GET params to the same url without GET params through .htaccess. But I have some important GET params that shoudn't be redirected. It's ?s=... for site search and utm labels.
In summary I want to redirect the following urls
/some-url?x?
/some-url?728%10%02
/some-url?%18%9B%D9%DF%05
and a lot of others GET params to
/some-url
But keep untouched urls like this:
/some-url?s=searh_term or
/some-url?utm_campaign=my_campaign
If you've a selected number of GET parameters possible, then you can check against them in your htaccess file, and redirect all requests without the allowed parameters.
RewriteEngine On
# check that there is indeed a query string
RewriteCond %{QUERY_STRING} ^.+$
# check that it doesn't start with one of allowed parameters
RewriteCond %{QUERY_STRING} !^(utm_campaign|s|other|parameters|list)= [NC]
RewriteRule ^(.*)$ /$1? [R=301,L]
Related
I am using .htaccess to redirect certain subfolders of my domain, to remove the question mark to improve my URLs.
Currently my URLs are like this:
www.example.com/post/?sometitle
I am trying to remove the question mark, so it is the following URL:
www.example.com/post/sometitle
Currently I have the following code in my .htaccess file:
RewriteCond %{THE_REQUEST} /post/?([^\s&]+) [NC]
RewriteRule ^ /post/%1 [R=302,L,NE]
i am using php GET parameters, i am attempting for when the browser visits example.com/post/sometitle that the page that is currently example.com/post/?sometitle is displayed
In that case you need to the opposite of what you are asking in your question: you need to internally rewrite (not externally "redirect") the request from example.com/post/sometitle to example.com/post/?sometitle.
However, you must have already changed all the URLs in your application to use the new URL format (without the query string). You shouldn't be using .htaccess alone for this.
I also assume that /post is a physical directory and that you are really serving index.php in that directory (mod_dir is issuing an internal subrequest to this file). So, instead of /post/?sometitle, it's really /post/index.php?sometitle?
For example:
RewriteEngine On
# Rewrite /post/sometitle to filesystem path
RewriteRule ^post/([\w-]+)$ /post/index.php?$1 [L]
So, now when you request /post/sometitle the request is internally rewritten and handled by /post/index.php?sometitle instead.
I have assumed that "sometitle" can consist of 1 or more of the characters a-z, A-Z, 0-9, _ and -. Hence the regex [\w-]+.
If this is a new site then you can stop there. However, if you are changing an existing URL structure that has already been indexed by search engines and linked to by external third parties then you'll need to redirect the old URLs to the new. (Just to reiterate, you must have already changed the URL in your application, otherwise users will experience repeated redirects as they navigate your site.)
To implement the redirect, you can add something like the following before the above rewrite:
# Redirect any "stray" requests to the old URL
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteCond %{QUERY_STRING} ([\w-]+)
RewriteRule ^post/$ /post/%1 [R=302,NE,QSD,L]
The check against the REDIRECT_STATUS environment variable is to ensure we only redirect "direct requests" and thus avoiding a redirect loop.
(Change to 301 only when tested as OK, to avoid caching issues.)
In Summary:
RewriteEngine On
# Redirect any "stray" requests to the old URL
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteCond %{QUERY_STRING} ([\w-]+)
RewriteRule ^post/$ /post/%1 [R=302,NE,QSD,L]
# Rewrite /post/sometitle to filesystem path
RewriteRule ^post/([\w-]+)$ /post/index.php?$1 [L]
UPDATE: If you have multiple URLs ("folders") that all follow the same pattern, such as /post/<title>, /home/<title> and /build/<title> then you can modify the above to cater for all three, for example:
# Redirect any "stray" requests to the old URL
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteCond %{QUERY_STRING} ([\w-]+)
RewriteRule ^(post|home|build)/$ /$1/%1 [R=302,NE,QSD,L]
# Rewrite /post/sometitle to filesystem path
RewriteRule ^(post|home|build)/([\w-]+)$ /$1/index.php?$2 [L]
Aside: (With my Webmasters hat on...) This is not really much of an "improvement" to the URL structure. If this is an established website with many backlinks and good SE ranking then you should think twice about making this change as you could see a dip in rankings at least initially.
If only changing from query is your requirement then try with below, we are using QSD flag to discard our query string after our rule matched.
RewriteCond %{QUERY_STRING} ([^\s&]+) [NC]
RewriteRule ^ /post/%1 [R=302,L,NE,QSD]
My website do not use any GET parameters except on one page. Nonetheless, I can see that Google managed to index a bunch of my pages with GET parameters. This is not great for SEO (duplicate content)...
So I'm trying to edit my .htaccess to do 301 redirects between all urls with GET parameters to url without GET parameters (except for one url). Some examples:
example.com/?foo=42 => example.com/
example.com/about?bar=42 => example.com/about
example.com/r.php?foobar=42 => the url r.php should keep the GET parameters
So far I'm trying to remove all GET parameters, and it doesn't work.
RewriteEngine On
RewriteRule ^(.*)\?(.*)$ http://www.example.com/$1 [L,NC,R=301]
Any idea how to fix that?
You cannot match query string using RewriteRule.
You can use this generic rule to remove all query string except for requests that have DOT:
RewriteEngine On
RewriteCond %{QUERY_STRING} .
RewriteRule ^([^.]*)$ /$1? [L,NE,R=301]
I've taken my site down for some prolonged maintenance and am using mod_rewrite to send all requests to a single page: www.mysite.com/temp/503.php
This is my .htaccess file which works fine.
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_URI} !^/temp/503.php [NC]
RewriteRule .* /temp/503.php [R,L]
However, what I'd also like to be able to do is to hide /temp/503.php in the resulting URL from the visitor.
I know this is perhaps trivial and I'm sure fairly simple to achieve, but with my limited mod_rewrite skills I can't seem to get it to work.
Any help would be much appreciated.
Thanks.
Just get rid of the R flag in the rewrite rule, which tells the rule to redirect the request, thus changing the URL in the browser's location bar. So the rule would look like:
RewriteRule .* /temp/503.php [L]
which internally rewrites the requested URI instead of externally telling the browser that it's been moved to a new URL.
Excuse me for my english.
I make a brands directory web site.
Before to acces to the brands pages I use requests like this :
mydomain.com/fiche.php?id=115
where id is the id of the brand in my directory
I change the structure of the brands pages and now use this request:
mydomain.com/annuaire.php?type=fiche&id_marq=115
where id has become id_marq
I try to use a rewritebrule like this:
RewriteRule ^fiche.php$ http://www.annuaire-sites-officiels.com/annuaire.php?detail=fiche&id_marq=$1 [L,QSA,R=301]
to redirect the old links to the new pages but result dont pass the id_marq value and the url is:
http://www.annuaire-sites-officiels.com/annuaire.php?detail=fiche&id_marq=&id=115
&id= is too.
What am I doing wrong?
Your rule is not evaluating query string and that's why its not capturing id query parameter.
Change your code to:
Options +FollowSymLinks -MultiViews
# Turn mod_rewrite on
RewriteEngine On
RewriteBase /
RewriteCond %{QUERY_STRING} ^id=([^&]+) [NC]
RewriteRule ^fiche\.php$ /annuaire.php?detail=fiche&id_marq=%1 [R=302,L,QSA,NC]
Once you verify it is working fine, replace R=302 to R=301. Avoid using R=301 (Permanent Redirect) while testing your mod_rewrite rules.
Check out Regex Back Reference Availability:
You have to capture the query string. [QSA] passes it forward unaltered, so unless you're using id for anything you don't need that bit of code. Your 301 redirect is correct since this is a permanent redirect. Remember if you add a failed redirect your browser may cache that redirect so it might not look like it's working.
In this string match I'm only catching numbers to prevent someone from passing something like an asterisk * and XSS exploiting your site.
I've not included and [NC] matches in my code because when you allow multiple cases they can seem like different URLs to search engines (bad for SEO).
RewriteCond %{QUERY_STRING} id=([0-9]+)
RewriteRule ^fiche.php$ http://%{HTTP_HOST}/annuaire.php?detail=fiche&id_marq=%1 [R=301,L]
I'm trying to write a URL like below, but when I try to call the seo queryparam, it always returns index.php. Any idea why it isn't returning the correct value for 'seo'?
RewriteRule ^([^/]*)$ index.php?c=home&m=details&seo=$1 [L]
The URL it should forward from would be something like this: http://domain.com/The-Name-of-the-Product. That URL should be rewritten to http://domain.com/index.php?c=home&m=details&seo=The-Name-of-the-Product, but instead it ends up as http://domain.com/index.php?c=home&m=details&seo=index.php
Various events cause a URL to go back through the rewrite process. You can use RewriteCond to prevent this:
RewriteCond $1 !^index.php$
RewriteRule ^/?([^/]+)$ index.php?c=home&m=details&seo=$1 [L,NS]
From the mod_rewrite technical details:
When you manipulate a URL/filename in per-directory context mod_rewrite first rewrites the filename back to its corresponding URL (which is usually impossible, but see the RewriteBase directive below for the trick to achieve this) and then initiates a new internal sub-request with the new URL. This restarts processing of the API phases.
This catches people all the time.