I would like to remove certain URL parameters from my site, so Googlebot doesn't get confused & thinks it's duplicate content.
The parameters are:
?sort=
?limit=
?order=
Based on some examples I've come across, here's what I'm currently using in .htaccess:
RewriteCond %{QUERY_STRING} "sort=" [NC]
RewriteRule (.*) /$1? [R=301,L]
RewriteCond %{QUERY_STRING} "limit=" [NC]
RewriteRule (.*) /$1? [R=301,L]
RewriteCond %{QUERY_STRING} "order=" [NC]
RewriteRule (.*) /$1? [R=301,L]
What is the proper syntax to combine these parameters into one rule?
It is not a good solution to remove the parameters if you need them.
The best way to avoid problems related to duplicate content, is to add in the html <head>:
<link rel="canonical" href="http://www.domain.com/url-file.php?param=xxx">
By indicating the complete url of the page, with the only parameters you want to index by Google.
You can use alternation in regex:
RewriteCond %{QUERY_STRING} ^(limit|sort|order)= [NC]
RewriteRule ^ %{REQUEST_URI}? [R=301,L,NE]
Related
#anubhava provided an excellent answer for my previous question of doing an .htaccess internal rewrite with the below code, which worked for my one search query.
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteCond %{QUERY_STRING} id=([0-9]+) [NC]
RewriteRule ^file\.php$ /directory/%1? [R=301,L,NC]
RewriteRule ^directory/(\d+)/?$ /directory/file.php?id=$1 [L,QSA,NC]
I wanted to make this a separate question since my next question is slightly different. How could I adapt this to also work with two parameters? For instance, I would also like http://ipaddress/directory/file.php?id=47?name=value1 to redirect to http://ipaddress/directory/47/value1
name= can also be any combination of letters and numbers, like value1050, etc.
Thank you #anubhava for your previous answer above, and maybe there's a way to add on this second parameter as well?
Considering you are segregating your query string values in id=1234&name=value123 style, since passing 2 times query string will not be allowed, then you could try following, fix of your shown attempt.
RewriteEngine on
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteCond %{QUERY_STRING} ^id=(\d+)&name=(.*)$ [NC]
RewriteRule ^file\.php/?$ /directory/%1/%2? [R=301,L,NC]
RewriteRule ^directory/(\d+)/(.*)/?$ /directory/file.php?id=$1&name=$2 [L,QSA,NC]
2nd solution: Adding 1 more solution here, either use above OR use following one at a time only please.
RewriteEngine on
RewriteCond %{THE_REQUEST} \s/file\.php\?d=(\d+)&name=(\S+)\s [NC]
RewriteRule ^ /directory/%1/%2? [R=301,L]
RewriteRule ^directory/(\d+)/(.*)/?$ /directory/file.php?id=$1&name=$2 [NC,L,QSA]
I have a bunch of ugly links coming in to a site like this:
https://www.somedomain.com/mm5/mch.mvc?Session_ID=5f59c6e0&Screen=PRODFB&Product_Code=pcode1&Category_Code=category_a&Store_Code=store1&fb=1
I'm trying to use htaccess to recreate the url like this:
https://www.somedomain.com/mm5/mch.mvc?Screen=PROD&Product_Code=pcode1
I've come up with this but of course it isn't working. I think I just need to be able to ignore the rest of the url parameter after the product_code param but not sure how to do it.
RewriteCond %{REQUEST_FILENAME} !-s
RewriteRule ^Screen=PRODFB&Product_Code=([^/.]+)$ /mm5/mch.mvc?Screen=PROD&Product_Code=$1 [R=301,L,NE]
What am I missing? Thanks!
You cannot match query string in RewriteRule directive.
As long as query parameters are same as what you have in question, you may use this rule in your site root .htaccess:
RewriteEngine On
RewriteCond %{THE_REQUEST} /(mm5/mch\.mvc)\?.*&Screen=PRODFB&(Product_Code=[^\s&]+) [NC]
RewriteRule ^ /%1?Screen=PROD&%2 [R=301,L,NE]
Or using %{QUERY_STRING}:
RewriteCond %{QUERY_STRING} (?:^|&)Screen=PRODFB&(Product_Code=[^&]+) [NC]
RewriteRule ^mm5/mch\.mvc/?$ %{REQUEST_URI}?Screen=PROD&%1 [R=301,L,NE]
I was able to get it work like this:
RewriteCond %{QUERY_STRING} Screen=PRODFB&Product_Code=([^&]+)
RewriteRule ^(.*)$ /mm5/mch.mvc?Screen=PROD&Product_Code=%1& [R=301,L,NE]
I have dozens of redirects from an old page e.g. index.php?mode=1,2,3,0 and I want to get rid of all GET Params because the new page is anyways just plain html.
RewriteCond %{REQUEST_URI} ^/index\.php$
RewriteCond %{QUERY_STRING} mode=17,0,0,0,0$
RewriteRule (.*) /big-mamas-house/ [R=301,L]
I thought removing (.*) would already do the trick but then the rule is not applied anymore according to:
http://htaccess.madewithlove.be/
Your rule can simplified to:
RewriteCond %{QUERY_STRING} ^mode=17,0,0,0,0$
RewriteRule ^index\.php$ /big-mamas-house/? [R=301,L,NC]
? in the end is needed to strip off any previous query string.
Is there a more efficient way to doing this?
The last /(.*)$ is an ID that I don't care to use. only whats before it.
RewriteRule ^about-us/news-room/(.*)_(.*)_(.*)_(.*)_(.*)_(.*)_(.*)_(.*)_(.*)_(.*)/(.*)$ index.php?go=/news/press-releases/$1-$2-$3-$4-$5-$6-$7-$8-$9-$10 [NC]
RewriteRule ^about-us/news-room/(.*)_(.*)_(.*)_(.*)_(.*)_(.*)_(.*)_(.*)_(.*)/(.*)$ index.php?go=/news/press-releases/$1-$2-$3-$4-$5-$6-$7-$8-$9 [NC]
RewriteRule ^about-us/news-room/(.*)_(.*)_(.*)_(.*)_(.*)_(.*)_(.*)_(.*)/(.*)$ index.php?go=/news/press-releases/$1-$2-$3-$4-$5-$6-$7-$8 [NC]
RewriteRule ^about-us/news-room/(.*)_(.*)_(.*)_(.*)_(.*)_(.*)_(.*)/(.*)$ index.php?go=/news/press-releases/$1-$2-$3-$4-$5-$6-$7 [NC]
RewriteRule ^about-us/news-room/(.*)_(.*)_(.*)_(.*)_(.*)_(.*)/(.*)$ index.php?go=/news/press-releases/$1-$2-$3-$4-$5-$6 [NC]
RewriteRule ^about-us/news-room/(.*)_(.*)_(.*)_(.*)_(.*)/(.*)$ index.php?go=/news/press-releases/$1-$2-$3-$4-$5 [NC]
RewriteRule ^about-us/news-room/(.*)_(.*)_(.*)_(.*)/(.*)$ index.php?go=/news/press-releases/$1-$2-$3-$4 [NC]
RewriteRule ^about-us/news-room/(.*)_(.*)_(.*)/(.*)$ index.php?go=/news/press-releases/$1-$2-$3 [NC]
RewriteRule ^about-us/news-room/(.*)_(.*)/(.*)$ index.php?go=/news/press-releases/$1-$2 [NC]
I found some solutions online but seem to get really confused on using the [N] flag? Not too sure here. Can anyone explain a better more efficient way to do this?
You can just let the rewrite engine loop internally for this:
RewriteRule ^about-us/news-room/(.+)/(.*)$ index.php?go=/news/press-releases/$1 [L,NC]
RewriteCond %{QUERY_STRING} ^go=/news/press-releases/(.*)_(.*)$
RewriteRule ^index\.php$ /index.php?go=/news/press-releases/%1-%2 [L]
The first rule sends the request to index.php, and the second rule removes the underscores and replaces them with dashes. Because the rewrite engine loops, it'll keep applying the rule until either the recursion limit is reached or all the underscores are gone.
I'm trying to write .htaccess rewrite for page with categories and search filters.
I want to disallow the special places with .htaccess . I have already specified places in robots.txt, but spiders still crawling the places.
Places i want to allow to crawl:
www.domain.com/path1.html
www.domain.com/path1/path2.html
www.domain.com/path1/path2/path3.html
www.domain.com/path1/path2/path3.html
www.domain.com/path4/path5.html
Places i want to disallow to crawl:
www.domain.com/path1.html?search[param1]=value&...
www.domain.com/path1/path2.html?search=param2&...
www.domain.com/path1/path2/path3.html?searchHash=param3
As i understand .htaccess code for search param will look, something like this, but it's not correct and I'm stack..
RewriteCond %{HTTP_USER_AGENT} (googlebot|bingbot|yandex) [NC]
RewriteRule ^(.*).html\?search=.*$ http://www.domain.com/$1 [R=301,L]
No you cannot match QUERY_STRING in RewriteRule. You need to use RewriteCond %{QUERY_STRING} like this:
RewriteCond %{QUERY_STRING} ^search=.+ [NC]
RewriteCond %{HTTP_USER_AGENT} (googlebot|bingbot|yandex) [NC]
RewriteRule ^(.+?\.html)$ http://www.domain.com/$1 [R=301,L,NC]