.htaccess - fixing duplicate content issues - .htaccess

Some pages are showing up as duplicate content when I run a page crawl with seoMOZ.
for example:
/index.php
and
/index.php/
are being crawled as two separate pages. How I would implement a mod-rewrite to remove trailing slashes from only .php files?
Also
mysite.com/dir/
and
mysite.com/dir/index.php
are being flagged as duplicate content. I would prefer to have all "/dir/file.php" links redirected to "/dir/" for aesthetic reasons, but I'm not sure of how to do this or if it is the best thing to do from an SEO standpoint.
Thanks for help and advice.

An couple ideas:
Add a rel="canonical" link to the section of the non-canonical version of each HTML page.
Taken from http://support.google.com/webmasters/bin/answer.py?hl=en&answer=139394
This means that you can set only one page to be the "orginal" or "authoritative" page to be indexed instead of the pages that contain the same content. This is great for pages that show posts by tags, for example.
ALSO
You can do a redirect. Crack open you htaccess and redirect all inquiries with a query string.
RewriteCond %{QUERY_STRING} .
RewriteRule (.*) $1?
There are a lot of ways to handle this, however.

Related

htaccess redirect if the url does not contain a specific character

I'm moving the site to a subdomain and need certain tag strings to go to the subdomain and some to remain on the main site. Problem is both have a similar tag system.
I need this type of request
https://www.site.co.uk/tags/example-tag
to go here:
https://sub.site.co.uk/tags/example-tag
but this type of request
https://www.site.co.uk/tags/view?tags=14-some-varriable
to remain unchanged and parsed to content without redirecting.
What would be the most recommended and best solution?
I have written some code to work around other redirects but this one is causing me a headache.
Cheers
For your this mentioned example, below should work.
RewriteEngine On
RewriteCond %{REQUEST_URI} ^(.+)/(.+)
RewriteRule ^ https://sub.site.co.uk/%1/%2 [R]

Sitewide redirects

I'm working on doing a site-wide redirect, while still maintaining a consistent url pattern.
http://www.site1.com/folder/page
should first redirect to
http://www.site1.com/redirectHandler?dest=folder/page
which would ultimately have a link to http://www.site2.com/folder/page
I can obviously code the last part, but since there are several hundred pages, I'm hoping someone can show how to do the first redirect via htaccess, instead of individual code on each page?
Did some investigation, here's what I found which does the trick (much simpler than I thought)
RewriteEngine on
RewriteRule ^(.*)$ http://www.site1.com/redirectHandler?dest=$1 [R=301,L]

.htaccess rewrite to add query parameter

I need to modify all requests bearing the form
http://example.com/dw2/dokuwiki/doku.php/page to
http://example.com/dw2/dokuwiki/doku.php/page?do=export_xhtml
The page bit is variable - it corresponds to each paage in the wiki. I should mention that given the way dokuwiki syntax works page could contain one or more colons. e.g. glossary:archive.
The intent here is to extract the bare page content (shorn of the header, sidebar etc) of the wiki for distribution via a CDN. This does not give a complete solution since dokuwiki still leaves in a lot of unrequired verbiage in the exported markup file but gets me most of the way there. I'd much appreciate any help with this.
Place this rule as your very first rule in /dw2/dokuwiki/.htaccess:
RewriteEngine On
RewriteBase /dw2/dokuwiki/
RewriteCond %{QUERY_STRING} ^$
RewriteRule ^(doku\.php/[^/]+)/?$ $1?do=export_xhtml [L,NC,QSA,R,NE]

How to redirect a folder and its pages to a single page

I'm trying to redirect some files and I'm pretty stuck. There are way too many to do some of them on a "page-per-page" basis and so I need a quick way as these pages are insignificant but return 404's at the moment.
I have a page like this "/old-blog/tag/page", I previously redirected the "old-blog" to "new-blog" so I get "/new-blog/tag/page" but now I want "tag" and all pages after this to be sent to "new-blog". I hope this example makes sense, please ask if I've missed something.
I'm doing my redirects with my .htaccess file so I'd like a method I can use with this in mind.
Thanks, Dan.
You may already have rewrite rules, if there are rules that do some type of routing, then mod_rewrite and mod_alias is going to conflict (RedirectMatch is mod_alias). So try sticking with just mod_rewrite. Try adding these rules, above any other rules, in the htaccess file in your document root:
RewriteEngine On
RewriteRule ^/?old-blog/tag /new-blog/ [L,R=301]

using mod_rewrite to create SEO friendly URLS

I've been searching google for this but can't find the solution to my exact needs. Basically I've already got my URL's named how I like them i.e. "http://mysite.com/blog/page1.php"
What I'm trying to achieve (if it's possible!) is to use rewrite to alter the existing URLS to: "http://mysite.com/blog/page1"
The problem I've come across is I've found examples that will do this if the user enters "http://mysite.com/blog/page1" into the broweser which is great, however I need it to work for the existing links in google as not to loose traffic, so incoming URLS "http://mysite.com/blog/page1.php" are directed to "http://mysite.com/blog/page1".
The 1st example (Canonical URLs) at the following is pretty much what you want:
http://httpd.apache.org/docs/2.0/misc/rewriteguide.html#url
This should do the trick, rewriting requests without .php to have it, invisible to the user.
RewriteEngine On
RewriteRule ^/blog/([^.]+)$ /blog/$1.php
You will need to write a rewrite rule for mapping your old url's to your new url as a permanent redirect. This will let the search engine know that the new, seo friendly url's are the ones to be used.
RewriteRule blog/page1.php blog/page1 [R=301,L]

Resources