What to do with dynamic url after rewrite? - .htaccess

After you've been able to successfully create a url rewrite how do you handle the original and other possibly ways to access a page. This of course to prevent duplicate content. For example if I have this:
RewriteEngine on
RewriteBase /
RewriteRule ^blog/(\d+)/([\w\-/\.]+)/?$ blog.php?id=$1&article_title=$2 [L]
I'm able to access the page by the url
https://www.mysite.com/blog/10/mysite.com (the mysite.com is the article title)
The problem is I'm also able to access the site by going to
https://www.mysite.com/blog.php?id=10article_title=sitetitle
https://www.mysite.com/blog.php?id=10
ect.
How are you supposed to handle those particular urls.
Also should I change the blog.php?id=10 to the rewritten url? Can I rely on something else and just start using the full rewritten url now? The site is new.

For my web site, I have the script that gets called from inside the rewrite detect the URI they were fetched from (using the "REQUEST_URI" variable that at least Apache sets), and redirect to the canonical one if they ever get called with the internal one (outputting a 301 direction).

Related

Apache .htaccess redirect to an anchor

I'm trying to do a one-off damage-limitation redirection to an anchor on a page on a website. A wrong URL got published in some publicity material, like this:
https://mydomain.org.uk/A/B
when what I really wanted to publish was
https://mydomain.org.uk/A#B
Having looked at some other answers it seems that any redirect with an anchor needs to be an absolute URL. So I put this in my .htaccess:
RewriteRule A/B https://mydomain.org.uk/A.php#B [NE,L]
(note, the .php is correct, A.php is the page file). And it just simply doesn't work. The browser simply loads A.php and displays it from the top.
I know that the rule pattern is matching, because if I make the target be a completely nonexistent page I get a 404 as expected.
Unfortunately my web hosting service doesn't let me use the Apache log, so it's hard to trace what's going wrong. Can anyone guide me to how to do the rewrite properly so that I pass the #anchor all the way through to the user's browser?
Thanks in advance!
When the RewriteRule is processed by the server, it basically changes internally which resource to access, without the browser noticing.
The only way to change the URL in the browser is to use the redirect flag. This will make the webserver send a HTTP 302 response with a Location header, which then will result in the browser changing the URL and requesting the new page. This new URL can contain an anchor.
In your case the following rule should work:
RewriteRule A/B https://mydomain.org.uk/A.php#B [NE,R,L]
Please keep in mind that anchors are a browser feature so they are normally not sent to the server and therefore neither appear in access logs nor can be used in a RewriteRule.

Old pages PR to

I have site that has many links to its pages. And I will completely renew site, update CMS, content and page structure. Domain remains the same.
What will get users in browser if they find somewhere on the Internet old link
to old sites page and follow by that link while it's already a new site and
old page where link leads to doesn't exist?
How to make a redirect or something from these old links if old
pages not open to root of domain?
What's about Google in that case how do not lost PR and
redirect that PR weight of old pages to main domain?
I am kindly appreciate any relative discussion on this topic because it's really interesting from all sides.
What will get users in browser if they find somewhere on the Internet old link to old sites page and follow by that link while it's already a new site and old page where link leads to doesn't exist?
They will get a 404 Not Found.
How to make a redirect or something from these old links if old pages not open to root of domain?
You'll need to create a 301 redirect from every old page to the equivalent new page. You can do it using mod_alias:
RedirectMatch 301 ^/old_page/(.*)$ /new_page/$1
or mod_rewrite:
RewriteRule ^old_page/(.*)$ /new_page/$1 [L,R=301]
You'll obviously need to tailor the matching expressions and targets to your specific needs. If you have a lot of individual URL's that need redirecting, you may want to look into creating a RewriteMap.
What's about Google in that case how do not lost PR and redirect that PR weight of old pages to main domain?
As long as you use 301 redirects (a permanent redirect, as opposed to 302, a temporary redirect) Google's page ranking will transfer to the new URL.
What's better to use mod_alias or mod_rewrite in this situation and why?
Either is fine, but mod_rewrite gives you a lot more options and allows for rewrite maps. But if you are doing something simple, mod_alias is fine.
Again the same with 301 and 302 what's better to use here?
You want 301 here. It means "the resource that you requested has permanently moved to HERE" as opposed to a 302 which means "the resource that you requested isn't here right now, but in the mean time, you can find it HERE". Also, Google won't transfer any page ranking to the new page if you only do a 302 redirect, since it's meant only for temporary redirects. Not when a page has permanently moved to a new URL.
And the last I just looked on the old pages they all like domain.tld/index.php?id=77 does this rule correct RewriteRule ^index.php?id=(*)$ / [L,R=301] in that case for any id number to root?
This rule will not work. You cannot match against the query string (the ?id= part) in a RewriteRule, only against the %{QUERY_STRING} var in aRewriteCond. Also(*)` is probably not what you want.
This will 301 redirect any request for /index.php?id=N where N is any number, to the document root.
RewriteCond %{QUERY_STRING} ^id=([0-9]+)
RewriteRule ^index.php$ / [L,R=301]

.htaccess for 301 redirect: which syntax is best?

I am permanently redirecting my website
http://www.oldsite.com
to
http://newsite.com/blog
Is there a difference between using
Redirect 301 / http://newsite.com/blog/
or
RewriteEngine On
RewriteRule ^(.*)$ http://newsite.com/blog/$1 [R=301,L]
Any reason I should use one over the other?
The first uses Apache's internal redirection engine to direct all requests to / to http://newsite.com/blog with a 301 Moved Permanently response code.
The other loads the Apache rewriting engine and rewrites all of the incoming requests that match ^(.*)$ to http://newsite.com/blog/ (appending the matched part of the request URI to the target URI) with a 301 Moved Permanently response code, like the former.
The difference? The former rewrites everything to http://newsite.com/blog/ regardless of the request, and the second takes into account the request URI rewriting it as specified. The first is also somewhat faster than the second because it does not load the rewriting engine, does not introspect the request itself, and (depending on the AllowOverride setting) does not have to look up and load .htaccess files.
I believe the performance difference between the two would be imperceptible to a user.
However, assuming that all of the URLs on the old blog site cleanly map to the new site, then I would recommend using the second method.
If you use the first method, all links to your old blog posts will end up on the home page of your new site, which is not a great experience for users who may have bookmarked links etc.
If you care about SEO, then its the same story, all of your page rank will go from your old blog posts to your new site home page.

Does mod_rewrite only translate external requests to internal files and not vice versa?

I think this is a very stupid question so I apologise, as i think i may completely misunderstand mod_rewrite.
Say you have a URL
www.domain.com/products/item.php?id=1234
mod_rewrite can rewrite that to a friendly URL
wwww.domain.com/products/item/1234
(for example)
So, if i type in wwww.domain.com/products/item/1234 this will be rewritten to www.domain.com/products/item.php?id=1234 and that page is served. Fine.
But what if you type in www.domain.com/products/item.php?id=1234 - that page will be served but not rewritten to the friendly URL.
So my question is can you rewrite internal file names automatically? For example, all URLs on my site are currently in the www.domain.com/products/item.php?id=1234 format. When a user clicks this link can this be rewritten to the friendly URL? Or should you always hard code in the friendly URL?
Im sorry if that made little sense! Im getting confused because i want to rewrite non-friendly to friendly URL, but then serve the non friendly URL - so wont that cause an infinite redirect loop?
Mod_rewrite can't really internally rewrite URLs across domains, though it could proxy them (using P option in RewriteRule). Assuming that the domain is the same, you could do something to redirect the client's browser to a friendly URL if the old one is used while internally rewriting the friendly URL back to the old one, but they have to be both the same domain. You do this by looking at the actual request (%{THE_REQUEST}) variable instead of looking at the URI, which changes as they get rewritten internally.
This redirects the browser when the old URLs are used to the friendly URLs
RewriteCond %{THE_REQUEST} ^([A-Z]{3,9})\ /products/item\.php
RewriteCond %{QUERY_STRING} id=([0-9]+)
RewriteRule ^products/item\.php$ /products/item/%1? [R=301,L]
This rewrites internally when a friendly URL is used:
RewriteCond %{THE_REQUEST} ^([A-Z]{3,9})\ /products/item/[0-9]+
RewriteRule ^products/item/([0-9]+) /products/item.php?id=$1 [QSA,L]
Mod_rewrite does not automatically rewrite the "none friendly" urls to the friendly urls. You have to add some rules yourself to do this.
Also Mod_rewrite does not modify the links inside your html, css, or whatever you use. You need to change those yourself.
If a user uses the friendly url, it will never know that it is rewritten. Mod_rewrite is tranparent from the user's point of view. You can add a [R] flag to your rules which makes apache send a redirect to the client. This way the client does see the rewritten url.
Redirecting the unfriendly to the fiendly url, should only be done to help search engines (and to prevent link-rot, but that's more rare). This can be done without a redirect loop, unlike Sergey says.
Try looking around here on SO to find a script that does the redirect from the unfriendly to the fiendly url. Let me know if you can't find it, and I'll help.

Rewrite rules to hide an url

i need to hide full path and show shortly:
www.site.com/this/is/path/to/hide/page.php --> www.site.com
Any idea to do it with .htaccess apache and rewrite rules??
EXAMPLE:
If i type www.site.com i want open index.php (in /),
but if i go to /hidden/path i want to open file.php (in hidden/path)
mantaining browser url in www.site.com.
EDIT:
i want to see in the browser bar www.site.com and i want to open page at /this/is/path/to/hide/page.php .
thanks
As I explained in : How does url rewrite works? every rewrite triggers a new call to the rewritten URL. (HTTP 3xx code).
So the client will ask for www.site.com/this/is/path/to/hide/page.php, would be redirected to www.site.com and will be served the index page as a normal user.
There is no way to tell the client to display one URL in the browser bar instead of another, client browser will always make a new request. (Or you could impersonate any site for example)
If you are not against the use of a cookie, or can use environment variable you may be able to do something like :
RewriteRule this/is/path/to/hide/page.php / [co:knowHiddenPath=true]
The environment variable as same syntax with E instead of co.
(See http://httpd.apache.org/docs/2.0/mod/mod_rewrite.html for cookie information)
Your index page should then check this cookie/variable to serve the hidden page or not.
Another solution would be to enable access with password to your file. So even if someone know the URL he would not access the file. Security by obscurity is does not exists.
You can I believe do that with an Alias,
Alias / /this/is/path/to/hide/page.php
This directive needs to be in your <VirtualHost>
This will use mod_rewrite and you can put this into your .htaccess
# Activate Rewrite Engine
RewriteEngine On
# Home page rewrite rule
RewriteRule ^$ /this/is/path/to/hide/page.php [QSA,L]
This will ONLY work if you hit website root (e.g. http://www.example.com/)

Resources