IIS URL Rewrite condition file exists using capture groups? - iis

Seems like this would be really easy but can't seem to find an example.
User types in the following URL:
http://www.mysite.com/business/chelan/internet
I want it to go to:
/business/internet_chelan.php
ONLY if the file exists there. Otherwise, go to:
/business/internet.php
"Chelan" is a county in my area. If I want to create a specific page for that county, I could simply add "_chelan" to my file. Otherwise, it just goes to the general internet page.
So I use the following to grab URL parts:
^(business|residential)/([^/]+)/(.+)
and rewrite to:
{R:1}/{R:3}_{R:2}.php
Works beautifully... but I want to have the condition:
if {R:1}/{R:3}_{R:2}.php "Is a file" rewrite to {R:1}/{R:3}_{R:2}.php
But it seems like {R:} does not work in conditions??? Is this true or is there some other way I can write it? I've seen {C:1} and $1 but they don't seem to work either.
Any ideas?
IIS 7.5 on Windows 7 using URL Rewrite module

Related

.htaccess: redirect specific link to another?

I have these three links:
localhost/my_projects/my_website.php
localhost/my_projects/my_website.html
localhost/my_projects/my_website
The paths of the php and html files are as follows:
C:\xampp\htdocs\my_projects\my_website.php
C:\xampp\htdocs\my_projects\my_website.html
The link without an extension is "artificial" and I want to use said link:
localhost/my_projects/my_website
to get the contents of either of these links:
localhost/my_projects/my_website.php
localhost/my_projects/my_website.html
The reason for the two example files, instead of just one, is that I want to be able to switch between those two files when I edit the htaccess file. Obviously I only want to access one of those files at a time.
What do I need to have in my .htaccess file inside the my_projects folder to accomplish that? How can I make one specific link redirect to another specific link?
After reading your comment clarifying your folder structure I corrected the RewriteRule. (By the way, it would be best if you add that info to the question itself instead of in comments).
The url you want to target is: http://localhost/my_projects/my_website
http:// is the protocol
localhost is your domain (it could also be 127.0.0.1 or a domian name like www.example.com in the Internet)
I assume you are running Apache on port 80, otherwise in the url you need to also specify the port. For port 8086 for example it would be http://localhost:8086/my_projects/my_website.
The real path is htdocs/my_projects/my_website.php or htdocs/my_projects/my_website.html depending on your needs (obviously both won't work at the same time).
Here the my_projects in the "fake" url collides with the real folder "my_projects" so Apache will go for the folder and see there is no my_website (with no extension) document there (it won't reach the rewrite rules).
There is a question in SO that provides a work around for this, but it is not a perfect solution, it has edge cases where the url will still fail or make other urls fail. I had posted it yesterday, but I seem not to find it now.
The simple solution if you have the flexibility for doing it is to change the "fake" url for it not to collide with the real path.
One option is for example to replace the underscores with hyphens.
Then you would access the page as http://localhost/my-projects/my-website if you want to keep a sort of "fake" folder structure in the url. Otherwise you could simply use http://localhost/my-website.
Here are both alternatives:
# This is for the directory not to be shown. You can remove it if you don't mind that happening.
Options -Indexes
RewriteEngine On
#Rule for http://localhost/my-projects/my-website
RewriteRule ^my-projects/my-website(.+)?$ my_projects/my_website.php$1 [NC,L]
#Rule for http://localhost/my-website
RewriteRule ^my-website(.+)?$ my_projects/my_website.php$1 [NC,L]
(Don't use both, just choose one of these two, or use them to adapt it to your needs)
The first part the rewrite rule is the regular expression for your "fake" url, the second part is the relative path of your real folder structure upto the page you want to show.
In the regular expression we capture whatever what we assume to be possible query parameters after .../my_website, and paste it after my_website.php in the second part of the rule (the $1).
Later on if you want to point the url to my_website.html, you have to change the second part of the rule, where it says .php, replace it by .html.
By the way, it is perfectly valid and you'll see it in most SEO friendly web sites to write an url as http://www.somesite.com/some-page-locator, and have a rewrite rule that translates that url to a page on the website, which is what I had written in my first answer.

Notes 9, rewriting URLs

How do you rewrite a URL in Notes 9 XPages.
Let's say I have:
www.example.com/myapp.nsf/page-name
How do I get rid of that .nsf part:
www.example.com/page-name
I don't want to do lots of manual re-direct because my pages are dynamically formed like wordpress.
I've read this: http://www.ibm.com/developerworks/lotus/library/ls-Web_site_rules/
It does not address the issue.
If you use substitution rules like the following, you can get rid of the db.nsf part and call your XPages directly as example.com/xpage1.xsp:
Rule (substitution): /db.nsf/* -> /db.nsf/*
Rule (substitution): /* -> /db.nsf/*
However, you have to "manually" generate your URLs without the db.nsf part in e.g. menus because the XPages runtime will include the db.nsf part in the URLs if you use for instance the openPage simple action.
To completely control what is going in and out put your Domino behind an Apache HTTP and use mod_rewrite. On Domino 9.0 Windows you can use mod_domino
You can do it with a mix of subsitutions, "URL-pattern" and paritial refresh.
I had the same problem, my customers wants clean URLs for SEO.
My URLs now looks like these:
www.myserver.de/products/financesoftware/anyproduct
First i used one subsitution to cover the folder, database and xpage part of the URL.
My substitution: "/products" -> "/web/techdemo.nsf/product.xsp"
Problem with these is, any update on this site (with in redirect mode) and the user gets back the "dirty" URL.
I solved this with the use of paritial refreshes only.
Last but not least, i uses my own slash pattern at the end of the xpage call (.xsp)
In my case thats the "/financesoftware/anyproduct/" part.
I used facesContext.getExternalContext().getRequestPathInfo() to resolve that URL part.
Currently i used good old RegExp to get the slash separated parameters back out of the url, but i am investigating a REST solution at the moment.
I haven't actually done this, but just saw the option yesterday while looking for something else. In your Xpage, go to All Properties, and look at 'navigationRules' and 'pageBaseUrl'. I think you will find what you are looking for there.

CakePHP nice urls - how to prevent normal urls from working

I have a website that's written using CakePHP. I've added some rewrite rules in the .htacces file to change the default urls to different ones (instead of /controller1/action1/parameter I have /some-string-about-controller-and-action/parameter, for example).
The problem is that now both the normal url and the nice one are available, and google seems to be indexing both, which is a problem. I'd like to only keep the nice one, which is the proper way to handle this so that it affects the google results as little as possible?
I don't know why you don't want to use cakes own routing (if you are having trouble doing what you want, you can accomplish what you want with a custom route class), then make sure that you redirect all relevant URL's in your .htaccess file to the desired URL using a MOVED PERMANENTLY redirect.
This way google will index the target url instead of the one that is undesirable. You are right to take offense to this, double indexing is a great way to harm your SEO rankings.

mod_rewrite so a specific directory is removed/hidden from the URL, but still displays the content

I'd like to create a rewrite in .htaccess for my site so that when a user asks for URL A, the content comes from URL B, but the user still sees the URL as being URL A.
So, for example, let's say I have content at mydomain.com/projects/project-example. I want users to be able to ask for mydomain.com/project-example, still see that URL in their address bar, but the browser should display the content from mydomain.com/projects/project-example.
I've looked through several .htaccess rewrite tips and FAQs, but unfortunately none of them seemed to present a solution for exactly what I've described above. Not everything on my domain will be coming from the /projects/ directory, so I'd imagine the rewrite should check to see if the page exists first so it's not appending /projects/ to every url. I'm really stumped.
If a rewrite is not exactly what I need, or if there is a simple solution for this problem, I'd love to hear it.
This tutorial should have everything that you need, including addressing exactly what you are asking: http://httpd.apache.org/docs/2.0/misc/rewriteguide.html . It may just be a matter of terminology.
So, for example, let's say I have content at mydomain.com/projects/project-example. I want users to be able to ask for mydomain.com/project-example, still see that URL in their address bar, but the browser should display the content from mydomain.com/projects/project-example.
With something like:
RewriteRule ^project-example$ /projects/project-example [L]
When someone requests http://mydomain.com/project-example and the URI /project-example gets rewritten internally to /projects/project-example. Note that when this is in an .htaccess file, the URI /project-example gets the leading slash removed when matching.
If you have a directory of stuff, you can use regular expressions and back-references, for example you want any request for http://mydomain.com/stuff/ to map to /internal/stuff/:
RewriteRule ^stuff/(.*)$ /internal/stuff/$1 [L]
So requests for http://mydomain.com/stuff/file1.html, http://mydomain.com/stuff/image1.png, etc. get rewritten to /internal/stuff/file1.html, /internal/stuff/image1.png, etc.

Getting "mywebsite.org/" to resolve to "mywebsite.org/index.php"

At my work we have various web pages that, my boss feels, are being ranked lower than they should be because "mywebsite.org/category/" looks like a different URL to search engines than "mywebsite.org/category/index.php" does, even though they show the same file. I don't think it works this way but he's convinced. Maybe I'm wrong though. I have two questions:
How do i make it so that it will say "index.php" in the address bar of all subcategories?
Is this really how pagerank works?
Besides changing all the links everywhere, a simpler solution is to use a rewrite rule. Make sure it is a permanent redirect, or Google will keep using the old link (without index.php). How you do this exactly depends on your web server, but for Apache HTTPd it looks something like the example given below.
Yes. Or so I've heard. Very few people know for sure. But Google mentions this guideline (as "Be consistent"). Make sure to check out all of Google's Webmaster guidelines.
Apache config for rewrite rule:
# in the generic config
LoadModule rewrite_module modules/mod_rewrite.so
# in your virutal host
RewriteEngine On
# redirect everything that ends in a slash to the same, but with index.php added
RewriteRule ^(.*)/$ $1/index.php [R=301,L]
# or the other way around, as suggested
# RewriteRule ^(.*)/index.php$ $1/ [R=301,L]
Adding this code to the top of every page should also work:
<?php
if (substr($_SERVER['REQUEST_URI'], -1) == '/') {
$new_request_uri = $_SERVER['REQUEST_URI'].'index.php';
header('HTTP/1.1 301 Moved Permanently');
header('Location: '.$new_request_uri);
exit;
}
?>
You don't tell us if you're using straight PHP or some other framework, but for PHP, probably you just need to change all the links on your site to "mywebsite.org/category/index.php".
I think it's possible that this does affect your search engine rank. However, you would be better off using only "mywebsite.org/category" rather than adding "index.php" to each one.
Bottom line is that you need to make sure all your links in your website use one or the other. What actually gets shown in the address bar is unimportant.
A simple solution is to put in the <head> tag:
<link rel="canonical" href="http://mywebsite.org/category/" />
Then, no matter which page the search engine ends up on, it will know it is simply a different view of /category/
And for your second question--yes, it can affect your results, if Google thinks you are spamming. If it wasn't, they wouldn't have added support for rel="canonical". Although I wouldn't be surprised if they treat somedir/index.* the same as somedir/
I'm not sure if /category/ and /category/index.php are considered two urls for seo, but there is a good chance that it will effect them, one way or another. There is nothing wrong with making a quick change just to be sure.
A few thoughts:
URLs
Rather than adding /index.php, you will be better off making it so there is no index.php on any of them, since the keyword 'index' is probably not what you want.
You can make a script that will check if the URL of the current page ends in index.php and remove it, then forward to the resulting URL.
For example, on one of my sites, I require the 'www.' for my domain (www.domain.com and domain.com are considered two URLs for search purposes, though not always), so I have a script that checks each page and if there is no www., it ads it, and forwards.
if (APPLICATION_LIVE) {
if ( (strtolower($_SERVER["HTTP_HOST"]) != "www.domain.com") ) {
header("HTTP/1.1 301 Moved Permanently"); // Recognized by search engines and may count the link toward the correct URL...
header("Location: " . 'www.domain.com/'.$_SERVER["REQUEST_URI"] );
exit();
}
}
You could mode that to do what you need.
That way, if a crawler visits the wrong URL, it will be notified that it was replaced with the correct URL. If a person visits the wrong URL, they will be forwarded to the correct URL (most won't notice), and then if they copy the url from the browser to send someone or link to that page, they will end up linking to the correct url for that page.
LINKING URLS
They way other pages link to your pages is more important for seo. Make sure all your in-site links use the proper URL (without /index.php), and that if you have a 'link to this page' feature, it doesn't include the /index.php part. You can't control how everyone links to you, but you can take some control over it, like with the script in item 1.
URL ROUTING
You may also want to consider using some sort of framework or stand-alone URL rerouting scheme. It could make it so there were more keywords, etc.
See here for an example: http://docs.kohanaphp.com/general/routing
I agree with everyone who's saying to ditch the index.php. Please don't force your visitor to type index.php if not typing it could get them the same result.
You didn't say if you're on an IIS or Apache server.
IIS can be set to assume index.php is the default page so that http:// mywebsite.org/ will resolve correctly without including index.php.
I would say that if you want to include the default page and force your users to type the page name in the url, make the page name meaningful to a search engine and to your visitors.
Example:
http://mywebsite.org/teaching-web-scripting.php
is far more descriptive and beneficial for SEO rankings than just
http://mywebsite.org/index.php
Might want to take a look at robots.txt files? Not quite the best solution, but you should be able to implement something workable with them...

Resources