Complicated .hatccess rules to convert to web.config - .htaccess

I have the following .haccess rules that I've used for a number of years, however I need them converting to work in a web.config file.
Anybody have a clue where to start?
#Set the mapfiles for the category and product pages
RewriteMap mapfile1 txt:maps/map_categories.txt
RewriteMap mapfile2 txt:maps/map_products.txt
RewriteMap lower int:tolower
#FIX the trailing slash
RewriteRule ^([^.?]+[^.?/])$ $1/ [R=301,L]
#Define the rules for the CATEGORY pages
RewriteCond %{URL} ^(?!/(cms|cms/.*)$)
RewriteCond %{URL} ^(?!/(vadmin|vadmin/.*)$)
RewriteCond %{URL} ^(?!/(bathroom-blog|bathroom-blog/.*)$)
RewriteRule ^(?:[^/]+/)?([^/.]+)/?$ product_list.asp?catid=${mapfile1:${lower:$1}}&%{QUERY_STRING} [NC,L]
#Define the rules for the PRODUCT pages
RewriteCond %{URL} ^(?!/(cms|cms/.*)$)
RewriteCond %{URL} ^(?!/(vadmin|vadmin/.*)$)
RewriteCond %{URL} ^(?!/(bathroom-blog|bathroom-blog/.*)$)
#RewriteCond %{REQUEST_FILENAME}.asp -d
#RewriteCond %{REQUEST_FILENAME}.asp -f
RewriteRule ^.+/.+/([^/.]+)/$ product_detail.asp?prodID=${mapfile2:${lower:$1}}&%{QUERY_STRING} [NC,L]
Very grateful for any help!
Cheers, Chris.

I don't know a lot about .NET web.config files. The last time I used them was four years ago. Therefore I start by explaining what your .htaccess file does, step by step. I focus on what is done and not how, i. e., I don't explain the many expressions like ^([^.?]+, which look like comic figures cursing around.
It's a long answer, so get yourself a coffee.
Let's start with the first three lines.
RewriteMap mapfile1 txt:maps/map_categories.txt
RewriteMap mapfile2 txt:maps/map_products.txt
RewriteMap lower int:tolower
Do you have a RewriteEngine on somewhere in your .htaccess file or perhaps in your host configuration?
RewriteMap sets up a mapping you can use in the rules below like this: ${mapfile1:argument}. You should have two files map_categories.txt and map_products.txt in your maps subdirectories. I can imagine that your application updates these text files. The third mapping converts upper case letters to lower case.
#FIX the trailing slash
RewriteRule ^([^.?]+[^.?/])$ $1/ [R=301,L]
The comment is true. Let's say you have http://example.com/product, this gets redirected (HTTP 301 Moved Permanently) to http://example.com/product/.
#Define the rules for the CATEGORY pages
RewriteCond %{URL} ^(?!/(cms|cms/.*)$)
RewriteCond %{URL} ^(?!/(vadmin|vadmin/.*)$)
RewriteCond %{URL} ^(?!/(bathroom-blog|bathroom-blog/.*)$)
RewriteRule ^(?:[^/]+/)?([^/.]+)/?$ product_list.asp?catid=${mapfile1:${lower:$1}}&%{QUERY_STRING} [NC,L]
If you have urls exactly like these: http://example.com/cms, http://example.com/cms/, http://example.com/cms/any_path, and also for vadmin or bathroom-blog instead of cms, then invoke product_list.asp with the corresponding category id. These rules are using the mapping map_categories.txt to map names to IDs, and also allows mixed case, even if the mapping only contains lower case names.
Summary: http://example.com/cms/some_category calls product_list.asp with the category id of some_category.
#Define the rules for the PRODUCT pages
RewriteCond %{URL} ^(?!/(cms|cms/.*)$)
RewriteCond %{URL} ^(?!/(vadmin|vadmin/.*)$)
RewriteCond %{URL} ^(?!/(bathroom-blog|bathroom-blog/.*)$)
#RewriteCond %{REQUEST_FILENAME}.asp -d
#RewriteCond %{REQUEST_FILENAME}.asp -f
RewriteRule ^.+/.+/([^/.]+)/$ product_detail.asp?prodID=${mapfile2:${lower:$1}}&%{QUERY_STRING} [NC,L]
This is really strange. The three RewriteCond conditions are exactly the same as the first three above. If this really works, then your predecessor has used some URL Rewriting black magic, for example by exploiting that after a successful rewrite (without redirect), the rules are re-read and again applied to the now rewritten URL. To find more about this, I would need access to your system.
I already know that I didn't see everything: RewriteEngine on is missing, and also the mapping files.
URL Rewriting is one of the subjects which are notoriously hard to get it right without some experimenting. No wonder it is difficult to get working help right out of the box here on Stack Overflow. The problem is that almost always one needs access to the systems involved, because it is impossible to give an answer without some researching the live systems.
As Apache itself says in: http://shib.ametsoc.org/manual/misc/rewriteguide.html#page-header:
The Apache module mod_rewrite is a killer one, i.e. it is a really sophisticated module which provides a powerful way to do URL manipulations. With it you can do nearly all types of URL manipulations you ever dreamed about. The price you have to pay is to accept complexity, because mod_rewrite's major drawback is that it is not easy to understand and use for the beginner. And even Apache experts sometimes discover new aspects where mod_rewrite can help.
In other words: With mod_rewrite you either shoot yourself in the foot the first time and never use it again or love it for the rest of your life because of its power.
Emphasis is mine.

After an hour or so of Googling, I have found that IIS7 supports mapfiles via its rewriteMaps functions. The following two web-pages have given me the solution that I need, and I have created a separate "rewritemaps.config" file to store the ID->URL mappings.
http://www.iis.net/learn/extensions/url-rewrite-module/using-rewrite-maps-in-url-rewrite-module
http://blogs.iis.net/ruslany/archive/2010/05/19/storing-url-rewrite-mappings-in-a-separate-file.aspx
Anyone wishing to convert .htaccess mapfile configurations over to web.config, these two pages will get you going.

Related

Don't initiate old-to-new-domain RewriteRule if a certain variable is specified in query string

I've searched SO and Google long and hard for this one and I'm really surprised not to have found an answer (or stumbled upon the solution by trial and error!). It's a slightly tough search as most of the keywords lead to people wanting to exclude query strings as part of their redirect, whereas I want to exclude certain query strings from the subsequent redirect entirely.
We have migrated a content site from olddomain.com (running Drupal) to newdomain.com (running Wordpress). All the paths stay the same and so we want to redirect like-for-like from one domain to the other. However, we still want to be able to access Drupal's admin panel (and other associated admin URLs) for a variety of reasons. These exceptions must be done by exclusion so that, when we are not redirecting to the new domain, Drupal's existing generic mod_rewrite rules still activate in order to serve the redirect- excluded URLs correctly.
The main "like for like" redirect rule looked like this, and works well:
RewriteCond %{REQUEST_URI} !^/?(admin/|index.php|install.php|authorize.php|cron.php|update.php|xmlrpc.php|batch)
RewriteRule ^(.*) https://newdomain.com/$1 [R=301,L]
However, some admin-only paths (typically for editing a piece of content) don't always use "admin" in the path, e.g.:
/node/4823/edit
So what I want to do is to be able to manually add a noredirect query string variable which is then used as a further negated RewriteCond of my existing RewriteRule so in essence I am saying "do a like-for-like redirect on all paths as long as they are not in any of these folders and noredirect doesn't appear in the query string".
This is as far as I've got, you can see some of the steps I've taken, all of which have failed so far:
RewriteCond %{REQUEST_URI} !^/?(admin/|index.php|install.php|authorize.php|cron.php|update.php|xmlrpc.php|batch)
#RewriteCond %{QUERY_STRING} !^noredirect=([^&]+)
#RewriteCond %{QUERY_STRING} !^/noredirect/
#RewriteCond %{QUERY_STRING} !^.*(\bnoredirect\b)
#RewriteCond %{QUERY_STRING} !^.*(noredirect)
RewriteCond %{QUERY_STRING} !(noredirect)
RewriteRule ^(.*) https://newdomain.com/$1 [R=301,L]
As I've gone on I've tried to make it more and more generic in order to try and simplify the task; all I care about is checking for "noredirect" anywhere in the query string, so I'd be happy with all of these query strings matching the exclusion and thus preventing the redirect:
?noredirect
?noredirect=
?foo=bar&noredirect=whatever
?thisbitsaysnoredirectinit&foo=bar
As always, I look forward to basking in your expertise...
After much fiddling, and reordering and testing the rules separately (stupid I didn't do this before, this clearly showed the query string rule in itself was fine), this was the solution:
RewriteCond %{REQUEST_URI} !(^|/)(admin/|index.php|install.php|authorize.php|cron.php|update.php|xmlrpc.php|batch|node)
RewriteCond %{QUERY_STRING} !(noredirect)
RewriteRule ^(.*) https://newdomain.com/$1 [R=301,L]
So it required the change from ^/? to (^|/) in the "exclude these folders" rule.
I'll happily admit I don't understand why this has fixed it. Unfixed, this condition itself was working fine, it was just preventing the next condition from being looked at. If there was a problem with the syntax it seems to me that the condition should have been broken entirely rather than messing with other conditions. If anyone can shed any light on that for me, thank you!
For what it's worth, RewriteBase is unspecified (so I assume defaults to root).
Drupal-specific bonus
Also note the addition of "node" to the list of folders I'm excluding from the 1:1 redirect to the new domain.
When I thought about this more I realised that this would mean the admin-only extensions of "node" (e.g. /node/123/edit) would be safe as they are not rewritten to an alias. The public views of nodes (e.g. /node/123) would be initially ignored but then subsequently rewritten to their aliases by Drupal's own functionality, at which point .htaccess is called a second time and the redirect to the new domain (as the alias does not begin with "node") is activated.
This is not only a better system (anyone trying to go to an original Drupal node URL rather than an alias will not get redirected too soon) but of course means I now only have to use the ?noredirect query string in much rarer use cases.

.htaccess redirect whole website structure but leave some of the old structure intact

I have searched but can't quite find the answer to this exact situation:
My website structure is as follows
www.example.co.uk/old-folder1/old-folder2/old-page-name
I needed to redirect the whole structure to:
www.example.co.uk/new-folder1/old-folder2/old-page-name
I successfully did this with:
RewriteRule ^old-folder1/(.*)$ /new-folder1/$1 [R=302,L]
However, if possible for the moment I still want to serve images from the old structure ie.
www.example.co.uk/old-folder1/old-images-folder/old-image.jpg
At the moment the above is being rewritten as:
www.example.co.uk/new-folder1/old-images-folder/old-image.jpg
Which makes sense but this leads me to my question, is there any way of excluding some of the sub directories in 'old-folder1' from the RewriteRule above so that for example www.example.co.uk/old-folder1/old-images-folder/old-image.jpg is still accessible?
I don't have much experience with this but from research I came up with the following but it doesn't work.
RewriteCond %{REQUEST_URI} !^/old-folder1/old-images-folder/.*$
I am beginning to think this might not even be possible with the approach I have taken with:
RewriteRule ^old-folder1/(.*)$ /new-folder1/$1 [R=302,L]
Thanks
Entire contents of the .htaccess file at the moment is:
Options +SymLinksIfOwnerMatch
RewriteEngine on
RewriteBase /
RewriteCond %{REQUEST_URI} !^/old-folder1/old-images-folder/.*$
RewriteRule ^old-folder1/(.*)$ /new-folder1/$1 [R=302,L]
Try:
RewriteRule ^old-folder1/(?!old-images-folder)(.*)$ /new-folder1/$1 [R=302,L]
HOwever, what's probably happening is that your images are linked using relative URLs, so they may be relatively linked to the "new" images folder.
You should check your links and make sure they're linked correctly to the old path.

Redirect Desktop Internal Pages to Correct Mobile Internal Pages with Htaccess

I have built a Mobile site in a sub-domain.
I have successfully implemented the redirect 302 from:
www.domain.com to m.domain.com in htaccess.
What I'm looking to achieve now it to redirect users from:
www.domain.com/internal-page/ > 302 > m.domain.com/internal-page.html
Notice that URL name for desktop and mobile is not the same.
The code I'm using looks like this:
# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>
# END WordPress
# Mobile Redirect
# Verify Desktop Version Parameter
RewriteCond %{QUERY_STRING} (^|&)ViewFullSite=true(&|$)
# Set cookie and expiration
RewriteRule ^ - [CO=mredir:0:www.domain.com:60]
# Prevent looping
RewriteCond %{HTTP_HOST} !^m.domain.com$
# Define Mobile agents
RewriteCond %{HTTP_ACCEPT} "text\/vnd\.wap\.wml|application\/vnd\.wap\.xhtml\+xml" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "sony|symbian|nokia|samsung|mobile|windows ce|epoc|opera" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "mini|nitro|j2me|midp-|cldc-|netfront|mot|up\.browser|up\.link|audiovox"[NC,OR]
RewriteCond %{HTTP_USER_AGENT} "blackberry|ericsson,|panasonic|philips|sanyo|sharp|sie-"[NC,OR]
RewriteCond %{HTTP_USER_AGENT} "portalmmm|blazer|avantgo|danger|palm|series60|palmsource|pocketpc"[NC,OR]
RewriteCond %{HTTP_USER_AGENT} "smartphone|rover|ipaq|au-mic,|alcatel|ericy|vodafone\/|wap1\.|wap2\.|iPhone|android"[NC]
# Verify if not already in Mobile site
RewriteCond %{HTTP_HOST} !^m\.
# We need to read and write at the same time to set cookie
RewriteCond %{QUERY_STRING} !(^|&)ViewFullSite=true(&|$)
# Verify that we previously haven't set the cookie
RewriteCond %{HTTP_COOKIE} !^.*mredir=0.*$ [NC]
# Now redirect the users to the Mobile Homepage
RewriteRule ^$ http://m.domain.com [R]
RewriteRule $/internal-page/ http://m.domain.com/internal-page.html [R,L]
At the end, you have two RewriteRule lines which I believe should be changed to:
RewriteRule ^\/?$ http://m.domain.com [R=302]
RewriteRule ^\/?(.*)\/?$ http://m.domain.com/$1.html [R=302,L]
The ^\/?(.*)\/?$ means give me a string that starts at the beginning (^) and gives me all characters ((.*)) until the end ($) without the trailing/beginning (/) if there is one (?).
The http://m.domain.com/$1.html means that if the address is http://www.domain.com/internal-page/ then it becomes http://m.domain.com/internal-page.html.
The [R=302,L] should mean a 302 redirect (R=302) and the last rewrite (L), so no other rewrites can occur on our URL.
EDIT:
I believe that in the case of your RewriteRules the first one was redirecting to http://m.domain.com in the event that the URL was just the domain, but if there was anything else then the second rewrite was failing because it was not actually literally /internal-page/ and you needed a regex variable to put into the new URL.
EDIT (2):
To redirect to each mobile page from a specific desktop page:
RewriteRule ^\/foo\/?$ http://m.domain.com/bar.html [R=302]
RewriteRule ^\/hello\/?$ http://m.domain.com/world.html [R=302]
The (/?) means that a / is optional in that position and the (^) denotes beginning and ($) denotes ending in this case (the ^ can also be used to indicated something like [^\.] which means anything except a period).
Just put how ever many of those that you need in a row to do the redirecting and that should do the trick. To make sure there are no misconceptions, the first line would mean that http://www.domain.com/foo/ would become http://m.domain.com/bar.html and because the trailing slash is made optional http://www.domain.com/foo (notice the trailing forward slash is absent) would also redirect to http://m.domain.com/bar.html.
You can play with the syntax a bit to customize it, but hopefully I've pointed you in the right direction. If you need anything else, let me know, I'll do my best to assist.
I don't want to sound like a broken record or anything, but I feel that I could not, in good conscience, end this edit without pointing out that modifying the mobile site would be a much better way to do this. If it is not possible or you feel that a few static redirects are not a big deal versus modifying some pages, then I totally understand, but here are a few things for you to think about:
If the mobile site and desktop site are in separate folders then the exact same name scheme can be used for both making the Rewrites simpler and meaning that as new pages/content are added you will not need more Rewrite statements (making more rewrites means you have to create the new pages and then you have to create the redirects. that's extra work and more files which require your attention.)
If the mobile site is actually hosted from the same directory as the desktop site, then changing the files for one or the other so it becomes something like /desktop-foo/ or /d-foo/ then it is very easy to make the rewrite (redirect) go to something like /m-foo.html. You could forego modifying the desktop pages and make /foo/ become /m-foo.html and make all your mobile versions begin with an 'm'.
The third option that comes to mind is the most difficult and time consuming, depending on the content of the site, but it is a pretty cool one and ultimately would make the site the easiest to work on (after the initial work, of course). It is quite possible to use the same page for desktop, mobile, tablet, etc without the use of mod_rewrite or separate pages. Things like media queries in your CSS would allow you to change the look of the page depending on what the client is viewing it from. I came across a tutorial on the subject earlier which used media queries and the max-width of the screen to determine how the page should look. This would require a good bit of work now, but could save some hassle down the road as well as being an interesting learning experience if you are up to the challenge.
Again, sorry that this veered off topic at the end there, but I got the impression from your original question and your responses that you might find the alternatives interesting if you haven't already considered and dismissed them and that even if the alternatives do not interest you that you aren't going to be like some people and respond with, "Hey, $*%& you, buddy! I asked for Rewrites not all that other garbage!" I hope you take it as nothing more than what it is intended to be...helpful.

Can I create a search engine friendly URL from this custom ColdFusion CMS URL?

I have inherited a custom ColdFusion CMS app. The URL's that it creates are horrendous. Not at all suitable for SEO or readability for that matter. An example of a URL in this CMS is:
http://www.mysite.com/Index2.cfm?a=000003,000010,000019,001335
Basically, each level of hierarchy is stored in the database based upon that long string of comma separated values. So in the case of the example I used, that particular page is 4 levels deep in the CMS hierarchy.
Basically what I would like to see is a format similar to this
http://www.mysite.com/level-1/level-2/level-3/level-4
Is this possible? Any help would be greatly appreciated. For what it's worth we are using ColdFusion 6 at present time, but will be upgrading to 8 in the near future.
First of all, are you willing to have the index.cfm in the URL? Like: http://www.mysite.com/index.cfm/level-1/level-2/level-3/level-4 ? If not, then you'll need to be doing a rewrite to remove the index.cfm, but still allow CF to process the page. Your .htaccess would look something like this:
RewriteEngine On
# If it's a real path, just serve it
RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule . - [L]
# Redirect if no trailing slash
RewriteRule ^(.+[^/])$ $1/ [R=301,L]
# Rewrite URL paths
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} !-d
RewriteRule ^([a-zA-Z0-9/-]+)$ /index.cfm%{REQUEST_URI} [PT]
Next step, you'll need to "catch" the URLs and serve up the correct pages based on the SEO-friendly URLs. You can grab the incoming URL from the CGI.path_info variable. It's hard to know what your code should look like without knowing how it currently processes those URL variables, but essentially you'd have some kind of mapping function that grabbed the SEO-friendly names and substituted in the numbers to grab the content.
The third step is rewriting any URLs that are generated by your CMS to output the SEO-friendly URLs. Same mapping happens here, only in reverse.

Using .htaccess to rewrite clean & custom URL structure

How do i redirect
website.com/about.php
to
website.com/about
Also, is it possible to manually create the appearance of subdirectories using .htaccess?
i.e. website.com/project1.php
to
website.com/projects/project1
Much appreciated!
Yes!
You can use the Apache RewriteEngine module to achieve this end. The documentation is here: http://httpd.apache.org/docs/2.2/mod/mod_rewrite.html
There are a number of good tutorials around also.
Some examples of rules are:
RewriteRule ^/(\S+).php /$1
RewriteRule ^/(project\d+).php /projects/$1
However, these are just guesses at what you might want to achieve, there are always corner cases. (Also, these are not tested!)
Bear in mind the above two rules would not necessarily be good together as written. Take note in the documentation about setting up the RewriteEngine and using appropriate RewriteRule options.
For example, when specifying many RewriteRules it is common to specify the [L] option so that the RewriteEngine stops rewriting after the rule is applied. Ordering of rules, therefore, can be significant.
Here are your examples putting the .htaccess on your root directory and using mod_rewrite
RewriteEngine on
RewriteRule ^about\.php$ about [R]
RewriteRule ^projects/project([0-9]+)$ project$1.php

Resources