RewriteRule - redirect multi variable URL to multi variable URL - .htaccess

Our old website has a search URL structure like this:
example.com/Country/United States/Region/California/Area/Southern California/City/San Diego/Suburb/South Park/Type/House/Bedrooms/4/Bathrooms/3/
This is currently rewritten to point to the physical page:
/search/index.aspx
The parameters in the URL can be mixed up in different orders, and the URL can include one or more parameters.
We want to 301 redirect these old URLs to a new structure that is ordered in a logical way and more concise:
example.com/united-states/california/southern-california/san-diego/south-park/?type=house&bedrooms=4&bathrooms=3
example.com/united-states/california/?type=house&bedrooms=4&bathrooms=3
Is there a way with URL rewriting to interrogate the old URL, work out what parameters are existing and then write out the new URL structure?
Even if we can limit it to just the Country, Region, Area, City and Suburb, that may be good enough to at least return some results even if it's not perfect.
Also, spaces should be turned into hyphens and all text made lowercase.
I already have the RewriteRule to turn the new URL structure into a URL to point to a physical page. It's just transforming the old URL in to the new URL I need help with. I've googled endlessly and it's just beyond me!
Can anyone help? Thanks.

Since you already have the old search page with rewriting rules set up for it and which is capable of parsing all parameters you need, the easiest and most appropriate solution I see here is to issue a redirect you require from this old search page's code. Just put the code that composes new URL with all parameters needed and redirects from this page - this should be a lot easier than trying to parse all these parameters in .htaccess and combine them into the new format.

Related

HTAccess Redirect using main parameter, ignore all others

firstly I know and understand how to redirect based on parameters :)
My issue is that I need to redirect all links based on the supplied MenuID parameter and ignore any other information in the query string, as not all parameters are used in each web page request, e.g. menuid=2738421; is New Products
http://www.domain.com/shop.php?menuid=2738421&menuref=Ja&menutitle=New+Products& limit=5&page=24
or,
http://www.domain.com/shop.php?menuid=2738421&menuref=Ja&menutitle=New+Products&limit=20&page=3
or,
http://www.domain.com/shop.php?menuid=2738421&menuref=Ja&page=12&limit=15
to
http://www.domain.com/new.html?page=x&limit=x
The reason for the redirection is that search-engines have indexed these pages and so I need to prettify the URLs.
Is this actually possible to create a fuzzy redirect criteria?
## 301 Redirects
# 301 Redirect 1 - works for this explicit URL, but need a partial result
RewriteCond %{QUERY_STRING} ^$
RewriteRule ^new\.html$ http://www.monarchycatering.com/shop.php?menuid=2738421&menuref=Ja&menutitle=New+Products&limit=5&page=24 [R=301,NE,NC,L]
Any help gratefully taken, thank you in advance
Mark.
Sorry for the delay, but StackOverflow doesn't seem to have a way to flag answers that have been replied to and need my attention.
OK, if I understand you correctly, you have an incoming "reduced" semi-SEF URL out in the real world (produced by your links), such as
http://www.domain.com/new.html&limit=5&page=24
("real" SEF would be something like http://www.domain.com/new/limit/5/page/24.html)
and you need to use .htaccess to map it to real files and more Query String information:
http://www.domain.com/shop.php?menuid=2738421&menuref=Ja&menutitle=New+Products&limit=5&page=24
You want to detect new.html for example, and replace it by a fixed string shop.php?menuid=2738421&menuref=Ja&menutitle=New+Products&, using the [QSA] flag to append the existing Query String on to the end?
RewriteEngine On
RewriteRule ^new\.html /shop.php?menuid=2738421&menuref=Ja&menutitle=New+Products [QSA]
RewriteRule ^sale\.html /shop.php?menuid=32424&menuref=Ja&menutitle=Products+On+Sale [QSA]
...etc...
I believe that a & will be stuck on the end of the rewritten address if the user supplied a non-empty Query String, but be sure to test it both ways.
P.S. It probably would have been cleaner to use "comment" to reply to my question, rather than adding another answer.
It's not clear to me what your starting point is and where you're trying to end up. Do you have "pretty" URLs that you want to convert into "non-pretty" Query Strings that your scripts can actually digest?
The reason for the redirection is that search-engines have indexed
these pages and so I need to prettify the URLs.
If the search engines have already indexed the Query String non-pretty version, they'll have to now re-index with pretty URLs. Ditto for all your customers' bookmarks.
If you want to generate "pretty" links within your site, and decode them (in .htaccess) upon re-entry to non-pretty Query Strings, that should work. Your customers' existing bookmarks should even continue to work, while the search engines will replace the non-pretty with the pretty URLs.
and thanks for the interest in my question...
I have rewritten parts of my website and Google still has references to the old MenuID parameter and shop.php configuration, but now I rewriten the Query to a prettier format, e.g.
http://www.domain.com/shop.php?menuid=2738421&menuref=Ja&menutitle=New+Products&limit=5&page=24
is now
http://www.domain.com/new.html&limit=5&page=24
The pages represent product categories, and so needed to be displayed in a more meaningful manner. Customer bookmarking is not an issue, as long as I can redirect the pages.
I hope that makes sense, best wishes,
Mark.

IIS URL Rewrite - need help rewriting friendly urls for a forum

I'm new to using the URL Rewrite module and I'm having trouble with what I thought would be a simple URL rewrite for forum threads (using IIS 7.5)
I need to rewrite:
/forum/100/2534/friendly-title
or:
/forum/100/2534/334/comment/friendly-thread-title
to:
/forum/?forum=100&thread=2534&post=334&postType=comment
The rule that I have written (not working) is:
^forum/([1-9][0-9][0-9]*)/([1-9]*)/(([1-9]*)/(post|comment)/)?([a-zA-Z0-9-]{5,50})$
Which maps to:
/forum/?forum={R:1}&thread={R:2}&post={R:4}&postType={R:5}
I'm getting a 404 error.
It's correct that {R:4} and {R:5} are empty when you use the first URL. That's because there are no values for these fields. The RegEx still matches though so the URL will still be rewritten. Your code should properly handle empty values for the post and postType querystring parameters to display the entire thread and not just a specific comment (at least that what I assume is suppose to happen).
By the way, a more logical URL structure would be:
/forum/100/2534/friendly-thread-title/comment/334
This won't help you this this particular problem though but just on a side note.

Is it possible with canonical URL for this pattern in htaccess: /a/*/id/uniqueid?

A big problem is that I am not a programmer….! So I need to solve this with means within my own competence… I would be very happy for help!
I have an issue with a lot of duplicated URLs in the Google index and there are strong signs that it is causing SEO problems.
I don’t have duplicate links on the site itself, but as it once was set-up, for certain pages the system allows all sorts of variations in the URL. As long as is it has a specific article-id, the same content will be presented under an infinite number of URLs.
I guess the duplicates in Google's index has been growing over long time and is due to links gone wrong from other sites that links to mine. The problem is that the system have accepted the variations.
Here are examples of variations that exists in the Google index:
site.com/a/Cow_Cat/id/5272
site.com/a/cow_cat/id/5272
site.com/a/cow…cat/id/5272
site.com/a/cowcat/id/5272
site.com/a/bird/id/5272
The first URL with mixed case is the one used site-wide and for now I have to live with it, it would take too long time to make a change to all lower case. I cannot make a manual effort via htaccess as it is a total of 300.000 articles. I believe there are 10 ‘s of thousands that have one or more duplicates.
My question is this:
Is it possible to create rules for canonical URLs in htaccess in order to make the above URLs to be handled as one as well as for the rest of the 300.000?
I e, is there a way to say that all URLs having
/a/*/id/uniqueid
should be seen as one = based only on the unique ID and not give any regard to the text expressed with the “*”?
My hope is that it would be possible to say that a certain pattern like above should only be differentiated by the last unique segment.
If it is not possible in htaccess, how would it be done with link rel="canonical" on each page, can the code include wildcards?
I should add that the majority of the duplicates are caused by incoming links being lower case where the site itself is using a mix. Would it be OK to assign a canonical URL only with lower case although the site itself is basically always using a mix of lower/upper case?
If this is possible, I would be very happy to be helped with how to do it!!!!
Jonas
Hi Michael! I am not an expert but this is how I think it could be done:
1) My problem is that the URLs have mixed cases and I cannot change that now.
2) If it is OK for the searchengines, it would be fine for me to make the canonical URL identical to the actual URLs with the difference that it was all lower case, that would solve approx 90% of the duplicates. I e this would be the used URL: site.com/a/Cow_Cat/id/5272 and this would be the canonical: site.com/a/cow_cat/id/5272. As I understand, that would be good SEO...or...?
My idea was NOT to change the address browser address bar (i e using 301 redirect) but rather just telling the search engines which URLs that are duplicates, as I understand, that can be done by defining a canonical URL either in htaccess (as a pattern - I hope) or as a tag on each page.
3) IF, it would be possible to find a wildcard solution...I am not sure if this is possible at all, but that would mean it was possible to NOT assign a specific canonical URL but rather a "group pattern", i e "Please search engine, see all URLs with this patter - having the unique identifier in the end - as if they are one and the same URL, you SE, decide which one you prefer": /a/*/id/uniqueid
Would that work? It will only work in htaccess if canonical URLs can be defined as a group where the group is defined as a pattern with a defined part as the unique id.
Is it possible when adding a tag for each page to say that "all URLs containing this unique id should be treated the same"? If that would work it would look something similar to this
link rel="canonical" /a/*/id/5272
I dont know if this syntax with wildcard exist but it would be nice : )
My advice would be to use 301 redirects, with URL rewriting. Ask your webmaster to place this in your apache config or virtual host config:
RewriteMap lc int:tolower
Then inside your .htaccess file you can use the map ${lc:$1} to convert matches to lower case. Here, the $1 part is a match (backreference from brackets in a regex in the RewriteRule) and the ${lc: } part is just how you apply the lc (lowercase) function set up earlier. Here is an example of what you might want in your .htaccess file:
RewriteCond %{REQUEST_URI} [A-Z] #this matches a url with any uppercase characters
RewriteRule (.*) /${lc:$1} [L,R=301] #this makes it lowercase
As for matching the IDs, presuming your examples mean "always end with the ID" you could use a regex like:
^(.+/)(\d+))$
The first match (brackets) gets everything up to and including the forward slash before the ID, and the second part grabs the ID. We can then use it to point to a single, specific URL (like canonical, but with a 301).
If you do just want to use canonical tags, then you'll have to say what you're using code wise, but an example I use (so as not add tags to hundreds of individual pages, for instance) in PHP would be:
if ($_SERVER["REDIRECT_URL"] != "") {
$canonicalUrl = $_SERVER["SERVER_NAME"] . $_SERVER["REDIRECT_URL"];
} else if ($_SERVER["REQUEST_URI"] != "") {
$canonicalUrl = $_SERVER["SERVER_NAME"] . preg_replace('/^([^?]+)\?.*$/', "$1", $_SERVER['REQUEST_URI']);
}
Here, the redirect URL is used if it's available, and if not the request uri is used. This code strips off the query string (this bold bit in http://www.mysite.com/a/blah/12345/?something=true). Of course you can add to this code to specify a custom path, not just taking off the query string, by playing with the regex.

CakePHP nice urls - how to prevent normal urls from working

I have a website that's written using CakePHP. I've added some rewrite rules in the .htacces file to change the default urls to different ones (instead of /controller1/action1/parameter I have /some-string-about-controller-and-action/parameter, for example).
The problem is that now both the normal url and the nice one are available, and google seems to be indexing both, which is a problem. I'd like to only keep the nice one, which is the proper way to handle this so that it affects the google results as little as possible?
I don't know why you don't want to use cakes own routing (if you are having trouble doing what you want, you can accomplish what you want with a custom route class), then make sure that you redirect all relevant URL's in your .htaccess file to the desired URL using a MOVED PERMANENTLY redirect.
This way google will index the target url instead of the one that is undesirable. You are right to take offense to this, double indexing is a great way to harm your SEO rankings.

mod_rewrite so a specific directory is removed/hidden from the URL, but still displays the content

I'd like to create a rewrite in .htaccess for my site so that when a user asks for URL A, the content comes from URL B, but the user still sees the URL as being URL A.
So, for example, let's say I have content at mydomain.com/projects/project-example. I want users to be able to ask for mydomain.com/project-example, still see that URL in their address bar, but the browser should display the content from mydomain.com/projects/project-example.
I've looked through several .htaccess rewrite tips and FAQs, but unfortunately none of them seemed to present a solution for exactly what I've described above. Not everything on my domain will be coming from the /projects/ directory, so I'd imagine the rewrite should check to see if the page exists first so it's not appending /projects/ to every url. I'm really stumped.
If a rewrite is not exactly what I need, or if there is a simple solution for this problem, I'd love to hear it.
This tutorial should have everything that you need, including addressing exactly what you are asking: http://httpd.apache.org/docs/2.0/misc/rewriteguide.html . It may just be a matter of terminology.
So, for example, let's say I have content at mydomain.com/projects/project-example. I want users to be able to ask for mydomain.com/project-example, still see that URL in their address bar, but the browser should display the content from mydomain.com/projects/project-example.
With something like:
RewriteRule ^project-example$ /projects/project-example [L]
When someone requests http://mydomain.com/project-example and the URI /project-example gets rewritten internally to /projects/project-example. Note that when this is in an .htaccess file, the URI /project-example gets the leading slash removed when matching.
If you have a directory of stuff, you can use regular expressions and back-references, for example you want any request for http://mydomain.com/stuff/ to map to /internal/stuff/:
RewriteRule ^stuff/(.*)$ /internal/stuff/$1 [L]
So requests for http://mydomain.com/stuff/file1.html, http://mydomain.com/stuff/image1.png, etc. get rewritten to /internal/stuff/file1.html, /internal/stuff/image1.png, etc.

Resources