remove part of url via mod_rewrite - .htaccess

Is there any way to hide part of a Url via mod_rewrite. I am currently using part of the url, .htm, to split the page that is being requested and the query string.
Example
http://www.example.com/page/article/single.htm/articleid=8
This would let me know that the page requested is:
http://www.example.com/page/article/single
And the quest string is:
article=8
Ideally i would like the have this to work the same url without the .htm visible
http://www.example.com/page/article/single/articleid=8
The number of variables in the query sting varies as does the number of levels before the .htm so the rule would need to be dynamic
Thanks

To also do multiple querystring parameters, how do you want it to look? I started with this, which keeps this simple, then got trickier below.
http://www.example.com/page/article/single/articleid=8&anothervar=abc
Try this rule:
RewriteRule ^([^=]+)/(.+)$ $1.htm?$2 [NC,L]
This handles one or more querystring parameters, but does require at least one. This looks for anything without an = up to a slash, then everything else. Basically, it uses the = as the indicator of the path vs. the querystring portions; but actually splits it on the slash. (The NC is a habit of mine; not needed in this case, but when I leave it out I forget it when it's needed.)
To let querystrings be optional, so it could handle just
http://www.example.com/page/article/single
I found it easiest with two rules, instead of trying to mingle this into one rule:
RewriteRule ^([^=]+)$ $1.htm [NC,L]
RewriteRule ^([^=]+)/(.+)$ $1.htm?$2 [NC,L]
You can do something even prettier, using slashes for everything including multiple querystring parameters, like this:
http://www.example.com/page/article/single/articleid=8/anothervar=abc
It's a little hairy, but I think this works (couldn't let it go...)
Another rule handles replacing the slashes with ampersands, then doing the rewrite as above. This was easier to keep straight - maybe there's a way to do it all at once, but this was tricky enough for me:
RewriteRule ^([^=]+)$ $1.htm [NC,L]
RewriteRule ^([^=]+)/([^=]+=[^/]+)/([^=]+=.+)$ $1/$2&$3 [NC,LP]
RewriteRule ^/([^=]+)/(.+)$ /$1.htm?$2 [NC,L]
The first rule is as above, handling no querystrings at all. That just gets it out of the way.
The second rule is a loop LP, which is what I tend to find in examples whenever you have an unknown number of replacements. In this case, it's replacing the last querystring-slash with an ampersand, and looping until there's only one left (leaving that for the question mark in the third rule).
It's looking for the last one of these articleid=8/anothervar=abc where there are two parameters left. It replaces the slash with an ampersand like articleid=8&anothervar=abc
In words, it's looking for (and capturing in parentheses):
(not-equalsign) slash (not-equalsign equalsign not-slash) slash (not-equalsign equalsign anything)
This lines up as:
(not-equalsign) /page/article/single
slash /
(not-equalsign equalsign not-slash) articleid = 8
slash /
(not-equalsign equalsign anything) anothervar = abc
It replaces the last slash with an ampersand, and after looping, turns it into the first draft above: http://www.example.com/page/article/single/articleid=8&anothervar=abc . The third rule handles this as described above.
A note: These also assume all your urls will look like this, since they're going to tack on .htm to everything. If you want still allow explicit /something/page.htm then these rules would need to not-match on .htm if it's already there - something like that. Or maybe an initial rule up front that looks for .htm and just stops rewriting there. Or maybe only do this for the /page paths.

Related

RewriteRule giving me issues with my regex

I'm trying to do a simple redirect where going to a url like www.example.com/foo will take me to www.example.com/quokka/inquiry/ask.php?user=foo.
For testing purposes I started with this:
RewriteRule ^(m.*)$ /quokka/inquiry/ask.php?user=$1
This works great for use cases where the foo starts with the letter: m, but I want it to be super customizable. So then I make this my redirect (note the removal of the letter m):
RewriteRule ^(.*)$ /quokka/inquiry/ask.php?user=$1
Why isn't the RewriteRule above not working for any instance of foo? I believe there's something wrong with my Regex?
Any help would be greatly appreciated.
RewriteRule ^(.*)$ /quokka/inquiry/ask.php?user=$1
Depending on what other directives you have in your .htaccess file, this is possibly causing an internal rewrite loop, which is preventing the URL from ever resolving correctly (do you get a 500 Internal Server Error?). Or, at best, an invalid rewrite to /quokka/inquiry/ask.php?user=quokka/inquiry/ask.php.
Aside: Note that, as mentioned, this is an internal rewrite, not strictly a "redirect" as you stated in your question. The term "redirect" usually refers to an "external 3xx redirect". (Although admittedly the Apache docs also confuse these terms, but do at least qualify this as an "internal redirect".)
In the case of the above directive, the rewritten URL is also captured by the ^(.*)$ pattern (which captures anything), which results in a loop something like:
Request: www.example.com/foo
Rewritten to: /quokka/inquiry/ask.php?user=foo
Rewritten to: /quokka/inquiry/ask.php?user=quokka/inquiry/ask.php
Rewritten to: /quokka/inquiry/ask.php?user=quokka/inquiry/ask.php
:
URL-rewriting does not stop when it gets to the end of the .htaccess file. Processing loops until the URL passes through unchanged. (Although what is considered a "change" is not always entirely clear, as you can get loops simply by rewriting the URL, even when the rewritten URL is the same, as in step#4 above.)
The pattern ^(m.*)$ "works" because the rewritten URL does not start with an "m". But if you have an other URLs that start with an "m", then these will also be rewritten and become inaccessible.
You need to have a unique URL that only captures "user IDs" (in this case). For example, all URLs that reference "user IDs" could have a specific prefix, eg. example.com/u/<userid>.
RewriteRule ^u/(.*)$ /quokka/inquiry/ask.php?user=$1
Or perhaps are of a maximum length that does not conflict with any other URL (eg. between 3 and 8 chars):
RewriteRule ^(.{3,8})$ /quokka/inquiry/ask.php?user=$1
Also, if you are restrictive as possible on the format of the user ID then this might also be sufficient. eg. only lowercase letters:
RewriteRule ^([a-z]+)$ /quokka/inquiry/ask.php?user=$1
However, using a prefix and restriction (regex should always be as restrictive as possible) would be my preference, as it avoids potential conflicts in the future. For example:
RewriteRule ^u/([a-z]{3,8})$ /quokka/inquiry/ask.php?user=$1 [L]
Also, include the L flag to ensure that no other directives that immediately follow are processed.

.htaccess, virtual directories, and semi-complex URLs

I'm basically just trying to have a master syntax for predictable URLs. Simple URL is no problem
RewriteEngine on
# RewriteRule ^friendlyUrl/content/?$ /index.php?app=main&module=content
Which to my understanding looks for the url structure and allows 1 or 0 trailing "/"'s
But some parts of the website have a /urlPrefix/ to access, eg. mysite.com/membersArea/
and /membersArea/ will be apart of every query there. I'm having trouble accomodating for trailing ?s and &s in URLs like these.
RewriteRule ^secureUrl/\?(.*)$ /index.php?app=admin&$1
This is my attempt to handle everything from mysite.com/secureUrl/ to mysite.com/secureUrl/?var1=foo&var2=bar and after many server errors and a search, I find myself here.
This is the most complex line I have and between you and me, I couldn't tell you exactly what's happening other than it looks for /friendlyUrl/10DIGITKEY/(possible task)/?possiblevars=foo&var2=bar
RewriteRule ^friendlyUrl/([a-zA-Z0-9]{10})/?([a-z]*)/?\??(.*)$ /index.php?app=main&module=web&id=$1&$2&$3
Htaccess has always been my weakest subject, and as a webmaster I pay the price constantly, any help would be appreciated.
Need to input the same request to the PHP file (plus ANY query with or without ? or &) whether its just /friendlyUrl/ or /friendyUrl/?var=1, /friendlyUrl/&var=1, /friendlyUrl/var=1
You're looking to keep the query string of your request URI to remain as is, or to be included in the rewritten URL after the rewrite process is done.
For this purpose, you use the QSA flag in your RewriteRule directive. So, to rewrite /friendlyUrl/10DIGITKEY/(possible task)/?possiblevars=foo&var2=bar, you'd have:
RewriteRule ^friendlyUrl/([a-z\d]{10})/([^/]*)/?$ /index.php?app=main&module=web&id=$1&task=$2 [QSA]
Notice the QSA flag at the end. Also, keep in mind that I'm passing the second match (the possible task of your URL) as another variable (named task). This variable will be empty if nothing was found.
QSA|qsappend
When the replacement URI contains a query string, the default behavior
of RewriteRule is to discard the existing query string, and replace it
with the newly generated one. Using the [QSA] flag causes the
query strings to be combined.

How to redirect only when there is something after .html?

I have found that there are some people with bad syntax links to our articles.
For example, we have an article with URL
http://www.oursite.com/demo/article-179.html
The issue is that lot of people have linked back to this article with bad syntax such as
http://www.oursite.com/demo/article-179.html%5Cohttp:/www.oursite.com/demo/glossary.php
Now, I added the following ReWrite Rule in the .htaccess file to take care of such links.
RewriteRule article-179\.html(.*)$ "http\:\/\/www\.oursite\.com\/demo\/article-179\.html [301,L]
But this has resulted in a Redirect Loop message. How can we fix this issue via htaccess rewrite rule. Basically, we need something in our rewrite rule that works only when there is one or more characters after the .html. If not, then it should not redirect.
Any help would be highly appreciated!
With best regards!
Use + instead of *. * matches zero or more, which causes the pattern to match for the redirected path too, + instead matches one or more.
Also you should make the pattern as precise as possible, ie don't just check whether it ends with article-179.html, better check for the full path. And if this all happens on the same domain, then there's no need to use the absolute URL for the redirect.
There's also no need for escaping the substitution parameter like you did, it's treated as a simple string except for:
back-references ($N) to the RewriteRule pattern
back-references (%N) to the last matched RewriteCond pattern
server-variables as in rule condition test-strings (%{VARNAME})
mapping-function calls (${mapname:key|default})
http://httpd.apache.org/docs/2.2/mod/mod_rewrite.html#rewriterule
Long story short, theoretically this should do it:
RewriteRule ^demo/article-179\.html(.+)$ /demo/article-179.html [R=301,L]
or this if you really need the absolute URL:
RewriteRule ^demo/article-179\.html(.+)$ http://www.oursite.com/demo/article-179.html [R=301,L]

Rewrite rule for seo - title in url

Lets say I want users to be able to type this url in:
www.website.com/blog/2453/I-gained-0.1%-more-scripting-knowledge-!
I'm trying to include title information in the url for seo benefits.
I also want to include an id for my query. Effectively I want to pick up the id and ignore the title stuff that comes after, bearing in mind its user generated text so could contain any special characters in it.
How can I write a .htaccess rewrite rule so that the server reads it as the following with the appropriate GET data:
www.website.com/blog.php?id=2453
This is what I have tried but frankly I am way out of my depth here:
RewriteRule ^blog/([A-Za-z0-9-]+)/([A-Za-z0-9-]+)/?$ blog.php?id=$1 [NC,L]
The rewrite rule you are using should work except for the ., %, and ! characters that are in your URL. The % characters is not safe to use in URLs because it has a special meaning in the URL syntax. I wouldn't use exclamation points either.
If the ID is always going to be numeric, use ([0-9]+) instead of ([A-Za-z0-9-]+).
Try this URL:
www.website.com/blog/2453/I-gained-0.1-more-scripting-knowledge
With this rule:
RewriteRule ^blog/([0-9]+)/[A-Za-z0-9\-\.]+/?$ blog.php?id=$1 [NC,L]

mod_rewrite Redirect Rule Variables question

I'm a bit of an .htaccess n00b, and can't for the life of me get a handle of regular expressions.
I have the following piece of RewriteRule code that works just fine:
RewriteRule ^logo/?$ /pages/logo.html
Basically, it takes /pages/logo.html and makes it /logo.
Is there a way for me to generalize that code with variables, so that it works automatically without having to have an independent line for each page?
I know $1 can work as a variable, but thats usually for queries, and I can't get it to work in this instance.
First you need to know that mod_rewrite can only handle requests to the server. So you would need to request /logo to have it rewritten to /pages/logo.html. And that’s what the rule does, it rewrites requests with the URL path /logo internally to /pages/logo.html and not vice versa.
If you now want to use portions of the matched string, you need to use groups to group them ( (expr)) that you then can reference to with $n. In your case the pattern [^/] will be suitable that describes any character other than the slash /:
RewriteRule ^([^/]+)$ /pages/$1.html
Try this:
RewriteRule ^/pages/(.*)\.html$ /$1
The (.*) matches anything between pages/ and .html. Whatever it matches is used in $1. So, /pages/logo.html becomes /logo, and /pages/subdir/other_page.html would become /subdir/other_page

Resources