htaccess RewriteRule - pattern not matching - .htaccess

I'm trying to use categories in a clean way in my urls like this:
website.com/category
In the url the categories are written like this: Some random examples:
Animals
Consumer-Electronics
Books-&-Comics
External-Hard-Discs
Form,-Beauty-&-Health
Black-&-White-TV
The-Adventures-Of-Tintin
Fryers,-Waffle-makers-&-Cooking
etc...
As you can see, there is a random combination of words (with starting upper case), characters "-", ",", and "&". There are more combinations than the examples.
With rewrite I'm trying to get the categories in a variable like this:
RewriteRule ^([\w-&]+)$ /categories.php?mcn=$1 [L,NC]
This is not working. If I read out the variable I wanted with "Books-&-Comics" in categories.php, I only get "Books-" while it should be "Books-&-Comics".
When I add a "," in the character class like this:
RewriteRule ^([\w,-&]+)$ /categories.php?mcn=$1 [L,NC]
I get an internal server error.
How should my RewriteRule look like to match the category examples and get them correctly in the variable?

For your first problem, the issue is that your parameters are being decoded and thus the & is starting a new URL parameter. You can fix this by adding a B flag to your rule.
Your second issue is that the pattern ^([\w,-&]+)$ is invalid. It is trying to match any word character, or any character between , and &. (Ascii 44 & 38) because this is out of order, the regex fails. As you want to match the - character rather than using it as a range indicator, it should be escaped.
With these changes made your rule is:
RewriteRule ^([\w,\-&]+)$ /categories.php?mcn=$1 [L,NC,B]
A regex helper like regex101 can be a huge help in creating your rules.

Related

How to efficiently match a .htaccess RewriteRule for a complete word only if it's the last part of the URL

I'm unsure how to word this request, so please bear with me as I explain with an example. I'll try to make it clear.
I wish to redirect a URL if it ends with one of two words, let's say foo or bar. It must match only as a complete word, so food or new-foo shouldn't match. The URL might end with a slash, so /foo and /foo/ are both valid.
Also, the word might be by itself at the beginning of the URL or at the end of a longer path.
Thus, any of the following should match, with or without a trailing slash:
https://example.com/foo
https://example.com/new/foo
https://example.com/bar
https://example.com/some/other/bar
But, none of the following should match (with or without a trailing slash):
https://example.com/foo-new
https://example.com/old-bar
https://example.com/bar/thud
https://example.com/plugh/foo/xyzzy
Clarification: It's OK if the word is repeated, e.g. the following should still redirect, because foo is at the end of the URL:
https://example.com/foo/new/foo
The best that I've managed to come up with is to use two redirects, the first checking for the word on its own, and the second checking for the word being the last part of a path:
RewriteRule ^(foo|bar)/?$ https://redirect.com/$1/ [last,redirect=permanent]
RewriteRule /(foo|bar)/?$ https://redirect.com/$1/ [last,redirect=permanent]
There will, eventually, be several words, not just the two…
RewriteRule ^(foo|bar|baz|qux|quux|corge|grault|garply)/?$ https://redirect.com/$1/ [last,redirect=permanent]
RewriteRule /(foo|bar|baz|qux|quux|corge|grault|garply)/?$ https://redirect.com/$1/ [last,redirect=permanent]
… so using two RewriteRule statements seems error-prone and possibly inefficient. Is there a way to combine the two RewriteRule statements into one? Or, maybe, you have a better idea? (I toyed with FilesMatch, but I was confused as to how to go about it.)
Thank you
This probably is what you are looking for:
RewriteEngine on
RewriteRule (?:^|/)(foo|bar)/?$ https://example.com/$1/ [L,R=301]
(?:^|/) is a "non capturing group", so $1 still refers to what is captured by (foo|bar), while the whole expression matches a requested URL with only those words or with those words as final folder in a path sequence.

Is it possible to put a line break into a 'mailto:' rewrite in htaccess?

For complex reasons I've had to remove an enquiry form from a web site and use a 'mailto:' instead. For simplicity I've changed the htaccess file so that the former 'contact' link to the form now becomes a 'mailto:' as follows:
RewriteRule ^contact$ mailto:myname#mydomain.com?subject=BusinessName\ BandB\ Enquiry&body=You\ can\ find\ our\ availability\ on\ line.\ Delete\ this\ content\ if\ inapplicable
That does work, my local e-mail client (Thunderbird) opens with the information correctly shown in subject and body. (My TB is set to compose in plain text, I've yet to test with HTML)
I would like to introduce a new line in the body so that 'Delete this content if inapplicable' is on a separate line. Is there any way to do this? Given mod_rewrite's intended purpose I could understand if there isn't but I thought I'd ask before giving up.
I would like to introduce a new line in the body so that 'Delete this content if inapplicable' is on a separate line.
New lines in the body need are represented by two characters: carriage return (char 13) + line feed (char 10) (see RFC2368). This would need to be URL encoded in the resulting URL as %0D%0A.
When used in the RewriteRule substitution string the literal % characters would need to backslash-escaped to negate their special meaning as a backreference to the preceding CondPattern (which there isn't one). ie. \%0D\%0A. Otherwise, you will end up with the string DA, because there is no %0 backreference in this example.
You can also avoid having to backslash-escape all the literal spaces by encloses the entire argument (substitution string) in double quotes.
So, try the following instead:
RewriteRule ^contact$ "mailto:myname#mydomain.com?subject=BusinessName BandB Enquiry&body=You can find our availability on line.\%0D\%0ADelete this content if inapplicable" [R,L]

htaccess rewrite query string vs back reference?

I'm trying to rewrite this:
http://www.domain.com/johns-wishlist-12
to this:
index.php?route=wishlist/shared_wishlist&id=12&name=johns
I've read some good tutorials, but none of them really explain how back references work (when using more than one)... I also don't understand when to use {QUERY_STRING}, as opposed to just back references?
Could use a little help... this is what I have for the above:
RewriteRule ^([a-z0-9]*)-wishlist-([0-9]*)/?$ index.php?route=wishlist/shared_wishlist&id=$1&name=$2 [L,QSA]
Obviously "johns" and "12" will change based on the user...
so should I be using a rewrite condition {QUERY_STRING} in this case? why?
The %{QUERY_STRING} variable is used to match against the request's query string. In your case, the request is for http://www.example.com/johns-wishlist-12, so there is no query string there. You are rewriting to a URI with a query string, though, so the only thing that matters is the next time around when the rules loop (which may not happen), if you had another rule that matched against the %{QUERY_STRING} variable, the query string that you created will show up there.
The $ something in your rule's target are backreferences to a "grouped" match in the rule's pattern. Whenever you have a () in your pattern, that groups the match which can then be backreferenced using a $. In the case of a condition, the backreferences are % instead.

mod_rewrite RewriteRule backreference use in pattern

Is it possible to use a backreference in the middle of the RewriteRule pattern?
I envision this as such:
RewriteRule ^(.)([a-z]*)$1([a-z]*) $2-$3
^
(note, this fellow, here, refers to that first "(.)")
If the received url is XboopXdoop, the result would of course be boop-doop.
I am attempting to use this to specify a delimiter at the beginning of the incoming url that can be used to parse the rest of the string, without forcing the use of a specific character as that delimiter.
Thank you.
$1 works on the right side (rewrite), but not in the regex. You need to use \1.
Try:
RewriteRule ^(.)([a-z]+)\1([a-z]+) $2-$3
I ran into a bizarre edge case with the * where it split based on the second character of the string, and not the second. XtestingXtest resulted in es-ing ... so yeah, not sure what was happening there. If I use a + it works fine.
Also, since * and + are greedy, if you have multiple delimiter characters, it will split on the last occurrence of the character:
XbaseXtest -> base-test
XbaseXteXst -> baseXte-st
XbaseXtestX -> baseXtest-

Need a mod_rewrite .htaccess solution to replace %20 spaces with -'s in the finished URL

I need an .htaccess mod_rewrite solution that will take a .cgi search query like this:
www.mydomain.com/cgi-bin/finda/therapist.cgi?Therapy_Type=Pilates Training&City=Los Angeles&State=CA
and return matching results in the browser's address bar to look like this:
www.mydomain.com/therapists/Pilates-Training-Los-Angeles-CA.html
or better yet:
www.mydomain.com/therapists/pilates-training-los-angeles-ca.html
Notice the database includes values with one, two or three words + spaces...
For example:
Therapy_Type=Pilates Training <- includes a space
City=Los Angeles <- includes a space
State=CA <- no space
I used the tool at: http://www.generateit.net/mod-rewrite/ to generate the following RewriteRule:
RewriteEngine On
RewriteRule ^([^-]*)-([^-]*)-([^-]*)\.html$ /cgi-bin/finda/therapist.cgi?Therapy_Types=$1&City=$2&State=$3 [L]
This does work (finds the search matches) and generates the results page, but because the parameter values have spaces in them, we end up with a URL that looks like this:
www.mydomain.com/therapists/Pilates%20Training-Los%20Angeles-CA.html
I've spent days in this forum and others trying to find a solution to get rid of these %20 (encoded spaces) so the final returned URL will look like 1) or 2) above.
I know someone on here must know how to do this... Help ;-)
If you replace the %20 with -, then how would you know where the therapy type ends and the city starts?
pilates-training-los-angeles-ca
would be
type=pilates
city=training
state=los
So I don't think you like to replace the %20 by -. You could however replace it with another character, like _:
pilates_training-los_angeles-ca
You then would have to translate every _ to a space within your PHP script (or whatever language you are using server side).

Resources