htaccess : How to replace string with special character? - .htaccess

I would like to redirect the URL :
https://www.example.com/blablabla,?trx_id=TX-23
to
https://www.example.com/blablabla,TX-23
Basically, I would like to remove the string : ?trx_id=
I tried the following but it it's not working. It's seems like it's related to special characters
RewriteRule ^(.+)?trx_id=(.+)$ $1$2 [R=301,L]
Can anyone help please ?
Thanks

It does not work because when the request is parsed by apache, it's split in several parts : host, port, path and query string. The directive RewriteRule matches against the url path which is : blabla, in your example, while the parameters are put in the query string trx_id=TX-23 (question mark removed).
To match against the query string, you have to use a condition, like this.
RewriteCond %{QUERY_STRING} ^trx_id=(.+)$
RewriteRule ^(.+) /$1%1 [R=301,QSD,L]
The condition only affects its following rule. Back references such as $N refer to the rule pattern, and %N refer to the condition pattern.
Also note the QSD flag to discard the initial query string, otherwise it would be kept in the rewritten request (at least in apache 2.4, note sure for apache 2.2).

Related

htaccess RewriteRule with literal question marks (not query string)

I need to be able to match question marks because there was a translated text encoding mistake, and part of the URL ended up hardcoded with question marks in them. Here's a URL example that I need to rewrite:
https://example.com/Documentation/Product????/index.html
Here is my current rewrite rule. It works when the characters following "Product" are not question marks, but when they are, the rule doesn't apply.
RewriteRule "^Documentation/Product[^/]+/(.*)$" "https://s3.amazonaws.com/company-documentation/Help/Product/$1" [L,NC]
How would I make sure that question marks are considered to be characters too in this rule? I can't expect that only question marks and not the original non-English characters will be in the URL, so I want the rule above to match both question marks and any other character.
I found this topic which seems relevant, but the flags don't help, and the answer doesn't explain how to overcome the problem mentioned in the "Aside".
https://webmasters.stackexchange.com/questions/107259/url-path-with-encoded-question-mark-results-in-incorrect-redirect-when-copied-to
https://example.com/Documentation/Product????/index.html
You say it's "not a query string", but actually that is exactly what it is. And that is why you can't match it with the RewriteRule pattern. The above URL is split as follows:
URL-path: /Documentation/Product (matched by the RewriteRule pattern)
Query string: ???/index.html (note 3 ? - the first one starts the query string)
To match the query string you'll need an additional RewriteCond directive that checks against the QUERY_STRING server variable.
For example, to match the above URL, you would need to do something like:
RewriteCond %{QUERY_STRING} ^\?*/index\.html
RewriteRule ^Documentation/Product$ https://s3.amazonaws.com/company-documentation/Help/Product/index.html [NC,R,L]
This matches any number of erroneous ? at the start of the query string.
I've added the R (redirect) flag. Your directive (without the R flag) would trigger an external redirect anyway (because you specifying an absolute URL in the substitution), but it is far better to be explicit here. This is also a temporary (302) redirect. If this should be permanent (301) then change it to R=301, but only once you have confirmed that it's working OK (301s are cached hard by the browser so can make testing problematic).
UPDATE:
...so I want the rule above to match both question marks and any other character.
Only if there are question marks in the URL will there be a query string, so I think it is advisable to keep these two rules separate.
If there could be any erroneous characters at the start of the query string and if you want to capture the end part of the URL (like you are doing in your original directive, eg. index.html) then you can modify the above to read:
RewriteCond %{QUERY_STRING} /(.*)$
RewriteRule ^Documentation/Product$ https://s3.amazonaws.com/company-documentation/Help/Product/%1 [NC,R,L]
Note the %1 (as opposed to $1) backreference in the substitution string. This is a backreference to the captured group in the last matched CondPattern (ie. /(.*)$).
You can follow this with your existing directive (but remember to include the R flag) for more "normal" URLs that don't contain a ? (ie. query string).
NB: Surrounding the arguments in double quotes are entirely optional in this example. They are only required if you have unescaped spaces in the pattern or substitution arguments.
In summary
# Redirect URLs of the form:
# "/Documentation/Product?<anything#1>/<anything#2>"
RewriteCond %{QUERY_STRING} /(.*)$
RewriteRule ^Documentation/Product$ https://s3.amazonaws.com/company-documentation/Help/Product/%1 [NC,R,L]
# Redirect URL-paths of the form (no query string):
# "/Documentation/Product<something>/<anything>"
RewriteRule ^Documentation/Product[^/]+/(.*) https://s3.amazonaws.com/company-documentation/Help/Product/$1 [NC,R,L]

mod_rewrite pass variable and

What I would need to do is to pass 1 variable as variable, and the rest as a rest of URL intact, so I can get them by $_GET in php later. The below does not work:
RewriteRule ^store/([a-zA-Z0-9\_\-]+).html?(.*)/?$ store.php?var1=$1&$2 [L]
Possible Links could be:
store/products.html
store/products.html?sort=asc&price=down
store/products.html?price=down&here_we_can_have_a_lot_of_different_params_in_whatever_order
Basically, just take this $var1 and the rest forward to URL? How can I do that?
P.S. I think I found a solution:
RewriteRule ^store/([a-zA-Z0-9\_\-]+).html?(.*)/?$ store.php?var1=$1&%{QUERY_STRING} [L]
The problem is your premise as to how a RewriteRule works. A RewriteRule only matches against the URI path (in apache terms the Request URI) and not against the query string. This means the (.*)/?$ at the end of your regex only works because it can reduce to just '$' (or the end of the string) and the ? before it just means the regexp will match .htm as well as .html
The simpler version of your rule is as follows:
RewriteRule ^store/([a-zA-Z0-9\_\-]+).html store.php?var1=$1 [L,QSA]
The QSA stands for Query String Append, which simply adds back any existing query string to the rewritten URL.

Why does my RewriteRule not work when there is a `?` in the URL

I am learning how to write regular expressions for .htaccess redirects.
So far I've managed to figure out everything I needed, except for a couple of regular expressions which don't behave as I expected. I am testing my regular expressions using a desktop application, and they work fine there, but not in the .htaccess file.
FYI: The RewriteBase is set to /site/
This is the incoming URL:
/site/view-by-tag/politics/?el_mcal_month=3&el_mcal_year=2009
I want to grab "politics" and redirect to /site/tags/politics/
Here is what I used:
RewriteRule ^view-by-tag/([a-zA-Z\-]+)/([a-zA-Z0-9\-\/\.\_\=\?\&]+) /tags/$1/ [R=301,L]
I added the capture of all the characters after politics because I am having the issue that when there is a ? in the URL the redirect does not work, and I can't figure out why. In the URL given above, if I remove the ? it works fine, but if the ? is in there, nothing happens. Is there a reason for this?
The same thing happens when I try to capture 307 from /site/?option=com_content&view=article&id=307&catid=89&Itemid=55
I used this regular expression, article&id=([0-9]+) /?p=$1 [R=301,L] but again, when there is a ? in the URL it stops the redirect for doing anything.
What is the reason for that?
The .htaccess file in question is on a Wordpress blog (3.4.1)
The point that you've missed is that the rewrite engine splits the URI into two parts: the REQUEST_URI and the QUERY_STRING. The query string part isn't used in the rule match string so there is no point in constructing rule regexp patterns to look for it.
You can probe and pick out parameters from the query string by using rewrite conditions and condition regexps to set %N variables.
By default the query string is appended to the output substitution string unless you have a ?someparam in it -- in which case it is ignored unless you used the [QSA] (query string append) parameter.
The way that you'd pick up the id in /site/?option=com_content&view=article&id=307&catid=89&Itemid=55 is to use something like:
RewriteCond %{QUERY_STRING} \bid=(\d+)
Before the rule and this would set %1 to 307. Read the rewrite documentation for more general discussion of how to do this.
The query string is must be processed separately in a RewriteCond if you need to manipulate it, and should not be matched inside the RewriteRule Instead, just match the request not including the query string, and use QSA to append the query string onto the redirect:
RewriteRule ^view-by-tag/([A-Za-z-]+)/?$ /tags/$1/ [R=301,L,QSA]
# OR, if you don't want the rest of the query string appended, put a `?` onto
# the redirect to replace it with nothing
RewriteRule ^view-by-tag/([A-Za-z-]+)/?$ /tags/$1/? [R=301,L]
Actually, the QSA may not be needed in a R redirect - I think that the default behavior is to pass the query string with the redirect.
If you need to capture 307 from the query string, do it in a RewriteCond and capture in %1:
# Capture the id in %1
RewriteCond %{QUERY_STRING} id=([\d]+)
# Redirect everything to /, pass %1 into p
RewriteRule . /?p=%1 [LR=301,L]

Two question marks and two ampersand in query string?

Ech, my query string look's like:
http://localhost/index.php?page=public&another=http://www.google.com?omg=tt&nop=asd
And afcourse i rewrite it with regex to:
regex:
RewriteRule ^pass-([^=]*)=([^=]*)$ index.php?page=$1&another=$2 [L]
http://localhost/pass-url=http://www.google.com?omg=tt&nop=asd
(1)But then url becomes to: http://www.google.com only.
If i try urlencode with this url without regex:
http://localhost/index.php?page=public&another=http://www.google.com?omg=tt&nop=asd
it echo:
http%3A%2F%2Fwww.google.com%3Fomg%3Dtt
(2)In this case &nop=asd part gone.
So how to make (1) work and why (2) do this ?
The biggest question would be how to pass two question and ampersand in query string ?
Any suggestion regarding this situation ?
Problem is in your rewrite rule. Have it like this:
RewriteRule ^pass-([^=]*)=([^=]*)$ hw.php?page=$1&another=$2 [L,QSA]
Note that additional QSA flag which will make sure to preserve original query parameters.

301 Htaccess RewriteRule Query_String

Problem: Visitors open the url website.com/?i=133r534|213213|12312312 but this url isn't valid anymore and they need to be forwarded to website.com/#Videos:133r534|213213|12312312
What I've tried: During the last hours I tried many mod_rewrite (.htaccess) rules with using Query_String, all failed. The last message in this topic shows a solution for this problem, but what would be the rule in my situation.
I'm very curious how you would solve this problem :)!
The following will handle the simple case you show. You'll need to add additional logic if you need to allow for other parameters in the query string or file names before the ?.
RewriteEngine On
RewriteCond %{QUERY_STRING} ^i=(.*)
RewriteRule ^.* /#Video:%1? [NE,R=permanent]
Why is this tricky?
RewriteRule doesn't look at the query string, so you have to use RewriteCond to evaluate the QUERY_STRING variable and capture the part you'll need later (referenced via %1)
the hash character (#) is normally escaped, you must specify the [NE] flag
The trailing ? on the substitution string is required to suppress the original query string
I tested this on Apache 2.2.

Resources