htaccess RewriteRule with literal question marks (not query string) - .htaccess

I need to be able to match question marks because there was a translated text encoding mistake, and part of the URL ended up hardcoded with question marks in them. Here's a URL example that I need to rewrite:
https://example.com/Documentation/Product????/index.html
Here is my current rewrite rule. It works when the characters following "Product" are not question marks, but when they are, the rule doesn't apply.
RewriteRule "^Documentation/Product[^/]+/(.*)$" "https://s3.amazonaws.com/company-documentation/Help/Product/$1" [L,NC]
How would I make sure that question marks are considered to be characters too in this rule? I can't expect that only question marks and not the original non-English characters will be in the URL, so I want the rule above to match both question marks and any other character.
I found this topic which seems relevant, but the flags don't help, and the answer doesn't explain how to overcome the problem mentioned in the "Aside".
https://webmasters.stackexchange.com/questions/107259/url-path-with-encoded-question-mark-results-in-incorrect-redirect-when-copied-to

https://example.com/Documentation/Product????/index.html
You say it's "not a query string", but actually that is exactly what it is. And that is why you can't match it with the RewriteRule pattern. The above URL is split as follows:
URL-path: /Documentation/Product (matched by the RewriteRule pattern)
Query string: ???/index.html (note 3 ? - the first one starts the query string)
To match the query string you'll need an additional RewriteCond directive that checks against the QUERY_STRING server variable.
For example, to match the above URL, you would need to do something like:
RewriteCond %{QUERY_STRING} ^\?*/index\.html
RewriteRule ^Documentation/Product$ https://s3.amazonaws.com/company-documentation/Help/Product/index.html [NC,R,L]
This matches any number of erroneous ? at the start of the query string.
I've added the R (redirect) flag. Your directive (without the R flag) would trigger an external redirect anyway (because you specifying an absolute URL in the substitution), but it is far better to be explicit here. This is also a temporary (302) redirect. If this should be permanent (301) then change it to R=301, but only once you have confirmed that it's working OK (301s are cached hard by the browser so can make testing problematic).
UPDATE:
...so I want the rule above to match both question marks and any other character.
Only if there are question marks in the URL will there be a query string, so I think it is advisable to keep these two rules separate.
If there could be any erroneous characters at the start of the query string and if you want to capture the end part of the URL (like you are doing in your original directive, eg. index.html) then you can modify the above to read:
RewriteCond %{QUERY_STRING} /(.*)$
RewriteRule ^Documentation/Product$ https://s3.amazonaws.com/company-documentation/Help/Product/%1 [NC,R,L]
Note the %1 (as opposed to $1) backreference in the substitution string. This is a backreference to the captured group in the last matched CondPattern (ie. /(.*)$).
You can follow this with your existing directive (but remember to include the R flag) for more "normal" URLs that don't contain a ? (ie. query string).
NB: Surrounding the arguments in double quotes are entirely optional in this example. They are only required if you have unescaped spaces in the pattern or substitution arguments.
In summary
# Redirect URLs of the form:
# "/Documentation/Product?<anything#1>/<anything#2>"
RewriteCond %{QUERY_STRING} /(.*)$
RewriteRule ^Documentation/Product$ https://s3.amazonaws.com/company-documentation/Help/Product/%1 [NC,R,L]
# Redirect URL-paths of the form (no query string):
# "/Documentation/Product<something>/<anything>"
RewriteRule ^Documentation/Product[^/]+/(.*) https://s3.amazonaws.com/company-documentation/Help/Product/$1 [NC,R,L]

Related

Replace part of a long URL and redirect

Is there a way to redirect the URL as follows:
URL is generated based on a filtering system so it is like this
https://example.com/product-category-no-slash-generated-part-is-autoadded-here
Due to the massive product number, it is impossible for me to change all generated URL-s but I need to change, for example, only no-slash part to something-else, so redirect does this:
Old URL:
https://example.com/product-category-no-slash-generated-part-is-autoadded-here
New URL:
https://example.com/product-category-something-else-generated-part-is-autoadded-here
I hope I managed to explain the problem.
I tried to use stuff like RewriteRule ^/no-slash/(.*)$ /something-else/$1 [L] but I think this does not work for what I need.
To replace no-slash with something-else in the URL-path that only consists of a single path-segment then you can do something like the following using mod-rewrite, near the top of the root .htaccess file.
RewriteEngine On
# Replace "no-slash" in URL-path with "something-else"
RewriteRule ^([\w-]+)no-slash([\w-]+)$ /$1something-else$2 [R=302,L]
This assumes the URL-path can only consist of the characters 0-9, a-z, A-Z, _ (underscore) and - (hyphen).
The $1 and $2 backreferences contain the matched URl-path before and after the string to replace repsectively.
I tried to use stuff like RewriteRule ^/no-slash/(.*)$ /something-else/$1 [L]
In this you are matching slashes in the URL-path - which do not occur in your example. You are also not allowing for anything before the string you want to replace (eg. product-catgeory-).
In a .htaccess context, the URL-path matched by the RewriteRule pattern does not start with a slash. S, a pattern like ^/no-slash will never match.
UPDATE:
another example. example.com/demo-tools-for-construction-work So word TOOLS in URL must be replaced with EQUIPMENT-AND-TOOLS.
(I'm assuming this should all be lowercase.)
A problem with your second example (in comments) is that tools also exists in the target URL, so this would naturally result in an endless redirect loop.
To prevent this "loop" you would need to exclude the URL you are redirecting to. eg. You could exclude URLs that already contain equipment-and-tools.
For example:
# Replace "tools" in URL-path with "equipment-and-tools"
# - except if it already contains "equipment-and-tools"
RewriteCond %{REQUEST_URI} !equipment-and-tools
RewriteRule ^([\w-]+)tools([\w-]+)$ /$1equipment-and-tools$2 [R=302,L]
The ! prefix on the CondPattern (2nd argument to the RewriteCond directive) negates the expression. So, in this case it is successful when equipment-and-tools is not contained in the requested URL.

Use htaccess to change query parameter to iOS app-specific deep-link [duplicate]

I am trying to do the following:
User visits URL with query parameter: http://www.example.com/?invite=1234
I then want them to be deep linked into the app on their iOS device, so they go to: app_name://1234
Any suggestions on how to accomplish this in my .htaccess file?
I tried this but it doesn't work:
RewriteEngine On # Turn on the rewriting engine
RewriteRule ^invite/(.*)/$ app_name://$1 [NC,L]
If RewriteRule won't work, can anyone send me an example code for RewriteCond or JavaScript to achieve what I need?
Not sure how this will work with the iOS device, but anyway...
RewriteRule ^invite/(.*)/$ app_name://$1 [NC,L]
This doesn't match the given URL. This would match a requested URL of the form example.com/invite/1234/. However, you are also matching anything - your example URL contains digits only.
The RewriteRule pattern matches against the URL-path only, you need to use a RewriteCond directive in order to match the query string. So, to match example.com/?invite=1234 (which has an empty URL-path), you would need to do something like the following instead:
RewriteCond %{QUERY_STRING} ^invite=([^&]+)
RewriteRule ^$ app_name://%1 [R,L]
The %1 backreference refers back to the last matched CondPattern.
I've also restricted the invite parameter value to at least 1 character - or do you really want to allow empty parameter values through? If the value can be only digits then you should limit the pattern to only digits. eg. ^invite=(\d+).
I've include the R flag - since this would have to be an external redirect - if it's going to work at all.
However, this may not work at all unless Apache is aware of the app_name protocol. If its not then it will simply be seen as a relative URL and result in a malformed redirect.

htaccess : How to replace string with special character?

I would like to redirect the URL :
https://www.example.com/blablabla,?trx_id=TX-23
to
https://www.example.com/blablabla,TX-23
Basically, I would like to remove the string : ?trx_id=
I tried the following but it it's not working. It's seems like it's related to special characters
RewriteRule ^(.+)?trx_id=(.+)$ $1$2 [R=301,L]
Can anyone help please ?
Thanks
It does not work because when the request is parsed by apache, it's split in several parts : host, port, path and query string. The directive RewriteRule matches against the url path which is : blabla, in your example, while the parameters are put in the query string trx_id=TX-23 (question mark removed).
To match against the query string, you have to use a condition, like this.
RewriteCond %{QUERY_STRING} ^trx_id=(.+)$
RewriteRule ^(.+) /$1%1 [R=301,QSD,L]
The condition only affects its following rule. Back references such as $N refer to the rule pattern, and %N refer to the condition pattern.
Also note the QSD flag to discard the initial query string, otherwise it would be kept in the rewritten request (at least in apache 2.4, note sure for apache 2.2).

Why does my RewriteRule not work when there is a `?` in the URL

I am learning how to write regular expressions for .htaccess redirects.
So far I've managed to figure out everything I needed, except for a couple of regular expressions which don't behave as I expected. I am testing my regular expressions using a desktop application, and they work fine there, but not in the .htaccess file.
FYI: The RewriteBase is set to /site/
This is the incoming URL:
/site/view-by-tag/politics/?el_mcal_month=3&el_mcal_year=2009
I want to grab "politics" and redirect to /site/tags/politics/
Here is what I used:
RewriteRule ^view-by-tag/([a-zA-Z\-]+)/([a-zA-Z0-9\-\/\.\_\=\?\&]+) /tags/$1/ [R=301,L]
I added the capture of all the characters after politics because I am having the issue that when there is a ? in the URL the redirect does not work, and I can't figure out why. In the URL given above, if I remove the ? it works fine, but if the ? is in there, nothing happens. Is there a reason for this?
The same thing happens when I try to capture 307 from /site/?option=com_content&view=article&id=307&catid=89&Itemid=55
I used this regular expression, article&id=([0-9]+) /?p=$1 [R=301,L] but again, when there is a ? in the URL it stops the redirect for doing anything.
What is the reason for that?
The .htaccess file in question is on a Wordpress blog (3.4.1)
The point that you've missed is that the rewrite engine splits the URI into two parts: the REQUEST_URI and the QUERY_STRING. The query string part isn't used in the rule match string so there is no point in constructing rule regexp patterns to look for it.
You can probe and pick out parameters from the query string by using rewrite conditions and condition regexps to set %N variables.
By default the query string is appended to the output substitution string unless you have a ?someparam in it -- in which case it is ignored unless you used the [QSA] (query string append) parameter.
The way that you'd pick up the id in /site/?option=com_content&view=article&id=307&catid=89&Itemid=55 is to use something like:
RewriteCond %{QUERY_STRING} \bid=(\d+)
Before the rule and this would set %1 to 307. Read the rewrite documentation for more general discussion of how to do this.
The query string is must be processed separately in a RewriteCond if you need to manipulate it, and should not be matched inside the RewriteRule Instead, just match the request not including the query string, and use QSA to append the query string onto the redirect:
RewriteRule ^view-by-tag/([A-Za-z-]+)/?$ /tags/$1/ [R=301,L,QSA]
# OR, if you don't want the rest of the query string appended, put a `?` onto
# the redirect to replace it with nothing
RewriteRule ^view-by-tag/([A-Za-z-]+)/?$ /tags/$1/? [R=301,L]
Actually, the QSA may not be needed in a R redirect - I think that the default behavior is to pass the query string with the redirect.
If you need to capture 307 from the query string, do it in a RewriteCond and capture in %1:
# Capture the id in %1
RewriteCond %{QUERY_STRING} id=([\d]+)
# Redirect everything to /, pass %1 into p
RewriteRule . /?p=%1 [LR=301,L]

301 Htaccess RewriteRule Query_String

Problem: Visitors open the url website.com/?i=133r534|213213|12312312 but this url isn't valid anymore and they need to be forwarded to website.com/#Videos:133r534|213213|12312312
What I've tried: During the last hours I tried many mod_rewrite (.htaccess) rules with using Query_String, all failed. The last message in this topic shows a solution for this problem, but what would be the rule in my situation.
I'm very curious how you would solve this problem :)!
The following will handle the simple case you show. You'll need to add additional logic if you need to allow for other parameters in the query string or file names before the ?.
RewriteEngine On
RewriteCond %{QUERY_STRING} ^i=(.*)
RewriteRule ^.* /#Video:%1? [NE,R=permanent]
Why is this tricky?
RewriteRule doesn't look at the query string, so you have to use RewriteCond to evaluate the QUERY_STRING variable and capture the part you'll need later (referenced via %1)
the hash character (#) is normally escaped, you must specify the [NE] flag
The trailing ? on the substitution string is required to suppress the original query string
I tested this on Apache 2.2.

Resources