What does ::$1 mean in an htaccess? - .htaccess

I've been browsing the symfony2 framework source. In the htaccess file for their example website, I found the %{REQUEST_URI}::$1 written as follows:
RewriteCond %{REQUEST_URI}::$1 ^(/.+)(.+)::\2$
RewriteRule ^(.*) - [E=BASE:%1]
The comment above that rule explains
The following rewrites all other queries to the front controller. The condition ensures that if you are using Apache aliases to do mass virtual hosting, the base path will be prepended to allow proper resolution of the app.php file; it will work in non-aliased environments as well, providing a safe, one-size fits all solution.
However, that doesn't explain the ::$1 or ::\2.
Are they backreferences? If not, what are they? What is their purpose?

I have encountered almost the same htaccess file in my Zend project, and here are my thoughts and hope it helps.
The htaccess file (located at the Zend project directory, same as index.php) says
RewriteCond %{REQUEST_URI}::$1 ^(/.+)(.+)::\2$
RewriteRule ^(.*)$ - [E=BASE:%1]
RewriteRule ^(.*)$ %{ENV:BASE}index.php [NC,L]
Suppose Zend is installed at http://mydomain.tld/zend (let's call it yourdomain later on)
and we are requesting yourdomain/mycontroller/myaction
Therefore %{REQUEST_URI} will be /zend/mycontroller/myaction.
Note that $1, which is the pattern in the RewriteRule directive in the htaccess context [1], "will initially be matched against the filesystem path, after removing the prefix that led the server to the current RewriteRule (e.g. app1/index.html or index.html depending on where the directives are defined)".
Therefore $1 will be mycontroller/myaction.
And %{REQUEST_URI}::$1 will be /zend/mycontroller/myaction::mycontroller/myaction.
The above string will be matched against ^(/.+)(.+)::\2$. Note that for the two capturing groups in round braces i.e., (/.+)(.+) before :: many combinations can match that. For example:
Group 1: /z
Group 2: end/mycontroller/myaction
or
Group 1: /zend/mycontroller/myactio
Group 2: n
and anything in between is a valid match. In fact, the most interesting one would be
Group 1: /zend/
Group 2: mycontroller/myaction
which (is the only case that) makes backreference \2 (after ::) to the second group a match.
In this case, /zend/ will be stored in the environment variable BASE which is what the first RewriteRule does. The %1 refers to the first matched string in RewriteCond which is /zend/.
Looking at the second RewriteRule, it is clear that why there is a need for this. As index.php can only be found in /zend/index.php, we need to add /zend/ in front of index.php.
Here we assume to use the URL-path as Substitution for the second RewriteRule directive. Refer to [1] and search for "A DocumentRoot-relative path to the resource to be served" under the RewriteRule Directive section.
All the above leave the query string unchanged/untouched. It is up to index.php how to parse the query string (as well as the URI).
Lastly goes the case where Zend is installed at the domain root.
%{REQUEST_URI} will be /mycontroller/myaction.
$1 will be mycontroller/myaction.
The string to be matched by RewriteCond will be /mycontroller/myaction::mycontroller/myaction.
This time the second group in (/.+)(.+) will never match mycontroller/myaction as there needs to be at least one letter following the initial backslash for the first group, making the second group as close as ycontroller/myaction but not exactly mycontroller/myaction so there cannot be a match.
As a result, the first RewriteRule is not used. The BASE enviornment variable will not be set, and when the second RewriteRule uses it, it will simply be empty.
References
[1] http://httpd.apache.org/docs/current/mod/mod_rewrite.html

The $1 in %{REQUEST_URI}::$1 references the matched string of the RewriteRule directive, i.e., the matched string of .* in ^(.*). So %{REQUEST_URI}::$1 is expanded to the requested URI path as supplied by the user, and the current internal URI path and query, separated by ::.
The pattern ^(/.+)(.+)::\2$ is used to find a prefix (first capturing group) which makes the remaining part match the part behind the :: (\2 is a back reference to the matched string of the second capturing group of the pattern).
If such a match is found, the prefix is stored in the environment variable BASE ([E=BASE:%1], where %1 references the matched string of the previous successful RewriteCond pattern match).

Related

Replace part of a long URL and redirect

Is there a way to redirect the URL as follows:
URL is generated based on a filtering system so it is like this
https://example.com/product-category-no-slash-generated-part-is-autoadded-here
Due to the massive product number, it is impossible for me to change all generated URL-s but I need to change, for example, only no-slash part to something-else, so redirect does this:
Old URL:
https://example.com/product-category-no-slash-generated-part-is-autoadded-here
New URL:
https://example.com/product-category-something-else-generated-part-is-autoadded-here
I hope I managed to explain the problem.
I tried to use stuff like RewriteRule ^/no-slash/(.*)$ /something-else/$1 [L] but I think this does not work for what I need.
To replace no-slash with something-else in the URL-path that only consists of a single path-segment then you can do something like the following using mod-rewrite, near the top of the root .htaccess file.
RewriteEngine On
# Replace "no-slash" in URL-path with "something-else"
RewriteRule ^([\w-]+)no-slash([\w-]+)$ /$1something-else$2 [R=302,L]
This assumes the URL-path can only consist of the characters 0-9, a-z, A-Z, _ (underscore) and - (hyphen).
The $1 and $2 backreferences contain the matched URl-path before and after the string to replace repsectively.
I tried to use stuff like RewriteRule ^/no-slash/(.*)$ /something-else/$1 [L]
In this you are matching slashes in the URL-path - which do not occur in your example. You are also not allowing for anything before the string you want to replace (eg. product-catgeory-).
In a .htaccess context, the URL-path matched by the RewriteRule pattern does not start with a slash. S, a pattern like ^/no-slash will never match.
UPDATE:
another example. example.com/demo-tools-for-construction-work So word TOOLS in URL must be replaced with EQUIPMENT-AND-TOOLS.
(I'm assuming this should all be lowercase.)
A problem with your second example (in comments) is that tools also exists in the target URL, so this would naturally result in an endless redirect loop.
To prevent this "loop" you would need to exclude the URL you are redirecting to. eg. You could exclude URLs that already contain equipment-and-tools.
For example:
# Replace "tools" in URL-path with "equipment-and-tools"
# - except if it already contains "equipment-and-tools"
RewriteCond %{REQUEST_URI} !equipment-and-tools
RewriteRule ^([\w-]+)tools([\w-]+)$ /$1equipment-and-tools$2 [R=302,L]
The ! prefix on the CondPattern (2nd argument to the RewriteCond directive) negates the expression. So, in this case it is successful when equipment-and-tools is not contained in the requested URL.

Use htaccess to change query parameter to iOS app-specific deep-link [duplicate]

I am trying to do the following:
User visits URL with query parameter: http://www.example.com/?invite=1234
I then want them to be deep linked into the app on their iOS device, so they go to: app_name://1234
Any suggestions on how to accomplish this in my .htaccess file?
I tried this but it doesn't work:
RewriteEngine On # Turn on the rewriting engine
RewriteRule ^invite/(.*)/$ app_name://$1 [NC,L]
If RewriteRule won't work, can anyone send me an example code for RewriteCond or JavaScript to achieve what I need?
Not sure how this will work with the iOS device, but anyway...
RewriteRule ^invite/(.*)/$ app_name://$1 [NC,L]
This doesn't match the given URL. This would match a requested URL of the form example.com/invite/1234/. However, you are also matching anything - your example URL contains digits only.
The RewriteRule pattern matches against the URL-path only, you need to use a RewriteCond directive in order to match the query string. So, to match example.com/?invite=1234 (which has an empty URL-path), you would need to do something like the following instead:
RewriteCond %{QUERY_STRING} ^invite=([^&]+)
RewriteRule ^$ app_name://%1 [R,L]
The %1 backreference refers back to the last matched CondPattern.
I've also restricted the invite parameter value to at least 1 character - or do you really want to allow empty parameter values through? If the value can be only digits then you should limit the pattern to only digits. eg. ^invite=(\d+).
I've include the R flag - since this would have to be an external redirect - if it's going to work at all.
However, this may not work at all unless Apache is aware of the app_name protocol. If its not then it will simply be seen as a relative URL and result in a malformed redirect.

How to write this .htaccess rewrite rule

I am setting up a MVC style routing system using mod rewrite within an .htaccess file (and some php parsing too.)
I need to be able to direct different URLs to different php files that will be used as controllers. (index.php, admin.php, etc...)
I have found and edited a rewrite rule that does this well by looking at the first word after the first slash:
RewriteCond %{REQUEST_URI} ^/stats(.*)
RewriteRule ^(.*)$ /hello.php/$1 [L]
However, my problem is I want it to rewrite based on the 2nd word, not the first. I want the first word to be a username. So I want this:
http://www.samplesite.com/username/admin to redirect to admin.php
instead of:
http://www.samplesite.com/admin
I think I just need to edit the rewrite rule slightly with a 'anything can be here' type variable, but I'm unsure how to do that.
I guess you can prefix [^/]+/ to match and ignore that username/
RewriteCond %{REQUEST_URI} ^/[^/]+/stats(.*)
RewriteRule ^[^/]+/(.*)$ /hello.php/$1 [L]
then http://www.samplesite.com/username/statsadmin will be redirecte to http://www.samplesite.com/hello.php/statsadmin (or so, I do not know the .htaccess file)
To answer your question, "an anything can be here type variable" would be something like a full-stop . - it means "any character". Also the asterisk * means "zero or more of the preceding character or parenthesized grouped characters".
But I don't think you need that...If your matching url will always end in "admin" then you can use the dollar sign $ to match the end of the string.
Rewrit­eRule admin$ admin.php [R,NC,L]
Rewrites www.anything.at/all/that/ends/in/admin to www.anything.at/admin.php

Ampersands in URL problems

I have a php page which creates URL like:
vendors/London City/cat-DJ & Entertainment/keywords
which my .htaccess redirects as shown below
RewriteRule vendors/(.+)/cat-(.+)/(.+)$ vendors.php?location=$1&category=$2&freetext=$3 [L]
RewriteRule vendors/(.+)/cat-(.+)/(.+)/$ vendors.php?location=$1&category=$2&freetext=$3 [L]
problem 1 is : in the vendors.php file, I am getting only "DJ ; Entertainment" as category. The ampersand is missing.
Problem 2 is : My complete .htaccess file is shown below... 6 rules are defined.
RewriteRule vendors/(.+)/(.+)/$ vendors.php?location=$1&freetext=$2 [L]
RewriteRule vendors/(.+)/(.+)$ vendors.php?location=$1&freetext=$2 [L]
RewriteRule vendors/(.+)/cat-(.+)/$ vendors.php?location=$1&category=$2 [L]
RewriteRule vendors/(.+)/cat-(.+)$ vendors.php?location=$1&category=$2 [L]
RewriteRule vendors/(.+)/cat-(.+)/(.+)$ vendors.php?location=$1&category=$2&freetext=$3[L]
RewriteRule vendors/(.+)/cat-(.+)/(.+)/$ vendors.php?location=$1&category=$2&freetext=$3[L]
Why the URL vendors/London City/cat-DJ & Entertainment/keywords is matching with rule 3 or 4 and redirecting to vendors.php?location=$1&category=$2 ?
Does .htaccess Process the rules from top to beginning one by one?
I had solved the problem by putting the rules 5 and 6 at the top of other rules. Did I make the correct fix?
1. I don't really like the idea of having spaces and other special characters in the URLs. I don't know if it's possible with your site, but instead of this kind of URL
vendors/London City/cat-DJ & Entertainment/keywords
you should have this one:
vendors/london-city/cat-dj-and-entertainment/keywords
For that, of course, you will have to perform some additional transformations / lookups in your database to convert london-city back to London City and dj-and-entertainment back to DJ & Entertainment. This can be done by storing these "from-to" pairs in database.
2. In any case -- order of rules matters. Therefore you should start with more specific rules and end up with more generic rules.
Also -- the (.+) pattern is a way too broad as it can match hello as well as hello/pink/kitten. To ensure that you always grab only one section (part of URL between /) use ([^/]+) pattern instead -- this will address one of the aspects of your "prob #2".
Therefore, try these optimized rules (each rule will match the URL with and without trailing slash):
RewriteRule ^vendors/([^/]+)/cat-([^/]+)/([^/]+)/?$ vendors.php?location=$1&category=$2&freetext=$3 [L]
RewriteRule ^vendors/([^/]+)/cat-([^/]+)/?$ vendors.php?location=$1&category=$2 [L]
RewriteRule ^vendors/([^/]+)/([^/]+)/?$ vendors.php?location=$1&freetext=$2 [L]
Also I'm not getting the value of 'category' with the Ampersand as
given in the url. I am getting only semi-colon. What can be the
reason?
I do not have Apache box currently running next to me, so cannot check it right now, but try adding B or NE flag next to the L flag (e.g. [L,B]) -- one of them should help:
http://httpd.apache.org/docs/current/rewrite/flags.html#flag_b
http://httpd.apache.org/docs/current/rewrite/flags.html#flag_ne
From the docs:
The order in which these rules are defined is important - this is the order in which they will be applied at run-time.

Match "full stop" in mod rewrite

At the moment I am just matching numbers, letters, dashes and underscores in my .htaccess file:
RewriteRule ^([A-Za-z0-9-_]+)/?$ index.php?folder=$1
I also want to match full stops in the string. I don't want to use:
(.*)
I have tried:
([.A-Za-z0-9-_]+)
([\.A-Za-z0-9-_]+)
([\\.A-Za-z0-9-_]+)
([A-Za-z0-9-_\.]+)
None of which seem to work.... how can I escape the full stop so it matches a full stop!
---------- Additional information ----------------
As an example:
mydomain.com/groups/green/ should go to index.php?folder=green
In addition I am also re-writing subdomains over the top of this (I think this is causing the complication)...
anotherdomain.com should map to index.php?folder=anotherdomain.com
I have succesfully re-written the subdomain with the following rule:
# external group domain name
RewriteCond %{ENV:Rewrite-Done} !^Yes$
## exclude requests from myhost.com
RewriteCond %{HTTP_HOST} !^www\.myhost\.com
## allowed list of domain masking domains
RewriteCond %{HTTP_HOST} ^(anotherdomain.com|extra.domain.com|external.otherdomain.com)
RewriteRule (.*) /groups/%1/$1
I think this is where the complication lies.
---------------- Solution ----------------------
Despite not finding a solution to the exact problem above, I have worked around it by changing the first re-direct (which maps the external domains) from:
RewriteRule (.*) /groups/%1/$1
to:
RewriteRule (.*) /groups/external/$1&external_domain=%1
The second re-write (on the folder) can then interpret the "external domain" variable instead of the folder.
Your first option is the simplest and is correct. Inside square brackets . has no special meaning, so you include it verbatim without any special escaping needed.
Actually there is a small problem with the second dash in 0-9-_. If you want a dash inside square brackets you should place it at the beginning of the character class. Otherwise it will have its special meaning of defining a character range:
([-.A-Za-z0-9_]+)
If that doesn't work there is something else wrong with your RewriteRule. For instance, if this is a global rule rather than per-directory (no RewriteBase) then URLs will begin with a slash /.

Resources