Hello I am writing a profile page script, in this script I check the value of an incoming $_GET variable and validate that it is an integer, I then validate this value against a $_SESSION value to confirm that they can only access their own accounts. The code looks like this:
// validate $_GET field
if(isset($_GET['identity']) && filter_var($_GET['identity'], FILTER_VALIDATE_INT, array('min_range' => 1))) {
if(isset($_SESSION['user_identity']) && ((int)$_SESSION['user_identity'] === (int)$_GET['identity'])) { // if session exists and is === $_GET['identity']
// Proceed with code
This works fine for instance if I try to pass '0','2-2','abc' or no value as the $_GET value the query correctly fails and redirects them to the home page.
What I then tried to do was alter my .htaccess file to map the URLs to 'profile/1' just to tidy it up.
RewriteRule ^profile$ profile.php
RewriteRule ^profile/([0-9]+)$ profile.php?identity=$1 [NC,L]
What I found now is that the page doesn't redirect any more using those invalid $_GET parameters above. It just tries to find 'profile/abc.
Does anyone know why?
I use this and it works for me:
RewriteEngine On
RewriteBase /
RewriteRule ^profile$ profile.php
RewriteRule ^profile/([a-z0-9\-]+)$ profile.php?identity=$1 [NC,L,QSA]
Now, how did you get profile/abc? if you try to pass letters in the rule it wont work since you only specify numbers ([0-9]+). If you want to pass letters you will need to use:
RewriteRule ^profile/([a-z0-9\-]+)/?$ profile.php?identity=$1 [NC,L,QSA]
Related
Hoping this isn't a duplicate, done a lot of looking and I just get more confused as I don't use .htaccess often.
I would like to have some pretty URLs and see lots of help regarding getting information where for example index.php is passed a parameter such as page. So I can currently convert www.example.com/index.php?page=help to www.example.com/help.
Obviously I'm not clued up on this but I would like to parse a URL such as www.example.com/?page=help.
Can't seem to find much info and adapting the original I am obviously going wrong somewhere.
Any help or pointers in the right direction would be greatly appreciated. I'm sure its probably stupidly simple.
My alterations so far which do not seem to work are:
RewriteCond %{THE_REQUEST} ^.*/?page=$1
RewriteRule ^(.*)/+page$ /$1[QSA,L]
Also recently tried QUERY_STRING but just getting server error.
RewriteCond %{QUERY_STRING} ^page=([a-zA-Z]*)
RewriteRule ^(.*) /$1 [QSA,L]
Given up as dead to the world so thought I would ask. Hoping to ensure the request/url etc starts ?page and wanting to make a clean URL from the page parameter.
This is the whole/basic process...
1. HTML Source
Make sure you are linking to the "pretty/canonical" URL in your HTML source. This should be a root-relative URL starting with a slash (or absolute), in case you rewrite from different URL path depths later. For example:
Help Page
2. Rewrite the "pretty" URL
In .htaccess (using mod_rewrite), internally rewrite the "pretty" URL back to the file that actually handles the request, ie. the "front-controller" (eg. index.php, passing the page URL parameter if you wish). For example:
DirectoryIndex index.php
RewriteEngine On
# Rewrite URL of the form "/help" to "index.php?page=help"
RewriteRule ^[^.]+$ index.php?page=$0 [L]
The RewriteRule pattern ^[^.]+$ matches any URL-path that does not include a dot. By excluding a dot we can easily omit any request that would map to a physical file (that includes a file extension delimited by a dot).
The $0 backreference contains the entire URL-path that is matched by the RewriteRule pattern.
The DirectoryIndex is required when the "homepage" (root-directory) is requested, when the URL-path is otherwise empty. In this case the page URL parameter is not passed to our script.
3. Implement the front-controller / router (ie. index.php)
In index.php (your "front-controller" / router) we read the page URL parameter and serve the appropriate content. For example:
<?php
$pages = [
'home' => '/content/homepage.php',
'help' => '/content/help-page.php',
'about' => '/content/about-page.php',
'404' => '/content/404.php',
];
// Default to "home" if "page" URL param is omitted or is empty
$page = empty($_GET['page']) ? 'home' : $_GET['page'];
// Default to 404 "page" if not found in the array/DB of pages
$handler = $pages[$page] ?? $pages['404'];
include($_SERVER['DOCUMENT_ROOT'].$handler);
As seen in the above script, the actual "content" is stored in the /content subdirectory. (This could also be a location outside of the document root.) By storing these files in a separate directory they can be easily protected from direct access.
4. Redirect the "old/ugly" URL to the "new/pretty" URL [OPTIONAL]
This is only strictly necessary (in order to preserve SEO) if you are changing an existing URL structure and the "old/ugly" (original) URLs have been exposed (indexed by search engines, linked to by third parties, etc.), otherwise the "old" URL (ie. /index.php?page=abc) is accessible. This is the same whenever you change an existing URL structure.
If the site is new and you are implementing the "new/pretty" URLs from the start then this is not so important, but it does prevent users from accessing the old URLs if they were ever exposed/guessed.
The following would go before the internal rewrite and after the RewriteEngine directive. For example:
# Redirect "old" URL of the form "/index.php?page=help" to "/help"
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteCond %{REQUEST_URI} ^/index\.php$ [OR]
RewriteCond %{QUERY_STRING} ^page=([^.&]*)
RewriteRule ^(index\.php)?$ /%1 [R=301,L]
The check against the REDIRECT_STATUS environment variable prevents a redirect-loop by not redirecting requests that have already been rewritten by the later rewrite.
The %1 backreference contains the value of the page URL parameter, as captured from the preceding CondPattern (RewriteCond directive). (Note how this is different to the $n backreference as used in the rewrite above.)
The above redirects all URL variants both with/without index.php and with/without the page URL parameter. For example:
/index.php?page=help -> /help
/?page=help -> /help
/index.php -> / (homepage)
/?page= -> / (homepage)
TIP: Test first with 302 (temporary) redirects to prevent potential caching issues.
Comments / improvements / Exercises for the reader
The above does not handle additional URL parameters. You can use the QSA (Query String Append) flag on the initial rewrite to append additional URL parameters on the initially requested URL. However, implementing the reverse redirect is not so trivial.
You don't need to pass the page URL parameter in the rewrite. The entire (original) URL is available in the PHP superglobal $_SERVER['REQUEST_URI'] (which also includes the query string - if any). You can then parse this variable to extract the required part of the URL instead of relying on the page URL parameter. This generally allows greatest flexibility, without having to modify .htaccess later.
However, being able to pass a page URL parameter can be "useful" if you ever want to manually rewrite (override) a URL route using .htaccess.
Incorporate regex (wildcard pattern matching) in the "router" script so you can generate URLs with "parameters". eg. /<page>/<param1>/<param2> like /photo/cat/large.
Reference:
https://httpd.apache.org/docs/2.4/rewrite/
https://httpd.apache.org/docs/2.4/rewrite/intro.html
https://httpd.apache.org/docs/2.4/mod/mod_rewrite.html
RewriteCond %{QUERY_STRING} ^page=([^&]+)
RewriteRule ^$ /%1? [R=302,L]
Can't delete and didn't want to waste anyones time responding.
I am trying to do the following:
User visits URL with query parameter: http://www.example.com/?invite=1234
I then want them to be deep linked into the app on their iOS device, so they go to: app_name://1234
Any suggestions on how to accomplish this in my .htaccess file?
I tried this but it doesn't work:
RewriteEngine On # Turn on the rewriting engine
RewriteRule ^invite/(.*)/$ app_name://$1 [NC,L]
If RewriteRule won't work, can anyone send me an example code for RewriteCond or JavaScript to achieve what I need?
Not sure how this will work with the iOS device, but anyway...
RewriteRule ^invite/(.*)/$ app_name://$1 [NC,L]
This doesn't match the given URL. This would match a requested URL of the form example.com/invite/1234/. However, you are also matching anything - your example URL contains digits only.
The RewriteRule pattern matches against the URL-path only, you need to use a RewriteCond directive in order to match the query string. So, to match example.com/?invite=1234 (which has an empty URL-path), you would need to do something like the following instead:
RewriteCond %{QUERY_STRING} ^invite=([^&]+)
RewriteRule ^$ app_name://%1 [R,L]
The %1 backreference refers back to the last matched CondPattern.
I've also restricted the invite parameter value to at least 1 character - or do you really want to allow empty parameter values through? If the value can be only digits then you should limit the pattern to only digits. eg. ^invite=(\d+).
I've include the R flag - since this would have to be an external redirect - if it's going to work at all.
However, this may not work at all unless Apache is aware of the app_name protocol. If its not then it will simply be seen as a relative URL and result in a malformed redirect.
So, after searching for a solution all over this community, my question is as follow:
Im working within the Wordpress enviroment, Apache server. I have a folder within uploads named /restricted/. Everything in here (any file extension) can only be accessed if:
A cookie named 'custom_cookie' is set
And this cookie value must be a partial match of the URL request
If these conditions fail, an image is served. Inside this /restricted/ folder I got a .htaccess file. Everything must (prefered) be done in that htaccess file, not on root htaccess file.
The cookie is set by functions.php, no problem with that
part. And comments about security is not the question here
This is an url example (localhost): http://localhost/komfortkonsult/wp-content/uploads/restricted/some-file.jpg?r=870603c9d23f2b7ea7882e89923582d7
The first condition A cookie named custom_cookie is set, everything is working with this:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /komfortkonsult/
RewriteCond %{REQUEST_URI} ^.*uploads/restricted/.*
RewriteCond %{HTTP_COOKIE} !custom_cookie
RewriteRule . /komfortkonsult/restricted.png [R,L]
</IfModule>
However, the next part Im totally out in the blue, But I tried and failed with the following approaches:
RewriteCond %{HTTP_COOKIE} custom_cookie=(.*)$
RewriteCond %1::%{REQUEST_URI} ^(.*?)::/\1/?
RewriteRule . /komfortkonsult/restricted.png [R,L]
Likewise:
RewriteCond %{QUERY_STRING} ^r=(.*)$
RewriteRule ^/ - [E=COOKIE_MATCH:%1]
RewriteCond %{HTTP_COOKIE} !custom_cookie="%{ENV:COOKIE_MATCH}"
RewriteRule . /komfortkonsult/restricted.png [R,L]
Likewise:
RewriteCond %{HTTP_COOKIE} custom_cookie=([^;]+) [NC]
RewriteCond %{REQUEST_URI} !%1 [NC]
RewriteRule . /komfortkonsult/restricted.png [R,L]
And so on. I really want to keep this inside the .htaccess, instead using validation through a .php file call. But if that is the only solution to my architechture, please provide a full working example (not foo=bar, your redirects goes here...)
Any other approaches of my objectives are welcome.
Thanks so much for helping me out with this.
/ Intervik
Update (after accepted answer and working) example of usage
The objectives are one layer of protection in a Wordpress single install. All media, images or other files, uploaded and attached to pages, are hidden (replaced by an image) if A) the user is not logged-in or B) The user is logged in but not with the capability of 'edit_post'.
But the restriction is only for files uploaded into a unique folder called /restricted/. The folder is resident in the Wordpress original /uploads/ root. This restricted material is not allowed to be direct-linked or accessable by search engines etc etc. No browser-cache is allowed and restriction must work immediately after log-out. And more... but I think you get it.
The namespace 'custom_cookie' is just a providing example. And the examples showing the Wordpress install is within a subfolder on localhost. LIKE h**p://example.com/workspace/. Remove 'workspace/' if in root.
The cookie architecture, functions.php
function intervik_theme_set_custom_cookie(){
if(is_user_logged_in()){
global $current_user;
if(current_user_can('edit_posts')){
if(!isset($_COOKIE['custom_cookie'])){
$cookie_value = $current_user->ID . '|' . $current_user->user_login . '|' . $current_user->roles;
$salt = wp_salt('auth');
$cookie_hash = hash_hmac('md5', $cookie_value, $salt);
setcookie('custom_cookie', $cookie_hash, time()+36, '/');
$_COOKIE['custom_cookie'] = $cookie_hash;
} else {
$cookie_value = $current_user->ID . '|' . $current_user->user_login . '|' . $current_user->roles;
$salt = wp_salt('auth');
$cookie_hash = hash_hmac('md5', $cookie_value, $salt);
if($cookie_hash != $_COOKIE['custom_cookie']){
setcookie('custom_cookie', '', 1, '/');
unset($_COOKIE['custom_cookie']);
}
}
} else {
if(isset($_COOKIE['custom_cookie'])){
setcookie('custom_cookie', '', 1, '/');
unset($_COOKIE['custom_cookie']);
}
}
} else {
if(isset($_COOKIE['custom_cookie'])){
setcookie('custom_cookie', '', 1, '/');
unset($_COOKIE['custom_cookie']);
}
}
}
add_action('init', 'intervik_theme_set_custom_cookie');
As you can see, Each cookie is unique for each valid user, for each +36 seconds period (enough for a page-load - but use +120 for 2 minutes). This "token" is applied to every request send to the the server:
The link to attachment url filter:
function intervik_restricted_wp_get_attachment_url($url, $post_id){
if(strpos($url, '/restricted/') !== FALSE){
if(isset($_COOKIE['custom_cookie'])){
$url = add_query_arg('r', $_COOKIE['custom_cookie'], $url);
}
}
return $url;
}
add_filter('wp_get_attachment_url', 'intervik_restricted_wp_get_attachment_url', 10, 2);
We are not allowing any other query strings. Remark, more filter must be added for sizes, like wp_get_attachment_image_src etc etc. But direct links to media, this is enough.
Replace the if(current_user_can('edit_posts') with another
if(is_user_logged_in() ... changes everything to just login/out
users. Then skip the filters in the admin backend with if(!is_admin()
&& strpos($url, '/restricted/')!== FALSE) ...
And finally the .htaccess file, in the root of the uploads/restricted/ folder:
# BEGIN Intervik
Options +FollowSymLinks
Options All -Indexes
<IfModule !mod_rewrite.c>
Deny from all
</IfModule>
<IfModule mod_headers.c>
Header set Cache-Control "no-cache, no-store, must-revalidate"
Header set Pragma "no-cache"
Header set Expires 0
</IfModule>
RewriteEngine On
RewriteCond %{HTTP_COOKIE}::%{QUERY_STRING} !\bcustom_cookie=([0-9a-f]{32})\b.*::r=\1(&|$)
RewriteRule . /workspace/restricted.png? [R,L]
# END Intervik
I also placed the nice PNG IMAGE "Restriced Access timeout" in the Wordpress install root. This is also served as thumbnail in Library admin area for non valid administrators. The upload filter or backend is another area.
We are not protecting Englands financial plans here, but we wanna keep
away some paperwork for an organistion and some picures from Google and from
your wife.
Please comment
Its actually working and you are welcome to comment the flaws or security risks. However, there is also another layer validation with PHP above this layer in our install, but we need speed for not so important stuff.
You've got some of the correct bits in your different attempts, but you need to bring them together in the correct order.
Try the following instead:
RewriteEngine On
# custom_cookie value is 32 char hex and must match the value of the "r" URL parameter
RewriteCond %{HTTP_COOKIE}::%{QUERY_STRING} !\bcustom_cookie=([0-9a-f]{32})\b.*::r=\1(&|$)
RewriteRule ^ /komfortkonsult/restricted.png [QSD,R,L]
The QSD flag (Apache 2.4+) is required to remove the query string from the redirected URL. Alternatively, if you are still using Apache 2.2 then you can append a ? to the susbstitution instead.
Note that the RewriteBase is not required here. The <IfModule> should also be removed. The <IfModule mod_rewrite.c> wrapper is only required if this is intended to work without mod_rewrite being available. It is not. If mod_rewrite is not available then your conditions will simply fail silently and access will be unrestricted. In this case, it is preferable to fail with an error and access is forbidden (for everyone).
Assumptions:
The cookie value is a 32 character hex value (as in your example).
The r URL parameter is always the first URL parameter (as in your example).
You mentioned "any file extension", however, redirecting to an image only really "works" if an image is being requested in the first place. If you have files other than images it may be preferable to simply return a 403 Forbidden. (Strictly speaking, sending a 403 is the correct response rather than a 302, followed by 200 OK.) To send a 403 instead, just change the RewriteRule directive to read:
RewriteRule ^ - [F]
How this works...
An important point, that is missed from all but one of your examples, is the r URL parameter is part of the query string, not the URL-path. The REQUEST_URI server variable contains the URL-path only, which notably excludes the query string. To match the query string you need to compare against the QUERY_STRING server variable.
%{HTTP_COOKIE}::%{QUERY_STRING} - The cookie HTTP request header is joined with the query string using a separater (::) that is guaranteed to not appear in either value. This forms the TestString.
!\bcustom_cookie=([0-9a-f]{32})\b.*::r=\1(&|$) - This is the CondPattern that matches the TestString. \b is a word boundary, so we match only this specific cookie. The value of this cookie is captured using ([0-9a-f]{32}). We then skip over any remaining characters in the cookie header until we get to our separater (::). After this we are matching against the query string (value of the QUERY_STRING server variable in the TestString). The "magic" is the \1 backreference to the first captured group, ie. the cookie value.
The ! prefix on the CondPattern negates the entire pattern. So, the condition is successful when this pattern does not match, ie. when the values of the cookie and URL parameter are different (or not present at all).
Why your attempts were not working...
RewriteCond %{HTTP_COOKIE} custom_cookie=(.*)$
RewriteCond %1::%{REQUEST_URI} ^(.*?)::/\1/?
This assumes your cookie is the last cookie in the Cookie header. This is difficult to guarantee.
You are trying to match the cookie value with the entire URL-path (REQUEST_URI), so this will never match. It assumes your URL is of the form: http://localhost/870603c9d23f2b7ea7882e89923582d7.
RewriteCond %{QUERY_STRING} ^r=(.*)$
RewriteRule ^/ - [E=COOKIE_MATCH:%1]
RewriteCond %{HTTP_COOKIE} !custom_cookie="%{ENV:COOKIE_MATCH}"
Good, you are checking the query string for the URL parameter value. However...
The first RewriteRule never matches because the URL-path never starts with a slash in per-directory (.htaccess) context. Consequently, the COOKIE_MATCH environment variable is never set.
The CondPattern is a regex, not a plain string, so %{ENV:COOKIE_MATCH} is not evaluated - it is seen as a literal string. You've also enclosed this in double quotes, which aren't part of the cookie value either.
RewriteCond %{HTTP_COOKIE} custom_cookie=([^;]+) [NC]
RewriteCond %{REQUEST_URI} !%1 [NC]
Again, you are comparing against the URL-path, not the query string. However, as mentioned above, the %1 backreference is not evaluated in the CondPattern, so this is seen as a literal string anyway.
It is why the %{VARIABLE} (and %1 etc) expressions are not evaluated in the CondPattern that we need to use the seemingly complex expression that uses a regex backreference of the form:
%{VAR1}##%{VAR2} ^(.+)##\1$
Normally, the practice or very old way of displaying some profile page is like this:
www.domain.com/profile.php?u=12345
where u=12345 is the user id.
In recent years, I found some website with very nice urls like:
www.domain.com/profile/12345
How do I do this in PHP?
Just as a wild guess, is it something to do with the .htaccess file? Can you give me more tips or some sample code on how to write the .htaccess file?
According to this article, you want a mod_rewrite (placed in an .htaccess file) rule that looks something like this:
RewriteEngine on
RewriteRule ^/news/([0-9]+)\.html /news.php?news_id=$1
And this maps requests from
/news.php?news_id=63
to
/news/63.html
Another possibility is doing it with forcetype, which forces anything down a particular path to use php to eval the content. So, in your .htaccess file, put the following:
<Files news>
ForceType application/x-httpd-php
</Files>
And then the index.php can take action based on the $_SERVER['PATH_INFO'] variable:
<?php
echo $_SERVER['PATH_INFO'];
// outputs '/63.html'
?>
I recently used the following in an application that is working well for my needs.
.htaccess
<IfModule mod_rewrite.c>
# enable rewrite engine
RewriteEngine On
# if requested url does not exist pass it as path info to index.php
RewriteRule ^$ index.php?/ [QSA,L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule (.*) index.php?/$1 [QSA,L]
</IfModule>
index.php
foreach (explode ("/", $_SERVER['REQUEST_URI']) as $part)
{
// Figure out what you want to do with the URL parts.
}
I try to explain this problem step by step in following example.
0) Question
I try to ask you like this :
i want to open page like facebook profile www.facebook.com/kaila.piyush
it get id from url and parse it to profile.php file and return featch data from database and show user to his profile
normally when we develope any website its link look like
www.website.com/profile.php?id=username
example.com/weblog/index.php?y=2000&m=11&d=23&id=5678
now we update with new style not rewrite we use www.website.com/username or example.com/weblog/2000/11/23/5678 as permalink
http://example.com/profile/userid (get a profile by the ID)
http://example.com/profile/username (get a profile by the username)
http://example.com/myprofile (get the profile of the currently logged-in user)
1) .htaccess
Create a .htaccess file in the root folder or update the existing one :
Options +FollowSymLinks
# Turn on the RewriteEngine
RewriteEngine On
# Rules
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /index.php
What does that do ?
If the request is for a real directory or file (one that exists on the server), index.php isn't served, else every url is redirected to index.php.
2) index.php
Now, we want to know what action to trigger, so we need to read the URL :
In index.php :
// index.php
// This is necessary when index.php is not in the root folder, but in some subfolder...
// We compare $requestURL and $scriptName to remove the inappropriate values
$requestURI = explode(‘/’, $_SERVER[‘REQUEST_URI’]);
$scriptName = explode(‘/’,$_SERVER[‘SCRIPT_NAME’]);
for ($i= 0; $i < sizeof($scriptName); $i++)
{
if ($requestURI[$i] == $scriptName[$i])
{
unset($requestURI[$i]);
}
}
$command = array_values($requestURI);
With the url http://example.com/profile/19837, $command would contain :
$command = array(
[0] => 'profile',
[1] => 19837,
[2] => ,
)
Now, we have to dispatch the URLs. We add this in the index.php :
// index.php
require_once("profile.php"); // We need this file
switch($command[0])
{
case ‘profile’ :
// We run the profile function from the profile.php file.
profile($command([1]);
break;
case ‘myprofile’ :
// We run the myProfile function from the profile.php file.
myProfile();
break;
default:
// Wrong page ! You could also redirect to your custom 404 page.
echo "404 Error : wrong page.";
break;
}
2) profile.php
Now in the profile.php file, we should have something like this :
// profile.php
function profile($chars)
{
// We check if $chars is an Integer (ie. an ID) or a String (ie. a potential username)
if (is_int($chars)) {
$id = $chars;
// Do the SQL to get the $user from his ID
// ........
} else {
$username = mysqli_real_escape_string($char);
// Do the SQL to get the $user from his username
// ...........
}
// Render your view with the $user variable
// .........
}
function myProfile()
{
// Get the currently logged-in user ID from the session :
$id = ....
// Run the above function :
profile($id);
}
Simple way to do this. Try this code. Put code in your htaccess file:
Options +FollowSymLinks
RewriteEngine on
RewriteRule profile/(.*)/ profile.php?u=$1
RewriteRule profile/(.*) profile.php?u=$1
It will create this type pretty URL:
http://www.domain.com/profile/12345/
For more htaccess Pretty URL:http://www.webconfs.com/url-rewriting-tool.php
It's actually not PHP, it's apache using mod_rewrite. What happens is the person requests the link, www.example.com/profile/12345 and then apache chops it up using a rewrite rule making it look like this, www.example.com/profile.php?u=12345, to the server. You can find more here: Rewrite Guide
ModRewrite is not the only answer. You could also use Options +MultiViews in .htaccess and then check $_SERVER REQUEST_URI to find everything that is in URL.
There are lots of different ways to do this. One way is to use the RewriteRule techniques mentioned earlier to mask query string values.
One of the ways I really like is if you use the front controller pattern, you can also use urls like http://yoursite.com/index.php/path/to/your/page/here and parse the value of $_SERVER['REQUEST_URI'].
You can easily extract the /path/to/your/page/here bit with the following bit of code:
$route = substr($_SERVER['REQUEST_URI'], strlen($_SERVER['SCRIPT_NAME']));
From there, you can parse it however you please, but for pete's sake make sure you sanitise it ;)
It looks like you are talking about a RESTful webservice.
http://en.wikipedia.org/wiki/Representational_State_Transfer
The .htaccess file does rewrite all URIs to point to one controller, but that is more detailed then you want to get at this point. You may want to look at Recess
It's a RESTful framework all in PHP
I am learning how to write regular expressions for .htaccess redirects.
So far I've managed to figure out everything I needed, except for a couple of regular expressions which don't behave as I expected. I am testing my regular expressions using a desktop application, and they work fine there, but not in the .htaccess file.
FYI: The RewriteBase is set to /site/
This is the incoming URL:
/site/view-by-tag/politics/?el_mcal_month=3&el_mcal_year=2009
I want to grab "politics" and redirect to /site/tags/politics/
Here is what I used:
RewriteRule ^view-by-tag/([a-zA-Z\-]+)/([a-zA-Z0-9\-\/\.\_\=\?\&]+) /tags/$1/ [R=301,L]
I added the capture of all the characters after politics because I am having the issue that when there is a ? in the URL the redirect does not work, and I can't figure out why. In the URL given above, if I remove the ? it works fine, but if the ? is in there, nothing happens. Is there a reason for this?
The same thing happens when I try to capture 307 from /site/?option=com_content&view=article&id=307&catid=89&Itemid=55
I used this regular expression, article&id=([0-9]+) /?p=$1 [R=301,L] but again, when there is a ? in the URL it stops the redirect for doing anything.
What is the reason for that?
The .htaccess file in question is on a Wordpress blog (3.4.1)
The point that you've missed is that the rewrite engine splits the URI into two parts: the REQUEST_URI and the QUERY_STRING. The query string part isn't used in the rule match string so there is no point in constructing rule regexp patterns to look for it.
You can probe and pick out parameters from the query string by using rewrite conditions and condition regexps to set %N variables.
By default the query string is appended to the output substitution string unless you have a ?someparam in it -- in which case it is ignored unless you used the [QSA] (query string append) parameter.
The way that you'd pick up the id in /site/?option=com_content&view=article&id=307&catid=89&Itemid=55 is to use something like:
RewriteCond %{QUERY_STRING} \bid=(\d+)
Before the rule and this would set %1 to 307. Read the rewrite documentation for more general discussion of how to do this.
The query string is must be processed separately in a RewriteCond if you need to manipulate it, and should not be matched inside the RewriteRule Instead, just match the request not including the query string, and use QSA to append the query string onto the redirect:
RewriteRule ^view-by-tag/([A-Za-z-]+)/?$ /tags/$1/ [R=301,L,QSA]
# OR, if you don't want the rest of the query string appended, put a `?` onto
# the redirect to replace it with nothing
RewriteRule ^view-by-tag/([A-Za-z-]+)/?$ /tags/$1/? [R=301,L]
Actually, the QSA may not be needed in a R redirect - I think that the default behavior is to pass the query string with the redirect.
If you need to capture 307 from the query string, do it in a RewriteCond and capture in %1:
# Capture the id in %1
RewriteCond %{QUERY_STRING} id=([\d]+)
# Redirect everything to /, pass %1 into p
RewriteRule . /?p=%1 [LR=301,L]