How to make webpage act like a subdirectory with htaccess - .htaccess

I'm building a website which has a profile page for each user.
I'd like the profile pages to be accessed through visiting a web address, eg:
example.com/profile/username
Obviously it would be impractical to create a new PHP file for each registered user, and so what I'm trying to do is redirect traffic from example.com/profile/(whatever) to example.com/profile, where I can then interrogate the URL and load the correct profile from a database.
However, I'm really struggling to work out how to do this with .htaccess. I've tried all sorts of redirects, etc, but can't seem to get it to work. Can anyone point me in the right direction?
The only other thing I've got in my .htaccess file at the moment is the below to remove .php from URLs:
# Remove .php
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^([^\.]+)$ $1.php [NC,L]
Any advice gratefully appreciated!

redirect traffic from example.com/profile/(whatever) to example.com/profile
Note that you should internally rewrite directly to the file that handles the request, which I assume is /profile.php, rather than relying on another rewrite to later append the extension. I assume that code in /profile.php already processes the request as necessary.
I will also assume that "username" can consist only of the characters a-z, A-Z, 0-9, _ (underscore) and - (hyphen). Generally, you want to be as specific as possible.
Try something like the following:
RewriteEngine On
# Disable MultiViews to avoid conflicts with mod_negotiation
# (Since you are using "extensionless URLs")
Options -MultiViews
# Handle requests for `/profile/<username>
RewriteRule ^profile/([\w-]+)$ profile.php [L]
# Append the ".php" extension to any request that does not end in a file extension
RewriteRule ^([^.]+)$ $1.php [L]
Whilst the comment in your code states "Remove .php". The code actually appends the file extension to URLs that don't have one.
This currently only works for files in the document root (as per your original code), eg. /foo. It won't do anything with /foo/bar for instance.
The condition that checks against the file system (ie. RewriteCond %{REQUEST_FILENAME} !-f) was not required in your existing code, unless you have actual files that don't contain a file extension? I'm also assuming you do not need to access filesystem directories directly (as per your original code).
There is no need to escape literal dots when used inside a character class.

Related

mod_rewrite rule to redirect specific URL matches

I am using mod_rewrite to redirect long URL's to specific pages. Its for a shop so basically if the URL is one folder deep it takes the user to a specific page, if the URL is two folders it takes them to another etc. I achieved this using the following...
RewriteRule ^([^/\.]+)/?$ shop.php?category=$1 [L]
RewriteRule ^([^/\.]+)/([^/\.]+)/?$ brand.php?category=$1&brand=$2 [L]
RewriteRule ^([^/\.]+)/([^/\.]+)/([^/\.]+)/?$ handler.php?category=$1&brand=$2&product=$3 [L]
RewriteRule ^([^/\.]+)/([^/\.]+)/([^/\.]+)/([^/\.]+)/?$ handler.php?category=$1&brand=$2&product=$3&car=$4 [L]
Notice each rule is one folder deeper than the previous.
Rather than use another rule for the next stage which is the final product page I would rather take the user to http://www.domain.com/PRODUCT/DB/ID so I wrote a rule to check if the first folder was PRODUCT and if so take the user to PRODUCT.PHP?DB=$1&ID=$2...
RewriteRule ^product/([^/\.]+)/([^/\.]+)/?$ /product.php?db=$1&id=$2 [L]
It keeps returning a 404 error though. I placed this new rule just before the others in the hope it would execute this rule before any others (this appears to be 3 folders deep for which there is another rule when the first folder isn't PRODUCT)
The .htaccess and subsequent .php files are at the root level of the site.
Have I wrote the rule correctly? I have tried all sorts and looked everywhere but questions like this are generally related to ignoring a specific folder which I don't actually want to do.
Thanks
Try turning off Multiviews:
Options -Multiviews
Multiviews is part of mod_negotiation which tries to match a request to existing files, including ones that are missing the extension. So it's possible that mod_negotiation sees the request for /product/something and sees the file /product.php and serves up that file using PATH INFO, which will completely bypass mod_rewrite.

Trying to stop urls such as mydomain.com/index.php/garbage-after-slash

I know very little about .htaccess files and mod-rewrite rules. Looking at my statcounter information today, I noticed that a visitor to my site entered a url as follows:
http://mywebsite.com/index.php/contact-us
Since there is no such folder or file on the website and no broken links on the site, I'm assuming this was a penetration attempt. What was displayed to the visitor was the output of the index.php file, but without benefit of the associated CSS layout.
I need to create a rewrite rule that will either remove the information after index.php (or any .php file), or perhaps more appropriately, insert a question mark (after the .php filename), so that any following garbage will be treated like a parameter (and will be gracefully ignored if no parameters are required).
Thank you for any assistance.
If you're only expecting real directories and real files that do exist, then you can add this to an .htaccess file. What it does is it takes a non-existent file or directory request and gives the user the index.php page with the original request as a query string. [QSA] appends any existing query string.
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule (.*) index.php?$1 [PT,QSA]
I found a solution, using information provided by AbsoluteZero as well as other threads that popped up on the right side of the screen as the solution came closer.
Here's the code that worked for me...
Options -Multiviews -Indexes +FollowSymLinks
RewriteEngine On
RewriteBase /
DirectorySlash Off
# remove trailing slash
RewriteRule ^(.*)\/(\?.*)?$ $1$2 [R=301,L]
# translate PATH_INFO information into a parameter
RewriteRule ^(.*)\.php(\/)(.*) $1.php?$3 [R=301,L]
# rewrite /dir/file?query to /dir/file.php?query
RewriteRule ^([\w\/-]+)(\?.*)?$ $1.php$2 [L,T=application/x-httpd-php]
I got the removal of trailing slash from another post on StackOverflow. However, even after removing the trailing slash, the rewrite rule did not account for someone appending what looks to be valid information after the .php file
(For example: mysite.com/index.php/somethingelse)
My goal was to either remove the "/somethingelse", or render it harmless. The PATH_INFO rule locates a "/" after the .php file, and turns everything else from that point forward into a query string (which will usually be ignored by the PHP file).

Make files not accessible via URL

How can I redirect via .htaccess file, that only the index.html can be accessed via URL.
I already got this:
RewriteEngine on
RewriteBase /
Options +FollowSymlinks
RewriteRule ^/?login/?$ /php/login.php [NC,R=301,L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.html [L,QSA]
It works fine if somebody types in for example "www.mypage.com/skd/lasnd"
but if somebody types in a file which exists on the webserver, e.g. "www.mypage.com/php/login.php", he will be redirected to that page. How to forbid that?
To be more exact: my JavaScript & PHP scripts should be still allowed to access to every file on my webserver.
These lines:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
are conditions "if REQUEST_FILENAME is NOT a file, if REQUEST_FILENAME is not a directory" and if both are met then RewriteRule is taking place. This is usually to allow "friendly urls" to work and at the same time to not rewrite any images, css etc. You can block access to files with many ways, but you have to take care to not block too much (like said images etc). The simplest approach would be to put your files in subdirectory and add another .htaccess file in that directory with line
Deny From All
This will make httpd reny any request to whatever is in that directory and subdirectories (unless another .htaccess overwrite these rules) while your scripts will be able to access them without a problem.
I strongly recommend do read mod_rewrite docs
EDIT
There's no "my javascript" and "their javascript". There's request and that's all you can tell for sure. You cannot tell which access yours and which is not. "i only want to deny request via typing in the browser adress line" - you can't tell that either. You theoretically could check REFERER, and if there's none set then assume it's direct hit, but REFERER comes from browser so it can be faked as well. And I personally block all REFERERS by default, so all my requests are w/o any REFERER even these not direct. You could try cookies, but again - these can be be grabbed by script and sent back too. The only real option is to Deny from all to these files and "tunel" them thru some sort of script (i.e. PHP) that would do i.e. file() on target file only if user authenticated himself previously using login and password. Any other attempts are broken from the start.
try the following
RewriteRule /.* http://www.new-domain.com/index.html

Redirect to fallback file if first attempt fails

I have this in my .htaccess:
RewriteRule ^images/([^/\.]+)/(.+)$ themes/current/images/$1/$2 [NC]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^images/([^/\.]+)/(.+)$ modules/$1/images/$2 [L,NC]
The idea is that it does the following:
// Rewrite this...
images/calendar/gear.png
// ... to this
themes/current/images/calendar/gear.png
// HOWEVER, if that rewritten path doesn't exist, rewrite the original URL to this:
modules/calendar/images/gear.png
The only things that change here are calendar and gear.png, the first of which could be any other single word and the latter the file name (possibly with path) to an image file.
I can rewrite the original URL to the first rewrite as shown in the example just fine, but what I cannot do is get my .htaccess to serve up the file from the other, fallback location if the first location 404s. I was under the impression that not using [L] in my first RewriteRule would rewrite the URL for RewriteCond.
The problem I'm having is that instead of serving the fallback file, the browser just shows a 404 to the first rewritten path (themes/current/calendar/gear.png), instead of falling back to modules/calendar/gear.png. What am I doing wrong?
Please note that my regex isn't perfect, but I can refine that later. Right now I'm concerning myself with the rewrite logic itself.
Fallthrough rules are fraught with bugs. My general recommendation is than any rule with a replacement string other than - should trigger an internal redirect to restart the .htaccess parse. This avoids the subrequest and URI_PATH bugs.
Next once you go to 404, again in my experience this is unrecoverable. I have a fragment which does something similar to what you are trying to do:
# For HTML cacheable blog URIs (a GET to a specific list, with no query params,
# guest user and the HTML cache file exists) then use it instead of executing PHP
RewriteCond %{HTTP_COOKIE} !blog_user
RewriteCond %{REQUEST_METHOD}%{QUERY_STRING} =GET [NC]
RewriteCond %{ENV:DOCUMENT_ROOT_REAL}/blog/html_cache/$1.html -f
RewriteRule ^(article-\d+|index|sitemap.xml|search-\w+|rss-[0-9a-z]*)$ \
blog/html_cache/$1.html [L,E=END:1]
Note that I do the conditional test in filesystem space and not URI (Location) space. So this would map in your case to
RewriteCond %{DOCUMENT_ROOT}/themes/current/images/$1/$2l -f
RewriteRule ^images/(.+?)/(.+)$ themes/current/images/$1/$2 [L]
Though do a phpinfo() to check to see if your hosting provider uses an alternative to DOCUMENT_ROOT if it is a shared hosting offering e.g an alternative environment variable as mine uses DOCUMENT_ROOT_REAL.
The second rule will be picked up on the second processing past after the internal redirect.

.htaccess, proper rewriting of directory and file with same name

As of now my website has a few static pages, one of which is /portfolio. Among other things, my htaccess hides the .html extension. I'd like to add a portfolio directory, but I do not want to move my existing portfolio page into the portfolio directory as the default index file. My /portfolio page is one of my Google sitelinks and I am afraid if it is moved or if the url changes in someway, Google will consider it to be a brand new page.
My problem is once I add the /portfolio/ directory, whenever I try to visit the original /portfolio page, a trailing slash is automatically added and it links to the directory itself.
I've tried countless options, one being a rewrite of /portfolio/ to /portfolio, however this creates an infinite loop. I also tried "DirectorySlash Off" but that only removed the trailing slash while being inside the directory, it didn't revert access to the original /portfolio page.
Ultimately, I would like to keep my /portfolio page as-is, linking to pages inside the directory like so /portfolio/example and if either /portfolio or /portfolio/ is accessed it will result in showing the same page which is outside of the directory without Google thinking it is duplicate content.
A similar question exists here:
.htaccess rewriting url to page or directory though this still resulted in an infinite loop for me for some reason, I'm guess it has something to do with the hidden extensions.
Here's my htaccess-
RewriteEngine On
# HTML to PHP
RemoveHandler .html .htm
AddType application/x-httpd-php .htm .html
# Hide extension
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.html -f
RewriteRule ^(.*)$ $1.html
# Force WWW
RewriteCond %{HTTP_HOST} ^mydomain\.net
RewriteRule ^(.*)$ http://www.mydomain.net/$1 [R=301,L]
# Blog Subdomain
RewriteCond %{HTTP_HOST} ^blog.mydomain.net$
RewriteRule ^(.*)$ http://www.mydomain.net/blog/$1 [R=301,L]
I know it's not a great idea having a directory with the same name as a static page, but I really would rather not alter the existing page and lose the Google sitelink, so a clean and proper way to handle this would be a help.
There are two things going "wrong" here, and two ways to fix it.
The first is that apache "figures out" that there is a directory by the name of "portfolio" before the rewrite conditions are applied. That means that the rewrite conditions are receiving "portfolio/" instead of "portfolio".
Second, the "!-d" rule is specifically avoiding the rewrite that you want to make if there is in fact a directory by that name
Solution 1: Manually re-route requests for the portfolio directory to remove the slash.
# Manually re-route portfolio/ requests to portfolio
RewriteCond %{REQUEST_FILENAME} portfolio/$
RewriteRule ^(.*)/$ $1
# Hide extension
RewriteCond %{REQUEST_FILENAME}\.html -f
RewriteRule ^(.*)$ $1.html
Note the removal of the "!-d" condition.
The downside to this is that you are having to hard-code the "portfolio" edge case directly into the rewrite rules, and will still result in the browser being first redirected to portfolio/
Solution 2: Set DirectorySlash Off and remove directory exists test
# Disable Automatic Directory detection
DirectorySlash Off
# Hide extension
RewriteCond %{REQUEST_FILENAME}\.html -f
RewriteRule ^(.*)$ $1.html
Setting DirectorySlash Off would fix this issue the best, but may break other parts of your site where you actually want the auto DirectorySlash. Best of Luck, and I hope this helps.
Note when testing solution 2, your browser may remember the redirect of "portfolio" to "portfolio/" and perform the redirect before it even sends the request to the server. Be sure to test in a cache-clear, clean environment for best results.

Resources