htaccess RewriteRules not working on subdirectories - .htaccess

I have a static HTML site that I'm trying to add some URL rewriting rules to via my .htaccess file.
The site file directory looks like this:
- index.htm
- about-us [subdirectory]
- index.htm
- careers [subdirectory]
- index.htm
- contact [subdirectory]
- index.htm
- map.htm
- privacy.htm
- projects [subdirectory]
- index.htm
- education.htm
- healthcare.htm
- recreation.htm
- residential.htm
- hospitality.htm
- services [subdirectory]
- index.htm
My goal is to remove the file extensions from the page URLs, append a trailing slash, and force a 301 redirect so that anyone trying to access the file in it's original format (i.e. https://example.com/projects/education.htm) would automatically be rewritten to it's cleaner format (i.e. https://example.com/projects/education/).
I already have the index.htm files rewritten/redirected in the .htaccess file on the root of my site. Here's what I have so far:
DirectoryIndex index.php index.html index.htm
RewriteEngine On
RewriteCond %{REQUEST_URI} (.*)/$
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.htm -f
RewriteRule ^(.+)/$ /$1/ [R=301,L]
RewriteRule ^index\.htm$ / [R=301,L]
RewriteRule ^(.*)/index\.htm$ /$1/ [R=301,L]
I have the directory root files (index.htm) removed successfully from the URLs, but for the life of me, I can't seem to get the non-root files (i.e. education.htm) rewritten in the desired format.
Can anyone shed some light on what I'm doing wrong?
UPDATE
As per #misorude's comment, I removed the 1st rewrite rule. I also changed the 301s to 302s (at least temporarily, to try and avoid rule caching issues). I also modified my existing rules to utilize the "+" character in my regex to force a real file or folder name in my rules.
Finally, I added a new 3rd rule. This one is an attempt to find all subdirectory "non index" pages, and remove their file extensions, which appears to work. However, I'm getting 404 errors on those non index pages.
All of my previously working index.htm rewrite rules are working just fine, but the non index ones throw 404 errors. Here's my update htaccess file:
DirectoryIndex index.php index.html index.htm
RewriteEngine On
RewriteCond %{REQUEST_URI} (.*)/$
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.htm -f
# If we're on the root index page of the site,
# remove index.htm from URL
RewriteRule ^index\.htm$ / [R=302,L]
# If we're on a subdirectory index page, remove the index.htm from the URL
RewriteRule ^(.+)/index\.htm$ /$1/ [R=302,L]
# If we're on a non-index page of a subdirectory,
# remove the extension from the URL
RewriteRule ^(.+)/(.+)\.htm$ /$1/$2/ [R=302,L]
Any advice?

Here is a complete solution by disabling MultiViews option and handling each case
DirectoryIndex index.php index.html index.htm
Options -MultiViews
RewriteEngine On
# redirect "/index.htm" or "/xxx/index.htm" to "/" or "/xxx/"
RewriteCond %{REQUEST_FILENAME} -f
RewriteCond %{THE_REQUEST} \s/([^/]+/)?index\.htm\s [NC]
RewriteRule ^ /%1 [R=302,L]
# redirect "/xxx/page.htm" to "/xxx/page/"
RewriteCond %{REQUEST_FILENAME} -f
RewriteCond %{THE_REQUEST} \s/([^/]+)/([^/]+)\.htm\s [NC]
RewriteRule ^ /%1/%2/ [R=302,L]
# rewrite back "/xxx/page/" to "/xxx/page.htm"
RewriteCond %{DOCUMENT_ROOT}/$1/$2\.htm -f
RewriteRule ^([^/]+)/([^/]+)/$ /$1/$2.htm [L]

Related

How to remove part of a url using htaccess

I have looked at a good few similar questions on stack overflow but what I have tried doesn't seem to work.
'quotes' is a subfolder with a wordpress installation in it. So, the main root has a wordpress site in it and the 'quotes' subfolder also has a wordpress installation in it.
I have a path something like
https://example.com/quotes/uk/travel-packages
https://example.com/quotes/us/travel-packages
But I don't want 'quotes' to be in the url, it should just be
https://example.com/uk/travel-packages
I currently have this
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /example/
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /example/index.php [L]
RewriteRule ^quotes/(.*)$ /$1 [L,NC,R]
RewriteRule ^uk/(.*) /quotes/$1 [L]
RewriteRule ^us/(.*) /quotes/$1 [L]
</IfModule>
EDIT: I was trying this on my localhost hence the rewrite base being /example/ so I understand the confusion there now.
Here is the live server .htaccess file for both the root directory and the quotes subfolder in the root directory.
ROOT:
# BEGIN WordPress
# The directives (lines) between "BEGIN WordPress" and "END WordPress" are
# dynamically generated, and should only be modified via WordPress filters.
# Any changes to the directives between these markers will be overwritten.
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>
# END WordPress
QUOTES SUBDIRECTORY
# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /quotes/
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /quotes/index.php [L]
</IfModule>
# END WordPress
If /quotes is a subdirectory that contains the WordPress installation and you have already removed /quotes from the URL in WordPress itself then you need to configure two .htaccess files:
One in the document root that internally rewrites (unconditionally) all requests to the /quotes subdirectory.
And another .htaccess file in the /quotes subdirectory that routes the request to WordPress - containing the "standard" WordPress front-controller.
In the (root) /.htaccess file:
RewriteEngine On
# Internally rewrite all requests to WordPress subdirectory
RewriteRule ^ quotes%{REQUEST_URI} [L]
In the /quotes/.htaccess file:
RewriteEngine On
# OPTIONAL...
# Redirect any direct requests for "/quotes/<anything>" back to root
# NB: WP itself *must* already be correctly configured to omit "/quotes" from the URL
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteRule (.*) /$1 [R=301,L]
# WordPress front-controller
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . index.php [L]
The check against the REDIRECT_STATUS environment variable is to ensure only direct requests are redirected and not rewritten requests by the WP front-controller - which would otherwise cause a redirect loop.
Alternatively, you have just a single .htaccess file in the document root (and remove the .htaccess file in the /quotes - WordPress - subdirectory). For example:
RewriteEngine On
RewriteBase /quotes
# OPTIONAL...
# Redirect any direct requests for "/quotes/<anything>" back to root
# NB: WP itself *must* already be correctly configured to omit "/quotes" from the URL
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteRule ^quotes/(.*) /$1 [R=301,L]
# WordPress front-controller
RewriteRule ^quotes/index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . index.php [L]
It seems from your question update that you also have a 2nd WordPress installation in the document root. It's not possible to have two separate WordPress installations and make them appear as a single WP site in the document root (ie. without specifying the subdirectory as part of the URL) unless there is some discernable difference in the URL structure between the two sites. Otherwise, you don't know which URLs should be routed to the WP installation in the subdirectory and which to the site in the document root.
Any with /us or /uk or any country extension in future should go to /quotes otherwise it should go to the route or whatever page is being called from the root site like for instance, https://example.com/contant-us should go to the root contact us page.
Yes, this is possible. For "any" country extension I assume a string of 2 lowercase letters a-z. Although this does mean that the WP site in the document root cannot have any URLs that start with a 2 letter path segment.
Ordinarily, you would modify the existing directives (as above). However, since this is WordPress, the directives in the # BEGIN WordPress code block should not be modified (unless you prevent WP from modifying this). Instead, we will just add some directives before the WP front-controller.
In the (root) /.htaccess file:
# Rewrite any URLs that contain a language code prefix to the subdirectory
RewriteRule ^[a-z]{2}/ quotes%{REQUEST_URI} [L]
# BEGIN WordPress
# : (Remainder of existing .htaccess file goes here)
In the /quotes/.htaccess file:
# OPTIONAL...
# Redirect any direct requests for "/quotes/<anything>" back to root
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteRule (.*) /$1 [R=301,L]
# BEGIN WordPress
# : (Remainder of existing .htaccess file goes here)
This assumes you will be accessing your static resources directly. eg. Using the /quotes subdirectory for resources that are contained within this site, so the /quotes subdirectory is not entirely hidden.
See my answer to part two of this question with regards to missing static resources (images, CSS and JS) for the second WordPress installation in the subdirectory that could result from implementing the above rewrites/redirects.
wordpress site missing images after htaccess change

htaccess rewrite url to hide variable

My knowledge of htaccess is pretty limited, so please be gentle...
I have a php page that displays an article, based on a variable passed through the url. For each article, I want the url to be changed to the article title, but I can't lose the article id being passed to the page...
So, for ...
www.somesite.com/news/article?id=4
... I can change it to ...
www.somesite.com/news/here-is-the-article-title
Or for
www.somesite.com/news/article?id=7
... I can change it to ...
www.somesite.com/news/some-other-title-that-I-choose
Bascially, I want the content for "article?id=4" to be shown on the page, but the URL to read "here-is-the-article-title". Is this even possible using htaccess? Or is there a similar solution? I know that I'll probably have to rewrite EACH url, and I'm fine with that.
I have some standard htaccess code I use...
Options +FollowSymLinks -MultiViews
# enable the rewrite engine
RewriteEngine On
# Set your root directory
RewriteBase /
# remove the .php extension
RewriteCond %{THE_REQUEST} ^GET\ (.*)\.php\ HTTP
RewriteRule (.*)\.php$ $1 [R=301]
# remove index and reference the directory
RewriteRule (.*)/index$ $1/ [R=301]
# remove trailing slash if not a directory
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} /$
RewriteRule (.*)/ $1 [R=301]
# forward request to html file, **but don't redirect (bot friendly)**
RewriteCond %{REQUEST_FILENAME}.php -f
RewriteCond %{REQUEST_URI} !/$
RewriteRule (.*) $1\.php [L]
# Disable Directory Browsing
Options -Indexes
# Protect htaccess File
<files ~ "^.*\.([Hh][Tt][Aa])">
order allow,deny
deny from all
satisfy all
</files>
Basic REDIRECT doesn't work, since I lose the article id.

Site accessible with multiple URLs

I'm using the micro framework Silex on my website hosted on a VPS.
So, the site files are in the /site_name/public_html/ folder but, with Silex, the site must point to the /site_name/public_html/web/ folder.
In the public_html directory, I have the following .htaccess file :
Options -Indexes -MultiViews
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
# Redirect to https & www
RewriteCond %{HTTPS} off [OR]
RewriteCond %{HTTP_HOST} !^www\. [NC]
RewriteRule ^ https://www.example.com%{REQUEST_URI} [R=301,L,NE]
# Redirect incoming URLs to web folder
RewriteCond %{REQUEST_URI} !web/
RewriteRule (.*) /web/$1 [L]
</IfModule>
And, in the /public_html/web/ folder, the following .htaccess :
<IfModule mod_rewrite.c>
# Redirect incoming URLs to index.php
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^ index.php [QSA,L]
</IfModule>
Now, everything works fine but my pages are accessible with three different patterns :
example.com/page/ (the one I want to keep)
example.com/web/page/
example.com/web/index.php/page/
I have used the meta canonical to avoid duplicate content but I still want these last two options to not exist.
I guess I have something to change in both .htaccess files but I can't find what it is.
I would actually remove the .htaccess file in the /web subdirectory altogether and rewrite directly to /web/index.php in the root .htaccess file. By having two .htaccess files you are seemingly creating extra work. The mod_rewrite directives in the subdirectory will completely override the parent directives (by default), so your canonical HTTPS and www redirects are also being overridden.
(Presumably you had a RewriteEngine On directive in the /web/.htaccess file?)
Having removed the /web/.htaccess file, try something like the following in your root .htaccess file:
Options -Indexes -MultiViews
RewriteEngine On
RewriteBase /web
# Redirect to https & www
RewriteCond %{HTTPS} off [OR]
RewriteCond %{HTTP_HOST} !^www\. [NC]
RewriteRule ^ https://www.example.com%{REQUEST_URI} [R=302,L,NE]
# If /web or /index.php is present at the start of the requested URL then remove it (via redirect)
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteRule ^(?:web|index\.php)/(.*) /$1 [R=302,L]
# Front-controller...
# Internally rewrite all requests to /web/index.php (uses RewriteBase set above)
RewriteRule index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^ index.php [L]
The check against the REDIRECT_STATUS environment variable ensures we only test initial requests and not requests that have been later rewritten.
The <IfModule> wrapper is not required, unless your site is intended to work without mod_rewrite.
Note that a request like /web/index.php/page/ would result in two redirects. First to /index.php/page then to /page. Since this is an edge case I would consider a double redirect to be acceptable.
UPDATE: I've removed the "directory" check in the above as this would have prevented the document root (example.com/) from being rewritten to the /web subdirectory. This would have consequently resulted in a 403 if you didn't have a directory index document (eg. index.php) in the document root of your site. (However, requests for example.com/page/ should have still worked OK.)
Test with 302 (temporary) redirects and only change to 301 (permanent) when you are sure it's working OK - to avoid any caching issues in the browser. Be sure to clear the browser cache before testing.

Hidden redirect of / to subfolder using .htaccess in TYPO3

We're setting up a TYPO3 installation, and if the user calls example.com/ we'd like the server to redirect to /typo/index.php?id=106.
This should happen without a change in the address bar. Every other file access on the server (for example example.com/test.png) should be redirected to example.com/typo/test.png).
This is the .htaccess file in the root directory. As I understand, it will redirect everything which doesn't have /typo in the URL to the subfolder and attach the parameters:
RewriteCond %{REQUEST_URI} !^/typo/
RewriteRule ^(.*)$ typo/$1 [L]
Now, this already seems to work, when I call example.com/index.php?id=106 I'm not getting a 404. Unfortunately TYPO3 seems to have some trouble (or the .htaccess configuration isn't correct), because we get a message saying "No input file specified".
What's also missing is the initial redirect when no path is specified. It should then go to /typo/index.php?id=106.
You may try this in one .htaccess file in root directory:
Options +FollowSymlinks -MultiViews
RewriteEngine On
RewriteBase /
# URL with no path
RewriteCond %{REQUEST_URI} ^/?$ [NC]
RewriteCond %{REQUEST_URI} !index\.php [NC]
RewriteRule .* /typo/index.php?id=106 [NC,L]
# URL with path
RewriteCond %{REQUEST_URI} !^/typo [NC]
RewriteRule ^(.+) /typo/$1 [NC,L]
Maps silently:
http://domain.com/ to
http://domain.com/typo/index.php?id=106
and
http://domain.com/anything
http://domain.com/typo/anything
For permanent redirection, replace [NC,L] with [R=301,NC,L]

.htaccess rewrite file uri to either file-name.html or file-name.php - whichever exists

OBJECTIVE: To cause the browser to rewrite to file-name.php, if it exists; else return file-name.html - whether the visitor has typed the url as any one of the following:
http://mydomain.com/file-name
http://mydomain.com/file-name.html
http://mydomain.com/file-name.php
Had good success with the following rules in my .htaccess file at root:
# REWRITE FILE URI TO file.php IF EXISTS
Options Indexes +FollowSymLinks +MultiViews
Options +ExecCGI
RewriteEngine on
RewriteBase /
# parse out basename, but remember the fact
RewriteRule ^(.*).html$ $1 [C,E=WasHTML:yes]
# rewrite to document.php if exists
RewriteCond %{REQUEST_FILENAME}.php -f
RewriteRule ^(.*)$ $1.php [S=1]
# else reverse the previous basename cutout
RewriteCond %{ENV:WasHTML} ^yes$
RewriteRule ^(.*)$ $1.html
However, I have since installed WP at root, alongside pre-existing website, and these rules are no longer working.
WHAT DOES WORK: file-name is rewritten to either file-name.html or file-name.php - whichever file exists.
WHAT DOES NOT WORK: file-name.html is not rewritten to file-name.php even when there is no file-name.html and file-name.php is there. Also, file-name.php is not rewritten to file-name.html when there is no file-name.php but there is file-name.html.
The entire .htaccess as it is now:
# BEGIN WP MULTISITE RULES
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
# uploaded files
RewriteRule ^([_0-9a-zA-Z-]+/)?files/(.+) wp-includes/ms-files.php?file=$2 [L]
# add a trailing slash to /wp-admin
RewriteRule ^([_0-9a-zA-Z-]+/)?wp-admin$ $1wp-admin/ [R=301,L]
RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^ - [L]
RewriteRule ^[_0-9a-zA-Z-]+/(wp-(content|admin|includes).*) $1 [L]
RewriteRule ^[_0-9a-zA-Z-]+/(.*\.php)$ $1 [L]
RewriteRule . index.php [L]
# END WP MULTISITE RULES
# REWRITE FILE URI TO file.php IF EXISTS
Options Indexes +FollowSymLinks +MultiViews
Options +ExecCGI
RewriteEngine on
RewriteBase /
# parse out basename, but remember the fact
RewriteRule ^(.*).html$ $1 [C,E=WasHTML:yes]
# rewrite to document.phtml if exists
RewriteCond %{REQUEST_FILENAME}.php -f
RewriteRule ^(.*)$ $1.php [S=1]
# else reverse the previous basename cutout
RewriteCond %{ENV:WasHTML} ^yes$
RewriteRule ^(.*)$ $1.html
Any advices?
Quick overview tells that your original rules most likely will never get reached as WP rules should intercept all requests.
This line RewriteRule ^ - [L] with those conditions will abort rewriting for any already existing files or folders, while this line RewriteRule . index.php [L] will intercept/redirect all requests to index.php.
If you move your rules above WordPress one, then it will work again.
To rewrite request for non-existing .php file to .html file use this rule:
# rewrite non-existing .php file to existing .html
RewriteCond %{DOCUMENT_ROOT}/$1.php !-f
RewriteCond %{DOCUMENT_ROOT}/$1.html -f
RewriteRule ^(.+)\.php$ $1.html [L,PT]
Place it below your rules (but above WP). The rule will check if .php files does not exist and rewrite will only occurs if .html file is present. If both files are unavailable then nothing will happen.
Keep in mind that because of these checks and the fact that rule is on the top of rewrite chain, this rule will be evaluated for every request to .php file (even WP pages) which may put extra pressure on very busy server. Ideally you would like to have proper URLs in first place so there will be no need for such manipulations.

Resources