.htaccess, proper rewriting of directory and file with same name - .htaccess

As of now my website has a few static pages, one of which is /portfolio. Among other things, my htaccess hides the .html extension. I'd like to add a portfolio directory, but I do not want to move my existing portfolio page into the portfolio directory as the default index file. My /portfolio page is one of my Google sitelinks and I am afraid if it is moved or if the url changes in someway, Google will consider it to be a brand new page.
My problem is once I add the /portfolio/ directory, whenever I try to visit the original /portfolio page, a trailing slash is automatically added and it links to the directory itself.
I've tried countless options, one being a rewrite of /portfolio/ to /portfolio, however this creates an infinite loop. I also tried "DirectorySlash Off" but that only removed the trailing slash while being inside the directory, it didn't revert access to the original /portfolio page.
Ultimately, I would like to keep my /portfolio page as-is, linking to pages inside the directory like so /portfolio/example and if either /portfolio or /portfolio/ is accessed it will result in showing the same page which is outside of the directory without Google thinking it is duplicate content.
A similar question exists here:
.htaccess rewriting url to page or directory though this still resulted in an infinite loop for me for some reason, I'm guess it has something to do with the hidden extensions.
Here's my htaccess-
RewriteEngine On
# HTML to PHP
RemoveHandler .html .htm
AddType application/x-httpd-php .htm .html
# Hide extension
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.html -f
RewriteRule ^(.*)$ $1.html
# Force WWW
RewriteCond %{HTTP_HOST} ^mydomain\.net
RewriteRule ^(.*)$ http://www.mydomain.net/$1 [R=301,L]
# Blog Subdomain
RewriteCond %{HTTP_HOST} ^blog.mydomain.net$
RewriteRule ^(.*)$ http://www.mydomain.net/blog/$1 [R=301,L]
I know it's not a great idea having a directory with the same name as a static page, but I really would rather not alter the existing page and lose the Google sitelink, so a clean and proper way to handle this would be a help.

There are two things going "wrong" here, and two ways to fix it.
The first is that apache "figures out" that there is a directory by the name of "portfolio" before the rewrite conditions are applied. That means that the rewrite conditions are receiving "portfolio/" instead of "portfolio".
Second, the "!-d" rule is specifically avoiding the rewrite that you want to make if there is in fact a directory by that name
Solution 1: Manually re-route requests for the portfolio directory to remove the slash.
# Manually re-route portfolio/ requests to portfolio
RewriteCond %{REQUEST_FILENAME} portfolio/$
RewriteRule ^(.*)/$ $1
# Hide extension
RewriteCond %{REQUEST_FILENAME}\.html -f
RewriteRule ^(.*)$ $1.html
Note the removal of the "!-d" condition.
The downside to this is that you are having to hard-code the "portfolio" edge case directly into the rewrite rules, and will still result in the browser being first redirected to portfolio/
Solution 2: Set DirectorySlash Off and remove directory exists test
# Disable Automatic Directory detection
DirectorySlash Off
# Hide extension
RewriteCond %{REQUEST_FILENAME}\.html -f
RewriteRule ^(.*)$ $1.html
Setting DirectorySlash Off would fix this issue the best, but may break other parts of your site where you actually want the auto DirectorySlash. Best of Luck, and I hope this helps.
Note when testing solution 2, your browser may remember the redirect of "portfolio" to "portfolio/" and perform the redirect before it even sends the request to the server. Be sure to test in a cache-clear, clean environment for best results.

Related

htaccess rewrite rule permanently redirects certain pages, even though I don't want it to

I like to deploy little and often, so my deploy script creates a dated folder in the docroot (e.g. /202206281019/), and I update my .htaccess file to point to it.
There is another folder for images (/static/), as they rarely get updated and so I don't want to deploy them every time. That is also referenced in the .htaccess.
- public_html/
|
-- .htaccess
-- 202206281019/
-- static/
In both of these cases, I'm re-writing internally, so that the visitor does not know anything about my site structure.
This works exactly as I expect across most of the site, except for one section where the rule appears to send a 301 redirect message back to the browser.
I really, really don't want this, and do not understand why it is happening in only one section of the site.
I'd be really grateful for another pair of eyes checking my work ...
Options -Indexes
DirectoryIndex index.html
<IfModule mod_rewrite.c>
RewriteEngine on
# /wp-content only holds images/styles/scripts
RewriteCond %{REQUEST_URI} ^/wp-content/
RewriteRule ^(.*)$ /static/$1 [NC,END]
# /wp-includes only holds images/styles/scripts
RewriteCond %{REQUEST_URI} ^/wp-includes/
RewriteRule ^(.*)$ /static/$1 [NC,END]
# everything else gets re-written
RewriteCond %{REQUEST_URI} !^/202206281019
RewriteRule ^(.*)$ /202206281019/$1 [NC,L]
</IfModule>
My staging site: https://staging.achaneich.co.uk/uk-clubs-directory/. As you click around the site, all requests are silently rewritten as I want. However, if you click on a link to a club (e.g. https://staging.achaneich.co.uk/uk-clubs-directory/name/lochaber-aikido), that gets redirected (to https://staging.achaneich.co.uk/202206281019/uk-clubs-directory/name/lochaber-aikido/). The image rewrite works as expected throughout the site.
Thanks in advance for your thoughts!

htaccess directory management

im going to ask a really simple question. i dont want my link to show this when i run my page :
http://localhost/example/assets/gallery.php
what i want is :
http://localhost/example/assets/
so how to do it in .htaccess file ?
i would really appreciate it if you can help because im so confused after reading forums .
my htaccess is like this right now but you know it only helps to remove extension :
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^([^\.]+)$ $1.php [NC,L]
How to make assets/gallery.php -> assets/
In the assets folder make a .htaccess file
Paste in this code :
DirectoryIndex gallery.php
This code changes the Directory Index (like a index.php file) to the gallery.php file meaning gallery.php is now like the index.php file.
The DirectoryIndex method that #RyanTheGhost suggests in his answer should have worked for the specific example you posted (where you request a directory and are serving a file from within that directory). However, the mod_rewrite directives you currently have in the document root1 will conflict with any requests for directories2 (although the DirectoryIndex should take priority).
However, the DirectoryIndex method is not very practical if you have many such files. And if you are not requesting a directory then this method naturally won't work anyway.
You could instead rewrite the URL using mod_rewrite in your existing .htaccess file, before your current directives.
1 I'm assuming your .htaccess file is in the document root.
For example:
# Rewrite "/example/assets/" to "/example/assets/gallery.php"
RewriteRule ^example/assets/$ example/assets/gallery.php [L]
Or, to avoid repitition:
# Rewrite "/example/assets/" to "/example/assets/gallery.php"
RewriteRule ^example/assets/$ $0gallery.php [L]
Where the $0 backreference contains the entire URL-path that is matched by the RewriteRule pattern. ie. example/assets/ in this case. NB: There is no slash prefix on the RewriteRule pattern or the substitution string.
Note that since you are requesting a directory (ie. /example/assets/) you need to ensure there is no DirectoryIndex document in that directory (eg. index.html or index.php), otherwise this will be served (by mod_dir) instead, overriding your internal rewrite above.
2 Your current directives that append the .php extension are arguably incorrect:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^([^\.]+)$ $1.php [NC,L]
This rule appends the .php extension to any request that does not map to a physical file, even if the file with a .php extension does not exist either. This can result in the incorrect URL being reported back to the user in the 404 error document (depending on how this is implemented). For example, the default Apache 404 error document will report that /foo.php does not exist, when the user requested /foo.
This rule will also rewrite directories (since they are "not files") which will result in a 404 (as opposed to a 403 or directory listing, if enabled). Although a DirectoryIndex document will override this.
Additionally, the NC flag is superfluous and there is no need to backslash-escape the literal dot inside a character class.
You could instead check that the corresponding .php file exists before rewriting, instead of checking that the requested URL does not map to a file.
For example:
RewriteCond %{DOCUMENT_ROOT}/$1.php -f
RewriteRule ^([^.]+)$ $1.php [L]
The request is now only rewritten when the corresponding .php file exists, which naturally avoids any conflicts with directories.

removed .html extensions with htaccess now index.html give 403 error

After entering the code below, my home page gives a 403 error. The rest of the site works perfectly. All instances of .html were removed.
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^([^\.]+)$ $1.html [NC,L]
Any advice?
Thank you!
example.com leads ti the 403 error. If I write example.com/index it works fine.
Something else must have changed for this to result in a 403 error. The code you posted won't actually do anything when you request example.com/ - the same as if that code didn't exist at all. (UPDATE: However, this assumes your .htaccess file is located in the document - it appears this is not the case - see below.)
However, what will trigger a 403 in such cases is when "formatted directory listings" are disabled and the directory index document cannot be found (or has been disabled).
So, try setting the appropriate directory index at the top of your .htaccess file:
DirectoryIndex index.html
It is the DirectoryIndex that serves the appropriate file when requesting your "home page", not your directives in .htaccess.
UPDATE:
It [.htaccess] is located in my root directory. Would it be better to put it in the public_html folder?
Yes, the code you posted should go in the /public_html directory (ie. your document root). If these directives are in a .htaccess file above the document root then the RewriteRule pattern will match the URL-path public_html/ and rewrite the URL to public_html/.html which is possibly where your 403 error is coming from ("dot" files are usually hidden/protected OS files and you may also have a directive in your server config blocking access. However, this behaviour may also be dependent on other factors in the server config/OS). However, with that code in the document root then a request for example.com/ (your home page) won't be processed by these directives (which is good) - mod_dir should then serve the index.html file in this instance.
However, you don't want to process "directories" anyway (public_html is obviously a "directory", not a file). Which is what's happening above. eg. .html shouldn't be appended to public_html/ to begin with (or example.com/path/to/directory/ or any other directory). This can be avoided by adding an additional condition to your rule block to avoid directories (as well as files). For example:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^.]+)$ $1.html [L]
Simply adding that additional RewriteCond directive might be enough and still allow you to keep your .htaccess file above the document root. (However, you may still need to move the .htaccess file as well, as described above.)
Also, the NC flag is not required here and literal dots don't need to be escaped when used inside a character class.
You could also extend this code to first check the existence of the file (with a .html extension) before rewriting, although this may be unnecessary in your case. For example:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule ^([^.]+)$ $1.html [L]
This requires an additional "file check" which may be an unnecessary overhead.

htacces - need to fix broken links coming from other sites to mine

I am having an issue where Google Webmaster Tools is reporting a ton of 404 links to my site which are coming from ask.com.
I have tried to get ask.com to fix their side but of course they are not, so now I am stuck with over 11k of bad links to my site which I am suspecting is effecting my ranks right now.
Anyways I have a possible way to 301 them, but not sure how to do it with .htaccess.
Here is the bad link pointing to my site
http://www.freescrabbledictionary.com/sentence-examples/fere-film/feverous/about.php
It should be
http://www.freescrabbledictionary.com/sentence-examples/fere-film/feverous/
Besides the about.php there are other variations of endings as well, I basically need to be able to remove the ending.
Problem is that the URL after /sentence-examples/ can change. The beginning is always:
http://www.freescrabbledictionary.com/sentence-examples/
So basically:
http://www.freescrabbledictionary.com/sentence-examples/<-keep but can change->/<-keep but can change->/<-remove this->
This .htaccess should be placed on the folder before sentence-examples:
RewriteEngine on
# Redirect /sentence-examples/anything/anything/remove to /sentence-examples/anything/anything/
RewriteCond %{THE_REQUEST} ^[A-Z]{3,}\s/+(sentence-examples/[^/]+/[^/]+)/.* [NC]
RewriteRule ^ /%1/? [R=302,PT,L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^/]+)/(.*)$ /sentence-examples/examplesentence.php?havethis=$1&word=$2 [L]
Change 302 to 301 once you confirm it's working as expected.
If you have a CMS installed you might need a different rule to work along with it without conflicting.
Keep in mind that if you had previously tried different redirects using 301 aka permanent redirect its recommended that you use a different browser to test this rule to avoid the caching.
This is possibly quick and dirty but I've done a simple test on localhost and here just to make sure it works.
RewriteEngine On
RewriteRule ^sentence-examples/(.*)/(.*)/(.*)\.php http://www.freescrabbledictionary.com/sentence-examples/$1/$2/ [R=301,L]
You can see that I've added wildcard groups (.*) to the RewriteRule so that we can pick up the elements of the URL that we need to aid in proper redirection i.e. $1 and $2. You can also use the third one ($3) to get which destinations are being targeted alot for your SEO needs.
NB: The rule above assumes that that the redirected URL will always be from a .php target and to ensure that you can redirect regardless of whatever comes after the 3rd URL segment replace the RewriteRule with this
RewriteRule ^sentence-examples/(.*)/(.*)/(.*)$ http://www.freescrabbledictionary.com/sentence-examples/$1/$2/ [R=301,L]

.htaccess rewrite rule for /

I have a website where if I go to the URL http://mysite.com/community it shows page not found. But, the URL http://mysite.com/community/ correctly displays the page. How can I set up a rewrite for that "/" after community?
This is my present .htaccess:
Options +FollowSymLinks
Options +Indexes
RewriteEngine On
RewriteRule ^admin$ Admin/index.php?qstr=$1 [L]
RewriteRule ^(.*)/$ index.php?qstr=$1 [L]
These were the ones tried by me, but failed
First,
RewriteRule ^(.*)/community $1/community/ [L]
second,
RewriteRule /community /community/ [L]
All with different combinations of with and without [L].
From the Apache URL Rewrite Guide:
Trailing Slash Problem
Description:
Every webmaster can sing a song about the problem of the trailing slash on URLs referencing directories. If they are missing, the server dumps an error, because if you say /~quux/foo instead of /~quux/foo/ then the server searches for a file named foo. And because this file is a directory it complains. Actually it tries to fix it itself in most of the cases, but sometimes this mechanism need to be emulated by you. For instance after you have done a lot of complicated URL rewritings to CGI scripts etc.
Solution:
The solution to this subtle problem is to let the server add the trailing slash automatically. To do this correctly we have to use an external redirect, so the browser correctly requests subsequent images etc. If we only did a internal rewrite, this would only work for the directory page, but would go wrong when any images are included into this page with relative URLs, because the browser would request an in-lined object. For instance, a request for image.gif in /~quux/foo/index.html would become /~quux/image.gif without the external redirect!
So, to do this trick we write:
RewriteEngine on
RewriteBase /~quux/
RewriteRule ^foo$ foo/ [R]
The crazy and lazy can even do the following in the top-level .htaccess file of their homedir. But notice that this creates some processing overhead.
RewriteEngine on
RewriteBase /~quux/
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^(.+[^/])$ $1/ [R]
Well, after trying out all the above solutions as well as some of my own, I finally solved this. I'm definitely sure that this is NOT a complete solution but it sure solved it for the time being.
Solution: Just created an empty directory named "community" in the root folder. That's it!
But I'm still on the lookout for the actual solution to this.

Resources