Removing extensions with .htaccess on Drupal site - .htaccess

How do you use .htaccess to remove extensions. On the website, I want old links to work, but, site layout has changed. I want to map
/arithmetic/whole_numbers/practice/whole_number_rounding.html
to rediredt(mapped) to
/arithmetic/whole_numbers/practice/whole_number_rounding
I have moved from a standard html site to a Drupal site and the new page names have been defined to look like the original pages names, except the lack of .html extensions.

I know it doesn't answer your question directly but what about this:
http://drupal.org/project/path_redirect
Edited: if it's not enough try this:
drupal.org/documentation/modules/path

Related

Setting up .htaccess for the first time

Totally lost on how to set up a .htaccess file, bunch of stuff and only been able to redirect and set index.
I have a site https://subdomain.domain.com/views/list.html and I want it to show up as https://subdomain.domain.comIve been able to hide the views/list.html from that main page with DirectoryIndex views/list.html but when i come back to it from within the website it still shows up as subfolder.Also is it possible for other subfolder files to not show up as subfolder but as something else? e.g. https://subdomain.domain.com/views/add.html show up as https://subdomain.domain.com/addproduct
Have you thought about trying PHP indexing? Make a folder structure and place the indexer in the correct folder. As for the subbing, it should be possible, least from what I recall.

Keep people from opening pdf directly on my webiste

I am not sure if this is possible, but I figure I would ask.
I have hundred of PDF's stored on my website, and they are all getting indexed directly by Google, so people are doing a search and the engine is taking them directly to the PDF. The issue here is that the PDF's are related to language learning and have audios that go with them. If a visitor goes directly to the PDF, then they never see the audios.
I have another page I have designed which opens up the PDF in an Iframe, and shows the audios right next to them so the users can use it.
So my question is, is it possible to redirect a user who opens:
www.mywebsite.com/something.pdf
And have it redirect them to:
www.mywebsite.com/page-with-audios/
The key here is that the pdf should still open in the IFrame on my domain.
Thanks in advance for any assistance.
If you use routing, you could make a route which has the PDF name as parameter. The route could look something like this:
/{PDF_NAME}.pdf
This could be used to match all PDF's, like example.com/foo.pdf, example.com/bar-baz.pdf. Since you then have the name of the PDF they would like to view, you can redirect them to the /page-with-audio-files with some extra data like the name of the PDF. Then you can handle opening the iFrame.
EDIT
since I now see your question was directed at .htaccess, I think the following might work too.
add this rewrite to your .htaccess:
RewriteEngine On
RewriteRule ^([a-zA-Z\-]+.pdf)$ /page-with-audio/$1 [L, R=301]
This will make the $1 variable somepdf.pdf if your request url is http://example.com/somepdf.pdf.
Then it redirects the user to http://example.com/page-with-audio/somepdf.pdf so you know which pdf was requested.

.htaccess or symbolic link (symlink)

I have a website with multiple folders and I was trying to fix them in my .htaccess. After a little while, I have a big .htaccess with rules that conflicts.
Now every time I want to add a folder I have to add it to the .htaccess.
I did some research and I found out I can create symbolic link instead, so no more .htaccess
In both solution I have to create or modify something so for me its the same result at the end but is it a better practice to create instead symbolic link ?
Symbolic links are faster yes (like Aki said) but here's my thoughts on this.
if you have images, css or js files then you don't need to rewrite or create symbolic links. You can use the full URL (eg /images/...) or use a common domain like i.domain.com (or anything you want) and refer all your JS, Images and CSS there. Eg: i.domain.com/logo.jpg or js.domain.com/site.js.
This way, you never have to think about rewriting rules or create links you might forget one day.
This one is very easy to manage and maintain if you need to add images, change js or update your CSS since you only have one point of entry and automatically everything be updated.
use symblink, .htaccess has to be proccesed by apache whereas the symblink are proccess by the OS which is faster.
creating 100 rules vs 100 symblink, if the rule you looking for is at the last you will have to parse all of them then use the one you need.

Using mod_rewrite to rewrite site root

I'm trying to have relative links that are preceded with a forward slash (/) get rewritten with mod_rewrite to refer to the site root.
I have site:
http://localhost/mysite/
and I have numerous references, for example, in my css directory, formatted like such:
background: url('/img/background.jpg');
I would like to use mod_rewrite to point that at:
http://localhost/mysite/img/background.jpg
But right now, it is pointing to:
http://localhost/img/background.jpg
I apologize in advance if this is a no-brainer, but I'm new to mod_rewrite, and I have so far been unsuccessful in getting this to work!
This is not a good thing to do. It is duct-taping a mistake that should be fixed at its core.
By rewriting URLs this way, you risk breaking links in other sites on localhost, for example, and it is a long-term maintenance nightmare having to deal with URLs outside your site's root directory.
Consider using relative links in your CSS instead, e.g.
background: url('../img/background.jpg');
if the images are in a sibling directory of where your CSS style sheet is in.
Links in style sheets are always relative to the location the style sheet is in - not the HTML file that uses it. That makes it very easy to use relative links in style sheets.

Getting "mywebsite.org/" to resolve to "mywebsite.org/index.php"

At my work we have various web pages that, my boss feels, are being ranked lower than they should be because "mywebsite.org/category/" looks like a different URL to search engines than "mywebsite.org/category/index.php" does, even though they show the same file. I don't think it works this way but he's convinced. Maybe I'm wrong though. I have two questions:
How do i make it so that it will say "index.php" in the address bar of all subcategories?
Is this really how pagerank works?
Besides changing all the links everywhere, a simpler solution is to use a rewrite rule. Make sure it is a permanent redirect, or Google will keep using the old link (without index.php). How you do this exactly depends on your web server, but for Apache HTTPd it looks something like the example given below.
Yes. Or so I've heard. Very few people know for sure. But Google mentions this guideline (as "Be consistent"). Make sure to check out all of Google's Webmaster guidelines.
Apache config for rewrite rule:
# in the generic config
LoadModule rewrite_module modules/mod_rewrite.so
# in your virutal host
RewriteEngine On
# redirect everything that ends in a slash to the same, but with index.php added
RewriteRule ^(.*)/$ $1/index.php [R=301,L]
# or the other way around, as suggested
# RewriteRule ^(.*)/index.php$ $1/ [R=301,L]
Adding this code to the top of every page should also work:
<?php
if (substr($_SERVER['REQUEST_URI'], -1) == '/') {
$new_request_uri = $_SERVER['REQUEST_URI'].'index.php';
header('HTTP/1.1 301 Moved Permanently');
header('Location: '.$new_request_uri);
exit;
}
?>
You don't tell us if you're using straight PHP or some other framework, but for PHP, probably you just need to change all the links on your site to "mywebsite.org/category/index.php".
I think it's possible that this does affect your search engine rank. However, you would be better off using only "mywebsite.org/category" rather than adding "index.php" to each one.
Bottom line is that you need to make sure all your links in your website use one or the other. What actually gets shown in the address bar is unimportant.
A simple solution is to put in the <head> tag:
<link rel="canonical" href="http://mywebsite.org/category/" />
Then, no matter which page the search engine ends up on, it will know it is simply a different view of /category/
And for your second question--yes, it can affect your results, if Google thinks you are spamming. If it wasn't, they wouldn't have added support for rel="canonical". Although I wouldn't be surprised if they treat somedir/index.* the same as somedir/
I'm not sure if /category/ and /category/index.php are considered two urls for seo, but there is a good chance that it will effect them, one way or another. There is nothing wrong with making a quick change just to be sure.
A few thoughts:
URLs
Rather than adding /index.php, you will be better off making it so there is no index.php on any of them, since the keyword 'index' is probably not what you want.
You can make a script that will check if the URL of the current page ends in index.php and remove it, then forward to the resulting URL.
For example, on one of my sites, I require the 'www.' for my domain (www.domain.com and domain.com are considered two URLs for search purposes, though not always), so I have a script that checks each page and if there is no www., it ads it, and forwards.
if (APPLICATION_LIVE) {
if ( (strtolower($_SERVER["HTTP_HOST"]) != "www.domain.com") ) {
header("HTTP/1.1 301 Moved Permanently"); // Recognized by search engines and may count the link toward the correct URL...
header("Location: " . 'www.domain.com/'.$_SERVER["REQUEST_URI"] );
exit();
}
}
You could mode that to do what you need.
That way, if a crawler visits the wrong URL, it will be notified that it was replaced with the correct URL. If a person visits the wrong URL, they will be forwarded to the correct URL (most won't notice), and then if they copy the url from the browser to send someone or link to that page, they will end up linking to the correct url for that page.
LINKING URLS
They way other pages link to your pages is more important for seo. Make sure all your in-site links use the proper URL (without /index.php), and that if you have a 'link to this page' feature, it doesn't include the /index.php part. You can't control how everyone links to you, but you can take some control over it, like with the script in item 1.
URL ROUTING
You may also want to consider using some sort of framework or stand-alone URL rerouting scheme. It could make it so there were more keywords, etc.
See here for an example: http://docs.kohanaphp.com/general/routing
I agree with everyone who's saying to ditch the index.php. Please don't force your visitor to type index.php if not typing it could get them the same result.
You didn't say if you're on an IIS or Apache server.
IIS can be set to assume index.php is the default page so that http:// mywebsite.org/ will resolve correctly without including index.php.
I would say that if you want to include the default page and force your users to type the page name in the url, make the page name meaningful to a search engine and to your visitors.
Example:
http://mywebsite.org/teaching-web-scripting.php
is far more descriptive and beneficial for SEO rankings than just
http://mywebsite.org/index.php
Might want to take a look at robots.txt files? Not quite the best solution, but you should be able to implement something workable with them...

Resources