I guess, this is an easy one but anyway, I haven't figured it out yet.
After migrating my website from Joomla 3 to Joomla 4 the structure of categories and articles will change. That's why I will need some rules in .htaccess to redirect the old urls to the new ones.
The website is hosted on an Apache server.
The old URL structure looks something like that.
https://www.mydomain.de/category/subcategory/item/[articleID]-[articleAlias].html
[articleID] is a digit.
[articleAlias], e.g. „this-is-article-number-233“
This should be redirected to...
https://www.mydomain.de/newcategory/newsubcategory/[articleAlias].html
An example:
https://www.mydomain.de/category/subcategory/item/2324-this-is-my-latest-article.html
… should be redirected to...
https://www.mydomain.de/newcategory/newsubcategory/this-is-my-latest-article.html
I've played around with RedirectMatch and Rewrite Rule but haven't been successful to make it work. How do I get rid of the article id?
My latest try failed with...
RedirectMatch ^category/subcategory/item/([0-9]+)-(.*)$ /newcategory/newsubcategory/$1
Is there a simple and elegant solution to this? Thanks in advance!
UPDATE
Maybe it's more complex than I thought it was.
Main problem is that not only my categories changed but also the ids of the articles.
So, to stick with my example...
https://www.mydomain.de/category/subcategory/item/2324-this-is-my-latest-article.html
first turns into something like:
https://www.mydomain.de/newcategory/newsubcategory/1223-this-is-my-latest-article.html
Anyway, Joomla 4 is able to drop the article id automatically (guess with an internal rewrite) for seo-friendly URLs. I activated that feature to make the new URLs look like
https://www.mydomain.de/newcategory/newsubcategory/[articleAlias].html
The [articleAlias] stays the same.
A redirection according to what you actually ask should be possible like that:
RewriteEngine on
RedirectRule ^/?category/subcategory/item/[0-9]+-(.*)\.html$ /newcategory/newsubcategory/$1.html [R,L]
However I doubt that this really is what you want: this completely drops the numeric ID of the resource. Which means that it won't be available for processing when the redirected request comes back requesting the new, stripped URL. How do you want to internally rewrite that request back to the internal resource then, without that ID?
I made some more tests and it this is the final solution to my problem:
RedirectMatch 301 ^/?category/subcategory/item/[0-9]+-(.*)\.html$ /newcategory/newsubcategory/$1.html
I have a problem in redirecting a URL on a Silverstripe website. I have a news feed page with a summary of articles in a paginated style. It displays 20 articles initially and switches to the next 20 based on the page number chosen. It is just the standard blog layout. When I click on page 2 then it should navigate to https://*****/news/?count=20 and for page 3 as https://*****/news/?count=40 etc. However upon clicking the blog page number it navigates to https://*****/news/news/?count=20. So the navigation link is not rewriting the parent URL.
All of my other Silverstripe websites work fine with the same blog layout except this and I don't see any reason to tweak the default code. I thought of adding a .htaccess redirect like this
Redirect 301 /news/news/?start=20 https://******/news/?start=20
but I didn't have any luck to make it work. Kindly suggest me a solution for this.
The output I expect is to redirect to the right URL
https://******/news/?start=20
Here is a simple redirection rule that should fix the symptom you describe:
RewriteEngine on
RewriteRule ^/?news/news/(.*)$ /news/$1 [R=301,L]
But I doubt that approach is a good idea. Simply because it tries to fix a symptom, not the cause. The cause is that you actually create requests to URLs that contain the /news/news/ issue which should never happen. I assume the cause of that issue is that you hand out relative references (so something like news/...) instead of absolute references (/news/...). I strongly suggest that you handle the cause instead of trying to fix the symptom.
I have a series of URLs on my website:
http://www.example.com/sub1/sub2/content.html
But I would like to remove "sub1" completely - not hide it so it still attempts to access that directory. Finished result would be this URL:
http://www.example.com/sub2/content.html
Many similar posts on SE seem to demonstrate how to "hide" a URL from the user. I want to rewrite the URL so that it treats it as if it isn't even there.
Example of what I'm trying not to do: Hide Part of URL htaccess
NOTE: I do not want to actually delete files as suggested by the comment below. I'm trying to redirect the request to another directory.
This worked for me:
RewriteRule ^sub1/sub2/(.*)$ /sub2/$1 [R=302,NC,L]
Helpful page: http://coolestguidesontheplanet.com/redirecting-a-web-folder-directory-to-another-in-htaccess/
I will be really greatful if someone helps me with this.
Let's consider these 2 URLs (both returning 200 in the response header):
www.foo.com/something
www.foo.com/something/
Google considers these 2 URLs different despite both having the same content which leads to a duplicated content problem. To solve the issue it is advised to either use the 301 permanent redirect to redirect one URL to the other or use the rel="canonical" attribute. source
Wordpress blogs deal perfectly with this matter. When adding the trailing slash to my internal links, I was redirected to URLs without the trailing slash (301 response).
The problem is the redirect is only happening with internal pages. My homepage seem to return a 200 response with or without the trailing slash. Should I leave it as it is or force a redirect with the .htaccess file?
p.s.: The backlinks to my website have 2 different hrefs (with and without the trailing slash). Should I change those backlinks to a unique href or redirect one to the other?
Use this link to add trailing slash to end of your url
It doesn't matter whether your Backlinks are with slash or not, because after implementing techniques mentioned in above address, search engines will assume your pages only with slashes. Remember because of past indexing you should wait until former index to be deleted. or you may use 301 redirect to pages with slash. Basically this will take some time until search engines came again and find your redirect rules, too... .
By home page I assume you mean the page shown when you enter just the domain.
With or without the slash represents exactly the same URL. Nothing to worry about and nothing you can do.
At my work we have various web pages that, my boss feels, are being ranked lower than they should be because "mywebsite.org/category/" looks like a different URL to search engines than "mywebsite.org/category/index.php" does, even though they show the same file. I don't think it works this way but he's convinced. Maybe I'm wrong though. I have two questions:
How do i make it so that it will say "index.php" in the address bar of all subcategories?
Is this really how pagerank works?
Besides changing all the links everywhere, a simpler solution is to use a rewrite rule. Make sure it is a permanent redirect, or Google will keep using the old link (without index.php). How you do this exactly depends on your web server, but for Apache HTTPd it looks something like the example given below.
Yes. Or so I've heard. Very few people know for sure. But Google mentions this guideline (as "Be consistent"). Make sure to check out all of Google's Webmaster guidelines.
Apache config for rewrite rule:
# in the generic config
LoadModule rewrite_module modules/mod_rewrite.so
# in your virutal host
RewriteEngine On
# redirect everything that ends in a slash to the same, but with index.php added
RewriteRule ^(.*)/$ $1/index.php [R=301,L]
# or the other way around, as suggested
# RewriteRule ^(.*)/index.php$ $1/ [R=301,L]
Adding this code to the top of every page should also work:
<?php
if (substr($_SERVER['REQUEST_URI'], -1) == '/') {
$new_request_uri = $_SERVER['REQUEST_URI'].'index.php';
header('HTTP/1.1 301 Moved Permanently');
header('Location: '.$new_request_uri);
exit;
}
?>
You don't tell us if you're using straight PHP or some other framework, but for PHP, probably you just need to change all the links on your site to "mywebsite.org/category/index.php".
I think it's possible that this does affect your search engine rank. However, you would be better off using only "mywebsite.org/category" rather than adding "index.php" to each one.
Bottom line is that you need to make sure all your links in your website use one or the other. What actually gets shown in the address bar is unimportant.
A simple solution is to put in the <head> tag:
<link rel="canonical" href="http://mywebsite.org/category/" />
Then, no matter which page the search engine ends up on, it will know it is simply a different view of /category/
And for your second question--yes, it can affect your results, if Google thinks you are spamming. If it wasn't, they wouldn't have added support for rel="canonical". Although I wouldn't be surprised if they treat somedir/index.* the same as somedir/
I'm not sure if /category/ and /category/index.php are considered two urls for seo, but there is a good chance that it will effect them, one way or another. There is nothing wrong with making a quick change just to be sure.
A few thoughts:
URLs
Rather than adding /index.php, you will be better off making it so there is no index.php on any of them, since the keyword 'index' is probably not what you want.
You can make a script that will check if the URL of the current page ends in index.php and remove it, then forward to the resulting URL.
For example, on one of my sites, I require the 'www.' for my domain (www.domain.com and domain.com are considered two URLs for search purposes, though not always), so I have a script that checks each page and if there is no www., it ads it, and forwards.
if (APPLICATION_LIVE) {
if ( (strtolower($_SERVER["HTTP_HOST"]) != "www.domain.com") ) {
header("HTTP/1.1 301 Moved Permanently"); // Recognized by search engines and may count the link toward the correct URL...
header("Location: " . 'www.domain.com/'.$_SERVER["REQUEST_URI"] );
exit();
}
}
You could mode that to do what you need.
That way, if a crawler visits the wrong URL, it will be notified that it was replaced with the correct URL. If a person visits the wrong URL, they will be forwarded to the correct URL (most won't notice), and then if they copy the url from the browser to send someone or link to that page, they will end up linking to the correct url for that page.
LINKING URLS
They way other pages link to your pages is more important for seo. Make sure all your in-site links use the proper URL (without /index.php), and that if you have a 'link to this page' feature, it doesn't include the /index.php part. You can't control how everyone links to you, but you can take some control over it, like with the script in item 1.
URL ROUTING
You may also want to consider using some sort of framework or stand-alone URL rerouting scheme. It could make it so there were more keywords, etc.
See here for an example: http://docs.kohanaphp.com/general/routing
I agree with everyone who's saying to ditch the index.php. Please don't force your visitor to type index.php if not typing it could get them the same result.
You didn't say if you're on an IIS or Apache server.
IIS can be set to assume index.php is the default page so that http:// mywebsite.org/ will resolve correctly without including index.php.
I would say that if you want to include the default page and force your users to type the page name in the url, make the page name meaningful to a search engine and to your visitors.
Example:
http://mywebsite.org/teaching-web-scripting.php
is far more descriptive and beneficial for SEO rankings than just
http://mywebsite.org/index.php
Might want to take a look at robots.txt files? Not quite the best solution, but you should be able to implement something workable with them...