htaccess : can I delete old redirection rules? - .htaccess

Last year I altered existing URLs, so I created hunders of redirection rules to point each old URL to the new one.
Can I remove these rules today ? How can I check if google is no mre indexing old URLs ?
Thanks :)

Check the "advanced search" google offers:
it allows to use the site keyword, so a search for site:example.com will present all URLs indexed in the google catalog.
That should be the list you are interested in.
Apart from that you should have uploaded your current sitemap and checked that those current links are indexed, obviously.

Related

how stop indexing links with included subfolders

My real links on my website should index in google as (example):
www.mywebsite.com/title,id,sometext,sometext
unexpectedly google search indexing my website with subfolders whitch should not occur for example:
www.mywebsite.com/include/title,id,sometext,sometext
www.mywebsite.com/img/min/title,id,sometext,sometext
and so on
how can i stop these actions from indexing. What i have to change on htaccess or robots.txt? Help me, thanks
You need to update your robots.txt to prevent bots from browsing those pages and you should set a noindex on these pages to remove them from rankings. You may also want to explore canonical links if the same page can be served from multiple URLs.

duplicate URLs in my page, best solution?

I have a website that write URLs like this:
mypage.com/post/3453/post-title-name-person
In fact, what is important is the post and ID part (3453). The title I just add for SEO.
I changed some title names recently, but people can still using the old URL to access, because I just get the ID to open the page, so:
mypage.com/post/3453/post-title-name-person
mypage.com/post/3453/name-person
...
Will open the same page.
Is it wrong? Google webmaster tools tells me that I have 8765 duplications pages. So, to try to solve this I am redirecting old title to post/id/current-title but it seems that Google doesn't understand this redirecting and still give me duplications.
Should i redirect to not found if title doesn't match with the actual data base? (But this can be a problem because links that people shared won't open) Or what?
Maybe Google has not processed your redirections yet. It may take several weeks and sometimes several months to process all pages, especially if they are not revisited often. Make sure your redirects are 301 and not 302 (temporary).
That being said, there is a better method than redirections for duplicate pages: the canonical tag. If you can, implement it. There is less risk to mix up redirections.
Google can pick your new URL's only after the implementation of 301 redirection through .htaccess file. You should always need to remember that 301 re-direct should be proper and one to one to the new url. After this implementation you need to fetch those new URL via Google Search console so that Google index those URL's fast.

301 redirect all ugly permalinks from old site to new site

So I overhauled a complete website the other day and found some of the old pages snippets in the google search results. The old page had an ugly link structure such as domain.com/index.php?article_id=123. The new site uses pretty permalinks such as domain.com/pagetitle.
Is there a piece of code I could put into the .htaccess file in order to redirect all ugly permalinks to the new site?
Edit
Additional info: The old links don't exist anymore. The old site and the new one's structure differs a lot, not all contents from the all site were adapted. Main problem is that I don't want the old links in the google search results to always throw a 404 at the user.
Maybe something of a
RedirectMatch ^/index.php?$ http://www.example.com/somepage
This will redirect all pages starting from index.php to another location
I don't have the rep to comment on the other answer, but that is a very improper solution if you value your SEO at all. A redirect is your way of telling Google "I've got the same page, I just moved it". There's a much better way to do this that won't negatively affect your SEO at all.
You should create some logic to redirect those old links to your new links.
Here's an example of how you could do it:
Go to the beginning of your program, before any logic takes place.
Use code to retrieve the requested page. In this case, you might be able to get away with simply checking for GET variables that match article_id.
If the requested page is a match for your GET variable, run a query to see if the article exists. (Obviously, you'll still want to 404 articles that don't exist).
Retrieve the content used to generate the new, more SEO-friendly URL's. This is probably the article title or something.
Write some code to generate the new article title. At this point, if this is working properly, you should be able to system print that new URL to make sure it's correct.
301 redirect to the new URL. Don't 302 or any other number, 301 redirect it. This lets search engines know it's the same page and content, but it has permanently moved.

How to properly split a site?

Suppose I have a new verison of a website:
http://www.mywebsite.com
and I have would like to keep the older site in a sub-directory and treat it seperately:
http://www.mywebsite.com/old/
My new site has a link to the old one on the main page, but not vice-versa.
1) Should I create 2 sitemaps? One for the new and one for the old?
2) When my site gets crawled, how can I limit the path of the crawler? In other words, since the new site has a link to the old one, the crawler will reach the old site. If I do the following in my robots.txt:
User-agent: *
Disallow: /old/
I'm worried that it won't crawl the old site (using the 2nd sitemap) since it's blocked. Is that correct?
1) You could include all URLs in one file, or you could create separate files. One could understand a sitemap as "per (web) site", e.g. see http://www.sitemaps.org/:
In its simplest form, a Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL
Since you now have two sites, you may create two sitemaps. But again, I don't think that it is strictly defined that way.
2) Well, if you block the URLs in robots.txt, these URLs won't be visited by conforming bots. It doesn't mean that these URLs will never be indexed by search engines, but the pages (= the content) will not.

Symbol "?" in alias or Dirty url

I want to move the website to the Drupal CMS with original paths. It's look like
website.com/search.php?q=blablabla
website.com/index.php?q=blablabla
website.com/category.php?q=auto&page=2
etc
How can i use these aliases in Drupal? Thank you.
I think you will have great difficulty setting this up, if it's even possible. It would be much better to let Drupal use its standard clean URLs and to setup URL rewrite rules to translate requests for legacy URLs to the new ones.
For example, Drupal's search URL looks like:
website.com/search/node/blahblah
And in .htaccess you could define:
RewriteRule ^search.php\?q=(.*)$ /search/node/$1 [R=301,NC,L]
Which would match the format of your legacy search URL, extract the query and rewrite the URL so that query is in Drupal's clean form. That way requests to website.com/search.php?q=blah get translated to website.com/search/node/blah before getting sent to Drupal. The user however will see the new, Drupal-style URL.
mod_rewrite is well documented.
This is of course going to be harder to do if your legacy URLs make use of unique IDs that do not exist in Drupal. In that case I'd take care to make sure that node IDs and taxonomy IDs etc all correspond between your legacy site and your new site. That way you could translate something like /view.php?articleID=121 to /node/121.
This has the effect of handling any incoming links from search engines, third party sites, or users' bookmarks, but leaves you with an entirely new URL structure. I've used this approach before when migrating to Drupal.

Resources