301 redirect all ugly permalinks from old site to new site - .htaccess

So I overhauled a complete website the other day and found some of the old pages snippets in the google search results. The old page had an ugly link structure such as domain.com/index.php?article_id=123. The new site uses pretty permalinks such as domain.com/pagetitle.
Is there a piece of code I could put into the .htaccess file in order to redirect all ugly permalinks to the new site?
Edit
Additional info: The old links don't exist anymore. The old site and the new one's structure differs a lot, not all contents from the all site were adapted. Main problem is that I don't want the old links in the google search results to always throw a 404 at the user.

Maybe something of a
RedirectMatch ^/index.php?$ http://www.example.com/somepage
This will redirect all pages starting from index.php to another location

I don't have the rep to comment on the other answer, but that is a very improper solution if you value your SEO at all. A redirect is your way of telling Google "I've got the same page, I just moved it". There's a much better way to do this that won't negatively affect your SEO at all.
You should create some logic to redirect those old links to your new links.
Here's an example of how you could do it:
Go to the beginning of your program, before any logic takes place.
Use code to retrieve the requested page. In this case, you might be able to get away with simply checking for GET variables that match article_id.
If the requested page is a match for your GET variable, run a query to see if the article exists. (Obviously, you'll still want to 404 articles that don't exist).
Retrieve the content used to generate the new, more SEO-friendly URL's. This is probably the article title or something.
Write some code to generate the new article title. At this point, if this is working properly, you should be able to system print that new URL to make sure it's correct.
301 redirect to the new URL. Don't 302 or any other number, 301 redirect it. This lets search engines know it's the same page and content, but it has permanently moved.

Related

redirect numerous dynamic urls to home page via .htaccess

I am trying to clean up a previously hacked WordPress site, and domain name reputation, the site has new hosting and is now on a different CMS system, but there are hundreds of spam links in Google I need to get rid of, they look like example.com/votes.php?10054nzwzm75042pw205039
Domain name, then votes.php?**** etc.. Numbers letters all sorts.
So how do I redirect ANYTHING that starts with the domain name then /votes.php?***
Any help greatly appreciated
Unless you have multiple domains, you don't need to explicitly check the domain name.
To send a "410 Gone" for anything that contains /votes.php in the URL-path (and any query string), you can do something like the following at the top of your root .htaccess file using mod_rewrite:
RewriteEngine On
# Serve a 410 Gone for any requests to "/votes.php"
RewriteRule ^votes\.php$ - [G]
A 410 is preferable to a "redirect" if you want to get these URLs removed from the search engines as quickly as possible.
To expedite the process of URL removal from Google then use Google's Removal Tool as well.
If you redirect these pages to the homepage then it will likely be seen as a soft-404 by Google and these URLs are likely to remain in the search results for a lot longer.

duplicate URLs in my page, best solution?

I have a website that write URLs like this:
mypage.com/post/3453/post-title-name-person
In fact, what is important is the post and ID part (3453). The title I just add for SEO.
I changed some title names recently, but people can still using the old URL to access, because I just get the ID to open the page, so:
mypage.com/post/3453/post-title-name-person
mypage.com/post/3453/name-person
...
Will open the same page.
Is it wrong? Google webmaster tools tells me that I have 8765 duplications pages. So, to try to solve this I am redirecting old title to post/id/current-title but it seems that Google doesn't understand this redirecting and still give me duplications.
Should i redirect to not found if title doesn't match with the actual data base? (But this can be a problem because links that people shared won't open) Or what?
Maybe Google has not processed your redirections yet. It may take several weeks and sometimes several months to process all pages, especially if they are not revisited often. Make sure your redirects are 301 and not 302 (temporary).
That being said, there is a better method than redirections for duplicate pages: the canonical tag. If you can, implement it. There is less risk to mix up redirections.
Google can pick your new URL's only after the implementation of 301 redirection through .htaccess file. You should always need to remember that 301 re-direct should be proper and one to one to the new url. After this implementation you need to fetch those new URL via Google Search console so that Google index those URL's fast.

Redirects required for pages no longer in sitemap

I have a relatively new site that has just started to pick up a bit of traction in the SERP's. My problem is that I have published it and had it indexed with PHP URL extensions, as follows:
www.example.com/page.php
www.example.com/product.php
And so on. Obviously it is a fairly easy matter of editing the .htacess file to remove these extensions. So I will end up with:
www.example.com/page
www.example.com/product
No problems there.
Because the site is still quite small, I can easily change all the links manually to drop the .php extension, and then update the sitemap. So Google, and all users, should have no way of reaching the .php pages, although of course they still exist if you were to manually type them in.
But, because Google has a 'record' of these pages existing (even though there are no direct links to reach them now), do I need to implement 301 redirects from the .php pages to the new non-php pages? I.e. will Google try to crawl those pages that are no longer in the sitemap, but once existed? In other words, since you can still reach www.example.com/page.php , even though will be no link on the site or in the sitemaps that will take you there, would I get penalised for having duplicate content - are 301 redirects basically required when doing this kind of thing, even if there are no links to the content anymore?
Thanks very much.
It is better to have 301 redirect for some time(month or two) even though you can change all your links to nonphp urls. This way any residual URLs(will always be there) that are hanging out there will be taken care and google will index nonphp urls from your 301 redirect. Once you are sure from Logs(depending on your system) that there are no more OLD urls coming in, you can remove the 301 redirects. This is little easier way of moving all your old URLs instead of abruptly throwing 404s. 301 helps to transfer SEO values of old URLs to new ones.
Another item to look out for is using rel="canonical" if you want your .php and nonphp pages to coexist. This signals that they are not duplicates.

.htaccess file - forward everything, but not?

First time user, been looking all night.
We recently changed our site from .net to wordpress. We transferred over half of the news articles and not the other half. So now we get old users coming to the site and getting a 404.
The news articles that exist in the wordpress site have been reditected and work fine, for example,
www.example.com/news/transfered-news-story.aspx
redirects to
www.example.com/blog/news/transfered-news-story
this was done manually.
What I need help with is if someone comes to the site with any other request, e.g.
www.example.com/news/this-didnt-get-moved.aspx
or
www,example.com/news/anything-else
or
www.example.com/news/2010/02
all just gets redirected to
www.example.com/blog/news
I have been reading on and off for a couple of weeks and tried a few things but they all append the additional stuff on the end of the redirected string.
so www.example.com/news/my-stuff-ok
becomes www.example.com/blog/news/my-stuff-ok (and I want to drop the my-stuff-ok)
I hope you get what I'm after, any help would be very much appreciated.
Thanks
Phil
You can simply write a directive that converts a 404 to a url (documentation):
ErrorDocument 404 /blog/news
However, you really should go through the motions of adding manual redirects (permanent redirect) to the new url for each of the other articles because you will take a considerable SEO hit if those urls no longer serve up the content that was linked by the search engine.

Dynamically creating URLs for other websites

I'd like to know how websites have created URLs with other domains like these on trafficestimate.com.
I'm guessing it's some .htaccess stuff to redirect domain names to a dynamic page?
Thanks
Your URL has an GET Request. So when someone calls the page http://google.com/search with the parameters hl=en, safe=off etc., the page can process those parameters. So for instance safe=off means that you want to get back any search result. The q=site:... is your search string. In this case Google will look it up in its database and give you the results. So when you call this URL there is probably no .htaccess processing done. However you can process the URL and GET request with .htacces and i.e. redirect the user to another page.
Maybe you'll describe a bit further what exactly you trying to do/want to know. This makes explaining easier.
EDIT: After reading Gumbo's comment I looked at the Google result page. So maybe your question means the trafficestimate-URLs. They look like http://trafficestimate.com/example.org. This is really a good case for .htaccess. So using .htaccess they take the URL and redirect it to http://www.trafficestimate.com/websites/?domain=example.org. Here you have again a GET request and an application builds the page.
Some URL rewriting is probably involved. Otherwise they would have to create an existing file for every possible request.
Using Apache’s mod_rewrite in a .htaccess file is one option. But since the server identifies itself with “Microsoft-IIS/7.5”, they are probably rather using ISAPI_Rewrite, a mod_rewrite derivative for Microsoft’s IIS.

Resources