duplicate URLs in my page, best solution? - search

I have a website that write URLs like this:
mypage.com/post/3453/post-title-name-person
In fact, what is important is the post and ID part (3453). The title I just add for SEO.
I changed some title names recently, but people can still using the old URL to access, because I just get the ID to open the page, so:
mypage.com/post/3453/post-title-name-person
mypage.com/post/3453/name-person
...
Will open the same page.
Is it wrong? Google webmaster tools tells me that I have 8765 duplications pages. So, to try to solve this I am redirecting old title to post/id/current-title but it seems that Google doesn't understand this redirecting and still give me duplications.
Should i redirect to not found if title doesn't match with the actual data base? (But this can be a problem because links that people shared won't open) Or what?

Maybe Google has not processed your redirections yet. It may take several weeks and sometimes several months to process all pages, especially if they are not revisited often. Make sure your redirects are 301 and not 302 (temporary).
That being said, there is a better method than redirections for duplicate pages: the canonical tag. If you can, implement it. There is less risk to mix up redirections.

Google can pick your new URL's only after the implementation of 301 redirection through .htaccess file. You should always need to remember that 301 re-direct should be proper and one to one to the new url. After this implementation you need to fetch those new URL via Google Search console so that Google index those URL's fast.

Related

Block or redirect website page URLs using .htaccess

I am having some issues with spam links visiting my site returning a 404 error.
My site was hacked with a secret spam links folder on public_html that redirected users to pornographic sites, those links were plastered across the internet. I have since remedied the malware issue, but have several hundred visitors hitting a 404 page because these links no longer exist, messing up all my analytics accounts, using bandwidth, etc.
I have searched for a way to block (so that they never hit my website) anyone that tries to access these URL paths, but cannot possibly redirect every single link (there were over 2000) using a wildcard, or something. My search led me here: Block Spam Referrer Traffic and it is not quite the solution I need.
The searches go to pages like this: www.mywebsite.com/spampage/morespam/ (which have been deleted and are now 404 errors)
There are several iterations of the /spampage/ and /morespam/ urls.
The referrer is generally a google search, so I can't block the referrer using .htaccess. I'd like to somehow block www.mywebsite.com/spampage/*/ and all iterations.
Apologies, I am by no means a programmer. I do appreciate any help that can be offered.
Update#1:
Seems that perhaps the best way is to block these links/directories using the robots.txt file, I have done so and will report back if I have success!
Update#2:
Reporting back. I am new to this all, so I was going about the solution wrong in my original question. Essentially, I found that I needed all of the links de-indexed, as they were generating all the traffic by being indexed by google. I was able to request de-indexing of the directories in question manually through the google webmaster tools account. One requirement for de-indexing was to have the robots.txt on the site block the directories in question from being crawled. Once I did that I submitted the request to remove the directory from the google index. Those pages were taken off in about 3 hours by google (thanks google!), so it was pretty quick once I found out the proper way to go about it. No .htaccess editing needed. Once the pages were no longer index, traffic went back down to normal levels and my keywords, etc, will be back to normal.

Redirects required for pages no longer in sitemap

I have a relatively new site that has just started to pick up a bit of traction in the SERP's. My problem is that I have published it and had it indexed with PHP URL extensions, as follows:
www.example.com/page.php
www.example.com/product.php
And so on. Obviously it is a fairly easy matter of editing the .htacess file to remove these extensions. So I will end up with:
www.example.com/page
www.example.com/product
No problems there.
Because the site is still quite small, I can easily change all the links manually to drop the .php extension, and then update the sitemap. So Google, and all users, should have no way of reaching the .php pages, although of course they still exist if you were to manually type them in.
But, because Google has a 'record' of these pages existing (even though there are no direct links to reach them now), do I need to implement 301 redirects from the .php pages to the new non-php pages? I.e. will Google try to crawl those pages that are no longer in the sitemap, but once existed? In other words, since you can still reach www.example.com/page.php , even though will be no link on the site or in the sitemaps that will take you there, would I get penalised for having duplicate content - are 301 redirects basically required when doing this kind of thing, even if there are no links to the content anymore?
Thanks very much.
It is better to have 301 redirect for some time(month or two) even though you can change all your links to nonphp urls. This way any residual URLs(will always be there) that are hanging out there will be taken care and google will index nonphp urls from your 301 redirect. Once you are sure from Logs(depending on your system) that there are no more OLD urls coming in, you can remove the 301 redirects. This is little easier way of moving all your old URLs instead of abruptly throwing 404s. 301 helps to transfer SEO values of old URLs to new ones.
Another item to look out for is using rel="canonical" if you want your .php and nonphp pages to coexist. This signals that they are not duplicates.

301 redirect all ugly permalinks from old site to new site

So I overhauled a complete website the other day and found some of the old pages snippets in the google search results. The old page had an ugly link structure such as domain.com/index.php?article_id=123. The new site uses pretty permalinks such as domain.com/pagetitle.
Is there a piece of code I could put into the .htaccess file in order to redirect all ugly permalinks to the new site?
Edit
Additional info: The old links don't exist anymore. The old site and the new one's structure differs a lot, not all contents from the all site were adapted. Main problem is that I don't want the old links in the google search results to always throw a 404 at the user.
Maybe something of a
RedirectMatch ^/index.php?$ http://www.example.com/somepage
This will redirect all pages starting from index.php to another location
I don't have the rep to comment on the other answer, but that is a very improper solution if you value your SEO at all. A redirect is your way of telling Google "I've got the same page, I just moved it". There's a much better way to do this that won't negatively affect your SEO at all.
You should create some logic to redirect those old links to your new links.
Here's an example of how you could do it:
Go to the beginning of your program, before any logic takes place.
Use code to retrieve the requested page. In this case, you might be able to get away with simply checking for GET variables that match article_id.
If the requested page is a match for your GET variable, run a query to see if the article exists. (Obviously, you'll still want to 404 articles that don't exist).
Retrieve the content used to generate the new, more SEO-friendly URL's. This is probably the article title or something.
Write some code to generate the new article title. At this point, if this is working properly, you should be able to system print that new URL to make sure it's correct.
301 redirect to the new URL. Don't 302 or any other number, 301 redirect it. This lets search engines know it's the same page and content, but it has permanently moved.

Pointing multiple domain names at same place

I have a website ranking well in Google, my current website has dashes in and looks like so...
this-is-mine.com
Ive just also bought
thisismine.com
I'd like to point the latter to my first site, but I dont want it to be classed as duplicate content.
I'm unsure if I just do this through 123-reg but will this affect my Google rankings, or is there a correct way of doing this without penalising myself?
According to the link below, my thoughts are confirmed.
A 301 is fine as it forwards everything including page rank to the "new" site. In your case this-is-mine.com.
A 302 could/would be a problem for SEO.
http://seo-hacker.com/301-302-redirect-affect-seo/
If your current website is ranking well then don't disturb it. There is no benefits in pointing multiple domains on one website. You can also make a single page website on the new domain and optimize it for Google and link with your old one.
If you still want to do this then do a 301 redirect but make sure that the domain is new and has no spammy back links pointing to it.

Dynamically creating URLs for other websites

I'd like to know how websites have created URLs with other domains like these on trafficestimate.com.
I'm guessing it's some .htaccess stuff to redirect domain names to a dynamic page?
Thanks
Your URL has an GET Request. So when someone calls the page http://google.com/search with the parameters hl=en, safe=off etc., the page can process those parameters. So for instance safe=off means that you want to get back any search result. The q=site:... is your search string. In this case Google will look it up in its database and give you the results. So when you call this URL there is probably no .htaccess processing done. However you can process the URL and GET request with .htacces and i.e. redirect the user to another page.
Maybe you'll describe a bit further what exactly you trying to do/want to know. This makes explaining easier.
EDIT: After reading Gumbo's comment I looked at the Google result page. So maybe your question means the trafficestimate-URLs. They look like http://trafficestimate.com/example.org. This is really a good case for .htaccess. So using .htaccess they take the URL and redirect it to http://www.trafficestimate.com/websites/?domain=example.org. Here you have again a GET request and an application builds the page.
Some URL rewriting is probably involved. Otherwise they would have to create an existing file for every possible request.
Using Apache’s mod_rewrite in a .htaccess file is one option. But since the server identifies itself with “Microsoft-IIS/7.5”, they are probably rather using ISAPI_Rewrite, a mod_rewrite derivative for Microsoft’s IIS.

Resources