redirect numerous dynamic urls to home page via .htaccess - .htaccess

I am trying to clean up a previously hacked WordPress site, and domain name reputation, the site has new hosting and is now on a different CMS system, but there are hundreds of spam links in Google I need to get rid of, they look like example.com/votes.php?10054nzwzm75042pw205039
Domain name, then votes.php?**** etc.. Numbers letters all sorts.
So how do I redirect ANYTHING that starts with the domain name then /votes.php?***
Any help greatly appreciated

Unless you have multiple domains, you don't need to explicitly check the domain name.
To send a "410 Gone" for anything that contains /votes.php in the URL-path (and any query string), you can do something like the following at the top of your root .htaccess file using mod_rewrite:
RewriteEngine On
# Serve a 410 Gone for any requests to "/votes.php"
RewriteRule ^votes\.php$ - [G]
A 410 is preferable to a "redirect" if you want to get these URLs removed from the search engines as quickly as possible.
To expedite the process of URL removal from Google then use Google's Removal Tool as well.
If you redirect these pages to the homepage then it will likely be seen as a soft-404 by Google and these URLs are likely to remain in the search results for a lot longer.

Related

Nginx URL rewrite to remove folder from URL when its followed by certain subfolders

After I have upgraded my site I see that once I go live with new version some parts of the website URLs will not be redirected for gallery, blogs and files because of new structure. And there is no way fixing it within the CMS. So my goal is to use NGINX redirects.
I wonder do any of you know any NGINX rewrite tricks to make such redirects possible?
website.com/forums/blogs/ into website.com/blogs/
website.com/forums/gallery/ into website.com/gallery/
website.com/forums/files/ into website.com/files/
I actually need the part forums dropped from the URL only and ONLY when the address is going for forums+blogs/gallery/files. Don't want to loose that google traffic.
So for example
website.com/forums/blogs/entry123/my-dog/ is redirected to
website.com/blogs/entry123/my-dog/
BUT
website.com/forums/topic/my-dog/
is left alone and working just like before because the following subfolder is neither blogs or gallery or files.
I needed that once on Apache and this one worked but on Nginx I have no idea.
RewriteRule ^forums/(blogs|gallery|files)/(.*)$ /$1/$2 [L,R=301]
You can try something like
rewrite ^/forums/(blogs|gallery|files)/(.*)$ /$1/$2;
Please note that rewrite directive accepts some flags wich meaning depends on where is it placed (is it inside a server or location block). Detailed documentation is here.

duplicate URLs in my page, best solution?

I have a website that write URLs like this:
mypage.com/post/3453/post-title-name-person
In fact, what is important is the post and ID part (3453). The title I just add for SEO.
I changed some title names recently, but people can still using the old URL to access, because I just get the ID to open the page, so:
mypage.com/post/3453/post-title-name-person
mypage.com/post/3453/name-person
...
Will open the same page.
Is it wrong? Google webmaster tools tells me that I have 8765 duplications pages. So, to try to solve this I am redirecting old title to post/id/current-title but it seems that Google doesn't understand this redirecting and still give me duplications.
Should i redirect to not found if title doesn't match with the actual data base? (But this can be a problem because links that people shared won't open) Or what?
Maybe Google has not processed your redirections yet. It may take several weeks and sometimes several months to process all pages, especially if they are not revisited often. Make sure your redirects are 301 and not 302 (temporary).
That being said, there is a better method than redirections for duplicate pages: the canonical tag. If you can, implement it. There is less risk to mix up redirections.
Google can pick your new URL's only after the implementation of 301 redirection through .htaccess file. You should always need to remember that 301 re-direct should be proper and one to one to the new url. After this implementation you need to fetch those new URL via Google Search console so that Google index those URL's fast.

Redirects required for pages no longer in sitemap

I have a relatively new site that has just started to pick up a bit of traction in the SERP's. My problem is that I have published it and had it indexed with PHP URL extensions, as follows:
www.example.com/page.php
www.example.com/product.php
And so on. Obviously it is a fairly easy matter of editing the .htacess file to remove these extensions. So I will end up with:
www.example.com/page
www.example.com/product
No problems there.
Because the site is still quite small, I can easily change all the links manually to drop the .php extension, and then update the sitemap. So Google, and all users, should have no way of reaching the .php pages, although of course they still exist if you were to manually type them in.
But, because Google has a 'record' of these pages existing (even though there are no direct links to reach them now), do I need to implement 301 redirects from the .php pages to the new non-php pages? I.e. will Google try to crawl those pages that are no longer in the sitemap, but once existed? In other words, since you can still reach www.example.com/page.php , even though will be no link on the site or in the sitemaps that will take you there, would I get penalised for having duplicate content - are 301 redirects basically required when doing this kind of thing, even if there are no links to the content anymore?
Thanks very much.
It is better to have 301 redirect for some time(month or two) even though you can change all your links to nonphp urls. This way any residual URLs(will always be there) that are hanging out there will be taken care and google will index nonphp urls from your 301 redirect. Once you are sure from Logs(depending on your system) that there are no more OLD urls coming in, you can remove the 301 redirects. This is little easier way of moving all your old URLs instead of abruptly throwing 404s. 301 helps to transfer SEO values of old URLs to new ones.
Another item to look out for is using rel="canonical" if you want your .php and nonphp pages to coexist. This signals that they are not duplicates.

Block Bots from crawling one of my sites on a multistore multidomain prestashop

Hello i have a multistore multidomain prestashop installation with main domain example.com and i want to block all bots from crawling a subdomain site subdomain.example.com made for resellers where they can buy at lower prices because the content is duplicate to the original site, and i am not exacly sure how to do it. Usualy if i want to block the bots for a site i would use
User-agent: *
Disallow: /
But how do i use it without hurting the whole store ? and is it possible to block the bots from the htacces too ?
Regarding your first question:
If you don't want search engines to gain access to the subdomain (sub.example.com/robots.txt), using a robots.txt file ON the subdomain is the way to go. Don't put it on your regular domain (example.com/robots.txt) - see Robots.txt reference guide.
Additionally, I would verify both domains in Google Search Console. There you can monitor and control the indexation of the subdomain and main domain.
Regarding your second question:
I've found a SO thread here which explains what you want to know: Block all bots/crawlers/spiders for a special directory with htaccess.
We use a canonical URL to tell the search engines where to find the original content.
https://yoast.com/rel-canonical/
A canonical URL allows you to tell search engines that certain similar
URLs are actually one and the same. Sometimes you have products or
content that is accessible under multiple URLs, or even on multiple
websites. Using a canonical URL (an HTML link tag with attribute
rel=canonical) these can exist without harming your rankings.

Changing website name using mod_rewrite and htaccess

I want to rename a folder on my site from http://mywebsite.com/myfolder/ to http://mywebsite.com/mynewfolder/. The urls for the old folder name are all index by Google and may other sites have linked to mine. What is the correct way to ensure that visitors coming to the site on the old folder name will now see the new folder name? Should I chane the name of the folder on my server and then use mod_rewrite to force the new url (folder name)
this seems to work, but is it correct: Redirect 301 /myfolder /mynewfolder
also for SEO would it be better to use: /my-folder-name/
RewriteEngine on
RewriteRule ^oldfolder/ /newfolder/ [R=301,NC]
It is widely acknowledged that hyphenating (-) your URLs makes a small impact on SEO as it separates any keywords in your URL rather than having them read as one long string. However saying that I'm pretty sure Google is clever enough to have a go at working this out for themselves. I don't suppose it would hurt and it makes it easier for your user to read at the very least.

Resources