Return 410 (Gone) for URLs containing hash symbol - googlebot

A previous development of a flash based site used URLs of the form http://www.example.com/#!
These URLs have been indexed by Google but I need these URLs to be removed from the search index. Reading the Google documentation, I should be generating a 404 or 410 error.
However, I can't do it on the server-side because the # and everything after it is not included in the GET request. Therefore, in my .htaccess file, there is no way to check for URLs of this form.
I can detect those URLs with the Javascript line below
window.location.hash.match('#!')
then change window.location to a URL that my .htaccess file can match and then use a RewriteRule to return a 410 error.
But will this work with the search bots and/or is there a better way?
Thanks.

Related

redirect numerous dynamic urls to home page via .htaccess

I am trying to clean up a previously hacked WordPress site, and domain name reputation, the site has new hosting and is now on a different CMS system, but there are hundreds of spam links in Google I need to get rid of, they look like example.com/votes.php?10054nzwzm75042pw205039
Domain name, then votes.php?**** etc.. Numbers letters all sorts.
So how do I redirect ANYTHING that starts with the domain name then /votes.php?***
Any help greatly appreciated
Unless you have multiple domains, you don't need to explicitly check the domain name.
To send a "410 Gone" for anything that contains /votes.php in the URL-path (and any query string), you can do something like the following at the top of your root .htaccess file using mod_rewrite:
RewriteEngine On
# Serve a 410 Gone for any requests to "/votes.php"
RewriteRule ^votes\.php$ - [G]
A 410 is preferable to a "redirect" if you want to get these URLs removed from the search engines as quickly as possible.
To expedite the process of URL removal from Google then use Google's Removal Tool as well.
If you redirect these pages to the homepage then it will likely be seen as a soft-404 by Google and these URLs are likely to remain in the search results for a lot longer.

Can a old Redirect 301 from .htaccess be removed if the source url is marked as excluded in the Google Search Console?

I'm cleaning up my huge .htaccess file which got bloated over the years.
There are many Redirect 301's in my .htaccess file which are years old. I see few of the old/source urls in the .htaccessmarked as excluded in the google search console. Now, is it safe to remove these entries in the .htaccess? Can I assume that these redirects are unnecessary now ?
Like say,
.htaccess
Redirect 301 /xxxxx https://www.yyyyyy.com
Now, In Google Search Console -> Index -> Coverage report, I see the /xxxxx is marked as the Excluded. So, now am I good to remove the above entry from .htaccess as google already indexed a canonical version of it?

.htaccess rewriting URL to add a character to a URL

I'm doing this to fix an error with my AMP pages. I'm using the Automatic AMP plugin, and this plugin lets you access the AMP pages using 2 different methods
site.com/post/amp/
site.com/post/?amp
Using the AMP Page validator (https://validator.ampproject.org) I see that all AMP pages using just /amp/ get multiple errors, while /?amp is validated correctly.
Unfortunately, Google is checking /amp/ for all my pages, hence I'm getting tons of errors.
What I'd like to know is how to use the .htaccess redirect rule to add the ? to the AMP queries so all /amp/ requests are redirected (with a 301) to /?amp/
I'd appreciate suggestions on this. Thank you
Something like this:
RewriteEngine on
RewriteRule ^(.*)/amp/$ /$1?amp [R=301,L]
This will redirect any URL ending in /amp/ to the same but with ?amp instead, which seems to be what you want. To go in your root .htaccess file.

301 redirects not working properly. Numbers in URL being ignored

I have just replaced my old non-Wordpress site with a Wordpress site. Now I need to redirect approximately 400 old URLs to their equivalent pages on the new site.
I have already recorded all the old urls and new urls, but when I put the code in the .htaccess file I am getting a strange result.
If I try to redirect to any url that has a number in it, the redirect tries to redirect to that url minus the numbers.
For example:
Redirect 301 /international_organizations/africaamerica_institute http://nyintl.net/international-organizations-in-new-york/2792/africa-america-institute/
Redirects to http://nyintl.net/international-organizations-in-new-york/africa-america-institute/
Which isn't actually a page and thus returns a 404 error.
Anyone have any idea what's going wrong? All my posts on the new site have the month/year syntax in the url, so this means that 95% of my redirects aren't working.
All the urls that don't contain numbers are redirecting perfectly.
I have made sure to put all my redirects ABOVE the WP rewrite rules, but that hasn't made a difference (tried them below as well).

Htaccess - Rewrite and Redirect url for Kuenan Forum in Joomla

I have a website that got an feed that shows all new forum posts. When doing this it will show url/#postID
I want to redirect all this links to the correct url without the postID to get the onsite SEO optimal
Exampel
www.mysite.com/forum/category/the-post-titel/#48394 -> ..../the-post-titel/
What would the correct line in htaccess be to redirect all url's with the "/#43244" to just "/"
You cannot do this via .htaccess as the # or URL Fragment Identifier is not transmitted to the server so .htaccess cannot act on it.
If you are doing this for SEO purposes, the best (and perhaps only) way is to remove this from the feed itself, before people or Search Engines get to it.

Resources