.htaccess file - forward everything, but not? - .htaccess

First time user, been looking all night.
We recently changed our site from .net to wordpress. We transferred over half of the news articles and not the other half. So now we get old users coming to the site and getting a 404.
The news articles that exist in the wordpress site have been reditected and work fine, for example,
www.example.com/news/transfered-news-story.aspx
redirects to
www.example.com/blog/news/transfered-news-story
this was done manually.
What I need help with is if someone comes to the site with any other request, e.g.
www.example.com/news/this-didnt-get-moved.aspx
or
www,example.com/news/anything-else
or
www.example.com/news/2010/02
all just gets redirected to
www.example.com/blog/news
I have been reading on and off for a couple of weeks and tried a few things but they all append the additional stuff on the end of the redirected string.
so www.example.com/news/my-stuff-ok
becomes www.example.com/blog/news/my-stuff-ok (and I want to drop the my-stuff-ok)
I hope you get what I'm after, any help would be very much appreciated.
Thanks
Phil

You can simply write a directive that converts a 404 to a url (documentation):
ErrorDocument 404 /blog/news
However, you really should go through the motions of adding manual redirects (permanent redirect) to the new url for each of the other articles because you will take a considerable SEO hit if those urls no longer serve up the content that was linked by the search engine.

Related

redirect numerous dynamic urls to home page via .htaccess

I am trying to clean up a previously hacked WordPress site, and domain name reputation, the site has new hosting and is now on a different CMS system, but there are hundreds of spam links in Google I need to get rid of, they look like example.com/votes.php?10054nzwzm75042pw205039
Domain name, then votes.php?**** etc.. Numbers letters all sorts.
So how do I redirect ANYTHING that starts with the domain name then /votes.php?***
Any help greatly appreciated
Unless you have multiple domains, you don't need to explicitly check the domain name.
To send a "410 Gone" for anything that contains /votes.php in the URL-path (and any query string), you can do something like the following at the top of your root .htaccess file using mod_rewrite:
RewriteEngine On
# Serve a 410 Gone for any requests to "/votes.php"
RewriteRule ^votes\.php$ - [G]
A 410 is preferable to a "redirect" if you want to get these URLs removed from the search engines as quickly as possible.
To expedite the process of URL removal from Google then use Google's Removal Tool as well.
If you redirect these pages to the homepage then it will likely be seen as a soft-404 by Google and these URLs are likely to remain in the search results for a lot longer.

duplicate URLs in my page, best solution?

I have a website that write URLs like this:
mypage.com/post/3453/post-title-name-person
In fact, what is important is the post and ID part (3453). The title I just add for SEO.
I changed some title names recently, but people can still using the old URL to access, because I just get the ID to open the page, so:
mypage.com/post/3453/post-title-name-person
mypage.com/post/3453/name-person
...
Will open the same page.
Is it wrong? Google webmaster tools tells me that I have 8765 duplications pages. So, to try to solve this I am redirecting old title to post/id/current-title but it seems that Google doesn't understand this redirecting and still give me duplications.
Should i redirect to not found if title doesn't match with the actual data base? (But this can be a problem because links that people shared won't open) Or what?
Maybe Google has not processed your redirections yet. It may take several weeks and sometimes several months to process all pages, especially if they are not revisited often. Make sure your redirects are 301 and not 302 (temporary).
That being said, there is a better method than redirections for duplicate pages: the canonical tag. If you can, implement it. There is less risk to mix up redirections.
Google can pick your new URL's only after the implementation of 301 redirection through .htaccess file. You should always need to remember that 301 re-direct should be proper and one to one to the new url. After this implementation you need to fetch those new URL via Google Search console so that Google index those URL's fast.

Block or redirect website page URLs using .htaccess

I am having some issues with spam links visiting my site returning a 404 error.
My site was hacked with a secret spam links folder on public_html that redirected users to pornographic sites, those links were plastered across the internet. I have since remedied the malware issue, but have several hundred visitors hitting a 404 page because these links no longer exist, messing up all my analytics accounts, using bandwidth, etc.
I have searched for a way to block (so that they never hit my website) anyone that tries to access these URL paths, but cannot possibly redirect every single link (there were over 2000) using a wildcard, or something. My search led me here: Block Spam Referrer Traffic and it is not quite the solution I need.
The searches go to pages like this: www.mywebsite.com/spampage/morespam/ (which have been deleted and are now 404 errors)
There are several iterations of the /spampage/ and /morespam/ urls.
The referrer is generally a google search, so I can't block the referrer using .htaccess. I'd like to somehow block www.mywebsite.com/spampage/*/ and all iterations.
Apologies, I am by no means a programmer. I do appreciate any help that can be offered.
Update#1:
Seems that perhaps the best way is to block these links/directories using the robots.txt file, I have done so and will report back if I have success!
Update#2:
Reporting back. I am new to this all, so I was going about the solution wrong in my original question. Essentially, I found that I needed all of the links de-indexed, as they were generating all the traffic by being indexed by google. I was able to request de-indexing of the directories in question manually through the google webmaster tools account. One requirement for de-indexing was to have the robots.txt on the site block the directories in question from being crawled. Once I did that I submitted the request to remove the directory from the google index. Those pages were taken off in about 3 hours by google (thanks google!), so it was pretty quick once I found out the proper way to go about it. No .htaccess editing needed. Once the pages were no longer index, traffic went back down to normal levels and my keywords, etc, will be back to normal.

301 redirect all ugly permalinks from old site to new site

So I overhauled a complete website the other day and found some of the old pages snippets in the google search results. The old page had an ugly link structure such as domain.com/index.php?article_id=123. The new site uses pretty permalinks such as domain.com/pagetitle.
Is there a piece of code I could put into the .htaccess file in order to redirect all ugly permalinks to the new site?
Edit
Additional info: The old links don't exist anymore. The old site and the new one's structure differs a lot, not all contents from the all site were adapted. Main problem is that I don't want the old links in the google search results to always throw a 404 at the user.
Maybe something of a
RedirectMatch ^/index.php?$ http://www.example.com/somepage
This will redirect all pages starting from index.php to another location
I don't have the rep to comment on the other answer, but that is a very improper solution if you value your SEO at all. A redirect is your way of telling Google "I've got the same page, I just moved it". There's a much better way to do this that won't negatively affect your SEO at all.
You should create some logic to redirect those old links to your new links.
Here's an example of how you could do it:
Go to the beginning of your program, before any logic takes place.
Use code to retrieve the requested page. In this case, you might be able to get away with simply checking for GET variables that match article_id.
If the requested page is a match for your GET variable, run a query to see if the article exists. (Obviously, you'll still want to 404 articles that don't exist).
Retrieve the content used to generate the new, more SEO-friendly URL's. This is probably the article title or something.
Write some code to generate the new article title. At this point, if this is working properly, you should be able to system print that new URL to make sure it's correct.
301 redirect to the new URL. Don't 302 or any other number, 301 redirect it. This lets search engines know it's the same page and content, but it has permanently moved.

Magento SEO URL's suddenly changed, can't reenable

I’m fairly new to the Magento platform but I have a decent amount of experience in web development on apache servers.
A few days ago I was asked to look into an issue that was first made aware of with failing filters.
I had a look at the google analytics data and it seems the SEO friendly URLs have all stopped displaying. The navigation URLs still use friendly words however on the page return the URL is redirected to a basic catalog URL.
http://www.camera-camera.com/cameras-and-accessories.html
instead now it goes to
https://www.camera-camera.com/index.php/catalog/category/view/id/9
I checked the admin config. The Web > SEO URL rewrites are set to YES
I toggled them to No saved and back to yes then saved. Tried clearing the catalog URL rewrite cache
Checked the htaccess file and it hasn’t been touched for months.
Emptied the core rewrite table and reindexed it.
So I’m outta ideas now, was hoping some of you more experienced users can have some input as to what else I can check.
I also found it strange that the URL is now ignoring postback parameters. If you look at their filters they are simply an a link to the same page with a post parameter. This gets striped and ignored now might be related?
A file restore was on the day it happened. Any files I should check it against?
Thanks for any help you can provide !
I just discovered that it was related to HTTPS. I didn't notice but seems the site keeps redirecting to HTTPS even though the filter links etc are pointing to HTTP, in the redirect the parameters are dropped. Now to figure out why its going into HTTPS

Resources