Over-zealous RedirectMatch in .htaccess file - .htaccess

I just converted a website that had thousands of bad links, and I have been cleaning up tens of thousands of indexing errors, some with redirect 301s and some with RedirectMatch. However, I am getting one error that puzzles me.
Either of these two lines of code in the .htaccess file
RedirectMatch 301 /faqs(.*) http://www.belleviewanimalclinic.com/pet-care-faqs/
RedirectMatch 301 /faq(.*) http://www.belleviewanimalclinic.com/pet-care-faqs/
cause this page
http://www.belleviewanimalclinic.com/denver-veterinarian-articles/faqs-about-pet-dentistry/
to be redirected here
http://www.belleviewanimalclinic.com/pet-care-faqs/
There are lots of old URLs that started with /faqXXXXXX that need to be redirected to the new FAQ page. However, I don't want this to apply to the article above.
Is there a way I can rewrite the RedirectMatch rules so that only URLs that start with /faq are redirected, rather than somewhere in the middle of the request URL?
Thanks!

Use a caret to match the start of a path:
RedirectMatch 301 ^/faqs?(.*) http://www.belleviewanimalclinic.com/pet-care-faqs/
I merged the two lines into one by making the "s" optional (by following it with a "?"). The caret should make it so the path has to begin with faq, not just include it.
Note: I initially removed the beginning slash since match strings generally start after the beginning slash, however I'm more used to RewriteRules than RedirectMatches and I believe the behavior may be different.

Related

How to redirect double URL to single URL with htaccess

Google Search Console is showing 404 Page Not Found error for
https://example.com/page/https://example.com/page/
and the link is coming from an external website.
I want to redirect with .htaccess:
https://example.com/page/https://example.com/page/
to
https://example.com/page/
Can anyone can help me in this regard?
Try the following mod_rewrite directives at the top of your .htaccess file:
RewriteEngine On
RewriteRule ^(.*?)https?:/ /$1 [R=301,L]
This just removes any trailing part on the URL-path that starts http:/ (or https:/).
UPDATE: The ? in the capturing subpattern (.*?) makes it non-greedy, so it only captures up to the first occurrence of https:/ and discards the rest, rather than up to the last occurrence (greedy) and looping (redirect loop) until all occurrences of https:/ were removed.
Additional notes:
First test with 302 (temporary) redirect to make sure it works. Only change to 301 when confirmed, to avoid caching issues.
The URL-path that is matched by the RewriteRule pattern has already had sequences of slashes reduced to single slashes, so you can't match // (double slash) here (but I don't think you need to).
If there are query strings involved then you may need a slightly different approach and another directive, since the query string itself (as opposed to the URL-path) might contain the "repeated URL" that needs to be removed (we would need to see an example first). The RewriteRule pattern matches against the URL-path only, not the query string.
On Windows: If the (scheme and) colon (:) appears in the first path segment (ie. the malformed link is for the document root) then Apache will generate a 403 Forbidden before .htaccess is able to redirect. There is nothing you can do to avoid this since it is a limitation of the OS (colons are not allowed in filesystem paths - the 403 occurs when Apache tries to map the URL to a filesystem path). This does not happen on Linux. For example: https://example.com/https://example.com/.
UPDATE: If you are not seeing a redirect, just a 404 then you may need to enable additional pathname information (PATH_INFO) on your URLs. For example, at the top of your .htaccess file:
AcceptPathInfo On

How can i 301 redirect a subfolder and everything after it?

I am trying to redirect a subfolder as well as anything after it to the home page.
For example:
example.com/subfolder/extra-stuff > example.com
The extra-stuff is constantly changing and auto generated, so I want the redirect to remove that as well.
I am using:
Redirect 301 /subfolder(.*) http://www.example.com
However, this will result in http://www.example.com/extra-stuff.
Is there a way I can say if /subfolder(and anything else after subfolder) redirect to home?
Thanks for any suggestions!
The Redirect directive uses simple prefix-matching and everything after the match is copied onto the end of the target URL (which is what you are seeing here). However, the Redirect directive also does not support regex syntax, so a "pattern" like (.*) on the end will actually match the literal characters (, ., * and ) - which shouldn't have worked in your example?!
You'll need to use RedirectMatch instead (also part of mod_alias), which does use regex, and is not prefix matching.
For example:
RedirectMatch 301 ^/subfolder http://www.example.com/
Any request that starts /subfolder will be redirected to http://www.example.com/ exactly.
You'll need to clear your browser catch before testing.
You tagged your question "Magento" (which is probably using mod_rewrite). You should note, however, if you are already using mod_rewrite for rewrites/redirects then you should probably be using mod_rewrite instead of mod_alias to do this redirect, since you can potentially get conflicts.
For example, the equivalent mod_rewrite directive would be:
RewriteRule ^subfolder http://www.example.com/ [R=301,L]
Note there is no slash prefix on the RewriteRule pattern. This would need to go near the top of your .htaccess file.

301 Redirects behaving strangely

Running into this issue where the redirect in the second line is redirecting to the URL in the first line.
Redirect 301 /academics/degrees http://mydomain.edu/folder1/location1/
Redirect 301 /academics/degrees/phd http://mydomain.edu/folder1/location2/
At first I thought it had something to do with the locations to be redirected containing hyphens, but haven't been able to find anything on that.
Does it have something to do with the locations to be redirected sharing the same folder/permalink structure?
I've never encountered this before and am totally lost. I tried RedirectMatch but that didn't have any effect.
It is because /academics/degrees matches both URLs and rule for /academics/degrees/phd never fires. Either change the order of your rule OR better use RedirectMatch with regex capability to match only desired URL pattern:
RedirectMatch 301 ^/academics/degrees/?$ http://mydomain.edu/folder1/location1/
RedirectMatch 301 ^/academics/degrees/phd/?$ http://mydomain.edu/folder1/location2/
Make sure to clear your browser cache before testing this change.

htaccess 301 redirect error

i have setup up a redirect
RedirectMatch 301 /data(.*) http://www.site.com/sites/default/files/datassheets$1
and i am getting the following error
http://www.site.com/sites/default/files/datasheetssheetssheetssheetssheetssheetssheetssheetssheetssheetssheetssheetssheetssheetssheetssheets/doc3542.pdf
when i rename the datasheets directory to something else it works but this is not an option
is this an apache error or am i doing something wrong
Your RedirectMatch regular expression /data(.*) is matching on every request and thus will continue indefinitely.
What the complete redirect rule will look like depends on your use-case. The following rule takes care of the endless loop issue and redirects the content following /data/ to the new structure at http://www.site.com/sites/default/files/datasheets/.
RedirectMatch 301 ^/data/(.+) http://www.site.com/sites/default/files/datassheets/$1
/data/my-cool-file =>
http://www.site.com/sites/default/files/datassheets/my-cool-file
the (.*) portion that you have after /data is matching sheets in your url. You are then taking that match and appending it on the redirect. That's what's giving you the repeating word. I'm guessing you're also redirecting to your own site, which is why it's repeating so many times.
What are you expecting to come after data that you want to append to the redirect? If it's a query string, you can add [QSA] as a flag at the end to maintain the query string.
Example:
RedirectMatch 301 /data/(.*) http://www.site.com/sites/default/files/datassheets/$1 [QSA]
Also, consider that you are telling everyone that any page that starts with data in any directory shouldn't exist, yet you are redirecting them to a page that matches the very same pattern you are supposedly getting rid of. You probably need to expand the regex to only match what you intend.

Massive .htaccess Redirect

I got some problem on redirecting something.
I tried to use Redirect 301 /link/link/link to /link/link
Is there a way to make it more easier coz there are 100+ links I need to redirect.
Like this
/blog/category/energy-savings/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing
/blog/category/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/
/blog/category/uncategorized/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/about
/blog/category/uncategorized/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/about
/blog/category/uncategorized/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing
/blog/category/uncategorized/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/about
/blog/category/water-conservation/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing
/blog/category/water-conservation/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/plumbing/about
To the Homepage?
Thanks
The directive:
Redirect 301 /link/link/link /link/link
Doesn't do what you think it does. It's mapping to nodes together, meaning a request for:
/link/link/link/foo/bar.html
gets redirected to:
/link/link/foo/bar.html
So maybe you need to fix that. You could try:
RedirectMatch 301 ^/link/link/link/?$ /link/link
so the nodes aren't connected like in the Redirect directive. As for trying to "fix" this problem of yours, you can try:
RedirectMatch 301 ^/([^/]+)/([^/]+)/([^/]+)/([^/]+)/\4/ /
This redirects anything that looks like: /blah1/blah2/blah3/same/same/same/ etc. to the homepage at /. The \4 matches the 4th path in the URI, so if anything after the 3rd path repeats, then it redirects.

Resources