Htaccess redirect and remove last 5 characters - .htaccess

I have 20k+ indexed pages but ~200 have /feed/ added at the end:
http://www.domain.com/page-ID-TITLE/feed/
ID and TITLE are dynamic.
TITLE can have multiple words, it doesn't have a fixed length: word1-word2-.....
The normal URL is:
http://www.domain.com/page-ID-TITLE/
The problem is that I get duplicated content on those pages, how can I redirect the URLs with feed at the end to the normal URL?
Thank you for your time!

Quite simply, actually. You don't need to specify the count of characters - only define what they are:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule (.+)/feed/?$ $1 [R=301]
I would suggest that you use the [R] flag instead of the [R=301] flag whilst testing on a production site. If it works, then switch it to the latter flag.

Related

.htaccess - Heal malformed url

I need a solution. For some reason in the past seems that I generate some "bad" links for bots only.
Resume: There is a fake "page" parameter when malformed url is present. When there are 2 "page" parameters then the first one is fake, must be removed.
X is random
Remove the parameter page only when "?/?page" is detected
Good: search?pagepage=496
Bad: search?/?page=X
Good: https://example.com/search?page=496
Good: https://example.com/search?page=496&orderBy=oldest
Bad: https://example.com/search?/?page=X&page=496&orderBy=oldest
RewriteCond %{QUERY_STRING} ^(.*)&?^XXX[^&]+&?(.*)$ [NC]
RewriteRule...
Thank you guys!
UPDATE
At final, I found a solution by myself:
RewriteCond %{QUERY_STRING} ^(.*)&?^/\?page=[^&]+&?(.*)$ [NC]
RewriteRule ^/?(.*)$ /search$1?%1%2 [R=301,L]
RewriteCond %{QUERY_STRING} ^/\?page=.+&(page=.*)
RewriteRule ^(search)$ $1?%1 [R=301]
This will do the rewrite for all your URLs that have the extra page parameter you want to keep.
To make the last part optional, we would have to wrap &(page=.*) into another set of braces, and add a ? as quantifier - (&(page=.*))?.
Then the back reference would need to be changed from %1 to %2 (because we only need that inner part, we don't want the &) - but then for your URL without any real page parameter to keep, there would be no match in this place, and therefor the %2 would not be substituted with anything, but added to the URL literally.
So better to leave the above as-is, and simply add
RewriteCond %{QUERY_STRING} ^/\?page=.+
RewriteRule ^(search)$ $1 [QSD,R=301]
below those two existing lines. The pattern does not need to be any more specific (because the URLs that have a genuine page parameter at the end, have already been handled by those previous two lines.) And the QSD makes it simply drop the existing query string, so that https://example.com/search?/?page=20 results in https://example.com/search (which I assume is what you wanted here, because there is no actual page parameter to keep, correct?)

.htaccess to remove arbitrary text in filenames

I've got a client who uploads thousands of images with names like 1057_1.jpg , 1057_2.jpg, 1083_1H.jpg etc - always a number, an underscore, a number and an optional letter. The CMS uses these to link them to relevant entries.
For SEO reasons we want those image filenames to contain some keywords taken from the CMS. So they would become, say, 1057_1-some-keywords-here.jpg. Is there a way, with .htaccess, to keep the filenames the same, but redirect 1057_1-any-arbitrary-words.jpg to 1057_1.jpg? Basically to remove everything from the first dash up to the dot?
Thanks for your help - I must learn htaccess properly sometime but need to find a quick solution for now!
You may try this:
RewriteEngine On
RewriteBase /
#RewriteCond %{REQUEST_FILENAME} !-f
#RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} /([^\-]+)-.+\.([^/]+)/?
RewriteRule .* %1.%2 [R=301,L]
Redirects permanently any URL like:
http://example.com/1057_1-anything.jpg or
http://example.com/any/number/of/folders/1057_1-anything.jpg
To:
http://example.com/1057_1.jpg or
http://example.com/any/number/of/folders/1057_1.jpg
Effectively removing -anything from the last string in the URL-path.
The image name hast to be the last string in the URL-path, for the rule-set to work.
For a silent mapping, remove R=301 from [R=301,L].
UPDATE:
anything did not include the period as it was used to determine the end of the name and the start of the extension. However, I modified the rule-set to remove also any number of periods in anything except the last one, according to the OP requirement in previous comment.

htaccess make rewrite condition page specific

I have the following in my htaccess which works fine for the page called portfolio:
RewriteCond %{QUERY_STRING} ^_escaped_fragment_=([^&]*)
RewriteRule .* http://%{HTTP_HOST}/portfolio#!%1? [R=301,L,NE]
However I now need to keep this and also produce this same rewrite for another page called PrivateGallery.
How can I do this without the two clashing, is there any way to write the first condition line differently for example so that it was page specific and then copy this a second time changing the page names accordingly?
Thanks in advance
I've eventually found what I was looking for after countless experiments. I achieved it with this:
RewriteCond %{QUERY_STRING} ^_escaped_fragment_=([^&]*)
RewriteRule .* %{REQUEST_URI}#!%1? [R=301,L,NE]
So both my pages containing escaped fragments will rewrite to their hashbang representative URLs without one page also redirecting to another.
This was mainly needed so that I could get Facebook to parse these URLs for a project where I needed like button/fb comments to represent separate image URLs in the document.

Can I create a search engine friendly URL from this custom ColdFusion CMS URL?

I have inherited a custom ColdFusion CMS app. The URL's that it creates are horrendous. Not at all suitable for SEO or readability for that matter. An example of a URL in this CMS is:
http://www.mysite.com/Index2.cfm?a=000003,000010,000019,001335
Basically, each level of hierarchy is stored in the database based upon that long string of comma separated values. So in the case of the example I used, that particular page is 4 levels deep in the CMS hierarchy.
Basically what I would like to see is a format similar to this
http://www.mysite.com/level-1/level-2/level-3/level-4
Is this possible? Any help would be greatly appreciated. For what it's worth we are using ColdFusion 6 at present time, but will be upgrading to 8 in the near future.
First of all, are you willing to have the index.cfm in the URL? Like: http://www.mysite.com/index.cfm/level-1/level-2/level-3/level-4 ? If not, then you'll need to be doing a rewrite to remove the index.cfm, but still allow CF to process the page. Your .htaccess would look something like this:
RewriteEngine On
# If it's a real path, just serve it
RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule . - [L]
# Redirect if no trailing slash
RewriteRule ^(.+[^/])$ $1/ [R=301,L]
# Rewrite URL paths
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} !-d
RewriteRule ^([a-zA-Z0-9/-]+)$ /index.cfm%{REQUEST_URI} [PT]
Next step, you'll need to "catch" the URLs and serve up the correct pages based on the SEO-friendly URLs. You can grab the incoming URL from the CGI.path_info variable. It's hard to know what your code should look like without knowing how it currently processes those URL variables, but essentially you'd have some kind of mapping function that grabbed the SEO-friendly names and substituted in the numbers to grab the content.
The third step is rewriting any URLs that are generated by your CMS to output the SEO-friendly URLs. Same mapping happens here, only in reverse.

.htaccess Conditional Redirect When Pattern Does Not Match

I want to enable my users to enter search queries using a URL like this:
www.domain.com/searchterm
or with a trailing slash like this:
www.domain.com/searchterm/
However, I want to capture certain search terms and redirect them to an actual directory. For example, a query like this:
www.domain.com/css/site.css
should actually point to the CSS file, and should not pass "css/site.css" as the search term.
Here's my non working code:
RewriteRule ^/(.+)/?$ /index.php?search=$1 [L]
RewriteRule ^/css/(.+)$ /css/$1 [L]
This doesn't work - can anyone tell me what I'm doing wrong?
Instead of excluding all your existing urls it would be a far better solution to use a script like thia as a 404 page. While capturing the 404 you could still send a 200 response but atleast it would make your rewrite rules far easyer.
Or if you really want to do it without the 404, use this
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule .? /search_script [L]
It's not working because your expression isn't written well.
First, if you flip those two it should work fine. Second, take a look at your first RewriteRule. Your expression is ^/(.+)/?$. Basically, it's matching ANYTHING until the end of the string, so long as it starts with /.
If I were working on this file, I'd move the "search" RewriteRule to the end of the file. It would be interpreted last, and therefore it has less a chance of being used.
And as I'm writing this, I see Rick's option, which is what I would do even more than my own option. I'm not too familiar with the RewriteCond yet. What his does is it checks to see if the request is a file or a directory, and if not, it would then go make the search.
Cheers!

Resources