.htaccess - Heal malformed url - .htaccess

I need a solution. For some reason in the past seems that I generate some "bad" links for bots only.
Resume: There is a fake "page" parameter when malformed url is present. When there are 2 "page" parameters then the first one is fake, must be removed.
X is random
Remove the parameter page only when "?/?page" is detected
Good: search?pagepage=496
Bad: search?/?page=X
Good: https://example.com/search?page=496
Good: https://example.com/search?page=496&orderBy=oldest
Bad: https://example.com/search?/?page=X&page=496&orderBy=oldest
RewriteCond %{QUERY_STRING} ^(.*)&?^XXX[^&]+&?(.*)$ [NC]
RewriteRule...
Thank you guys!
UPDATE
At final, I found a solution by myself:
RewriteCond %{QUERY_STRING} ^(.*)&?^/\?page=[^&]+&?(.*)$ [NC]
RewriteRule ^/?(.*)$ /search$1?%1%2 [R=301,L]

RewriteCond %{QUERY_STRING} ^/\?page=.+&(page=.*)
RewriteRule ^(search)$ $1?%1 [R=301]
This will do the rewrite for all your URLs that have the extra page parameter you want to keep.
To make the last part optional, we would have to wrap &(page=.*) into another set of braces, and add a ? as quantifier - (&(page=.*))?.
Then the back reference would need to be changed from %1 to %2 (because we only need that inner part, we don't want the &) - but then for your URL without any real page parameter to keep, there would be no match in this place, and therefor the %2 would not be substituted with anything, but added to the URL literally.
So better to leave the above as-is, and simply add
RewriteCond %{QUERY_STRING} ^/\?page=.+
RewriteRule ^(search)$ $1 [QSD,R=301]
below those two existing lines. The pattern does not need to be any more specific (because the URLs that have a genuine page parameter at the end, have already been handled by those previous two lines.) And the QSD makes it simply drop the existing query string, so that https://example.com/search?/?page=20 results in https://example.com/search (which I assume is what you wanted here, because there is no actual page parameter to keep, correct?)

Related

Correct .htaccess RewriteCond?

I have spent hours looking for a solution, but .htaccess rules seem way over my head. I have this rule:
RewriteRule ^(.*)$ wikka.php?wakka=$1 [QSA,L]
and I need it to be applied only if there is anything beyond the domain name, ie. www.example.com/xyz but NOT with just www.example.com because then I only need to display a simple index.php instead {no address translation}.
How do I do that?
RewriteRule ^(.*)$ wikka.php?wakka=$1 [QSA,L]
You just need to change the * (0 or more) in the RewriteRule pattern to + (1 or more). For example:
RewriteRule (.+) wikka.php?wakka=$1 [QSA,L]
I also removed the anchors ^ and $, since they are not necessary here. You are grabbing everything anyway, so saying you are grabbing everything from the start to the end is just not necessary.
Also, unless you require the query string from the original request I would remove the QSA flag. By itself this rule will result in a repeated query string of the form ?wakka=wikka.php&wakka=xyz - where xyz is the intial request. This still "works" if reading the URL params with PHP as wakka=xyz will override the earlier parameter.

URL Rewriting not possible?

I have a website with this url :
http://www.example.com/stage?dept=01
I want transform this url to this
http://www.example.com/stage-ain.html
I want that new url override the standard html and it's become the only url available (for SEO).
I do this, but it's not ok :
RewriteEngine On
RewriteRule ^stage?dept=01$ stages-ain.html [R=301]
Have you an idea ?
Thanks
The first argument of RewriteRule cannot match the query string. You have written a regex instead that matches the url stagedept=01 and stagdept=01.
You want to use a RewriteCond instead. You could use the %{QUERY_STRING} variable, but this will likely cause an infinite loop. Instead you probably want to match on %{THE_REQUEST}.
RewriteCond %{THE_REQUEST} /stage\?dept=01
RewriteRule ^ stages-ain.html [R,L,QSD]
Change the [R] flag to [R=301] when you have tested everything works as expected. Always use the L flag with external redirects, unless you have a very good reason to continue rewriting, as this can cause some weird problems.
See the documentation for more information.

Mod rewrite to redirect except on a certain page

Trying to write a rewrite rule in my htaccess so that any request to /en/shop/index.html or /fr/shop/index.html stays on the server, but if the user goes to any other page it redirects to a different server. Here's what I've got so far and it doesn't work.
RewriteRule ^(.*)/shop/(.*) [L]
RewriteRule ^(.*)$ http://www.newwebsite.com/$1 [R=301]
Add a dash to tell the first RewriteRule that you want the matches to be passed through unchanged:
RewriteRule ^.*/shop(/.*)?$ - [L]
I also removed the first set of parentheses since you're not using the results of the match so there's no need to store the matched patterns. I assumed you might need to match /shop without a trailing slash so that's why the (/.*)? section is there.

rewritecond, rewriterule and ignoring extra querystrings

I have an old url:
www.example.com/content.aspx?ID=227&ParentID=33&MicrositeID=0&Page=1
that I wish to rewrite to:
www.example.com/product/item
The only important bit is ID=227, everything after that can be stripped and is not required for the redirect. I need to not pass any querystrings to the new address, this is basically a hard rewrite from one address to another.
I have my rewrite rule:
RewriteCond %{QUERY_STRING} ^ID=227(.*)$ [NC]
RewriteRule ^content\.aspx$ http://www.example.com/product/item [R=301,L]
But as I'm a total noob at mod_rewrite I'm struggling - can any htaccess gurus out there help me out?
RewriteCond %{QUERY_STRING} (^|&)ID=(\d+)(&|$) [NC]
RewriteRule ^content\.aspx$ /product/item/%2? [R=301,L]
A few comments...
There's no need to add .* to match the whole string in this case. As long as you can pinpoint what you want to match, you'll be fine. (^|&) matches either the start of the string or & whereas (&|$) matches either the end of the string or &. This allows id=xxx to be anywhere in the query string, which is good practice. \d matches one digit whereas + is a repetition operator for "one or more".
Furthermore, you don't actually need to include the domain name so long as the resulting page is on the same domain. Just start the resulting string with a / to make it relative to the root level.
%2 means that you're inserting a submatch from the RewriteCond statement rather than the RewriteRule. The latter would be $1, $2, as you might know.
The trailing ? tells the rewrite engine not to append the querystring to the URL. (Don't worry, the question mark won't show up in the redirect URL)

Conditional .htaccess rules for same URL

What is the best method to combine the following two rules into one, so that users can visit domain.com/schedule and also domain.com/schedule/{day}
The rule should forward to the same controller, where I will then check the parameter
RewriteRule ^schedule/?$ index.php?_orn_shows_action=view-schedule [NC,QSA,L]
RewriteRule ^schedule/([a-zA-Z0-9-]+)/?$ index.php?_orn_shows_action=view-schedule&day=$1 [NC,QSA,L]
If you don't care whether or not you get the day parameter without a value, you can just do this:
RewriteRule ^schedule(/(([a-zA-Z0-9-]+)/?)?)?$ index.php?_orn_shows_action=view-schedule&day=$3 [NC,QSA,L]
This would capture these variations:
/schedule
/schedule/
/schedule/day
/schedule/day/
If you want to make sure not to get an empty day parameter, do this:
RewriteCond $3 ="" [OR]
RewriteCond &day=$3 ^(.*)$
RewriteRule ^schedule(/(([a-zA-Z0-9-]+)/?)?)?$ index.php?_orn_shows_action=view-schedule&day=$3 [NC,QSA,L]
Finally, if you don't expect to get a query string from the browser on these URLs, I would drop the QSA tag, since someone going to /schedule/monday?day=1 would end up setting day equal to 1, instead of to "monday". There's ways to prevent that, but if you don't need the query string, it's easier just to ignore it.
Maybe you don't have to combine it to one - try to switch their positions, second as first and first after that, so your first rule will not catch request that should be analysed by second rule.

Resources