Redirect parameter to its parent post url - .htaccess

i have several set of error URL with parameter that i need to redirect to the parent post URL
http://www.mysite.co/post1.html?amp=1
http://www.mysite.co/post1.html?amp=0
http://www.mysite.co/post1.html?utm_source=xxxxx
so what im trying to achieve http://www.mysite.co/post1.html?amp=1 (status 404 ) should redirect to http://www.mysite.co/post1.html (200 ok)
i tried to add htaccess code but it always gave me 500 errors, can someone help me with the proper htaccess code
Htacess
# BEGIN WordPress
# The directives (lines) between "BEGIN WordPress" and "END
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteRule .* - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization}]
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>
# END WordPress
current permalinks are set to
/%postname%.html
Parameter in url ?amp, ?utm_source are added by third party service/plugins which resulted 404 error

Try the following at the top of the .htaccess file, before the # BEGIN WordPress section:
RewriteCond %{QUERY_STRING} ^(amp=[01]|utm_source=[^&]+)$
RewriteRule ^[\w-]\.html$ /$0 [QSD,R=301,L]
This matches any URL-path that ends in .html. Only the specific query strings as mentioned in the question are matched. ie. amp=0 or amp=1 or utm_source=<something>. It will not redirect amp=2 or utm_source= or utm_source=<something>&foo=1 etc.
The QSD flag (Apache 2.4) discards the original query string.
Test first with 302 (temporary) redirects to avoid potential caching issues.
UPDATE:
#REDIRECTION UTM CLEAR
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteCond %{QUERY_STRING} "utm" [NC]
RewriteRule (.*) /$1? [R=301,L,QSD]
<IfModule mod_rewrite.c>
which one you thing serves better?
What do you mean by "better"?
As written, this code is strictly invalid (and contains superfluous directives). But (when corrected) this code arguably matches too much (and does not handle the amp URL parameter at all). It matches utm anywhere in the query string which could potentially create conflicts with existing code. It also matches any URL-path, so is potentially checking 1000s of requests that don't need checking. eg. It would match /image.jpg?nutmeg=5 and /?scoutmaster=1 - which clearly have nothing to do with utm tracking parameters (which all start utm_).
The code I posted above matches precisely the criteria you've stated in the question. And thus avoids potential conflicts. So, from that perspective, the code I posted above is "better".
However, to match amp or any URL parameter that simply starts utm_ and only whole URL parameters that might occur anywhere in the query string then use something like the following instead:
RewriteCond %{QUERY_STRING} (?:^|&)(amp|utm_\w+)=
RewriteRule ^[\w-]\.html$ /$0 [QSD,R=301,L]
This matches URLs of the form /%postname%.html - your permalink structure. It does not match /image.jpg etc.
You do not need to repeat the RewriteEngine directive. The RewriteBase directive is entirely superfluous. You should not wrap these directives in a <IfModule> container.
Note that if you have any legitimate URL parameters mixed in then they will also be removed.
This matches the following:
/<postname>.html?amp=<anything>
/<postname>.html?utm_source=<anything>
/<postname>.html?utm_campaign=<anything>&bar=1
/<postname>.html?foo=1&utm_<something>=<anything>
etc.
But does not match:
/<postname>.html?wamp=<anything>
/<postname>.html?nutmeg_source=<anything>
/image.jpg?utm_source=<anything>
etc.

Related

301 redirect with wildcard in htaccess file not working

I am trying to redirect these pages
/reviews/page/2/
/reviews/page/3/
to this
/reviews/
using this line:
Redirect 301 /reviews/page/./ /reviews/
But it's not working. I've tried other combinations like .* and ^.*$ but nothing works. Only a specific URL will get redirected to the new one.
Is there anything else that could interfere with the line I'm trying to work? Maybe space, uppercase, lower case, indent, etc?
The whole file is pasted below.
RewriteEngine On
RewriteCond %{SERVER_PORT} 80
RewriteRule ^(.*)$ https://example.com/$1 [R,L]
# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>
# END WordPress
# --------------------------------
# | ADDITIONAL RULES |
# --------------------------------
<FilesMatch "^robots\.txt">
Order Allow,Deny
Allow from all
</FilesMatch>
<FilesMatch "^\.htaccess|.*\.cgi">
Order Deny,Allow
Deny from all
</FilesMatch>
Redirect 301 /reviews/page/./ /reviews/
The mod_alias Redirect directive is prefix-matching, it does not accept wildcards or regex. So, you could theoretically do something like:
Redirect 301 /reviews/page/ /reviews/
However, as mentioned, the Redirect directive is prefix-matching and everything after the match is copied onto the end of the target URL. So, a request for /reviews/page/2/ would be redirected to /reviews/2/ - which is not desirable.
You could use RedirectMatch instead, which uses regex rather than simple prefix matching. However, since you are already using mod_rewrite (RewriteRule) directives, it is preferable to use mod_rewrite for this in order to avoid potential conflicts. Different Apache modules execute at different times during the request, regardless of the apparent order of these directives in the config file.
Instead, try the following mod_rewrite directive at the top of your .htaccess file, immediately after the RewriteEngine directive:
RewriteRule ^reviews/page/[23]/$ /reviews/ [R=302,L]
This matches any request for /reviews/page/2/ or /reviews/page/3/ and redirects to /reviews/. The 2 or 3 are matched using a character class. Note that in per-directory .htaccess files the URL-path that the RewriteRule pattern matches against does not start with a slash.
This is also a 302 (temporary) redirect. Change this to 301 (permanent) redirect - if that is the intention - but only after you have tested that it's working OK. 301s are cached hard by the browser, so can make testing problematic.
You'll need to clear your browser cache before testing.
UPDATE: To match any digit you can change [23] to \d (the shorthand character class for digits. To match 1 or 2 digits, you can use \d{1,2}. For example:
RewriteRule ^reviews/page/\d{1,2}/$ /reviews/ [R=302,L]
You could use . (dot) to match any character. However, this might be too broad for what looks like a page number. Regular expressions (regex) should be as restrictive as possible.

mod-rewrite redirect but prevent direct access

I want to redirect all content to:
www.example.com/public/...
but prevent direct access to
www.example.com/public/file1/
www.example.com/public/file2/
etc
The final URL should be:
www.example.com/file1/
I've tried this for redirecting and it works - but I dont know how to prevent direct access:
ReWriteCond %{REQUEST_URI} !/public/
RewriteRule ^(.*) public/$1 [L]
After spending an inordinate amount of time trying to solve this problem, I found that the solution lies with the under-documented REDIRECT_STATUS environment variable.
Add this to the beginning of your top-level /.htaccess code, and also to any .htaccess files you have under it (e.g. /public/.htaccess):
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{ENV:REDIRECT_STATUS} !=200
RewriteRule ^ /public%{REQUEST_URI} [L]
</IfModule>
Now, if the user requests example.com/file1 then they are served the file at /public/file1. However, if they request example.com/public/file1 directly then the server will attempt to serve the file at /public/public/file1, which will fail (unless you happen to have a file at that location).
IMPORTANT:
You need to add those lines to all .htaccess files, not just the top-level one in the web root, because if you have any .htaccess files below the web root (e.g. /public/.htaccess) then these will override the top-level .htaccess and users will again be able to access files in /public directly.
Note about variables and redirects:
Performing a redirect (or a rewrite) causes the whole process to start again with the new URI, so any variables that you set before the redirect will no longer be set afterwards. This is done deliberately, because usually you do not want the final result to depend on how you got there (i.e. whether it was via a direct request or via a redirect).
However, for those special occasions where you do want to know how you got to a particular URI, you can use REDIRECT_STATUS. Also, any environment variables set before the redirect (e.g. with SetEnvIf) will still be available after the redirect, but with REDIRECT_ prefixed to the name of the variable (so MY_VAR becomes REDIRECT_MY_VAR).
Maybe you should clarify what's the expected behaviour when user tries to reach the real URL:
www.example.com/public/file1/
If by prevent you mean forbid, you could add a rule to respond with a 403
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_URI} !/public/
RewriteRule ^(.*)$ /public/$1 [L]
RewriteCond %{REQUEST_URI} /public/
RewriteRule ^(.*)$ / [R=403,L]
</IfModule>
Update: The solution above doesn't work!
I realized my previous solution always throws the 403 so it's worthless. Actually, this is kinda tricky because the redirection itself really contains /public/ in the URL.
The solution that really worked for me is to append a secret query string to the redirection and check for this value on URL's containing /public/:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_URI} !/public/
RewriteRule ^(.*)$ /public/$1?token=SECRET_TOKEN [L]
RewriteCond %{REQUEST_URI} /public/
RewriteCond %{QUERY_STRING} !token=SECRET_TOKEN
RewriteRule ^(.*)$ / [R=403,NC,L]
</IfModule>
This way www.example.com/file1/ will show file1, but www.example.com/public/file1/ will throw a 403 Forbidden error response.
Concerns about security of this SECRET_TOKEN are discussed here: How secure is to append a secret token as query string in a htaccess rewrite rule?
If your URL's are expected to have it's own query string, like www.example.com/file1/?param=value be sure to add the flag QSA.
RewriteRule ^(.*)$ /public/$1?token=SECRET_TOKEN [QSA,L]

.htaccess RewriteCond for existing and non-existing files

I am trying to use .htaccess to redirect all url requests of a certain subpath ("URL/somefolders/main/..") to one basefile named "_index.php". So I implemented the following .htaccess to the "folder" URL/somefolders/main/ :
<IfModule mod_rewrite.c>
DirectoryIndex index.php
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^?]*)$ /main/_index.php?oldpath=$1 [NC,L,QSA]
</IfModule>
The redirection works fine for all non-existant files, but if the file exists then it is called without redirection. I suppose this is because I ordered to do so by the "!" in the RewriteCond, but all my tries to change it failed.
How do I have to change the above code to redirect all files (existant or not) ?
Edit:
All my tries still end up uneffective or erroneous.
The latter with the Apache log error:
Request exceeded the limit of 10 internal redirects due to probable configuration error. Use 'LimitInternalRecursion' to increase the limit if necessary. Use 'LogLevel debug' to get a backtrace.
Currently I fear that in fact I did correctly allow non-existant files in most of my before tries but get me a problem with an endless loop due to included files - is this possible? And if so can the .htaccess distinguish between "internal" and "external" file requests?
Your original rules are the most common implementation you'll see, where REQUEST_FILENAME is checked for existing files or directories to prevent things like CSS and images from being rewritten. But that's not what you want.
So you correctly attempted to remove the RewriteCond directives but ended up with an infinite rewrite loop. That is likely because the subsequent RewriteRule is also attempting to rewrite _index.php back to itself.
You can fix that by adding a RewriteCond which specifically matches _index.php to prevent it from looping on itself.
<IfModule mod_rewrite.c>
DirectoryIndex index.php
RewriteEngine on
# Don't apply the rewrite to _index.php to prevent looping
RewriteCond %{REQUEST_URI} !main/_index\.php
RewriteRule ^([^?]*)$ /main/_index.php?oldpath=$1 [NC,L,QSA]
</IfModule>
I'll also simplify the matched group in RewriteRule. ([^?]*) captures everything up to the first ?, but the expression received by RewriteRule will never include the query string or ? anyway. You may instead simply use (.*) to capture whatever is present.
RewriteRule (.*) /main/_index.php?oldpath=$1 [NC,L,QSA]

.htaccess | no redirect if there is an specific parameter

im doing a cms at the moment
now im struggeling with the ajax implementation
i have everything running except a mod_rewrite problem..
RewriteEngine On
RewriteRule \.html index.php [L]
this redirects nothing except html files to index.php
i need a second rule witch checks the REQUEST_URI for a parameter to prevent the full site gets loaded by ajax.
i dont think this is understandable so i just post what i want to achieve^^
RewriteEngine On
RewriteCond %{REQUEST_URI} !(?rewrite=no$)
RewriteRule \.html index.php [L]
i want nothing redirected except html files and also no redirect on url's with "(.html)?rewrite=no" at the end
hope someone can help me since rewrites and regexp are not my stongest stuff
thanks in advance
From the Apache docs:
REQUEST_URI
The path component of the requested URI, such as "/index.html". This notably excludes the query string which is available as as its own variable named QUERY_STRING.
So you are actually looking to match on %{QUERY_STRING} rather than %{REQUEST_URI}. Don't include the ? on the query string when matching its condition:
RewriteEngine On
# Match the absence of rewrite=no in the query string
RewriteCond %{QUERY_STRING} !rewrite=no [NC]
# Then rewrite .html into index.php
RewriteRule \.html index.php [L]

url rewrite how to send everything else to index.php condition

there,
This sould be a simple task for anyone who knows, but I am new to Apache rewrites, so please bear with me.
I wrote 2 rewrite conditions and they work. I need to write a third - so that everything else would go to index.php file. The problem is - if I add the third rule, it is always applied despite first 2.
RewriteEngine On
RewriteRule ^new/?$ new.php [NC,L]
RewriteRule ^thanks(.*)$ thankyou.php [NC,L]
RewriteRule ^(.*)$ index.php
Thanks for help.
I believe the answer lies in the following paragraph about the L flag used with the RewriteRule directive:
If you are using RewriteRule in either .htaccess files or in
sections, it is important to have some understanding of
how the rules are processed. The simplified form of this is that once
the rules have been processed, the rewritten request is handed back to
the URL parsing engine to do what it may with it. It is possible that
as the rewritten request is handled, the .htaccess file or
section may be encountered again, and thus the ruleset may be run
again from the start. Most commonly this will happen if one of the
rules causes a redirect - either internal or external - causing the
request process to start over.
I think what happens is that after the rewrite is executed, somehow control is given back to the URL parsing engine and the rules are run again.
You can prevent this behaviour by adding a few rewrite conditions to the last rule:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteRule new/?$ new.php [NC,L]
RewriteRule thanks(.*)$ thankyou.php [NC,L]
# Only rewrite to index.php if the current request is not for an existing file or directory
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . index.php [L]
</IfModule>

Resources