Remove parentheses from the URLs query string in rewrite rule - .htaccess

I would like to clean up the URL's by removing parentheses from all query strings.
I tried the following code, but couldn't get it to work.
RewriteCond %{REQUEST_URI} [\(\)]+
RewriteRule ^(.*)[\(]+([^\)]*)[\)]+(.*)$ /$1$2$3 [R=301,L]
Here's an example of a URL:
http://www.example.com/blog/abc-post/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+blogname+(Blog+Name+New+York)

In order to match the query string you need to check the QUERY_STRING server variable in a RewriteCond directive.
Here are some ways of doing this:
1. Any number of parentheses - multiple redirects
For example, to remove any number of opening/closing parentheses in the query string part of the URL:
RewriteCond %{QUERY_STRING} (.*)[()]+(.*)
RewriteRule (.*) /$1?%1%2 [R,NE,L]
The NE flag is required in your example to avoid the %-encoded character (ie. %3A) being doubly encoded.
This will, however, result in multiple redirects, depending on the number of "groups" of parentheses. In your example, this will result in two redirects, because there a two "groups" of parentheses (a single parenthesis in each "group").
2. Any number of parentheses pairs - multiple (but fewer) redirects
If the parenthesis are always in matching pairs, then you can specifically check for the opening/closing parenthesis and potentially reduce the number of redirects.
RewriteCond %{QUERY_STRING} (.*)\((.*)\)(.*)
RewriteRule (.*) /$1?%1%2%3 [R,NE,L]
In your example, this results in a single redirect because there is just a single pair of parentheses. But /abc?foo=(bar)&one=(two) would result in two redirects.
3. Any number of parentheses - single redirect
This method performs multiple internal rewrites to remove the parentheses, followed by a single redirect once all the parentheses have been replaced:
# Remove parentheses from query string
RewriteCond %{QUERY_STRING} (.*)[()]+(.*)
RewriteRule (.*) /$1?%1%2 [E=REPLACED_PARENS:1,NE,L]
# Redirect to "clean" URL
RewriteCond %{ENV:REDIRECT_REPLACED_PARENS} 1
RewriteCond %{THE_REQUEST} ^GET\ /(.*)\?
RewriteRule ^ /%1 [R,NE,L]
The first rule internally rewrites the request and sets an environment variable if a replacement is required.
The second rule checks for this environment variable (note that REPLACED_PARENS becomes REDIRECT_REPLACED_PARENS after the first rewrite) and ultimately redirects to the cleaned URL. The URL-path is grabbed from the initial request (contained in the THE_REQUEST server variable) to avoid inadvertantly redirecting to the directory index (eg. index.php) when a bare directory is requested (or front-controller is used).

Related

Rewrite URL with query parameters in .htaccess

Say I have these urls:
https://example.com/bbs/board.php?bo_table=cad
https://example.com/bbs/board.php?bo_table=videos
https://example.com/bbs/board.php?bo_table=news
How can I rewrite these in .htaccess to something like this:
https://example.com/cad
https://example.com/videos
https://example.com/news
This is my attempt thus far. I know that my rewrite method is solid because it works on URL's without query strings. I tried the QSA flag (Query String Append) to no avail.
Options -MultiViews
RewriteRule ^bbs/board.php?bo_table=cad$ /caster-cad-downloads [R=301,L,QSA]
RewriteRule ^caster-cad-downloads$ bbs/board.php?bo_table=cad [END]
RewriteRule ^bbs/board.php?bo_table=video$ /caster-videos [R=301,L,QSA]
RewriteRule ^caster-videos$ bbs/board.php?bo_table=video [END]
RewriteRule ^bbs/board.php?bo_table=news$ /news [R=301,L,QSA]
RewriteRule ^news$ bbs/board.php?bo_table=news [END]
How can I rewrite to a different URL instead of the query string while still using the
%{QUERY_STRING method?
RewriteCond %{QUERY_STRING} ^bo_table=(cad|videos|news)$
RewriteRule ^bbs/board\.php$ /%1 [QSD,R=301,L]
# RewriteRule ^(caster-cad-downloads|caster-videos|news)$ bbs/board.php?bo_table=$1 [END]
RewriteRule ^(?:caster-(cad)-downloads|caster-(videos)|(news))$ bbs/board.php?bo_table=$1 [END]
How can I rewrite these in .htaccess to something like this:
The "rewrite" is the other way round (as mentioned previously). The incoming request is for /cad and this is internally rewritten to /bbs/board.php?bo_table=cad that actually handles the request.
This can be achieved with a single rule since these 3 URLs follow the same pattern (although that conflicts with the code sample you've posted). For example:
RewriteRule ^(cad|videos|news)$ bbs/board.php?bo_table=$1 [END]
The $1 backreference contains the value of the first capturing group in the RewriteRule pattern. ie. either cad, videos or news.
The external redirect is not strictly necessary, unless you are changing an existing URL structure. Note that the RewriteRule pattern matches against the URL-path only, which notably excludes the query string. (So your rules that include a query string would never match.) To match the query string you need an additional condition (RewriteCond directive) and match against the QUERY_STRING server variable. For example, the following would go before the above rewrite:
RewriteCond %{QUERY_STRING} ^bo_table=(cad|videos|news)$
RewriteRule ^bbs/board\.php$ /%1 [QSD,R=301,L]
Note that we need to use the QSD flag here in order to discard the original query string, we don't want to append it.
The %1 backreference (as opposed to $1) matches the capturing group in the last matched CondPattern (RewriteCond directive).
Don't forget to backslash-escape literal dots in the regex in order to negate their special meaning.
UPDATE:
RewriteRule ^(cad-downloads|cad-videos|news)$ bbs/board.php?bo_table=$1 [END]
To pass cad, videos (video?) or news as the URL parameter, you could do it like this:
RewriteRule ^(?:(cad)-downloads|cad-(videos)|(news))$ bbs/board.php?bo_table=$1 [END]
This is made possible because cad, videos and news are still part of the requested URL. The outer regex group is made non-capturing (with the ?: prefix). An additional capturing group inside this captures the necessary part of the requested URL.
However, the reverse is not possible without hardcoding the mappings.
I'll see if I can get back to your other queries/chat tomorrow...

remove all querystring from any url [duplicate]

Have been trying to write a redirect rule with query string but did not succeed.
I have the URL example.com/blog/?page=1 or example.com/blog/?hello, so it does not really matter what goes in the query string. How do I write a Redirect rule, so that it cuts the query string and redirects to the URL before the query string. For example, both of those URLs have to redirect to example.com/blog/ so that URL does not contain any query string.
I was trying
RewriteRule ^blog/?$ blog/ [R=301,L,NE] but got redirected to 404 page.
Also tried
RewriteRule ^blog/?$ /blog/ [R=301,L,NE] and got the message that page is not working, 'URL' redirected you too many times.
BTW, technology I am using is Gatsby with htaccess plugin.
To remove the query string you first need to check that there is a query string to remove, otherwise, it should do nothing.
For example, to remove the query string from /blog/?<query-string> you would do something like this:
RewriteCond %{QUERY_STRING} .
RewriteRule ^(blog)/?$ /$1/ [QSD,R=302,L]
This matches the URL-path blog/ (trailing slash optional) and redirects to /blog/ (with a trailing slash). Your example URL includes the trailing slash, but your regex appears to suggest the trailing slash is optional?
The preceding condition (RewriteCond directive) checks the QUERY_STRING server variable to make sure this is non-empty (ie. it matches a single character, denoted by the dot).
The $1 backreference in the substitution string contains the value from the captured group in the preceding RewriteRule pattern. ie. "blog" in this example. This simply saves repetition. You could just as easily write RewriteRule ^blog/?$ /blog/ [QSD,R,L] instead.
The QSD (Query String Discard) flag removes the original query string from the redirected response, otherwise, this would be passed through by default (which would create a redirect-loop).
If the request does not contain a query string then this rule does nothing (since the condition will fail).
If this is intended to be permanent then change the 302 (temporary) redirect to 301 (permanent), but only once you have confirmed this works as intended. 301s are cached persistently by the browser so can make testing problematic.
A look at your existing rules:
was trying RewriteRule ^blog/?$ blog/ [R=301,L,NE] but got redirected to 404 page.
By default, the relative substitution string (ie. blog/) is seen as relative to the directory that contains the .htaccess file and this "directory-prefix" is then prefixed back to the relative URL, so this will (by default) result in a malformed redirect of the form https://example.com/path/to/public_html/blog/.
Also tried RewriteRule ^blog/?$ /blog/ [R=301,L,NE] and got message that page is not working, 'url' redirected you too many times.
This is not checking for (or removing) the query string so this is basically just redirecting to itself - an endless redirect-loop.
Remove any query string from any URL
What rule do i write, to remove query string from any URL.
Modify the RewriteRule pattern to match any URL and redirect to the same. For example:
RewriteCond %{QUERY_STRING} .
RewriteRule (.*) /$1 [QSD,R=302,L]
This needs to go at the top of the root .htaccess file before any existing rewrites.
If the .htaccess file is in a subdirectory (not the root) then you will need to do something like the following instead, since the $1 backreference (as used above) won't contain the complete root-relative URL-path.
RewriteCond %{QUERY_STRING} .
RewriteRule ^ %{REQUEST_URI} [QSD,R=302,L]

HTACCESS How to "cut" URL at one point

I am new to .htaccess and I don't understand it well. Recently I have built the following code:
RewriteEngine On
RewriteCond %{HTTP_HOST} (.*)
RewriteCond %{REQUEST_URI} /api/v2/
RewriteRule ^api/v2(.*) /api/v2/api.php?input=$1
This was in the root public folder (example.com/.htaccess). But now I have to create second Rewrite and I want to make .htaccess file in example.com/api/v2/ folder. I tried to remove /api/v2/ part in each Rewrite Rule, but only thing I got was error 500.
What I want to achieve:
If someone uses this link: https://example.com/api/v2/test/test/123, I'd like to make it into https://example.com/api/v2/api?input=test/test/123 with .htaccess located in example.com/api/v2 folder.
Addressing your existing rule first:
RewriteCond %{HTTP_HOST} (.*)
RewriteCond %{REQUEST_URI} /api/v2/
RewriteRule ^api/v2(.*) /api/v2/api.php?input=$1
The first RewriteCond (condition) is entirely superfluous and can simply be removed. The second condition simply asserts that there is a slash after the v2 and this can be merged with the RewritRule pattern. So, the above is equivalent to a single RewriteRule directive as follows:
RewriteRule ^api/v2(/.*) /api/v2/api.php?input=$1 [L]
This would internally rewrite the request from /api/v2/test/test/123 to /api/v2/api.php?input=/test/test/123 - note the slash prefix on the input URL parameter value.
However, unless you have another .htaccess file in a subdirectory that also contains mod_rewrite directives then this will create a rewrite loop (500 error).
Also note that you should probably include the L flag here to prevent the request being further rewritten (if you have other directives).
If someone uses this link: https://example.com/api/v2/test/test/123, I'd like to make it into https://example.com/api/v2/api?input=test/test/123 with .htaccess located in example.com/api/v2 folder.
I assume /api? is a typo and this should be /api.php?. Note also that the slash is omitted from the start of the URL parameter value (different to the rule above).
I tried to remove /api/v2/ part in each Rewrite Rule, but only thing I got was error 500.
This is the right idea, however, you need to be careful of rewrite loops (ie. 500 error response) since the rewritten URL is likely matching the regex you are trying to rewrite.
Try the following instead in the /api/v2/.htaccess file:
RewriteEngine On
RewriteCond %{REQUEST_URI} !api\.php$
RewriteRule (.*) api.php?input=$1 [L]
The preceding RewriteCond directive checks that the request is not already for api.php, thus avoiding a rewrite loop, since the pattern .* will naturally match anything, including api.php itself.
You could avoid the additional condition by making the regex more specific. For example, if the requested URL-path cannot contain a dot then the above RewriteCond and RewriteRule directives can be written as a single directive:
RewriteRule ^([^.]*)$ api.php?input=$1 [L]
The regex [^.]* matches anything except a dot, so avoids matching api.php.
Alternatively, only match the characters that are permitted. For example, lowercase a-z, digits and slashes (which naturally excludes the dot), which covers your test string test/test/123:
RewriteRule ^([a-z0-9/]*)$ api.php?input=$1 [L]
Or, if there should always be 3 path segments, /<letters>/<letters>/<digits>, then be specific:
RewriteRule ^([a-z]+/[a-z]+/\d+)$ api.php?input=$1 [L]

Htaccess - Redirect if URL does not contain at least three numbers

I'm struggling to get this htaccess redirect to work. I want to redirect any URL that does not contain at least three numbers in a row. I started with the following and it worked perfectly for redirecting any URL that DID have three numbers in a row:
RewriteCond %{REQUEST_URI} [0-9]{3,20} [NC]
RewriteRule (.*) "https\:\/\/info\.mywebsite\.com\/" [R=301,L]
However, I tried to modify that with the exclamation mark to make the condition NOT match three numbers in a row:
RewriteCond %{REQUEST_URI} !([0-9]{3,20}) [NC]
RewriteRule (.*) "https\:\/\/info\.mywebsite\.com\/" [R=301,L]
But that doesn't seem to work as expected. Am I missing something with turning this expression into a not match?
Having previously experimented with the opposite 301 (permanent) redirect then the results are most probably cached (by the browser) from the earlier redirect. It is a good idea to test with 302 (temporary) redirects to avoid caching issues.
Note also that the REQUEST_URI server variable contains the URL-path only, so if the digits are contained in the query string part of the URL-path then your condition will fail.
The quantifier {3,20} matches from 3 to 20 characters, if you want "at least three" then use the quantifier {3,} (no upper bound).
You don't need the capturing subpatterns, ie. surrounding parentheses (...) on the regex since you are not using backreferences anywhere. Incidentally, you can't capture subpattern on a negated regex.
You don't need the additional condition (RewriteCond directive) - this can all be done with the RewriteRule directive only.
The NC flag is not required here - you are checking digits only.
For example:
RewriteRule !\d{3,} https://info.mywebsite.com/" [R=302,L]
As noted in comments, the RewriteRule substitution string is a regular string, not a regex, so does not require any special backslash escaping (although colons and slashes don't need escaping anyway in Apache regex).

URL mod-rewriting

I want to mod_rewrite this Url:
Before:
website.altervista.org/page.php?name=value
After:
website.altervista.org/value
Solution:
RewriteCond %{REQUEST_URI} !page.php$
RewriteRule ^(.+)$ /page.php?name=$1 [L]
Explanation:
The mod_rewrite RewriteRule has 3 parameters:
Pattern
Substitution
Flags
Implemented as such:
RewriteRule pattern substitution [flags]
Starting at server root, enter the requested URL path in the RewriteRule "pattern" parameter, and the desired path in the "substitution" parameter. In this case:
RewriteRule ^(.+)$ /page.php?name=$1 [L]
If the URL varies and you don't want to (or can't) write a rule for every situation then use the regular expression ^(.+)$ to capture the dynamic value and inject it into your substituted path using the RE capture variable $1. The first set of parenthesis is $1, the second set is $2, etc. And capturing parenthesis can be nested.
^(.+)$ This regular expression can be read as: ^ at the start of the string, $ all the way to the end of the string, look for . any character + one or times and () capture that value into a variable.
Problem:
Even though we have the flag [L] (last rule evaluated), the mod_rewrite engine (behind the scenes) sends the newly constructed request /page.php?name=somevalue back through the mod_rewrite engine until no rules are met or, apparently, there are no changes to the request. Fortunately there is a supplimentary directive to expand on the conditional power provided by the RewriteRule called RewriteCond.
The mod_rewrite RewriteCond applies to the next occurring RewriteRule and also has 3 parameters:
Test String
Conditional Pattern
Flags (optional)
The Test String can be derived from a few sources. Often a Server Variable, relating to the current request, is used here as the subject of this condition.
The Conditional Pattern is, again, text or a regular expression, but has some additional special conditions that may be evaluated. Read the Apache online mod_rewrite documentation for a detailed explanation.
In this case: RewriteRule ^(.+)$ /page.php?name=$1 [L], our newly substituted request is sent back through mod_rewrite as /page.php?name=somevalue and matches our "catch-all" rule, therefore our original "somevalue" is lost and replaced with our newly requested resource page.php. To prevent our "catch all" from catching our "page.php" requests let's exclude it from the rule using RewriteCond.
RewriteCond %{REQUEST_URI} !page.php$
RewriteRule ^(.+)$ /page.php?name=$1 [L]
This RewriteCond can be read as: %{REQUEST_URI} get the requested resource and does it ! NOT $ end with page.php. If this condition is true, continue to the next condition or rule. If this condition is not true, skip this rule set and continue to the next rule set.

Resources