.htaccess regular expression in Yii when having path url format - .htaccess

I am trying to set the regular expressions to get some parameters from the url. here is one url example:
mydomain.com/files/images/versions/large/uploadedGal/1-36.png
and I need to get two parameters out of this large and 36 and make a url like this:
mydomain.com/index.php?r=image/default/create&id=36&version=large
This is the my regular expression in the htaccess:
Options -Indexes
<IfModule mod_rewrite.c>
RewriteEngine on
RewriteBase /
# If the requested file or directory does not exist...
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
# ...and if the source URL points to an image, we redirect to the create image URL.
RewriteRule versions/([^/]+)/[^\-\d]*\-?(\d+)\.(gif|jpg|png)$ index.php?r=image/default/create&id=$2&version=$1 [L,R,QSA]
</IfModule>
Which doesn't work... I cannot figure out where is the problem, I think it doesn't get the parameters write, and I get the error 404 not found. I am using urlFormat as path BTW.

Well ... your pattern is wrong/bad. Considering your example URL
files/images/versions/large/uploadedGal/1-36.png
and how it needs to be transformed/rewritten
index.php?r=image/default/create&id=36&version=large
try this one (definitely works, but you may need to adjust it for different URL types, since you have provided only 1 URL example):
versions/([^/]+)/[^\-\d]*(?:\d+\-)?(\d+)\.(gif|jpg|png)$
or like this (really depends on other possible URLs, but most likely this is what you wanted in first place)
versions/([^/]+)/[^\-]*\-?(\d+)\.(gif|jpg|png)$
Your current pattern will match only up to this part files/images/versions/large/uploadedGal/.

Related

Redirect parameter to its parent post url

i have several set of error URL with parameter that i need to redirect to the parent post URL
http://www.mysite.co/post1.html?amp=1
http://www.mysite.co/post1.html?amp=0
http://www.mysite.co/post1.html?utm_source=xxxxx
so what im trying to achieve http://www.mysite.co/post1.html?amp=1 (status 404 ) should redirect to http://www.mysite.co/post1.html (200 ok)
i tried to add htaccess code but it always gave me 500 errors, can someone help me with the proper htaccess code
Htacess
# BEGIN WordPress
# The directives (lines) between "BEGIN WordPress" and "END
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteRule .* - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization}]
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>
# END WordPress
current permalinks are set to
/%postname%.html
Parameter in url ?amp, ?utm_source are added by third party service/plugins which resulted 404 error
Try the following at the top of the .htaccess file, before the # BEGIN WordPress section:
RewriteCond %{QUERY_STRING} ^(amp=[01]|utm_source=[^&]+)$
RewriteRule ^[\w-]\.html$ /$0 [QSD,R=301,L]
This matches any URL-path that ends in .html. Only the specific query strings as mentioned in the question are matched. ie. amp=0 or amp=1 or utm_source=<something>. It will not redirect amp=2 or utm_source= or utm_source=<something>&foo=1 etc.
The QSD flag (Apache 2.4) discards the original query string.
Test first with 302 (temporary) redirects to avoid potential caching issues.
UPDATE:
#REDIRECTION UTM CLEAR
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteCond %{QUERY_STRING} "utm" [NC]
RewriteRule (.*) /$1? [R=301,L,QSD]
<IfModule mod_rewrite.c>
which one you thing serves better?
What do you mean by "better"?
As written, this code is strictly invalid (and contains superfluous directives). But (when corrected) this code arguably matches too much (and does not handle the amp URL parameter at all). It matches utm anywhere in the query string which could potentially create conflicts with existing code. It also matches any URL-path, so is potentially checking 1000s of requests that don't need checking. eg. It would match /image.jpg?nutmeg=5 and /?scoutmaster=1 - which clearly have nothing to do with utm tracking parameters (which all start utm_).
The code I posted above matches precisely the criteria you've stated in the question. And thus avoids potential conflicts. So, from that perspective, the code I posted above is "better".
However, to match amp or any URL parameter that simply starts utm_ and only whole URL parameters that might occur anywhere in the query string then use something like the following instead:
RewriteCond %{QUERY_STRING} (?:^|&)(amp|utm_\w+)=
RewriteRule ^[\w-]\.html$ /$0 [QSD,R=301,L]
This matches URLs of the form /%postname%.html - your permalink structure. It does not match /image.jpg etc.
You do not need to repeat the RewriteEngine directive. The RewriteBase directive is entirely superfluous. You should not wrap these directives in a <IfModule> container.
Note that if you have any legitimate URL parameters mixed in then they will also be removed.
This matches the following:
/<postname>.html?amp=<anything>
/<postname>.html?utm_source=<anything>
/<postname>.html?utm_campaign=<anything>&bar=1
/<postname>.html?foo=1&utm_<something>=<anything>
etc.
But does not match:
/<postname>.html?wamp=<anything>
/<postname>.html?nutmeg_source=<anything>
/image.jpg?utm_source=<anything>
etc.

Tricky .htaccess mod_rewrite syntax problem

I got stuck, and even reading through tons of forum posts didn't help me.
The challenge:
I need URIs to be rewritten and queries to be maintained
Examples 1:
example.com/test/23/result/7
shall be redirected to a script under
example.com/test/
That works quite well with an .htaccess entry like this:
RewriteCond %{QUERY_STRING} ^$
RewriteRule ^test/(.+)$ test/?s=$1
The URI is displayed unaltered. The called script is called, and the additional subdirectory definitions can be retrieved in PHP either through variable $_GET['s'] or $_SERVER['REQUEST_URI']. All is fine so far. The problem starts when adding a query string:
Example 2:
example.com/test/23/result/7?id=16
shall be redirected to the same script under
example.com/test/?id=16
Even when I add [QSA] to the rewrite rule, the URI is not parsed correctly. I tried several ways to initiate a redirect. All failed. The redirect either points to a non-existing address or the query string gets lost. Besides the initial URI subdirectory information, here I would need the query string to be evaluated in my script. Both pieces of data need to be transferred to it.
Does anyone have a solution?
Thanks a lot for sharing your expertise!
I would go with following htaccess Rules. This assumes that you have index.php file which is taking care of non-existing pages request in later your Rules.
RewriteEngine ON
##Rules for handling index.php url here.
RewriteCond %{THE_REQUEST} \s/([^/]*)/.*\?index\.php\s [NC]
RewriteRule ^ %1?%{QUERY_STRING} [NC,L]
##Rules for non-existing pages here.
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^ index.php [L]
###Rest of your rules go here.....

Surprising rewriting of URL by htaccess rule

I've zeroed my problem and I've specific question.
With only the following code in the .httaccess why index2.php gets called if I type in my URL as www.mysite.com/url2 ?
RewriteEngine On
RewriteCond %{REQUEST_URI} (.html|.htm|.feed|.pdf|.raw)$ [NC]
RewriteRule (.*) index2.php [L]
I've also tested it at http://www.regextester.com and should not replace it with index2.php:
In the end I want this rule to skip any URL starting with /url2 or /url2/*.
EDIT: I've made screen recording of this problem: http://screenr.com/BBBN
You have this in your .htaccess:
RewriteEngine On
RewriteCond %{REQUEST_URI} (.html|.htm|.feed|.pdf|.raw)$ [NC]
RewriteRule (.*) index2.php [L]
What it does? it rewrites anything that ends with html, htm, feed , pdf , raw to index2.php. So, if you are getting results as your URL is ends with those extensions, then there are two possible answers:
There is another rewrite rule in an .htaccess in upper directories (or in server config files) that causes the URL to be rewritten.
Your URL actually ends with those extensions. have in mind, what you enter in your address bar, will be edited and rewritten. For example, if you enter www.mysite.com/url2 in your address bar and that file doesn't exist on server, your server will try to load the proper error document. So, if your error document is /404.html, it will be rewritten to index2.php at the end.
Update:
I think it's the case. create a file named 404.php in your document root. Inside your main .htaccess (in your document root), put this:
ErrorDocument 404 /404.php
delete all other ErrorDocument directives.
inside 404.php , put this:
<?php
echo 'From 404.php file';
?>
Logic behind it:
When you have a weird behavior in mod_rewrite, the best solution in my experience is using rewrite log. to enable rewrite log put this in your virtualhost or other server config directives you may choose:
RewriteLogLevel 9
RewriteLog "logs/RewriteLog.log"
be careful: the code above will enable rewrite log and start logging at highest level possible (logging everything). It will decrease your server speed and the log file will become huge very quickly. Do this only on your dev server.
Explanation: When you try to access www.mysite.com/url2, Apache gives your URL to rewrite module. Rewrite module checks if any of RewriteRules applies to your URL. Because you have one rule and it doesn't apply to your URL, it tries to load the normal file. But this file does not exit. So, Apache will do the next step which is showing the proper error message. When you set a custom error file, Apache will run the test against the new address. For example if error document is /404.html, Apache checks whether your rule applies to /404.html or not. Since it does, it will rewrite it.
The point to remember is apache will do this every time there is change in URL, whether the change is made by rewrite module or not!
The rule you list should work as you expect if this is the only rule. Fact is that theory is fun, but apparently it doesn't work as expected. Please note that . will match ANY CHARACTER. If you want to match the full stop/period character, you'll need to escape it. That's why I use \.(html|htm|feed|pdf|raw)$ instead of (.html|.htm|.feed|.pdf|.raw)$ below.
You can add another RewriteCond that simply doesn't match if the url starts with /url2, like below. This might not be a viable solution if there are lots of urls that shouldn't be matched.
RewriteCond %{REQUEST_URI} !^/url2
RewriteCond %{REQUEST_URI} \.(html|htm|feed|pdf|raw)$ [NC]
RewriteRule (.*) index2.php [L]
To get a better understanding of what is happening you can alter the rule to something like this. Now simply enter the urls you dont want to be matched in the url bar and inspect the url bar after the redirect happens. In the url-parameter you now see what url actually triggered this rule to match. This screencast shows you a similar version working with a sneaky rewriterule that is working away on the url.
#A way of finding out what is -actually- matched
RewriteCond %{REQUEST_URI} \.(html|htm|feed|pdf|raw)$ [NC]
RewriteCond %{REQUEST_URI} !/foo
RewriteRule (.*) /foo?url=$1 [R,L]
You can decide to match the %{THE_REQUEST} variable instead. This will always contain the request itself. If something else is rewriting the url, this variable doesn't change, meaning you can use this to overwrite any changes. Make sure the url won't be matching itself. You would get something like below. An example screencast can be found here.
#If it doesn't end on .html/htm/feed etc, this one won't match
RewriteCond %{THE_REQUEST} ^(GET|POST)\ /.*\.(html|htm|feed|pdf|raw)\ HTTP [NC]
RewriteCond %{REQUEST_URI} !^/index2\.php$
RewriteRule (.*) /index2.php [L]

Trying to stop urls such as mydomain.com/index.php/garbage-after-slash

I know very little about .htaccess files and mod-rewrite rules. Looking at my statcounter information today, I noticed that a visitor to my site entered a url as follows:
http://mywebsite.com/index.php/contact-us
Since there is no such folder or file on the website and no broken links on the site, I'm assuming this was a penetration attempt. What was displayed to the visitor was the output of the index.php file, but without benefit of the associated CSS layout.
I need to create a rewrite rule that will either remove the information after index.php (or any .php file), or perhaps more appropriately, insert a question mark (after the .php filename), so that any following garbage will be treated like a parameter (and will be gracefully ignored if no parameters are required).
Thank you for any assistance.
If you're only expecting real directories and real files that do exist, then you can add this to an .htaccess file. What it does is it takes a non-existent file or directory request and gives the user the index.php page with the original request as a query string. [QSA] appends any existing query string.
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule (.*) index.php?$1 [PT,QSA]
I found a solution, using information provided by AbsoluteZero as well as other threads that popped up on the right side of the screen as the solution came closer.
Here's the code that worked for me...
Options -Multiviews -Indexes +FollowSymLinks
RewriteEngine On
RewriteBase /
DirectorySlash Off
# remove trailing slash
RewriteRule ^(.*)\/(\?.*)?$ $1$2 [R=301,L]
# translate PATH_INFO information into a parameter
RewriteRule ^(.*)\.php(\/)(.*) $1.php?$3 [R=301,L]
# rewrite /dir/file?query to /dir/file.php?query
RewriteRule ^([\w\/-]+)(\?.*)?$ $1.php$2 [L,T=application/x-httpd-php]
I got the removal of trailing slash from another post on StackOverflow. However, even after removing the trailing slash, the rewrite rule did not account for someone appending what looks to be valid information after the .php file
(For example: mysite.com/index.php/somethingelse)
My goal was to either remove the "/somethingelse", or render it harmless. The PATH_INFO rule locates a "/" after the .php file, and turns everything else from that point forward into a query string (which will usually be ignored by the PHP file).

Redirect to fallback file if first attempt fails

I have this in my .htaccess:
RewriteRule ^images/([^/\.]+)/(.+)$ themes/current/images/$1/$2 [NC]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^images/([^/\.]+)/(.+)$ modules/$1/images/$2 [L,NC]
The idea is that it does the following:
// Rewrite this...
images/calendar/gear.png
// ... to this
themes/current/images/calendar/gear.png
// HOWEVER, if that rewritten path doesn't exist, rewrite the original URL to this:
modules/calendar/images/gear.png
The only things that change here are calendar and gear.png, the first of which could be any other single word and the latter the file name (possibly with path) to an image file.
I can rewrite the original URL to the first rewrite as shown in the example just fine, but what I cannot do is get my .htaccess to serve up the file from the other, fallback location if the first location 404s. I was under the impression that not using [L] in my first RewriteRule would rewrite the URL for RewriteCond.
The problem I'm having is that instead of serving the fallback file, the browser just shows a 404 to the first rewritten path (themes/current/calendar/gear.png), instead of falling back to modules/calendar/gear.png. What am I doing wrong?
Please note that my regex isn't perfect, but I can refine that later. Right now I'm concerning myself with the rewrite logic itself.
Fallthrough rules are fraught with bugs. My general recommendation is than any rule with a replacement string other than - should trigger an internal redirect to restart the .htaccess parse. This avoids the subrequest and URI_PATH bugs.
Next once you go to 404, again in my experience this is unrecoverable. I have a fragment which does something similar to what you are trying to do:
# For HTML cacheable blog URIs (a GET to a specific list, with no query params,
# guest user and the HTML cache file exists) then use it instead of executing PHP
RewriteCond %{HTTP_COOKIE} !blog_user
RewriteCond %{REQUEST_METHOD}%{QUERY_STRING} =GET [NC]
RewriteCond %{ENV:DOCUMENT_ROOT_REAL}/blog/html_cache/$1.html -f
RewriteRule ^(article-\d+|index|sitemap.xml|search-\w+|rss-[0-9a-z]*)$ \
blog/html_cache/$1.html [L,E=END:1]
Note that I do the conditional test in filesystem space and not URI (Location) space. So this would map in your case to
RewriteCond %{DOCUMENT_ROOT}/themes/current/images/$1/$2l -f
RewriteRule ^images/(.+?)/(.+)$ themes/current/images/$1/$2 [L]
Though do a phpinfo() to check to see if your hosting provider uses an alternative to DOCUMENT_ROOT if it is a shared hosting offering e.g an alternative environment variable as mine uses DOCUMENT_ROOT_REAL.
The second rule will be picked up on the second processing past after the internal redirect.

Resources