While reviewing the documentation of a popular framework, I stumbled upon the .htaccess code below. I pretty much understand what it does except for the (?s) part. What does it do?
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^((?s).*)$ index.php?_url=/$1 [QSA,L]
It turns on the single line mode which makes . to additionally match new line characters (which it normally does not).
In this case it's redundant (and looks awkward) since the uri is a single line anyway.
References:
regular-expressions.info - Specifying Modes Inside The Regular Expression
Related
I recently installed MediaWiki 1.23.9 on a HostGator-hosted server (Apache-based I believe). I got it all configured and got pretty URLs up and running, got action URls also rewriting properly and everything was nice. I noticed, however, that anchor links, specifically the auto-generated section headers, aren't quite so pretty. They undergo "dot encoding" for some reason I'm not 100% sure on.
This results in /w/MyPage#Section_1_(Stuff_Here) becoming /w/MyPage#Section_1_.28Stuff_Here.29.
With parentheses being valid URI characters (and in fact, if used in a page title, they are properly not encoded in the URI), I don't understand why this is happening, nor how to stop it. I looked through all manner of bug reports and even tried glancing through the MediaWiki source. I found the function that performs the encoding, but as far as I can tell parentheses shouldn't be getting encoded.
My question is: Is there a way to prevent MediaWiki from encoding parentheses in section header anchors? Failing that, can I mask this behavior using .htaccess rules? For reference, my current .htaccess file is below, though I would very much prefer turning it off rather than masking it.
RewriteEngine On
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} !-f
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} !-d
RewriteRule ^(.*)$ %{DOCUMENT_ROOT}/w/index.php [L]
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} !-f
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} !-d
RewriteRule ^/?w/images/thumb/[0-9a-f]/[0-9a-f][0-9a-f]/([^/]+)/([0-9]+)px-.*$ %{DOCUMENT_ROOT}/w/thumb.php?f=$1&width=$2 [L,QSA,B]
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} !-f
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} !-d
RewriteRule ^/?w/images/thumb/archive/[0-9a-f]/[0-9a-f][0-9a-f]/([^/]+)/([0-9]+)px-.*$ %{DOCUMENT_ROOT}/w/thumb.php?f=$1&width=$2&archived=1 [L,QSA,B]
Note: This answer to a different question provides a quick explanation of what the "dot encoding" process is, though not how to exclude parentheses from it.
MediaWiki encodes section ids to honor HTML4 restrictions. This is a relic of the past as MediaWiki uses HTML5 these days, which removed those restrictions. You can set $wgExperimentalHtmlIds to true to make MediaWiki follow HTML5 rules (where only whitespace needs to be converted).
This is called "experimental" because at the time (the setting was introduced in 2010) browser support for HTML5 was somewhat unreliable. Today that's probably fine but no one actually tested that so use it at your own risk.
I am trying to rewrite my urls in my site so whatever is after the slash is passed as an argument (example.com/Page goes to example.com/index.php?page=Page)
here is the code that isn't working (it gives a Forbidden):
RewriteEngine On
RewriteRule ^/(.+)/$ /index.php?page=$1 [L]
Any Help will be appreciated
This is what I suggested in the comment to your question:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !^/index\.php
RewriteRule ^(.+)$ /index.php?page=$1 [L,B]
The leading slash does not make sense in .htaccess style files, since you do not process an absolute oath in there, but a relative one. About the trailing slash: your example does not show such a slash, so why do you want to have it in the regular expression? It results in your pattern not matching anything but a request terminated by a slash. Which is not what you want.
The RewriteCond lines are there to still allow access to physical existing files and directories and to prevent an endless loop, though that should not occur with an internal-only rewriting. And you need the B flag to escape the part of the request url you want to specify as GET argument.
The last condition is actually obsolete, since obviously /index.php should be a file. I leave it in for demonstration purposes.
In general it is a very good idea to take a look at the documentation of apaches rewriting module: httpd.apache.org/docs/current/mod/mod_rewrite.html
It is well written, very precise and contains lots of really good examples. It should answer all your questions.
I've got a rewriterule in my .htaccess which allows me to add unlimited parameters separated by /'s.
RewriteRule ^(.*)/?$ index.php?params=$1 [L,NC]
This works properly untill I send an urlencoded string to it (using cURL) with an encoded \n in it (%0A).
So server/param1/param2/param3text works, but server/param1/param2/param3text1%0Aparam3text2 doesn't.
I found one Q on Stack Overflow mentioning a similar problem:
How can I apply an htaccess rewrite rule to a URL containing a linefeed character (%0A)?
But I can't/don't know how to implement [\r\n] in my (.*).
Any help?
Ok, so first, I had to add a check to make sure that the file didn't exist (the two RewriteCond's take care of that). Then I had to create a pattern that matched any character, or a \r or a \n that was matched one or more times(+). The zero or more times operator (*) didn't return the results properly.
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^((.|\r|\n)+)/? index.php?params=$1 [L,NC]
Just an FYI here: A common hacking method called Whitespace filtering uses %0A
Filtering can be bypassed on the space character by using alternative
whitespace characters to the space character (%20). Most SQL engines
consider a line return (%0a in a *NIX environment, %0a%0d in a Windows
environment), tab characters, or the + character as valid whitespace:
You must utilize %{THE_REQUEST} variable to grab actual path from original Apache web server request.
Try this code:
RewriteCond %{QUERY_STRING} !^params=.+ [NC]
RewriteCond %{THE_REQUEST} ^[A-Z]{3,}\s/+[^/]+/([^\s]+) [NC]
RewriteRule ^ index.php?params=%1 [L,QSA]
Then inside index.php check $_SERVER["QUERY_STRING"] for the full unadulterated path with %0A in it.
This is my .htaccess code:
RewriteBase /kajak/
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^moduli/([^/]+)/(.*)$ moduli/$1/index.php/$2 [L]
Now / is appended to every URL. For example, http://127.0.0.1/moduli/novice becomes http://127.0.0.1/moduli/novice/.
How can I prevent getting / at the end?
While I do not know the answer to your question, I will note two oddities about your question and your code that may be related to the problem at hand.
With the RewriteBase you have in your code, those rules should not even be being triggered.
While I am new to regex myself, I look at ([^/]+) and am a little confused as to why you are capturing it. I know that ^ matches the START of the string, which would never be true since you already have another one at the real start of the string.
This being said, I would probably write the code as below:
RewriteBase /moduli/
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)/(.*)$ $1/index.php/$2 [L]
This would rewrite URLs as below:
http://www.website.com/moduli/novice/view
http://www.website.com/moduli/novice/index.php/view
Based on your block of code, this seems to be what you are trying to do. If it is not, then I am sorry.
I don't think that's related to your rewrite rule, (it does not match it).
The / is added because when you request http://example.com/xx/zz and the web server detects zz is a directory, it transforms it to http://example.com/xx/zz/ through a 301 redirect (the browser makes another request - check you apache logs).
Read about the trailing slash redirect thing here.
The, you must aks yourself, what do you want to happen when the url requested is http://127.0.0.1/moduli/novice/ (Do you want it to be be catched by your redirect or not? Currently it's not catched because of RewriteCond %{REQUEST_FILENAME} !-d)
BTW, I don't quite understand your RewriteBase /kajak/ line there - are you sure it's correct?
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^admin/([^/]+)/([^/]+).php website.com/admin/index.php?route=$1/$2 [NS]
RewriteRule ^modules/([^/]+)/([^/]+).php website.com/admin/index.php?route=$1/$2 [NS]
the above works for when you login. it goes to /admin/modules/catalog but when you click on a link that shows up in the status bar as /admin/vendors/ it doesnt work???? which vendors is a sub dir or modules/
any idea
Considering what is said in the documentation about RewriteCond, and what you get, I'd say that the RewriteConditions are only applied to the one RewriteRule that follows :
The RewriteCond directive defines a
rule condition. One or more
RewriteCond can precede a RewriteRule
directive. The following rule is then
only used if both the current state of
the URI matches its pattern, and if
these conditions are met.
What if you try duplicating those RewriteCond in front of the second RewriteRule ? At least as a test ?
And here is an interesting thing about the S|Skip flag. Amongst other things and an example, it is said :
The [S] flag is used to skip rules
that you don't want to run. This can
be thought of as a goto statement in
your rewrite ruleset.
And also :
This technique is useful because a
RewriteCond only applies to the
RewriteRule immediately following it.
Thus, if you want to make a
RewriteCond apply to several
RewriteRules, one possible technique
is to negate those conditions and use
a [Skip] flag.
I've never tried this, but it might be useful, in your situation... maybe ^^
Still, it's some part of the Apache's documentation that seem to indicate what I said earlier is right.
Assuming poor English?
It sounds like you're requesting /admin/vendors; but you're saying that vendors is located in /modules.