htaccess - check if file exists, otherwise run php - .htaccess

Background:
I've recently created a dynamic image resizing script in PHP so that I can give it an image path somewhere within the server + get parameter of width, and it'll return this image, resized according to the query parameter and throw the image into a cache folder for future use.
Afterwards, if the same image, with the same width is requested, it goes to that PHP script, and the script checks if the file already exists, and if it does, it'll just output it. (It will do file_get_contents(), and then echo it with the appropriate header)
The challenge:
What I want to do is to bypass the PHP script if the file already exists. I want .htaccess to check if the file exists in the cache folder, and then simply go to that file, instead of the PHP script. That would be simple enough, except the filename includes the modified width.
So, for instance:
If the file relative path is images/products/36sh542dhs.jpg, and I want it modified with width 100px, the request URL will look like this si/images/products/36sh542dhs.jpg?d=100. The file will be stored as follows: resized_images/cache/images/products/36sh542dhs---D100.jpg.
Is there any conceivable way to get .htaccess to take in si/images/products/36sh542dhs.jpg?d=100, break it up, remove si, add resized_images/cache instead, AND modify the filename to stick the D---100 right before the "dot extension" part?

The best algorithm that comes to my mind regarding your issue is to: always use the /resized_images/cache/images/products/36sh542dhs---D100.jpg path.
In .htaccess you can have a line ErrorDocument 404 /404-error-handler.php. This will catch all 404 errors.
In 404-error-handler.php file you check the value of $_SERVER['REQUEST_URI']. If it contains your path, you include the logic of image generation and saving else you return a custom 404 error page.
This way you will obtain what you have asked: htaccess check if file exists, if no it goes up the 404 custom php script which will check if the URL matches your image-generation-url and either generate and cache the image, either show a 404 page. If the file exists, it will be served to client.

Another way (kind of the way you were suggesting)...
Link to:
/si/images/products/36sh542dhs---D100.jpg
Cached images stored in:
/resized_images/cache/images/products/36sh542dhs---D100.jpg
Script that actually resizes images:
/scripts/image-resizer.php?image=36sh542dhs&size=100
By linking to a different URL you can keep your cache directory and image script "private" - which can be easily moved to different locations if you wish.
Using mod_rewrite in .htaccess:
RewriteEngine On
# Serve cached image if it exists
RewriteCond %{DOCUMENT_ROOT}/resized_images/cache/$1 -f
RewriteRule ^si/(images/products/\w---D\d{2,4}\.jpg)$ resized_images/cache/$1 [L]
# Otherwise send request to PHP image resizing script
RewriteRule ^si/images/products/(\w)---D(\d{2,4})\.jpg$ scripts/image-resizer.php?image=$1&size=$2 [L]
Alternatively, you link directly to the cached image (as mentioned in #besciualex's answer) and rewrite the request to your image resizer script when it doesn't exist, rather than relying on the error handler:
# Send request to PHP image resizing script when cached image does not exist
RewriteCond %{REQIEST_FILENAME} !-f
RewriteRule ^resized_images/cache/images/products/(\w)---D(\d{2,4})\.jpg$ scripts/image-resizer.php?image=$1&size=$2 [L]
Only requests that look-like cached images are checked.

Related

Is there any way to open another file when accessing an image via htaccess?

I would like when accessing the url https://www.italinea.com.br/uploads/jx5rufam7adfwd75pi6c.jpg, which is an existing file on the server, open the php file https://www.italinea.com.br/image.php.
Which htaccess rule do I use?
I tried to use:
RewriteRule ^(.+)\.jpg image.php [L,QSA]
But as it is an existing file, it opens the image and not the .php file
RewriteRule ^(.+)\.jpg image.php [L,QSA]
This is the right idea and should "work", although it could be simplified a bit. And it is rather generic, so matches every .jpg request.
If this is not working then either:
You have a conflict with other directives in the .htaccess file (the order of directives can be important).
You have a front-end proxy that is serving your static content. This is a problem if the URL being requested maps to a physical file as the application server (ie. .htaccess) is then completely bypassed.
If this is the case then see my answer to the following related question:
WordPress: can't achieve direct image access redirection via .htaccess

Adding a subdirectory to an invisible .htaccess URL rewrite

I wanted to add a subdirectory to my url so it would become easier to read:
Example of what i'd like:
localhost/testwebsite/users.php?firstname=john
should become
localhost/testwebsite/users/john
I tried using the .htaccess mod_rewrite rules and came up with this:
RewriteEngine On
RewriteBase /testwebsite/
RewriteRule ^users/(.*)$ users.php?firstname=$1
What happens why I use that code: it redirects the page successfully, it shows the html of the correct user and it processes the argument correctly. However, all stylesheets, images, scripts, anything with a relative path, could not be found and respond with a 404 message, because of the extra subdirectory added in the new url, I reckon.
Am I doing something wrong? Is there another technique I should be using? Or should I simply make all paths in my project absolute with regards to the root?
You're doing it right. The browser doesn't know that the actual path of the file is different. Use absolute paths or make paths relative to the easier to read URL.

Clean URL rewriting rule

right now my url looks like this:
http://domain.com/en/c/product%2C-product2%2C-product3/82
where last number is category numer.
And im trying to rewrite it and redirect user to url which should look this one:
http://domain.com/82/product-product2-product3
The clue is I want to hide "en/c/" part and clean url from commas and blank spaces. I'm completely green in rewriting.
Any ideas?
You can use these 2 rules in your root .htaccess for that:
RewriteEngine On
RewriteBase /
RewriteRule ^en/c/([^,\s]*)[,\s]+([^,\s]*)/(\d+)/?$ $3/$1$2 [NC,L,NE,R=302]
RewriteRule ^(en/c)/([^,\s]*)[,\s]+(.*)/(\d+)/?$ $1/$2$3/$4 [NC,L]
In order for this to work, we need to tell the server to internally redirect all requests for the URL "url1" to "url2.php". We want this to happen internally, because we don't want the URL in the browser's address bar to change.
To accomplish this, we need to first create a text document called ".htaccess" to contain our rules. It must be named exactly that (not ".htaccess.txt" or "rules.htaccess"). This would be placed in the root directory of the server (the same folder as "url2.php" in our example). There may already be an .htaccess file there, in which case we should edit that rather than overwrite it.
The .htaccess file is a configuration file for the server. If there are errors in the file, the server will display an error message (usually with an error code of "500").
If you are transferring the file to the server using FTP, you must make sure it is transferred using the ASCII mode, rather than BINARY. We use this file to perform 2 simple tasks in this instance - first, to tell Apache to turn on the rewrite engine, and second, to tell apache what rewriting rule we want it to use. We need to add the following to the file:
RewriteEngine On # Turn on the rewriting engine
RewriteRule ^url1/?$ url2.php [NC,L] # Handle requests for "url1"

What does the RewriteRule in .htaccess convert to in an app.yaml file?

How do I convert this .htaccess file to an app.yaml file?
Here is the .htaccess file:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(?!.*?public).* index.php [QSA,L]
I need to do this to run an app using PHP on Google App Engine.
The reason I'm asking this question, is because Google specifically recommends a code example in their official documentation that is stored in Git Hub called Dr Edit. The Dr Edit code example has a .htaccess file, but no app.yaml file. And, in the READ ME file, the very first step for setting up the application, is to create a Google App Engine application. So I guess Google has provided a code example that insinuates it will run on Google App Engine, but it won't.
Supposedly Google is monitoring Stack Overflow for issues related to GAE, so I hope they read this.
Here is information about how to simulate Apache mod_rewrite $_GET['q'] routing for a PHP App Engine app, in case that's helpful.
As the name .htaccess implies, it can control access to a directory. The .htaccess files are read on every request. I guess that means that every HTTP request to the domain name must go through the .htaccess file. So it seems like the .htaccess file is like a gatekeeper. From Wikipedia, it seems that the .htaccess file can be used to block certain IP address from loading a web page.
The first line of the .htaccess file is: RewriteEngine On
That line of code: RewriteEngine is an Apache Directive which turns the Apache Rewriting Engine on or off.
So that .htaccess file is looking at every request, taking the request, then pushing out a different request.
It seems that the app.yaml file can not directly take one URL request, and push out a different request, BUT the app.yaml file can take a URL request, and then run a PHP script, that will change the request.
So, in the app.yaml file, the section that takes the URL, then redirects to a script, would look like this:
Catch the incoming URL requests and cause the mod_rewrite.php file to run.
- url: /Put the incoming URL you want to monitor, catch and redirect here
script: mod_rewrite.php
Redirect URL requests with app.yaml
The second line of the .htaccess file is:
RewriteCond %{REQUEST_FILENAME} !-f
This is a test, looking for a match in the incoming URL request. The -f at the end is for:
-f (is regular file) Treats the TestString as a pathname and tests whether or not it exists, and is a regular file.
Oh wait! There is an exclamation point in front of it. That's probably a test for logical not. So, if it's NOT a regular file?
Whatever that means. What is the difference between a regular file and everything else? What's an irregular file? lol idk. Anyway, it's looking for a file I guess.
From the Apache documentation, here is a quote:
Server-Variables: These are variables of the form %{ NAME_OF_VARIABLE
} where NAME_OF_VARIABLE can be a string taken from the following
list:
So that %{REQUEST_FILENAME} part of the second line is what Apache is calling a Server-Variable, and the specific Server-Variable is REQUEST_FILENAME.
Here is a quote from the Apache documentation:
REQUEST_FILENAME
The full local filesystem path to the file or script matching the request, if this has already been determined by the server at the time
REQUEST_FILENAME is referenced. Otherwise, such as when used in
virtual host context, the same value as REQUEST_URI. Depending on the
value of AcceptPathInfo, the server may have only used some leading
components of the REQUEST_URI to map the request to a file.
So I guess the .htaccess file is looking for something that looks like a path to files on your local computer?
Finally there is the Rewrite Rule Apache RewriteRule
What is the QSA at the end for? I guess it's called a flag, and:
Using the [QSA] flag causes the query strings to be combined
QSA seems to stand for Query String Append. If the incoming URL has a query string on the end of it, how do you want it dealt with?
The [L] prevents any further rewriting rules to be appended
Appache L rewrite flag
Appache Documentation QSA
The character (^) is called a caret. I don't know why it's used at the beginning. The syntax for the Appache Rewrite directive is:
RewriteRule pattern target [Flag1,Flag2,Flag3]
So the caret is part of the pattern being detected. That target is simply index.php. So it looks like if lots of different possible requests simply get routed back to the index.php file, which is the very first thing the application runs.
The asterisk at the end is probably a wildcard for any extension.
I think you're out of luck :-( Rewrite is an Apache module but AppEngine is based on Jetty, so you must redesign for that. You might be able to use one of the Bundled Servlets, Filters, and Handlers but none of them are a direct substitute for Apache Module mod_rewrite. Some people have used Apache as a front end before Jetty, but that is a clumsy approach. Sorry.

Rewrite existing but broken symlinks with .htacces?

I would like to use a rewrite rule that is executed when a symlink exists but is broken.
So the scenario's would be:
Symlink does not exist: normal 404/403 error.
Symlink exists but is broken: generate-cache.php is called.
Symlink exists and is working: target file is loaded normally.
For example:
## Symlink does not exist.
GET /links/cache/secret.jpg
404 Not Found
## Symlink is broken.
GET /links/cache/secret.jpg
Links to /images/cache/secret.jpg
Because it's broken, rewrites to: generate-cache.php?path=cache/secret.jpg
200 OK
## Symlink works.
GET /links/cache/secret.jpg
Links to /images/cache/secret.jpg
200 OK
Update: I want to avoid using PHP to do these checks, because it causes a performance bottleneck. Outputting the file through PHP if it exists causes PHP to lock. Also I have no option to use multiple PHP threads or install additional apache modules.
I don't know of a way of testing for a broken symlink in mod_rewrite (-l checks for the existence of a symlink, but doesn't attempt to follow it), which may mean you'd need to write some kind of callback in PHP (or some other language).
An alternative approach would be to rewrite all requests, and build this logic in PHP:
if the file exists in the cache directory, set appropriate headers and use readfile() to output the data
if the symlink exists (or just an empty file with the right name in a "control" directory; I presume you have some other process creating the symlinks, so this could be amended to touch files instead), do appropriate generation
if the symlink/control file doesn't exist, send a 404 header and immediately exit
Another variation, slightly more efficient, would be to let Apache serve the cached image directly if it exists, and rewrite to PHP for steps 2 and 3. Something like this:
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-f
RewriteRule /links/cache/(.*) generate-cache.php?path=$1
And in PHP
if ( ! file_exists('cache_control/' . $_GET['path'] )
{
header('HTTP/1.1 404 Not Found');
exit;
}
else
{
// Control file exists, so this is an allowable file; carry on...
generate_file_by_whatever_magic_you_have( 'links/cache/' . $_GET['path'] );
header('Content-Type: image/jpeg'); // May need to support different types
readfile( 'links/cache/' . $_GET['path'] );
exit;
}
Assuming you can replace the symlinks with control files, and the names match up directly (i.e. the target of your symlink can be "guessed" from its name), you could move the control file check into mod_rewrite as well:
# If the requested file doesn't exist (if it does, let Apache serve it)
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-f
# Match the basic path format and capture image name into %1
RewriteCond %{REQUEST_FILENAME} /links/cache/(.*)
# Check if a cache control file exists with that image name
RewriteCond %{DOCUMENT_ROOT}/cache_control/%1 -f
# If so, serve via PHP; if not, no rewrite will happen, so Apache will return a 404
RewriteRule /links/cache/(.*) generate-cache.php?path=$1

Resources