Simple redirect rules for Amazon S3 - .htaccess

I'm using S3 and CloudFront to store the images, CSS and JS files of my web site - which is not static and is hosted on a proper web server
Since the CSS file changes frequently, I'm using a version number to make sure the user browser reloads it when it changes. When I was hosting the CSS file on my Apache web server, I was using the following redirect rule
RewriteEngine On
# CSS Redirection (whatever.min.5676.css is redirected to whatever.min.css)
RewriteRule ^(.*)\.min\.[0-9]+\.css$ $1.min.css
With this simple rule, http://www.example.com/all.min.15.css redirected to http://www.example.com/all.min.css
How can I reproduce such a rule with Amazon S3 and/or CloudFront ?
i.e. to have http://example.amazonaws.com/mybucket/css/all.min.3.css or http://example.amazonaws.com/mybucket/css/all.min.42.css redirected to http://example.amazonaws.com/mybucket/css/all.min.css
(Note : my S3 bucket is NOT configured as a website but should it be so to enable redirection rules?)

NOTE: this answer does not use any rule. It might not be the proper answer.
I would be using a query parameter to handle different versions, like:
http://example.amazonaws.com/mybucket/css/all.min.css?ver42
http://example.amazonaws.com/mybucket/css/all.min.css?42
http://example.amazonaws.com/mybucket/css/all.min.css?ver=42
http://example.amazonaws.com/mybucket/css/all.min.css?20141014
To be exact, in my dynamic web page, the version parameter is stored in a variable and appended to url (both CSS and JS). While in development I only have to increase/set one variable to force the browser to load a new version. This way, there is no need for rewrite rules, even on Apache.
Caching also works as the Last-Modified and ETag headers are kept in tact.
Hope this helps.

Related

Force 200 response codes from Azure Static Website - SPA (Google won't index routes)

I have a React SPA that is being hosted as an Azure Static Website. The configuration is rather simple - html, js etc files are deployed to Azure Storage. I then enable the static website feature and expose this via a Verizon Premium CDN Endpoint.
The Static Website is configured to serve index.html as the index and error document. The issue that I am seeing here is that when a route is requested /faqs for example the response is a 404 with the index.html doc as the response body - this works fine in the browser but Google will not crawl it as it's seeing the response as a 404.
I wonder if there is anyway around this? Is there anyway to force 2** response codes?
Well after messing around trying to configure Azure to force status codes I found a solution, it's not ideal but it works and will be fine for now.
SOLUTION: I cloned my index.html as faqs (no extension so manually set content type) so that the respective version is served when requested. Happy days! Glad I only have a small number of public pages.
Since you have the CDN layer in front of your website, you can have the CDN deliver the index.html via a URL rewrite rather than relying on the static website's "error page" delivery mechanism. This holds up even if you have a variable number of routes in your application.
Configure a rule in your CDN's Rules Engine that takes any path without a file extension (since we want normal requests for assets or script/style files to return those actual files) and rewrites to /index.html. Re-write means the URL of the actual request remains the same, but the file that gets delivered comes from the rewritten URL.
See this article for more.

Can I redirect a user to a Google Drive PDF while keeping my URL in the address bar?

If I have a user follow a link to my site such as
mydomain.com/pdf/google_token
is there a way for me to redirect them to the Google pdf
drive.google.com/file/d/google_token/view
while keeping
mydomain.com/pdf/google_token
in the address bar?
Right now I am redirecting to google successfully using
RewriteRule ^pdf/([a-zA-Z0-9]+)$ https://drive.google.com/file/d/$1/view
in my .htaccess file, but it is replacing the URL with
drive.google.com/file/d/google_token/view
Thanks.
You are not looking for a way to redirect. A redirection always changes the URL in the client, that is the whole purpose of a redirection. What you are looking for is a proxy solution, maybe in conbination with an internal rewrite. That creates a kind of mapping: the content published on that google resource is re-published through your http host.
This would be an example for such setup:
ProxyPass /google-drive/ https://drive.google.com/
ProxyPassReverse /google-drive/ https://drive.google.com/
RewriteEngine on
RewriteRule ^/?pdf/([a-zA-Z0-9]+)$ /google-drive/file/d/$1/view [END]
An alternative would be to only re-publish a section of that remote resource:
ProxyPass /google-drive/ https://drive.google.com/file/d/
ProxyPassReverse /google-drive/ https://drive.google.com/file/d/
RewriteEngine on
RewriteRule ^/?pdf/([a-zA-Z0-9]+)$ /google-drive/$1/view [END]
That set will work likewise in the http servers host configuration and you probably can also get it to work using a dynamic configuration file (".htaccess" style file). If you really need to use such a file then take care that its interpretation is enabled in the host configiration. And you definitely need to have apache's proxy module loaded. You should prefer to place such rule in the http servers host configuration though, for various reasons.
If that setup is not possible, for example because you do not have access to the proxy module, then you can implement a simple routing solution which fetches the PDF in background using for example php's cURL extension and forwarding the payload along with correct http headers to the client that sent a request to that PDF. That is usually done for resources kept locally but there is no reason why you can't do that with remote resources too.
Some additional notes:
if you only deliver documents form that google drive resource, then you probably do not need the ProxyPassReverse directive, but only the ProxyPass.
if you run into a server internal error (http status 500) using the above setup then chances are that you operate a very old version of the apache http server. You will find a hint on an unsupported END flag in your http servers error log file in that case. Try using the [L] flag in that case, it probably will work the same here, but that depends on the rest of your setup.

.htaccess to limit access to directory not working properly

I'm trying to limit access to a directory based on the results of a php script. I have the following in my .htaccess folder where the files are located:
RewriteCond %{REQUEST_URI} !=league_access.php
RewriteRule .* league_access.php
I have also tried:
RewriteEngine on
RewriteRule .* league_access.php
If you go to the directory http://www.bowling-tracker.com/bowl/league_documents/1/ you will note that it is firing the league_access.php script (as it currently only types "Running the Test Script
Restricted access" to the page.
So that is acting correctly.
http://www.bowling-tracker.com/bowl/league_documents/1/test.html you will see that you're granted access to the page (rather than it going to the league_access.php script).
This website is on FastComet (public hosting company) so I cannot change server settings or files except the .htaccess file.
Any help to resolve this would be greatly appreciated.
Thanks....
FastComet Team here! Part of our shared hosting environment is utilizing NginX as a reverse proxy to the Apache web service. This configuration gets the advantages of both services at the same time and ensures a better performance of your project. NginX is processing all requests for static content, such as PDF files or HTML pages. Here's a list of all file types that will be processed by the NginX service:
3gp|gif|jpg|jpeg|png|ico|wmv|avi|asf|asx|mpg|mpeg|mp4|pls|mp3|mid|wav|swf|flv|html|htm|js|css|exe|zip|tar|rar|gz|tgz|bz2|uha|7z|doc|docx|xls|xlsx|pdf|iso
However, if the request is for dynamic content, such as a PHP script, it will be passed from the NginX to the Apache service. You are correctly setting the rule in question in the .htaccess file of your website, but this file is only read by the Apache service, not NginX. In other words, if there is a request for a static content, such as a PDF file here:
http://www.bowling-tracker.com/bowl/league_documents/1/Rules_Thurs_Night_Mixed.pdf
or an HTML page here:
http://www.bowling-tracker.com/bowl/league_documents/1/test.html
it will be processed by NginX without considering the .htaccess rules that you have set. There is an easy way of resolving that by excluding the processing of HTML, HTM and PDF types files for your domain or even your entire hosting account. This way, those requests will be processed by the Apache web server, instead of NginX. In this case, the .htaccess rules that you apply will be taken into consideration by the system and they will work without any issues.

Cloudflare page rules to serve static files without extension

I have managed to configure my Nginx (on top of Nodejs) to serve static files without the html extension (e.g. going to site.com/about serves the about.html page) - with help from these past questions: how to serve html files in nginx without showing the extension in this alias setup and https://serverfault.com/questions/346994/hide-html-file-extensions-using-nginx-rewrites
But I am unable to figure out how to set up Cloudflare page rules to work with this setup (the current page rules are setup to include static html files as well as js, css, etc.).
How do I configure cloudflare to serve the about.html page when the user goes to site.com/about, and also serve the team.html page when the user goes to site.com/about/team? Do I need to do anything special, or is the Nginx setup sufficient?
If CloudFlare caching of your static pages isn't required, there's no need for you do do anything, everything should work out of the box.
If you want CloudFlare to also cache those static pages, try setting up page rules to Cache Everything on your site:
Domain > Page Rules
Pattern: *site.com/*
Custom Caching > Cache everything
Once you setup the page rules, CloudFlare should cache your static pages and site.com/page1 should work. To clarify, your server is still serving the pages, not CloudFlare. With the page rules, you are simply instructing CF to cache what your server sends for site.com/page1, as opposed to fetching the page from your server for every visitor.
You can then add other Page Rules with higher priorities should you want to exclude certain endpoints from caching (e.g. an admin section). You won't need to do this if you're just hosting static HTML.
If this doesn't work, or if you need more control over what's being cached, check this CloudFlare support doc for more options.
Good luck!

Redirect in htaccess to limit sending of cookies

I would like to write a redirect to avoid cookies being sent on graphics & css files. I think what I want is to redirect html and php to www, and others to root, possibly keeping js on the www so scripts can process cookies. This is for Joomla installations that are not cookie aware and I don't want to have to change the template files etc. Related question, can I just redirect the no-cookie files to root if the html is sent to www, or do I need to create a subdomain (which would complicate the no-change policy for the templates)
Thanks.
For reference, here's another SO question along the same lines: .htaccess, YSlow, and “Use cookie-free domains”.
As the accepted answer in that question mentions, creating a redirect from a cookie domain to non-cookie domain would be counterproductive and result in extra round-trips.
I'm not familiar with Joomla, but if as you mentioned the goal is to not mess with the Joomla templates too much, you could do one of:
Register a new domain which is an alias (cname) to your original domain. For example if you already have www.example.com, register examplestatic.com and set it to point to www.example.com. Then adjust your templates to include static files from examplestatic.com. Those requests should be cookie-free.
Use Amazon CloudFront as a CDN. You would use their Custom Origin feature to pull files from your server as the origin. Then adjust your templates to refer to the CloudFront domain instead of yours.
Going down this path may or may not provide much benefit for your situation. You didn't mention it, but I would make sure to start with the higher impact performance rules like minimizing HTTP requests by combining static files, enabling gzip compression, optimizing images, and so on.

Resources