I want to follow Google's directive in terms of cache headers for images, scripts and styles.
After reading symfony's documentation about http cache, I decided to install FOSHttpCacheBundle. The I set up rules for path like ^/Resources/ or ^/css/. I then fail to see it the proper headers for my images using Chrome's console.
Alternatively, I have read that, since my server is handling the resource, this is not Symfony that deals with this matter (yet I read in the doc that Symfony Proxy was good for shared-hosting servers, which is what I have).
So should I just add lines to my .htaccess as explained in here, or am I simply misusing FOSHttpCacheBundle? (Or both.)
Static files (including javascript files, CSS stylesheets, images, fonts...) are served directly by the web server. As the PHP module is not even loaded for such files, you must configure the server to set proper HTTP headers. You can do it using a .htaccess file if you use Apache but doing it directly in httpd.conf/apache2.conf/vhost conf (depending of your configuration) will be better from a performance point of view.
If you also want want to set HTTP cache headers for dynamic content (HTML generated by Symfony...), then you must use FosHttpCache or any other method provided by Symfonny such as the #Cache annotation.
Related
I am using two application load balancers that are routing requests to 4 backend varnish servers. I got answers to configure the PHP file to purge the cache but I have no idea where to put it and how to execute.
For which type of PHP application are you trying to configure cache purging?
A custom application?
WordPress?
Drupal?
Magento?
Some other CMS or framework?
If you're using an existing platform, CMS, or framework, the documentation will probably state how to configure purging.
Varnish Configuration
Of course, the Varnish VCL code should also be tuned to process purges.
You can find more information about purging (and banning) in Varnish on http://varnish-cache.org/docs/6.0/users-guide/purging.html
Here are the questions you should ask yourself regarding purging. Maybe the documentation of your CMS or framework will answer these as well?
Are you trying to purge individual URLs?
Does your code have pattern matching in play to invalidate multiple URLs at once (uses bans in VCL)
If pattern matching is used, are you sending the invalidation pattern via an HTTP request header?
Does your invalidation code use the URL to identify objects in cache, or does it rely on tagged content?
Are you restricting access to the purging mechanism based on IP address or subnet. If so, please configure an ACL in VCL.
Many WordPress plugins rely on individual URL purging. Other WordPress plugins use bans through request header patterns.
Drupal uses bans, but has a system in place that tags content. The ban patterns don't match URLs, but tags.
Magento uses bans.
Conclusion
If you use a CMS or framework, the purging strategy is set in advance. It's just a matter of configuring your app and making sure the VCL can handle it.
If you have custom code, you have a choice, and you can implement purging or banning.
Please have a look at the user guide section about purging I mentioned above. It should help you understand the underlying mechanism.
I'm currently using babel to transform es6 code to es5 and browserify to bundle it to use it in the browser. Now I've began to using a http2 server (Nginx).
Http2 is more effective when it can load multiple small files instead of one big bundle.
How to best serve multiple js files instead of one big bundle?
I know that SystemJS can load multiple files in development without bundling, and for production you can use a DepCache to define the dependence trees of the modules you are importing
https://github.com/systemjs/systemjs/blob/master/docs/production-workflows.md
This approach would require you to ditch browserfy and change to systemjs as it only uses bundles.
I see that you didn't get the answer on your question till now. Thus I try to help you in spite of HTTP/2 is new for me too (it explains the long text of my answer :-)).
Good information about HTTP/2 can be find on the page https://blog.cloudflare.com/http-2-for-web-developers/. I repeat shortly:
stop concatenating files
stop inlining assets
stop sharding domains
continue minimizing of CSS/JavaScript files
continue loading from CDNs
continue DNS prefetching via <link rel='dns-prefetch' href='...' /> included in <head>
...
I want to add two additional points about the importance of setting HTTP headers Cache-Control and Link:
think about setting Cache-Control HTTP headers (especially max-age, expires and etag) on all content of your page. See details below. I strictly recommend to read the Caching Tutorial.
set Link HTTP header to use SERVER PUSH of HTTP/2.
The setting of HTTP headers LINK: are important to use server push feature of HTTP/2 (see here, here). RFC5988 and Section 19.6.1.2 of RFC2068 describe the feature existing in HTTP 1.1 already. Everybody knows Content-Type: application/json, but in the same way one could set less known Link: <...>; rel=prefetch, described here. For example, one can use
Link: </app/script.js>; rel=preload; as=script
Link: </fonts/font.woff>; rel=preload; as=font
Link: </app/style.css>; rel=preload; as=style
Such links, set on HTML page (like index.html), will informs HTTP server to push the resources together with the response on your HTML page. As the result you save unneeded round-trips and the later requests (after parsing HTML files) and the resources will be displayed immediately. You can consider to set the LINK headers on all images from your page to improve the visibility of your page. See here additional information with nice pictures, which demonstrates the advantage of HTTP/2 server push. If you use PHP then the code could be interesting for you.
The most web developers do some optimizations steps directly or indirectly. The steps are done either during building process or by setting HTTP headers in HTTP responses. One have to review some processes switch off someone and include another one. I try to summarize my results.
you can consider to use webpack instead of browserify to exclude some dependencies from merging. I don't know browserify good enough, but I know that webpack supports externals (see here), which allows to load some modules from CDN. In the next step you can remove any merging at all, but minimize and set cache-control on all your modules.
It's strictly recommended to load CSS/JS/Fonts, which you use, and which you don't developed yourself, from CDN. You should never merge such resources with your JavaScript files (what could you probably do with browserify now). Loading of Bootstrap CSS from your server is not good idea. One should better follow advises from here and use CDN instead ol downloading of all files locally.
The main reason of the usage of CDN is very easy to understand if you examine HTTP headres of the response from https://cdnjs.cloudflare.com/ajax/libs/jquery/2.2.1/jquery.min.js for example. You will find something like cache-control: public, max-age=30672000 and expires:Mon, 06 Mar 2017 21:25:04 GMT. Chrome will shows typically Status Code:200 (from cache) and you will see no traffic over the wire. If you explicitly reload the page (by pressing F5) then you will see a response with 222 bytes and Status Code:304. In other words the file will be typically didn't loaded at all. jQuery 2.2.1 stay forever the same. The next version will have another URL. The usage of HTTPS makes sure that the user will load really jQuery 2.2.1. If it's not enough then you can use https://www.srihash.org/ to calculate sha384 value and use extended form of <link> or <script>:
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/2.2.1/jquery.min.js"
integrity="sha384-8C+3bW/ArbXinsJduAjm9O7WNnuOcO+Bok/VScRYikawtvz4ZPrpXtGfKIewM9dK"
crossorigin="anonymous"></script>
If the user opens your page with the link then the sha384 hash will be recalculated and verified (by Chrome and Firefox). If the file is not yet in local cache then it will be loaded really quickly too. One short remark by loading the same file from https://code.jquery.com/jquery-2.2.1.min.js one uses HTTP 1.1 today, but from https://cdnjs.cloudflare.com/ajax/libs/jquery/2.2.1/jquery.min.js be used HTTP/2 protocol. i recommend to test the protocol by choosing the CDN. You can find here the list of CDNs which supports now HTTP/2. In the same way loading Bootstrap from https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/css/bootstrap.min.css one would uses HTTP 1.1 today, but one would use HTTP/2 by loading the same data from https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.3.6/css/bootstrap.min.css.
I spend many time for CDN to make clear that the most advantage of CDN is setting of cashing headers of HTTP response and the usage of immutable URLs. You can do the same in your modules too.
One should think about the time of caching of every content returned from the server. You can use URLs to your modules, which contains version number of your component (like /script/mycomponent1.1.12341) and to change the last part of version number every time on changing the module. You can set long enough value of max-age in cache-control and your components will be cached by web browser of the client.
Finally I'd recommend you to verify that you installed the latest version of OpenSSL and the latest version of nginx. I recommend to verify your web site in http://www.webpagetest.org/ and in https://www.ssllabs.com/ssltest/ to be sure that you don't forget any simple steps.
Using PuppetLab's Apache module, how would I enable caching for a particular URL? (I'm running a web application that puts all static content under a particular URL)
I understand the resulting configuration line should be CacheEnable type /foo, however neither apache::mod::cache nor apache::mod::disk_cache accept any parameters and from puppetlabs-apache/templates/mod/disk_cache.conf.erb it appears that including apache::mod::disk_cache will enable caching for / (which I don't want).
So how would I enable caching for just a particular URL via Puppet? Should I just use the concat::fragment to add the CacheEnable directive to the vhost or is there another way to do it?
Is there any way for varnish to read a list of backend urls from a text file, and then proxy cache misses to a random url taken from the text file?
What I imagine is something like this pseudocode...
/var/services/backend-urls.conf
http://backend-host-1/path/to/application
http://backend-host-2/path/to/application
http://backend-host-3/path/to/application
# etc
varnish config
sub vcl_miss {
// read a list of urls from a text file
backendHosts = readFile("/var/services/backend-urls.conf");
//choose a random url from the file
randomHost = chooseLineAtRandom(backendHosts);
//proxy the request to the random host
set req.backend = randomHost;
}
To provide some background, I work on a server system that comprises a number of backend applications that currently sit behind a front-end running apache. We are evaluating replacing the apache layer with varnish so we can benefit from the caching capabilities of varnish. We also have a service discovery framework that knows the endpoint locations for each backend application (the endpoint urls change periodically as new hosts emerge or are taken out of service).
Currently we use the RewriteMap functionality in mod_rewrite to route requests to the backend services. Then we have a process to maintain the lists of backend services based upon the contents of the service discovery framework.
All this works well for us in apache, except that apache is like using a sledgehammer to crack a nut. All we really want is the reverse proxy loigc, and the caching in varnish would be helpful too.
Is there any way to have varnish read the list of backend urls from an external resource?
Without resorting to custom vmod/c modules, the quick answer is no.
The VCL instruction are being compiled within varnish, and that rules out run-time inclusions.
But why not include within the VCL a separate backend vcl which includes the current backends.
that vcl file could be written out on demand. Then using varnishadm CLI command you could request a new compile of the VCL, therefore bringing the config live.
I can see two potential solutions.
The first is to have something generate your VCL and backends such as Chef or some custom scripting. You can then process the text file into backend definitions and the necessary VCL to invoke them. To handle the requirement for the random backend you could use a director. I've not dealt with directors myself but it looks like they are meant to solve that requirement. When changes to the backends occur you could rerun the generation script/Chef and tell Varnish to reload its configuration either using varnishadm or service varnish reload to avoid a full restart.
The second would be to implement it in C, either via a VMOD as Marcel Dumont suggests or possibly using inline C in your VCL.
With vmod_dynamic you can just use any DNS name as a backend or even service records.
For your use case, one option would be to set up an SRV record in DNS pointing to all your servers and then just use that as for example in the basic-stub.vtc test case.
I'm using S3 and CloudFront to store the images, CSS and JS files of my web site - which is not static and is hosted on a proper web server
Since the CSS file changes frequently, I'm using a version number to make sure the user browser reloads it when it changes. When I was hosting the CSS file on my Apache web server, I was using the following redirect rule
RewriteEngine On
# CSS Redirection (whatever.min.5676.css is redirected to whatever.min.css)
RewriteRule ^(.*)\.min\.[0-9]+\.css$ $1.min.css
With this simple rule, http://www.example.com/all.min.15.css redirected to http://www.example.com/all.min.css
How can I reproduce such a rule with Amazon S3 and/or CloudFront ?
i.e. to have http://example.amazonaws.com/mybucket/css/all.min.3.css or http://example.amazonaws.com/mybucket/css/all.min.42.css redirected to http://example.amazonaws.com/mybucket/css/all.min.css
(Note : my S3 bucket is NOT configured as a website but should it be so to enable redirection rules?)
NOTE: this answer does not use any rule. It might not be the proper answer.
I would be using a query parameter to handle different versions, like:
http://example.amazonaws.com/mybucket/css/all.min.css?ver42
http://example.amazonaws.com/mybucket/css/all.min.css?42
http://example.amazonaws.com/mybucket/css/all.min.css?ver=42
http://example.amazonaws.com/mybucket/css/all.min.css?20141014
To be exact, in my dynamic web page, the version parameter is stored in a variable and appended to url (both CSS and JS). While in development I only have to increase/set one variable to force the browser to load a new version. This way, there is no need for rewrite rules, even on Apache.
Caching also works as the Last-Modified and ETag headers are kept in tact.
Hope this helps.