image not cached when loaded with meta refresh - .htaccess

Not really an issue, but how come images cached in .htaccess with :
<FilesMatch "\.(jpg|jpeg|gif|bmp)$">
Header set Cache-Control "max-age=6048000, public"
</FilesMatch>
are reloaded every time when the url of the page is called by a meta tag
<http-equiv="refresh" content="2;URL=page_with_large_jpg.php"> )
This meta tag is on another php script... php scripts are only cached 1 second :
<FilesMatch "\.(html|htm|php)$">
Header set Cache-Control "max-age=1, no-cache, private, must-revalidate"
</FilesMatch>
In most situations the cache looks good, if I enter the url of a page containing a large jpg image manually, or call this page by clicking on a link to it, the jpg is clearly cached (provided I visited this page/image previously of course), and so displayed instantly, but if the page containing the large jpg is called by meta refresh tag in the head section, the image is loaded again, taking a few seconds or more to be displayed entirely if it's very large !
Is there a way to prevent this ?

Related

Content disposition link conflict

I use MODx Evolution and I included into my htaccess file the following:
<IfModule mod_headers.c>
<FilesMatch "\.jpg$">
Header append Content-Disposition "attachment;"
</FilesMatch>
<FilesMatch "\.jpeg$">
Header append Content-Disposition "attachment;"
</FilesMatch>
<FilesMatch "\.png$">
Header append Content-Disposition "attachment;"
</FilesMatch>
</IfModule>
I have a download button for each image that can be downloaded, like this:
<div class="box download-box">
<a class="button" href="[*template-variable-image*]">Download</a>
The above code works perfectly.
Now I've added another button for users to see the image in full scale in a separate browser tab with this code:
<h2 class="thumb-caption"><span data-href="[*template-variable-image*]" target="_blank">PREVIEW</span></h2>
Now when the user clicks "PREVIEW" the content disposition attachment box appears for download. How can I get the "PREVIEW" to show the preview of the image the way I planned and NOT the content download box???
This is more a HTML than a MODX question. A lot of modern browsers know the download attribute in the a tag.
So throw away the .htaccess additions and use
<a class="button" href="[*template-variable-image*]" download>Download</a>
You could also use javascript for this and catch all browsers. John Culviner has written a nice jQuery plugin jquery-file-download for this.

How to compress html pages using SetOutputFilter DEFLATE

I am not able to get compressed html pages in my browser even though I am 100% sure mod_deflate is activated on my server.
My htaccess file has this code snippet :
<IfModule mod_deflate.c>
<Files *.html>
SetOutputFilter DEFLATE
</Files>
</IfModule>
A non compressed excerpt of my content is:
<div>
<div>
Content
</div>
</div>
With the htaccess code I am using, I would expect to get the output below in my browser (no space and no tabs at the beginning of each line):
<div>
<div>
Content
</div>
</div>
Is there something wrong with the code I am using in the htaccess file?
Is keeping all tabs in front of each html line after compression the normal behavior of mod_deflate?
If so, would you recommend that I switch tabs with spaces in my html code to get the desired effect?
Thanks for your insights on this
For the Deflate output filter to compress the content
Your content should be at least 120 bytes; compressing lesser bytes increases the output size.
The http client making the request should support gzip/deflate encoding.
Most modern Web browsers support gzip encoding and automatically decompress the gziped content for you. So what you are seeing using a Web browser's View Page Source option is not the compressed content. To verify if your browser received a compressed content, hit the F12 Key, select the Network tab and your requested page. If the response header has Content-Encoding: gzip, you can be sure the compression worked.
In Firefox, you can remove support for gzip,deflate by going to about:config and emptying the value for network.http.accept-encoding. Now with no support for gzip, Firefox will receive uncompressed content from your Apache server.
Alternatively, if you want to see the compressed content, you can use a client that does not automatically decompress the contents for you (unless you use --compressed option).
You can use curl for this:
curl -H "Accept-Encoding: gzip,deflate" http://example.com/page.html > page.gz

Clearing site cache after a site content update

I've just rolled out a major update to my work site.
Some of the resources are still being called from cached versions, such as the style sheets and javascript files. Some visitors to the site will still see the old resources.
Is there a htaccess command to force all assests to re-cache from a set date
What I do instead of messing with my .htaccess rules every time I do an update, is I append the name of the styles and js files I changed with a random number or the date/time I changed it.
That way the browser see's it as a new file, loads it and caches the new version.
For example in your tag, add a version query string.
style.css?version=3.2 or use a hash style.css?version=V6CbUlTe7M94ol8
This can be done with your JS files too. Much easier and better to do then messing with your Apache config. Anytime you make an update just change the version and it will be re-cached for all.
FYI, StackOverflow uses the same technique. Look at the source code
for this page.
<link rel="stylesheet" type="text/css" href="//cdn.sstatic.net/stackoverflow/all.css?v=dc5a5d7ef830">
I found a solution
<FilesMatch "\.(html|htm|js|css|php)>
FileETag None
Header unset ETag
Header set Cache-Control "max-age=0, no-cache, no-store, must-revalidate"
Header set Pragma "no-cache"
Header set Expires "Wed, 11 Jan 1984 05:00:00 GMT"
</FilesMatch>

How to crawl only HTML in Nutch?

Is it possible to crawl/fetch only plain HTML pages via Nutch (i.e. no pictures, video, flash, excel, exe, pdf or word files)?
How to check Content-Type of the page and fetch only text/html pages via Nutch?
Edit conf/regex-urlfilter.txt:
Set files suffix for ignore:
-\.(jpg|gif|zip|ico)$

Stop IE8 from opening or downloading a text/plain MIME type

I'm dynamically generating a text file in PHP, so it has a .php extension but a text/plain MIME type. All browsers display the file as nicely preformatted text, except IE8.
Googling tells me that they've added security where if the HTTP header content type doesn't match the expected content type (I think based on the extension and some sniffing) then it forces the file to be downloaded. In my case I have to open it, and also give it permission to open the file I just told it open! That's probably a Win7 annoyance though. Serving a static plain text file works fine, of course.
So can I stop IE8 from downloading the file and get it to view it normally? The code has to run on multiple shared hosting environments, so I think I'm stuck with the .php extension.
Add this to your HTTP header:
X-Content-Type-Options: nosniff
It's an IE8 feature to opt-out of its MIME-sniffing.
Source
Alternatively, you can "trick" IE8 into thinking that it is indeed serving up a text file. These 2 lines do it for me and don't involve using non-standardized "X-" headers:
Header("Content-Type: text/plain");
Header("Content-Disposition: inline; filename=\"whatever.txt\"");

Resources