Cannot exclude robot.txt from varnish cache - varnish

I tried to exclude robots.txt form varnish cache by using the following lines of code in default.vcl
if(req.url ~ "^/robots\.txt$") {
return(pass);
}
Now Network tab in dev tools, it shows a Age: 0 and X-Cache:MISS. But,for some reason varnish does not exclude the file from being cached. I even deleted the file from its location. But still its loading the url https://www.example.com/robots.txt
I also purged varnish cache using following commands
curl -X PURGE www.example.com/robots.txt
and
varnishadm "ban req.http.host == www.example.com && req.url ~ ^/robots.txt"
and
varnishadm "ban req.http.host ~ www.example.com && req.url ~ ^/robots.txt"
It shows the 200 Purged message, but still no luck.
Can anyone help me out ?

"I even deleted the file from its location. But still its loading the url https://www.example.com/robots.txt" -- may be your browser is caching it.
Sending PURGE request to varnish will only remove the object from the cache, not from the backend, so if you PURGE something from varnish and then send a GET request to it, it will serve the object to you for sure.
If you want it to be lost forever, you need to remove it from your backend.

Related

How to disable TRACE in niginx conf file

Assume my web app is hosted on https://example.com, I am testing my HTTP verbs on my server, I have nginx there,
curl -v -X TRACE https://example.com
For above I am getting a response. Instead, I should get 404 or other.
I had tried the below conf in Nginx server location block.
if ($request_method !~ ^(GET|HEAD|POST)$ ){
return 444;
}
This works for me but TRACE is still allowed. How should I stop it?
You could try using limit_except instead, like this:
location / {
limit_except GET HEAD POST { deny all; }
}

Virtual Host, works but any links get instant 404 error message

I created and setup a LAMP stack, inserted the database + files + code, and went to the local host. The main page pulls up (GOOD!) but, any links on the main page, no matter what it is.. gives me this:
Not Found
The requested URL /user/register was not found on this server.
Apache/2.4.18 (Ubuntu) Server at cbirc.com Port 80
I am assuming this is a apache issue? If someone could fill me in. Its strange that the main page pulls up fine, but any links to any other part of the site gives me this error code.
If index is working, It sound that RewriteEngine was not set to ON in apache configuration, Try to add this to your .htaccess file or vhost (/etc/apache2/sites-available/yourdomaine.com) :
<ifModule mod_rewrite.c>
RewriteEngine On
</ifModule>
If you modify vhost, don't forget to restart apache :
sudo apachectl configtest #Test before restart
sudo apachectl restart
if module is not installed :
sudo a2enmod rewrite
You can also check the Base URL in your HTML if you have one

Varnish 3.0.2 behind a SSL-Terminator

Please do someone know if varnish 3.0.2 support http redirection to https.
In fact I have a varnish cache server behind a ssl terminator (an aws external loadbalancer on which I set a http and a https listener).
I would like the varnish when it receives a http request, to redirect in https, and directly send back the response (resquest?) to the loadbalancer, and the loadbalancer will receive the response as a https request and forward it the varnih which will then forward it to its own backend.
But it seems like my varnish cache don't redirect back to the loadbalancer but redirect the https request to its backend.
However the backend behind the varnish I don't have a https backend, I get timemout when I issue a http request.
When the client enter https in the browser it works. The problem is with http request.
Here is my configuration :
In vcl_recv :
if (client.ip != "127.0.0.1" && req.http.host ~ "^(?i)mydomain.com" && req.http.X-Forwarded-Proto !~ "(?i)https") {
set req.http.x-redir = "https://" + req.http.host + req.url;
#return(synth(850, "Moved permanently"));
error 850 "Moved permanently";
}
In vcl_error :
if (obj.status == 850) {
set obj.http.Location = req.http.x-redir;
set obj.status = 302;
return (deliver);
}
Can someone help please. I can't upgrade my varnish version manually at the moment.
Thanks
I solve the problem,
The security group of the elb was only allowing connection on port 443, I add the port 80 for the http listener and it works

Wordpress MU Cloudfront. Trailing slash redirect goes to wrong domain

I'm having some trouble with redirects within wordpress redirection causing the domain to change.
Example:
Site - noncdn.somedomain.com
CDN URL - www.domain.com
When I open links w/o a trailing slash there is a 301 redirect:
Going here: www.domain.com/page
Takes you here: noncdn.somedomain.com/page/
Since Cloudfront is hitting the server using Origin Domain, the server doesn't even know that requests are coming in from a different domain.
How do I force this 301 to use FQDN w/ correct CDN domain instead of doing a relative redirect?
I've already added this so that links on the site and images all load from Cloudfront domain, but it seems to have no effect on the redirect behavior:
add_filter('home_url','home_url_cdn',10,2);
function home_url_cdn( $path = '', $scheme = null ) {
return get_home_url_cdn( null, $path, $scheme );
}
function get_home_url_cdn( $blog_id = null, $path = '', $scheme = null ) {
$cdn_url = get_option('home');
if(get_option('bapi_site_cdn_domain')){
$cdn_url = get_option('bapi_site_cdn_domain');
}
$home_url = str_replace(get_option('home'),$cdn_url,$path);
//echo $home_url;
return $home_url;
}
Any Help is much appreciated!
Thanks!
I was tracking down a very similar issue for a while with a Cloudfront distribution of a standard static website running on Nginx. The symptoms were the same, links with a trailing slash (e.g. www.acme.com/products/) worked correctly, but omitting the trailing slash caused the user to be redirected to the origin.
The issue is that the webserver software itself is not properly attempting to resolve URIs and is instead responding with a redirect to a URL it can serve. You can test this by using curl against your site:
$ curl http://myhost.com/noslashurl
HTTP/1.1 301 Moved Permanently [...]
CloudFront is returning exactly what your server returns, in this case a 301 redirect to your origin URL. Instead of following the redirect and caching that, CloudFront caches the redirect itself. The only way to correct this is to ensure that your origin properly handles the requests and does not respond with a 301.
In my particular case, that meant updating the try_files directive for my location in the nginx configuration. As I mentioned this is a static site, and so my try_files became:
location / {
[...]
try_files $uri $uri/index.shtml /index.shtml =404;
}
You want to be sure that the try_files has an endgame, to avoid redirection cycling which will cause your server to return 500 Server Errors when a non-existent URL is requested. In this case, /index.shtml is the last-ditch attempt and failing that, it will return a 404.
I know this doesn't precisely answer your question, but yours was one of a very few I found when searching for "cloudfront without trailing slash redirects to origin", and you've not had an answer for a year, so I figured it was worth sending a response.
I had the same problem.
I fixed the issue changing some wordpress parameters.
In the elasticbeanstalk I set the parameter CUSTOM_URL for my custom domain and in the file /var/www/html/wp-includes/load.php
I set the parameters HTTP_HOST and SERVER_NAME to same value of CUSTOM_URL, and it resolved the redirect to elasticbeanstalk url.
$_SERVER['HTTP_HOST'] = $_SERVER['CUSTOM_URL'];
$_SERVER['SERVER_NAME'] = $_SERVER['CUSTOM_URL'];

Varnish VCL & Req.Url matching for redirects

I currently Varnish set up for general caching etc, but also acting as a redirect for a mobile version of our website.
It works great (as Varnish does!) and redirects as intended. I decided to add functionality to the VCL config to not just redirect mobiles to the mobile version of the site, but to also redirect desktops accessing a link to the mobile site (for example, on Google) to the desktop version of the site.
However, I can't seem to get this to work in the most puzzling of ways. Here is the VCL:
# Ignoring certain shared assets
if (req.url !~ ".(jpg|png|gif|gz|tgz|bz2|tbz|mp3|ogg|css)$") {
# Let's detect if we're a Mobile
if (req.http.User-Agent ~ "iP(hone|od)" || req.http.User-Agent ~ "Android" || req.http.User-Agent ~ "Symbian" || req.http.User-Agent ~ "^BlackBerry" || req.http.User-Agent ~ "^SonyEricsson" || req.http.User-Agent ~ "^Nokia" || req.http.User-Agent ~ "^SAMSUNG" || req.http.User-Agent ~ "^LG" || req.http.User-Agent ~ "webOS" || req.http.User-Agent ~ "^PalmSource") {
# If we're a mobile, set the X-Device header.
set req.http.X-Device = "mobile";
# If we've not set a preference to the fullsite to override the redirect, and we're not accessing the mobile site, redirect. This all works fine.
if ((req.http.Cookie !~ "fullsite")&&(req.url !~ "mobile")){
error 750 "Moved Temporarily";
}
}
else{
# We're not mobile. I can see this header is set in the logs.
set req.http.X-Device = "desktop";
# If we're a desktop AND accessing the mobile site....
if (req.url ~ "mobile"){
# -------------------- THIS NEVER HAPPENS
error 750 "Moved Temporarily";
}
}
}
Have a glaring error in the logic here? There aren't any cookies or any other things that might interfere with the redirect that I can see. If anyone has any insight on this, I'd be eternally grateful :)
Best regards
B
Thought I'd revisit incase anyone is reading this with the same problem - We did solve this in the end, and it was due to an incredibly basic oversight.
There was another VCL condition earlier that was interfering with the redirect - a partial match mod ( ~ ) was redirecting ALL matched mobile urls to the desktop version, as this came earlier in the VCL.
It's an obvious point, but my advice to anyone with this problem is to check these partial matches, and remember it'll potentially match any part of the URL.

Resources