CDNs and domains - dns

A lot of big websites (facebook etc) are settings up CDN's for their content. Now I notice, that these CDN's are not always on the original domain.
Example: Facebook pictures are on "photos-a.ak.fbcdn.net"
Why is that? Is there a performance-gain in not having lots of subdomains on the "primary" domain (facebook.com)

There is quite a large performance gain for the user because with a different domain, the browser won't have to send the cookies anymore.
Read this article for more info about cookieless domains: http://code.google.com/speed/page-speed/docs/request.html#ServeFromCookielessDomain

Stackoverflow itself expains it very well on its static content site.
When the browser makes a request for a
static image and sends cookies
together with the request, the server
doesn't have any use for those
cookies. So they only create network
traffic for no good reason. You should
make sure static components are
requested with cookie-free requests.
Create a subdomain and host all your
static components there.
If your domain is www.example.org, you can host your static components on
static.example.org. However, if you've
already set cookies on the top-level
domain example.org as opposed to
www.example.org, then all the requests
to static.example.org will include
those cookies. In this case, you can
buy a whole new domain, host your
static components there, and keep this
domain cookie-free. Yahoo! uses
yimg.com, YouTube uses ytimg.com,
Amazon uses images-amazon.com and so
on.
Another benefit of hosting static components on a cookie-free domain is that some proxies might refuse to
cache the components that are
requested with cookies. On a related
note, if you wonder if you should use
example.org or www.example.org for
your home page, consider the cookie
impact. Omitting www leaves you no
choice but to write cookies to
*.example.org, so for performance reasons it's best to use the www
subdomain and write the cookies to
that subdomain.
Originally from Yahoo.

A CDN is typically spread across the world so that the user is closer to files than if all files were in the main data center.
It is difficult (impossible?) to route a single domain to different data centers.
Also, cookies no longer need to be sent. For Facebook, which has a ton of cookies due to being embedded all over the web, this can be a big saver.

Related

Use cookie-free domains - subdomain solution not working

I'm trying to optimize an html webpage, and one of the suggestions from yslow is:
Use cookie-free domains There are 11 components that are not
cookie-free
So I followed one of the standard solutions I've seen and created a subdomain static.mysite.com and put the images there.
But I'm still getting the exact same problem -- a cookie is still being delivered with each image, and same yslow message.
So how do I get this subdomain to be cookie free?
If you are using subdomain for cookie-free delivery then your main page has to use www prefix.
I had the same problem. The subdomain simply didn't work, so I used a different domain name and it solved the problem.
When the browser makes a request for a static image and sends cookies together with the request, the server doesn't have any use for those cookies. So they only create network traffic for no good reason. You should make sure static components are requested with cookie-free requests. Create a subdomain and host all your static components there.
If your domain is www.example.org, you can host your static components on static.example.org. However, if you've already set cookies on the top-level domain example.org as opposed to www.example.org, then all the requests to static.example.org will include those cookies. In this case, you can buy a whole new domain, host your static components there, and keep this domain cookie-free. Yahoo! uses yimg.com, YouTube uses ytimg.com, Amazon uses images-amazon.com and so on.
Another benefit of hosting static components on a cookie-free domain is that some proxies might refuse to cache the components that are requested with cookies. On a related note, if you wonder if you should use example.org or www.example.org for your home page, consider the cookie impact. Omitting www leaves you no choice but to write cookies to *.example.org, so for performance reasons it's best to use the www subdomain and write the cookies to that subdomain.
Source - http://developer.yahoo.com/performance/rules.html
EDIT
If you set your cookies on a top-level domain (e.g. yourwebsite.com) all of your sub-domains (e.g. static.yourwebsite.com) will also include the cookies that are set. Therefore, in this case, it is required that you use a separate domain name to deliver your static content if you want to use cookie-free domains. However, if you set your cookies on a www subdomain such as www.yourwebsite.com, you can create another subdomain (e.g. static.yourwebsite.com) to host all of your static files which will no longer result in any cookies being sent.
For Wordpress you can use this config:
define("WP_CONTENT_URL", "http://static.yourwebsite.com");
define("COOKIE_DOMAIN", "www.yourwebsite.com");
Details - https://www.keycdn.com/support/how-to-use-cookie-free-domains/
EDIT 2
You will need to move your static content over to the wp-content folder of your newly created subdomain!

Is there any reason not to serve https content on a page served over http?

I currently have image content being served on a domain that is only accessible over https. What is the downside of serving an image with an https path on a page accessed over http? Are there any caching considerations? I'm using an HttpRuntime.Cache object to store the absolute image path, which is retrieved from a database.
I assume there is no benefit to using protocol-relative URLs if the image is only accessible over https?
Is there a compelling reason why I should set up a separate virtual directory to also serve the image content over http?
If the content served over HTTPS within the HTTP page isn't particularly sensitive and could equally be served over HTTP, there is no downside (perhaps some performance issues, not necessarily much, and lack of caching, depending on how your server is configured: you can cache some HTTPS content).
If the content server over HTTPS is sufficiently sensitive to motivate the usage of HTTPS, this is really bad practice.
Checking that HTTPS is used and used correctly is solely the responsibility of the client and its user (this is why automatic redirections from HTTP to HTTPS are only partly useful, for example). Although some of it has to do with the technicalities of certificate verification, a lot of the security offered by HTTPS comes from the fact that the user:
expects to be using HTTPS (otherwise they could easily be downgraded),
is able to verify the validity of the certificate: green/blue bar, corresponding to the host name on which they expect to be.
The first point can be addressed by HTTP Strict Transport Security, from a technical point of view.
The second needs used interaction. If you go to your bank's website, it must not only be a site with a valid certificate, but you should also check that it's indeed the domain name of your bank, for example.
Embedding HTTPS content in an HTTP page defeats this, since the user can't check which site is being used, and that HTTPS is used at all in fact. To some extent, embedding HTTPS content from a third party in an HTTPS page also presents this problem (this is one of the problems with 3-D Secure, which may well be served using HTTPS, but using an iframe doesn't make which site is actually used visible.)

why do CDNs always use a seperate host instead of subdomain?

a mirroring CDN can't have the same hostname as you application server, because you need a way for the CDN to explicitly reference the application.
Why, in general, do sites like facebook run their CDN on a totally seperate host, not just a subdomain like cdn.facebook.com? example: http://profile.ak.fbcdn.net/hprofile-ak-snc4/173706_6103645_790537_q.jpg
Is the reason, that they can construct resource URLs with many different hostnames, to avoid the 4-connections-per-host limit on some browsers?
If your domain is www.example.org, you can host your static components on static.example.org. However, if you've already set cookies on the top-level domain example.org as opposed to www.example.org, then all the requests to static.example.org will include those cookies.
From: http://developer.yahoo.com/performance/rules.html#cookie_free
Because user generated content can contain nasties that may be able to access data hosted on the primary domain.
It also stops things like cookies and authentication getting sent in the request to CDN content.
Preventing users from inserting
scripts, and at the same time allowing
user submitted html is extremely
difficult to do on the server side -
ergo we must have sandboxing.
Borrowed from a fairly old whatwg post

What are the pros and cons of a 100% HTTPS site?

First, let me admit that what I know about HTTPS is pretty rudimentary. I don't know much about session security, encryption, or how either of those things is supposed to be done.
What I do know is that web security is important; that horror stories of XSS, CSRF, and database injections pop up over and over again. I know that a preventative stance against such exploits is better than a reactive one.
But the motivation for this question comes from a different point of view. I work at a site that regularly accepts payment from users. Obviously, the payments are sent over a secure channel (HTTPS). I mainly work on the CSS, HTML, and JavaScript of the site. What I've been told is that it is necessary to duplicate CSS, JavaScript, and image files before they can be called over HTTPS. So assume I have the following files:
css/global.css
js/global.js
images/
logo.png
bg.png
The way I understand it, these files need to be duplicated before they can be "added" to the HTTPS. So a file can either be under security (HTTPS) or not.
If this is true, then this is a major hindrance. In even the smallest site, it would be a major pain to duplicate files and then have to maintain them every time you make a CSS or JS change. Obviously this could be alleviated by moving everything into the HTTPS.
So what I want to know is, what are the pros and cons of a site that is completely behind HTTPS? Does it cause noticeable overhead? Is it just foolish to place the entire site under encryption? Would users feel safer seeing the "secure" notifications in their browser during their entire visit? And last but not least, does it truly make for a more secure site? What can HTTPS not protect against?
You can serve the same content via HTTPS as you do via HTTP (just point it to the same document root).
Cons that may be major or minor, depending:
serving content over HTTPS is slower than serving it via HTTP.
certificates signed by well-known authorities can be expensive
if you don't have a certificate signed by a trusted authority (eg, you sign it yourself), visitors will get a warning
Those are pretty basic, but just a few things to note. Also, personally, I feel much better seeing that the entire site is HTTPS if it's anything related to financial stuff, obviously, but as far as general browsing, no, I don't care.
Noticeable overhead? Yes, but that matters less and less these days as clients and servers are much faster.
You don't need to make a copy of everything, but you do need to make those files accessible via HTTPS. Your HTTPS and HTTP services can use the same doc root.
Is it foolish to put the whole site under encryption? Typically no.
Would users feel safer? Probably.
Does it truly make for a more secure site? Only when dealing with the communication channel between the client and the server. Everything else is still up for grabs.
You've been misinformed. The css, js, and image files need not be duplicated assuming you've set up the http and https mapping to point to the same physical website on the server. The only important thing is that these files are referenced with https when the page you're looking at is also under https. This will prevent the dreaded security message that says that some objects on the page are not secured.
For every other page where you're running the site under http (unsecured) you can reference those same files in the same locations, but with an http address.
To answer your other question, there would indeed be a performance penalty to put the entire site under https. The server has to work hard to encrypt everything it sends over the wire. And then some not-so-old browsers won't cache https content to disk by default, which of course will result in an even heavier load on the server.
Because I like my sites to be as responsive as possible, I'm always selective about which sections of a site I choose to be SSL-encrypted. In most typical e-commerce sites, the only pages that need SSL encryption are the login, registration, and checkout pages.
The traditional reason for not having the entire site behind SSL is processing time. It does take more work for both the client and the server to use SSL. However, this overhead is fairly small compared to modern processors.
If you are running a very large site, you may need to scale slightly faster if you are encrypting everything.
You also need to buy a certificate, or use a self signed one which may not be trusted by your users.
You also need a dedicated IP address. If you are on a shared hosting system, you need to have an IP that you can dedicate to only having SSL on your site.
But if you can afford a certificate and private ip and don't mind needed a slightly faster server, using SSL on your entire site is a great idea.
With the number of attacks that SSL mitigates, I would say do it.
You do not need multiple copies of these files for them to work with HTTPs. You may need to have 2 copies of these files if the hosting setup has been configured in such that you have a separate https directory. So to answer your question - no duplicate files are not required for HTTPs but depending on the web hosting configuration - they may be.
In regards to the pros and cons of https vs http there are already a few posts addressing that.
HTTP vs HTTPS performance
HTTPS vs HTTP speed comparison
HTTPs only encrypts the data between the client computer and the server. It does not software holes or issues such as remote javascript includes. HTTPs doesn't make your application better - it only helps secure the data between the user and your app. You need to make sure your app has no security holes, practice filtering all data, SQL, and review security logs frequently.
However if you're only responsible for the frontend part of the site I wouldn't worry about it but would bring up concerns of security with the main developer for the backend.
One of the concerns is that https traffic could be blocked, for example on Apple computers if you set parental control on it blocks https traffic because it can't read the encrypted content, you can read here:
http://support.apple.com/kb/ht2900
https note: For websites that use SSL
encryption (the URL will usually begin
with https), the Internet content
filter is unable to examine the
encrypted content of the page. For
this reason, encrypted websites must
be explicitly allowed using the Always
Allow list. Encrypted websites that
are not on the Always Allow list will
be blocked by the automatic Internet
content filter.
An important "pro" for more https at your site is the following:
a user connecting thru an unencrypted WiFi, like at an airport, can give their password in https, but if the site then switches back to http after the password page, the session cookie becomes exposed and can be immediately used by an eavesdropper.
See this article http://steve.grc.com/2010/10/28/why-firesheeps-time-has-come/#comment-2666

How do I setup IIS with a cookieless domain to improve performance?

I was reading in google's documentation their new pagespeed plugin, that they recommend using cookieless domains to improve performance:
Static content, such as images, JS and CSS files, don't need to be accompanied by cookies, as there is no user interaction with these resources. You can decrease request latency by serving static resources from a domain that doesn't serve cookies.
Does anybody know how to do this in IIS?
What the Google article is suggesting is that you serve all your static content from another domain where cookies are not initially set by that serving domain.
Cookies are set in two ways - by session cookies (e.g. by ASP or ASP.NET requests) or explicitly by your application.
These will be posted back TO the server on each subsequent request for the domain that set the cookie (regardless of whether the request is for static or dynamic content) thus increasing the request payload.
What you're doing by having a second domain to serve static content (HTML, CSS, Images etc) is making cookie free requests because no initial cookie would be set in the first place for that domain.
In IIS it's your application, ISAPI Filter or ISAPI extension that will add a cookie. If your IIS server is not intercepting requests for static content (because this is usually handled by the kernel mode http.sys driver) then no cookies will be added to the response. It's only further up the request pipeline that cookies come into play.
So basically there isn't a way to explicitly configure cookie-less domains in IIS.
If you simply put all your static resources into, for instance, static.mysite.com, and if you never set any cookies in that domain, then the browser will never send a cookie when retrieving a resource from your static domain.
That's all Google is saying. There's nothing to configure, just to organize.
AFAIK google analytics sets cookie for all subdomains, so it would be useless if you are using analytics?
I've experienced this also, you'd have to use a different domain altogether to avoid analytics/adsense cookies being set. Using static.yourdomain.com won't cut it.
Here's to hoping Google will change their analytics cookies so that we won't all have to buy new domains to serve cookie-less content.
AFAIK google analytics sets cookie for all subdomains, so it would be useless if you are using analytics?
Here's an example using the Google Analytics asynchronous tracking code, of how to set the domain for tracking:
_gaq.push(['_setAccount', 'UA-XXXXXXX-x'],['_setDomainName', 'www.example.com'],['_trackPageview']);
Here's an example using the previous version of tracking code:
var pageTracker = _gat._getTracker("UA-XXXXXXX-x");
pageTracker._setDomainName("www.example.com");
pageTracker._trackPageview();
and here's what Google has to say about this: Google Analytics & Cookies

Resources