I have some basic questions regarding configuring the CDN. I am using Amazon CloudFront for that.
1) Let's suppose my website is example.com. In the origin of cloudfront, do I mention example.com as the origin or create a CNAME like cdn.example.com which points to the server and then enter cdn.example.com as the origin?
2) Once the configuration is done, do I redirect example.com to the cloudfront domain like dxxxxxx.cloudfront.net?
3) I will update all the links in my website to http://dxxxxxx.cloudfront.net/xxx. Now when I browse example.com, I will be redirected to cloudfront. But cloudfront is also using the example.com as the origin. Isn't it like cloudfront is trying to pull data from itself? Won't that create a dead loop?
I am not able to get my head around this. I will be really grateful if someone could help. Thanks!!
Here is how it works.
Your website is at example.com where all the static files are hosted that you want to serve through Cloudfront. This example.com is called the Origin Server, Origin Host, or simply the Origin.
Cloudfront will create a Pull Zone for you that will look like http://dxxxxxx.cloudfront.net - Now you have to use this host instead of the original example.com for your static assets. All HTML files or dynamic files will still be loaded directly through example.com . Users will still enter example.com in their browser. Only the scripts, styles, images, fonts, icons and similar static files that are loaded by the browser behind the scenes are required to be changed to use the CDN host.
Your CDN setup is complete at this point. However, if someone looks at the source code of your page, they can see the cloudfront URLs being used to deliver static assets. This may look unprofessional. As a solution to hide 3rd party Host Name, and to use your own host name to get a branded feel, you can create a new subdomain cdn.example.com at your DNS provider and CNAME it to dxxxxxx.cloudfront.net
If you created a CNAME subdomain above, you now have to update all URLs of static files again, and change their URL to use cdn.example.com . Your website will still be loaded over example.com, but assets will now be delivered through cdn.example.com that will point to dxxxxxx.cloudfront.net
When dxxxxxx.cloudfront.net will receive request from browser for static files, it will forward that request to the specified origin server example.com where the files are actually placed. Origin will send the files to cloudfront, cloudfront will save the file for future use and will send a copy to the browser.
Step 3, and 4 are not part of the CDN integration process. Also, the subdomain cdn.example.com is not a requirement. You can use some other subdomain, or some other domain. For example, the following are valid:
cdn2.example.com
static-assets.example.com
static.assets.example.com
images.example.parent-company-website.com
Similarly, it is not a requirement to fetch assets from example.com only. You can specify my-other-website.net as origin, and cloudfront will happily fetch resources from there for your example.com site.
In your scenario, all of the following are not dependent on each other. You can change any or all of these and the process will not break providing you made necessary adjustments to the configuration and the code.
Your website: example.com
CDN origin: example.com (since currently assets are at this host)
Pull Zone: http://dxxxxxx.cloudfront.net/
CNAME Host: cdn.example.com
Hope this clears the picture.
Related
I have a static site through Cloudfront with an S3 origin & custom domain via Route 53. All works well, except that Google has also indexed the Cloudfront distribution url (d123etc.cloudfront.net) as well as my custom domain, leading to duplicate content issues.
I've tried canonical urls, but the distribution remains indexed. It has been suggested to serve up a different robots.txt depending on what domain is being used, which sounds fine, but there is no .htaccess or web server, leaving it to a Lambda Edge function to try and send the different robots.txt.
The problem is that I can't find how in the function to determine if a request is coming from my custom domain or from the direct distribution url. I've tried white-listing the Origin, but it is not sent through when using an S3 origin. I've also tried white-listing the Referer header, but no referrer is sent through when accessing the robots.txt file as it's a direct request.
For the time-being, I'm adding a meta noindex client-side using js on page load (which I realise is too late), and also redirecting client-side to my actual domain in case someone follows the google indexed cloudfront.net domain.
Does anyone know how to detect in Lambda Edge which domain is being used to make the request? Or some other way of blocking Google from indexing the Cloudfront url, just leaving it to index the custom domain.
So I think the way to do this would be to set up a redirect on your hosted webserver. If you check the 'host' in the request header and check for cloudfront.com, send a 301 response code along with your custom domain name.
S3 has a UI way to do this:
https://medium.com/tensult/how-to-do-site-redirection-using-aws-522a4002c645
It seems you'll need a second bucket behind the same cloudfront url but without the custom domain. Then you can set it to redirect all requests to your custom domain.
The browser or bots would then stop trying cloudfront.com because it doesn't return anything, they would automatically (without the user really noticing) to my domain.xyz and all the links would link to your own domain.
I am fairly new to using CDN but i've found that there are two types of CDN.
You redirect your DNS to your CDN and they automatically take over the traffic as a proxy and do the caching and content delivery. No change in URLs and it's basically no work. Even hard to understand if my content is being delivered through CDN (you have to check headers or use website tools that look for it). Good example is CloudFlare
You do not redirect your DNS. You give it an origin server, then everything gets copied over to the CDN servers and you content is available on the new CDN URLs.
Now, i have a website with a lot of images. I want to use Microsoft Azure CDN. I created my profile (Standart Microsoft CDN) and created the CDN endpoint. I tested and it works fine
https://xxxx.com/images/example.png
https://xxxx.azureedge.net/images/example.png
All good - my image is there, along wiht others
So what comes next? I have an image (img src tag) for example pointing to /images/example.png. It seems like i need to change it to https://xxxx.azureedge.net/images/example.png
So my website has a lot of images and if i have to go and manually re-do all the img src tags it seems like a lot of work and what happens if i decide to move to another CDN or stop using CDN. So all this leads me to believe i might be missing a point here and not doing this correctly.
Is that the correct way a CDN like this should work? If yes, may i get some help on how can i achieve that with minimum amount of labour? re-doing all my css, js and images to the new URLs? I am using Joomla CSM.
Documentation out there on how to tackle or deal with something as easy as this are unbelievably limited.
Basically you are right. Mainly, CDN services will basically "pull" static content (for example images) from your website, and then serve them from multiple locations (servers) to your visitors from your provided CDN url. For example:
Your origin url
mydomain.com/image.jpg
CDN url
mycdn.cdnservice.com/image.jpg
If the URL was the SAME as your existing url, then it wouldn't really work as a CDN now would it. There are often options so that you can use your own subdomain, for example cdn.mydomain.com/image.jpg, but it's still a change of URL. Most CMS's will often have options, or at least plugins, to set CDN url for static assets, which will dynamically replace the paths to point to the CDN url. If you have set file paths manually, then these will need to be replaced manually also with the full CDN path.
There are a few hacks like server rewrite which might allow you to use the same URL, but this is not recommended to pursue. Generally speaking, using a CDN requires changing url to your static assets.
Option #2 is to use a reverse proxy CDN service like Cloudflare. This requires changing your nameservers to route ALL your traffic through Cloudflare, and then Cloudflare will work as a CDN for static assets without you having to change url paths. However, it must be noted that Cloudflare is much more than just a CDN, and you can't really control how your assets are cached on their CDN/servers.
I am hosting some videos on Google Drive.
Basically, I'd like to "mask" the download URL's of the videos with that of my own domain.
As of current, the links look like
https://drive.google.com/uc?id={id}&export=download
I'd like the links to be my example.com instead of google.com
At first, I tried inserting a cname record (eg. drive.mydomain.com -> drive.google.com) however, Google returns a 404 error in that case.
Can this be done?
In order to download a file from a server, the server must know the requested URL. If you add a CNAME record like yourdomain.com CNAME goole.com, the client will know the IP of the google server, but the requested URL will not be recegnized by the google web server, concequently will respond with a 404 error.
That said, there is no way to get a correct response from a server "masking" the domain name.
One workaround (maybe overkill), could be to create an script to temporary download the file from google to your server then send that file from your server to the final client.
I have setup a CloudFront origin pull server. It allows me to set a domain name, which I have. This works.
But I don't want the whole domain to be the origin. I want
mydomain.com/folder/subfolder
to be the origin. Also, the cloudfront distribution is CNAMEd to a cdn, which is setup via DNS to cloudfront. This seems to work.
So, basically, instead of this URL:
xyz.cloudfront.net/folder/subfolder/1.jpg
I want this instead:
cdn.mydomain.com/1.jpg
Currently I have achieved, via CNAME and origin pull:
cdn.mydomain.com/folder/subfolder/1.jpg
The question is: on CloudFront how do I setup an origin pull from a folder, not from the main domain name?
The accepted answer is out of date. This is possible using the "Origin Path" setting in AWS which will rewrite the request to a sub-folder on the origin:
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/distribution-web-values-specify.html#DownloadDistValuesOriginPath
I am not aware of a way to do this cloudfront. However, you create a virtual host at your origin for subfolder.example.com and have it's root directory be the directory you mentioned. Then you could set subfolder.example.com as your origin for the default cache behavior.
I'm trying to optimize an html webpage, and one of the suggestions from yslow is:
Use cookie-free domains There are 11 components that are not
cookie-free
So I followed one of the standard solutions I've seen and created a subdomain static.mysite.com and put the images there.
But I'm still getting the exact same problem -- a cookie is still being delivered with each image, and same yslow message.
So how do I get this subdomain to be cookie free?
If you are using subdomain for cookie-free delivery then your main page has to use www prefix.
I had the same problem. The subdomain simply didn't work, so I used a different domain name and it solved the problem.
When the browser makes a request for a static image and sends cookies together with the request, the server doesn't have any use for those cookies. So they only create network traffic for no good reason. You should make sure static components are requested with cookie-free requests. Create a subdomain and host all your static components there.
If your domain is www.example.org, you can host your static components on static.example.org. However, if you've already set cookies on the top-level domain example.org as opposed to www.example.org, then all the requests to static.example.org will include those cookies. In this case, you can buy a whole new domain, host your static components there, and keep this domain cookie-free. Yahoo! uses yimg.com, YouTube uses ytimg.com, Amazon uses images-amazon.com and so on.
Another benefit of hosting static components on a cookie-free domain is that some proxies might refuse to cache the components that are requested with cookies. On a related note, if you wonder if you should use example.org or www.example.org for your home page, consider the cookie impact. Omitting www leaves you no choice but to write cookies to *.example.org, so for performance reasons it's best to use the www subdomain and write the cookies to that subdomain.
Source - http://developer.yahoo.com/performance/rules.html
EDIT
If you set your cookies on a top-level domain (e.g. yourwebsite.com) all of your sub-domains (e.g. static.yourwebsite.com) will also include the cookies that are set. Therefore, in this case, it is required that you use a separate domain name to deliver your static content if you want to use cookie-free domains. However, if you set your cookies on a www subdomain such as www.yourwebsite.com, you can create another subdomain (e.g. static.yourwebsite.com) to host all of your static files which will no longer result in any cookies being sent.
For Wordpress you can use this config:
define("WP_CONTENT_URL", "http://static.yourwebsite.com");
define("COOKIE_DOMAIN", "www.yourwebsite.com");
Details - https://www.keycdn.com/support/how-to-use-cookie-free-domains/
EDIT 2
You will need to move your static content over to the wp-content folder of your newly created subdomain!