Azure scaling with Orchard cms - azure

I was reading through these questions:
Scaling Orchard with Azure Web Sites
Orchard CMS Performance
How to deploy Orchard CMS in Windows Azure?
I started to think about an e-commerce project I am undertaking and would like to clarify a few things if possible.
Please forgive me because I am finding it very difficult to articulate this question in a way I feel I have clearly communicated what I am thinking.
Firstly, what factors and when would those factors kick in for me to start thinking about scaling to handle the traffic of my web site. The type of factors I am aware of would include:
Session handling
Caching
I am thinking the amount of data being served in a request but not sure on the full implications of request size
Secondly, with all things there should be a certain level of up-front planning when trying to set up a web site that can handle traffic of certain levels. Would the Azure scaling need to be done upfront or is it a simple matter to make it work now for what is needed and then up-scale at a later date when it is necessary?
Let me give a real life scenario to try aid where my fear is:
A radio broadcast was put out for a certain web site trying to sell
their wares. The web site was not planned very well. The web site
started to receive visits from people listening to the radio show. So
many visitors that the web site was not able to handle the traffic and
an error message was displayed telling the world that they should
'talk to the administrator' or words to that effect. You know the
picture I am sure and I am also very certain it would be embarrassing
for any web developer to be told that this was happening to a web site
they had designed.
I would really like to really be able to distil a proper question out of this, but there are many things that I am just not aware of. To try an make this question less vague I will try to summarise what I would like to achieve:
I want to have a web site that is able to handle a lot of traffic following successful advertising/marketing campaigns. I want to walk the tightrope of budget versus functionality, which is why I would like to be able to do the least amount possible to start with and be able to easily up-scale as demand dictates.
Bearing this in mind, what approach/considerations should I take to avoid nasty pitfalls with performance/availability/reliability when using an Orchard CMS/Azure combination to deliver my project?

Orchard on Azure Web Sites is working great for us, see http://nublr.pt
A few things to bear in mind with the site configuration are:
follow the guidelines in http://docs.orchardproject.net/Documentation/Optimizing-Performance-of-Orchard-with-Shared-Hosting
set up caching (module Contrib.Cache available in the gallery) which will use IIS's application cache.
set up the Warmup feature to keep the site alive,
also ensure that dynamic compilation is off by using the Config/HostComponents.config
We are currently in "shared" mode of azure web sites, we don't have much traffic yet, but out load testing with https://loadimpact.com has not taken the site down once. at any time we can move to the "reserved" mode (it does take up to 24h for it to happen)
Version 1.6 will bring a lot of improvements to Orchard, try to get started with your development in it.
Hope this has helped.

Related

Block traffic from referral spam bots in Azure Web App with DNN

I am sure many of you have found fake referral traffic in your google analytics reports/views. This makes it difficult for low to medium traffic sites to have accurate data for marketing. I am wondering what others are doing to exclude this traffic from their analytics reports.
If you go to your analytics account and go to acquisition -> all traffic -> referrals you will see sites like floating-share-buttons.com. These are the sites I want to filter out. Which you can do by setting up a custom filter for the view as described at the bottom of this page. I have done this and it works.
I would rather block these bots from hitting the site all together. Just a note: my sites are running as web apps in azure.
I am not sure if setting up url rewrite rules described here will work in azure apps or if this will mess with the existing url rewrite functions of the Content Management System I am using (DotNetNuke DNN platform 7).
I am really just looking to hear what others have done to block bots rather than than setting up filters in the analytics view's settings.
Thanks
PS
for those who are interested, this is the current filter list I am using:
webmonetizer\.net|trafficmonetizer\.org|success-seo\.com|event-tracking\.com|Get-Free-Traffic-Now\.com|buttons-for-website\.com|4webmasters\.org|floating-share-buttons\.com|free-social-buttons\.com|e-buyeasy\.com
With regards to this issue, there are a number of things that you can do. You are going the route that I see most commonly used and that is to block the information using the filters in Google Analytics.
You can go the route of an IIS Filter as well, just like you have linked. DNN's Friendly URL's will not necessarily be impacted by this as they are processed BEFORE DNN gets the request. There is a marginal performance impact by having two things process re-writes, but nothing to be concerned about until incredibly high user volume.
This is also a great collection of options.
First you need to know that there are mainly 2 types of spam affecting GA right now, Ghost and Crawlers.
The first(ghosts) never interacts with your page, so any server-side solutions like the HTTP rules or htaccess file won't have any effect and will only fill your config files with.
The crawlers as the name imply do access your website and can be blocked this way, but there are only a few of them compared with the ghost. To give you an Idea there are around 8 active crawlers while there are more than 100 ghosts and each week increasing.
This is because the ghost method is easier to implement for the spammers.
From your expression, only success-seo is a crawler. The rest should be filtered. Now there is a better way to get rid of all ghosts with just one filter based in your valid hostnames instead of creating of updating one every week.
You can find more information about the ghost spam and the solution here
https://stackoverflow.com/a/28354319/3197362
https://moz.com/ugc/stop-ghost-spam-in-google-analytics-with-one-filter
Hope it helps.

What server should I use if I am expecting over 20k visits/day?

I recently launched a fantasy football online game for the English premier league called Myfpl11.com and I want to know what server should I choose if I am expecting 20k visits a day. My visits are going up and I want the site to keep performing smoothly. Please help.
This is probably the wrong area of StackExchange to ask this sort of question. However ...
The first thing you should do is get prepared to scale horizontally instead of vertically. If you keep growing you will soon grow out of any single server that you purchase.
Instead, what you need to do is start looking at ways to modify your website to be able to work over multiple systems. If you're experiencing load issues on the server you currently have, spin up another one of the exact same instance and move the database to that server, so you will then have two -- one dedicated to the database (which will really help it do its job) and one dedicated to serving traffic.
From there look at how you can scale in to multiple web processes, databases and add caching layers.
You can add cloudflare.com as your DNS service which will help you out by better caching your assets, but most importantly they will deliver a technical issues page to your users if your site does fall over at any stage. This is really helpful for saving face, because they will get an actual page with a message instead of a continually loading white-page.
Look at using services like digitalocean.com or linode.com (both very affordable and great staff) where you can easily add/remove resources as you need them.

google analytics a/b testing with 2 site instances.

I am getting ready to release a new web site in the coming weeks, and would like the ability to run multivariate or a/b tests between two version of the site.
The site is hosted on azure, and I am using the Service Gateway to split traffic between the instances of the site, both of which are deployed from Visual Studio Online. One from the main branch and the other from an "experimental" branch.
Can I configure Google analytics to assist me in tracking the success of my tests. From what I have read Google analytics seems to focus on multiple versions of a page within the same site for running its experiments.
I have though of perhaps using 2 separate tracking codes, but my customers are not overly technically savvy, so I would like to keep things as simple as possible. I have also considered collecting my own metrics inside the application, but I would prefer to use an existing tool as I don't really have the time to implement something like that.
can this be done? are there better options? is there a good nugget package that might fulfil my needs? any advice welcome.
I'd suggest setting a custom dimension that tells you which version of the site the user is on. Then in the reports you can segment and compare the data.

what are the tools to parse/analyze IIS logs - ideally free/open source?

note: there are few similar questions already asked here - but they are from 2009. May be something has changed since then.
I'm responsible for a bunch of websites hosted on different servers. I do not do any log analysis right now, but I would like to change this. First question - what is the best tool to view ISSUES with the website based on IIS logs (i.e. 404, 500 responses, long page processing, etc)? Ideally with grouping/sorting options? I do not want to spend a lot of time on this, I just want to periodically check if all is good with the website.
Second question (and I know most likely i'm asking for too much) - but is there any way to expose processed logs to web? So I can review things mentioned above without RPDing into the server?
Ideally I'm looking for a free/open source solution, but I'm ready to pay for a good software as well (but not a lot of $$).
Thank you.
You can take a look at our log monitoring solution EventSentry, which can monitor text-based logs like IIS logs. We have standard templates setup for IIS, and we can consolidate the logs in a database with web-access, so that you can review the logs without using RDP.
It's a pretty flexible solution that allows you to pick the fields you are interested in, and ignore the ones you are not - and thus save space in your database.
You can also setup real-time alerts, so that you can get an email when a critical error is encountered in a log file, like a 500 error.
http://www.eventsentry.com/features/log-file-monitoring
Finally, you can also plug-in command line tools which can verify that a given web page is accessible, or get alerted when it changes: http://www.eventsentry.com/features/application-monitoring.
I'm biased of course, but I would say that our solution is pretty affordable. Since it offers additional functionality as well, such as service monitoring (to monitor your IIS services) and event log monitoring (IIS does log critical messages to the event log), you can setup comprehensive monitoring with a single product.
I'd look into #LuckyLuke solution (or similar) - classic "build vs buy" decision. Based on your post, this isn't going to be your "full time" job so IMHO its best to leave it to those who do...
I don't know what "legacy" answers you are referring to, but if you want to tinker you can use Microsoft's own log parser, and depending on how far you want to go with it, you can use it (COM dll) to write your "admin web pages" in .Net/ASP.Net and host it in each of your servers....
If you're very specific about the errors you just want to be alerted about, another "hacky" way would be to provide your own custom error pages (either the default IIS error pages, or configure your Asp.Net apps to use specific error pages).

Difference between Ad company statistics, Google Analytics and Awstats on adult sites

I have this problem. I have web page with adult content and for several past months i had PPC advertisement on it. And I've noticed a big difference between Ad company statistics of my page, Google Analytics data and Awstats data on my server.
For example, Ad company tells me, that i have 10K pageviews per day, Google Analytics tells me, that i have 15K pageviews and on Awstats it's around 13K pageviews. Which system should I trust? Should i write my own (and reinvent a wheel again)? If so, how? :)
The joke is, that i have another web page, with "normal" content (MMORPG fan site) and those numbers are +- equal in all three systems (ad company, GA, Awstats). Do you think it's because it's not adult oriented page?
And final question, that is totally offtopic, do you know about Ad company that pays per impression and don't mind adult sites?
Thanks for the answers!
First, you should make sure not to mix up »hits«, »files«, »visits« and »unique visits«. They all have a different meaning and are sometimes called differently. I recommend you to look up some definitions if you are confused about the terms.
awstats has probably the most correct statistics, because it has access to the access.log from the web server. Unfortunately, a cached site (maybe cached by the browser, a proxy from an ISP or your own caching server) might not produce a hit on the web server. Especially if your site is served with good caching hints which don't enforce a revalidation and you are running your own web cache (e.g. Squid) in front of your site, the number will be considerable lower, because it only measures the work of the web server.
On the other hand, Google Analytics is only able to count requests from users which haven't blocked Google Analytics and have JavaScript enabled (but they will count pages served by a web cache). So, this count can be influenced by the user, but isn't affected by web caches.
The ad-company is probably simply counting the number of requests which they get from your site (probably based on their access.log). So, to get counted there, the add must not be cached and must not be blocked by the user.
So, as you can see, it's not that easy to get a single correct value. But as long as you use the measured values in comparison to those from the previous months, you should get at least a (nearly) correct rate of growth.
And your porn site probably serves a high amount of static content (e.g. images from the disk) and most of the web servers are really good at serving caching hints automatically for static files. Your MMORPG on the other hand, might mostly consist of some dynamic scripts (PHP?) which don't send any caching hints at all and web servers aren't able to determine those caching headers for dynamic content automatically. That's at least my explanation, without knowing your application and server configuration :)

Resources