Is a Windows Azure worker role instance a whole VM? - azure

When I run a worker role instance on Azure, is it a complete VM running in a shared host (like EC2)? Or is it running in a shared system (like Heroku)?
For example, what happens if my application starts requesting 100 GB of memory? Will it get killed off automatically for violation of limits (รก la Google App Engine), or will it just exhaust the VM, so that the Azure fabric restarts it?
Do two roles ever run in the same system?

It's a whole VM, and the resources allocated are based directly on the size of VM you choose, from 1.75GB (Small) to 14GB (XL), with 1-8 cores. There's also an Extra Small instance with 768MB RAM and shared core. Full VM size details are here.
With Windows Azure, your VM is allocated on a physical server, and it's the fabric's responsibility of finding such servers to properly allocate all of your web or worker role instances. If you have multiple instances, this means allocating these VMs across fault domains.
With your VM, you don't have to worry about being killed off if you try allocating too much in the resource dep't: it's just like having a machine, and you can't go beyond what's there.
As far as two roles running on the same system: Each role has instances, and with multiple instances, as I mentioned above, your instances are divided into fault domains. If, say, you had 4 instances and 2 fault domains, it's possible that you may have two instances on the same rack (or maybe same server).

I ran a quick test to check this. I'm using a "small" instance that has something like 1,75 gigabytes of memory. My code uses an ArrayList to store references to large arrays so that those arrays are not garbage collected. Each array is one billion bytes and once it is allocated I run a loop that sets each element to zero and then another loop to check that each element is zero to ensure that memory is indeed allocated from the operating system (not sure if it matters in C#, but it indeed mattered in C++). Once the array is created, written to and read from, it is added to the ArrayList.
So my code successfully allocated five such arrays and the attempt to allocate the sixth one resulted in System.OutOfMemoryException. Since 5 billion bytes plus overhead is definitely more that 1,75 gigabytes of physical memory allocated to the machine I believe this proves that page file is enabled on the VM and the behavior is the same as on usual Windows Server 2008 with the limitations induced by the machine where it is running.

Related

umbraco 7.2 on azure websites takes 1.5gb memory

I have a simple Umbraco 7.2 website (no plugins/custom code/etc) on a Shared Azure Website. Azure has suspended the website twice in the last week because of over quota memory usage. I scaled it up to 6 instances for now. Looking at the dashboard right now, it shows me it's using 1.5gigs. I went on the Kudu interface (.scm.azurewebsites.net) and in the Process Explorer it shows that the the process is only taking ~150mb of both private memory and working set. It also says virtual memory is taking ~750mb.
Why is Azure saying it's taking up so much memory? Does increasing instance count actually mean more memory for my app or does it just mean more instances are running the same app... so it's basically 200mb*6 instances = 1200megs?
Thanks!
When you scale up the number of instances for an Azure Website it changes the number of VMs your website is running on - so it means more instances running the same app.
Similarly, when you log into the Kudu interface this is going to connect to the process running on one particular instance - so it won't show you total memory being used by all instances at once, just what's used by the one particular instance you're connected to.

Azure changing hardware

I have a product which uses CPU ID, network MAC, and disk volume serial numbers for validation. Basically when my product is first installed these values are recorded and then when the app is loaded up, these current values are compared against the old ones.
Something very mysterious happened recently. Inside of an Azure VM that had not been restarted in weeks, my app failed to load because some of these values were different. Unfortunately the person who caught the error deleted the VM before it was brought to my attention.
My question is, when an Azure VM is running, what hardware resources may change? Is that even possible?
Thanks!
Answering this requires a short rundown of how Azure works.
In each data centres there are thousands of individual machines. Each machine runs a hypervisor which allows a number of operating systems to share the same underlying hardware.
When you start a role, Azure looks for available resources - disk space CPU RAM etc and boots up a copy of the appropriate OS VM in thoe avaliable resources. I understand from your question that this is a VM role - so this VM is the one you uploaded or created.
As long as your VM is running, the underlying virtual resources provided by the hypervisor are not likely to change. (the caveat to this is that windows server 2012's hyper visor can move virtual machines around over the network even while they are running. Whether azure takes advantage of this, I don't know)
Now, Azure keeps charging you for even when your role has stopped because it considers your role "deployed". So in theory, those underlying resources still "belong" to your role.
This is not guaranteed. Azure could decided to boot up your VM on a different set of virtualized hardware for any number of reasons - hardware failure being at the top of the list, with insufficient capacity being second.
It is even possible (tho unlikely) for your resources to be provided by different hardware nodes.
An additional point of consideration is that it is Azure policy that disaster recovery (or other major event) may include transferring your roles to run in a separate data centre entirely.
My point is that the underlying hardware is virtual and treating it otherwise is most unwise. Roles are at the mercy of the Azure Management Routines, and we can't predict in advance what decisions they may make.
So the answer to your question is that ALL of the underlying resources may change. And it is very, very possible.

Allocating resources for SQL Server instances -dba

I have a question regarding allocating resources for SQL Server 2008 R2. We a have physical server Windows 2008 R2 with three installations of SQL Server 2008 R2 for Production, test and development. The server has 64GB of RAM and 24 cores. If we want to allocate specific amount of resources to specific instance, can we do it? For instance, we want allocate 32 GB and 12 cores for production instance, and 12 GB and 6 cores each for test and development.
Because all three instances are on the same physical server, we do not want the test and development instances to consume more resources than we want them to. Is there a way to set it in the server or in SQL Server?
You have some options. The three options I see are:
(1) use a VM for each server on the same physical machine and allocate resources to the VMs as you see fit.
(2) use sql server resource governor (http://blog.sqlauthority.com/2012/06/04/sql-server-simple-example-to-configure-resource-governor-introduction-to-resource-governor/)
With this you can setup resource consumption by user. So for your test server, just setup its users with a lower resource allocation.
(3) in sql server you can also set the max memory of an instance in sys.configurations using sp_configure. For CPU you could try experimenting with affinity mask in a test environment or assigning a specific CPU id range or NUMA node id range to the instance. For more details: http://msdn.microsoft.com/en-us/library/ee210585(v=sql.110).aspx
I would recommend always setting max server memory in any instance. I would not recommend messing with the affinity or CPU range settings without a lot of testing. Instead, for your case I would recommend using resource governor. This is bc it does not have the drawback of VMs, which themselves have overhead and thus use up some server resources. Also you can leave the CPU scheduling to the OS and server with resource governor. This option also gives you the most flexibility since you can limit resources by specific user, in case you ever need that.

Azure VM pricing - Is it better to have 80 single core machines or 10 8-core machines?

I am limited by a piece of software that utilizes a single core per instance of the program run. It will run off an SQL server work queue and deposit results to the server. So the more instances I have running the faster the overall project is done. I have played with Azure VMs a bit and can speed up the process in two ways.
1) I can run the app on a single core VM, clone that VM and run it on as many as I feel necessary to speed up the job sufficiently.
OR
2) I can run the app 8 times on an 8-core VM, ...again clone that VM and run it on as many as I feel necessary to speed up the job sufficiently.
I have noticed in testing that the speed-up is roughly the same for adding 8 single core VMs and 1 8-core VM. Assuming this is true, would it better better price-wise to have single core machines?
The pricing is a bit of a mystery, whether it is real cpu usage time, or what. It is a bit easier using the 1 8-core approach as spinning up machines and taking them down takes time, but I guess that could be automated.
It does seem from some pricing pages that the multiple single core VM approach would cost less?
Side question: so could I do like some power shell scripts to just keep adding VMs of a certain image and running the app, and then start shutting them down once I get close to finishing? After generating the VMs would there be some way to kick off the app without having to remote in to each one and running it?
I would argue that all else being equal, and this code truly being CPU-bound and not benefitting from any memory sharing that running multiple processes on the same machine would provide, you should opt for the single core machines rather than multi-core machines.
Reasons:
Isolate fault domains
Scaling out rather than up is better to do when possible because it naturally isolates faults. If one of your small nodes crashes, that only affects one process. If a large node crashes, multiple processes go down.
Load balancing
Windows Azure, like any multi-tenant system, is a shared resource. This means you will likely be competing for CPU cycles with other workloads. Having small VMs gives you a better chance of having them distributed across physical servers in the datacenter that have the best resource situation at the time the machines are provisioned (you would want to make sure to stop and deallocate the VMs before starting them again to allow the Azure fabric placement algorithms to select the best hosts). If you used large VMs, it would be less likely to find a suitable host with optimal contention to accommodate many virtual cores.
Virtual processor scheduling
It's not widely understood how scheduling a virtual CPU is different than scheduling a physical one, but it is something worth reading up on. The main thing to remember is that hypervisors like VMware ESXi and Hyper-V (which runs Azure) schedule multiple virtual cores together rather than separately. So if you have an 8-core VM, the physical host must have 8 physical cores free simultaneously before it can allow the virtual CPU to run. The more virtual cores, the more unlikely the host will have sufficient physical cores at any given time (even if 7 physical cores are free, the VM cannot run). This can result in a paradoxical effect of causing the VM to perform worse as more virtual CPU cores are added to it. http://www.perfdynamics.com/Classes/Materials/BradyVirtual.pdf
In short, a single vCPU machine is more likely to get a share of the physical processor than an 8 vCPU machine, all else equal.
And I agree that the pricing is basically the same, except for a little more storage cost to store many small VMs versus fewer large ones. But storage in Azure is far less expensive than the compute, so likely doesn't tip any economic scale.
Hope that helps.
Billing
According to Windows Azure Virtual Machines Pricing Details, Virtual Machines are charged by the minute (of wall clock time). Prices are listed as hourly rates (60 minutes) and are billed based on total number of minutes when the VMs run for a partial hour.
In July 2013, 1 Small VM (1 virtual core) costs $0.09/hr; 8 Small VMs (8 virtual cores) cost $0.72/hr; 1 Extra Large VM (8 virtual cores) cost $0.72/hr (same as 8 Small VMs).
VM Sizes and Performance
The VMs sizes differ not only in number of cores and RAM, but also on network I/O performance, ranging from 100 Mbps for Small to 800 Mbps for Extra Large.
Extra Small VMs are rather limited in CPU and I/O power and are inadequate for workloads such as you described.
For single-threaded, I/O bound applications such as described in the question, an Extra Large VM could have an edge because of faster response times for each request.
It's also advisable to benchmark workloads running 2, 4 or more processes per core. For instance, 2 or 4 processes in a Small VM and 16, 32 or more processes in an Extra Large VM, to find the adequate balance between CPU and I/O loads (provided you don't use more RAM than is available).
Auto-scaling
Auto-scaling Virtual Machines is built-into Windows Azure directly. It can be based either on CPU load or Windows Azure Queues length.
Another alternative is to use specialized tools or services to monitor load across the servers and run PowerShell scripts to add or remove virtual machines as needed.
Auto-run
You can use the Windows Scheduler to automatically run tasks when Windows starts.
The pricing is "Uptime of the machine in hours * rate of the VM size/hour * number of instances"
e.g. You have a 8 Core VM (Extra Large) running for a month (30 Days)
(30 * 24) * 0.72$ * 1= 518.4$
for 8 single cores it will be
(30 * 24) * 0.09 * 8 = 518.4$
So I doubt if there will be any price difference. One advantage of using smaller machines and "scaling out" is when you have more granular control over scalability. An Extra-large machine will eat more idle dollars than 2-3 small machines.
Yes you can definitely script this. Assuming they are IaaS machines you could add the script to windows startup, if on PaaS you could use the "Startup Task".
Reference

Azure Worker Role data volatility

I would like to create an application that holds large amount of volatile data in memory. Only small part of this data needs to be persisted when host machine shuts down, or in case of maintenance. Outages should be rare, this in memory data needs to be accessible for most of the time, but rare restrats of service is bearable.
If I have been developing for a server, I would create a WindowsService, which runs reliably while the machine is up, and I would persist a fraction of the data in the OnStop() method.
I'm thinking of moving this whole thing to the cloud. The question is that if a Worker Role is similiar to a Windows Service from this point of view? Does it run most of the time with rare outages, or is it recycled / restarted from time to time or when it is idle?
Like Windows Service, Worker role is meant for processing background tasks. However one thing you would need to keep in mind that your worker role can go down any time. It may be because of hardware failure or software updates. Thus you can't always assume this to be highly available. That's why Windows Azure recommends deploying multiple instances of your application.
What you could do is have multiple instances of your worker role running and all of them sharing a common cache where you would put volatile data. Do take a look at Windows Azure Caching (http://msdn.microsoft.com/en-us/library/windowsazure/gg278356.aspx) where you could either dedicate some memory of a VM (i.e. an instance) for caching purpose or have a full VM dedicated for caching. That way you'll have your volatile data somewhere outside of your worker roles and thus making it available to all instances.

Resources