We have a number of Service Fabric clusters provisioned in Azure, for dev and testing. I would like to find a way to 'pause' these over night to save paying for them when they're not being used.
This seems to be what the Azure Dev Labs are for, but as far as I can see they don't support Service Fabric Clusters.
I'm thinking of writing a script to completely tear these environments down at night and rebuild them in the morning, but before doing that I'm wondering if there are any better ways.
Service Fabric clusters cannot be safely "paused". If you shut down all VMs, there is a chance that the cluster's state - the applications and their data - will be lost.
If you don't mind starting with a fresh set of clusters every morning, it's pretty straightforward to automate. You can define your environments using ARM templates and write a short script to provision, then create another script to delete the resource groups at the end of the day, which will remove the VMs and all associated resources.
Related
I am investigating a robust way to scan my Azure AKS clusters and randomly change the numbers of pods, allocated resources, throttling and if possible limit connections to other resources (E.g. database, queues, cache).
The idea is to have this running against any environment (test, QA, live)
Log what changes where made and when
Email that the script has run
Return environment to desired state
My questions are:
Is there tooling for this already?
If this possible via CRON/ Azure pipelines?
This is part of my stress development work cycle that includes API integration and load testing to help find weakness and feedback ways we can improve our offering and teams reputation
Google "Kubernetes chaos engineering".
Look at Azure Chaos Studio https://azure.microsoft.com/en-us/products/chaos-studio/#overview
Create a chaos experiment that uses a Chaos Mesh fault to kill AKS pods with the Azure portal https://learn.microsoft.com/en-us/azure/chaos-studio/chaos-studio-tutorial-aks-portal
I want to host an Orleans project on Azure, but don't want to use the (classic) Cloud Services model (I want an ARM template project). The web app sample uses the old web / worker model - what is best option? There is a Service Fabric sample - is that the best route? The nearest equivalent to the web/worker model is VM Scale Sets - is that a well tested option?
IMO, app service is closet to web role.
Worker role however, depending on the point of view
From system architecture point of view, I think Scale Set is the closet. You get an identical set of VMs running your application. However you lost all management features. How your cluster handle application configurations, work loads on each node, service interruptions from server failure or deployments are pretty much DIY. Also you need to provision the VM with dependencies for your application.
From operations point of view, I think Service Fabric is the closest. It handles problems above but then you are dealing with design/implementation changes and learning curve from the added fabric layer in the architecture. Could be small, could be big depending on the complexity of your project. Besides, service fabric is still relatively new and nothing is for sure. Best case you follow the sample change a few lines of code and it works like a charm. Worst case you may want to complete refactor orleans solution into service fabric solution.
App service would be the easiest among the three. If it doesn't meet your requirement, I personally would try Service Fabric. Same reason why people are moving to cloud and you would opt for ARM solution.
We are working on an application that processes excel files and spits off output. Availability is not a big requirement.
Can we turn the VM sets off during night and turn them on again in the morning? Will this kind of setup work with service fabric? If so, is there a way to schedule it?
Thank you all for replying. I've got a chance to talk to a Microsoft Azure rep and documented the conversation in here for community sake.
Response for initial question
A Service Fabric cluster must maintain a minimum number of Primary node types in order for the system services to maintain a quorum and ensure health of the cluster. You can see more about the reliability level and instance count at https://azure.microsoft.com/en-gb/documentation/articles/service-fabric-cluster-capacity/. As such, stopping all of the VMs will cause the Service Fabric cluster to go into quorum loss. Frequently it is possible to bring the nodes back up and Service Fabric will automatically recover from this quorum loss, however this is not guaranteed and the cluster may never be able to recover.
However, if you do not need to save state in your cluster then it may be easier to just delete and recreate the entire cluster (the entire Azure resource group) every day. Creating a new cluster from scratch by deploying a new resource group generally takes less than a half hour, and this can be automated by using Powershell to deploy an ARM template. https://azure.microsoft.com/en-us/documentation/articles/service-fabric-cluster-creation-via-arm/ shows how to setup the ARM template and deploy using Powershell. You can additionally use a fixed domain name or static IP address so that clients don’t have to be reconfigured to connect to the cluster. If you have need to maintain other resources such as the storage account then you could also configure the ARM template to only delete the VM Scale Set and the SF Cluster resource while keeping the network, load balancer, storage accounts, etc.
Q)Is there a better way to stop/start the VMs rather than directly from the scale set?
If you want to stop the VMs in order to save cost, then starting/stopping the VMs directly from the scale set is the only option.
Q) Can we do a primary set with cheapest VMs we can find and add a secondary set with powerful VMs that we can turn on and off?
Yes, it is definitely possible to create two node types – a Primary that is small/cheap, and a ‘Worker’ that is a larger size – and set placement constraints on your application to only deploy to those larger size VMs. However, if your Service Fabric service is storing state then you will still run into a similar problem that once you lose quorum (below 3 replicas/nodes) of your worker VM then there is no guarantee that your SF service itself will come back with all of the state maintained. In this case your cluster itself would still be fine since the Primary nodes are running, but your service’s state may be in an unknown replication state.
I think you have a few options:
Instead of storing state within Service Fabric’s reliable collections, instead store your state externally into something like Azure Storage or SQL Azure. You can optionally use something like Redis cache or Service Fabric’s reliable collections in order to maintain a faster read-cache, just make sure all writes are persisted to an external store. This way you can freely delete and recreate your cluster at any time you want.
Use the Service Fabric backup/restore in order to maintain your state, and delete the entire resource group or cluster overnight and then recreate it and restore state in the morning. The backup/restore duration will depend entirely on how much data you are storing and where you export the backup.
Utilize something such as Azure Batch. Service Fabric is not really designed to be a temporary high capacity compute platform that can be started and stopped regularly, so if this is your goal you may want to look at an HPC platform such as Azure Batch which offers native capabilities to quickly burst up compute capacity.
No. You would have to delete the cluster and recreate the cluster and deploy the application in the morning.
Turning off the cluster is, as Todd said, not an option. However you can scale down the number of VM's in the cluster.
During the day you would run the number of VM's required. At night you can scale down to the minimum of 5. Check this page on how to scale VM sets: https://azure.microsoft.com/en-us/documentation/articles/service-fabric-cluster-scale-up-down/
For development purposes, you can create a Dev/Test Lab Service Fabric cluster which you can start and stop at will.
I have also been able to start and stop SF clusters on Azure by starting and stopping the VM scale sets associated with these clusters. But upon restart all your applications (and with them their state) are gone and must be redeployed.
I thought one of the advantages of Azure was that I could turn services on and off depending on when I want them to be available.
However I cant see how to pause my App Service Plan.
Is it possible?
I want to use the S1 tier so that I can play with what it offers. However I want to be able to pause the cost accumulation when I am not using it.
I see from the app service pricing help that an app will still be billed for even though it is in the stopped state.
Yet the link also clearly states that I only pay for what I use. So how does that work?
If you put your hosting plan onto the free tier, you will stop being charged for it. However if you have things like deployment slots and certificates these will be deleted.
The ability to turn services on and off, is more to do with being able to scale services, so if you need 50 servers for an hour you can easily do that.
What you can do to make your solution temporary is to create a deployment script, using Powershell or Resource manager Templates then you can deploy your solution for exactly as long as you need it and then delete it again when you don't. In this sense you can turn your services on and off at a whim.
Azure provides building blocks for you to create the solution you need, it is up to you to figure out how to best use those building blocks to create the solution you seek.
Edited to answer extended question.
If you want to use the S1 pricing plan, and not have it charge when you are not using it, the only way of achieving that is by using automation. Fortunately, this is reasonably trivial to achieve.
If you look at this template it is pretty much all configured to deploy a website from Github to Azure on demand. If you edit that to configure it to your needs you can have a new Azure website online with 2 minutes of running the script.
Then you would have another script that deleted it once you had finished.
Doing it this way you would loose no functionality, and probably learn quite a bit about what is possible with Azure along the way.
App Service Plan
An app service plan is the hardware that a web app runs on. In the free and shared tier your web apps share an instance with other web apps. In the other tiers you have a dedicated virtual machine. It is this virtual machine that you pay for. In that case it is irrelevant whether or not you have web apps running on your app service or not, you still have a virtual machine running and you will be charged for that.
To change the App Service Plan via PowerShell, you can run the following command
Set-AzureRmAppServicePlan -ResourceGroupName $rg -Name $AppServicePlan -Tier Free
I was able to accomplish this using the dashboard by selecting the App Service Plan, clicking Scale up (App Service Plan), and then from there if you click Dev/Test you can select the Free tier.
As others have mentioned, you need to script this. Fortunately, I created a repository with one-click deployment to your Azure resources.
https://github.com/jraps20/jrap-AzureVerticalScaling
The steps are intended to be as simple and generic as possible:
Execute the one-click deployment from the repo readme
Select the subscription, resource group etc.
Deploy resource to Azure
Set up your schedule to scale up and scale down as-needed
The scripting relies on runbooks and variables to maintain the previous state of each App Service Plan and App Services within those plans. Some App Services cannot be scaled due to specific settings being used (AlwaysOn, Use32BitWOrkerProcess, ClientCertEnabled, etc.). In those cases, the previous values are stored as variables prior to down scaling and then the original values are reapplied when the services are scaled up.
For more clarity, I have written a blog post that goes into detail. The post is pertaining to Sitecore, but applies to any App Service setup- Drastically Reduce Azure PaaS Hosting Costs in Non-Prod Environments With Scheduled Vertical Scaling. It also includes a brief video tutorial to show its use case.
Myself and others have been using this repository/approach for well over a year and it works great. I mostly use it for POC's to reduce costs when I'm not actively working on something. However, its main intention was for targeting non-prod environments to save costs during non-work hours.
Azure App Service Plan is just an logical concept of a set of features and capacity that you can share across multiple apps. I don`t think you can "pause" a plan, instead you can pause your service. and depends on billing model of each service, you might or might not get charged.
Pausing = Delete or lower tier.
Scripting is the key.
Design Diagram
Use scripts to create (also consider shared resources)
Delete using scripts
Use scripts to recreate.
eg: If we use resource group properly per environment then
Export-AzureRmResourceGroup will create a template for us (everything in the resource group will be pulled out as script). So we can delete it and recreate it anytime.
To pause a VM and stop billing you need to shut is down and deallocate it. Just shutting down still has the capacity reserved as if its running.
Storage can't be shutdown - it can be moved to lower cost tiers.
We are working on architecture of new web application to be hosted on Azure. This application would run only in day time (Say 9AM to 5PM). What I read so far about Azure is we would continue be billed even when we stop the deployment.
However in case of Azure VM (IAAS) billing stops when we stop the VM.
Client is keenly interested in running the IT cost to the minimum. We are planning to use WASABi/Auto-scaling block to auto shutdown & auto-start the app to run only during (9AM-5PM)
Deploying application every morning & deleting every evening even programmatically doesn't sound like a good architecture.
Should we target the app for VM rather than Azure web role?
While hourly billing cost is definitely a consideration and it is true that if you stop a VM in IaaS, billing stops, there are other considerations as well. Some of them are:
With Cloud Services, you have to architect the application in a certain way to take advantage of statelessness there. So there may be a bit of a learning curve there. With Virtual Machines, in theory you can build an application the way you are used to and deploy that in the cloud.
With Cloud Services, the major advantage is that you don't have to maintain the VM. This is something Microsoft does that for you. So there's little or no IT-admin overhead. With VMs, maintaining the VM is your responsibility so that's an additional cost which is recurring as well (assuming you (or your client) have an IT Admin kind of guy on the payroll).
Generally speaking, if the application is a stand-alone application with quite simple deployment topology and is brand new application it is recommended that you write them as Cloud Service but do take the costs (development / IT admin) into consideration as well.
When you stop the virtual machine through a standard shutdown (through the machine itself for example), it does continue to incur charges. The portal will eventually show it as shutdown, but the VM still has resources allocated.
However, if you stop the machine through the portal, API, or PowerShell, it will stop and DEALLOCATE the machine. This means the VM will use storage space, but will not incur compute charges.
Simply schedule the deallocation of the machines during off-hours, and you will only page for the usage during the day.