I'm trying to figure out the pricing model for Azure SQL databases. Comparing vCore to DTU on the defaults, 20 DTUs worth of server would cost an estimated £27.96 a month while vCore would cost £326.34 a month.
Why the disparity? I looked up what DTUs are and I'm happy with the overall concept of them being based on maths based on CPU, etc., but I can't figure out if each database transaction would add up so I could eventually "use up" the 20 DTU and so will get charged for another set of 20 or whether the database will only run as fast as "20" based on the calculations.
This isn't a question about the DTU calculator, I'm happy with all that, this is a question about why there is such a significant difference between the two values.
The reason for the difference is that the vCore model provides a dedicated core for your database and can be scaled independent of storage. In the DTU model you are basically paying for a share of CPU, Memory and storage (including io). When you choose a larger DTU all the specs move up together.
This article provides some detail: https://learn.microsoft.com/en-us/azure/sql-database/sql-database-service-tiers-vcore
https://learn.microsoft.com/en-us/azure/sql-database/sql-database-service-tiers-vcore#choosing-service-tier-compute-memory-storage-and-io-resources
If you decide to convert from the DTU-model to vCore-model, you should select the performance level using the following rule of thumb: each 100 DTU in Standard tier requires at least 1 vCore in General Purpose tier; each 125 DTU in Premium tier requires at least 1 vCore in Business Critical tier.
https://learn.microsoft.com/en-us/azure/sql-database/sql-database-vcore-resource-limits#what-happens-when-database-and-elastic-pool-resource-limits-are-reached
The maximum number of sessions and workers are determined by the service tier and performance level. New requests are rejected when session or worker limits are reached, and clients receive an error message.
Related
Below is a screen shot of various Azure SDW performance tiers, concurrent queries and concurrency slots.
Let's look at the DW500 tier. It offers 32 concurrent queries but only 20 concurrency slots. Since each query consumes at least one concurrency slot (I think), how could this tier actually achieve 32 concurrent queries? It seems to me it would max out at 20. Am I missing something?
Some queries in Azure SQL Data Warehouse do not use the concurrency slots but will contribute to the concurrent query count, as listed here.
A note on concurrency slots from the official documentation:
Only resource governed queries consume concurrency slots. System
queries and some trivial queries don't consume any slots. The exact
number of concurrency slots consumed is determined by the query's
resource class.
I will be using Azure Cosmos DB with Azure Functions deployed in the same regions, with a gateway (cloudflare or an Azure option) which will route to the azure function in the closest region, which is deployed along side a Cosmos DB replication.
the benefits in perceived latency should be logarithmic right?
like, having 2 regions is 3x better,
3 region ~5x times better perceived latency. etc.
according to MS, Cosmos DB is available in all regions.
considering our customers aren't clustered around a specific region and are all over the world.
which is the optimal regions to deploy to?
for replication in
1 region
2 regions
3 regions
4 regions
You can use the http://www.azurespeed.com/
to see the closest DC from the client and pick the optimal location.
As an extreme/unrealistic case you can imagine each customer/client having a copy of the db running next to them. This should cause the least latency for the customer. Right ?
The answer is that it depends. If you talk about local read/write latency then that would be true. However, the more you replicate your database the more time write operations will take to synchronise across all nodes (and in turn affect what is available when you read). See consistency models here. Although you have customers spread across the globe, it would be better if you start from regions with the most load/requests and then spread out from there.
Deciding this is also when the proverbial "rubber meets the road" as you would soon realise that business might be willing to relax some latency needs around edges given the cost increase to achieve 100% coverage.
If I have an S2 Sql Database, and I create a secondary geo-replicated database, should it be of the same size (S2)? I see that you get charged for the secondary DB, but the DTU's reported against that secondary are 0%, which seems to indicate that S2 is too large.
Obviously, we'd like to save the cost if at all possible and move the secondary to a smaller size if at all possible.
Considerations
I understand if we need to failover to the secondary, at that point, it would need to be bumped up to the size of S2 to meet the production workloads, but assuming that we could do this at the time of failover?
I also get that if we were actively using the replicated DB for reporting, etc, then we'd have to size it accordingly to meet that demand. But currently, we are not actively using the secondary for anything other than to use as a failover point if it is ever needed.
At this point both primary and secondary must be in the same edition but can have different performance objectives (DTU size). We are working on lifting that limitation so that geo-replication databases could scale to a different edition when needed without breaking the replication links (e.g. standard to premium).
Re sizing the secondary, you *can" make it smaller in DTU than the primary if you believe that the updates take less capacity than reads (high read/write ratio). But as noted earlier, you will have to upsize it right after the failover and it may take time during which your app's performance will be impacted. In general, we do not recommend having the secondary more than 1 level smaller. E.g. S3->S1 is not a good idea as it will likely cause replication lag and may result in excessive data loss after failover.
You can safely change the tier of the secondary database, but bear in mind, that in the case of failover, you will face performance issues. Also you cant scale past your current performance Tier (so both bases ought to be of the same Tier).
And yes, you can change the size past failover, but the process is manual.
I am considering migration from Azure SQL to Azure SQL Data Warehouse. It seems to offer some of the features that we need, however price is a concern for starting small. 100 DWU Data Warehouse is priced considerably higher ($521/month) than a seemingly comparable 100 DTU Azure SQL S2 tier ($150/month).
To make sure I am comparing apples to apples, can someone shed some light on how DWU compare to DTU (assuming basic configuration with a single database)?
Edit: to everyone who is inclined to answer that Azure SQL DW and Azure SQL are not comparable and therefore it makes no sense to compare DTU to DWU: then how does it make sense to (talk about migration) to DW?
For what it's worth, 1 DWU = 7.5 DTU with respect to server capacity
When you look at the server instance that you provision a DW instance on:
100 DWU instance consumes 750 DTUs of server capacity
400 DWU instance consumes 3,000 DTUs of server capacity
While this information is interesting, it may not be very useful in terms of comparing pricing because DW pricing is exclusively based on DWU, while Azure SQL pricing is the combination of DTU and database size.
You can't and really shouldn't compare the two for the same workload; they're designed for different things based on completely different architectures. As such, DTU and DWU are not comparable measures. Also, how deeply have you looked into the technical differences? The high level features are not the major issue, details are what might wreck your app (e.g. can you live with a limited TSQL surface area or transaction isolation level?)
Azure SQL DB is intended to be a general purpose DB as a service. The few feature gaps aside, you should think about Azure SQL DB functionally the same way you do SQL Server, minus a lot of the administrative tasks and with a different programming model. Works great for OLTP apps and most reporting apps (or mixed) but not so great for complex analytical apps against with very large datasets (can't really store that much in SQL DB anyway).
SQL DW is intended for data warehousing, analytical type workloads. Its MPP architecture is particularly well suited for complex queries against very large data sets. It will not perform well for typical OLTP applications that have lots of small or singleton queries especially when it's a mix of insert, update and delete operations. If you get a trial instance of SQL DW, you can easily test and verify the behavior for your workload compared to what it currently looks like on SQL DB.
SQL DW also has some limitations on its TSQL surface area, types, concurrency, isolation levels (deal breaker for almost all OLTP apps), etc... so be sure to look into the documentation to get the whole picture as you evaluate feasibility. It might work great but I suspect it's not the best solution if you're running an OLTP workload. Reporting/analytical type workloads however might find a happy home in SQL DW.
The best way to figure out what you need is to look at your current IO requirements. Data Warehouses tend to be IO hogs and consequently are optimized by maximizing IO throughput. The DWU Calculator site walks you through the process of capturing a your disk metrics and estimates how many DWUs you need to fulfill your workload.
http://dwucalculator.azurewebsites.net/
The new new Azure SQL Database Services look good. However I am trying to work out how scalable they really are.
So, for example, assume a 200 concurrent user system.
For Standard
Workgroup and cloud applications with "multiple" concurrent transactions
For Premium
Mission-critical, high transactional volume with "many" concurrent users
What does "Multiple" and "Many" mean?
Also Standard/S1 offers 15 DTUs while Standard/S2 offers 50 DTUs. What does this mean?
Going back to my 200 user example, what option should I be going for?
Azure SQL Database Link
Thanks
EDIT
Useful page on definitions
However what is "max sessions"? Is this the number of concurrent connections?
There are some great MSDN articles on Azure SQL Database, this one in particular has a great starting point for DTUs. http://msdn.microsoft.com/en-us/library/azure/dn741336.aspx and http://channel9.msdn.com/Series/Windows-Azure-Storage-SQL-Database-Tutorials/Scott-Klein-Video-02
In short, it's a way to understand the resources powering each performance level. One of the things we know when talking with Azure SQL Database customers, is that they are a varied group. Some are most comfortable with the most absolute details, cores, memory, IOPS - and others are after a much more summarized level of information. There is no one-size fits all. DTU is meant for this later group.
Regardless, one of the benefits of the cloud is that it's easy to start with one service tier and performance level and iterate. In Azure SQL Database specifically you can change the performance level while you're application is up. During the change there is typically less than a second of elapsed time when DB connections are dropped. The internal workflow in our service for moving a DB from service tier/performance level follows the same pattern as the workflow for failing over nodes in our data centers. And nodes failing over happens all the time independent of service tier changes. In other words, you shouldn’t notice any difference in this regard relative to your past experience.
If DTU's aren't your thing, we also have a more detailed benchmark workload that may appeal. http://msdn.microsoft.com/en-us/library/azure/dn741327.aspx
Thanks Guy
It is really hard to tell without doing a test. By 200 users I assume you mean 200 people sitting at their computer at the same time doing stuff, not 200 users who log on twice a day. S2 allows 49 transactions per second which sounds about right, but you need to test. Also doing a lot of caching can't hurt.
Check out the new Elastic DB offering (Preview) announced at Build today. The pricing page has been updated with Elastic DB price information.
DTUs are based on a blended measure of CPU, memory, reads, and writes. As DTUs increase, the power offered by the performance level increases. Azure has different limits on the concurrent connections, memory, IO and CPU usage. Which tier one has to pick really depends upon
#concurrent users
Log rate
IO rate
CPU usage
Database size
For example, if you are designing a system where multiple users are reading and there are only a few writers, and if your application middle tier can cache the data as much as possible and only selective queries / application restart hit the database then you may not worry too much about the IO and CPU usage.
If many users are hitting the database at the same time, you may hit the concurrent connection limit and requests will be throttled. If you can control user requests coming to the database in your application then this shouldn't be a problem.
Log rate: Depends upon the volume of the data changes (including additional data pumping in the system). I have seen application steadily pumping the data vs data being pumped all at once. Selecting the right DTU again depends upon how one can do throttling at the application end and get steady rate.
Database size: Basic, standard, and premium has different allowed max sizes, and this is another deciding factor. Using table compression kind of features helps reducing the total size, and hence total IO.
Memory: Tuning the expesnive queries (joins, sorts etc), enabling lock escalation / nolock scans help controlling the memory usage.
The very common mistake people usually do in database systems is scaling up their database instead of tuning the queries and application logic. So testing, monitoring the resources / queries with different DTU limits is the best way of dealing this.
If choose the wrong DTU, don't worry you can always scale up/ down in SQL DB and it is completely online operation
Also unless a strong reason migrate to V12 to get even better performance and features.