The facts:
1 Azure SQL S0 instance
a few tables one of them containing ~ 8.6 Million Rows and 1 PK
Running a Count-query on this table takes nearly 30 minutes (!) to complete.
Upscaling the instance from S0 to S1 reduces the query time to 13 minutes:
Looking into Azure Portal (new version) the resource-usage-monitor shows the following:
Questions:
Does anyone else consider even 13 minutes as rediculos for a simple COUNT()?
Does the second screenshot meen that during the 100%-period my instance isn't responding to other requests?
Why are my metrics limited to 100% in both S0 and S1? (see look under "Which Service Tier is Right for My Database?" stating " These values can be above 100% (a big improvement over the values in the preview that were limited to a maximum of 100).") I'd expect the S0 to bee like on 150% or so if the quoted statement is true.
I'm interested in experiences regarding usage of databases with more than 1.000 records or so from other people. I don't see how a S*-scaled Azure SQL for 22 - 55 € per month could help me in upscaling-strategies at the moment.
Azure SQL Database editions provide increasing levels of DTUs from Basic -> Standard -> Premium levels (CPU,IO,Memory and other resources - see https://msdn.microsoft.com/en-us/library/azure/dn741336.aspx). Once your query reaches its limits of DTU (100%) in any of these resource dimensions, it will continue to receive these resources at that level (but not more) and that may increase the latency in completing the request. It looks like in your scenario above, the query is hitting its DTU limit (10 DTUs for S0 and 20 for S1). You can see the individual resource usage percentages (CPU, Data IO or Log IO) by adding these metrics to the same graph, or by querying the DMV sys.dm_db_resource_stats.
Here is a blog that provides more information on appropriately sizing your database performance levels. http://azure.microsoft.com/blog/2014/09/11/azure-sql-database-introduces-new-near-real-time-performance-metrics/
To your specific questions
1) As you have 8.6 million rows, database needs to scan the index entries to get the count back. So, it may be hitting the IO limit for the edition here.
2) If you have multiple concurrent queries running against your DB, they will be scheduled appropriately to not starve one request or the other. But latencies may increase further for all queries since you will be hitting the available resource limits.
3) For older Web/Business editions, you may be able to see the metric values going beyond 100% (they are normalized to the limits of an S2 level), as they don't have any specific limits and run in a resource-shared environment with other customer loads. For the new editions, metrics will never exceed 100%, because system guarantees you resources upto 100% of that edition's limits, but no more. This provides predictable, guaranteed amount of resources for your DB unlike Web/Business editions, where you may get very little or lot more resources at different times depending on other competing customer DB workloads running on the same machine.
Hope this helps.
-- Srini
Related
I’m struggling to understand how the pricing mechanism for RU/s works. Specifically my confusion comes in when the word “first” is used.
I’ve done my research here:https://devblogs.microsoft.com/cosmosdb/build-apps-for-free-with-azure-cosmos-db-free-tier/?WT.mc_id=aaronpowell-blog-aapowell
In the second paragraph it’s mentioned:
“With Azure Cosmos DB Free Tier enabled, you’ll get the first 400 RU/s throughput and 5 GB storage in your account for free each month, for the lifetime of the account.”
So hypothetically speaking if I have an app that does one query and that 1 query evaluates to 1RU. Can I safely assume that
400 users can execute the query once per second for free?
500 users can execute the query once per second and I will only be charged for 100RU
If the RU is consistently less than 401 per second, there will be no charge
Please do make mention if there’s any other costing I should be aware of. Ie. Any cosmosDb dependencies, or app service costing
You're not really thinking about RU/sec correctly.
If you have, say, 400 RU/sec, then that is your allocated # of RU within a one-second window. It has nothing to do with # of users (as I doubt you're giving end-users direct access to your Cosmos DB instance).
in the case of operations only costing 1 RU, then yes, you should be able to execute 400 operations within a 1-second window, without issue (although there is only one type of operation costing 1 RU, and that's a point-read).
in the case you run some type of operation that puts you over the 400RU quota for that 1-second period, that operation completes, but now you're "in debt" so to speak: you will be throttled until the end of the 1-second period, and likely a bit of time into the next period (or, depending on how deep in RU-debt you went, maybe a few seconds).
when you exceed your RU/sec allocation, you do not get charged. In your example, you asked what happens if you try to consume 500 RU in a 1-second window, and asserted you'd be charged for 100 RU. Nope. You'll just be throttled after exhausting the 400 RU allocation.
The only way you'll end up being charged for more RU/sec, is if you increase your RU/sec allocation.
There is some more reading out there you can do:
azure cosmos db free tier
pricing examples
Setup :
3 member embedded cluster deployed as a spring boot jar.
Total keys on each member: 900K
Get operation is being attempted via a rest api.
Background:
I am trying to benchmark the replicated map of hazelcast.
Management center UI shows around 10k/s request being executed but avg get latency per sec is coming 0ms.
I believe it is not showing because it might be in microseconds.
Please let me know how to configure management center UI to show latency in micro/nanoseconds?
Management center UI shows around 10k/s request being executed but avg get latency per sec is coming 0ms.
I believe you're talking about Replicated Map Throughput Statistics in the replicated map details page. The Avg Get Latency column in that table shows on average how much time it took for a cluster member to execute the get operations for the time period that is selected on the top right corner of the table. For example, if you select Last Minute there, you only see the average time it took for the get operations in the last minute.
I believe it is not showing because it might be in microseconds.
Cluster is sending it as milliseconds (calculating it as nanoseconds in a newer cluster version but still sending as milliseconds). However, since a replicated map replicates all data on all members and every member contains the whole data set, get latency is typically very low as there's no network trip.
I guess that the way we render very small metric values confused you. In Management Center UI, we only show two fractional digits. You can see it in action in the below screenshots:
As you can see, since the value is very low, it is shown as 0. I believe we can do a better job rendering these values though (using a smaller time unit for example). I will create an issue for this on our private issue tracker.
I have a cosmosGB gremlin API set up with 400 RU/s. If I have to run a query that needs 800 RUs, does it mean that this query takes 2 sec to execute? If i increase the throughput to 1600 RU/s, does this query execute in half a second? I am not seeing any significant changes in query performance by playing around with the RUs.
As I explained in a different, but somewhat related answer here, Request Units are allocated on a per-second basis. In the event a given query will cost more than the number of Request Units available in that one-second window:
The query will be executed
You will now be in "debt" by the overage in Request Units
You will be throttled until your "debt" is paid off
Let's say you had 400 RU/sec, and you executed a query that cost 800 RU. It would complete, but then you'd be in debt for around 2 seconds (400 RU per second, times two seconds). At this point, you wouldn't be throttled anymore.
The speed in which a query executes does not depend on the number of RU allocated. Whether you had 1,000 RU/second OR 100,000 RU/second, a query would run in the same amount of time (aside from any throttle time preventing the query from running initially). So, aside from throttling, your 800 RU query would run consistently, regardless of RU count.
A single query is charged a given amount of request units, so it's not quite accurate to say "query needs 800 RU/s". A 1KB doc read is 1 RU, and writing is more expensive starting around 10 RU each. Generally you should avoid any requests that would individually be more than say 50, and that is probably high. In my experience, I try to keep the individual charge for each operation as low as possible, usually under 20-30 for large list queries.
The upshot is that 400/s is more than enough to at least complete 1 query. It's when you have multiple attempts that combine for overage in the timespan that Cosmos tells you to wait some time before being allowed to succeed again. This is dynamic and based on a more or less black box formula. It's not necessarily a simple division of allowance by charge, and no individual request would be faster or slower based on the limit.
You can see if you're getting throttled by inspecting the response, or monitor by checking the Azure dashboard metrics.
I just launched an Azure SQL Database, and the DTU and CPU usage is behaving strangely. The database is only receiving about 30 requests per minute, and the CPU/DTU will be extremely low for hours, and then jump up to 100% and stay there (with no increase in the number of requests that triggers this). When I click to view the top queries, none of them are above 1% cpu usage. I started out on a 5 DTU plan, and yesterday upgraded to 20 DTUs and the same behavior is occurring. Any idea what else might cause the DTU/CPU to get maxed out? See images below:
https://i.imgur.com/LdbYTPw.png
https://i.imgur.com/jlus3FM.png
Thanks in advance for any advice!
Joe
EDIT: I'm getting closer, I found these repeated entries in the error log. (about 8 - 10 per SECOND)
"The incoming request has too many parameters. The server supports a maximum of 2100 parameters. Reduce the number of parameters and resend the request."
The thing is, the App Service that queries the database is only doing simple selects, updates, and inserts... none of which uses any complex WHERE IN statement. Furthermore, every query is wrapped in a try/catch block, and I'm never seeing an exception like this.
Where could these large queries be originating from?
You are only seeing the CPU component of the DTU graph, what about the "Data IO" and the "Log IO" components? Look at the top 5 queries on the 3 sections, and let me know if you find a query that start with "SELECT Statman ...". If you see that, then the Auto Update Statistics process is creating those DTU spikes.
I would suggest to install the sp_whoisactive script so that you can see what's going on more easily:
http://whoisactive.com/
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 8 years ago.
Improve this question
Now that Microsoft made the new SQL Azure service tiers available (Basic, Standard, Premium) we are trying to figure out how they map to the existing ones (Web and Business).
Essentially, there are six performance levels in the new tier breakdown: Basic, S1, S2, P1, P2 and P3 (details here: http://msdn.microsoft.com/library/dn741336.aspx)
Does anyone know how the old database tiers map to those six levels? For instance, is Business equivalent of an S1? S2?
We need to be able to answer this question in order to figure out what service tiers/levels to migrate our existing databases to.
We just finished a performance comparison.
I can't publish our SQL queries, but we used 3 different test cases that match our normal activity. In each test case, we performed several queries with table joins and aggregate calculations (SUM, AVG, etc) for a few thousand rows. Our test database is modest - about 5GB in size with a few million rows.
A few notes: For each, we tested the my local machine which is a 5 year old iMac running Windows/SQL Server in a virtual machine ("Local"), SQL Azure Business ("Business"), SQL Azure Premium P1, SQL Azure Standard S2, and SQL Azure Standard S1. The basic tier seemed so slow that we didn't test it. All of these tests were done with no other activity on the system. The queries did not return data so network performance was hopefully not a factor.
Here were our results:
Test One
Local: 1 second
Business: 2 seconds
P1: 2 seconds
S2: 4 seconds
S1: 14 seconds
Test Two
Local: 2 seconds
Business: 5 seconds
P1: 5 seconds
S2: 10 seconds
S1: 30 seconds
Test Three
Local: 5 seconds
Business: 12 seconds
P1: 13 seconds
S2: 25 seconds
S1: 77 seconds
Conclusions:
After working with the different tiers for a few days, our team concluded a few things:
P1 appears to perform at the same level as SQL Azure Business. (P1 is 10x the price)
Basic and S1 are way too slow for anything but a starter database.
The Business tier is a shared service so performance depends on what other users are on your server. Our database shows a max of 4.01% CPU, 0.77% Data IO ,0.14% Log IO and we're experiencing major performance problems and timeouts. Microsoft Support confirmed that we are "just on a really busy server."
The Business tier delivers inconsistent service across servers and regions. In our case, we moved to a different server in a different region and our service is back to normal. (we view that as a temporary solution)
S1, S2, P1 tiers seem to provide the same performance across regions. We tested West and North Central.
Considering the results above, we're generally worried about the future of SQL Azure. The business tier has been great for us for a few years, but it's scheduled to go out of service in 12 months. The new tiers seem over priced compared to the Business tier.
I'm sure there are 100 ways this could be more scientific, but I'm hoping those stats help others getting ready to evaluate.
UPDATE:
Microsoft Support sent us a very helpful query to assess your database usage.
SELECT
avg(avg_cpu_percent) AS 'Average CPU Percentage Used',
max(avg_cpu_percent) AS 'Maximum CPU Percentage Used',
avg(avg_physical_data_read_percent) AS 'Average Physical IOPS Percentage',
max(avg_physical_data_read_percent) AS 'Maximum Physical IOPS Percentage',
avg(avg_log_write_percent) AS 'Average Log Write Percentage',
max(avg_log_write_percent) AS 'Maximum Log Write Percentage',
--avg(avg_memory_percent) AS 'Average Memory Used Percentage',
--max(avg_memory_percent) AS 'Maximum Memory Used Percentage',
avg(active_worker_count) AS 'Average # of Workers',
max(active_worker_count) AS 'Maximum # of Workers'
FROM sys.resource_stats
WHERE database_name = 'YOUR_DATABASE_NAME' AND
start_time > DATEADD(day, -7, GETDATE())
The most useful part is that the percentages represent % of an S2 instance. According to Microsoft Support, if you're at 100%, you're using 100% of an S2, 200% would be equivalent to a P1 instance.
We're having very good luck with P1 instances now, although the price difference has been a shocker.
I am the author of the Azure SQL Database Performance Testing blog posts mentioned above.
Making IOPS to DTU comparisons is quite difficult for Azure SQL Database, which is why I focussed on row counts and throughput rates (in MB per second) in my tests.
I would be cautious about using the Transaction Rates quoted by Microsoft - their benchmark databases are rather small e.g. for Standard tier, which has a capacity of 250 GB, their benchmark databases for S1 and S2 are only 2 GB and 7 GB respectively. At these sizes I suggest SQL Server is caching much/most of the database and as such their benchmark is avoid the worst of the read throttling that is likely to impact real world databases.
I have added a new post regarding the new Service Tiers hitting General Availability and making some estimates of the changes in performance around S0 and S1 at GA.
http://cbailiss.wordpress.com/2014/09/16/performance-in-new-azure-sql-database-performance-tiers/
There is not really any kind of mapping between the old and new offerings near as I can tell. The old offerings the only thing that was really different between the "web" and "business" offering was the size the database was limited to.
However, on the new offerings each tier has performance metrics associated with them. So in order to decide what offering you need to move your existing databases to you need to figure out what type of performance needs your application has.
It appears in terms of size Web and Business fall between Basic and S1. Here's a link that has a chart with the new and old tiers compared. It seems a little apple to oranges honestly so there isn't a direct mapping. Here's also a link specifically addressed to people currently on the Web and Business Tiers.
Comparison of Tiers
Web and Business Edition Sunset FAQ