I am analyzing Azure SQL DW and I came across the term DWU (Data warehouse units). The link on Azure site only mentions a crude definition of DWU. I want to understand how DWU is calculated and how should I scale my system accordingly.
I have also referred to the link but it does not cover my question:
In addition to the links you found it is helpful to know that Azure SQL DW stores data in 60 different parts called "distributions". If your DW is DWU100 then all 60 distributions are attached to one compute node. If you scale to DWU200 then 30 distributions are detached and reattached to a second compute node. If you scale all the way to DWU2000 then you have 20 compute nodes each with 3 distributions attached. So you see how DWU is a measure of the compute/query power of your DW. As you scale you have more compute operating on less data per compute node.
Update: For Gen2 there are still 60 distributions but the DWU math is a bit different. DWU500c is one full size node (playing both compute and control node roles) where all 60 distributions are mounted. Scales smaller than DWU500c are single nodes that are not full size (meaning fewer cores and less RAM than full size nodes on larger DWUs). DWU1000c is 2 compute nodes each with 30 distributions mounted and there is a separate control node. DWU1500c is 3 compute nodes and a separate control node. And the largest is DWU30000c which is 60 compute nodes each with one distribution mounted.
I just found this link which shows the throughput to DWU relation
You can also checkout the dwucalculator. This site walks you through the process of taking a capture for your existing workload and makes a recommendation on the number of DWUs necessary to fulfill the workload in Azure SQL DW.
http://dwucalculator.azurewebsites.net/
Depending on the amount of time and the number of tables, you may choose DWU.
For eg: If 100 DWU's are taking 15 mins of time for 3 tables and to implement the same in 3 mins you may choose 500 DWU.
Related
I've red the documentation and searched the internet for a simple explanation on Azure application gateway auto-scaling and the above quoted line but failed.
It would be really helpful if you can explain/provide a explanation link related to the same for better understanding.
Thank you!
When you enable auto scaling you need to set a minimum and maximum instance count. How do you know how many instances you need to handle the minimum amount of traffic you want to be able to handle? That is where Capacity Units play a role:
Capacity Unit is the measure of capacity utilization for an Application Gateway across multiple parameters.
A single Capacity Unit consists of the following parameters:
2500 Persistent connections
2.22-Mbps throughput
1 Compute Unit
If any one of these parameters are exceeded, then another n capacity unit(s) are necessary, even if the other two parameters don’t exceed this single capacity unit’s limits. The parameter with the highest utilization among the three above will be internally used for calculating capacity units, which is in turn billed.
When configuring the minimum and maximum number of instances you can now calculate how many instances you need because a single instance can handle up to 10 Capacity Units, so for example a maximum number of 10 * 2500 = 25.000 persistent connections.
For example: if you expect to have to deal with 6000 persistent connections you will need at least 3 instances (3 * 2500 = up to 7500 persistent connections)
We started using Prometheus and Grafana as the main tools for monitoring our Service Fabric cluster. For targeting Prometheus we use wmi_exporter, with predefined parameters: CPU, system, process, service, memory, etc. Our main goal was to start monitoring our product services on the node group each instance in Azure Service Fabric.
For instance, we are using this PQuery to calculate total CPU usage in %:
100 - (avg by (hostname) (irate(wmi_cpu_time_total{scaleset="name",mode="idle" }[5m])) * 100) and metrics +- looks realistic.
Until we started to write queries for services.
For services, sum by (process,hostname)(irate(wmi_process_cpu_time_total{scaleset="name", process=~"processes"}[5m])) * 100, and metrics seems to be not realistic time to time, especially it is obvious after you compare it with total CPU time %. I found out an article regarding multiplying to 100 for getting % from CPU time, but in this case, I get metrics around 170% or more. Perhaps I need to divide it into the number of CPU cores?
Regarding query, I'm using the sum process because I get two different metrics for one process in two modes, user and privileged.
Can anyone please help me with the correct calculation for CPU process time total metric and transforming them to perc. ?
Thank you, I would be grateful for any help!
I hope this will help!
The result is pretty much the same as the Windows performance manager.
So, for CPU % for running services (tasks, processes):
sum by (process,hostname)(irate(wmi_process_cpu_time_total{scaleset="name", process=~"processes"}[5m])) * 100 / 2 (number of CPU cores)
First, you summarize all metrics for the running process, the exporter provides results for the same process ID: user and kernel mode metrics, so it needs to be summarized. The same must be done for hostname (instance, etc.). In my case, I have Azure scale sets, from 2 to 5 instances. It must be multiplied on 100 to get % and divide on number of CPU cores.
Cheers!
I would like to know if there is a way to check sql_ids that were downgraded to either serial or lesser degree in an Oracle 4-node RAC Data warehouse, version 11.2.0.3. I want to write a script and check the queries that are downgraded.
SELECT NAME, inst_id, VALUE FROM GV$SYSSTAT
WHERE UPPER (NAME) LIKE '%PARALLEL OPERATIONS%'
OR UPPER (NAME) LIKE '%PARALLELIZED%' OR UPPER (NAME) LIKE '%PX%'
NAME VALUE
queries parallelized 56083
DML statements parallelized 6
DDL statements parallelized 160
DFO trees parallelized 56249
Parallel operations not downgraded 56128
Parallel operations downgraded to serial 951
Parallel operations downgraded 75 to 99 pct 0
Parallel operations downgraded 50 to 75 pct 0
Parallel operations downgraded 25 to 50 pct 119
Parallel operations downgraded 1 to 25 pct 2
Does it ever refresh? What conclusion can be drawn from above output? Is it for a day? month? hour? since startup?
This information is stored as part of Real-Time SQL Monitoring. But it requires licensing the Diagnostics and Tuning packs, and it only stores data for a short period of time.
Oracle 12c can supposedly store SQL Monitoring data for longer periods of time. If you don't have Oracle 12c, or if you don't have those options licensed, you'll need to create your own monitoring tool.
Real-Time SQL Monitoring of Parallel Downgrades
select /*+ parallel(1000) */ * from dba_objects;
select sql_id, sql_text, px_servers_requested, px_servers_allocated
from v$sql_monitor
where px_servers_requested <> px_servers_allocated;
SQL_ID SQL_TEXT PX_SERVERS_REQUESTED PX_SERVERS_ALLOCATED
6gtf8np006p9g select /*+ parallel ... 3000 64
Creating a (Simple) Historical Monitoring Tool
Simplicity is the key here. Real-Time SQL Monitoring is deceptively simple and you could easily spend weeks trying to recreate even a tiny portion of it. Keep in mind that you only need to sample a very small amount of all activity to get enough information to troubleshoot. For example, just store the results of GV$SESSION or GV$SQL_MONITOR (if you have the license) every minute. If the query doesn't show up from sampling every minute then it's not a performance issue and can be ignored.
For example: create a table create table downgrade_check(sql_id varchar2(100), total number), and create a job with DBMS_SCHEDULER to run insert into downgrade_check select sql_id, count(*) total from gv$session where sql_id is not null group by sql_id;. Although the count from GV$SESSION will rarely be exactly the same as the DOP.
Other Questions
V$SYSSTAT is updated pretty frequently (every few seconds?), and represents the total number of events since the instance started.
It's difficult to draw many conclusions from those numbers. From my experience, having only 2% of your statements downgraded is a good sign. You likely either have good (usually default) settings and not too many parallel jobs running at once.
However, some parallel queries run for seconds and some run for weeks. If the wrong job is downgraded even a single downgrade can be disastrous. Storing some historical session information (or using DBA_HIST_ACTIVE_SESSION_HISTORY) may help you find out if your critical jobs were affected.
The facts:
1 Azure SQL S0 instance
a few tables one of them containing ~ 8.6 Million Rows and 1 PK
Running a Count-query on this table takes nearly 30 minutes (!) to complete.
Upscaling the instance from S0 to S1 reduces the query time to 13 minutes:
Looking into Azure Portal (new version) the resource-usage-monitor shows the following:
Questions:
Does anyone else consider even 13 minutes as rediculos for a simple COUNT()?
Does the second screenshot meen that during the 100%-period my instance isn't responding to other requests?
Why are my metrics limited to 100% in both S0 and S1? (see look under "Which Service Tier is Right for My Database?" stating " These values can be above 100% (a big improvement over the values in the preview that were limited to a maximum of 100).") I'd expect the S0 to bee like on 150% or so if the quoted statement is true.
I'm interested in experiences regarding usage of databases with more than 1.000 records or so from other people. I don't see how a S*-scaled Azure SQL for 22 - 55 € per month could help me in upscaling-strategies at the moment.
Azure SQL Database editions provide increasing levels of DTUs from Basic -> Standard -> Premium levels (CPU,IO,Memory and other resources - see https://msdn.microsoft.com/en-us/library/azure/dn741336.aspx). Once your query reaches its limits of DTU (100%) in any of these resource dimensions, it will continue to receive these resources at that level (but not more) and that may increase the latency in completing the request. It looks like in your scenario above, the query is hitting its DTU limit (10 DTUs for S0 and 20 for S1). You can see the individual resource usage percentages (CPU, Data IO or Log IO) by adding these metrics to the same graph, or by querying the DMV sys.dm_db_resource_stats.
Here is a blog that provides more information on appropriately sizing your database performance levels. http://azure.microsoft.com/blog/2014/09/11/azure-sql-database-introduces-new-near-real-time-performance-metrics/
To your specific questions
1) As you have 8.6 million rows, database needs to scan the index entries to get the count back. So, it may be hitting the IO limit for the edition here.
2) If you have multiple concurrent queries running against your DB, they will be scheduled appropriately to not starve one request or the other. But latencies may increase further for all queries since you will be hitting the available resource limits.
3) For older Web/Business editions, you may be able to see the metric values going beyond 100% (they are normalized to the limits of an S2 level), as they don't have any specific limits and run in a resource-shared environment with other customer loads. For the new editions, metrics will never exceed 100%, because system guarantees you resources upto 100% of that edition's limits, but no more. This provides predictable, guaranteed amount of resources for your DB unlike Web/Business editions, where you may get very little or lot more resources at different times depending on other competing customer DB workloads running on the same machine.
Hope this helps.
-- Srini
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 8 years ago.
Improve this question
Now that Microsoft made the new SQL Azure service tiers available (Basic, Standard, Premium) we are trying to figure out how they map to the existing ones (Web and Business).
Essentially, there are six performance levels in the new tier breakdown: Basic, S1, S2, P1, P2 and P3 (details here: http://msdn.microsoft.com/library/dn741336.aspx)
Does anyone know how the old database tiers map to those six levels? For instance, is Business equivalent of an S1? S2?
We need to be able to answer this question in order to figure out what service tiers/levels to migrate our existing databases to.
We just finished a performance comparison.
I can't publish our SQL queries, but we used 3 different test cases that match our normal activity. In each test case, we performed several queries with table joins and aggregate calculations (SUM, AVG, etc) for a few thousand rows. Our test database is modest - about 5GB in size with a few million rows.
A few notes: For each, we tested the my local machine which is a 5 year old iMac running Windows/SQL Server in a virtual machine ("Local"), SQL Azure Business ("Business"), SQL Azure Premium P1, SQL Azure Standard S2, and SQL Azure Standard S1. The basic tier seemed so slow that we didn't test it. All of these tests were done with no other activity on the system. The queries did not return data so network performance was hopefully not a factor.
Here were our results:
Test One
Local: 1 second
Business: 2 seconds
P1: 2 seconds
S2: 4 seconds
S1: 14 seconds
Test Two
Local: 2 seconds
Business: 5 seconds
P1: 5 seconds
S2: 10 seconds
S1: 30 seconds
Test Three
Local: 5 seconds
Business: 12 seconds
P1: 13 seconds
S2: 25 seconds
S1: 77 seconds
Conclusions:
After working with the different tiers for a few days, our team concluded a few things:
P1 appears to perform at the same level as SQL Azure Business. (P1 is 10x the price)
Basic and S1 are way too slow for anything but a starter database.
The Business tier is a shared service so performance depends on what other users are on your server. Our database shows a max of 4.01% CPU, 0.77% Data IO ,0.14% Log IO and we're experiencing major performance problems and timeouts. Microsoft Support confirmed that we are "just on a really busy server."
The Business tier delivers inconsistent service across servers and regions. In our case, we moved to a different server in a different region and our service is back to normal. (we view that as a temporary solution)
S1, S2, P1 tiers seem to provide the same performance across regions. We tested West and North Central.
Considering the results above, we're generally worried about the future of SQL Azure. The business tier has been great for us for a few years, but it's scheduled to go out of service in 12 months. The new tiers seem over priced compared to the Business tier.
I'm sure there are 100 ways this could be more scientific, but I'm hoping those stats help others getting ready to evaluate.
UPDATE:
Microsoft Support sent us a very helpful query to assess your database usage.
SELECT
avg(avg_cpu_percent) AS 'Average CPU Percentage Used',
max(avg_cpu_percent) AS 'Maximum CPU Percentage Used',
avg(avg_physical_data_read_percent) AS 'Average Physical IOPS Percentage',
max(avg_physical_data_read_percent) AS 'Maximum Physical IOPS Percentage',
avg(avg_log_write_percent) AS 'Average Log Write Percentage',
max(avg_log_write_percent) AS 'Maximum Log Write Percentage',
--avg(avg_memory_percent) AS 'Average Memory Used Percentage',
--max(avg_memory_percent) AS 'Maximum Memory Used Percentage',
avg(active_worker_count) AS 'Average # of Workers',
max(active_worker_count) AS 'Maximum # of Workers'
FROM sys.resource_stats
WHERE database_name = 'YOUR_DATABASE_NAME' AND
start_time > DATEADD(day, -7, GETDATE())
The most useful part is that the percentages represent % of an S2 instance. According to Microsoft Support, if you're at 100%, you're using 100% of an S2, 200% would be equivalent to a P1 instance.
We're having very good luck with P1 instances now, although the price difference has been a shocker.
I am the author of the Azure SQL Database Performance Testing blog posts mentioned above.
Making IOPS to DTU comparisons is quite difficult for Azure SQL Database, which is why I focussed on row counts and throughput rates (in MB per second) in my tests.
I would be cautious about using the Transaction Rates quoted by Microsoft - their benchmark databases are rather small e.g. for Standard tier, which has a capacity of 250 GB, their benchmark databases for S1 and S2 are only 2 GB and 7 GB respectively. At these sizes I suggest SQL Server is caching much/most of the database and as such their benchmark is avoid the worst of the read throttling that is likely to impact real world databases.
I have added a new post regarding the new Service Tiers hitting General Availability and making some estimates of the changes in performance around S0 and S1 at GA.
http://cbailiss.wordpress.com/2014/09/16/performance-in-new-azure-sql-database-performance-tiers/
There is not really any kind of mapping between the old and new offerings near as I can tell. The old offerings the only thing that was really different between the "web" and "business" offering was the size the database was limited to.
However, on the new offerings each tier has performance metrics associated with them. So in order to decide what offering you need to move your existing databases to you need to figure out what type of performance needs your application has.
It appears in terms of size Web and Business fall between Basic and S1. Here's a link that has a chart with the new and old tiers compared. It seems a little apple to oranges honestly so there isn't a direct mapping. Here's also a link specifically addressed to people currently on the Web and Business Tiers.
Comparison of Tiers
Web and Business Edition Sunset FAQ