Simple select count(id) uses 100% of Azure SQL DTUs - azure

This started off as this question but now seems more appropriately asked specifically since I realised it is a DTU related question.
Basically, running:
select count(id) from mytable
EDIT: Adding a where clause does not seem to help.
Is taking between 8 and 30 minutes to run (whereas the same query on a local copy of SQL Server takes about 4 seconds).
Below is a screen shot of the MONITOR tab in the Azure portal when I run this query. Note I did this after not touching the Database for about a week and Azure reporting I had only used 1% of my DTUs.
A couple of extra things:
In this particular test, the query took 08:27s to run.
While it was running, the above chart actually showed the DTU line at 100% for a period.
The database is configured Standard Service Tier with S1 performance level.
The database is about 3.3GB and this is the largest table (the count is returning approx 2,000,000).
I appreciate it might just be my limited understanding but if somebody could clarify if this is really the expected behaviour (i.e. a simple count taking so long to run and maxing out my DTUs) it would be much appreciated.

From the query stats in your previous question we can see:
300ms CPU time
8000 physical reads
8:30 is about 500sec. We certainly are not CPU bound. 300ms CPU over 500sec is almost no utilization. We get 16 physical reads per second. That is far below what any physical disk can deliver. Also, the table is not fully cached as evidenced by the presence of physical IO.
I'd say you are throttled. S1 corresponds to
934 transactions per minute
for some definition of transaction. Thats about 15 trans/sec. Maybe you are hitting a limit of one physical IO per transaction?! 15 and 16 are suspiciously similar numbers.
Test this theory by upgrading the instance to a higher scale factor. You might find that SQL Azure Database cannot deliver the performance you want at an acceptable price.
You also should find that repeatedly scanning half of the table results in a fast query because the allotted buffer pool seems to fit most of the table (just not all of it).

I had the same issue. Updating the statistics with fullscan on the table solved it:
update statistics mytable with fullscan

select count
should perform clustered index scan if one is available and its up to date. Azure SQL should update statistics automatically, but does not rebuild indexes automatically if they are completely out of date.
if there's a lot of INSERT/UPDATE/DELETE traffic on that table I suggest manually rebuilding the indexes every once in a while.
http://blogs.msdn.com/b/dilkushp/archive/2013/07/28/fragmentation-in-sql-azure.aspx
and SO post for more info
SQL Azure and Indexes

Related

AAS: How can I optimize the memory usage of an Azure Analysis Services instance?

Context
Note: To be precise I have multiple data models on the same AAS
instance however from viewing the size of those models along with the usage graphs they don't seem to be impacting the memory usage by any significant amount. Therefore the discussion below focuses on the
"single data model" that to us seems to be most correlated to the observed
spikes.
I have a data model held in an Azure Analysis Services instance (the data model itself is a database inside the azure analysis services instance). The data model itself has been deployed using Visual Studio to the Azure Analysis Services instance. The data model is essentially created using data straight from SQL Server database (queries and stored procedures are being used to create the tables under the hood).
Note: Within this data model there are 16 tables in total. The largest 2 (as defined by % of model occupied & other metrics, which can be viewed via DAX Studio Vertipaq Analyzer) are the ones which have been partitioned day wise with 60 days partitions in total each (2022-04-11, 2022-04-12, ...) and handled via the partitioning automation procedure outlined in the Resources section below. The remaining 14 tables haven't been partitioned in that sense and are "fully processed" each time the refresh function triggers the refresh (effectively each of those 14 tables consist of 1 single large partition, i.e. the whole table).
E.g: Every hour, when our refresh functions triggers, the latest 3 partitions of our 2 large tables are re-processed and each of the remaining 14 tables are re-processed fully (since only 1 big partition which forms each of these tables).
The refreshes of the data model are performed using a function app which has functions which refresh the latest 3 daywise partitions of 2 of the largest tables in the data model while the other tables are processed whole every time during refresh.
Currently the function controlling the refresh execution is triggered to go off every hour, during which it performs a refresh of the data as described above.
The issue that we have been facing is that when we observe our memory usage dashboard (Check screenshot below) we tend to get massive spikes in the memory usage which seem to occur during this refresh phase.
In light of this observation we began trying to test out what seems to be causing these periodic spikes and observed the following interesting points:
Spikes align almost perfectly with the scheduled hourly refreshes of our data model.
Leading us to believe the spikes are related to the refresh process in some way.
In between these refreshes the memory usage drops significantly
Further making us believe the spikes is caused by some part of the refresh activity and not caused by general usage.
Increasing the number of partitions (time window of data for the 2 main tables) from 30 days to 60 days and vice versa causes significantly visible changes in the spikes
If we go up from 30 to 60 days the spikes amplitudes increase and the other way around causes it to decrease.
Performing "Defragmentation process" as outlined in the white paper viewable via the link in the Resources section temporarily reduces the usage by a little.
By nature of this process its something that would be need to be performed on a regular basis to ensure continued benefits.
The tables that are fully processed each time (all tables in main data model barring 2 which only refresh last 3 daily partitions) don't seem to cause a high impact on the memory usage spikes.
We manually processed some of the largest tables one after another in-between the refreshes and didn't notice a huge jump in the graphs.
Reducing the 3 daywise partitions to 3 hourly partitions for the 2 main tables refresh didn't seem to cause a big change either.
Noticed a small drop in the memory usage (about 1-2GB during hourly refresh) but didn't seem to have as large of an impact as we thought (proportional to the data reduction). This makes us think that the actual amount of data might not be the primary issue.
Screenshots
Here are some more details on the metrics used Definitions:
Turqoise Line: Hard memory limit max (same as the max cache size of our AAS tier).
Dark Blue Line: High memory limit max (approx 80% of our Hard Memory limit).
Orange Line: Memory Usage max. More details can be found in the following links: AAS Metrics, Memory Usage Forum Post
Questions
Based on our scenario (described above) what could be that cause of the memory usage spikes during refresh and how can we reduce and or manage them in a nice way (ideally removing entirely or as much as possible)? Basically always much below the turquoise and dark blue lines
We feel that if we can figure this out it may allow us to stay within our current pricing tier and also potentially allow us to bring in more data (90-120 days partitions) without the worry of hitting "Out of Memory" health alerts for our instance (which we have been receiving up until now with 60 days).
Note: Barring the current hourly refreshes we are well within the tier limits in terms of memory usage (orange line much lower than the threshold turquoise & blue). Thus solving this could free us to make better use of our AAS resources
Current Thoughts
We do have calculated columns in our data model. Could this be causing the issue?
What would be the best way to test this?
Resources
Will place any useful links to documentation in this section. Hopefully can aid in understanding the context.
Github Link for the repo containing Tabular model refresh logic we based our process off.
https://github.com/microsoft/Analysis-Services/tree/master/AsPartitionProcessing
In the README.md make sure to click the link to the white paper which provides more detail.
Please try setting MaxParallelism to some low value like 2 or 3 in the ModelConfiguration table. This will reduce the number of parallel tables and partitions it will process at once. This alone probably won’t solve the memory spike issue but it should lower the spike a little at the expense of longer refresh times. If you can deal with this tradeoff and it spikes memory less this may be a workaround.
Please set IsAvailableInMDX to false on any hidden columns or hidden measure columns which are not put on an axis or referenced directly in an MDX query. This should reduce your memory footprint during processing because it will not build attribute hierarchies for those columns. On high cardinality columns the savings could be significant.
The next thing to try would be to split the tables/partitions into separate ModelConfiguration rows in the database. Then configure it to process one ModelConfiguration then the other sequentially. The goal here would be to process some tables in one transaction and other tables in a separate transaction. That should cause the memory usage required for each transaction to be less. Of course this may impact users in that half of the data will be stale after the first transaction so you will have to judge whether this is feasible.
A more complex optimization would be to scale out AAS and have a dedicated processing node. You could then process clear the model before you full process it. That should reduce the memory requirements the most. Once processing is done you run the Synchronize command. You could even scale back in removing the processing node to save cost the rest of the hour.
Another option to consider would be to deploy the models to Power BI Premium Gen2. The very interesting nuance with Gen2 is that a P1 capacity allows each dataset to be up to 25GB unlike Gen1 and unlike AAS S1 where the total of all datasets must be less than 25GB. If your organization already owns Power BI Premium capacities this should be a good option. If not then the cost probably won’t make sense at the moment. Or you could license each user with a Power BI Premium Per User license and deploy the model to that Premium Per User capacity. If you have under 70 users this may be a more cost effective option for you to try.

MariaDB got much faster, and I can't find the cause?

I have a concern about my MariaDB 10.4.12 database query execution time, which is getting much faster without any update to my database schema or data. While a speed-up is always welcome, I am concerned about the root cause of this speed-up, especially since I have not rolled out any changes in the last 24 hours. This specific query has sped up 60x overnight.
I have a NodeJS web application that filters a large dataset into "reporting" pages, which typically take 10-12 seconds to load. My main table has 3.5 million rows and the base query involves many joins, date comparisons, and text comparisons. There is room for fine-tuning the query, but it worked for what it was designed to do and I could live with 10 second load times. I noticed this morning, though, that my queries were executed in less than 1 second, without any recent changes on my part.
The most recent change to the application was pushed out five days ago, which affected the amount of data being pulled into this database. A separate application on the same server reaches out to a data set every 10 minutes and replicates these rows into the same database the "reporting" application communicates with. Up until this update, the query was collecting and inserting ~80,000 rows on average, taking about 8-10 seconds to fully replicate the data into this database. My change five days ago reduced the rows being inserted to ~20,000 on average.
Other clues:
PHPMyAdmin still takes 10-12 seconds to run the query, while the MySQL command-line tool takes in less than 1 second
The MariaDB temp directory was changed to a larger partition 7 days ago
The query was tested to be slow (10-12 seconds) 24 hours ago
The query is still slow on a pre-production server that runs the same application with an identical MySQL instance running (same schema and data)
My current running theory is that the ~80,000 inserts were not being executed in the time range being reported by NodeJS (8-10 seconds for the inserts), and they were instead waiting in the MariaDB temp directory until they could be fully written to the database. That would suggest that the database was constantly bogged down by these writes, and reducing the number to ~20k allowed the database to insert faster, allowing the select queries to run faster this morning.
Should I be concerned about this speed up? Could MariaDB have found a faster way to index my data? Am I going crazy?
Thank you.
Don't worry. This kind of thing can be caused by contention (multiple database clients using the database concurrently) and all sorts of other things.
(Cherish this moment. Performance usually goes the other direction.)
You can test for correctness to increase your confidence level. Check a few older and a few newer records to see if they still contain good data.
Or a full-table-scan query, something like this
SELECT COUNT(*), AVG(some_number_column), MIN(some_text_column) FROM mytable
That will take a while but it will hit every row in the table.
You probably don't need to do this, but it's a way to double check (and tell your boss, "I double checked.)
10 seconds, then 1 second. That is "normal".
The first was run when none of the data was cached in RAM; the second was with all cached.
Run it a third time; it will be 1 second again.
Restart MariaDB and run it again; it will again take 10 seconds.
Walk away from the machine for a long time; don't touch the table. It might be back to 10 seconds. For this, look at size of RAM and innodb_buffer_pool_size. Also look for big table scans that bump everything out of cache.

Azure SQL virtual machine performance - Inserts very slow

I'm trying out different pricing tiers on SQL Server.
Im inserting 4000 rows distributed over 4 tables in 10 seconds
My problem: I don't any performance improvements from a small D2S_V3 to D8S_V3
My application need to insert many rows (bulking is not an option), and this kind of performance is not acceptable
I wonder why I dont see improvements.
So my noob question: Do I need to configure something to see improvements? My naive thinking says I should some difference :-)
What am I doing wrong?
Without knowing much about your schema, it looks like you are storage bound or network bound.
Storage:
Try to mount the database to the local (temporary disk) and see if you notice any difference, if it is faster then your bottleneck is the mounted disk.
Network bound:
Where is the client that's inserting these transaction? On same machine? On Azure?
I suggest you setup a client in the same region and do the tests.
inserting 4000 rows distributed over 4 tables in 10 seconds.I don't any performance improvements from a small D2S_V3 to D8S_V3
I would approach this problem using wait stats approach rather than throwing hardware first with out knowing problem..
For example,running below insert
insert into #t
select orderid from orders o
join
Customers c
on c.custid=o.custid
showed me below wait stats
Wait WaitType="SOS_SCHEDULER_YIELD" WaitTimeMs="1" WaitCount="167"
Wait WaitType="PAGEIOLATCH_SH" WaitTimeMs="12" WaitCount="3" />
Wait WaitType="MEMORY_ALLOCATION_EXT" WaitTimeMs="21" WaitCount="4975" />
most of the time, the query spent time on
PAGEIOLATCH_SH:
getting data from disk into memory
MEMORY_ALLOCATION_EXT :allocating memory for the query to run
based on this i will try to troubleshoot by seeing if i have memory pressure on my system,since this query is trying to get data from disk.
This is just one example,but hopefully this will give you an idea..
Further i will try to see if select is returing data fast
Performance can be directly linked to your hardware or configuration, but it's more likely that it has to do with the structures and the queries. Take a look at the execution plan for the INSERT operation to see how it is being resolved by the optimizer. Also, capture the query metrics using extended events to see how many resources are being used by the operation. These are more likely to lead to a resolution on why the query is performing slowly and enable you to scale the hardware to best serve the query.

Azure SQL Database update performance

We're migrating some databases from an Azure VM running SQL Server to Azure SQL. The current VM is a Standard DS12 v2 with two 1TB SSDs attached.
We are using an elastic pool at the P1 performance level. We're early days in this, so nothing else is really running in the pool.
At any rate, we are doing an ETL process that involves a handful of ~20M row tables. We bulk load these tables and then update some attributes to help with the rest of the process.
For example, I am currently running the following update:
UPDATE A
SET A.CompanyId = B.Id
FROM etl.TRANSACTIONS AS A
LEFT OUTER JOIN dbo.Company AS B
ON A.CO_ID = B.ERPCode
TRANSACTIONS is ~ 20M rows; Company is fewer than 50.
I'm already 30 minutes into running this update which is far beyond what will be acceptable. The usage meter on the Pool is hovering around 40%.
For reference, our Azure VM runs this in about 2 minutes.
I load this table via the bulk copy and this update is already beyond what it took to load the entire table.
Any suggestions on speeding up this (and other) updates?
We are using an elastic pool at the P1 performance level.
Not sure ,how this translates your VM performance levels and what criteria you are using to compare both
I would recommend below steps ,since there is no execution plan provided ..
1.Check if there is any wait type ,while the update is running
select
session_id,
start_time,
command,
db_name(ec.database_id) as dbname,
blocking_session_id,
wait_type,
last_wait_type,
wait_time,
cpu_time,
logical_reads,
reads,
writes,
((database_transaction_log_bytes_used +database_transaction_log_bytes_reserved)/1024)/1024 as logusageMB,
txt.text,
pln.query_plan
from sys.dm_exec_requests ec
cross apply
sys.dm_exec_sql_text(ec.sql_handle) txt
outer apply
sys.dm_exec_query_plan(ec.plan_handle) pln
left join
sys.dm_tran_database_transactions trn
on trn.transaction_id=ec.transaction_id
the wait type,provides you lot of info,which can be used to troubleshoot..
2.You can also use below query to see in parallel ,what is happening with the query
set statistics profile on
your update query
then run below query in a seperate window
select
session_id,physical_operator_name,
row_count,actual_read_row_count,estimate_row_count,estimated_read_row_count,
rebind_count,
rewind_count,
scan_count,
logical_read_count,
physical_read_count,
logical_read_count
from
sys.dm_exec_query_profiles
where session_id=your sessionid;
as per your question,there don't seems to be an issue with DTU.So i dont see much issue on that front..
Slow performance solved in one case:
I have recently had severe problems with slow Azure updates that made it nearly unusable. It was updating only 1000 rows in 1 second. So 1M rows was 1000 seconds. I believe this is due to logging in Azure, but I haven't done enough research to be certain. Opening a MS support incident went nowhere. I finally solved the issue using two techniques:
Copy the data to a temporary table and make updates in the temp table. So in the above case, try copying the 50 rows to a temp table & then back again after updates. No/Minimal logging in this case.
My copying back was still slow (I had a few 100K rows), and I create a clustered index on that table. Update duration dropped by a factor of 4-5.
I am using a S1-20DTU database. It is still about 5 times slower than a dedicated instance, but that is fantastic performance for the price.
The real answer to this issue is that SQL Azure will spill to the tempdb much faster than you would expect if you are used to using a well provisioned VM or physical machine.
You can tell that this is happening by recording the actual execution query plan. Look for the warning icon:
The popup will complain about the spill:
At any rate, if you see this, it is likely that you're trying to do too much in the statement.
The Microsoft support person suggested updating the statistics, but this did not change the situation for us.
What seems to be working is the traditional advice to break the inserts up into smaller batches.

Azure Table Storage transaction limitations

I'm running performance tests against ATS and its behaving a bit weird when using multiple virtual machines against the same table / storage account.
The entire pipeline is non blocking (await/async) and using TPL for concurrent and parallel execution.
First of all its very strange that with this setup i'm only getting about 1200 insertions. This is running on a L VM box, that is 4 cores + 800mbps.
I'm inserting 100.000 rows with unique PK and unique RK, that should leverage the ultimate distribution.
Even more deterministic behavior is the following.
When I run 1 VM i get about 1200 insertions per second.
When I run 3 VM i get about 730 on each insertions per second.
Its quite humors to read the blog post where they are specifying their targets.
https://azure.microsoft.com/en-gb/blog/windows-azures-flat-network-storage-and-2012-scalability-targets/
Single Table Partition– a table partition are all of the entities in a table with the same partition key value, and usually tables have many partitions. The throughput target for a single table partition is:
Up to 2,000 entities per second
Note, this is for a single partition, and not a single table. Therefore, a table with good partitioning, can process up to the 20,000 entities/second, which is the overall account target described above.
What shall I do to be able to utilize the 20k per second, and how would it be possible to execute more than 1,2k per VM?
--
Update:
I've now also tried using 3 storage accounts for each individual node and is still getting the performance / throttling behavior. Which i can't find a logical reason for.
--
Update 2:
I've optimized the code further and now i'm possible to execute about 1550.
--
Update 3:
I've now also tried in US West. The performance is worse there. About 33% lower.
--
Update 4:
I tried executing the code from a XL machine. Which is 8 cores instead of 4 and the double amount of memory and bandwidth and got a 2% increase in performance so clearly this problem is not on my side..
A few comments:
You mention that you are using unique PK/RK to get ultimate
distribution, but you have to keep in mind that the PK balancing is
not immediate. When you first create a table, the entire table will
be served by 1 partition server. So if you are doing inserts across
several different PKs, they will still be going to one partition
server and be bottlenecked by the scalability target for a single
partition. The partition master will only start splitting your
partitions among multiple partition servers after it has identified hot
partition servers. In your <2 minute test you will not see the
benefit of multiple partiton servers or PKs. The throughput in the
article is targeted towards a well distributed PK scheme with
frequently accessed data, causing the data to be divided amongst
multiple partition servers.
The size of your VM is not the issue as
you are not blocked on CPU, Memory, or Bandwidth. You can achieve
full storage performance from a small VM size.
Check out
http://research.microsoft.com/en-us/downloads/5c8189b9-53aa-4d6a-a086-013d927e15a7/default.aspx.
I just now did a quick test using that tool from a WebRole VM in the
same datacenter as my storage account and I acheived, from a single
instance of the tool on a single VM, ~2800 items per second upload
and ~7300 items per second download. This is using 1024 byte
entities, 10 threads, and 100 batch size. I don't know how efficient this tool is or if it disables Nagles Algorithm as I was unable to get great results (I got ~1000/second) using a batch size of 1, but at least with the 100 batch size it shows that you can achieve high items/second. This was done in US West.
Are you using Storage client library 1.7 (Microsoft.Azure.StorageClient.dll) or 2.0 (Microsoft.Azure.Storage.dll)? The 2.0 library has some performance improvements and should yield better results.
I suspect this may have to do with TCP Nagle.
See this MSDN article and this blog post.
In essence, TCP Nagle is a protocol-level optimization that batches up small requests. Since you are sending lots of small requests this is likely to negatively affect your performance.
You can disable TCP Nagle by executing this code when starting your application
ServicePointManager.UseNagleAlgorithm = false;
Are the compute instances and storage account in the same affinity group? Affinity groups ensure that network proximity between the services is optimal and should result in lower latency at the network level.
You can find affinity group configuration under the network tab.
I would tend to believe that the maximum throughput is for an optimized load. For example, I bet you that you can achieve higher performance using Batch requests than individual requests you are doing now. And of course, if you use GUIDs for your PK, you can't Batch in your current test.
So what if you changed your test to batch insert entities in groups of 100 (maximum per batch), still using GUIDs, but for which 100 entities would have the same PK?

Resources