RDS performance insight average latency metric not showing - amazon-rds

Does someone know how to visualize "avg latency (ms)/call" and "rows examined/call" on the performance insight dashboard?
I have a MySQL RDS instance with performance insight enabled but metrics like "avg latency (ms)/call" and "rows examined/call" are not being displayed.
I already changed the performance_insight group parameter from 0 to 1 and rebooted the instance with no changes.
I see the "Load by waits":
But I don't see "avg latency (ms)/call" and "rows examined/call":

Related

How To Get the Aggregated Usage of CPU in GRAFANA when hyperthreading is enabled

We are running GRAFANA/PROMETHEUS to monitor our CPU metrics and find aggregated CPU Usage of all cpus. the problem is we have enabled hyperthreading and when we stress CPU the percentage exceeds from 100%. my question is how to limit that cpu usage to show only usage in 100% not more even if cpu is highly utilized.
P.S i have tried setting the max and min limit in grafana but still the graph spikes goes above that limit.
Kindly give me the right query for this problem.
The queries I have tried are given below.
sum(irate(node_cpu_seconds_total{instance="localhost",job="node", mode!="idle"}[5m]))*100
100 - avg(irate(node_cpu_seconds_total{instance="localhost",job="node", mode!="idle"}[5m]))*100
and other similar queries we have tried.
If all you want is to "cap" a variable or expression result to a maximum value (that is, 100) you could simply use the Prometheus function clamp_max.
Thus, you could do:
clamp_max(<expr>, 100)
This is probably the most helpful query.
(1 - avg(irate(node_cpu_seconds_total{instance="$instance",job="$job",mode!="idle"}[5m])))*100
Replace your instance IP and your node exporter job name.

Cpu utilisation of azure SQL database is showing zero

I have moved a azure SQL database from one subscription to another.
After moving the database the cpu utilisation is showing zero .
Cpu may have been dead.
Please help me to fix the issue.
screenshot attached
Try using the below DMV ( in SSMS /Query Editor) :
SELECT
AVG(avg_cpu_percent) AS 'Average CPU Utilization In Percent',
MAX(avg_cpu_percent) AS 'Maximum CPU Utilization In Percent',
AVG(avg_data_io_percent) AS 'Average Data IO In Percent',
MAX(avg_data_io_percent) AS 'Maximum Data IO In Percent',
AVG(avg_log_write_percent) AS 'Average Log Write I/O Throughput Utilization In Percent',
MAX(avg_log_write_percent) AS 'Maximum Log Write I/O Throughput Utilization In Percent',
AVG(avg_memory_usage_percent) AS 'Average Memory Usage In Percent',
MAX(avg_memory_usage_percent) AS 'Maximum Memory Usage In Percent'
FROM sys.dm_db_resource_stats;
I was able to see 76 percent cpu utilization using this query, still it's not showing on portal and Microsoft internal team is looking into it but at least I was able to see that cpu is not dead and it is responding.
screenshot of result
How to debug this:
Fire up SSMS and run a few queries. Some SELECT will be alright.
If you can retrieve data is all good. You probably moved to a higher tier with better performances.

CPU / DTUs getting maxed out on Azure SQL Database, but top queries less than 1% and database only a few MB

I just launched an Azure SQL Database, and the DTU and CPU usage is behaving strangely. The database is only receiving about 30 requests per minute, and the CPU/DTU will be extremely low for hours, and then jump up to 100% and stay there (with no increase in the number of requests that triggers this). When I click to view the top queries, none of them are above 1% cpu usage. I started out on a 5 DTU plan, and yesterday upgraded to 20 DTUs and the same behavior is occurring. Any idea what else might cause the DTU/CPU to get maxed out? See images below:
https://i.imgur.com/LdbYTPw.png
https://i.imgur.com/jlus3FM.png
Thanks in advance for any advice!
Joe
EDIT: I'm getting closer, I found these repeated entries in the error log. (about 8 - 10 per SECOND)
"The incoming request has too many parameters. The server supports a maximum of 2100 parameters. Reduce the number of parameters and resend the request."
The thing is, the App Service that queries the database is only doing simple selects, updates, and inserts... none of which uses any complex WHERE IN statement. Furthermore, every query is wrapped in a try/catch block, and I'm never seeing an exception like this.
Where could these large queries be originating from?
You are only seeing the CPU component of the DTU graph, what about the "Data IO" and the "Log IO" components? Look at the top 5 queries on the 3 sections, and let me know if you find a query that start with "SELECT Statman ...". If you see that, then the Auto Update Statistics process is creating those DTU spikes.
I would suggest to install the sp_whoisactive script so that you can see what's going on more easily:
http://whoisactive.com/

CosmosDB: Minimum throughput does not make sense

we got a cosmos DB containing 24 containers as of now.
The throughput is provisioned on database Level.
I would expect the Minimum throughput to be 2400 RUs but actually 4500 is expected. (Shown in Azure Portal as well as an error message in .NET SDK)
Expectation:
Count of containers * 100 RU/s = Min. RU/s
or if container count is less or equal to four
400 RU/s
I observed the behaviour, that after I delete the database and recreate it, the throughput works as expected.
This behaviour only occurs after some days working with the database.
Is there an explanation why this is expected or is this a bug in
CosmosDB itself?
Thanks for your help
You have two regions configured, per your screenshot:
Each region is consuming the 2400 or so RUs. 2x2400 = 4800. Seems like you have slightly less than 2400 RU per region.
In any case: the two-region setup is what's costing you double the expected RU.

Understanding Azure SQL Performance

The facts:
1 Azure SQL S0 instance
a few tables one of them containing ~ 8.6 Million Rows and 1 PK
Running a Count-query on this table takes nearly 30 minutes (!) to complete.
Upscaling the instance from S0 to S1 reduces the query time to 13 minutes:
Looking into Azure Portal (new version) the resource-usage-monitor shows the following:
Questions:
Does anyone else consider even 13 minutes as rediculos for a simple COUNT()?
Does the second screenshot meen that during the 100%-period my instance isn't responding to other requests?
Why are my metrics limited to 100% in both S0 and S1? (see look under "Which Service Tier is Right for My Database?" stating " These values can be above 100% (a big improvement over the values in the preview that were limited to a maximum of 100).") I'd expect the S0 to bee like on 150% or so if the quoted statement is true.
I'm interested in experiences regarding usage of databases with more than 1.000 records or so from other people. I don't see how a S*-scaled Azure SQL for 22 - 55 € per month could help me in upscaling-strategies at the moment.
Azure SQL Database editions provide increasing levels of DTUs from Basic -> Standard -> Premium levels (CPU,IO,Memory and other resources - see https://msdn.microsoft.com/en-us/library/azure/dn741336.aspx). Once your query reaches its limits of DTU (100%) in any of these resource dimensions, it will continue to receive these resources at that level (but not more) and that may increase the latency in completing the request. It looks like in your scenario above, the query is hitting its DTU limit (10 DTUs for S0 and 20 for S1). You can see the individual resource usage percentages (CPU, Data IO or Log IO) by adding these metrics to the same graph, or by querying the DMV sys.dm_db_resource_stats.
Here is a blog that provides more information on appropriately sizing your database performance levels. http://azure.microsoft.com/blog/2014/09/11/azure-sql-database-introduces-new-near-real-time-performance-metrics/
To your specific questions
1) As you have 8.6 million rows, database needs to scan the index entries to get the count back. So, it may be hitting the IO limit for the edition here.
2) If you have multiple concurrent queries running against your DB, they will be scheduled appropriately to not starve one request or the other. But latencies may increase further for all queries since you will be hitting the available resource limits.
3) For older Web/Business editions, you may be able to see the metric values going beyond 100% (they are normalized to the limits of an S2 level), as they don't have any specific limits and run in a resource-shared environment with other customer loads. For the new editions, metrics will never exceed 100%, because system guarantees you resources upto 100% of that edition's limits, but no more. This provides predictable, guaranteed amount of resources for your DB unlike Web/Business editions, where you may get very little or lot more resources at different times depending on other competing customer DB workloads running on the same machine.
Hope this helps.
-- Srini

Resources