Azure Redis timing out - azure

We've been having problems with Redis in Azure lately. A lot of timeout errors. I had a read of http://azure.microsoft.com/blog/2015/02/10/investigating-tim…
Looking at one of the CPU points I've done a "INFO CPU" and here's what it tells me:
used_cpu_sys:88.72
used_cpu_user:94.69
used_cpu_avg_ms_per_sec:0
server_load:0.45
event_wait:9
event_no_wait:19
Am I right thinking that the CPU is being killed by the OS? Server load is almost nothing but the CPU usage is high. Any ideas how to diagnose further and fix that?

there are several reason for TimeOut Exception using Redis on Azure.
Here a good post that should help you
http://azure.microsoft.com/blog/2015/02/10/investigating-timeout-exceptions-in-stackexchange-redis-for-azure-redis-cache/

Related

Debugging "mem_used_percent" on EC2 instance

We are facing a peculiar issue in our AWS service running on EC2 instances. In CW metrics "mem_used_percent" is gradually going up with time and eventually goes to 90% followed by system failure. We have verified that failure is happening due to OOM error and restarting the hosts is fixing it by bringing the "mem_used_percent" down to around 20%.
While doing the investigation on new and old running EC2 instances, we are seeing that only around 20% of the RAM usage is accounted for in "top" command output(sort by %MEM). We are not able to actually pin-point the processes using rest of the unaccounted physical memory.
Is there a better way to do memory usage analysis on EC2 instances(Linux) that will sum up to "mem_used_percent" in CW metrics?
Please let me know if any other details are required.
Thanks!!
Take a look at "procstat plugin". It might help your investigation
"The procstat plugin enables you to collect metrics from individual processes".
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-Agent-procstat-process-metrics.html#CloudWatch-Agent-procstat-process-metrics-collected

Debug high CPU usage in Azure WebApp (Linux)

I have set up an Azure WebApp (Linux) to run a WordPress and an other handmade PHP app on it. All works fine but I get this weird CPU usage graph (see below).
Both apps are PHP7.0 containers.
SSHing in to the two containers and using top I see no unusual CPU hogging processes.
When I reset both apps the CPU goes back to normal and then starts to raise slowly as shown below.
The amount of HTTP requests to the apps has not relation to the CPU usage at all.
I tried to use apache2ctl to see if there are any pending requests but that seems not possible to do inside a docker container.
Anybody got an idea how to track down the cause of this?
This is the top output. The instance has 2 cores. Lots of idle time but still over 100% load and none of the processes use the CPU ...
After handling with MS Support on that issue it seems to have boiled down to the WordPress theme being to slow or inefficient. Each request took very long and hogged CPU resources. All following requests started queuing up and thus increasing the CPU load.
Why that would not show as %CPU in top I was not explained.
They proposed to use a different theme or upscale to a multi core instance.
I am unsatisfied with that solution and will monitor further and try to find the real culprit.
I had almost exactly the same CPU Percentage graph as you did, although a Node.JS app instead of PHP. Disabling Diagnostic Logs > Docker Container Logging seems to have solved the problem for me.
I do not need those logs because I am logging to application insights.
But, in your case you might need more of those logs. I have no solution for that, but I am guessing that heavier log rotation or reducing the sizes of the logs by other means might help

Azure App Service Plan CPU spikes for no obvious reason

We're experiencing CPU spikes on our Azure App Service Plan for no obvious reason. Its not something that stops the service, but we'd like to have an understanding of when&how that kind of things happen.
For example, CPU percentage sits at 0-1% range for days but then all of the sudden it spikes to 98%, 45%, 60% and comes back to 0-1% range very quickly. Memory stays unchanged at comfortable 40-45% level, no incoming requests to it, no web jobs, nothing unusual in logs, no failures, service health ok, nothing we could point our finger to as a reason.
We tried to find out through kudu > support > analyze (metrics)...but we couldn't get request submited. It just keeps giving error to try later.
There is only one web app running in that app service plan, its a asp.net core 2.0. web api.
Could someone shed some light on this kind of behavior? Is this normal, expected? If so, why it happens? Is there a danger that it spikes to 90% and don't immediately come back?
Just, what's going on?
After speaking with MS support i've got an answer it is a normal behavior coming from their monitoring tool:
We reviewed our internal tools taking as starting point 12/26 and
today 12/29 and we could notice that this was majority System
processes doing background tasks, which is normal for each sandbox
environment. In your case, it was mostly MonAgentCore.exe fluctuating
in CPU which is our diagnostic log capturing process and this looks
like a very temporary spike and appears normal.

Application pool disabling

I have an application in the Production environment which is Windows Server 2012/IIS 8 and is load balanced.
Recently out of nowhere, the website app pool suddenly started gettig disabled. The System Windows Logs logged the following error message by the Resource-Exhaustion-Detector ...
Application Pool 'x' is being automatically disabled due to a series of failures in the process(es) serving that application pool.
Windows successfully diagnosed a low virtual memory condition. The following programs consumed the most virtual memory: w3wp.exe (6604) consumed 5080641536 bytes, w3wp.exe (1572) consumed 477335552 bytes, and w3wp.exe (352) consumed 431423488 bytes.
Anyone got any idea how I figure out what is happening? We've never come across this issue before and the application has been running for a good couple of years.
Also, this isn't something that happens regularly but instead seems to happen one every day or so, and even that is at a random time. The Virtual Memory was initially 4GB but because of the issue above, we increased it to 8GB. Recently it spiked at using about 6.8GB out of 8GB, which it has no reason to do so.
Any help would be really appreciated!
The answer is easy here, obviously and certainly you have two issues here
1- You have a serious bug in your process/code that happens intermittently "you need to debug it to find how/when that happens" or at least run a ProcDump
such that you keep it listening on the server on the process W3WP till an exception happens and then analyze this dump to find where the code get stuck and consume that memory/otherwise just debug the code and see what changes were made in last few months "not days"
2- the application get stopped because you have configured/it is configured by default to get disabled break after a certain number of failure repeats, and that's a normal behavior but the main issue as I said is not the application pool itself, its inside the process
please let me know if you need a further explanation or help on this matter

Azure Web Site CPU High at random intervals of the day

I have a Azure Web Site running for 6 months and on Friday 1st April 2016 at 09:50pm the CPU was very high and this had a impact on the performance of the web site. Stopping and restarting the web service solved the problem but it came back at 13:00pm. Since then the CPU has stayed high and making the web site un-useable
I've tried all monitoring tools, Daas, Event Logs, checked for Open Connections and ensure my software is closing or disposing objects correctly.
But the CPU is still high. Only way to resolve is to restart the web service but I dont want to keep doing this.
Has anyone else experience a similar problem and what was the solutions.
The only thing from the event logs that look an issues is the odd "A network-related or instance-specific error occurred while establishing a connection to SQL Server", which could be because the SQL Aure is not available.
Please help
Hmmm, high cpu means that your web site is executing code, perhaps a wrong loop on some not frequent code path.
The brute force way to identify what code is being executed, would be to add tracing to your solution by System.Diagnostics.Trace.WriteLine("I am here") and then check the Azure Application Log.
Another way would be to attach the Visual Studio Debugger during high cpu and check what is being executed
The other way would be to take a dump or minidump from kudu site and analyze it with WinDbg:
1)What thread is conuming cpu:
!runaway
2) What is this thread doing:
!clrstack
hth,
Aldo

Resources