Django Rest Framework very slow on Azure - python-3.x

I had migrated from Heroku to Microsft Azure, and the speed is really very slow, my App service is having the following specs OS (linux):
P1V2
210 total ACU
3.5 GB memory
Dv2-Series compute equivalent
then when it comes to my Azure Database for PostgreSQL flexible server, the following are the specs OS (linux):
General Purpose (2-64 vCores) - Balanced configuration for most common workloads
This is my response time of 15 sec because of Redis cache, sometimes it goes up to 30 sec or beyond :
Am sure all these Specs are higher than the default Heroku specs it used to give, but why is my Django project very slow when it comes to the response time of the API requests?
ADDITION :
I am using a container registry which connects to the App service wherever there's an auto-deployment.
I also fixed the n + 1 issue on the endpoints.
Always on is on, I read several posts like this one.
UPDATE :
I have an ps and top via bash with Kudu, but I don't seem to see any zomibe processes, I also searched with S=Z after pressing 'o', but I didn't find any, below are the screenshots :
top - 16:31:58 up 1 day, 1:47, 1 user, load average: 0.36, 0.62, 0.48
Tasks: 7 total, 1 running, 6 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.9 us, 4.6 sy, 2.2 ni, 89.5 id, 2.4 wa, 0.0 hi, 0.5 si, 0.0 st
MiB Mem : 13993.7 total, 2266.7 free, 1967.4 used, 9759.6 buff/cache
MiB Swap: 2048.0 total, 2032.2 free, 15.8 used. 11719.2 avail Mem

Just to highlight that an App service always runs in an App Service plan. When you create an App Service plan in any region a set of compute resources is created for that plan in that region.
Whatever apps you put into this App Service plan run on these compute resources as defined by your App Service plan. Each App Service plan defines:
Operating System (Windows, Linux)
Region (West US, East US, etc.)
Number of VM instances
Size of VM instances (Small, Medium, Large)
Pricing tier (Free, Shared, Basic, Standard, Premium, PremiumV2, PremiumV3, Isolated, IsolatedV2)
As per diagnostic tool, its reflecting that there is Too many active container running per host and high load average, and its recommended to move some of your app to other app service plan and consider scaling out to reduce load.
Suggest you to refer this detailed step by step guide on Move an app to another App Service plan
Please note that you can move an app to another App Service plan, as long as the source plan and the target plan are in the same resource group and geographical region.
For scaling out suggest you to follow detailed step mentioned in : Scale instance count manually or automatically you can choose to run your application on more than one instance.
Scaling out not only provides you with more processing capability, but also gives you some amount of fault tolerance. If the process goes down on one instance, the other instances continue to serve requests. You can set the scaling to be Manual or Automatic.
Further you may also consider Scale up as there is new PremiumV3 pricing tier gives you faster processors, SSD storage, and quadruple the memory-to-core ratio of the existing pricing tiers (double the PremiumV2 tier). With the performance advantage, you could save money by running your apps on fewer instances.
Check this article on to learn how to create an app in PremiumV3 tier or scale up an app to PremiumV3 tier.
More details:
Azure App Service plan overview
Update:
Also suggest you to go to App Service Diagnostics and see as below:
If Linux Zombie processes detected this may effect the performance and makes application slow. Zombie Process or defunct process is one which has completed execution but still exists in system process table. i.e, the parent process has not yet read the child processes exit status.
Zombie processes can either be detected by looking at top or ps output.
Recommended Action if Linux Zombie process detected:
SSH into your app container by going to
https://sitename.scm.azurewebsites.net.
Use ps to check for any <defunct> processes. Sample below.
ps -aux | grep -w defunct
root 3300 0.0 0.0 0 0 pts/24 ZN+ 18:51 0:00 [newzombie]
Use top to show any processes in a 'Z' state. Sample below (press 'o' and filter using 'S=Z')
top - 19:02:22 up 28 days, 13:35, 26 users, load average: 0.39, 0.65,
0.86
Tasks: 66 total, 1 running, 64 sleeping, 0 stopped, 1 zombie
%Cpu(s): 2.7 us, 2.0 sy, 1.0 ni, 93.9 id, 0.0 wa, 0.0 hi, 0.3 si, 0.0 st
KiB Mem : 1975528 total, 123776 free, 1049580 used, 802172 buff/cache
KiB Swap: 1910780 total, 769432 free, 1141348 used. 658264 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3317 root 30 10 0 0 0 Z 0.0 0.0 0:00.00 newzombie
Once the process is identified, try restarting the process or consider restarting your site.
Look for if there is HTTP Server Errors as HTTP 500.0 error typically indicates an application code issue. An un-handled exception in the application code or an error in application is what typically causes this error.

There are a number of issues which can impact performance like:
Network requests taking a long time
Application code or database
Queries being inefficient Application using high memory/CPU
Application crashing due to an exception
To isolate the issue. You may try below troubleshooting steps:
Observe and monitor application behavior
Collect data
Mitigate the issue
Would suggest you to navigate to your web app in the Azure portal and select the 'diagnose and solve' blade of your web app> click on Linux web app Slow under popular troubleshooting tools, the information provided here would be helpful for fixing this.
Further you can follow to speed up for drf try removing the unwanted apps in INSTALLED_APPS and MIDDLEWARE this may help in boosting your django rest framework performance.

There could be several causes for high response time, to isolate the issue. Kindly try these steps:
If it’s not done already, turn on Always On feature. By default, web apps are unloaded if they are idle for some period of time. This lets the system conserve resources. In Basic or Standard mode, you can enable Always On to keep the app loaded all the time.
On the App Service App, in the left navigation, click on Diagnose and solve problems – Checkout the tile for “Diagnostic Tools” > “Availability and Performance” & "Best Practices".
Update the CPU utilization as 75% for scale-out condition or 25% for scale-in condition as test and see if that makes any difference (to avoid flapping condition/I understand you have already analyzed CPU usage)
Isolate/Avoiding the outbound TCP limits is easier to solve, as the limits are set by the size of your worker. You can see the limits in Sandbox Cross VM Numerical Limits - TCP Connections- - To avoid outbound TCP limits, you can either increase the size of your workers, or scale out horizontally.
Troubleshooting intermittent outbound connection errors in Azure App Service - (to isolate port exhaustion).
f there are multiple Apps under a single App Service Plan, distribute the apps across multiple app service plans to obtain additional compute levels (to isolate the issue further/shared more details on this in the ‘comment’ section below)
Review the logs to fetch more details on this issue.
Note: Linux is currently the recommended option for running Python apps in App Service and I believe you’re leveraging the App Service Linux flavor.

After an engagement with the Microsoft team, the issue was that My Azure flexible server and App service were in different regions, one was in North South Africa and the other was in East US. So after ensuring all are in the same region, the issue was resolved.
Secondly, I had a field which had both text and images(base 64),I was using Django summernote, it provides a WYSIWYG experience, so it can store by default all the images and text together in the same field, so I optimized it, now the speed is super fast.

Related

High memory consumption on Azure Function App on Linux plan

I just switched from Windows plan to Linux on Azure Function App and memory usage went up 5 times.
I didn't change the way how package is built. And it is just dotnet publish -c Release --no-build --no-restore. I wonder if I could do sotmething here - build for specific runtime?
Is there a way to decrease that consumption? I'm wondering because my plan was to switch all functions to Linux plans as they are cheaper, but not neceserilly if it ends up in higher plans.
Few details:
dotnet 3.1
function runtime version ~3
functions run in-process
The function is rarely used, so there is no correlation between higher memory usage and bigger traffic.
Please check if my findings are helpful:
Memory Working Set is the Current amount of memory used by the Function App in MB's or the tracking how much of the application is currently loaded in physical memory.
If the requests are high, then the Memory working set is most likely to increase.
AFAIK, during the initial start/request or cold start of the Azure Function takes high memory consumption ranges nearly 60 MiB - 180 MiB and the net memory working set count depends on the amount of physical memory is using by our function application during requests and response time.
According to Azure Functions Plan Migration Official documentation, direct migration to a Dedicated (App Service) plan in not supported currently and this migration is not supported on Linux.
Also, you can check the cause and resolution on Azure Functions (Linux Plan) > Diagnose and Solve Problems > Availability & Performance >

How do I reduce cpu percentage in portal azure?

Right now my website is slow and when I see the xxxx-cd-hp it looks like picture below. CPU: 90, 36%. Is this still normal?
Apparently at certain times, CPU percentage increased. Maybe because many users have access
How can I solve this problem?
CPU time or process time is an indication of how much processing time on the CPU, a process has used since the process has started and CPU Percentage = Process time/Total CPU Time* 100
Suppose If the process has been running for 5 hours and the CPU time is 5 hours, and it is a single core machine, then that means that the process has been utilizing 100% of the resources of the CPU. This may either be a good or bad thing depending on whether you want to keep resource consumption low or want to utilize the entire power of the system.
App Service Diagnostics is an intelligent and interactive experience to help you troubleshoot your app with no configuration required. When you run into issues with your app, App Service Diagnostics points out what’s wrong to guide you to the right information to more easily troubleshoot and resolve issues. To access App Service diagnostics, navigate to your App Service web app in the Azure portal. In the left navigation, click on Diagnose and solve problems.

Azure AppService Performance Issue

We have ASP.Net Core 2.1 Web API hosted in AppService (S1) that talks to Azure SQL DB (S1-20DTUs). Both are in same region. During load testing we found that some API instances are taking too much time to return the result.
We tried to troubleshoot the performance issue and below are our observations.
API responds within 0.5 secs most of the time.
API methods are all async methods.
Sometimes it takes around 50 secs to over a minute.
CPU & Memory utilization are below 60%
Database has 20 DTU capacity, out of which 6 DTUs are used during load testing.
In the below example snapshot from Application Insights, we see total duration of the request was 27.4 secs. But the database dependency duration was just 97ms. There is no activity till the database was called. Please refer below example.
Can someone please help me to understand what was happening in this 27 secs of wait time. What could be the reason for this?
I would recommend checking the Application Map on Application Insights resource as shown below to double check the dependencies.
Verify the CPU and Memory metrics by going to the "Diagnose and solve problems" link on App service as shown below and run the Availability and Performance report to find out if there were any issues during your load testing.
Use Async methods on your API to maximize the CPU usage. It may be that the worker process threads are hitting the limits and your app is the bottleneck. You should get some insights when you run the report mentioned in point 2 above.
The S1 tier will support no more than 900 concurrent session. If you request per second (RPS rate) during the load test is very high you may face issues.
Also S3 and above are recommended for intensive workloads. Checking if all the connections are closed properly also helps
You can find details about different pricing tiers and their capabilities in the below link
https://learn.microsoft.com/en-us/azure/sql-database/sql-database-dtu-resource-limits-single-databases

Running out of memory in Azure with 50% utilization

I ran into a situation where out of memory exceptions were generated in our Azure App Service for a .Net Core Web API even though memory & utilization topped 50% in the App Service Plan (P2V2: 7GB RAM).
I have looked at this SO article to check private bytes and other things but still don't see where the memory of exhaustion comes from. I see a max usage of 1.5GB on the memory working set which is well below the 7GB.
Nothing shows up under Support + Troubleshooting -> Resource Health or App Service Advisor.
I am not sure where to look next and any help would be appreciated.
Azure App Services caps memory usage at 1.5G by default. But you can change this behaviour with this application setting (to be added under Configuration):
WEBSITE_MEMORY_LIMIT_MB = 3072
See also my answer here:
Is there way to determine why Azure App Service restarted?
The Metrics view on the portal can only go up to a 1 minute granularity level.
(The default is 5 minutes)
This means that each metric point is an average value over a 60-second interval.
It may be spiking up and down over 60 seconds, so you need a more real-time view.
Try the SCM console (Advanced Tools > Go), and check the Process Explorer to see the actual memory consumption.

What would cause high KUDU usage (and eventual 502 errors) on an Azure App Service Plan?

We have a number of API apps and WebApps on an Azure App Service P2v2 instance. We've been experiencing an amount of platform instability: the App Service becomes unhealthy and we get a rash of 502 errors across various of the Apps (different ones each time), attributable to very high CPU and Memory usage on the app service. We've tried scaling all the way up to P3v2, but whatever the issue is seems eventually to consume all resources available.
Whenever we've been able to trace a culprit among the apps, it has turned dout not to be the app itself but the Kudu service related to it.
A sample error message is High physical memory usage detected on multiple occasions. The kudu process for the app [sitename]'pe-services-color' is the most common cause of high memory usage. The most common cause of high memory usage for the kudu process is web jobs. where the actual app whose Kudu service is named changes quite frequently.
What could be causing the Kudu services to consume so much CPU/Memory, and what can we do to stabilise this app service?
Is it simply that we have too many apps running on one plan? This seems unlikely since all these apps ran previously on a single classic cloud service instance, but if so, what are the limits for apps and slots on a single plan?
(I have seen this question but the answer doesn't help)
Update
From Azure support, these are apparently the limits on Small - Medium - Large non-shared app services:
Worker Size Max sites
Small 5 Medium 10 Large 20
with 'sites' comprising app services/api apps and their slots.
They seem ridiculously low, and make the larger App Service units highly uneconomic. Can anyone confirm these numbers?
(Incidentally, we found that turning off Always On across the board fixed the issue - it was only causing a problem on empty sites though - we haven't had a chance yet to see if performance is good with all the sites filled.)
High CPU and memory utilization would be mostly caused by your program/code itself. If there are lot of CPU intensive tasks and you applied lot of parallel programming that spawn lot of new threads can contribute to high cpu and memory utilization. So review your code and see such instances. When number of parallel threads increased cpu utilization goes high and it starts scaling up frequently that adds up your cost also sometime thread loss and unexpected results. As Azure resources costs are high you need to plan your performance accordingly.
You can monitor this using the Metrics option of the app service plan in the blade .

Resources