Azure data explorer Concurrent queries - azure

In azure data explorer, we have made load tests on the cluster with Azure Load testing going through a function app (In consumption tier, function app ) that do request on ADX.
We have never been able to go over 2 concurrent queries . What should we do to go over ? We cannot pass 200 req/s for a simple query.

Related

Azure SQL DB P4 Not Scaling Properly

I upgraded my azure SQL db from P2 250 dtus to P4 500 dtus.
But, during heavy load again we are facing dropped connections and overall performance degradation.
According to me, the number of concurrent requests become too much and it stars dropping connections.
What i understood was p2 has 400 concurrent workers whereas p4 has 800.
https://learn.microsoft.com/en-us/azure/azure-sql/database/resource-limits-dtu-single-databases?view=azuresql
These concurrent workers are not related to dtus as now my dtus in P4 is 40-45% under heavy load also.
Can we get some data or logs to check the current concurrent workers?
Is there any other way to check it.?
Is that's the main reason for dropped connections and performance degradation?
Can we get some data or logs to check the current concurrent workers?
Is there any other way to check it.? Is that’s the main reason for
dropped connections and performance degradation?
You can Monitor and fetch logs and metrics of your Azure SQL DB by selecting the Metrics like below :-
Here, I am checking successful connections to understand the session metrics and worker percentage to understand if the number of requests or queries are affecting the performance.
You can also use various metrics based on your DTU by changing the metrics like below :-
In order to troubleshoot the performance degradation, You can make use of Azure Diagnostics to solve or get an insight like below :-
I selected High CPU utilization which gave me a recommendation to diagnose the issue and also a T-SQL query that can be run inside Azure SQL query editor or SSMS directly.
As, You have faced issue with scaling, You can also try to diagnose by selecting the option below :-
You can connect to your Azure SQL server in SSMS and query the log data directly to get the worker or sessions.
You can query > # sys.dm_db_resource_stats table in SSMS or query editor and get > max_worker_percent
which will give you - Maximum concurrent workers (requests) in percentage of the limit of the database’s service tier.
Refer below :-
Select * from sys.dm_db_resource_stats;
And query execution graphs metrics like below :-
You can also find insights and improve performance with options below:-
You can enable Azure Monitor and monitor your SQL server and all the DB’s together and find more insight on the concurrent workers and all data from sys.dm_db_resource_stats like below without having to log into SSMS :-
Go to Azure Monitor > select SQL from left tab > Create new profile :
You can add one Ubuntu 18.04 Linux VM to fetch the logs and data from all your SQL Server databases like below for monitoring :-
In this manner all your data will be managed in a centralized monitoring pane in Azure Monitor.
Reference :-
sys.dm_db_resource_stats (Azure SQL Database and Azure SQL Managed Instance) - SQL Server | Microsoft Learn

Azure-based approach for sending 100000 requests to external service

Every night I need to get data from external http service and save it to Azure Data Lake.
Actually, I need to get all the orders for all the customers. The problem is that there is now way to get this data via a single call. Id of a customer should be provided per each separate call.
The format of url is something like /api/ordersByCutomer/{cutomerId}
I need to get data for 100 000 different customers. It will result in 100 000 calls to the external service.
I tried to use Azure Data Factory with Foreach activity in parallel mode, but it takes 4 sec per each call there (3 seconds are spent in queue). The overall speed result was not satisfying.
What is the best (I mean the fastest) azure-based approach for this (except Azure Data Factory)?
Thanks
You could write some asynchronous code to hit the API/http service parallelly and execute this code using custom activity in ADF which works by using batch account to get this job done. Use custom activities in an Azure Data Factory
Also, before doing any of this it would nice to contact the owner/stakeholder of the external http service and finding out if there is rate limiting on that service and even if the service can handle such loads.

What is the best way to execute long-running and high memory usage tasks on Azure?

I need to find the best way to preform long running tasks on Azure.
Scenario:
User picks the dataset and filters on the web app (Azure App Service)
Based on the requirements we create SQL query
Query is executed against one or more Azure SQL databases
Data is being exported in .csv format and uploaded to Azure Blob Storage
The main issues are that execution of some SQL queries can last for 2+ hours and resultset can have
100M+ rows.
I believe that Azure Functions (and subsequently Durable Functions) are not a option because of the timeout and memory usage.
I believe that Azure Functions (and subsequently Durable Functions)
are not a option because of the timeout and memory usage.
Timeout limit is only in consumption plan. If you want to get rid of timeout limit and want more memory, you can use premium plan or app service plan. And because azure function can scale out horizontally, you can split the result to multiple and then input the azure function.
And also, you can use web jobs.

Azure functions better to call my API directly for sql server db access or use my azure API

I'm posting because I don't think the azure functions can take advantage of connection pooling. Say if I run 1 sql query every 5 minutes in my azure function, the initial connection will take a long time to connect because it can't take advantage of connection pooling like a C# web api that's always running.
Would it be better to call my C# Webapi to make that data call and return the results? Or is it better to directly connect to the db? Now if there are 10 or so DB calls I'm sure directly would be better, but 1 or 2 I don't know.
this is a C# azure function connecting to a azure sql server
This would only be applicable in the Consumption ("Serverless") Plan. Using the traditional App Service plan, your function would not be deprovisioned and would be able to make use of connection pooling since it is running on the same plan as an API App.

Azure WebJobs for Aggregation

I'm trying to figure out a solution for recurring data aggregation of several thousand remote XML and JSON data files, by using Azure queues and WebJobs to fetch the data.
Basically, an input endpoint URL of some sort would be called (with a data URL as parameter) on an Azure website/app. It should trigger a WebJobs background job (or can it continuously running and checking the queue periodically for new work), fetch the data URL and then callback an external endpoint URL on completion.
Now the main concern is the volume and its performance/scaling/pricing overhead. There will be around 10,000 URLs to be fetched every 10-60 minutes (most URLs will be fetched once every 60 minutes). With regards to this scenario of recurring high-volume background jobs, I have a couple of questions:
Is Azure WebJobs (or Workers?) the right option for background processing at this volume, and be able to scale accordingly?
For this sort of volume, which Azure website tier will be most suitable (comparison at http://azure.microsoft.com/en-us/pricing/details/app-service/)? Or would only a Cloud or VM(s) work at this scale?
Any suggestions or tips are appreciated.
Yes, Azure WebJobs is an ideal solution to this. Azure WebJobs will scale with your Web App (formerly Websites). So, if you increase your web app instances, you will also increase your web job instances. There are ways to prevent this but that's the default behavior. You could also setup autoscale to automatically scale your web app based on CPU or other performance rules you specify.
It is also possible to scale your web job independently of your web front end (WFE) by deploying the web job to a web app separate from the web app where your WFE is deployed. This has the benefit of not taking up machine resources (CPU, RAM) that your WFE is using while giving you flexibility to scale your web job instances to the appropriate level. Not saying this is what you should do. You will have to do some load testing to determine if this strategy is right (or necessary) for your situation.
You should consider at least the Basic tier for your web app. That would allow you to scale out to 3 instances if you needed to and also removes the CPU and Network I/O limits that the Free and Shared plans have.
As for the queue, I would definitely suggest using the WebJobs SDK and let the JobHost (from the SDK) invoke your web job function for you instead of polling the queue. This is a really slick solution and frees you from having to write the infrastructure code to retrieve messages from the queue, manage message visibility, delete the message, etc. For a working example of this and a quick start on building your web job like this, take a look at the sample code the Azure WebJobs SDK Queues template punches out for you.

Resources