Minimal Logging in Azure SQL Database - azure

While analyzing the performance of an Azure SQL Database with huge workload (Business-Critical service tier), I noticed the Log IO Percentage is hammered and hits 100% for considerable time periods, which as a consequence affects the overall performance. The database is being populated by several Data factory pipelines, that embody SSIS packages and stored procedures, and using INSERT/UPDATE statements extensively.
Back in on-premise world, I would change the database recovery model to Simple or Bulk-Logged, and use TABLOCK hint in my inserts, and the minimal logging is achieved (satisfying some other conditions).
Is this kind of minimal logging (TABLOCK) also applicable to Azure SQL Databases ? (I read they are in Full recovery model by default).
How to achieve minimal logging in the Azure SQL Database described above, using the same pipelines?

As Subbu comments, this is not supported by now. You can vote up here to progress this feature. https://feedback.azure.com/forums/217321-sql-database/suggestions/36400585-allow-recovery-model-to-be-changed-to-simple-in-az

Related

implement a modern end to end reporting system based on Power BI and Azure Synapse

I am working on modernizing a reporting solution where the data sources are on prem on the customers' sql servers (2014) and the reports are displayed as Power BI reports on the customer's Power BI Service portal. Today I use SSIS to build a data warehouse, as well as an on premise data gateway to ensure the transport of data up to an Azure analysis services which in turn are used by the Power BI reports.
I have been wondering if I could use Azure Synapse to connect to customer data and in a most cost effective way transport data to Azure and link them to the Power BI workspace as a shared dataset. There are many possibilities, but it is important that the customer experiences that the reports are fast and stable, and if possible can cope with near real time.
I experience SSIS being cumbersome and expensive in azure. Are there mechanisms that make it cheap and fast to get data in azure? Do I need a data warehouse (Azure SQL database) or is it better to use data lake as storage for data? Needs to do incremental load too. And what if I need to do some transformations? Should I use Power BI dataflow or do I need to create Azure Data flows to achieve this?
Does anyone have good experience to use synapse (also with DevOps in mind) and get a good DEV, TEST and Prod environment for this? Or is using Synapse a cost driver and a simpler implementation will do? Give me your opinions and if you have links to good articles, please do so. Looking forward to hear from you
regards Geir
The honest answer is it depends on a lot of different things and I don't know that I can give you a solid answer. What I can do is try to focus down which services might be the best option.
It is worth noting that a Power BI dataset is essentially an Analysis Services database behind the scenes, so unless you are using a feature that is specifically only available in AAS and using a live connection, you may be able to eliminate that step. Refresh options are one of the things that are more limited in Power BI though, so the separate AAS DB might be necessary for your scenario.
There is a good chance that Power BI dataflows will work just fine for you if you can eliminate the AAS instance, and they have the added advantage of have incremental refresh as a core feature. In that case, Power BI will store the data in a data lake for you.
Synapse is an option, but probably not the best one for your scenario unless your dataset is large, SQL pools can get quite expensive, especially if you aren't making use of any of the compute options to do transformations.
Data Factory (also available as Synapse pipelines) without the SSIS integration is generally the least expansive option for moving large amounts of data. It allows you to use data flows to do some transformations and has things like incremental load. Outputting to a Data Lake is probably fine and the most cost effective, though in some scenarios something like an Azure SQL instance could be required if you specifically need some of those features.
If they want true real time, it can be done, but none of those tools really are built for it. In most cases the 48 refreshes per day (aka every 30 minutes) available on a Premium capacity are close enough to real time once you dig into the underlying purpose of a given report.
For true real time reporting, you would look at push and/or streaming datasets in Power BI and feed them with something like a Logic App or possibly Stream Analytics. There are a lot of limitations with push datasets though- more than likely you would want to set up a regular Power BI report and dataset and then add the real time dataset as a separate entity in addition to that.
As far as devops goes, pretty much any Azure service can be integrated with a pipeline. In addition to any code, any service or service settings can be deployed via an ARM template or CLI script.
Power BI has improved in the past couple years to have much better support for devops and dev/test/prod environments. Current best practices can be found in the Power BI documentation: https://learn.microsoft.com/en-us/power-bi/create-reports/deployment-pipelines-best-practices

Maintenance required for Azure SQL DB in the long term

What is the maintenance required from an organization when deploying an Azure SQL Database in the long term?
My current organization is hoping to do as little database management as possible, and have looked for products that fully manage our databases without much intervention needed from our end. Some products that are being considered includes Snowflake (for their automated partitioning of tables) and Domo (for their data warehousing, connectors, and BI tool offerings).
I'm leaning towards using Azure SQL DB for multiple reasons (products offered, transparent pricing, integration ease, available documentation, SSO, etc.), but want to first understand the skills needed and ease in maintaining it in the long run.
Will we have to manually rebuild indexes and partition out tables as we scale up? Or is Azure intelligent enough that it'll do most of the heavy lifting of performance optimization itself?
Does Azure or other vendors provide services to optimize a DB?
Sorry for the vague prompts, but any additional considerations in choosing DB vendors would be great. Thanks!
Actually for your questions, you should know what is Azure SQL database and it's capabilities.
I'm leaning towards using Azure SQL DB for multiple reasons (products offered, transparent pricing, integration ease, available documentation, SSO, etc.), but want to first understand the skills needed and ease in maintaining it in the long run.
This document What is Azure SQL Database service introduced almost all message you want to know. SQL Database is a general-purpose relational database managed service in Microsoft Azure that supports structures such as relational data, JSON, spatial, and XML. SQL Database delivers dynamically scalable performance within two different purchasing models: a vCore-based purchasing model and a DTU-based purchasing model. SQL Database also provides options such as columnstore indexes for extreme analytic analysis and reporting, and in-memory OLTP for extreme transactional processing. Microsoft handles all patching and updating of the SQL code base seamlessly and abstracts away all management of the underlying infrastructure.
Will we have to manually rebuild indexes and partition out tables as we scale up? Or is Azure intelligent enough that it'll do most of the heavy lifting of performance optimization itself?
No, you don't. Scalability is one of the most important characteristics of PaaS that enables you to dynamically add more resources to your service when needed. Azure SQL Database enables you to easily change resources (CPU power, memory, IO throughput, and storage) allocated to your databases.
You can mitigate performance issues due to increased usage of your application that cannot be fixed using indexing or query rewrite methods. Adding more resources enables you to quickly react when your database hits the current resource limits and needs more power to handle the incoming workload. Azure SQL Database also enables you to scale-down the resources when they are not needed to lower the cost.
For more details, please reference: Scale Up/Down.
Does Azure or other vendors provide services to optimize a DB?
As Woblli said, Azure SQL database provides the Azure SQL database Monitoring and tuning for you.
As a complement, you also can use Azure SQL Database Automatic tuning to help you optimize the database automatically.
Hope this helps.
Azure SQL DB offers the services you're asking.
You can enable automatic tuning, which will create and drop indexes based on performance gains. Force good query plans again based on performance. It will roll back changes if the specific change has decreased the overall database performance level.
It will not partition or shard your database for you however.
Official documentation:
https://learn.microsoft.com/en-us/azure/sql-database/sql-database-automatic-tuning

Monitoring Azure SQL Database

How can I monitor the following metrics for Azure SQL Database:
- Buffer Cache hit ratio.
- Page life expectancy.
- Page Splits.
- Lock waits.
- Batch requests.
- SQL compilation.
The new Azure SQL Analytics
Azure SQL Analytics is a cloud monitoring solution for monitoring
performance of Azure SQL Databases at scale across multiple elastic
pools and subscriptions. It collects and visualizes important Azure
SQL Database performance metrics with built-in intelligence for
performance troubleshooting on top.
Performance counters on SQL Azure only collect SQL Server counters of a specific database and do not show Windows performance counters (like Page Life Expectancy). For some performance counters you need to take a first snapshot, then a second snapshot, and then you should substract values of counters between snapshots to get the actual counter value.
Please use the script provided on the following article to properly collect those counters.
Collecting performance counter values from a SQL Azure database.
You are probably looking for dynamic management views. A good starting point will be
Monitoring Azure SQL Database using dynamic management views.
Regarding Buffer Cache hit, Page life etc. check this blog
SQL Server memory performance metrics – Part 4 – Buffer Cache Hit Ratio and Page Life Expectancy

SQL Azure throttling information

How do I see if an SQL Azure database is being throttled?
I want to see data like: what percentage of time it was throttled, the count of throttles, the top reasons of throttles.
See https://stackoverflow.com/questions/2711868/azure-performance/13091125#13091125
Throttling is the least of your troubles. If you need performance then you would be best served to build your own DB servers using VM roles. I found that the performance of these is vastly improved over SQL Azure. For fault tolerance you can provision a primary and a failover in a different VM in a different region if necessary. Make sure that the DB resides on the local drive.
I don't believe that information is currently available. However, the team does share reasons why you could be throttled and how to handle it (see here).

Migrating database to SQL Azure

As far as I know the key points to migrate an existing database to SQL Azure are:
Tables has to contain a clustered
index. This is mandatory.
Schema and data migration should be
done through data sync, bulk copy,
or the SQL Azure migration
wizard, but not with the restore option in SSMS.
The .NET code should handle the
transient conditions related with
SQL Azure.
The creation of logins is in the
master database.
Some TSQL features may be not
supported.
And I think that's all, am I right? Am I missing any other consideration before starting a migration?
Kind regards.
Update 2015-08-06
The Web and Business editions are no longer available, they are replaced by Basic, Standard and Premium Tiers.
.CLR Stored Procedure Support is now available
New: SQL Server support for Linked Server and Distributed Queries against Windows Azure SQL Database, more info.
Additional considerations:
Basic tier allows 2 GB
Standard tier allows 250 GB
Premium tier allow 500 GB
The following features are NOT supported:
Distributed Transactions, see feature request on UserVoice
SQL Service broker, see feature request on UserVoice
I'd add in bandwidth considerations (for initial population and on-going bandwidth). This has cost and performance considerations.
Another potential consideration is any long running processes or large transactions that could be subject to SQL Azure's rather cryptic throttling techniques.
Another key area to point out are SQL Jobs. Since SQL Agent is not running, SQL Jobs are not supported.
One way to migrate these jobs are to refactor so that a worker role can kick off these tasks. The content of the job might be moved into a stored procedure to reduce re-architecture. The worker role could then be designed to wake up and run at the appropriate time and kick off the stored procedure.

Resources