SQL Server Azure Data Changes Auditing Techniques - azure

I am new to SQL Azure and I have a task to Implementing Auditing Techniques on SQL Server Azure database.
Can you someone please help me what are the different techniques available for Auditing data changes in SQL Azure. Any reference links will also helps.
I want to maintain Auditing for tables which has around 40-50 columns and I want to track all the column changes. I am also looking for reliability and performance factors.
Thanks

Auditing in SQL Azure is very easy to set up.Below is the data that will be captured..
Access to data
Schema changes (DDL)
Data changes (DML)
Accounts, roles, and permissions (DCL)
Stored Procedure, Login and, Transaction Management.
Once you set up Auditing,the files will be stored in Storage account which you Can download as excel file...
Now Azure gives an option to monitor Audit logs using power BI as well..
We have configured auditing for tables which are inserted heavily(1 million inserts per day minimum)..we didnt saw any performance degradation.
Updated as per comments:
Auditing is at database level as of now,if you want to audit a single table only ,triggers can be your best bet
Few links which may help you..
https://powerbi.microsoft.com/en-us/blog/monitor-your-azure-sql-database-auditing-activity-with-power-bi/
https://azure.microsoft.com/en-us/documentation/articles/sql-database-auditing-get-started/#subheading-1

Thanks tmullaney for the response. After a deep analysis I have started using Temporal Tables for enabling Auditing in SQL Azure. We can enable Auditing on individual tables/entities using this feature.
All the process will be done internally by SQL Server and no need to write even a single trigger to do the audit.
Here are couple of links that are useful to explore details on Temporal Tables in SQL Server,
Channel 9 Video : Temporal in SQL Server 2016 : https://channel9.msdn.com/Shows/Data-Exposed/Temporal-in-SQL-Server-2016
Temporal Tables : https://msdn.microsoft.com/en-IN/library/dn935015.aspx

Related

Relationships in Azure synapse (DWH)

I'm currently working in Azure synapse DWH and I have some theoretical questions:
How I can create relationships between tables (Dim's and Fact's) and what implications I would have If I want to create those relationships.
I read that To create a primary key, I would need to set a nonclustered table, but what that means?
Azure Synapse Analytics (ASA) has three engines:
serverless SQL pools (was SQL on-demand)
dedicated SQL pools (the next step on from Azure SQL Data Warehouse)
Apache Spark pools
None of these currently support database relationships, as at today. I suspect you mean dedicated SQL pools and just to confirm it does not support the FOREIGN KEY syntax. Relationships is more of an OLTP concept and not common in big data platforms, which ASA is.
Therefore your options are to enforce these relationships downstream or on import to your warehouse. A common method is to identify unknown values and substitute them with a -1 / Unknown value on import. This will ensure there are no NULLs in your key columns.
Additionally, enforce your relationships downstream eg in an Azure Analysis Services tabular model or Power BI model.
If you really need relationships then depending on your data volumes you might consider Azure SQL Database which supports data volumes up to 4TB alongside columnstore indexes which give great compression.
having a similar issue:
I cannot find an automated solution thus far;
I'm importing 'entities' from D365 to datalake; and it does NOT come with the relationships.
it will also NOT suggest the "Related Tables"
Introduce; ETL of 'entities' using T-SQL and Spark.
Governance of:: py.spark, notebooks, Schema, linting T-SQL. orchestration of activities and pipelines, workflows. Etc...
OR
For small datasets and projects:
Reverse look-up each table needed.
In Azure Synapse create a new DataFlow; and download the .PBIX ;
Do your ETL: Create Primary fact and dimension tables; (by whatever means), such as Using PowerPivot Unique/distinct DAX expression on a Customer.Table).
Once complete; if you like; import the newly ETL primary tables to the datalake.
Repeat step 2.
Create the relationships with PowerBI. (Ideally if ETL is done correctly PBI will auto find the relationships)
RE-Publish the .PBIX with the relationships as a “DataFlow”.
a. You must create relationships for every Dataflow; dataflows cannot be combined.
Measures and Dataflows will consume resources and require performance analysis if they grow.
at some point 'dataverse' may allow D365 data making this easier.
depending on your 'cost/spend' cloning all of D365 still doesn't solve your relationship needs.
Two solutions I'm aware of thus far:
Import the serverless DBO's to PowerBI; Model and Create the Dataset there. you can do massive ETL, including Foreign Key creation, and Filtering of NULL values to create primary keys for Dimensions. Aggregate data and create Fact tables, etc...
Its far easier then using the Synapse GUI. Drawbacks are PBI licensing related.
Create a "Lake Database" (map as you go, great for 5 or less entities.tables.) ETL is low-code. But I'm skeptical that after 40 hours of training; I should have just learned how to scrip this in Workbook/Spark.
Do BOTH; use PowerBI to develop your model and test it. Then go back to synapse and deploy the working model as a pipeline or lake database.
Points of Clarity from the top posted solution:
Do not trust the auto-relationship of PowerBI; stay away from pre-made REFID relationships in PBI unless you know for sure this is what you want. (step 6: original poster; if ETL is correct its a 1:M)
Publishing with .PBIX has its limitations with sharing and other issues the OP mentioned. Lake Database might be the workaround if you have Tabelau, Python, or Qlik as your solution.
DataVerse is coming; and PBI Analytics as well as predictive analysis with HD Insights will be embedded into D365. You will also be able to create drag and drop dashboards. As of 08-05-2022 this is already working in its infancy; even thought they want you to go modular; with hybrid serverless setup you can STILL Pull the aggregate measures from D365 into synapse and Reverse engineer them.

Near real-time ETL of Oracle data to Azure SQL

I have an Oracle DB with data that I need to load and transform into an Azure SQL Database. I have no control over either the DB nor the application that updates its data.
I'm looking at Azure Data Factory, but I really need data changes in Oracle to be reflected as near to real-time as possible.
I would appreciate any suggestions / insights.
Is ADF the correct tool for the job? If so, what is a good approach to use? If not suitable, what should I consider using instead?
For real-time you don't really want an ELT/ETL tool like ADF. Consider a replication agent like Attunity or (gulp at the licensing costs) GoldenGate.
I don't think Data Factory is not good for you. Yes you can copy data from Oracle to Azure SQL database with it. But like #Thiago Custodio said, we need need to do it to each table you have. That's too complicated.
Just reference: Copy data from and to Oracle by using Azure Data Factory.
As you said, you really need data changes in Oracle to be reflected as near to real-time as possible.
The migration/copy time must be very short. Then the data in Oracle and Azure SQL database could be same before the Oracle data changed next time. I searched a lot and didn't find any real-time copy tools. Actually, I think you want the copy could be something like 'data sync'.
I found this link Sync Oracle Database with SQL Azure, hope it could give some good ideas for you.
About the data migration or copy, You can using bellow ways:
SQL Server Migration Assistant for Oracle (OracleToSQL)
Azure Database Migration Service (DMS)
Reference tutorial:
Migrating Oracle Databases to SQL Server (OracleToSQL): SQL Server Migration Assistant (SSMA) for Oracle is a comprehensive environment that helps you quickly migrate Oracle databases to Azure SQL database.
How to migrate Oracle to Azure SQL Database with minimum downtime:
Hope this helps.
For the record, we went with a product named QLik Replicate (aka Attunity) and it is working very well!

power query refresh makes unable to access tables in azure dwh

I am using Azure DWH tables in power bi report. Whenever the report queries were refreshed until the refresh completed I am unable to execute any queries in SSMS for the same Azure DWH connection.
Please find the below attachments.
while refreshing the power bi
when queries the same table or other , the data is not accessible.
please find the query used in SSMS tool.
Here my table consists of only 29 records but in my original scenario the table has 10 million records.
Until the refresh completes, I cannot even get the result of the following query
Select Getdate()
This is caused by the concept of concurrency limits in Azure Data warehouse. Essentially by default your login is set to smallrc (resource class) which only has access to two concurrency slots and probably uses both for your refresh.
You can verify this is the issue by creating another user and trying to run your PowerQuery with one login and your SSMS query with another.
You can also change your resource class by running:
EXEC sp_addrolemember 'largerc', 'loaduser';
If needed you can read up more on resource class and concurrency management here: https://learn.microsoft.com/en-us/azure/sql-data-warehouse/resource-classes-for-workload-management
It's a complex subject overall and may be easier to dig through that whole document versus my attempt at explanation.
Finally, just a note of advice, unless you are planning for a large OLAP workload (larger than a terabyte and heavily CPU bound) and planning on putting in some sort of semantic layer between the users doing queries and the DW I would suggest just a plain SQL Azure DB with ColumnStore enabled on relevant tables.

Sitecore DMS in Azure

I've deployed the Sitecore on to Azure CD by using Sitecore Azure 3.0.0.
However, I'm not getting any analytics data until I manually update the "analytics" connection string to sql azure.
If anyone has already configured above, could you help me with queries below:
Is manual connection string the best solution? Or, am I missing any configuration setting for sitecore azure deploy.
Is it possible to Sync SQL Azure analytics to on-premise analytics db? We need this for disaster recovery i.e. to deploy all web,core,analytics to different data centre in event of disaster.
Does DMS slow down the performance of Sitecore CD?
Thanks.
You can add your "analytics" connection string to the connection string patch file defined in your Sitecore Azure config. Do this via the following steps:
Navigate to /sitecore/system/modules/Azure/[Environment]/[Region]/[Farm]/[Role]/[Deployment]
In the deployment item (e.g. Staging, Production), you should see a field named "Connection Strings Patch".
Scroll down in that field until you see the connection strings for the "core", "master", and "web" databases.
Add a connection string element for your "analytics" database. Be sure to use the connection string for the deployment item you're editing, i.e. use your Analytics staging connection string for the Staging item, production connection string for the Production item.
It is not recommended to use Azure SQL Data Sync for backup/disaster recovery (this recommendation is not specific to Sitecore). It is recommended to use a combination of Azure SQL database copying and then Azure SQL database export.
Also, Azure SQL Data Sync has limitations regarding the database schemas supported. SQL Data Sync is unable to synchronize any table that does not have a Primary Key (the Sitecore Analytics database has a few tables without primary keys).
Also, SQL Data Sync synchronizes only data but not stored procedures and triggers (the Sitecore Analytics database does have stored procedures).
Lastly, as your Analytics database grows, a sync operation is likely to take a significant amount of time to complete, whereas a copy operation will still take some time but likely not as much and will place less of a burden on your SQL server.
This MSDN article provides and overview of the copy/export process: http://msdn.microsoft.com/en-US/library/hh852669.aspx#adr3
This MSDN article provides details on how to copy Azure SQL databases: http://msdn.microsoft.com/library/ff951631.aspx
Yes, Sitecore content delivery server performance is impacted when DMS is enabled. To what extent largely depends on how you're using DMS (e.g. personalization, MV testing, engagement plans) and the amount of traffic your server receives.

Azure Tables or SQL Azure?

I am at the planning stage of a web application that will be hosted in Azure with ASP.NET for the web site and Silverlight within the site for a rich user experience. Should I use Azure Tables or SQL Azure for storing my application data?
Azure Table Storage appears to be less expensive than SQL Azure. It is also more highly scalable than SQL Azure.
SQL Azure is easier to work with if you've been doing a lot of relational database work. If you were porting an application that was already using a SQL database, then moving it to SQL Azure would be the obvious choice, but that's the only situation where I would recommend it.
The main limitation on Azure Tables is the lack of secondary indexes. This was announced at PDC '09 and is currently listed as coming soon, but there hasn't been any time-frame announcement. (See http://windowsazure.uservoice.com/forums/34192-windows-azure-feature-voting/suggestions/396314-support-secondary-indexes?ref=title)
I've seen the proposed use of a hybrid system where you use table and blob storage for the bulk of your data, but use SQL Azure for indexes, searching and filtering. However, I haven't had a chance to try that solution yet myself.
Once the secondary indexes are added to table storage, it will essentially be a cloud based NoSQL system and will be much more useful than it is now.
Despite similar names SQL Azure Tables and Table Storage have very little in common.
Here are a two links that might help you:
Table Storage, a 100x cost factor
Fat Entities on Table Storage
Basically, the first question should wonder about is Does my app really need to scale? If not, then go for SQL Azure.
For those trying to decide between the two options, be sure to factor reporting requirements into the equation. SQL Azure Reporting and other reporting products support SQL Azure out of the box. If you need to generate complex or flexible reports, you'll probably want to avoid Table Storage.
Azure tables are cheaper, simpler and scale better than SQL Azure. SQL Azure is a managed SQL environment, multi-tenant in nature, so you should analyze if your performance requirements are fit for SQL Azure. A premium version of SQL Azure has been announced and is in preview as of this writing (see HERE).
I think the decisive factors to decide between SQL Azure and Azure tables are the following:
Do you need to do complex joins and use secondary indexes? If yes, SQL Azure is the best option.
Do you need stored procedures? If yes, SQL Azure.
Do you need auto-scaling capabilities? Azure tables is the best option.
Rows within an Azure table cannot exceed 4MB in size. If you need to store large data within a row, it is better to store it in blob storage and reference the blob's URI in the table row.
Do you need to store massive amounts of semi-structured data? If yes, Azure tables are advantageous.
Although Azure tables are tremendously beneficial in terms of simplicity and cost, there are some limitations that need to be taken into account. Please see HERE for some initial guidance.
One other consideration is latency. There used to be a site that Microsoft ran with microbenchmarks on throughput and latency of various object sizes with table store and SQL Azure. Since that site's no longer available, I'll just give you a rough approximation from what I recall. Table store tends to have much higher throughput than SQL Azure. SQL Azure tends to have lower latency (by as much as 1/5th).
It's already been mentioned that table store is easy to scale. However, SQL Azure can scale as well with Federations. Note that Federations (effectively sharding) adds a lot of complexity to your application. I'm also not sure how much Federations affects performance, but I imagine there's some overhead.
If business continuity is a priority, consider that with Azure Storage you get cheap geo-replication by default. With SQL Azure, you can accomplish something similar but with more effort with SQL Data Sync. Note that SQL Data Sync also incurs performance overhead since it requires triggers on all of your tables to watch for data changes.
I realize this is an old question, but still a very valid one, so I'm adding my reply to it.
CoderDennis and others have pointed out some of the facts - Azure Tables is cheaper, and Azure Tables can be much larger, more efficient etc. If you are 100% sure you will stick with Azure, go with Tables.
However this assumes you have already decided on Azure. By using Azure Tables, you are locking yourself into the Azure platform. It means writing code very specific to Azure Tables that is not just going to port over to Amazon, you will have to rewrite those areas of your code. On the other hand programming for a SQL database with LINQ will port over much more easily to another cloud service.
This may not be an issue if you've already decided on your cloud platform.
I suggest looking at Azure Cache in combination with Azure Table. Table alone has 200-300ms latencies, with occasional spikes higher, which can significantly slow down response times / UI interactivity. Cache + Table seems to be a winning combination, for me.
For your question, I want to talk about how to decide with logic choose SQL Table and which need to use Azure Table.
As we know SQL Table is a relational database engine. but if you have a big data in one table the SQL Table is not applicable, because SQL query get big data is slow.
At this time you can choose Azure Table, the Azure Table query is so fast then SQL Table for big data, for example, in our website, someone subscribed many articles, we make the article as feed to user, every user have a copy of article title and description, so in the article table there are lots of data, if we use SQL Table, each query execution maybe take more than 30 seconds. But in Azure Table get users article feed by PartitionKey and RowKey is so fast.
From this example you may know how to choose between in SQL Table and Azure Table.
I wonder whether we are going to end up with some "vendor independent" cloud api libraries in due course?
I think that you have first to define what your application usage funnels are. Will your data model be subjected to frequent changes or it is a stable one? You have to be able to perform ultra fast inserts and reads are not so complicated? Do you need advance google like search? Storing BLOBS?
Those are the questions (and not only) that you have to ask and answer yourself in order to decide if you are more likely going to use NoSql or SQL approach in storing your data.
Please consider that both approaches can easily coexist and can be extended with BLOB storage as well.
Both Azure Tables and SQL Azure are two different beasts.Both are meant for different scenarios, one con to azure table is that you cannot move from azure to any other platform, unless you write providers in your code that can handle such shifts.

Resources