Scaling sql database on azure. Is data loss possible? - azure

I have an sql database on azure, My service tier is standard, I already scaled data max size to 500GB (standard tier is 200GB by default). Now I want to scale it to 750GB. I read documentaion but I'm not sure. How long it will take and if any data loss is possible ? Also will I have to change my configuration or all connection string etc stays the same ?

Data loss is not a byproduct of changing performance tiers or storage size, no. In most of the standard tiers, you are running on remote storage (similar to how SQL Server in a VM would run over blob storage backed disks). So, if all you are doing is increasing the max size on remote storage, that's morally equivalent to the same operation on-premises and it is immediate. If you are crossing tiers (say, standard to premium) or within premium tiers where there is no space available on the local nodes to satisfy your requests, the operation can take time since new space needs to be provisioned and your database needs to be seeded (copied) into the new space. This is done in the background and is related to the size of your database, the performance tier (as IOPS are based on that), and the current transaction load (as this also has to be replicated to N new nodes). When things are replicated and up-to-date, your current connections are closed from the server (which means any active transactions there are aborted) and you can reconnect and retry those on the newly seeded database replicas. This usually takes minutes to tens of minutes but for very large databases it can take an hour or longer.

Related

Does the cheap Azure Sql Server db pricing tier offload resources after a specific time period of non use, similar to the free app service plan work?

Question regarding pricing tiers for Azure Sql database and if the cheapest plan ($4.90/month) offloads resources after a specific time period for non use.
If my Azure Sql Server database doesn't get queried for a specific period of time, ex. 30 minutes. Will Azure offload the resources for the database, until it gets a new request?
This would be similar to an app service running the basic (free) plan. After ~30 minutes the site resources get offloaded from memory (I think). So, when I go to the site, instead of loading up immediately, it takes about 5-10 seconds, then if I hit the site again, because it's loaded back into memory, it will load immediately.
Does this same thing happen with Sql Server running in Azure with the cheap $4.90/month plan? It seems like it does! If I don't hit my app service (website), now upgraded to the S1 plan so no offloading, and come back a day later and hit a page that has to fetch database results to display on the page, it will take approx. 5-10 seconds, but then if I refresh the page again or hit another page that needs to query the DB, the queried data comes back instantly!
No it should not do that. You reserve capacity in vCores or DTUs, and that is what you should get. But different management operations (like maintenance, capacity restructuring, moving workloads from faulty hardware) will evict your workload from the underlying host it is running on and spin it up on another hardware. It would be transparent to you if not for the flushed cache. Of course this should not happen every time your application goes idle, but can be the case from time to time.
If you are using DTU purchasing model, I would recommend switching to vCore based model as it's the recommended one. Then take a look at the CPU Used metric. As long as the line of the metric is continuous you have allocated CPU, when the line becomes dotted it indicates that the db is paused, but the dotted behavior should only apply for serverless.
You can set your Azure SQL Database as serverless. It is the only Azure SQL Database (PaaS) option that can be auto-paused after a specific time of inactivity has passed, the minimum time of inactivity is 1 hour. You also get billed by the minute instead of by the hour. Here you will find step-by-step tutorial how you create a serverless database.
Make sure you set the auto-pause delay setting.

How to move from DTU to vCore in Azure SQL Database

At present we have 3 (Dev, QA & Prod) stages in our azure resources. All the three are using SQL Database 'Standard S6: 400 DTUs'. Because of Dev and QA SQL Database our monthly cost is going more than 700 euro's. I am planning to move from DTU to vCore serverless. Below are my queries,
Just going into portal -> Compute and storage -> and changing from DTU to vCore Serverless is the right process?
Do i need to take any other things before doing this operation?
Does my existing Azure SQL DB is going to get affected by this operation?
If things are not fine as per my requirement same way can i come back to DTU mode.
Thanks in advance.
You can have a look at this MS doc for details: Migrate Azure SQL Database from the DTU-based model to the vCore-based model
Just going into portal -> Compute and storage -> and changing from
DTU to vCore Serverless is the right process?
Yes! just change to required option from dropdown and click on Apply.
Migrating a database from the DTU-based purchasing model to the
vCore-based purchasing model is similar to scaling between service
objectives in the Basic, Standard, and Premium service tiers, with
similar duration and a minimal downtime at the end of the migration
process.
Do i need to take any other things before doing this operation?
Some hardware generations may not be available in every region. Check availability under Hardware generations for SQL
Database.
In the vCore model, the supported maximum database size may differ depending on hardware generation. For large databases, check supported
maximum sizes in the vCore model for single
databases
and elastic
pools.
If you have geo-replicated databases, during migration, you don't have
to stop geo-replication, but you must upgrade the secondary database
first, and then upgrade the primary. When downgrading, reverse the
order Also go through the doc once.
Does my existing Azure SQL DB is going to get affected by this
operation?
You can copy any database with a DTU-based compute size to a database
with a vCore-based compute size without restrictions or special
sequencing as long as the target compute size supports the maximum
database size of the source database. Database copy creates a
transactionally consistent snapshot of the data as of a point in time
after the copy operation starts. It doesn't synchronize data between
the source and the target after that point in time.
If things are not fine as per my requirement same way can i come
back to DTU mode.
A database migrated to the vCore-based purchasing model can be
migrated back to the DTU-based purchasing model at any time in the
same fashion, with the exception of databases migrated to the
Hyperscale service tier.

Azure Cosmos db : requests exceeding rate limit when bulk deleting records

I have one user bulk deleting some 50K documents from one container using a stored procedure.
Meanwhile another user is trying to login to the web app (connected to same cosmos db) but the request fails due to rate limit being exceeded.
What should be the best practice in this case in order to avoid service shortages like the one described?
a) Should I provision RUs by collection?
b) Can I set a cap on the RU's consumed by bulk operations from code when making a request?
c) is there any other approach?
More details on my current (naive/newbie) implementation:
Two collections : RawDataCollection and TransformedDataCollection
Partition key values are the customer account number
RU set at the database level (current dev deployment has minimum 400RUs)
Bulk insert/delete actions are needed in both collections
User profile data (for login purposes, etc.) is stored in RawDataCollection
Bulk actions are low priority in terms of service level, meaning it could be put on hold or something if a higher priority task comes in.
Normally when user logs in, retrieves small amounts of information. This is high priority in terms of service level.
It is recommended to not use Stored Procedures for bulk delete operations. Stored procedures only operate on the primary replica meaning they can only leverage 1/4 of total RU/s provisioned. You will get better throughput usage and more efficiency doing bulk operations using SDK client in Bulk Mode.
Whether you provision throughput at the database level or container level depends on a couple of things. If you have a large number of containers that get roughly the same number of requests and storage, database level throughput is fine. If the requests and storage is asymmetric then provision those containers which diverge greatly from the others with their own dedicated throughput. Learn more about the differences.
You cannot throttle requests on a container directly. You will need to implement Queue-based load leveling in your application.
Overall if you've provisioned 400 RU/s and trying to bulk delete 50K records, you are under provisioned and need to increase throughput. In addition, if you're workload is highly variable with long periods of little to no requests with short periods of high volume, you may want to consider using Serverless throughput or Autoscale

DTU utilization is not going above 56%

I'm doing a data load on azure sql server using azure data factory v2. I started the data load & the DB was set to Standard Pricing Tier with 800 DTUs. It was slow, so I increased the DTUs to 1600. (My pipeline is still running since 7 hrs).
I decided to change the pricing tier. I changed the pricing tier to Premium, DTUs set to 1000. (I didnt make any additional changes).
The pipeline failed as it lost connection. I rerun the pipeline.
Now, when I monitor the pipeline, it is working fine. When I monitor the database. The DTU usage on average is not going above 56%.
I am dealing with tremendous data. How can I speed up the process?
I expect the DTUs must max out. But the average utilization is around 56%.
Please follow this document Copy activity performance and scalability guide.
This tutorial gives us the Performance tuning steps.
One of ways is increase the Azure SQL Database tier with more DTUs. You have increased the Azure SQL Database tier with more 1000 DTUs, but the average utilization is around 56%. I think You don't need so higher price tier.
You need to think about other ways to improve the performance. Such as set more Data Integration Units(DIU).
A Data Integration Unit is a measure that represents the power (a combination of CPU, memory, and network resource allocation) of a single unit in Azure Data Factory. Data Integration Unit only applies to Azure integration runtime, but not self-hosted integration runtime.
Hope this helps.
The standard answer from Microsoft seems to be that you need to tune the target database or scale up to a higher tier. This suggests that Azure Data Factory is not a limiting factor in the copy performance.
However we've done some testing on a single table, single copy activity, ~15 GB of data. The table did not contain varchar(max), high precision, just simple and plain data.
Conclusion: it does barely matter what kind of tier you choose (not too low ofcourse), roughly above S7 / 800 DTU, 8 vcores, the performance of the copy activity is ~10 MB/s and does not go up. The load on the target database is 50%-75%.
Our assumption is that since we could keep throwing higher database tiers against this problem, but did not see any improvement in the copy activity performance, this is Azure Data Factory related.
Our solution is, since we are loading a lot of separate tables, to scale out instead of scale up via a for each loop and a batch count set to at least 4.
The approach to increase the DIU is only applicable in some cases:
https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-performance#data-integration-units
Setting of DIUs larger than four currently applies only when you copy
multiple files from Azure Storage, Azure Data Lake Storage, Amazon S3,
Google Cloud Storage, cloud FTP, or cloud SFTP to any other cloud data
stores.
In our case we are copying data from relational databases.

Storing a large amount of state in a service fabric cluster

I have a scenario where we need to store x*100 GBs of data. The data is in-general a good candidate for persistent state for an actor (well-partitioned, used by the specific actors only) in the service fabric cluster itself.
Is the service fabric persistent state storage recommended for data of this scale? (Our compute load is going to be fairly low, so bumping up VMs just to store the state is not a desirable option.)
How does the amount of persistent state affect the latency of moving partitions between nodes in the cluster?
Well let's look at how state is stored in a service (this applies to actors too).
The component that stores your data in your service is called a State Provider. State providers can be in-memory only or in-memory + local disk. The default state provider you get with an actor service is in-memory + local disk but it only keeps hot data in memory so your storage requirements are not memory bound. Contrast with the Reliable Collections state provider which currently stores all data both in-memory and on local disk, although in a future release it will also have an option to only keep hot data in memory and offload the rest to local disk.
Given that you are using actors, you can use the default actor state provider which means your data capacity is limited by local disk storage on your machines or VMs, which should be reasonable for storing 100s of GB. We generally don't move entire partitions around, but occasionally Service Fabric does need to rebuild a replica of your service, and the more data you have the longer it will take to build a replica. However, this doesn't really affect the latency of your service, because you have multiple replicas in a stateful service and you usually have enough replicas up that you don't need to wait for another to be rebuilt. Rebuilding a replica is usually something that happens "off to the side."
It's true that it's not economical to add VMs just for storing state, but keep in mind that you can pack as many services onto your VMs as you want. So even though your actor service isn't using much compute, you can always pack other services on those VMs to use up that compute so that you're maximizing both compute and storage on your VMs, which can in fact be very economical.

Resources