How do I calculate the cost of Azure Synapse Analytics? - azure

I need to calculate the cost of the Azure Synapse Analytics. I have used the Azure Pricing Calculator but I could not figure it out. It shows close to USD 2,100.
I have the following components as a part of the Azure Synapse Analytics
Synapse workspace
Self Hosted agent - Standard_B2s
Synapse SQL pool
How do I calculate the cost of Azure Synapse Analytics?

This is a very difficult question to answer, because most of the costs are consumption/runtime oriented.
The pricing calculator defaults are not great, so you'll really want to fine tune it. For instance, you cannot remove Dedicated Pools, but you can set the Hours to 0. It also includes Data Explorer, which cannot be removed. To not include these prices in the calculator, deselect the "Auto select engine instances", and under both Engine V-Cores and Data Management V-Cores, set the hours to 0.
The calculator will NOT include any time for Spark pools (Notebooks) or Data Flows. These are both heavily consumption oriented which will vary greatly based on your runtime choices like pool size. Their costs are based on minutes of consumption, so good luck predicting that.

Here is a sample pricing calculator filled out to describe your situation. The assumptions are below.
you are using a Dedicated SQL pool not a Serverless SQL pool
you have scaled the dedicated SQL pool to DWU100c and left it running 24 hours a day (if you programmatically pause it then that would reduce the cost)
you do not want to commit to running it 24 hours a day for 1 or 3 years and get reserved pricing discounts
in the dedicated SQL pool you have under 1TB of data (compressed) and you have geo-redundant backups enabled
you are running under 1,000 pipeline activities per month on the self-hosted integration runtime, copy activities run less than an hour per month, and other activity hours are less than 7 hours per month.
you are not using other parts of Synapse like Spark pools, data flows, Data Explorer pools, Synapse Serverless SQL, etc.
you are in the East US Azure region
you have a B2s virtual machine with a 128GB premium SSD OS disk and no other attached disks where the self-hosted IR is installed. It is running 24 hours a day. (The VM cost but not storage cost could be lowered if you pause and resume it programmatically)
on the B2s virtual machine you do not want to commit to running it 24 hours a day for 1 or 3 years to get a reserved pricing discount and you are renting the Windows license with the VM rather than bringing your license with Azure Hybrid Benefit
this is retail pricing

Related

Decision making factors for migrating SSAS from Azure VM to Azure Analysis Services

Recently we have upgraded our SSAS resources. Currently our SSAS is on Azure VM costing us based on this VM type 'Standard E32-8s_v3'.
I am looking for a way to save more cost by selecting a better option.
What can be a good option to save cost and at the same time have better efficiency.
what factors/ differences can be considered if we go to Azure analysis services instead of SSAS on Azure VM.
Our SQL server is also on Azure VM.
We have our reports on Power BI report server and SSRS.
Data is coming from different resources like SAP, external parties etc using SSIS.
Can you please Advice/ Suggest a better options for our data architecture.
Thank you.
Your VM is 8 cores and 256GB RAM.
One factor in pricing you haven’t mentioned is SQL licensing. You didn’t specify whether you are renting the SQL license with the VM or bringing your own license and what that costs. I’m going to assume you are renting it. With Azure Analysis Services the license is included in the price.
In Azure Analysis Services 100QPU is roughly equivalent to 5 cores. So 200QPU (an S2) would be an equivalent amount of CPU and a similar price but only has 50GB RAM.
To get the equivalent amount of RAM the S8 would get you close (200GB RAM) but for substantially more cost.
If you have one large model which (at least during peak usage or processing) uses most of the 256GB RAM then it may be tough to move to Azure Analysis Services for a similar price. If you have several models on that one server then you could split them across several smaller Azure Analysis Services servers and it may be a reasonable price for you. Or you could scale up for processing when RAM is needed most and scale down for the rest of the day to save cost.

Azure Synapse Billing Model for individuals?

This is a question on the pricing model of the azure synapse, how it works, and understanding the cost accrued/accumulated for developers who are doing self-study and exploring/learning the services.
For the same purpose, I purchased a pay-as-you-go service. The first question is -Is it the right scope/subscription for individuals who want to do practice or hands-on with Azure services?
Last Sunday i.e. 19 July 2020 (4 days ago) I provisioned 2 services (SQL server and Synapse SQL pool (data warehouse)).
Synapse SQL pool was set to 100 DWU and the service was immediately paused after creating it.
As per the billing I was expecting only to be charged 1.510 (since I had stopped the service and billing rate of 100 DWU is 1.510 per hour)
However on seeing today, i.e. 4 days since the services were provisioned I am seeing my accumulated charges to be 20.15 .
Does anyone know how this works out to be?
I have raised an SR for this and awaiting a response from Microsoft.
Appreciate it if anyone could give me some leads.
Regards
Lokesh
Did your bill include storage or was the $20 from compute alone? It would be reasonable for Synapse Data warehouse compute+storage: for 1 hour of DWU100 compute and 4 days of 1TB storage in central US region (1h * $1.51 + 4 * 24h * $0.19 = 1.51+18.24=$19.75).
The storage in ”old” Azure Data Warehouse comes with 1TB slots so you will be billed for 1TB even if you don’t use that all. With new Synapse on-demand you can use Azure Data Lake as your storage and it is billed by the actual usage.

Azure Databricks pricing: B2B subscription vs official page pricing

From one company I know that 50,000 DBUs for B2B Non-Production subscription may cost about 44,000$. In turn, at Databricks official pricing page, the most premium layer costs 0.55$/DBU (27,500$ per 50k DBUs).
Could you please explain the difference between B2B subscription DBUs and official page Data Analytics Pemium SKU DBUs?
Why the pricing differs so dramatically? Is there anything else (as part of B2B) besides support/fastrack?
Hope you won't need to publish private informationto to answer my question. But I need to understand the main reasons, to be able to plan costs for future projects.
UPD
Databricks B2B subscription does not provide you with a choice of different usage layers (Light/Engineering/Analytics). Instead you have a single option (price) for each bundle (DBU volume). That option is significantly more expensive than the most expensive Analytics layer.
Think of it as getting a discount on $50,000 worth of tokens. The way you run your process will pull from that bucket as if you had $50,000 to spend even though you are paying $46,000. You have a year or 3 years to spend them, if you don't spend them in that timeframe you lose the remaining. If you go through them all you will pay the pay-as-you-go price or you can pre-buy another year or 3 year bucket of units. Also how you run your jobs and what tier you run under (Standard or Premium) will determine how fast you burn through the bucket of units and does still matter as the previous answer stated.
https://azure.microsoft.com/en-us/pricing/details/databricks/
Databricks Unit pre-purchase plan
You can get up to 37% savings over pay-as-you-go DBU prices when you
pre-purchase Azure Databricks Units (DBU) as Databricks Commit Units
(DBCU) for either 1 or 3 years. A Databricks Commit Unit (DBCU)
normalizes usage from Azure Databricks workloads and tiers into to a
single purchase. Your DBU usage across those workloads and tiers will
draw down from the Databricks Commit Units (DBCU) until they are
exhausted, or the purchase term expires. The draw down rate will be
equivalent to the price of the DBU, as per the table above.
The purchase tiers and discounts for DBCU purchases are shown below:
1-year pre-purchase plan
DATABRICKS COMMIT UNIT (DBCU) PRICE (WITH DISCOUNT) DISCOUNT
25,000 $23,500 6%
50,000 $46,000 8%
100,000 $89,000 11%
200,000 $172,000 14%
350,000 $287,000 18%
500,000 $400,000 20%
750,000 $578,000 22%
1,000,000 $730,000 27%
1,500,000 $1,050,000 30%
2,000,000 $1,340,000 33%
Also Analytics/Engineering/Light are not options that you choose from. They are defined by how you run your jobs. Executing a job through the notebook interface is defined as an Analytics job where as if you schedule the notebook to run that is considered an engineering job and if you use a coded library submit job you are running under the light tier.
UPDATE - not enough room in comment section to answer OP reply
great thanks for your answer! I think I got my mistake, but please approve once again. So DBCU is about US dollars, so 50k DCBUs may be equal to let say ~100k DBUs, right?
DBUs and DBCUs are exactly the same and are charged the same as far as usage. The only difference is that you get an up front discount of 8% with your example of pre buying 50,000. If you were to run everything exactly the same in two different workspaces and you spent exactly 50,000 DBU Hours in one and 50,000 DBCU Hours in the other, you would owe $50,000 over the course of the year or you would pay $46,000 up front. Neither of these include the actual VM base costs that you would owe to Azure. The DBU structure is Databricks cut of the cost, so you would have to factor that in to your overall cost.
This took me a while to figure out when I started with databricks as well. When they say you are charged $0.55 for the Analytical job that is per DBU hour that is processed not .55 per job. So if I run an Analytical job for 1 hour I would burn .55 * (# of VM's * VM DBU cost per hour). If I ran that same job for only 1/2 an hour I would be charged (.55*.5) * (# of VM's * (VM DBU cost*.5)). It's easier to think of the DBU and DBCU units as 1 unit = $1 and you are burning the dollar value per second of compute not the unit count. The pricing grid that shows $0.55/DBU should be labeled $0.55/DBU-hour in my opinion. Took me a long time, a couple calls and a poc, to figure out.
As to your second question
And scheduling jobs through REST API is more beneficial then scheduling through ADF => Notebook, right?
Again the question is more complicated that it seems like it should be. I initially said yes it is better, I didn't catch the ADF portion of the question. You can run engineering jobs through ADF by making use of the job cluster option to run your notebooks. If you attach your notebooks through ADF to a premade analytics cluster you will pay the analytics cost. Using the API's you could schedule your notebooks in the built in jobscheduler that databricks provides. My understanding is that is charged at the engineer level of a Notebook and light level if a job library.
Another thing to ask for when prebuying if you go that route is to be able to attach the bucket of units to both your dev/test environment and prod environment. We keep them completely separate networks so we have two workspaces. can both pull from the same pool of units. Depends on your azure setup. We went through Databricks sales when we set ours up but Microsoft should be able to do the same.
Depending on the type of workload your cluster runs, you will either be charged for Data Engineering or Data Analytics workload.
For example, if the cluster runs workloads triggered by the Databricks jobs scheduler, you will be charged for the Data Engineering workload. If your cluster runs interactive features such as ad-hoc commands, you will be billed for Data Analytics workload.
Here is an example on how billing works?
If you run Premium tier cluster for 100 hours in East US 2 with 10 DS13v2 instances, the billing would be the following for Data Analytics workload:
VM cost for 10 DS13v2 instances —100 hours x 10 instances x $0.598/hour = $598
DBU cost for Data Analytics workload for 10 DS13v2 instances —100 hours x 10 instances x 2 DBU per node x $0.55/DBU = $1,100
The total cost would therefore be $598 (VM Cost) + $1,100 (DBU Cost) = $1,698.
In addition to VM and DBU charges, you may also be charged for managed disks, public IP address or any other resource such as Azure Storage, Azure Cosmos DB depending on your application.
Still you have confusion on understanding the Azure Databricks pricing?
I would suggest you to a create a billing support ticket to get more clarity on the "Azure Databricks pricing: B2B subscription vs official page pricing" which you are looking for.
Step1: Go to “Help+Support”
Step2: Under support =>Select + New support request
Step3: Fill Basic details: Issue type*: Billing
Step4: Review + Create
Note: Azure provides unlimited support for subscription management, which includes billing, quota adjustments, and account transfers.
Reference: How to create an Azure support request

DTU utilization is not going above 56%

I'm doing a data load on azure sql server using azure data factory v2. I started the data load & the DB was set to Standard Pricing Tier with 800 DTUs. It was slow, so I increased the DTUs to 1600. (My pipeline is still running since 7 hrs).
I decided to change the pricing tier. I changed the pricing tier to Premium, DTUs set to 1000. (I didnt make any additional changes).
The pipeline failed as it lost connection. I rerun the pipeline.
Now, when I monitor the pipeline, it is working fine. When I monitor the database. The DTU usage on average is not going above 56%.
I am dealing with tremendous data. How can I speed up the process?
I expect the DTUs must max out. But the average utilization is around 56%.
Please follow this document Copy activity performance and scalability guide.
This tutorial gives us the Performance tuning steps.
One of ways is increase the Azure SQL Database tier with more DTUs. You have increased the Azure SQL Database tier with more 1000 DTUs, but the average utilization is around 56%. I think You don't need so higher price tier.
You need to think about other ways to improve the performance. Such as set more Data Integration Units(DIU).
A Data Integration Unit is a measure that represents the power (a combination of CPU, memory, and network resource allocation) of a single unit in Azure Data Factory. Data Integration Unit only applies to Azure integration runtime, but not self-hosted integration runtime.
Hope this helps.
The standard answer from Microsoft seems to be that you need to tune the target database or scale up to a higher tier. This suggests that Azure Data Factory is not a limiting factor in the copy performance.
However we've done some testing on a single table, single copy activity, ~15 GB of data. The table did not contain varchar(max), high precision, just simple and plain data.
Conclusion: it does barely matter what kind of tier you choose (not too low ofcourse), roughly above S7 / 800 DTU, 8 vcores, the performance of the copy activity is ~10 MB/s and does not go up. The load on the target database is 50%-75%.
Our assumption is that since we could keep throwing higher database tiers against this problem, but did not see any improvement in the copy activity performance, this is Azure Data Factory related.
Our solution is, since we are loading a lot of separate tables, to scale out instead of scale up via a for each loop and a batch count set to at least 4.
The approach to increase the DIU is only applicable in some cases:
https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-performance#data-integration-units
Setting of DIUs larger than four currently applies only when you copy
multiple files from Azure Storage, Azure Data Lake Storage, Amazon S3,
Google Cloud Storage, cloud FTP, or cloud SFTP to any other cloud data
stores.
In our case we are copying data from relational databases.

Are there free websites on Windows Azure for more than the trial period

I did a lot of searching but I guess Windows Azure's trial offers are constantly changing and there is a lot of different information over the internet. I am looking to develop a small website for learning purposes using Azure. My questions are:
1) Are there still 10 free websites after my 30-day trial ends?
If yes,
2) Can I use Table/Blob store after the trial period?
3) Can I use Azure SQL instance after the trial period?
From the horses mouth, so to speak:
Web Sites Pricing Details
You can run up to 10 websites for Free in a shared environment.
Azure Table Storage will cost, but it's not all that much. Storage Pricing Details gives you a run down, but I find their Pricing Calculator to be quite useful.
As an example:
100GB of blob storage
100GB of tables and queues
10 million transactions per month
is a grand total of $9.90 USD per month.

Resources