I want to use azure media services for video streaming but the azure docs are quite ambiguous regarding the pricing. Kindly let me know:
what are Streaming units
How do we calculate the cost with streaming units?
What is the data transfer cost and how is it associated with streaming units?
Keeping these into consideration if I have 10 videos uploaded, 100 watch minutes, how to calculate the cost?
If we are using CDN with streaming endpoint, will streaming endpoint and CDN both be billed?
Thanks in advance.
Several of these questions are answered on https://learn.microsoft.com/en-us/azure/media-services/latest/stream-streaming-endpoint-concept, but for completeness I'll answer directly here. If you use a Standard streaming endpoint Streaming Units are not needed. The endpoint autoscales up to 600 Mbps. A Premium streaming endpoint does use Streaming Units. These are dedicated to your account and allow you to scale beyond 600 Mbps.
The cost is listed on https://azure.microsoft.com/en-us/pricing/details/media-services/ (click on "Streaming" 1/3 of the way down the page). The pricing is simply related to the amount of time the streaming endpoint is in the running state. Whether a client is streaming or not doesn't matter.
The primary cost related to streaming is the data egress or CDN egress. https://azure.microsoft.com/en-us/pricing/details/bandwidth/ and https://azure.microsoft.com/en-us/pricing/details/cdn/ respectively. The cost is based on the amount of data transferred, not the length of the video. This is because videos can be encoded at different bitrates.
Using the pricing calculator at https://azure.microsoft.com/en-us/pricing/calculator/?service=media-services can help.
From one company I know that 50,000 DBUs for B2B Non-Production subscription may cost about 44,000$. In turn, at Databricks official pricing page, the most premium layer costs 0.55$/DBU (27,500$ per 50k DBUs).
Could you please explain the difference between B2B subscription DBUs and official page Data Analytics Pemium SKU DBUs?
Why the pricing differs so dramatically? Is there anything else (as part of B2B) besides support/fastrack?
Hope you won't need to publish private informationto to answer my question. But I need to understand the main reasons, to be able to plan costs for future projects.
UPD
Databricks B2B subscription does not provide you with a choice of different usage layers (Light/Engineering/Analytics). Instead you have a single option (price) for each bundle (DBU volume). That option is significantly more expensive than the most expensive Analytics layer.
Think of it as getting a discount on $50,000 worth of tokens. The way you run your process will pull from that bucket as if you had $50,000 to spend even though you are paying $46,000. You have a year or 3 years to spend them, if you don't spend them in that timeframe you lose the remaining. If you go through them all you will pay the pay-as-you-go price or you can pre-buy another year or 3 year bucket of units. Also how you run your jobs and what tier you run under (Standard or Premium) will determine how fast you burn through the bucket of units and does still matter as the previous answer stated.
https://azure.microsoft.com/en-us/pricing/details/databricks/
Databricks Unit pre-purchase plan
You can get up to 37% savings over pay-as-you-go DBU prices when you
pre-purchase Azure Databricks Units (DBU) as Databricks Commit Units
(DBCU) for either 1 or 3 years. A Databricks Commit Unit (DBCU)
normalizes usage from Azure Databricks workloads and tiers into to a
single purchase. Your DBU usage across those workloads and tiers will
draw down from the Databricks Commit Units (DBCU) until they are
exhausted, or the purchase term expires. The draw down rate will be
equivalent to the price of the DBU, as per the table above.
The purchase tiers and discounts for DBCU purchases are shown below:
1-year pre-purchase plan
DATABRICKS COMMIT UNIT (DBCU) PRICE (WITH DISCOUNT) DISCOUNT
25,000 $23,500 6%
50,000 $46,000 8%
100,000 $89,000 11%
200,000 $172,000 14%
350,000 $287,000 18%
500,000 $400,000 20%
750,000 $578,000 22%
1,000,000 $730,000 27%
1,500,000 $1,050,000 30%
2,000,000 $1,340,000 33%
Also Analytics/Engineering/Light are not options that you choose from. They are defined by how you run your jobs. Executing a job through the notebook interface is defined as an Analytics job where as if you schedule the notebook to run that is considered an engineering job and if you use a coded library submit job you are running under the light tier.
UPDATE - not enough room in comment section to answer OP reply
great thanks for your answer! I think I got my mistake, but please approve once again. So DBCU is about US dollars, so 50k DCBUs may be equal to let say ~100k DBUs, right?
DBUs and DBCUs are exactly the same and are charged the same as far as usage. The only difference is that you get an up front discount of 8% with your example of pre buying 50,000. If you were to run everything exactly the same in two different workspaces and you spent exactly 50,000 DBU Hours in one and 50,000 DBCU Hours in the other, you would owe $50,000 over the course of the year or you would pay $46,000 up front. Neither of these include the actual VM base costs that you would owe to Azure. The DBU structure is Databricks cut of the cost, so you would have to factor that in to your overall cost.
This took me a while to figure out when I started with databricks as well. When they say you are charged $0.55 for the Analytical job that is per DBU hour that is processed not .55 per job. So if I run an Analytical job for 1 hour I would burn .55 * (# of VM's * VM DBU cost per hour). If I ran that same job for only 1/2 an hour I would be charged (.55*.5) * (# of VM's * (VM DBU cost*.5)). It's easier to think of the DBU and DBCU units as 1 unit = $1 and you are burning the dollar value per second of compute not the unit count. The pricing grid that shows $0.55/DBU should be labeled $0.55/DBU-hour in my opinion. Took me a long time, a couple calls and a poc, to figure out.
As to your second question
And scheduling jobs through REST API is more beneficial then scheduling through ADF => Notebook, right?
Again the question is more complicated that it seems like it should be. I initially said yes it is better, I didn't catch the ADF portion of the question. You can run engineering jobs through ADF by making use of the job cluster option to run your notebooks. If you attach your notebooks through ADF to a premade analytics cluster you will pay the analytics cost. Using the API's you could schedule your notebooks in the built in jobscheduler that databricks provides. My understanding is that is charged at the engineer level of a Notebook and light level if a job library.
Another thing to ask for when prebuying if you go that route is to be able to attach the bucket of units to both your dev/test environment and prod environment. We keep them completely separate networks so we have two workspaces. can both pull from the same pool of units. Depends on your azure setup. We went through Databricks sales when we set ours up but Microsoft should be able to do the same.
Depending on the type of workload your cluster runs, you will either be charged for Data Engineering or Data Analytics workload.
For example, if the cluster runs workloads triggered by the Databricks jobs scheduler, you will be charged for the Data Engineering workload. If your cluster runs interactive features such as ad-hoc commands, you will be billed for Data Analytics workload.
Here is an example on how billing works?
If you run Premium tier cluster for 100 hours in East US 2 with 10 DS13v2 instances, the billing would be the following for Data Analytics workload:
VM cost for 10 DS13v2 instances —100 hours x 10 instances x $0.598/hour = $598
DBU cost for Data Analytics workload for 10 DS13v2 instances —100 hours x 10 instances x 2 DBU per node x $0.55/DBU = $1,100
The total cost would therefore be $598 (VM Cost) + $1,100 (DBU Cost) = $1,698.
In addition to VM and DBU charges, you may also be charged for managed disks, public IP address or any other resource such as Azure Storage, Azure Cosmos DB depending on your application.
Still you have confusion on understanding the Azure Databricks pricing?
I would suggest you to a create a billing support ticket to get more clarity on the "Azure Databricks pricing: B2B subscription vs official page pricing" which you are looking for.
Step1: Go to “Help+Support”
Step2: Under support =>Select + New support request
Step3: Fill Basic details: Issue type*: Billing
Step4: Review + Create
Note: Azure provides unlimited support for subscription management, which includes billing, quota adjustments, and account transfers.
Reference: How to create an Azure support request
I would need to know that,
Is stopping Azure stream analytics service will stop the costing.
As per the answer from MSFT: For Azure Stream Analytics, there is no charge when the job is stopped.
But for Azure Stream Analytics on IoT Edge: Billing starts when an ASA job is deployed to devices, no matter what the job status is (running/failed/stopped).
Welcome to Stackoverflow!
Note: There is no charges for the stopped jobs. It will be billed on basis on steaming units in Cloud and jobs/devices in Edge.
Detailed explanation:
As a cloud service, Stream Analytics is optimized for cost. There are no upfront costs involved - you only pay for the streaming units you consume, and the amount of data processed. There is no commitment or cluster provisioning required, and you can scale the job up or down based on your business needs.
While creating stream Analytics Job, if you created a Stream Analytics job with streaming units = 1, it will be billed $0.11/hour.
Pricing:
Azure Stream Analytics on Cloud: If you created a Stream Analytics job with streaming units with N, it will be billed $0.11 * N/hour.
Azure Stream Analytics on Edge: Azure Stream Analytics on IoT Edge is priced by the number of jobs that have been deployed on a device. For instance, if you have two devices and the first device has one job whereas the second device has two jobs your monthly charge will be (1 job)(1 device)($1/job/device)+(2 jobs)(1 device)($1/job/device) = $1+$2 = $3 per month.
Hope this helps. If you have any further query do let us know.
In order to write sensor data from an IoT device to a SQL database in the cloud I use an Azure Streaming Analytics job. The SA job has an IoT Hub input and a SQL database output. The query is trivial; it just sends all data through).
According to the MS price calculator, the cheapest way of accomplishing this (in western Europe) is around 75 euros per month (see screenshot).
Actually, only 1 message per minute is send through the hub and the price is fixed per month (regardless of the amount of messages). I am surprised by the price for such a trivial task on small data. Would there be a cheaper alternative for such low capacity needs? Perhaps an Azure function?
If you are not processing the data real-time then SA is not needed, you could just use an Event Hub to ingest your sensor data and forward it on. There are several options to move data from the Event Hub to SQL. As you mentioned in your question, you could use an Azure Function or if you want a no-code solution, you could us a Logic App.
https://learn.microsoft.com/en-us/azure/connectors/connectors-create-api-azure-event-hubs
https://learn.microsoft.com/en-us/azure/connectors/connectors-create-api-sqlazure
In addition to Ken's answer, the "cold path" can be your solution, when the telemetry data are stored in the blob storage by Azure IoT Hub every 720 seconds (such as a maximum batch frequency).
Using the Azure Event Grid on the blob storage, it will trigger an EventGridTrigger subscriber when we can handle starting a streaming process for this batch (or for a group of batches within an one hour). After this batch process is done, the ASA job can be stopped.
Note, that the ASA job is billed based on the active processing time (that's the time between the Start/Stop) which your cost using an ASA job can be significantly dropped down.
is it possible to turn off an azure media services end-point when idle, and turn it on again by demand (either by ways of configuration or programmatically?)
the purpose of this being saving costs on the end-point when not used.
There are several billing meters which apply for azure media services See https://azure.microsoft.com/en-us/pricing/details/media-services/ for all details.
Main price components:
Storage - you are paying based on how many space consumed by your assets. You can delete assets which are you no longer planning to use to avoid extra charges.
Encoding without reserved units - usage charges based on size of processed data. Your are not paying if system is in idle state.
Encoding with reserved Units - Same as #2 plus you are paying for allocation of resources.Reserved units help you to encode multiple jobs in a same time. Use portal to increase/decrease number of reserved units or REST/API, Client SDK to configure this parameter - https://msdn.microsoft.com/en-US/library/azure/dn859236.aspx#update_EncodingReservedUnitType
Live channels - For all live channel types, billing is based on the amount of time the channel is in running state and not based on the incoming and processed data. You can stop or delete channel to avoid incurring charges. See https://msdn.microsoft.com/en-us/library/azure/dn783458.aspx#stop_channels
Streaming units - Provides dedicated bandwidth capacity for both on-demand and live streaming. Pricing per unit. You can start/stop/scale streaming endpoint through portal or rest api - see https://msdn.microsoft.com/en-us/library/azure/dn783468.aspx
Content Protection - charges based on licences/keys issued through a service.