Azure Blobs billing break down of "All Other Operations" - azure

I am looking at the billing for my Azure Storage Account and trying to understand managing its cost.
Currently my blobs cost is mostly under the "All Other Operations" category. Is there a way to see what operations these are?
I would like to reduce this cost, so the goal is to update my app so these operations are performed less, but I need to first identify what they are.
Below is the graph I get from cost analysis. (Storage accounts, Accumulated cost, grouped by meter)

After a support call with Azure, they pointed me to some of the (somewhat hidden) tracing capabilities.
First and easiest is to check the type of transactions.
Go to the storage account > Metrics
Select Transactions as the metric
Click Add Filter and select API Name as the property
Select the API names you think are the suspects
Unfortunately selecting multiple doesn't show them separately, so you have to try every API individually and see if anything sticks out.
Second option is to enable Diagnostics Logging for the storage type you're interested in.
If the above doesn't yield any good results, or you're curious about the exact calls at exact times .etc. you can enable this feature, and wait for logs to be collected, usually over a few days, so you have a good sample set to reason.
Go to the storage account > Diagnostic settings (classic).
This is under Monitoring (classic) doesn't seem to have a replacement in the new Monitoring section.
Enable logging and metrics type (hour or minute)
Click Save
These logs are written to a blob storage in the same account to a container named $logs. According to docs, this container cannot be deleted after enabling, but the content can be deleted when you're done.
Note that if your storage account gets a lot of traffic, this log can get very big very quick. You're charged the same rates for reads, writes and storage in this container as usual, including the log writes the platform does when these settings are enabled.
See documentation here
After setting this up, give it some time to collect data.
Use storage explorer or other means to navigate and download the logs and inspect them.
Logs contain every request made to storage, with details such as timestamp, API name, result, whether the operation was authenticated, and if you're looking at blobs it also shows the url, the user agent and more.
(turns out my app did close to 100,000 calls to GetBlobProperties and GetContainerProperties per day🎈)

Short answer to your question is Yes.
Analysis:
According to my observation, I get "all other operations" when I group by "Meter" as shown in below screenshot.
And then if I export the results by clicking on "Export" and then when I filter the results for "Meter" column with "all other operations" then I observe that column named "ServiceTier" has "tiered block blob" as value (in my case). For reference, see below screenshot.
And then if I group by "Meter subcategory" as shown in below screenshot then I see "tiered block blob" (in my case).
And then if I export the results by clicking on "Export" and then when I filter the results for "Meter subCategory" column with "tiered block blob" then I observe that column named "ServiceTier" also has "tiered block blob". For reference, see below screenshot.
So based on above analysis, I believe that we can figure out the break down of "Meter" column with "all other operations" as "tiered block blob" in my case with the help of "Meter subcategory" and "ServiceTier". Similarly you would be able to figure out the break down of "Meter" column with "all other operations".
Hope this helps! Cheers!
Other related references: As per this and this Azure documents, there are many other operations on blobs excluding write, read, list operations so in your case any such operations might have fallen under "all other operations" category.

Related

New item inserted in Azure Table Storage is not immediately available

I have
an endpoint in an Azure Function called "INSERT" that that inserts a
record in Table Storage using a batch operation.
an endpoint in a different Azure Function
called "GET" that gets a record in Table Storage.
If I insert an item and then immediately get that same item, then the item has not appeared yet!
If I delay by one second after saving, then I find the item.
If I delay by 10ms after saving, then I don't find the item.
I see the same symptom when updating an item. I set a date field on the item. If I get immediately after deleting, then some times the date field is not set yet.
Is this known behavior in Azure Table Storage? I know about ETags as described here but I cannot see how it applies to this issue.
I cannot easily provide a code sample because this is distributed among multiple functions and I think if I did put it in a simpler example, then there would be some mechanism that would see I am calling from the same ip or with the same client and manage to return the recently saved item.
As mentioned in the comments, Azure Table Storage is Strongly Consistent. Data is available to you as soon as it is written to Storage.
This is in contrast with Cosmos DB Table Storage where there are many consistency levels and data may not be immediately available for you to read after it is written depending on the consistent level set.
The issue was related to my code and queues running in the background.
I had shut down the Function that has queue triggers but to my surprise I found that the Function in my staging slot was picking items off the queue. That is why it made a difference whether I delay for a second or two.
And to the second part, why a date field is seemingly not set as fast as I get it. Well, it turns out I had filtered by columns, like this:
var operation = TableOperation.Retrieve<Entity>(partitionKey, id, new List<string> { "Content", "IsDeleted" });
And to make matters worse, the class "Entity" that I deserialize to, of course had default primitive values (such as "false") so it didn't look like they were not being set.
So the answer does not have much to do with the question, so in summary, for anyone finding this question because they are wondering the same thing:
The answer is YES - Table Storage is in fact strongly consistent and it doesn't matter whether you're 'very fast' or connect from another location.

Azure Stream Analytics: "Output contains multiple rows …" warning

we're using a Stream Analytics component in Azure to send data (log messages from different web apps) to a table storage account. The messages are retrieved from an Event Hub, but I think this doesn't matter here.
Within the Stream Analytics component we defined an output for the table storage account including partition and row key settings. As of now the partition key will be the name of the app that sent the log message in the first place. This might not be ideal, but I'm lacking experience in choosing the right values here. However, I think this is a whole different topic. The row key will be a unique id of the specific log message.
Now when I watch the Stream Analytics Output within the Azure portal the following warning message pops up very frequently (and sometimes disappears for a couple of seconds):
Warning: Output contains multiple rows and just one row per partition key. If the output latency is higher than expected, consider choosing a partition key that splits output into multiple partitions while maintaining about 100 records per partition.
Regarding this message I have two questions:
What does this exactly mean or why does it happen? I can see that a single new log message will always qualify as "just one row per partition key", simply because it's just one row. But looking at maybe hundreds of rows sent within a short period of time they all share just three partition keys (three apps logging to the Event Hub), pretty much equally divided. That's why I don't get the whole "Output contains multiple rows and just one row per partition key" thing.
Does this in any way affect the performance or overall functionality of the Stream Analytics component or the table storage?
I also played with the "Batch size" setting of the table storage output, but this didn't change anything.
Thanks in advance for reading and trying to help.
What does this exactly mean or why does it happen?
It is a warning not a error. It means that each row in your output has the unique partition key.
I can see that a single new log message will always qualify as "just one row per partition key", simply because it's just one row.
The warning is not suitable for a single message. I suggest you post a feedback on Azure feedback site which is used for accepting user voice and bugs.
https://feedback.azure.com/forums/34192--general-feedback
Does this in any way affect the performance or overall functionality of the Stream Analytics component or the table storage?
No, you could just ignore the warning.

Azure Storage Table size

Azure billing is based on the size of used space. Now I need to know the details. What is the size of each storage object in my storage (blob container, single table)?
It's easy to write a code that enumerates all blobs and calculates the overall size per container. But what about tables? How can I get the size of a certain table in Azure storage?
If you're not interested in getting a breakup by blob container, you don't have to write the code as far as finding the blob storage size is concerned. This information is available to you via storage analytics (http://msdn.microsoft.com/en-us/library/windowsazure/hh343270.aspx). The table of interest to you would be $MetricsCapacityBlob (http://msdn.microsoft.com/en-us/library/windowsazure/hh343264.aspx).
Coming to tables, unfortunately no such thing is available and you would need to fetch all entities and calculate the size of each entity to find the table size. You may find this blog post useful in calculating the size of an entity: http://blogs.msdn.com/b/avkashchauhan/archive/2011/11/30/how-the-size-of-an-entity-is-caclulated-in-windows-azure-table-storage.aspx.
HTH.
There is a tool which can get table size or entities count for you. Azure Storage Manager
Select a storage table in left tree pane
Click 'Property' button
Click 'Calc' button on the table properties dialog
Wait a few moment, till 'Calc' button becomes available again.
Here's the Step by Step of how to get this info:
Go into "Monitor" in Azure (it's a top level item on the left nav by default), it looks like a speedometer, or perhaps a really fast one handed clock.
Then select Metrics (it's below Alerts, and above Logs, in the first grouping).
Then from the "Select a scope" pop-up, select your storage account and pressed "Apply".
Then on the empty table there are some drop downs, the first one will have the scope you applied. The second one, Metric Namespace, should be "Table", the third one, Metric, should be "Table Capacity", you can leave the last one as Avg -- if you only have one table in your storage account then the Avg will just be the exact size of that table.
If you want to calculate the average row size, you can do a simple divide -- in my case I did 1.4 GB / 2.5M entities = ~560 bytes

Access times for Windows Azure storage tables

My company is interested in using the azure storage tables. They have asked me to look into access times but so far I have not found any information on this. I have a few questions that perhaps some person here could help answer.
Any information / links or anything on the read / write access times of azure table storage
If I use a partition key and row key for direct access does read time increase with number of fields
Is anyone aware of future plans for azure storage such as decrease in price, increase in access speed, ability to index or increase in size of storage per row
Storage is I understand 1MByte / row. Does this include space for the field names. I assume it does
Is there any way to determine how much space is used for a row in Azure storage. Any API for this.
Hope someone can help answer even one or two of these questions.
PLEASE note this question only applies to TABLE STORAGE.
Thanks
Microsoft has a blog post about scalability targets.
For actual storage per row, here's an excerpt from that post:
Entity (Row) – Entities (an entity is
analogous to a "row") are the basic
data items stored in a table. An
entity contains a set of properties.
Each table has two properties,
“PartitionKey and RowKey”, which form
the unique key for the entity. An
entity can hold up to 255 properties
Combined size of all of the properties
in an entity cannot exceed 1MB. This
size includes the size of the property
names as well as the size of the
property values or their types.
You should see performance around 500 transactions per second, on a given partition.
I know of no plans to reduce storage cost. It's currently at $0.15 / GB / month.
You can optimize table storage write speed by combining writes within a single partition - this is an entity group transaction. See here for more detail.
To add to David's answer. The Microsoft Extreme Computing Group have a pretty comprehensive series of performance benchmarks on all things Azure, including Azure tables.
From the above benchmarks (under read latency):
Entity size does not significantly affect the latencies
So I wouldn't be overly concerned about adding more properties.
Secondary indexes on Azure Tables have come up as a requested feature since it was first release and at one point it was even talked about as if it was going to be in an upcoming release. MS has since fallen very quiet about it. I understand that MS are working on it (or at the very least thinking very hard about it), but there is no time frame for when/if it will be released.

Are there any limits on the number of Azure Storage Tables allowed in one account?

I'm currently trying to store a fairly large and dynamic data set.
My current design is tending towards a solution where I will create a new table every few minutes - this means every table will be quite compact, it will be easy for me to search my data (I don't need everything in one table) and it should make it easy for me to delete stale data.
I've looked and I can't see any documented limits - but I wanted to check:
Is there any limit on the number of tables allowed within one Azure storage account?
Or can I keep adding potentially thousands of tables without any concern?
There are no published limits to the number of tables, only the 100TB 500TB limit on a given storage account. Combined with partition+row, it sounds like you'll have a direct link to your data without running into any table-scan issues.
This MSDN article explicitly calls out: "You can create any number of tables within a given storage account, as long as each table is uniquely named." Have fun!

Resources