How to identify per user cost and manage in azure - azure

I am developing a website that uses Azure B2C, Azure Storage (Blobs, Tables, Queues and file share) among others. I want to restrict the user transaction of... say file uploads/Downloads with some giga bytes and then give them a message that their quota is over for this month.
Is this possible for keeping track of individual B2C customer in Azure as a website owner? what's the best approach that is available to handle this?
Thanks in Advance,
Murthy

Actually, Azure Storage don't have any feature to restrict customer's consumption.
The only way might meet your need is that using a script, whatever language azure support.
To be brief, the script's logic could be:
Create a table with customers' information.
Set the limit of every user. Write the function for automate operating usage and remaining memory, and store the usage field's value and the remaining memory field's value to the table. I use Last to present remaining memory in the table.
Compare the file size with the customer's memory remain, when the upload api requested. If the 'Last' memory have 10k more than the file to be uploaded, allow uploading, otherwise, deny the request.
If upload succeed, get the file size when customer upload/download file from storage, and stored it in the table.
The table just like this: (Just for example, you should modify with your need)

Related

How to store (and query) the MaxMind GeoIP2 database in Azure?

In an Azure Web App I need to efficiently query the MaxMind GeoIP2 City Database (due to the volume of queries and the latency requirements we cannot use the MaxMind's rest API).
I'm wondering what's the best approach for storing the db (binary MMDB format, accessed via the official .NET api) so that it's easy to update with minimal downtime (we are going to subscribe Monthly updates) and still cost effective as to what regards Azure storage and transactions.
Apparently block blobs are the way to go, but I'm not sure about the monthly updates and the fact that the GeoIP2 api load in memory the whole db (I do not know if this would be a problem for the Web App, if I need a web worker to keep it up or I need something else), but actually I do not know yet how large the file is.
What's the most cost effective solution that preserve low latency over a huge volume?
According to the API docs you must have the database available in a file system (the API doesn't know anything about Azure storage and related REST API). So, regardless where you permanently store it, you'll need to have it on a disk somewhere.
I have no idea how large the database footprint is, but Web Apps, Cloud Services (web/worker roles) and Virtual Machines (whether Linux or Windows) all have local disks. And you have read/write access to these disks. So, you'd need to copy the database binary file (or csv) to local disk from somewhere. At this point, when you initialize the SDK, you'd create a DatabaseReader and point it to your locally-downloaded copy of the database file.
You mentioned storing the database in blob storage. There's nothing stopping you from doing so and simply downloading a copy to local disk. And there's nothing stopping you from storing multiple versions in multiple blobs. Note: You may also take advantage of Azure File storage (an SMB share). Which you choose is up to you.
As far as most cost effective solution: You'll need to do the pricing workup yourself to see what's most effective. You'd also need to evaluate how much RAM is available for the given size VM/role instance/Web App you choose. You mentioned Web Apps in your question: Web App instances scale from 0.5GB to 14GB, depending on the tier you choose (again, you'll need to evaluate this).

Azure addon - accessing WADPerformanceCountersTable?

If I write an Azure addon, can it access the WADPerformanceCountersTable table (of the business application that provisioned this addon)? Especially in terms of security/permissions.
E.g. say I wanted my addon to monitor some performance counters, and send an email alert if they pass some thresholds (regardless of whether there are already such commercial products, I'm just interested in the technical capability). What will I have to do? I'm guessing WADPerformanceCountersTable isn't publicly exposed to the entire worlds - so how can I make them accessible to my addon?
thanks very much
WADPerformanceCountersTable is nothing different from other Azure tables, and it's stored in the storage defined by Microsoft.WindowsAzure.Plugins.Diagnostics.ConnectionString in the configuration file. You will need the storage account name/key pair to read from this table.
FYI, here is an article about how to effectively fetching performance counter data from this table: http://gauravmantri.com/2012/02/17/effective-way-of-fetching-diagnostics-data-from-windows-azure-diagnostics-table-hint-use-partitionkey/

Billing a customer for level of services used on azure

I am building a service on azure and wanted to know if there is any way to know how much resources (data downloaded or uploaded, time required to do the processing) a customer has used in a given session and what level of services they have used in order to bill them accordingly. We expose the whole framework as a service, this consists of various small levels of services, like reading the data from some external FTP server, downloading it to blob, reading the file downloaded and storing them in tables and performing some operations the data in the table, email some results from service required by the user, etc.
So, depending on what all services the customer has used, we would like to bill them accordingly.
Thanks!
The only Azure specific function that I can think of that will help you with what you want to track is Azure Storage Logging which will track each and every request to Azure Storage, I'm not sure how much that is going to help you though.
I think you will have to decide exactly what you want to bill your customers for and then start tracking that yourself. This might be items similar to what MS charges you for (tracking the size of incoming requests, counting the number of transactions and the size of data stored to Azure Storage) or maybe just some arbitrary values based on some of this information

Limitations on Windows Azure Table Storage accounts

I am designing a multi-tennant web-based SaaS application that will be hosted on Windows Azure and use Table Storage.
The only limits I have found so far are:
5 storage accounts per subscription
100 TB maximum per storage account
1 MB per entity
I am deciding how to best partition my storage for multiple customers:
Option 1: Give each customer their own storage account. Not likely, considering the 5 account default limit.
Option 2: Give each customer their own set of tables. Prefix the table names with customer identifiers, such as a Books table split as "CustA_Books", "CustB_Books", etc.
Option 3: Have one set of tables, but prefix the partition keys to split the customers. So one "Books" table with partition keys of "CustA_Fiction", "CustA_NonFiction", "CustB_Fiction", "CustB_NonFiction", etc.
What are the pros and cons for options 2 and 3? Is there a limit to the number of tables in a single account that might affect option 2?
There are no limits to the number of tables you can create in Windows Azure. Your only limits ar the ones you have already listed. Well... I guess there are other limits if you consider the size of the entity attribute is always 64KB or less or if you consider batch options (100 entities or 4MB, whatever is the lesser).
Anyhow, the thing to keep in mind here is that your PartitionKey is going to be the most important thing you make. If you create a PK with the customer name in it, you get some good partitioning benefits. The downside to this is that if you mix the customer data in the same table, you make it harder on yourself to delete data (if you ever need to delete a customer). So, you can use the table as another level of partitioning. The PK you create is scoped to the table you create it under.
What I would consider here is if you ever need to delete the data in bulk or if you ever need to query data across customers (tenants). For the first one, it makes a ton of sense to use separate tables per customer so a delete is one operation versus at best 1 per 100 entities. However, if you need to query across tenants it is harder to join this data when you have multiple tables (that would require multiple queries).
All things being equal, I would use the tables as another level of partitioning if there is no overlap in tenant functionality and make my life easier should I want to delete a tenant. So, I guess that is option 2.
HTH
I highly suggest Option 2
We are also going this route because it adds a nice level or federation for the customer data. As the answered comment mentions it is easier to manage adding/deleting customers. Another benefit that we have noticed is the 'copy-abilty' of a customers data. This approach makes it much easier to move customer specific data to other storage accounts or to development environments for testing without affecting the entire lot.
In the SaaS world it also enables customers to get a copy of their own data with little effort, which is also a concern of many SaaS users.
Another alternative:
Imagine you have N storage accounts, the limit is 100 storage accounts per subscription. Each storage account have a table per customer.
For table request operations with Partition Key, like Insert, Update, Delete or a point query, you calculate hash value of customer name + partition key, calculate its modular of base N (total number of storage accounts), find the index of the exact storage account and forward the request to the correct storage account / table.
For read requests with no partition key, like a range query. Then you would need to broadcast the request to all storage accounts and merge the results.
One of the other things to keep in mind specifically around naming multiple storage accounts. Avoid naming the accounts lexicographically, that will cause them to be served from the same partition server on Azure backend and against their recommended scalability best practises. If you have N storage accounts. prefix each storage account name with a 3 digit hash, so they would be evenly distributed.

Architecture design and role communication with Azure in file bound app

I am considering moving my web application to Windows Azure for scalability purposes but I am wondering how best to partition my application.
I expect my scenario is typical and is as follows: my application allows users to upload raw data, this is processed and a report is generated. The user can then review their raw data and view their report.
So far I’m thinking a web role and a worker role. However, I understand that a VHD can be mounted to a single instance with read/write access so really both my web role and worker role need access to a common file store. So perhaps I need a web role and two separate worker roles, one worker role for the processing and the other for reading and writing to a file store. Is this a good approach?
I am having difficulty picturing the plumbing between the roles and concerned of the overhead caused by the communication between this partitioning so would welcome any input here.
Adding to Stuart's excellent answer: Blobs can store anything, with sizes up to 200GB. If you needed / wanted to persist an entire directory structure that's durable, you can mount a VHD with just a few lines of code. It's an NTFS volume that your app can interact with, just like any other drive.
In your case, a vhd doesn't fit well, because your web app would have to mount a vhd and be the sole writer to it. And if you have more than one web role instance (which you would if you wanted the SLA and wanted to scale), you could only have one writer. In this case, individual blobs fit MUCH better.
As Stuart stated, this is a very normal and common pattern. And again, with only a few lines of code, you can call upon the storage sdk to copy a file from blob storage to your instance's local disk. Then you can process the file using regular File IO operations. When your report is complete, another few lines of code lets you copy your report into a new blob (most likely in a well-known container that the web role knows to look in).
You can take this a step further and insert rows into an Azure table that are partitioned by customer, with row key identifying the individual uploaded file, and a 3rd field representing the URI to the completed report. This makes it trivial for the web app to display a customer's completed reports.
Blob storage is the easiest place to store files which lots of roles and role instances can then access - with none of them requiring special access.
The normal pattern suggested seems to be:
allow the raw files to be uploaded using instances of a web role
these web role instances return the HTTP call without doing processing - they store the raw files in blob storage, and add a "do this work message" to a queue.
the worker role instances pick up the message from the queue, read the raw blob, do the work, store the report result, then delete the message from the queue
all the web roles can then access the report when the user asks for it
That's the "normal pattern suggested" and you can see it implemented in things like the photo upload/thumbnail generation apps from the very first Azure PDC - its also used in this training course - follow through to the second page.
Of course, in practice you may need to build on this pattern depending on the size and type of data you are processing.

Resources