Microsoft Cosmos DB includes DocumentDB API, Table API and others. I have about ~ 10 TB of data and would like to have a fast key-value lookup (very little updating and writing, mostly are reading). Add a link for Microsoft Cosmos DB:
https://learn.microsoft.com/en-us/azure/cosmos-db/
So how should I choose between DocumentDB API and Table API?
Or when should I choose DocumentDB API? When should I choose Table API?
Is it a good practice to use DcoumentDB API to store 10 TB of data?
The Azure Cosmos DB Table API was introduced to make Cosmos DB and its advanced indexing, geo-distribution, etc. features available to the Azure Table storage community. The idea is that someone using Azure Table storage who needs more advanced features only offered by Cosmos DB can literally just change their connection string and their existing code will work with Cosmos DB.
But if you are a greenfield customer then I would recommend using SQL API (formerly called Document DB API) which is a super set of Table API. We are constantly investing in providing more advanced features and capabilities to SQL API where as for Table API we are just looking to maintain compatibility with Azure Table storage's API which hasn't changed in many years.
How much data you have doesn't have any affect on what API you choose. They both have the same multi-model infrastructure and can handle the same sizes of data, query loads, distribution, etc.
So how should I choose between DocumentDB API and Table API?
Choosing between DocumentDB API and Table API will primarily depend on the kind of data that you're going to store. DocumentDB API provides a schema-less JSON database engine with SQL querying capabilities whereas Table API provides a key-value storage database service. Since you mentioned that your data is key-value based, recommended is that you use Table API.
Or when should I choose DocumentDB API? When should I choose Table API?
Same as above.
Is it a good practice to use DcoumentDB API to store 10 TB of data?
Both Document DB API and Table API are designed to store huge amounts of data.
However you may want to look into Azure Table Storage as well. Cosmos DB lets you fine tune the throughput that you need and robust indexing/querying support and that comes at a price. Azure Tables on the other hand comes with fixed throughput and limited indexing/querying support and is extremely cheap compared to Cosmos DB.
You may find this link helpful to explore more about Cosmos DB: https://learn.microsoft.com/en-us/azure/cosmos-db/introduction.
Please don't flag this as off-topic.
It might help for you to know in advance: if you are considering the document interface, then in fact there is a case-insensitivity that can affect how DataContract classes (and I believe all others) are transformed to and from Cosmos.
In the linked discussion below, you will see that there is a case insensitivity in Newtonsoft.Json that can have effects on your handling of objects that you pass or get directly from the API. Not that Cosmos has ANY flaws, and in fact it is totally excellent. But with a document API, you might (like me) start to simply pass DataContract objects into Cosmos (which is obviously not wrong, and in fact very much expected from the object API), but there are some serializer and naming strategy handler options that you are probably better of at least being aware of up front.
So just to add a note for you to be aware of this behavior with an object interface. The discussion is here on GitHub:
https://github.com/JamesNK/Newtonsoft.Json/issues/815
I am a new to azure.Could any one help me what is table storage in Azure and how can I do table storage deployment through VSTS?Please share your thoughts and what steps involved in this and which plugin/task I can use in VSTS to perform this?
About Azure Table storage, you can refer to this article: Azure Table storage overview.
Regarding Azure table storage with VSTS, you can manage azure tables and table entities through Azure PowerShell task.
Azure Table storage stores large amounts of structured data. The service is a NoSQL datastore which accepts authenticated calls from inside and outside the Azure cloud. Azure tables are ideal for storing structured, non-relational data. Common uses of Table storage include:
Storing TBs of structured data capable of serving web scale
applications
Storing datasets that don't require complex joins, foreign keys, or
stored procedures and can be denormalized for fast access
Quickly querying data using a clustered index
Accessing data using the OData protocol and LINQ queries with WCF
Data Service .NET Libraries
You can use Table storage to store and query huge sets of structured, non-relational data, and your tables will scale as demand increases.
You’ll have to install Azure Storage Client Library for .NET to work with Azure Storage.
For more details, refer to the documentations Get started with Azure Table storage using .NET and Get started with Azure table storage and Visual Studio Connected Services (ASP.NET) incase if you haven't checked earlier.
First time using Azure. I have a basic node js bot built with Microsofts Bot Framework, and deployed on Azure. What are my options for storage?
I will most likely just be needing simple key:value storage. Mongodb was my first though but I dont think Azure supports it nativeley.
That said, what are my options for storage on Azure? I usual shy away from MySQL just from preference, but theres no actual reason that wouldnt work either.
Take a look at Azure Table Storage for a NoSql solution
Table storage is a key/attribute store with a schemaless design. Because Table storage is schemaless, it's easy to adapt your data as the needs of your application evolve. Access to data is fast and cost-effective for all kinds of applications. Table storage is typically significantly lower in cost than traditional SQL for similar volumes of data.
I have read here and on other post and forums that the best place to save session state in Azure is AppFabric Cache, but I find that very expensive and would like to give a go to either table storage or a SQL database.
I read that a SQL database will be faster but I can't understand why it would be. Surely the SQL database will cache hot data in memory, but I would expect Table Storage to also do that (does it?). Otherwise I don't see why a SQL database would be faster at retrieving a row than Table Storage, in the end both are just retrieving data from a local disk based on a key. I would even expect that because Table Storage scales up well and automatically (vs a SQL databases that needs to be partitioned manually), it would be preferable as session state isn't a good candidate for local caching.
Does anyone have any experience or opinion on this?
thanks
Charles
You mentioned AppFabric Cache, which is a retired service. Regarding SQL vs Table: There isn't really a right answer to this. If you want to spin up a SQL Database instance (running about $2.50 monthly for a Basic-tier database), you'll have 2GB to work with. With Table storage, you'll pay about $0.15 for the same storage. Then there is Redis cache, your own cache (such as memcached), Azure Managed Cache service, etc. Performance-wise, you'd need to do some benchmarking to see how each performs. Any of these would work with Virtual Machines, Cloud Services (web/worker roles), and Web Sites, as they all have very well-defined APIs and, if using ASP.NET MVC, good provider support. Each has different capacity limits and different pricing.
One thing with Table storage: each entity (row) is limited to 1MB, so if you're attempting to cache > 1MB per cache entry, you'll need to consider another option.
#Gaurav mentioned in-role cache. This is a great way to use extra memory in your web/worker role instances. However: It's limited to web/worker Cloud Services; it doesn't help with Web Sites or Virtual Machines. For those, you really need some type of independent cache provider.
I am at the planning stage of a web application that will be hosted in Azure with ASP.NET for the web site and Silverlight within the site for a rich user experience. Should I use Azure Tables or SQL Azure for storing my application data?
Azure Table Storage appears to be less expensive than SQL Azure. It is also more highly scalable than SQL Azure.
SQL Azure is easier to work with if you've been doing a lot of relational database work. If you were porting an application that was already using a SQL database, then moving it to SQL Azure would be the obvious choice, but that's the only situation where I would recommend it.
The main limitation on Azure Tables is the lack of secondary indexes. This was announced at PDC '09 and is currently listed as coming soon, but there hasn't been any time-frame announcement. (See http://windowsazure.uservoice.com/forums/34192-windows-azure-feature-voting/suggestions/396314-support-secondary-indexes?ref=title)
I've seen the proposed use of a hybrid system where you use table and blob storage for the bulk of your data, but use SQL Azure for indexes, searching and filtering. However, I haven't had a chance to try that solution yet myself.
Once the secondary indexes are added to table storage, it will essentially be a cloud based NoSQL system and will be much more useful than it is now.
Despite similar names SQL Azure Tables and Table Storage have very little in common.
Here are a two links that might help you:
Table Storage, a 100x cost factor
Fat Entities on Table Storage
Basically, the first question should wonder about is Does my app really need to scale? If not, then go for SQL Azure.
For those trying to decide between the two options, be sure to factor reporting requirements into the equation. SQL Azure Reporting and other reporting products support SQL Azure out of the box. If you need to generate complex or flexible reports, you'll probably want to avoid Table Storage.
Azure tables are cheaper, simpler and scale better than SQL Azure. SQL Azure is a managed SQL environment, multi-tenant in nature, so you should analyze if your performance requirements are fit for SQL Azure. A premium version of SQL Azure has been announced and is in preview as of this writing (see HERE).
I think the decisive factors to decide between SQL Azure and Azure tables are the following:
Do you need to do complex joins and use secondary indexes? If yes, SQL Azure is the best option.
Do you need stored procedures? If yes, SQL Azure.
Do you need auto-scaling capabilities? Azure tables is the best option.
Rows within an Azure table cannot exceed 4MB in size. If you need to store large data within a row, it is better to store it in blob storage and reference the blob's URI in the table row.
Do you need to store massive amounts of semi-structured data? If yes, Azure tables are advantageous.
Although Azure tables are tremendously beneficial in terms of simplicity and cost, there are some limitations that need to be taken into account. Please see HERE for some initial guidance.
One other consideration is latency. There used to be a site that Microsoft ran with microbenchmarks on throughput and latency of various object sizes with table store and SQL Azure. Since that site's no longer available, I'll just give you a rough approximation from what I recall. Table store tends to have much higher throughput than SQL Azure. SQL Azure tends to have lower latency (by as much as 1/5th).
It's already been mentioned that table store is easy to scale. However, SQL Azure can scale as well with Federations. Note that Federations (effectively sharding) adds a lot of complexity to your application. I'm also not sure how much Federations affects performance, but I imagine there's some overhead.
If business continuity is a priority, consider that with Azure Storage you get cheap geo-replication by default. With SQL Azure, you can accomplish something similar but with more effort with SQL Data Sync. Note that SQL Data Sync also incurs performance overhead since it requires triggers on all of your tables to watch for data changes.
I realize this is an old question, but still a very valid one, so I'm adding my reply to it.
CoderDennis and others have pointed out some of the facts - Azure Tables is cheaper, and Azure Tables can be much larger, more efficient etc. If you are 100% sure you will stick with Azure, go with Tables.
However this assumes you have already decided on Azure. By using Azure Tables, you are locking yourself into the Azure platform. It means writing code very specific to Azure Tables that is not just going to port over to Amazon, you will have to rewrite those areas of your code. On the other hand programming for a SQL database with LINQ will port over much more easily to another cloud service.
This may not be an issue if you've already decided on your cloud platform.
I suggest looking at Azure Cache in combination with Azure Table. Table alone has 200-300ms latencies, with occasional spikes higher, which can significantly slow down response times / UI interactivity. Cache + Table seems to be a winning combination, for me.
For your question, I want to talk about how to decide with logic choose SQL Table and which need to use Azure Table.
As we know SQL Table is a relational database engine. but if you have a big data in one table the SQL Table is not applicable, because SQL query get big data is slow.
At this time you can choose Azure Table, the Azure Table query is so fast then SQL Table for big data, for example, in our website, someone subscribed many articles, we make the article as feed to user, every user have a copy of article title and description, so in the article table there are lots of data, if we use SQL Table, each query execution maybe take more than 30 seconds. But in Azure Table get users article feed by PartitionKey and RowKey is so fast.
From this example you may know how to choose between in SQL Table and Azure Table.
I wonder whether we are going to end up with some "vendor independent" cloud api libraries in due course?
I think that you have first to define what your application usage funnels are. Will your data model be subjected to frequent changes or it is a stable one? You have to be able to perform ultra fast inserts and reads are not so complicated? Do you need advance google like search? Storing BLOBS?
Those are the questions (and not only) that you have to ask and answer yourself in order to decide if you are more likely going to use NoSql or SQL approach in storing your data.
Please consider that both approaches can easily coexist and can be extended with BLOB storage as well.
Both Azure Tables and SQL Azure are two different beasts.Both are meant for different scenarios, one con to azure table is that you cannot move from azure to any other platform, unless you write providers in your code that can handle such shifts.