Selecting partition key is a simple but important design choice in Azure Cosmos DB. In terms of improving performance and costs (RUs). Azure cosmos DB does not allow us to change partition key. So it is very important to select right partition key.
I gone through Microsoft documents Link
But I still have confusion to choose partition key
Below is the item structure, I am planning to create
{
"id": "unique id like UUID", # just to keep some unique ID for item
"file_location": "/videos/news/finance/category/sharemarket/it-sectors/semiconductors/nvidia.mp4", # This value some times contains special symbols like spaces, dollars, caps and many more
"createatedby": "andrew",
"ts": "2022-01-10 16:07:25.773000",
"directory_location": "/videos/news/finance/category/sharemarket/it-sectors/semiconductors/",
"metadata": [
{
"codec": "apple",
"date_created": "2020-07-23 05:42:37",
"date_modified": "2020-07-23 05:42:37",
"format": "mp4",
"internet_media_type": "video/mp4",
"size": "1286011"
}
],
"version_id": "48ad8200-7231-11ec-abda-34519746721"
}
I am using Azure cosmos SQL API. By Default, Azure cosmos take cares of indexing all data. In above case all properties are indexed.
for reading items I use file_location property. Can I make file_location as primary key ? or anything else to consider.
Fews notes:
file_location values contains special characters like spaces, commas, dollars and many more.
Few containers contains 150 millions entries and few containers contains just 20 millions.
my operations are
more reads, frequent writes as new videos are added, less updates in case videos changed.
Few things to keep in mind while selecting partition keys:
Observe the query parameters while reading data, they give you good hints to what partition key candidates are.
You mentioned that few containers contain 150 million documents and few containers contain 20 million documents. Instead of number of documents stored in a container what matters is which containers are getting higher number of requests. If few containers are getting too many requests, that is a good indicator of poorly designed partition keys.
Try to distribute the request load as evenly as possible among containers so that it gets distributed evenly among the physical partitions. Otherwise, you will get hot-partition issues and will workaround by increasing throughput which will cost you more $.
Try to limit cross-partition queries as much as possible
Related
By default in cosmosDb, all properties in documents are indexed, so why should I care to do researches on the partition key while the searches on index works perfectly as well and cost nothing ?
I have a cosmosDb with one million of document like this with each of them contain an array, the partition key is "tankId" e.g.:
{
"id": "67acdb16-80dd-4a6c-a5b0-118d5f5fdb97",
"tankId": "67acdb16-80dd-4a6c-a5b0-118d5f5fdb97"
"UserIds": [
"905336a5-bf96-444f-bb11-3eedb65c3760",
"432270f5-780f-401b-9772-72ec96166be1",
"cfecdf7e-5067-46b1-ab4e-25ca7d597248"
],
}
If I do a request on "UserIds" on this million documents which is not a partition key but indexed property, it takes only 3.32 RU !!! Wow.
SELECT *
FROM c
WHERE ARRAY_CONTAINS(c.UserIds, "905336a5-bf96-444f-bb11-3eedb65c3760")
Is it a good practice to do that kind of request ? I am a little bit worried on my design.
It start's mattering once your number of physical partitions starts growing. Using the partition key will allow Cosmos to map the query to a logical partition that resides in a physical partition. Therefore the query won't be a so called 'cross-partition query' and it won't have to check the index of other physical partitions (that also would consume RU).
In your case you are talking about a million documents which likely use a lot less than 50GB of data (the max size of a physical partition) so it's all stored in the same physical partition. Therefore you won't have any noticable effects on the RU usage.
So to anwser your underlying question whether you should make any changes. Is your database mostly read heavy? Do you have any property that is often used for querying? Are you assured that your partitions remain under the logical partition size limit (20GB)? If yes, then you should likely consider it in your design. Even then it'll only matter once your data starts to split in physical partitions.
We would like to store a set of documents in Cosmos DB with a primary key of EventId. These records are evenly distributed across a number of customers. Clients need to access the latest records for a subset of customers as new documents are added. The documents are immutable, and need to be stored indefinitely.
How should we design our partition key and queries to avoid clients all hitting the same partitions and/or high RU usage?
If we use just CustomerId as the partition key, we would eventually run over the 10GB limit for a logical partition, and if we use EventId, then querying becomes inefficient (would result in a cross-partition query, and high RU usage, which we'd like to avoid).
Another idea would be to group documents into blocks. i.e. PartitionKey = int(EventId / PartitionSize). This would result in all clients hitting the latest partition(s), which presumably would result in poor performance and throttling.
If we use a combined PartitionKey of CustomerId and int(EventId / PartitionSize), then it's not clear to me how we would avoid a cross-partition query to retrieve the correct set of documents.
Edit:
Clarification of a couple of points:
Clients will access the events by specifying a list of CustomerId's, the last EventId they received, and a maximum number of records to retrieve.
For this reason, the use of EventId alone won't perform well, as it will result in a cross partition query (i.e. WHERE EventId > LastEventId).
The system will probably be writing on the order of 1GB a day, in 15 minute increments.
It's hard to know what the read volume will be, but I'd guess probably moderate, with maybe a few thousand clients polling the API at regular intervals.
So first thing first, logical partitions size limit has now been increased to 20GB, please see here.
You can use EventID as a partition as well, as you have limit of logical partition's size in GB but you have no limit on amount of logical partitions. So using EventID is fine, you will get a point to point read which is very fast if you query using the EventID. Now you mention using this way you will have to do cross-partition queries, can you explain how?
Few things to keep in mind though, Cosmos DB is not really meant for storing this kind of Log based data as it stores everything in SSDs so please calculate how much is your 1 document size and how many in a second would you have to store then how much in a day to how much in a month. You can use TTL to delete from Cosmos when done though and for long term storage store it in Azure BLOB Storage and for fast retrievals use Azure Search to query the data in BLOB by using CustomerID and EventID in your search query.
How should we design our partition key and queries to avoid clients all hitting the same partitions and/or high RU usage?
I faced a similar issue some time back and a PartitionKey with customerId + datekey e.g. cust1_20200920 worked well for me.
I created the date key as 20200920 (YYYYMMDD), but you can choose to ignore the date part or even the month (cust1_202009 /cust1_2020), based on your query requirement.
Also, IMO, if there are multiple known PartitionKeys at a query time it's kind of a good thing. For example, if you keep YYYYMM as the PartitionKey and want to get data for 4 months, you can run 4 queries in parallel and combine the data. Which is faster if you have many clients and these Partition Keys are distributed among multiple physical partitions.
On a separate note, Cosmos Db has recently introduced an analytical store for the transactional data which can be useful for your use case.
More about it here - https://learn.microsoft.com/en-us/azure/cosmos-db/analytical-store-introduction
One approach is using multiple Cosmos containers as "hot/cold" tiers with different partitioning. We could use two containers:
Recent: all writes and all queries for recent items go here. Partitioned by CustomerId.
Archive: all items are copied here for long term storage and access. Partitioned by CustomerId + timespan (e.g. partition per calendar month)
The Recent container would provide single partition queries by customer. Data growth per partition would be limited either by setting reasonable TTL during creation, or using a separate maintenance job (perhaps Azure Function on timer) to delete items when they are no longer candidates for recent-item queries.
A Change Feed processor, implemented by an Azure Function or otherwise, would trigger on each creation in Recent and make a copy into Archive. This copy would have partition key combining the customer ID and date range as appropriate to limit the partition size.
This scheme should provide efficient recent-item queries from Recent and safe long-term storage in Archive, with reasonable Archive query efficiency given a desired date range. The main downside is two writes for each item (one for each container) -- but that's the tradeoff for efficient polling. Whether this tradeoff is worthwhile is probably best determined by simulating the load and observing performance.
We’re using CosmosDB in production to store HTTP request/response audit data. The structure of this data generally looks as follows:
{
"id": "5ff4c51d3a7a47c0b5697520ae024769",
"Timestamp": "2019-06-27T10:08:03.2123924+00:00",
"Source": "Microservice",
"Origin": "Client",
"User": "SOME-USER",
"Uri": "GET /some/url",
"NormalizedUri": "GET /SOME/URL",
"UserAgent": "okhttp/3.10.0",
"Client": "0.XX.0-ssffgg;8.1.0;samsung;SM-G390F",
"ClientAppVersion": "XX-ssffgg",
"ClientAndroidVersion": "8.1.0",
"ClientManufacturer": "samsung",
"ClientModel": "SM-G390F",
"ResponseCode": "OK",
"TrackingId": "739f22d01987470591556468213651e9",
"Response": "[ REDACTED ], <— Usually quite long (thousands of chars)
"PartitionKey": 45,
"InstanceVersion": 1,
"_rid": "TIFzALOuulIEAAAAAACACA==",
"_self": "dbs/TIFzAA==/colls/TIFzALOuulI=/docs/TIFzALOuulIEAAAAAACACA==/",
"_etag": "\"0d00c779-0000-0d00-0000-5d1495830000\"",
"_attachments": "attachments/",
"_ts": 1561630083
}
We’re currently writing around 150,000 - 200,000 of documents similar to the above a day with /PartitionKey as the partition key path that's configured on the container. The value of the PartitionKey is a randomly generated number in C#.net between 0 and 999.
However, we are seeing daily hotspots where a single physical partition can hit a max of 2.5K - 4.5K RU/s and others are very low (around 200 RU/s). This has a knock on to cost implications as we need to provision throughput for our largest utilised partition.
The second factor is we're storing a fair bit of data, close to 1TB of documents, and we add a few GB each day. As a result we have currently have around 40 physical partitions.
Combining these two factors means we end up having to provision for at minimum somewhere between 120,000 - 184,000 RU/s.
I should mention that we barely ever need to query this data; apart from very occasional for ad-hoc manually constructed queries in Cosmos data explorer.
My question is... would we be a lot better off in terms of RU/s required and distribution of data by simply using the “id” column as our partition key (or a randomly generated GUID) - and then setting a sensible TTL so we don't have a continually growing dataset?
I understand this would require us to re-create the collection.
Thanks very much.
Max throughput per physical partition
While using the id or a GUID would give you better cardinality than the random number you have today, any query you run would be very expensive as it would always be cross-partition and over a huge amount of data.
I think a better choice would be to use a synthetic key that combines multiple properties that both have high cardinality and also are used to query for the data. Can learn more about these here, https://learn.microsoft.com/en-us/azure/cosmos-db/synthetic-partition-keys
As far as TTL I would definitely set that for whatever retention you need for this data. Cosmos will TTL the data off with unused throughput so will never get in the way.
Lastly, you should also consider (if you haven't already) using a custom indexing policy and exclude any paths which are never queried for. Especially the "response" property since you say it is thousands of characters long. This can save considerable RU/s in write-heavy scenarios like yours.
From my experience what I see is cosmos tends to degrade with new data. More data mean more physical partitons. So you meed more throughput to be allocated to each of them . Currently we are starting to archive old data into blob storage to avoid this kind of problems and keep the number of physical partition unchangeable. We use cosmos as hot storage and then the old data go to blobs storage as cold storage. We reduce RU allocated to each physical partitions and we save money.
I'm creating a logging system to monitor our (200 ish) main application installations, and Cosmos db seems like a good fit due to the amount of data we'll collect, and to allow a varying schema for the log data (particularly the Tags array - see document schema below).
But, never having used CosmosDb before I'm slightly unsure of what to use for my partition key.
If I partitioned by CustomerId, there would likely be several Gb of data in each of the 200 partitions, and the data will usually be queried by CustomerId, so this was my first choice for the partition key.
However I was planning to have a 'log stream' view in the logging system, showing logs coming in for all customers.
Would this lead to running a horribly slow / expensive cross partition query?
If so, is there an obvious way to avoid / limit the cost & speed implications of this cross partition querying? (Other than just taking out the log stream view for all customers!)
{
"CustomerId": "be806507-7cc4-4db4-881b",
"CustomerName": "Our Customer",
"SystemArea": 1,
"SystemAreaName": "ExchangeSync",
"Message": "Updated OK",
"Details": "",
"LogLevel": 2,
"Timestamp": "2018-11-23T10:59:29.7548888+00:00",
"Tags": {
"appointmentId": "109654",
"appointmentGroupId": "86675",
"exchangeId": "AAMkA",
"exchangeAlias": "customer.name#customer.com"
}
}
(Note - There isn't a defined list of SystemArea types we'll use yet, but it would be a lot fewer than the 200 customers)
Cross partition queries should be avoided as much as possible. If your querying is likely to happen with customer id then the customerid is a good logical partition key. However you have to keep in mind that there is a limit of 10GB per logical partition data.
A cross partition query across the whole database will lead to a very slow and very expensive operation but if it's not functionality critical and it's just used for infrequent reporting, it's not too much of a problem.
I'm setting up our first Azure Cosmos DB - I will be importing into the first collection, the data from a table in one of our SQL Server databases. In setting up the collection, I'm having trouble understanding the meaning and the requirements around the partition key, which I specifically have to name while setting up this initial collection.
I've read the documentation here: (https://learn.microsoft.com/en-us/azure/cosmos-db/documentdb-partition-data) and still am unsure how to proceed with the naming convention of this partition key.
Can someone help me understand how I should be thinking in naming this partition key? See the screenshot below for the field I'm trying to fill in.
In case it helps, the table I'm importing consists of 7 columns, including a unique primary key, a column of unstructured text, a column of URL's and several other secondary identifiers for that record's URL. Not sure if any of that information has any bearing on how I should name my Partition Key.
EDIT: I've added a screenshot of several records from the table from which I'm importing, per request from #Porschiey.
Honestly the video here* was a MAJOR help to understanding partitioning in CosmosDb.
But, in a nutshell:
The PartitionKey is a property that will exist on every single object that is best used to group similar objects together.
Good examples include Location (like City), Customer Id, Team, and more. Naturally, it wildly depends on your solution; so perhaps if you were to post what your object looks like we could recommend a good partition key.
EDIT: Should be noted that PartitionKey isn't required for collections under 10GB. (thanks David Makogon)
* The video used to live on this MS docs page entitled, "Partitioning and horizontal scaling in Azure Cosmos DB", but has since been removed. A direct link has been provided, above.
Partition key acts as a logical partition.
Now, what is a logical partition, you may ask? A logical partition may vary upon your requirements; suppose you have data that can be categorized on the basis of your customers, for this customer "Id" will act as a logical partition and info for the users will be placed according to their customer Id.
What effect does this have on the query?
While querying you would put your partition key as feed options and won't include it in your filter.
e.g: If your query was
SELECT * FROM T WHERE T.CustomerId= 'CustomerId';
It will be Now
var options = new FeedOptions{ PartitionKey = new PartitionKey(CustomerId)};
var query = _client.CreateDocumentQuery(CollectionUri,$"SELECT * FROM T",options).AsDocumentQuery();
I've put together a detailed article here Azure Cosmos DB. Partitioning.
What's logical partition?
Cosmos DB designed to scale horizontally based on the distribution of data between Physical Partitions (PP) (think of it as separately deployable underlaying self-sufficient node) and logical partition - bucket of documents with same characteristic (partition key) which is supposed to be stored fully on the same PP. So LP can't have part of the data on PP1 and another on PP2.
There are two main limitation on Physical Partitions:
Max throughput: 10k RUs
Max data size (sum of sizes of all LPs stored in this PP): 50GB
Logical partition has one - 20GB limit in size.
NOTE: Since initial releases of Cosmos DB size limits grown and I won't be surprised that soon size limitations might increase.
How to select right partition key for my container?
Based on the Microsoft recommendation for maintainable data growth you should select partition key with highest cardinality (like Id of the document or a composite field). For the main reason:
Spread request unit (RU) consumption and data storage evenly across all logical partitions. This ensures even RU consumption and storage distribution across your physical partitions.
It is critical to analyze application data consumption pattern when considering right partition key. In a very rare scenarios larger partitions might work though in the same time such solutions should implement data archiving to maintain DB size from a get-go (see example below explaining why). Otherwise you should be ready to increasing operational costs just to maintain same DB performance and potential PP data skew, unexpected "splits" and "hot" partitions.
Having very granular and small partitioning strategy will lead to an RU overhead (definitely not multiplication of RUs but rather couple additional RUs per request) in consumption of data distributed between number of physical partitions (PPs) but it will be neglectable comparing to issues occurring when data starts growing beyond 50-, 100-, 150GB.
Why large partitions are a terrible choice in most cases even though documentation says "select whatever works best for you"
Main reason is that Cosmos DB is designed to scale horizontally and provisioned throughput per PP is limited to the [total provisioned per container (or DB)] / [number of PP].
Once PP split occurs due to exceeding 50GB size your max throughput for existing PPs as well as two newly created PPs will be lower then it was before split.
So imagine following scenario (consider days as a measure of time between actions):
You've created container with provisioned 10k RUs and CustomerId partition key (which will generate one underlying PP1). Maximum throughput per PP is 10k/1 = 10k RUs
Gradually adding data to container you end-up with 3 big customers with C1[10GB], C2[20GB] and C3[10GB] of invoices
When another customer was onboarded to the system with C4[15GB] of data Cosmos DB will have to split PP1 data into two newly created PP2 (30GB) and PP3 (25GB). Maximum throughput per PP is 10k/2 = 5k RUs
Two more customers C5[10GB] C6[15GB] were added to the system and both ended-up in PP2 which lead to another split -> PP4 (20GB) and PP5 (35GB). Maximum throughput per PP is now 10k/3 = 3.333k RUs
IMPORTANT: As a result on [Day 2] C1 data was queried with up to 10k RUs
but on [Day 4] with only max to 3.333k RUs which directly impacts execution time of your query
This is a main thing to remember when designing partition keys in current version of Cosmos DB (12.03.21).
CosmosDB can be used to store any limit of data. How it does in the back end is using partition key. Is it the same as Primary key? - NO
Primary Key: Uniquely identifies the data
Partition key helps in sharding of data(For example one partition for city New York when city is a partition key).
Partitions have a limit of 10GB and the better we spread the data across partitions, the more we can use it. Though it will eventually need more connections to get data from all partitions. Example: Getting data from same partition in a query will be always faster then getting data from multiple partitions.
Partition Key is used for sharding, it acts as a logical partition for your data, and provides Cosmos DB with a natural boundary for distributing data across partitions.
You can read more about it here: https://learn.microsoft.com/en-us/azure/cosmos-db/partition-data
Each partition on a table can store up to 10GB (and a single table can store as many document schema types as you like). You have to choose your partition key though such that all the documents that get stored against that key (so fall into that partition) are under that 10GB limit.
I'm thinking about this too right now - so should the partition key be a date range of some type? In that case, it would really depend on how much data is getting stored in a period of time.
You are defining a logical partition.
Underneath, physically the data is split into physical partitions by Azure.
Ideally a partitionKey should be a primary Key, or a field with high cardinality to ensure proper distribution, with the self generated id field within that partition also set to the primary key, that will help with documentFetchById much faster.
You cannot change a partitionKey once container is created.
Looking at the dataset, captureId is a good candidate for partitionKey, with id set manually to this field, and not an auto generated cosmos one.
There is documentation available from Microsoft about partition keys. According to me you need to check the queries or operations that you plan to perform with cosmos DB. Are they read-heavy or write-heavy? if read heavy it is ideal to choose a partition key in the where clause that will be used in the query, if it is a write heavy operation then look for a key which has high cardinality
Always point reads /writes are better since it consumes way less RU's than running other queries