What does getContext.getCollection() return in CosmosDB stored procedure? - node.js

I have written a simple stored procedure to query a collection and return response but when I execute it as node.js script I get 400 error code and following error message:
"PartitionKey extracted from document doesn't match the one specified in the header"
When getContext().getCollection.getSelfLink() value is printed I get dbs/3Mk0AA==/colls/3Mk0AOWMbw0=/ but my database and collection Ids are some other values.
Any kind of help will be much appreciated.

When you observe the documents you created in Azure Cosmos DB, you will see several system generated properties in addition to the ID you set.
You could find official statement from System vs. user defined resources.
{
"id": "1",
"statusId": "new",
"_rid": "duUuAN3LzQEIAAAAAAAAAA==",
"_self": "dbs/duUuAA==/colls/duUuAN3LzQE=/docs/duUuAN3LzQEIAAAAAAAAAA==/",
"_etag": "\"0400d4ee-0000-0000-0000-5a24ac3f0000\"",
"_attachments": "attachments/",
"_ts": 1512352831
}
getContext().getCollection.getSelfLink() method returns "_self" value, not Id value you set.
PartitionKey extracted from document doesn't match the one specified
in the header
This issue should be due to you set the PartitionKey incorrectly.
Suppose your partitioning key is color and there are two partitions red and blue in the database. The PK should be set red or blue, not color.
You could refer to a similar thread I answered before : How to specify NONE partition key for deleting a document in Document DB java SDK?
Hope it helps you.

Yes, Thanks for the help guys!!
Incase anyone else tries, adding partition key while executing the stored procedure worked for me. The code is,
client.executeStoredProcedure('/dbs/<database-id>/colls/<collection-id>/sprocs/<storedproc-id>', <input to the procedure(if any)>, { partitionKey: <partition-field-id> }, callback);

Related

Converting JSON Data from S3 upload, and using Lambda function to push to DynamoDB

I've been working on an assignment recently and I feel like I'm very close to solving the problem I'm having, but I just can't seem to find anything that would help online.
As the title states, I've got some JSON data being uploaded from a webpage into an S3 bucket. When a new S3 item is created, I want to take that data and store it in a DynamoDB table.
I'm using a Lambda function and testing with some data I've already stored in my S3 bucket. I've got the data in its key-value pairs in my console.logs but I just can't work out why it isn't actually storing the data.
On the left I have the data broken down into its key-value pair, i.e. "artist": "Elvis Presley", using JSON.parse(JSON.stringify(data)).
What I'm wondering, is how to push this data into the table.
var params = {
Item: JSON.parse(JSON.stringify(data)),
ReturnConsumedCapacity: "TOTAL",
TableName: "s3-to-dynamo-s00187306"
};
dynamo.putItem(params, dynamoResultCallback);
The above code is what I've been trying to use but it's giving me a timeout error. If I bump up the allowed time then I receive a different error relating to a missing partition key in the item, even though my partition key matches with one of the key values in every item.
Really stumped here, any advice is appreciated, thanks in advance.
[edit]
So I used what someone suggested below, the dynamo-db converter, and have some logs which provide some insight into what's going on.
I've now got the data in the correct format for dynamo-db, and each item is parsed correctly as far as I can tell.
As for what dynamo represents, I'm not 100% so I'm going to add a screenshot of its declaration at the top of my code. I think it's the doc client?
[edit 2]
So my "_class" values are all the exact same, might try changing the partition key to title instead? (nevermind this didn't work)
JSON.stringify(data) return a json format that not match with Dynamodb format, Dynamodb are waiting a format like this:
Item: {
'CUSTOMER_ID' : {N: '001'},
'CUSTOMER_NAME' : {S: 'Richard Roe'}
}
As you see, the syntax is not the same, I think you need to use another library, maybe dynamo-converters, or look at NodeJs Aws SDK maybe there is a method that can do this.

Cannot identify the correct COSMOS DB SQL SELECT syntax to check if coordinates (Point) are within a Polygon

I m developing an app that uses Cosmos DB SQL. The intention is to identify if a potential construction site is within various restricted zones, such as national parks and sites of special scientific interest. This information is very useful in obtaining all the appropriate planning permissions.
I have created a container named 'geodata' containing 15 documents that I imported using data from a Geojson file provided by the UK National Parks. I have confirmed that all the polygons are valid using a ST_ISVALIDDETAILED SQL statement. I have also checked that the coordinates are anti-clockwise. A few documents contain MultiPolygons. The Geospatial Configuration of the container is 'Geography'.
I am using the Azure Cosmos Data Explorer to identify the correct format of a SELECT statement to identify if given coordinates (Point) are within any of the polygons within the 15 documents.
SELECT c.properties.npark18nm
FROM c
WHERE ST_WITHIN({"type": "Point", "coordinates":[-3.139638969259495,54.595188276959284]}, c.geometry)
The embedded coordinates are within a National Park, in this case, the Lake District in the UK (it also happens to be my favourite coffee haunt).
'c.geometry' is the JSON field within the documents.
"type": "Feature",
"properties": {
"objectid": 3,
"npark18cd": "E26000004",
"npark18nm": "Northumberland National Park",
"npark18nmw": " ",
"bng_e": 385044,
"bng_n": 600169,
"long": -2.2370801,
"lat": 55.29539871,
"st_areashape": 1050982397.6985701,
"st_lengthshape": 339810.592994494
},
"geometry": {
"type": "Polygon",
"coordinates": [
[
[
-2.182235310191206,
55.586659699934806
],
[
-2.183754259805564,
55.58706479201416
], ......
Link to the full document: https://www.dropbox.com/s/yul6ft2rweod75s/lakedistrictnationlpark.json?dl=0
I have not been able to format the SELECT query to return the name of the park successfully.
Can you help me?
Is what I want to achieve possible?
Any guidance would be appreciated.
I appreciate any help you can provide.
You haven't mentioned what error you are getting. And you have misspelt "c.geometry".
This should work
SELECT c.properties.npark18nm
FROM c
WHERE ST_WITHIN({"type": "Point", "coordinates": [-3.139638969259495,54.595188276959284]}, c.geometry)
When running the query with your sample document, I was able to get the correct response(see image).
So this particular document is fine and the query in your question works too. Can you recheck your query on the explorer again? Also, are you referring to the incorrect database/collection by any chance?
Maybe a full screen shot of cosmos data explorer showing the dbs/collections, your query and response will also help.
I have fixed this problem. Not by altering the SQL statement but by deleting the container the data was held in, recreating it and reloading the data.
The SQL statement now produces the expected results.

Why does my Azure Cosmos DB SQL API Container Refuse Multiple Items With Same Partition Key Value?

In Azure Cosmos DB (SQL API) I've created a container whose "partition key" is set to /part_key and I am now trying to create and edit data in Data Explorer.
I created an item that looks like this:
{
"id": "test_id",
"value": "val000",
"magicNumber": 32,
"part_key": "asdf"
}
I am now trying to create an item that looks like this:
{
"id": "frank",
"value": "val001",
"magicNumber": 33,
"part_key": "asdf"
}
Based on the documentation I believe that each item within a partition key needs a distinct id, which to me implies that multiple items can in fact share a partition key, which makes a lot of sense.
However, I get an error when I try to save this second item:
{"code":409,"body":{"code":"Conflict","message":"Entity with the specified id already exists in the system...
I see that if I change the value of part_key to something else (say asdf2), then I can save this new item.
Either my expectations about this functionality are wrong, or else I'm doing this wrong somehow. What is wrong here?
Your understanding is correct, It could happen if you are trying to instead a new document with id equal to id of the existing document. This is not allowed, so operation fails.
Before you insert the modified copy, you need to assign a new id to it. I tested the scenario and it looks fine. May be try to create a new document and check

Azure RUs on cosmosDB

I am trying to find out how RUs are working in order to optimize the requests made to the DB.
I have a simple query where is select by id
SELECT * FROM c WHERE c.id='cl0'
That query costs 277.08 RUs
Then I have another query where I select by another property
SELECT * FROM c WHERE c.name[0].id='35bfea78-ccda-4cc5-9539-bd7ff1dd474b'
That query costs 2.95 RUs
I cant figure out why there is that a big a difference in the consumed RUs between these two queries.
Thw two queries return the exact same result
[
{
"label": "class",
"id": "cl0",
"_id": "cl0",
"name": [
{
"_value": "C0.Iklos0",
"id": "35bfea78-ccda-4cc5-9539-bd7ff1dd474b"
}
],
"_rid": "6Ds6AJHyfgBfAAAAADFT==",
"_self": "dbs/6Ds4FA==/colls/6Ds6DFewfgA=/docs/6Ds6AJHyfgBdESFAAAAAAA==/",
"_etag": "\"00007200-0000-0000-0000-w3we73140000\"",
"_attachments": "attachments/",
"_ts": 1528722196
}
]
I faced similar issue previously so you are not the only person facing this issue. I provide you with two solutions.
1.sql SELECT * FROM c WHERE c.id='cl0' query documents across total database.If you could make a partition key to properly field, it will greatly improve your performance.
You could refer to this doc to know how to choose partition key.
2.I founded below answer in the thread: Azure DocumentDB Query by Id is very slow
Microsoft support responded and they've resolved the issue. They've added IndexVersion 2 for the collection. Unfortunately, it is not yet available from the portal and newly created accounts/collection are still not using the new version. You'll have to contact Microsoft Support to made changes to your accounts.
I suggest you committing feedback here to trace this announcement.
Hope it helps you.
-- Edit
To upgrade to index version 2 use the following code
var collection = (await client.ReadDocumentCollectionAsync(string.Format("/dbs/{0}/colls/{1}", databaseId, collectionId))).Resource;
collection.SetPropertyValue("IndexVersion", 2);
var replacedCollection = await client.ReplaceDocumentCollectionAsync(collection);
RU consumption depends on your document size and your query, I will highly recommend below link to query metrics. If you want to tune your query or want to understand latency check query Feed Details
x-ms-documentdb-query-metrics:
totalExecutionTimeInMs=33.67;queryCompileTimeInMs=0.06;queryLogicalPlanBuildTimeInMs=0.02;queryPhysicalPlanBuildTimeInMs=0.10;queryOptimizationTimeInMs=0.00;VMExecutionTimeInMs=32.56;indexLookupTimeInMs=0.36;documentLoadTimeInMs=9.58;systemFunctionExecuteTimeInMs=0.00;userFunctionExecuteTimeInMs=0.00;retrievedDocumentCount=2000;retrievedDocumentSize=1125600;outputDocumentCount=2000;writeOutputTimeInMs=18.10;indexUtilizationRatio=1.00
x-ms-request-charge: 604.42
https://learn.microsoft.com/en-us/azure/cosmos-db/sql-api-sql-query-metrics

DocumentDB and Azure Search: Document removed from documentDB isn't updated in Azure Search index

When i remove a document from DocumentDB it wont be removed from the Azure Search Index. The index will update if i change something in a document.
I'm not quite sure how i should use this "SoftDeleteColumnDeletionDetectionPolicy" in the datasource.
My datasource is as follows:
{
"name": "mydocdbdatasource",
"type": "documentdb",
"credentials": {
"connectionString": "AccountEndpoint=https://myDocDbEndpoint.documents.azure.com;AccountKey=myDocDbAuthKey;Database=myDocDbDatabaseId"
},
"container": {
"name": "myDocDbCollectionId",
"query": "SELECT s.id, s.Title, s.Abstract, s._ts FROM Sessions s WHERE s._ts > #HighWaterMark"
},
"dataChangeDetectionPolicy": {
"#odata.type": "#Microsoft.Azure.Search.HighWaterMarkChangeDetectionPolicy",
"highWaterMarkColumnName": "_ts"
},
"dataDeletionDetectionPolicy": {
"#odata.type": "#Microsoft.Azure.Search.SoftDeleteColumnDeletionDetectionPolicy",
"softDeleteColumnName": "isDeleted",
"softDeleteMarkerValue": "true"
}
}
And i have followed this guide:
https://azure.microsoft.com/en-us/documentation/articles/documentdb-search-indexer/
What am i doing wrong? Am i missing something?
I will describe what I understand about SoftDeleteColumnDeletionDetectionPolicy in a data source. As the name suggests, it is Soft Delete policy and not the Hard Delete policy. Or in other words, the data is still there in your data source but it is somehow marked as deleted.
Essentially the way it works is periodically Search Service will query the data source and checks for the entries that are deleted by checking the value of the attribute defined in SoftDeleteColumnDeletionDetectionPolicy. So in your case, it will query the DocumentDB collection and find out the documents for which isDeleted attribute's value is true. It then removes the matching documents from the Index.
The reason it is not working for you is because you are actually deleting the records instead of changing the value of isDeleted from false to true. Thus it never finds matching values and no changes are done to the index.
One thing you could possibly do is instead of doing Hard Delete, you do Soft Delete in your DocumentDB collection to begin with. When the Search Service re-indexes your data, because the document is soft deleted from the source it will be removed from the index. Then to save storage costs at the DocumentDB level, you simply delete these documents through a background process some time later.

Resources