Azure Cosmos DB syntax for REPLACE method - node.js

I'm working with the Cosmos SDK in my my Node.js app. I've been able to query the database successfully but I'm having trouble with the REPLACE method. I'm wanting to update a single item in the database either by the unique 'id' field or the build in '_rid' field.
Here is the how I currently have it formatted (which returns the error: Entity with the specified id does not exist in the system):
const { resource: updatedItem } = await client.database(databaseId).container(contianerId).item('2INhAI1fcdkSAAAAERFAAA==', 'TX').replace(newJsonObject);
Sample Item:
'state' is the partition key
{
"DateTime": "01-28-19 11:55:48",
"id": "15",
"resolved": false,
"state": "TX",
"_rid": "2INhAI1fcdkSAAAAAAAAAA==",
"_self": "dbs/2INhAA==/colls/2INhAI1fcdk=/docs/2INhAI1fcdkSAAAAAAAAAA==/",
"_etag": "\"fd03208d-0000-0700-0000-5fc68a550000\"",
"_attachments": "attachments/",
"_ts": 1606847061
}

This ended up being the correct syntax for what I needed:
const { resource: updatedItem } = await client.database(databaseId).container(contianerId).item('15', 'TX').replace(newJsonObject);
where the value for the id and the value for the predefined partition key (state in my example) are the properties provided for the item.

Related

Cosmos DB SQL Query how to count sub properties?

I have these kind of json documents in a CosmosDB database.
{
"Version": 0,
"Entity": {
"ID": "xxxxxxx",
"EventHistory": {
"2020-04-28T16:30:35.6887561Z": "NEW",
"2020-04-28T16:35:21.1811993Z": "PROCESSED"
},
"SourceSystem": "xxxx",
"SourceSystemIdentifier": "xxxx",
"PCC": "xxx",
"StorageReference": "xxxxxxxxxxxx",
"SupplementaryData": {
"eTicketCount": "2"
}
}
}
The number of sub-properties within the EventHistory node is dynamic. In the example there are two but it can be any number.
I couldn't find a way to count how many sub-properties the node contains. At least, I need to query those whose have only one property declared.
FYI: I'm not able to change the format of the documents. I know that it would be more convenient to store them as an array.
I tried to use ARRAY_LENGTH or COUNT functions but since it's not an array, the formers couldn't be applied.

Unable to map nested datasource field of cosmos db to a root index field of Azure indexer using REST APIs

I have a mongo db collection users with the following data format
{
"name": "abc",
"email": "abc#xyz.com"
"address": {
"city": "Gurgaon",
"state": "Haryana"
}
}
Now I'm creating a datasource, an index, and an indexer for this collection using azure rest apis.
Datasource
def create_datasource():
request_body = {
"name": 'users-datasource',
"description": "",
"type": "cosmosdb",
"credentials": {
"connectionString": "<db conenction url>"
},
"container": {"name": "users"},
"dataChangeDetectionPolicy": {
"#odata.type": "#Microsoft.Azure.Search.HighWaterMarkChangeDetectionPolicy",
"highWaterMarkColumnName": "_ts"
}
}
resp = requests.post(url="<create-datasource-api-url>", data=json.dumps(request_body),
headers=headers)
Index for the above datasource
def create_index(config):
request_body = {
'name': "users-index",
'fields': [
{
'name': 'name',
'type': 'Edm.String'
},
{
'name': 'email',
'type': 'Edm.DateTimeOffset'
},
{
'name': 'address',
'type': 'Edm.String'
},
{
'name': 'doc_id',
'type': 'Edm.String',
'key': True
}
]
}
resp = requests.post(url="<azure-create-index-api-url>", data=json.dumps(request_body),
headers=config.headers)
Now the inxder for the above datasource and index
def create_interviews_indexer(config):
request_body = {
"name": "users-indexer",
"dataSourceName": "users-datasource",
"targetIndexName": users-index,
"schedule": {"interval": "PT5M"},
"fieldMappings": [
{"sourceFieldName": "address.city", "targetFieldName": "address"},
]
}
resp = requests.post("create-indexer-pi-url", data=json.dumps(request_body),
headers=config.headers)
This creates the indexer without any exception, but when I check the retrieved data in azure portal for the users-indexer, the address field is null and is not getting any value from address.city field mapping that is provided while creating the indexer.
I have also tried the following code as a mapping but its also not working.
"fieldMappings": [
{"sourceFieldName": "/address/city", "targetFieldName": "address"},
]
The azure documentation also does not say anything about this kind of mapping. So if anyone can help me on this, it will be very much appreciated.
container element in data source definition allows you to specify a query that you can use to flatten your JSON document (Ref: https://learn.microsoft.com/en-us/rest/api/searchservice/create-data-source) so instead of doing column mapping in the indexer definition, you can write a query and get the output in desired format.
Your code for creating data source in that case would be:
def create_datasource():
request_body = {
"name": 'users-datasource',
"description": "",
"type": "cosmosdb",
"credentials": {
"connectionString": "<db conenction url>",
},
"container": {
"name": "users",
"query": "SELECT a.name, a.email, a.address.city as address FROM a",
},
"dataChangeDetectionPolicy": {
"#odata.type": "#Microsoft.Azure.Search.HighWaterMarkChangeDetectionPolicy",
"highWaterMarkColumnName": "_ts"
}
}
resp = requests.post(url="<create-datasource-api-url>", data=json.dumps(request_body),
headers=headers)
Support for MongoDb API flavor is in public preview - you need to explicitly indicate Mongo in the datasource's connection string as described in this article. Also note that with Mongo datasources, custom queries suggested by the previous response are not supported afaik. Hopefully someone from the team would clarify the current state of this support.
It's working for me with the below field mapping correctly. Azure search query is returning values for address properly.
"fieldMappings": [{"sourceFieldName": "address.city", "targetFieldName": "address"}]
I did made few changes to the data your provided for e.g.
while creating indexers, removed extra comma at the end of
fieldmappings
while creating index, email field is kept at
Edm.String and not datetimeoffset.
Please make sure you are using the Preview API version since for MongoDB API is in preview mode with Azure Search.
For e.g. https://{azure search name}.search.windows.net/indexers?api-version=2019-05-06-Preview

Query cosmosDB: get last element in array

i have document like this:
{ "id": ....,
"Title": ""title,
"ZipCodes": [
{
"Code": "code01",
"Name": "Name01"
},
{
"Code": "code02",
"Name": "Name02"
},
{
"Code": "code03",
"Name": "Name03"
} ],
"_rid": .......,
"_self": .......,
"_etag": ......,
"_attachments": "attachments/",
"_ts": ......
i was used to command
select c.id, c.ZipCodes[ARRAY_LENGTH (c.ZipCodes) -1] as ZipCodes from c
But i got error, how can i query last element ZipCodes in cosmos DB.
You can use ARRAY_SLICE for this. When passed -1 it returns an array containing the last element of the original array. Then index into that with [0] to get the single element contained (i.e. the zip code itself.)
SELECT c.id,
ARRAY_SLICE(c.ZipCodes,-1)[0] AS LastZipCode
FROM c
There is no way using select you can query the subdocument , i think you should use the where condition as follows,
SELECT value udf.sortZipCode(c.ZipCodes)
from c where c.id=2 and c.Title='title'
However, here is a user defined function (UDF) that will do the trick:
function sortZipCode(ZipCode) {
function compareTimeStamps(a, b) {
return a.TimeStamp - b.TimeStamp; //implement your logic
}
return scanLog.sort(compareTimeStamps);
}
But i got error, how can i query last element ZipCodes in cosmos DB.
I agree with Sajeetharan mentioned that we could use the UDF to do that. And we could do that with UDF easily.
UDF code
function userDefinedFunction(zipcodes){
return zipcodes[zipcodes.length-1];
}
SQL query:
SELECT c.id,c.Title,udf.GetLastRecord(c.ZipCodes) as ZipCodes FROM c
Test Result:

Azure search index not updating field

I have two indexes, index1 is the old and currently used index and the new index2 contains additionally a new string array field myArray1.
Azure Search is using documentdb collection as a source and myArray1 is filled out properly there. However when querying the document in the Azure Search Explorer myArray1 is always empty. The search explorer is set to index2. I also tried resetting index2 but without luck.
I am using a CreateDataSource.json to define the query for the documentdb collection. In this query I am selecting the prop myArray1.
Any idea why the index is not picking up the values stored in myArray?
Here is the data source query:
SELECT c.id AS Id, c.crew AS Crews, c['cast'] AS Casts FROM c WHERE c._ts >= #HighWaterMark
If I run it against documentdb in Azure search it works fine.
Here is the index definition:
Index definition = new Index()
{
Name = "index-docdb4",
Fields = new[]
{
new Field("Id", DataType.String, AnalyzerName.StandardLucene) { IsKey = true, IsFilterable = true },
new Field("Crews", DataType.Collection(DataType.String)) { IsFilterable = true },
new Field("Casts", DataType.Collection(DataType.String)) { IsFilterable = true }
}
};
Here is the indexer json file
{
"name": "indexer-docdb4",
"dataSourceName": "datasource-docdb",
"targetIndexName": "index-docdb4",
"schedule": {
"interval": "PT5M",
"startTime": "2015-01-01T00:00:00Z"
}
}
Here is a documentdb example file
{
"id": "300627",
"title": "Carmen",
"originalTitle": "Carmen",
"year": 2011,
"genres": [
"Music"
],
"partitionKey": 7,
"_rid": "OsZtAIcaugECAAAAAAAAAA==",
"_self": "dbs/OsZtAA==/colls/OsZtAIcaugE=/docs/OsZtAIcaugECAAAAAAAAAA==/",
"_etag": "\"0400d17e-0000-0000-0000-590a493a0000\"",
"_attachments": "attachments/",
"cast": [
"315986",
"321880",
"603325",
"484671",
"603324",
"734554",
"734555",
"706818",
"711766",
"734556",
"734455"
],
"crew": [
"58185",
"390726",
"302640",
"670953",
"28046",
"122587"
],
"_ts": 1493846327
},

Not Getting the Shape Right in DocumentDb Select

I'm trying to get only the person's membership info i.e. ID, name and committee memberships in a SELECT query. This is my object:
{
"id": 123,
"name": "John Smith",
"memberships": [
{
"id": 789,
"name": "U.S. Congress",
"yearElected": 2012,
"state": "California",
"committees": [
{
"id": 444,
"name": "Appropriations Comittee",
"position": "Member"
},
{
"id": 555,
"name": "Armed Services Comittee",
"position": "Chairman"
},
{
"id": 678,
"name": "Veterans' Affairs Comittee",
"position": "Member"
}
]
}
]
}
In this example, John Smith is a member of the U.S. Congress and three committees in it.
The result that I'm trying to get should look like this. Again, this is the "DESIRED RESULT":
{
"id": 789,
"name": "U.S. Congress",
"committees": [
{
"id": 444,
"name": "Appropriations Committee",
"position": "Member"
},
{
"id": 555,
"name": "Armed Services Committee",
"position": "Chairman"
},
{
"id": 678,
"name": "Veterans' Affairs Committee",
"position": "Member"
}
]
}
Here's my SQL query:
SELECT m.id, m.name,
[
{
"id": c.id,
"name": c.name,
"position": c.position
}
] AS committees
FROM a
JOIN m IN a.memberships
JOIN c IN m.committees
WHERE a.id = "123"
I'm getting the following results which is correct but the shape is not right. I'm getting the same membership 3 times. Here's what I'm getting which is NOT the desired result:
[
{
"id": 789,
"name": "U.S. Congress",
"committees":[
{
"id": 444,
"name": "Appropriations Committee",
"position": "Member"
}
]
},
{
"id": 789,
"name": "U.S. Congress",
"committees":[
{
"id": 555,
"name": "Armed Services Committee",
"position": "Chairman"
}
]
},
{
"id": 789,
"name": "U.S. Congress",
"committees":[
{
"id": 678,
"name": "Veterans' Affairs Committee",
"position": "Member"
}
]
}
]
As you can see here, the "U.S. Congress" membership is repeated 3 times.
The following SQL query gets me exactly what I want in Azure Query Explorer but when I pass it as the query in my code -- using DocumentDb SDK -- I don't get any of the details for the committees. I simply get blank results for committee ID, name and position. I do, however, get the membership data i.e. "U.S. Congress", etc. Here's that SQL query:
SELECT m.id, m.name, m.committees AS committees
FROM c
JOIN m IN c.memberhips
WHERE c.id = 123
I'm including the code that makes the DocumentDb call. I'm including the code with our internal comments to help clarify their purpose:
First the ReadQuery function that we call whenever we need to read something from DocumentDb:
public async Task<IEnumerable<T>> ReadQuery<T>(string collectionId, string sql, Dictionary<string, object> parameterNameValueCollection)
{
// Prepare collection self link
var collectionLink = UriFactory.CreateDocumentCollectionUri(_dbName, collectionId);
// Prepare query
var query = getQuery(sql, parameterNameValueCollection);
// Creates the query and returns IQueryable object that will be executed by the calling function
var result = _client.CreateDocumentQuery<T>(collectionLink, query, null);
return await result.QueryAsync();
}
The following function prepares the query -- with any parameters:
protected SqlQuerySpec getQuery(string sql, Dictionary<string, object> parameterNameValueCollection)
{
// Declare query object
SqlQuerySpec query = new SqlQuerySpec();
// Set query text
query.QueryText = sql;
// Convert parameters received in a collection to DocumentDb paramters
if (parameterNameValueCollection != null && parameterNameValueCollection.Count > 0)
{
// Go through each item in the parameters collection and process it
foreach (var item in parameterNameValueCollection)
{
query.Parameters.Add(new SqlParameter($"#{item.Key}", item.Value));
}
}
return query;
}
This function makes async call to DocumentDb:
public async static Task<IEnumerable<T>> QueryAsync<T>(this IQueryable<T> query)
{
var docQuery = query.AsDocumentQuery();
// Batches gives us the ability to read data in chunks in an asyc fashion.
// If we use the ToList<T>() LINQ method to read ALL the data, the call will synchronous which is why we prefer the batches approach.
var batches = new List<IEnumerable<T>>();
do
{
// Actual call is made to the backend DocumentDb database
var batch = await docQuery.ExecuteNextAsync<T>();
batches.Add(batch);
}
while (docQuery.HasMoreResults);
// Because batches are collections of collections, we use the following line to merge all into a single collection.
var docs = batches.SelectMany(b => b);
// Return data
return docs;
}
I just write a demo to test with your query and I can get the expected result, check the snapshot below. So I think that query is correct, you've mentioned that you don't seem to get any data when you make the call in my code, would you mind share your code? Perhaps there are some mistakes in you code. Anyway, here is my test just for your reference and hope it helps.
Query used:
SELECT m.id AS membershipId, m.name AS membershipNameName, m.committees AS committees
FROM c
JOIN m IN c.memberships
WHERE c.id = "123"
Code here is very simple, sp_db.innerText represents a span which I used to show the result in my test page:
var docs = client.CreateDocumentQuery("dbs/" + databaseId + "/colls/" + collectionId,
"SELECT m.id AS membershipId, m.name AS membershipName, m.committees AS committees " +
"FROM c " +
"JOIN m IN c.memberships " +
"WHERE c.id = \"123\"");
foreach (var doc in docs)
{
sp_db.InnerText += doc;
}
I think maybe there are some typos in the query you specified in client.CreateDocumentQuery() which makes the result to be none, it's better to provide the code for us, then we can help check it.
Updates:
Just tried your code and still I can get the expected result. One thing I found is that when I specified the where clause like "where c.id = \"123\"", it gets the result:
However, if you didn't make the escape and just use "where c.id = 123", this time you get nothing. I think this could be a reason. You can verify whether you have ran into this scenario.
Just updated my original post. All the code provided in the question is correct and works. I was having a problem because I was using aliases in the SELECT query and as a result some properties were not binding to my domain object.
The code provided in the question is correct.

Resources