Azure Search Service REST API Delete Error: "Document key cannot be missing or empty." - azure

I am seeing some intermittent and odd behavior when trying to use the Azure Search Service REST API to delete a blob storage blob/document. It works, sometimes, and then other times I get this:
The request is invalid. Details: actions : 0: Document key cannot be
missing or empty.
Once I start getting this error, it's the same results when I try to delete any of the document/blobs stored in that index. I do have 'metadata_storage_path' listed as my index key (see below).
I have not been able to get the query to succeed again, or I would examine the differences in Fiddler.
I have also tried the following with no luck:
Resetting and re-running the associated search indexer.
Creating a new indexer & index against the same container and deleting from that.
Creating a new container, indexer, & index and deleting from that.
Any additional suggestions or thoughts?

Copy/paste error: "metadata_storage_name" should be "metadata_storage_path".
[Insert head-banging-on-wall emoji here.]

For those who are still searching for the solution...
Instead of id,
{
"value": [
{
"#search.action": "delete",
"id":"TDVRT0FPQXcxZGtTQUFBQUFBQUFBQT090fdf"
}
]
}
Use rid of your document to delete.
{
"value": [
{
"#search.action": "delete",
"rid":"TDVRT0FPQXcxZGtTQUFBQUFBQUFBQT090fdf"
}
]
}
Because while creating Search Index, you might have selected rid as your unique id column.
Note: We can delete a document only with Unique Id Columns.

Related

DynamoDB returns 200 statusCode for deleteItem but it still exists on the console

My table has a hash key userId (there is no range key).
I am calling the API as follows (from Node.js):
dynamo.deleteItem({
"TableName": 'my-table',
"Key": {
"userId": '4ada7bbd-a8ac-4d29-94c6-e199a50430c9'
}
}
I am calling this API and it is returning statusCode of 200 successful, but that item still exists on the DynamoDB console even if I refreshed with the refresh button.
How is this possible?
Please keep in mind that the DeleteItem operation succeeds even if you delete a non-existing item. In your case, probably an item with the key "4ada7bbd-a8ac-4d29-94c6-e199a50430c9" doesn't exist - maybe there is some typo in the name or something?
Try using GetItem to get the item instead of DeleteItem - then you'll be able to verify that the item that you think exists with this key, doesn't exist. Or, use GetItem after the DeleteItem to verify in that way that the item is gone after the delete. Don't mix code and UI in the same test because it's harder to know what you did wrong if you can't paste a stand-alone failing code.

Why does my Azure Cosmos DB SQL API Container Refuse Multiple Items With Same Partition Key Value?

In Azure Cosmos DB (SQL API) I've created a container whose "partition key" is set to /part_key and I am now trying to create and edit data in Data Explorer.
I created an item that looks like this:
{
"id": "test_id",
"value": "val000",
"magicNumber": 32,
"part_key": "asdf"
}
I am now trying to create an item that looks like this:
{
"id": "frank",
"value": "val001",
"magicNumber": 33,
"part_key": "asdf"
}
Based on the documentation I believe that each item within a partition key needs a distinct id, which to me implies that multiple items can in fact share a partition key, which makes a lot of sense.
However, I get an error when I try to save this second item:
{"code":409,"body":{"code":"Conflict","message":"Entity with the specified id already exists in the system...
I see that if I change the value of part_key to something else (say asdf2), then I can save this new item.
Either my expectations about this functionality are wrong, or else I'm doing this wrong somehow. What is wrong here?
Your understanding is correct, It could happen if you are trying to instead a new document with id equal to id of the existing document. This is not allowed, so operation fails.
Before you insert the modified copy, you need to assign a new id to it. I tested the scenario and it looks fine. May be try to create a new document and check

How to show unique keys on Cosmos DB container?

This link implies that unique keys can be seen in a Cosmos DB container by looking at the settings. However I can't seem to find them using both the portal and the storage explorer. How can you view the unique keys on an existing Cosmos DB container? I have a document that fails to load due to a key violation which should be impossible so I need to confirm what the keys are.
A slightly easier way to view your Cosmos DB unique keys is to view the ARM template for your resource.
On your Cosmos DB account, click Settings/ Export Template- let the template be generated and view online once complete. You will find them under the "uniqueKeyPolicy" label.
Based on this blob, unique keys policy should be visible like below:
"uniqueKeyPolicy": {
"uniqueKeys": [
{
"paths": [
"/name",
"/country"
]
},
{
"paths": [
"/users/title"
]
}
]
}
However, I could not see it on the portal as same as you. Maybe it's a bug here.
You could use cosmos db sdk as a workaround to get the unique keys policy, please see my java sample code.
ResourceResponse<DocumentCollection> response1 = documentClient.readCollection("dbs/db/colls/test", null);
DocumentCollection coll =response1.getResource();
UniqueKeyPolicy uniqueKeyPolicy = coll.getUniqueKeyPolicy();
Collection<UniqueKey> uniqueKeyCollections = uniqueKeyPolicy.getUniqueKeys();
for(UniqueKey uniqueKey : uniqueKeyCollections){
System.out.println(uniqueKey.getPaths());
}
Here is the basic code that worked for me. The code that writes the collection is output in Json format. I think this is similar to what you see in the portal but it skips or omits the uniqueKeyPolicy information.
As a side note I think I found a bug or odd behavior. Inserting a new document can throw unique index constraint violation but updates do not.
this.EndpointUrl = ConfigurationManager.AppSettings["EndpointUrl"];
this.PrimaryKey = ConfigurationManager.AppSettings["PrimaryKey"];
string dbname = ConfigurationManager.AppSettings["dbname"];
string containername = ConfigurationManager.AppSettings["containername"];
this.client = new DocumentClient(new Uri(EndpointUrl), PrimaryKey);
DocumentCollection collection = await client.ReadDocumentCollectionAsync(UriFactory.CreateDocumentCollectionUri(dbname, containername));
Console.WriteLine("\n4. Found Collection \n{0}\n", collection);
Support for showing unique key policy in collection properties will be added soon. Meanwhile you can use DocumentDBStudio to see unique keys in collection. Once unique key policy is set, it cannot be modified.
WRT odd behavior, can you please share full isolated repro and explain expected and actual behavior.
Here you can view the ARM template in your Azure Portal, and as the winner comment says You will find the unique keys under the "uniqueKeyPolicy" label.

Issue with Azure Blob Indexer

I have come across a scenario where I want to index all the files that are present in the blob storage.
But, In a scenario if the file that is uploaded in Blob is password protected, the indexer fails and also the indexer is now not able to index the remaining files.
[
{
"key": null,
"errorMessage": "Error processing blob 'url' with content type ''. Status:422, error: "
}
]
Is there a way to ignore the password protected files or a way to continue with the indexing process even if there is an error in some file.
See Dealing with unsupported content types section in Controlling which blobs are indexed. Use failOnUnsupportedContentType configuration setting:
PUT https://[service name].search.windows.net/indexers/[indexer name]?api-version=2016-09-01
Content-Type: application/json
api-key: [admin key]
{
... other parts of indexer definition
"parameters" : { "configuration" : { "failOnUnsupportedContentType" : false } }
}
Is there a way to ignore the password protected files or a way to
continue with the indexing process even if there is an error in some
file.
One possible way to do it is define a metadata on the blob by the name AzureSearch_Skip and set its value to true. In this case, Azure Search Service will ignore this blob and moves to the next blob in the list.
You can read more about this here: https://learn.microsoft.com/en-us/azure/search/search-howto-indexing-azure-blob-storage#controlling-which-parts-of-the-blob-are-indexed.

DocumentDB and Azure Search: Document removed from documentDB isn't updated in Azure Search index

When i remove a document from DocumentDB it wont be removed from the Azure Search Index. The index will update if i change something in a document.
I'm not quite sure how i should use this "SoftDeleteColumnDeletionDetectionPolicy" in the datasource.
My datasource is as follows:
{
"name": "mydocdbdatasource",
"type": "documentdb",
"credentials": {
"connectionString": "AccountEndpoint=https://myDocDbEndpoint.documents.azure.com;AccountKey=myDocDbAuthKey;Database=myDocDbDatabaseId"
},
"container": {
"name": "myDocDbCollectionId",
"query": "SELECT s.id, s.Title, s.Abstract, s._ts FROM Sessions s WHERE s._ts > #HighWaterMark"
},
"dataChangeDetectionPolicy": {
"#odata.type": "#Microsoft.Azure.Search.HighWaterMarkChangeDetectionPolicy",
"highWaterMarkColumnName": "_ts"
},
"dataDeletionDetectionPolicy": {
"#odata.type": "#Microsoft.Azure.Search.SoftDeleteColumnDeletionDetectionPolicy",
"softDeleteColumnName": "isDeleted",
"softDeleteMarkerValue": "true"
}
}
And i have followed this guide:
https://azure.microsoft.com/en-us/documentation/articles/documentdb-search-indexer/
What am i doing wrong? Am i missing something?
I will describe what I understand about SoftDeleteColumnDeletionDetectionPolicy in a data source. As the name suggests, it is Soft Delete policy and not the Hard Delete policy. Or in other words, the data is still there in your data source but it is somehow marked as deleted.
Essentially the way it works is periodically Search Service will query the data source and checks for the entries that are deleted by checking the value of the attribute defined in SoftDeleteColumnDeletionDetectionPolicy. So in your case, it will query the DocumentDB collection and find out the documents for which isDeleted attribute's value is true. It then removes the matching documents from the Index.
The reason it is not working for you is because you are actually deleting the records instead of changing the value of isDeleted from false to true. Thus it never finds matching values and no changes are done to the index.
One thing you could possibly do is instead of doing Hard Delete, you do Soft Delete in your DocumentDB collection to begin with. When the Search Service re-indexes your data, because the document is soft deleted from the source it will be removed from the index. Then to save storage costs at the DocumentDB level, you simply delete these documents through a background process some time later.

Resources