How to set a value in a list as the key for Azure Cognitive Search - azure

The data I have is of the form
{"event": {"custom": {"dimensions": [{"Id": ....}, {},...{}]}, ...},...}
The key that I need to index by is in the list. However, Cognitive Search does not seem to let me access the value within the list. Azure Cog. Search also fails to access any content from the list while trying to index.
Are there any solutions you can think?

Not sure how you're trying, but Azure Cognitive Search supports Complex types. Take a look in the following link:
https://learn.microsoft.com/en-us/azure/search/search-howto-complex-data-types
As an Alternative, you can project the internal dimensions (assuming they have a fixed number of dimensions) to fields in your index.

When using Indexers to import the data, key fields are limited to what can be expressed in a field mapping which has some support for mapping functions but wont allow you to select a value of an object in a collection. Your only options are to pre-process and transform the data (such as a query if this is coming from Cosmos DB, or azure function trigger if coming from blobs) or use a different field as the id and put the dimension id in another field that is queryable.
To make the data queryable you can use complex types or if the dimensions are always in the same ordinal you can use output field mappings to map it to a field by collection ordinal such as /document/event/custom/dimensions/1.

Related

How to check for inequality between fields in same document in Azure Cognitive search?

We have an index set up in Azure cognitive search that has two string fields (hash1 & hash2) containing separate hashes. We would like to query the index for documents where the two hashes within a document aren't equal.
I tried applying the filter: $filter=hash1 ne hash2, expecting the query to return all documents with mismatched hashes. Instead, I was greeted with the following error message:
"Invalid expression: Comparison must be between a field, range variable or function call and a literal value.\r\nParameter name: $filter"
From what I can gather there seems to be some kind of limitation preventing comparisons between fields. Would it be possible to perform this type of query in Azure cognitive search using a different technique?
I would use content enrichment in this case. Even if comparing two hashes with a query was supported, it would be inefficient compared to pre-calculating the value using a content enrichment technique.
Introduce a new boolean property called something like HasEqualHashes
Populate that property with an appropriate boolean value
Use a $filter to filter your content as you wish
search=whatever&$filter=HasEqualHashes
Note that two different scenarios determine how you can enrich your content.
CONTENT SUBMITTED VIA SDK
When you use the SDK to submit content, you can enrich your items any way you want using regular code. Populating your HasEqualHashes property is a trivial one-liner in C#.
CONTENT SUBMITTED USING BUILT-IN INDEXERS
If you use one of the built-in indexers, you have to learn and understand the concept of skillsets.
https://learn.microsoft.com/en-us/azure/search/cognitive-search-working-with-skillsets

How do you construct an Azure Search query to return a wildcard search based solely on a specific field?

If I may have missed this in some other area of SO please redirect me but I don't think this is a duped question.
I am using Azure Search with an index field called Title which is searchable and filterable using a Standard Lucerne Analyzer.
When using the built-in explorer, if I want to return all Job Titles that are explicitly named Full Stack Developer I can achieve it this way:
$filter=Title eq 'Full Stack Developer'&$count=true
But if I want to retrieve all the Job Titles using a wildcard to return all records having Full Stack in the name this way:
$filter=Title eq 'Full Stack*'&$count=true
The first 20 or so records returned are spot on, but after that point I get a mix of records that have absolutely nothing in common with the Title I specified in the query. My initial assumption was that perhaps Azure was including my specified Title performing an inclusive keyword search on the text as well.
Though I found a few instances where that hypothesis seemed to prove out, so many more of the records returned invalidated that altogether.
Maybe I don't understand fully the mechanics under the hood of Azure Search and so though my query appears to be valid; my expectation of the result is way off.
So how should my query look to perform a wildcard resulting to guarantee the words specified in the search to be included in the Titles returned, if this should be possible? And what would be the correct syntax to condition the return to accommodate for OR operators to be inclusive?
Azure Cognitive Search allows you to perform wildcard searches limited to specific fields only. To do so, you will need to specify the name of the fields in which you want to perform the search in searchFields parameter.
Your search URL in this case would be something like:
https://accountname.search.windows.net/indexes/indexname/docs?api-version=2020-06-30&searchFields=Title&search=Full Stack*
From the link here:

Can't add Azure Search to SQL Database

I have created Azure Search resource, and also SQL Database.
I'm trying to use "Add Azure Search" option in Azure Portal.
It splited to 2 steps.
Data source creation (done)
Indexer creation
When i'm trying to create indexer, it says
Import configuration failed, error creating Index
Error creating Index: "The request is invalid."
What does it mean? There is no any details.
My Table Schema looks like this:
Did you change any of the types in the index from the defaults? Here is a mapping of what SQL types map to Azure Cognitive Search index field types: https://learn.microsoft.com/en-us/azure/search/search-howto-connecting-azure-sql-database-to-azure-search-using-indexers#mapping-between-sql-and-azure-cognitive-search-data-types From my link, nvarchar maps to Edm.String or Collection(Edm.String). In your screenshot above, it looks like you've changed several field types (to Edm.DateTimeOffset and Edm.Int64, for example). That may be causing the error when it tries to create the index.
Or, it may be that you specified a ‘suggester name’ and ‘search mode’, but none of the index fields have ‘Suggester’ checked (hard to tell if the screenshot includes all fields or not). If you need a suggester, you should mark at least one field to use it. If you don’t need it, don't fill in those fields; otherwise the index creation will fail.

How to map my json path into cosmos db from Azure Data Factory

I'm trying to add entries into my cosmosdb using Azure Data Factory - However i am not able to choose the right collection as Azure Data Factory can only see the top level of the database.
Is there any funny syntax for choosing which collection to pick from Cosmos DB SQL API? - i've tried doing, entities[0] and entities['tasks'] but none of them seem to work
The new entries are inserted as we see in the red box, how do i get the entries into the entries collection?
Update:
Original Answer:
If the requirement you mentioned in the comments is what you need, then it is possible. For example, to put JSON data into an existing ‘tasks’ item, you only need to use the upsert method, and the source json data has the same id as the ‘tasks’ item.
This is the offcial doc:
https://learn.microsoft.com/en-us/azure/data-factory/connector-azure-cosmos-db#azure-cosmos-db-sql-api-as-sink
The random letters and numbers in your red box appear because you did not specify the document id.
Have a look of this:
By the way, if the tasks have partitional key, then you also need to specify.

Can I create azure cosmos db document with a custom key other than Id

I am using azure cosmos db for saving and editing my session information. Currently i am not using ID in my document, instead i have another unique field with all docs. How can i update my query to get documents?
You can use whatever property you want, for your custom key (just make sure you don't remove its index). By default, all properties are indexed unless you explicitly set up a custom index policy that removes certain properties from being indexed.
You cannot eliminate the built-in id property though; if you don't set it explicitly, it will just be set to a guid.
If you are doing queries, this really shouldn't matter, functionality-wise. Just search on whatever properties you want. However: If you are doing point-reads (a read is more efficient, RU-wise, than a query, when retrieving a single document) you can only perform a point-read by specifying the id property, not your custom property. If you must use a custom property and you need to do point-reads, you can consider storing your custom property as id as well (as long as it's guaranteed to be unique per document).

Resources