How to check for inequality between fields in same document in Azure Cognitive search? - azure

We have an index set up in Azure cognitive search that has two string fields (hash1 & hash2) containing separate hashes. We would like to query the index for documents where the two hashes within a document aren't equal.
I tried applying the filter: $filter=hash1 ne hash2, expecting the query to return all documents with mismatched hashes. Instead, I was greeted with the following error message:
"Invalid expression: Comparison must be between a field, range variable or function call and a literal value.\r\nParameter name: $filter"
From what I can gather there seems to be some kind of limitation preventing comparisons between fields. Would it be possible to perform this type of query in Azure cognitive search using a different technique?

I would use content enrichment in this case. Even if comparing two hashes with a query was supported, it would be inefficient compared to pre-calculating the value using a content enrichment technique.
Introduce a new boolean property called something like HasEqualHashes
Populate that property with an appropriate boolean value
Use a $filter to filter your content as you wish
search=whatever&$filter=HasEqualHashes
Note that two different scenarios determine how you can enrich your content.
CONTENT SUBMITTED VIA SDK
When you use the SDK to submit content, you can enrich your items any way you want using regular code. Populating your HasEqualHashes property is a trivial one-liner in C#.
CONTENT SUBMITTED USING BUILT-IN INDEXERS
If you use one of the built-in indexers, you have to learn and understand the concept of skillsets.
https://learn.microsoft.com/en-us/azure/search/cognitive-search-working-with-skillsets

Related

How do you construct an Azure Search query to return a wildcard search based solely on a specific field?

If I may have missed this in some other area of SO please redirect me but I don't think this is a duped question.
I am using Azure Search with an index field called Title which is searchable and filterable using a Standard Lucerne Analyzer.
When using the built-in explorer, if I want to return all Job Titles that are explicitly named Full Stack Developer I can achieve it this way:
$filter=Title eq 'Full Stack Developer'&$count=true
But if I want to retrieve all the Job Titles using a wildcard to return all records having Full Stack in the name this way:
$filter=Title eq 'Full Stack*'&$count=true
The first 20 or so records returned are spot on, but after that point I get a mix of records that have absolutely nothing in common with the Title I specified in the query. My initial assumption was that perhaps Azure was including my specified Title performing an inclusive keyword search on the text as well.
Though I found a few instances where that hypothesis seemed to prove out, so many more of the records returned invalidated that altogether.
Maybe I don't understand fully the mechanics under the hood of Azure Search and so though my query appears to be valid; my expectation of the result is way off.
So how should my query look to perform a wildcard resulting to guarantee the words specified in the search to be included in the Titles returned, if this should be possible? And what would be the correct syntax to condition the return to accommodate for OR operators to be inclusive?
Azure Cognitive Search allows you to perform wildcard searches limited to specific fields only. To do so, you will need to specify the name of the fields in which you want to perform the search in searchFields parameter.
Your search URL in this case would be something like:
https://accountname.search.windows.net/indexes/indexname/docs?api-version=2020-06-30&searchFields=Title&search=Full Stack*
From the link here:

Azure Cognitive Search - return full json as SearchDocument?

I'm using Azure.Search.Documents in C# to index JSON documents in Azure blob storage. About half of the fields of each json doc are meant to be searchable or fielded. The JSON also includes some fields that I don't want evaluated by my search.
My goal is to return the entire JSON document in my search results.
It seems like my choices are to (a) add SearchField records to my SearchIndex for every aspect of the document (in which the SearchDocument results are ready for me to use) or (b) leverage metadata_storage_path / metadata_storage_name and do a separate fetch for the document itself.
Option (b) feels less efficient, considering that the SearchDocument returned is already so close to the full JSON; it seems a shame to have to make a separate fetch for each document. But for option (a) to work, I'd need to tell the SearchIndex about the extra fields without them triggering false positive search results.
For (a) is there a way to add SearchFields (or the equivalent) and have them not trigger false positives? (IsSearchable seems to affect how, but not whether, they are evaluated). Also, if (b) is the better approach, is there a way to do this using "new SearchField" as opposed to declared via attributes? Thanks.
Thank you Vince. Adding your comment as answer to help other community users.
Set IsSearchable to FALSE

Returning accented as well as normal result set via azure search filters

Does anyone know how to ensure we can return normal result as well as accented result set via the azure search filter. For e.g the below filter query in Azure search returns a name called unicorn when i check for record with name unicorn.
var result= searchServiceClient.Documents.SearchAsync<myDto>("*",new SearchParameters
{
SearchFields = new List<string> {"Name"},
Filter = "Name eq 'unicorn'"
});
This is all good but what i want is i want to write a filter such that it returns record named unicorn as well as record named únicorn (please note the first accented character) provided that both record exist.
This can be achieved when searching for such name via the search query using language or Standard ASCII folding search analyzer as mentioned in this link. What i am struggling to find out is how can we implement the same with azure filters?
Please let me know if anyone has got any solutions around this.
Filters are applied on the non-analyzed representation of the data, so I don’t think there’s any way to do any type of linguistic analysis on filters. One way to work around this is to manually create a field which only do lowercasing + asciifloding (no tokenization) and then search lucene queries that look like this:
"normal search query terms" AND customFilterColumn:"filtérValuèWithÄccents"
Basically the document would both need to match the search terms in any field AND also match the filter term in the “customFilterColumn”. This may not be sufficient for your needs, but at least you understand the art of the possible.
Using filters it won't work unless you specify in advance all the possibilities:
for example:
$filter=name eq 'unicorn' or name eq 'únicorn'
You'd better work with a different analyzer that will change accents to it's root form. As another possibility, you can try fuzzy search:
search=unicorn~&highlight=Name

How to set a value in a list as the key for Azure Cognitive Search

The data I have is of the form
{"event": {"custom": {"dimensions": [{"Id": ....}, {},...{}]}, ...},...}
The key that I need to index by is in the list. However, Cognitive Search does not seem to let me access the value within the list. Azure Cog. Search also fails to access any content from the list while trying to index.
Are there any solutions you can think?
Not sure how you're trying, but Azure Cognitive Search supports Complex types. Take a look in the following link:
https://learn.microsoft.com/en-us/azure/search/search-howto-complex-data-types
As an Alternative, you can project the internal dimensions (assuming they have a fixed number of dimensions) to fields in your index.
When using Indexers to import the data, key fields are limited to what can be expressed in a field mapping which has some support for mapping functions but wont allow you to select a value of an object in a collection. Your only options are to pre-process and transform the data (such as a query if this is coming from Cosmos DB, or azure function trigger if coming from blobs) or use a different field as the id and put the dimension id in another field that is queryable.
To make the data queryable you can use complex types or if the dimensions are always in the same ordinal you can use output field mappings to map it to a field by collection ordinal such as /document/event/custom/dimensions/1.

How to do "Not Equals" in couchdb?

Folks, I was wondering what is the best way to model document and/or map functions that allows me "Not Equals" queries.
For example, my documents are:
1. { name : 'George' }
2. { name : 'Carlin' }
I want to trigger a query that returns every documents where name not equals 'John'.
Note: I don't have all possible names before hand. So the parameters in query can be any random text like 'John' in my example.
In short: there is no easy solution.
You have four options:
sending a multi range query
filter the view response with a server-side list function
using a CouchDB plugin
use the mango query language
sending a multi range query
You can request the view with two ranges defined by startkey and endkey. You have to choose the range so, that the key John is not requested.
Unfortunately you have to find the commit request that somewhere exists and compile your CouchDB with it. Its not included in the official source.
filter the view response with a server-side list function
Its not recommended but you can use a list function and ignore the row with the key John in your response. Its like you will do it with a JavaScript array.
using a CouchDB plugin
Create an additional index with e.g. couchdb-lucene. The lucene server has such query capabilities.
use the "mango" query language
Its included in the CouchDB 2.0 developer preview. Not ready for production but will be definitely included in the stable release.

Resources