Fuzzy search on a collection of strings - search

I'm using Azure Search and I have an index that has a field named 'keywords', which holds keywords (with type Collection(Edm.String)) related to a single document. I want to be able to use fuzzy search on my documents and as I understood from this link, all I have to do is put a '~' character to the end of my search query. However, this doesn't seem to work in my case.
I have a few documents in my index and one of them includes "fun" in its keywords. When I search for "run" with fuzzy search, I expect to see the documents with keyword "run", as well as "fun". If I know correctly, the edit distance between "fun" and "run" is only 1, which seems to be the default distance Azure Search's fuzzy search uses. Am I doing anything wrong here?
Or does the type Collection(Edm.String) not support fuzzy search? The attributes for 'keywords' are Searchable, Filterable and Retrievable.
Edit: I'm using the Standard Lucene Analyzer for the 'keywords' field. When I send the query
https://fakename.search.windows.net/indexes/fakeindex/docs?api-version=2016-09-01&search=run~
I would expect to get the following document as its keywords contain "fun"
"keywords": [
"balloon",
"message",
"text",
"monster",
"fun",
"evil",
"mad",
"cartoons",
"funny"
]

The fuzzy search feature is only supported in Lucene query syntax in Azure Search. Please specify queryType=full in the query string.

Related

How to disable tokenization for Azure Search Autocomplete?

I've created Azure Search Suggester for "full_name" index field in order to support autocomplete functionality. Now when I use Azure autocomplete REST endpoint by using "search" parameter as a let's say "Lor" I only get back the result "Lorem" not the "Lorem Ipsum". Is there any way to disable tokenization for suggester and to get back full name like "Lorem Ipsum" for the search term "Lor" for autocomplete?
The Autocomplete API is meant to suggest search terms based on incomplete terms one is typing into to the search box (type-ahead). It supports three modes:
oneTerm – Only one term is suggested. If the query has two terms, only
the last term is completed. For example:
"washington medic" -> "medicaid", "medicare", "medicine"
twoTerms – Matching two-term phrases in the index will be suggested,
for example:
"medic" -> "medicare coverage", "medical assistant"
oneTermWithContext – Completes the last term in a query with two or
more terms, where the last two terms are a phrase that exists in the
index, for example:
"washington medic" -> "washington medicaid", "washington medical"
The twoTerms mode might work for you. If you're looking for an API that suggests documents based on an incomplete query term, try the Suggestions API. It returns the entire contents of a field that has a Suggester enabled for all documents that matched the query.

Azure Search not finding words containing search query

I have an index with nutritional information. A search for burger does not match hamburger or burgers.
What is the most appropriate & efficient way to be able to search for these with Azure Search? I can use wildcards to match burgers (i.e. burger*) but Azure Search does not support wildcards at the start of the query, so I can't figure out how to match hamburger.
You can achieve this by using Lucene query syntax (see link below) in azure search.
Construct your query by using querytype full (which enables Lucene query syntax) with your term as a search. As an example, to find all matches containing the word burger, construct your query like this (try it in the azure search, search explorer query window:
queryType=full&search=/.*burger.*/
Microsoft docs for Lucene query syntax

How can I use AzureSearch with wildcard

I want to search for a field that has the name "14009-00080300", and I want to get a hit when searching only on a part of that, for example "14009-000803".
Using this code I dont get any hits:
{
"search": "\"14009-000803\"*",
"count":true,
"top":10
}
Is there a way to use azure search like SQL uses its wildcard search? (select * from table where col like '%abc%' ?
You can get your desired result by performing a full query with Lucene syntax (as noted by Sumanth BM). The trick is to do a regex search. Modify your query params like so:
{
"queryType": "full",
"search": "/.*searchterm.*/",
"count":true,
"top":10
}
Replace 'searchterm' with what you are looking for and azure search should return all matches from your index searchable columns.
See Doc section: MS Docs on Lucene regular expression search
You can use generally recognized syntax for multiple () or single (?) character wildcard searches. Note the Lucene query parser supports the use of these symbols with a single term, and not a phrase.
For example to find documents containing the words with the prefix "note", such as "notebook" or "notepad", specify "note".
Note
You cannot use a * or ? symbol as the first character of a search.
No text analysis is performed on wildcard search queries. At query time, wildcard query terms are compared against analyzed terms in the search index and expanded.
SearchMode parameter considerations
The impact of searchMode on queries, as described in Simple query syntax in Azure Search, applies equally to the Lucene query syntax. Namely, searchMode in conjunction with NOT operators can result in query outcomes that might seem unusual if you aren't clear on the implications of how you set the parameter. If you retain the default, searchMode=any, and use a NOT operator, the operation is computed as an OR action, such that "New York" NOT "Seattle" returns all cities that are not Seattle.
https://learn.microsoft.com/en-us/rest/api/searchservice/simple-query-syntax-in-azure-search
Reference: https://learn.microsoft.com/en-us/rest/api/searchservice/lucene-query-syntax-in-azure-search#bkmk_wildcard

Azure search - Searching for a phrase

I am using Azure Search within a .Net application to search across several fields over multiple documents. When searching for a phrase by quoting the search keyword (for example "software developer"), the results include those that seem only to contain the word "software", ahead of those with the phrase "software developer". Are we misunderstanding how phrase searching should work?
I am using SearchMode.All and QueryType.Full
I am not supplying any filters in this case

field cross search in lucene

Hi:
I have two documents:
title body
Lucene In Action A high-performance, full-featured text search engine library.
Lucene Practice Use lucene in your application
Now,I search "lucene performance" using
private String[] f = { "title", "body"};
private Occur[] should = { Occur.SHOULD, Occur.SHOULD};
Query q = MultiFieldQueryParser.parse(Version.LUCENE_29, "lucene performance", f, should,new IKAnalyzer());
Then I get two hits:
"Lucene In Action" and "Lucene Practice".
However I do not want the "Lucene practice" in the search result.
That's to say,I just want the documents who own all my search terms can be returned,the "lucene parctice" does not contain the term "performance",so it should not be returned.
Any ideas?
Lucene cannot match across fields. That is to say, for the query "a b", it won't match "a" in title and "b" in body. For that you need to create another field, say, all_text, which has title and body both indexed.
Also, when you are searching for "lucene performance" I suppose you are looking for documents that have both the terms - lucene as well as performance. By default, the boolean operator is OR. You need to specify default operator as AND to match all the terms in the query. (Otherwise in this case, the query "lucene performance" will start returning matches that talk about database performance.)

Resources