Can I "Exact Search" for targeted field(s) and Search across other fields as well? - azure

The "Exact Search" fields use their own custom analyzer, while the Search fields use a language specific custom analyzer (built on MicrosoftStemmingTokenizerLanguage.French, for example).
I can't seem to use $filter for the "Exact Search" field, because $filter considers the entire field, and doesn't use the custom analyzer of the field.
Azure Search docs indicate this about field scoped queries.
"You can specify a fieldname:searchterm construction to define a fielded query operation, where the field is a single word, and the search term is also a single word"
There is no clear way on how to do this in Azure. We know we can use the searchFields parameter in our Azure Search Rest API calls to target specific fields, but how do we search ALL fields for 1 term while specifically searching some fields for specific terms, basically doing an “AND” between them?

This is possible using the Lucene query syntax.
Construct your query like this, where "chair" is the term to search for in all fields, and field1 and field2 are fields where you want to search for specific terms:
chair AND field1:something AND field2:else
In terms of how you use this in the REST API, just embed it in your search parameter. If you're using GET it looks like this (imagine it URL-encoded):
search=chair AND field1:something AND field2:else
If you're using POST, it goes in the request body and looks like this:
{
"search": "chair AND field1:something AND field2:else",
... (other parameters)
}

Related

Returning accented as well as normal result set via azure search filters

Does anyone know how to ensure we can return normal result as well as accented result set via the azure search filter. For e.g the below filter query in Azure search returns a name called unicorn when i check for record with name unicorn.
var result= searchServiceClient.Documents.SearchAsync<myDto>("*",new SearchParameters
{
SearchFields = new List<string> {"Name"},
Filter = "Name eq 'unicorn'"
});
This is all good but what i want is i want to write a filter such that it returns record named unicorn as well as record named únicorn (please note the first accented character) provided that both record exist.
This can be achieved when searching for such name via the search query using language or Standard ASCII folding search analyzer as mentioned in this link. What i am struggling to find out is how can we implement the same with azure filters?
Please let me know if anyone has got any solutions around this.
Filters are applied on the non-analyzed representation of the data, so I don’t think there’s any way to do any type of linguistic analysis on filters. One way to work around this is to manually create a field which only do lowercasing + asciifloding (no tokenization) and then search lucene queries that look like this:
"normal search query terms" AND customFilterColumn:"filtérValuèWithÄccents"
Basically the document would both need to match the search terms in any field AND also match the filter term in the “customFilterColumn”. This may not be sufficient for your needs, but at least you understand the art of the possible.
Using filters it won't work unless you specify in advance all the possibilities:
for example:
$filter=name eq 'unicorn' or name eq 'únicorn'
You'd better work with a different analyzer that will change accents to it's root form. As another possibility, you can try fuzzy search:
search=unicorn~&highlight=Name

How to disable tokenization for Azure Search Autocomplete?

I've created Azure Search Suggester for "full_name" index field in order to support autocomplete functionality. Now when I use Azure autocomplete REST endpoint by using "search" parameter as a let's say "Lor" I only get back the result "Lorem" not the "Lorem Ipsum". Is there any way to disable tokenization for suggester and to get back full name like "Lorem Ipsum" for the search term "Lor" for autocomplete?
The Autocomplete API is meant to suggest search terms based on incomplete terms one is typing into to the search box (type-ahead). It supports three modes:
oneTerm – Only one term is suggested. If the query has two terms, only
the last term is completed. For example:
"washington medic" -> "medicaid", "medicare", "medicine"
twoTerms – Matching two-term phrases in the index will be suggested,
for example:
"medic" -> "medicare coverage", "medical assistant"
oneTermWithContext – Completes the last term in a query with two or
more terms, where the last two terms are a phrase that exists in the
index, for example:
"washington medic" -> "washington medicaid", "washington medical"
The twoTerms mode might work for you. If you're looking for an API that suggests documents based on an incomplete query term, try the Suggestions API. It returns the entire contents of a field that has a Suggester enabled for all documents that matched the query.

MarkLogic Text Search: Return Results Based on Matches Inside Attributes

I am using Marklogic's search:search() functions to handle searching within my application, and I have a use case where users need to be able to perform a text search that returns matches from an attribute on my document.
For example, using this document:
<document attr="foo attribute value">Some child content</document>
I want users to be able to perform a text search (not using constraints) for "foo", and to return my document based on the match within the attribute #attr. Is there some way to configure the query options to allow this?
Typing in attr:"foo" is not a workable solution, so using attribute range constraints won't help, and users still need to be able to search for other child content not in the attribute node. I'm thinking perhaps there is a way to add a cts:query OR'd into the search via the options, that allows this attribute to be searched?
Open to any and all other solutions.
Thanks!
Edit:
Some additional information, to help clarify:
I need to be able to find matches within the attribute, and elsewhere within the content. Using the example above, searches for "foo", "child content", or "foo child content" should all return my document as a result. This means that any query options that are AND'd with the search (like <additional-query>, which is intended to help constrain your search and not expand it) won't work. What I'm looking for is (likely) an additional query option that will be OR'd with the original search, so as to allow searching by child node content, attribute content, or a mix of the two.
In other words, I'd like MarkLogic to treat any attribute node content exactly the same as element text nodes, as far as searching is concerned.
Thanks!!
You could accomplish this search with a serialized element-attribute-word cts query in the additional-query options for the search API. The element attribute word query will use the universal index to match individual tokens within attributes.
In MarkLogic 9 You may be able to use the following to perform your search:
import module namespace search = "http://marklogic.com/appservices/search"
at "/MarkLogic/appservices/search/search.xqy";
search:search("",
<options xmlns="http://marklogic.com/appservices/search">
<additional-query>
<cts:element-attribute-word-query xmlns:cts="http://marklogic.com/cts">
<cts:element>document</cts:element>
<cts:attribute>attr</cts:attribute>
<cts:text>foo</cts:text>
</cts:element-attribute-word-query>
</additional-query>
</options>
)
MarkLogic has ways to parse query text and map a value to an attribute word or value query.
First, you can use cts:parse():
http://docs.marklogic.com/guide/search-dev/cts_query#id_71878
http://docs.marklogic.com/cts.parse
Second, you can use search:search() with constraints defined in an XML payload:
http://docs.marklogic.com/guide/search-dev/query-options#id_39116
http://docs.marklogic.com/guide/search-dev/appendixa#id_36346
I'd look into using the <default> option of <term>. For details see http://docs.marklogic.com/guide/search-dev/appendixa#id_31590
Alternatively, consider doing query expansion. The idea behind that is that a end user send a search string. You parse it using search:parse of cts:parse (as suggested by Erik), and instead of submitting that query as-is to MarkLogic, you process the cts:query tree, to look for terms you want to adjust, or expand. Typically used to automatically blend in synonyms, related terms, or translations, but could be used to copy individual terms, and automatically add queries on attributes for those.
HTH!

Solr Search Field Best Practices

I'm using solr for an enterprise application. So far it works well, as I am using a ngram field to search against. It works correctly for partial queries (match against indexed ngrams). But the problem I have is, how to enforce exact query matches?. For an example the query "Test 1" should match exactly the same text as it is when the user enter it with double quotation marks. Currently Since I have used some tokenizers and filters, the double quotation marks get filtered out, there's no difference in the queries "test 1", "tEst 1" or "TEST 1" (that is because of the analyzer chain I use, but it is needed to work with ngrams and partial search).
Currently I'm searching against a ngram query field. In order to enforce exact query match, what should I do? what is the best practice?. currently what I think is to identify the double quotation marks from client side and change the query field to the original field (with out ngrams). But I feel like there should be a better way of doing this, since the problem I have is generic and solr is a complete enterprise level search engine.
You can have another field for it and add string as the fieldType for the same and index it with same.
When you want to perform the exact match you can query on the above field.
And when you want to perform partial search ..you can query to the earlier field which is indexed by ngram.
OR.. Here is another way you can try.
You have defined the current field type using the ngram. In that while indexing you can define the ngram tokenizer and for the query you mention keywordTokenizer and lowercase filter factory only.
While indexing the text will be tokenized and while performing the query it will not.

sunspot solr attribute substring/wildcard searches

Is it possible to search solr attribute fields (non solr.TextField types) using a substring/wildcard/partial string match?
For instance if I have a solr.StrField field and documents that contain the string "1234567890" I want to be able to search on "456" and have that document returned.
From what I can see only textfields can be searched in this method using things like EdgeNGram and the like, but not attribute fields??
You can have the partial matches working for String as well Text fields for wildcards.
If the query parsers you are using, supports leading wildcard queries, you can easily search for *456*, and this should match 1234567890.
However, EdgeNGram would only work for solr.TextField, as solr.strField do not allow analysers to be added to it.
So you can only define fields with class as solr.TextField and have the EdgeNGram in the analysis chain, which would break down the indexed terms into shingles for partial matching.

Resources