Solr default search field for multiple fields which has different analyzers - search

I have a document which has title, stockCode, category fields.
I have different field types (and analysis chains) for each. For instance title has EdgeNGram 2 to 20, category has EdgeNGram 3 to 10 with different range and stockCode just has lowercase filter.
So that, I don't want to search from documents with keyword "sample" with building the query like title:sample OR stockCode:sample OR category:sample.
I'd like to search with just "q=sample".
I copied my fields to text but It does not work. Because all fields analyzed as same. But I don't want to index stockCode as EdgeNGram or any other filters. I'd like to index my fields as I configured and I'd like to search a keyword over them base on my indexes.
I've been researching about that for three days, and Solr has a little bit poor documentation.

You can use the edismax handler, as this will allow you to give a list of fields to query and supply the query by itself. You can also give separate weights to each field for scoring them differently.
defType=edismax&q=sample&qf=title^10 stockCode category
.. will search for sample in each of the three fields, giving a 10x boost to any hits in the title field.
You can find the documentation about the edismax query parser under Searching in the reference guide.

Related

Multiple analyzers for a single field in a search index of Azure Cognitive Search

We need two different types of search (based on user input), partial and exact for few fields that we have and for the same requirement, we require two different analyzers for each field to produce the required output.
Now, the problem is, I'm not able to configure 2 analyzers for a single field. The only option for me is to create two different indexes altogether and then query respective index based on the user input, but clearly, this is not the right solution, it is not scalable, mostly redundant data and takes almost double the space.
I'm trying to create a duplicate field in the same index with different analyzers and use the output of them based on the user input, but I'm not sure how I can configure that in the index. The name of the field is what is used to search for, during query time. Is there a possibility for me to have 2 different fields with different names, which actually point to one field but have different analyzers?
You can have 2 different fields with different names, which actually point to one field with two different analyzers. This can be done using field mappings in indexer definition.
I have created index as shown below,
As highlighted in above screen shot, I have taken two new fields with name cont01 and cont02.
These two new fields will point to field merged_content with two different analyzers.
In indexer definition I have configured field mappings as shown below,
Ran indexer and results are as shown below,
Reference link

Azure search - how to implement multiple facet search?

For example, if we have category facet and it returns withe 5 different categories, on clicking of the first category, the other categories will not be available in the response. I want to implement multiple facet search.
Appreciate your response.
For more info, i am referring the same scenario as below:
https://feedback.azure.com/forums/263029-azure-search/suggestions/7762452-provide-multiselect-facets
The facet in the response is limited to the selected and this feature is not supported. I'd suggest to vote for it here https://feedback.azure.com/forums/263029-azure-search/suggestions/7762452-provide-multiselect-facets
A workaround is to send multiple queries to get facets and filtered results separately.
For example,
1. keep all facets in the UI (or make another query to get all facets) after the first search query; 2. make another search query after another facet is selected provided that the application tracks what facets the user has selected.
if you want to filter results with multiple facets , you can modify your filter as below :
$filter = search.in(country, 'USA,Canada,Mexico,Brasil,Chile,Argentina', ',')
The first parameter to the search.in function is the string field reference (or a range variable over a string collection field in the case where search.in is used inside an any or all expression). The second parameter is a string containing the list of values, separated by spaces and/or commas. If you need to use separators other than spaces and commas because your values include those characters, you can specify an optional third parameter to search.in.
This third parameter is a string where each character of the string, or subset of this string is treated as a separator when parsing the list of values in the second parameter.
For more information about OData expression syntax for filters and order-by clauses in Azure Search, please refer to this tutorial.
I've recently run into this limitation and my workaround was to run a separate query for each facet as suggested by #rudin above.
Let's say for example that your application has facets for Colour, Brand and Size. Your primary search query includes all three filters but doesn't return any facets. You then run an additional query ignoring any selected Colours, which will give you all available colour values for the chosen brands and sizes, and you do the same for the brand and size facets.
For the additional queries it's important to set the 'Size' property to 0 so no search results are returned - just the relevant facet.
By doing this and running these queries asynchronously the performance overhead is minimal in my case with 6 facets.

The implication of #search.score in Azure Search Service

I understood the reason for having search profile and boosting results based on some fields e.g. distance, rating, etc. To me, that's most likely applicable to structured documents like json files. The scenario that I cannot make sense of it is when indexer gets search service index let's say a MS Word or PDF document in azure blob. We have two entries of "id" and "content" which I don't know how the search score would apply to it.
For e.g. there are two documents with different contents. I searched for a keyword and the same keyword found in two documents resulted into getting two different scores for two MS Word documents. My challenge is why this score should be different while both documents contain the same keyword?
The score is determined by many factors, for example, the count of terms in each document, and the number of searchable fields in which query terms were found. In your example, the documents have different lengths, so naturally they'll have different scores. HTH.

Can we specifiy order of columns to search in solr?

Can we specify order of columns to be searched in solr?
For example my search string is : "Test"
Then my result should contain all rows matching column1 and then all rows matching column 2... Similar to union query in SQL.
I tried with custom search handler which will fire multiple requests to solr and then append to get final result.
But is there any other way to get this type of search using SOLR?
I am using solr-5.4.1.
Thanks
You could use eDismax and match on both fields (I assume you mean columns by that). Then, use boosting to prioritize the matches in the first field to rank higher.
As an easy example, you would search against field1^10 field2, where ^10 is the boost factor. If that works but is not perfect, you can look into the documentation for other methods to apply boost.

Lucene is not finding results that are present in the index

I'm inspecting a Lucene index with Luke.
All documents have a field 'Title' and I would like to do a search for the search expression Title:Power, by which I want to find all documents with a title containing the word Power.
In Luke, I go to the tab "Search" and enter +Title:Power
When searching, there are no results. However, when I search by another field, I do find the document: +ContentType:MyContentType
In the column Title, I can clearly see the value of the document being: Power Quality Guide.
What could be the reasons I'm not finding this document when searching on Title?
There can be a number of reasons. Most common ones:
Title field could just be stored in the index but not indexed for search (Field.Store.YES, Field.Index.NO), unlike for the field for which you can find results (ContentType);
document(s) could be indexed using one analyzer but query is using a different one;
document is indexed using NOT_ANALYZED option which would store a field as a single term

Resources