Solr multiple word search

Solr multiple word search - search

I would like to know how can search row by adding multiple word in search.
i.e text is
The quick brown fox jumps over the lazy dog
I want to search
quick dog
so that I can get this row in result
if i search
quick elephant
still i should get this row in result.
The quick brown fox jumps over the lazy dog
The lazy brown fox jumps over the lazy dog
if i search brown i should get both row in result
if i search quick brown i should get only first line
Is this achievable with solr?

You can tune the way Solr matches multiple terms by using the mm parameter in the edismax query parser (as well as in the dismax query parser). While the second example (where the second line should be excluded), the mm parameter allows you to adjust exactly how many terms needs to be matched for a document to be considered valid for the search.
The second row will be scored lower than the first row in the second example, but you won't be able to exclude it.

Dear you can use AND operator (&&) between both work you wanat to search in a single document .
Like : "quick" AND "brown" will give yoy one first document .
The AND operator matches documents where both terms exist anywhere in the text of a single document
Also prefer use of + sign stand for compulsion of word in documengt .
+brown +quick
The "+" or required operator requires that the term after the "+" symbol exist somewhere in a the field of a single document.
Ref : https://lucene.apache.org/core/2_9_4/queryparsersyntax.html

Related

Azure Search orders results based on the position of the matched text

I would like to sort the documents based on the position of the matching text and then alphabetically.
E.g I have follow 3 values.
PATRICK STREET WEST
MOUNT ST PATRICK ROAD
PATTI MCCULLOCH WAY
I am search with Pat* and I want the results should be sort by the position of the matching text and then alphabetically.
E.g Required result
PATRICK STREET WEST
PATTI MCCULLOCH WAY
MOUNT ST PATRICK ROAD
but I am getting the result in the below order.
PATRICK STREET WEST
MOUNT ST PATRICK ROAD
PATTI MCCULLOCH WAY

This is a bit of a tricky requirement. Let's break this down into the two distinct asks:
Sort results based on the position of the text - there's no query syntax that would allow you to do this but there are some ways to get at this requirement. One option to allow you to boost documents that start with Pat would be to create a second field with the same text that uses a custom analyzer with a keyword_v2 tokenizer and lowercase token filter. That new field would only match if the string of text started with the letters Pat. You could then use scoring profiles or term boosting to weight matches in that field more to bring those results that matched at the beginning to the top.
Sort alphabetically - I'd recommend sorting by search score and then by the text like this: $orderby=search.score() desc,TextField desc. If you follow my recommendations from #1, items where they matched in the beginning of the string should have higher scores than other items and then can be sorted alphabetically within that set. Then all items where the match wasn't in the beginning will come after and will also be sorted alphabetically.
That should at least get you fairly close to the requirement! You could always do a bit of extra sorting on the client side if needed too.

Azure Search Lucene (full query type) single character

I'm using Azure Cognitive Search with QueryType = SearchQueryType.Full. It works fine but it doesn't search a word leas or equal 3 characters e.g. "the", "AC" etc.
I have some specific words which contain two characters.
Is it possible to somehow turn on search by all words even have less or equal to 3 characters?
Update: I believe it's not a problem with a searching but with highlighting results .

Having QueryType = SearchQueryType.Full is not a problem.
If you are using standard.lucene by default stopwards list is empty.
https://learn.microsoft.com/en-us/azure/search/index-add-custom-analyzers#predefined-analyzers-reference
If you are using English language analyzer all common filling words will not be indexed. https://learn.microsoft.com/en-us/azure/search/index-add-language-analyzers#english-analyzers
If you are searching for words for "starts with" you need to use wild card at the end of each word. Ex: the* for searching theatre

Azure Search: Prioritize closest exact match over others in a prefix search

I'm currently doing a prefix search with Azure Cognitive Search like so:
docs?api-version=2019-05-06&search=Do*
Suppose that my index contains Dog, Big Dog, and Small Dog. The result set seems to be sorted alphabetically by default and looks like:
Big Dog
Dog
Small Dog
How can I change my query string so that the closest exact match appears first and the rest is sorted alphabetically? Here's the output I want:
Dog
Big Dog
Small Dog
So, if the user types D, Do, or Dog, I want to show Dog first to help them short-circuit typing.

The results are ordered according to a score. This, is the result of TFxIDF formula. In other words, the results are displayed according to which term is more relevant according to your documents.
Saying that, I believe you must use NGram in order to get the most relevant term.
more info:
https://azure.microsoft.com/en-us/blog/custom-analyzers-in-azure-search/

Can you share what your exact document looks like? As Thiago mentioned Azure Cognitive Search returns a relevance score which shows the relative relevance of the entire document corresponding to the input query.
If your documents have only 1 matching field with the exact text you shared, it should return "Dog" with the highest score as it's more relevant to the query.

MongoDB: Indexing for a live search

Situation
I need to create a live search with MongoDB. But I don't know, which index is better to use normal or text. Yesterday I found main differences between them. I have a following document:
{
title: 'What vitamins are found in blueberries'
//other fields
}
So, when user enter blue, the system must find this document (... blueberries).
Problem
I found these differences in the article about them:
A text index on the other hard will tokenize and stem the content of the field. So it will break the string into individual words or tokens, and will further reduce them to their stems so that variants of the same word will match ("talk" matching "talks", "talked" and "talking" for example, as "talk" is a stem of all three).
So, Why is a text index, and its subsequent searchs faster than a regex on a non-indexed text field? It's because text indexes work as a dictionary, a clever one that's capable of discarding words on a per-language basis (defaults to english). When you run a text search query, you run it against the dictionary, saving yourself the time that would otherwise be spent iterating over the whole collection.
That's what I need, but:
The $text operator can search for words and phrases. The query matches on the complete stemmed words. For example, if a document field contains the word blueberry, a search on the term blue will not match the document. However, a search on either blueberry or blueberries will match.
Question
I need a fast clever dictionary but I also need searching by substring. How can I join these two methods?

Azure Search- Is there way to get exact match of words?

In Azure Search , Is there a way we can get exact match result of multiple words?
If i Search for word "Coca Cola Millenials". Can i get the result from results of azure matching the word "Coca Cola Millenials"

Are you asking if you can search for the phrase "Coca Cola Millenials"? Yes, you can. Surround the phrase with quotes as you did in this question.
From our documentation:
The phrase operator encloses a phrase in quotation marks. For example,
while Roach Motel (without quotes) would search for documents
containing Roach and/or Motel anywhere in any order, "Roach Motel"
(with quotes) will only match documents that contains that whole
phrase together and in that order (text analysis still applies).
Hope that helps

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string