const searchedWord = req.body.searchTerm;
console.log(searchedWord);
db.collection('subtitle')
.find({
$text: { $search: searchedWord },
})
Here is my code it takes a search word coming from the user and searches through all documents and returns the results. but the thing is it is case sensitive plus returns all the documents containing the world. if you search for "happen", some other words like "happened" and "happens" also return.
I just want to make it case insensitive and exact word.
I used regex but it does not work when my entry is dynamic like this.
all the MongoDB documentation is about a hardcoded word for search.
MongoDB Text Indexes use language-specific stemming rules.
When using english, suffixes are remove and the stem word is indexed, so "happens", "happened", and "happening" are all stored in the index as "happen".
To disable stemming, explicitly specify the language as "none".
Related
I want to search for a field that has the name "14009-00080300", and I want to get a hit when searching only on a part of that, for example "14009-000803".
Using this code I dont get any hits:
{
"search": "\"14009-000803\"*",
"count":true,
"top":10
}
Is there a way to use azure search like SQL uses its wildcard search? (select * from table where col like '%abc%' ?
You can get your desired result by performing a full query with Lucene syntax (as noted by Sumanth BM). The trick is to do a regex search. Modify your query params like so:
{
"queryType": "full",
"search": "/.*searchterm.*/",
"count":true,
"top":10
}
Replace 'searchterm' with what you are looking for and azure search should return all matches from your index searchable columns.
See Doc section: MS Docs on Lucene regular expression search
You can use generally recognized syntax for multiple () or single (?) character wildcard searches. Note the Lucene query parser supports the use of these symbols with a single term, and not a phrase.
For example to find documents containing the words with the prefix "note", such as "notebook" or "notepad", specify "note".
Note
You cannot use a * or ? symbol as the first character of a search.
No text analysis is performed on wildcard search queries. At query time, wildcard query terms are compared against analyzed terms in the search index and expanded.
SearchMode parameter considerations
The impact of searchMode on queries, as described in Simple query syntax in Azure Search, applies equally to the Lucene query syntax. Namely, searchMode in conjunction with NOT operators can result in query outcomes that might seem unusual if you aren't clear on the implications of how you set the parameter. If you retain the default, searchMode=any, and use a NOT operator, the operation is computed as an OR action, such that "New York" NOT "Seattle" returns all cities that are not Seattle.
https://learn.microsoft.com/en-us/rest/api/searchservice/simple-query-syntax-in-azure-search
Reference: https://learn.microsoft.com/en-us/rest/api/searchservice/lucene-query-syntax-in-azure-search#bkmk_wildcard
Search function
function (doc) {
for(var j =1;j<doc.sheets[1].data.length;j++){
index("Name", doc.sheets[1].data[j]);
}
}
My analyzer is standard. How would I modify this function so as to achieve search with partial matching.
you should be able to use * and other from special characters as part of the query syntax to perform wildcard searches, fuzzy searches, etc.
from https://console.bluemix.net/docs/services/Cloudant/api/search.html#query-syntax:
If you want a fuzzy search, you can run a query with ~ to find terms like the search term. For instance, look~ finds the terms book and took.
and
Wildcard searches are supported, for both single (?) and multiple (*) character searches. For example, dat? would match date and data, whereas dat* would match date, data, database, and dates.
I am working on a search feature in which I have to perform search operation in about 300,000 documents.
For this I have created a compound index over four fields and have given weight to them as well. By default if I search for a phrase having multiple words then mongoose searches for all the keywords with OR operation. Eg:- If you search for small cell lung then mongoose will search all document in which either one of these is available.
It is working very fast.
But my requirement is to perform AND operation. To achieve this I split all the words in a phrase and then put them in double quotes.So when user searches for a phrase having multiple words, search operation is performed as AND operation on each word. Eg:- If you search for small cell lung ("small" "cell" "lung") then it should find all those documents in which all are available. It is also working but it is very slow now.
Is there any way to make it faster.
I will share the code if required.
Thanks
Give a shot to this :
db.table.find("text", {search:"\"small\" \"cell\" \"lung\""})
Now above code will do the following :
If the search string includes phrases, the search performs an AND with
any other terms in the search string; e.g. search for "\"kiss me on
\" cheeks lips" searches for "Kiss Me on" and ("cheecks" or
"lips").
You can read it from here:
Docs
When you want to match the complete phrase, enclose the phrase in escaped double quotes:
db.table.find( { $text: { $search: "small cell lung" } } )
Also see the $text documentation on the mongodb site
I have stemming enabled in my Solr instance, I had assumed that in order to perform an exact word search without disabling stemming, it would be as simple as putting the word into quotes. This however does not appear to be the case?
Is there a simple way to achieve this?
There is a simple way, if what you're referring to is the "slop" (required similarity) as part of a fuzzy search (see the Lucene Query Syntax here).
For example, if I perform this search:
q=field_name:determine
I see results that contain "determine", "determining", "determined", etc.. If I then modify the query like so:
q=field_name:determine~1
I only see results that contain the word "determine". This is because I'm specifying a required similarity of 1, which means "exact match". I can specify this value anywhere from 0 to 1.
Another thing you can do is index the same text without stemming in one field, and with stemming in another. Boost the non-stemmed field & that should prefer exact versions of words to stemmed versions. Of course you could also write your own query parser that directs quoted phrases to the non-stemmed field only.
CouchDB gives an opportunity to search values from startkey, for exact key-value pair etc
But is there any way to search for substring in specified field?
The problem is like this. Our news database consists of about 40,000 news documents. Say, they have title, content and url fields. We want to find news documents which have "restaurant" in their title. Is there any way to do it?
View Collation wiki page tells nothing :( And it seems strange to me that there's no tool to handle this problem and all I can to do is just parsing JSON results with Python, PHP or smth else. In MySQL it's simply LOCATE() function..
Use couchdb-lucene.
Be careful here. Lucene is not always the best answer.
If your only searching one limited field and only searching for a word like restaurant then lucene which is really meant to tokenize large texts/documents can be way overkill, you can get the same effect by splitting the title.
function(doc){
var stringarray = doc.title.split(" ");
for(var idx in stringarray)
emit(stringarray[idx],doc);
}
Also Lucene and Couchdb do not support substring search, where the string is not in the beginning of a word.