Is there a way to prevent partial word matching using Sitecore Search and Lucene? - search

Is there a way when using Sitecore Search and Lucene to not match partial words? For example when searching for "Bos" I would like to NOT match the word "Boston". Is there a way to require the entire word to match? Here is a code snippet. I am using FieldQuery.
bool _foundHits = false;
_index = SearchManager.GetIndex("product_version_index");
using (IndexSearchContext _searchContext = _index.CreateSearchContext())
{
QueryBase _query = new FieldQuery("title", txtProduct.Text.Trim());
SearchHits _hits = _searchContext.Search(_query, 1000);
...
}

You may want to try something like this to get the query you want to run. It will put the + in (indicating a required term) and quote the term, so it should exactly match what you're looking for, its worked for me. Providing you're passing in BooleanClause.Occur.MUST.
protected BooleanQuery GetBooleanQuery(string fieldName, string term, BooleanClause.Occur occur)
{
QueryParser parser = new QueryParser(fieldName, new StandardAnalyzer());
BooleanQuery query = new BooleanQuery();
query.Add(parser.Parse(term), occur);
return query;
}
Essentially so your query ends up being parsed to +title:"Bos", you could also download Luke and play around with the query syntax in there, its easier if you know what the syntax should be and then work backwards to see what query objects will generate that.

You have to place the query in double quotes for the exact match results. Lucene supports many such opertators and boolean parameters that can be found here: http://lucene.apache.org/core/2_9_4/queryparsersyntax.html

It depends on field type. If you have memo or text field then partial matching is applied. If you want exact matching use string field instead. There you can find some details: https://www.cmsbestpractices.com/bug-how-to-fix-solr-exact-string-matching-with-sitecore/ .

Related

Gradle: How to filter and search through text?

I'm fairly new to gradle. How do I filter text in the following manner?
Pretend that the output/result I want to filter will be the two URLs below.
"http://localhost/artifactory/appNameIwant/moreStuffHereThatsDynamic"
> I want this URL
"http://localhost/artifactory/differentAppName"
> I don't want this URL
I want to put up a "match" variable that would be something like
variable = http://localhost/artifactory/appnameIwant
So essentially, the string will not be a perfect match. I want it to filter and provide back any URLs that start with the variable listed above. It cannot be a perfect match as the characters after the /appnameIwant/ will be changing.
I want to use a for loop to cycle through an array, with an if then statement to return any matches. For instance.
for (i=0; i < results.length; i++){
if (results[i] strings matches (http://localhost/artifactory/appnameIwant) {
return results[i] }
I am just filtering the URL strings themselves, not anything complicated inside the webpages.
Let me know if further explanation would be helpful.
Thanks so much for your time and help!
I figured it out - I just used
if (string.startsWith"texthere")) {println string}
A lot easier than I thought!

Sitecore 7 content search Starts with function

I am working with sitecore 7 content search.
var webIndex = ContentSearchManager.GetIndex("sitecore_web_index");
using (var context = webIndex.CreateSearchContext())
{
var results = context.GetQueryable<SearchResultItem>().Where(i =>
i.Content.Contains(mysearchterm));
}
sitecore performing contains operation on the content string, content contains the whole content of the page and does not return the result as I expect, for example searching for "hr" also returning results containing "through" in content, I tried using startswith but that just matches the start of the whole content string, I tried "Equal" but that matches the whole word, is there any way to search content where a word starts with search term?
Define '^' as the first character of a search phrase, it means "Starts With". for example to define all terms starting with "hr", just add '^' to search keyword like this "^hr".

How to create a search query for partial string matches in Mongoose?

I'm new to Mongoose.js and I'm wondering how to create a simple Mongoose query that returns values containing the characters in the order that they were submitted.
This will be for an autocomplete form which needs to return cities with names that contain characters input into the search field. Should I start with a .where query?
You could find by regexp, which should allow you to search in a flexible (although not extremely fast) way. The code would be something similar to;
var input = 'ln'; // the input from your auto-complete box
cities.find({name: new RegExp(input, "i")}, function(err, docs) {
...
});
Of course, you could preprocess the string to make it match from the start (prepend by ^), from the end (append by $) etc. Just note that matching against arbitrary parts of long strings may be slow.

examine stripping out search words

I'm using umbraco and I have examine up and running however my query is having words stripped out
For example:
I am searching on "man on the moon" with the following line of code, the variable "searchTerm" should contain "man on the moon":
var Searcher = ExamineManager.Instance.SearchProviderCollection["MySearcher"];
var searchCriteria = Searcher.CreateSearchCriteria();
var query = searchCriteria.Field("Name", searchTerm).Compile();
however, the query is generated as this when I debug:
{ SearchIndexType: , LuceneQuery: +Name:"man moon" }
Notice how it has removed the words "on the" from the searchTerm?
Presumably these are because they are deemed as STOP/reserved words. However, this means I do not get the search results I expect.
How can I get around this?
Internally the StopAnalyzer class is used by the StandardAnalyzer as part of the standard indexing process. The StopAnalyzer (http://lucenenet.apache.org/docs/3.0.3/d7/df5/_stop_analyzer_8cs_source.html#l00054) contains a method which allows you to substitute a different set of stopwords as an ISet type parameter rather than use the standard ENGLISH_STOP_WORDS_SET (line 134).
And I read here (http://webcache.googleusercontent.com/search?q=cache:sA-uyAC015UJ:our.umbraco.org/m%3Fmode%3Dtopic%26id%3D25600+&cd=2&hl=en&ct=clnk&gl=uk) that you can get Examine to use an empty set of stopwords by adding the following line to your application_start method in global.asax
Lucene.Net.Analysis.StopAnalyzer.ENGLISH_STOP_WORDS_SET = new System.Collections.Hashtable();
So with an empty set of stopwords your man in the moon should be back.
A bit of an odd idea but as an alternative you could also add a StopAnalyzer to ExamineSettings.config to create an index of docs with only the stop words and then AND them with your standardanalyzer result set?

How do you do a contains query using the MongoDB node.js driver?

I'm trying to do a search on items that contain a certain substring, but I'm getting no results back (even though I know the data is right for the query, including case):
collection.find({name: "/.*" + keyword + ".*/"}).toArray(function(err, items)
Should that not match everything that contains the keyword? It just returns an empty object.
I'm just using the regular MongoDB driver in an ExpressJS app.
You need to build a regular expression first should try something like this:
var regex = RegExp("/.*" + keyword + ".*/")
Then pass in the variable to the query. I generally find it easier to do the query as a variable and pass that in:
var query = { FieldToSearch: new RegExp('^' + keyword) };
collection.find(query).toArray(...)
I've included the regex as a left rooted regex to take advantage of indexes (always recommended if possible for performance). For more take a look here:
http://www.mongodb.org/display/DOCS/Advanced+Queries#AdvancedQueries-RegularExpressions
I did it like this
keyword = "text_to_search";
collection.find({name: {$Regex: keyword, $options:$i }})
I used i $i make the query case insensitive but u can use other options too
try this:
var keyword = req.params.keywords;
var regex = RegExp(".*" + keyword + ".*");
Note.find({noteBody: regex, userID: userID})
I got the keywords from the request parameters and I want to search from the noteBody with these keywords, now the keywords is a variable. If you want to put a variable in the database find, the format must be var regex = RegExp("." + keyword + "."). Hope this helps. Thanks

Resources