Highlighter.net returns no matches - c#-4.0

I am using lucene.net 2.9.4 and lucene.net contrib 2.9.4 my lucene query looks like:
+contents:umbraco*
I get results for this query. My highlighter code to get fragments looks like:
public string GetHighlight(string value, string highlightField, IndexSearcher searcher, string luceneRawQuery)
{
var query = GetQueryParser(highlightField).Parse(luceneRawQuery);
var scorer = new QueryScorer(searcher.Rewrite(query));
var highlighter = new Highlighter(HighlightFormatter, scorer);
var tokenStream = HighlightAnalyzer.TokenStream(highlightField, new StringReader(value));
return highlighter.GetBestFragments(tokenStream, value, MaxNumHighlights, Separator);
}
In my scorer object the property termsToFind is 0 I would expect that to at least be one? Anyone any ideas or suggestions on how to fix / debug?
Regards
Ismail

Ok figured this out I was passing in the wrong values to the highlighter function. I was passing the query search term and field name. What i needed to pass in was the content of the contents field for each document match and the query term. All working now.

Related

Function for `fq` field of SOLR in SOLR-node-client

For various fields such as q , start , row etc in SOLR we have corresponding functions in SOLR-node-client.
So if I want to construct a query for the following:
http://host:port/solr/eposro/select?q=cats.0%3A1&start=0&rows=4&wt=json&indent=true
I can use something like this:
var query = client.createQuery()
.q({cats.0 : 1})
.start(0)
.rows(4);
However, there is a filter query field in SOLR, fq. I don't seem to find a corresponding function for this in SOLR-node-client.
Following gives me error:
var query = client.createQuery()
.q({cats.0 : 1})
.fq({'brand':'real'})
.start(0)
.rows(4);
I get an error saying that fq function doesn't exist.
Am I doing anything wrong or is there any other way to achieve filter query using SOLR-node-client?
createQuery() returns a Query object and it has a matchFilter method.
Example:
var query = client.createQuery()
.q({cats.0 : 1})
.matchFilter('brand', 'real')
.start(0)
.rows(4);
HTH
I have checked a bit the source code and found out:
So, the usage should be like,
let searchQuery = solrClient.query()
searchQuery = searchQuery.fq({field:"tags",value: this.filterTag});
If anyone can update the doc, that would be great.

How can I search on list of values using Lucene Query interface

Simplistic Problem description:
Lucene index has two fields per document: ID and NAME.
I want to make a query using the Lucene Query interface such that I can find all the documents where ID is 1 OR 2 OR 3 OR so on. The IDs to be searched will be in a list and can potentially have upto 30 elements.
If I was using the query parser I would have done something like
ID:(1 OR 2 OR 3)
But the application is already heavily committed to the Query interface and I want to follow the current pattern. Only way I can think of doing this with Query interface is create n term queries and group them using the Boolean query as below
BooleanQuery booleanQuery = new BooleanQuery();
(String searchId : lstIds)
{
booleanQuery.add(new TermQuery(new Term("ID", searchId)), BooleanClause.Occur.SHOULD);
}
But is there a better/more efficient way of doing this?
Combining queries togetheer with a BooleanQuery is the correct way to reproduce a query like ID:(1 OR 2 OR 3). The query parser will generate a BooleanQuery similar to what you provided for that syntax, so you are absolutely doing the right thing here.
You might be able to make use of PrefixQuery, NumericRangeQuery or TermRangeQuery to simplify matters, if they actually suit your needs in practice, but there is nothing wrong with what you are doing already.
BooleanQuery is the solution for handling OR operator as you have shown in the code but if you want simple alternative of the it you could also use simple Query and pass the IDs as "1 OR 2 OR 3".
Here is the code snippet lucene 7.
Query query = new QueryParser("ID", analyzer).parse("1 OR 2 OR 3");
TopDocs topDocs = searcher.search(query, 10);
OR if you have all the OR you could also use QueryParser default Operator.
Here is the code snippet for lucene 7.
QueryParser queryParser = new QueryParser("ID", analyzer);
queryParser.setDefaultOperator(QueryParser.Operator.OR);
Query query = queryParser.parse("1 2 3");
TopDocs topDocs = searcher.search(query, 10);
I hope that work for you.

Liferay searchContext search by AssetTags and Keywords

I'm trying to implement custom search portlet through AssetEntries. Currently AssetEntryQuery doesn't allow to search with keywords. I'm trying to search through FacetedSearcher. Search by keywords seems to be ok. But when I'm trying to search by AssetTagNames
searchContext.setAssetTagNames(assetTagNames)
it doesn't work at all.
Here's my piece of code
SearchContext searchContext = new SearchContext();
Facet assetEntriesFacet = new AssetEntriesFacet(searchContext);
assetEntriesFacet.setStatic(true);
searchContext.addFacet(assetEntriesFacet);
/*MultiValueFacet multiValueFacet=new MultiValueFacet(searchContext);
multiValueFacet.setFieldName("assetTagNames");
multiValueFacet.setStatic(false);
searchContext.addFacet(multiValueFacet);*/
searchContext.setCompanyId(themeDisplay.getCompanyId());
String []assetTagNames=new String[1];
assetTagNames[0]= assetTagName;
searchContext.setAssetTagNames(assetTagNames);
searchContext.setKeywords(keywords);
String[] entryClassName = {JournalArticle.class.getName()};
searchContext.setEntryClassNames(entryClassName);
Indexer indexer = FacetedSearcher.getInstance();
// searchContext.setAndSearch(true);
Hits hits = indexer.search(searchContext);
System.out.println("Hits: " + hits.getLength());
Resulted query for request
searchKeyword: key1key1
assetTagName: sometag
+(+(companyId:1) +((+(entryClassName:com.liferay.portlet.journal.model.JournalArticle) +(status:0)))) +(assetCategoryTitles:*key1key1* assetCategoryTitles_en_US:*key1key1* assetTagNames:*key1key1* comments:key1key1 content:key1key1 description:key1key1 properties:key1key1 title:key1key1 url:key1key1 userName:*key1key1* classPK:key1key1 content_en_US:key1key1 description_en_US:key1key1 entryClassPK:key1key1 title_en_US:key1key1 type:key1key1)
As you see AssetTag isn't applied to the query.
I've already tried to set it through
searchContext.setAttribute("assetTagNames",assetTagName);
and commented MultiValueFacet code but wih no result.
For further i need to search by dateRange and Categories. Has anybody any idea?
Fortunately solved this.
If you'd like to search through tags you've to use a separate facet for this, e.g.
MultiValueFacet assetTagsFacet = new MultiValueFacet(searchContext);
assetTagsFacet.setFieldName(Field.ASSET_TAG_NAMES);
searchContext.addFacet(assetTagsFacet);
Also use searchContext.setAttribute("assetTagNames", assetTagName); instead of searchContext.setAssetTagNames(assetTagName);
For searching through Categories the same thing:
MultiValueFacet assetCategoriesFacet = new MultiValueFacet(searchContext);
assetCategoriesFacet.setFieldName("assetCategoryTitles");
searchContext.addFacet(assetCategoriesFacet);
searchContext.setAttribute("assetCategoryTitles", assetCategoryName);
Also i wanted to search by custom type of JournalArticle, Ive created facet for this, but got "type" twice in query. As a solution i used MultiValueFacet instead of AssetEntriesFacet during setting entryClassName
MultiValueFacet assetEntriesFacet = new MultiValueFacet(searchContext);
assetEntriesFacet.setFieldName("entryClassName");
searchContext.setAttribute("entryClassName",JournalArticle.class.getName());
searchContext.addFacet(assetEntriesFacet);

Lucene index searching

I am using Lucene indexing for the first time. I have some documents in Hindi and English and I create index on the content of document.When I search the index I get result from all the documents even if my query is some english word it returns hindi document also. I have added the code below.please tell me where I am dong wrong.
IndexSearcher searcher = new IndexSearcher(directory);
QueryParser parser = new QueryParser("Content", analyzer);
while (condition)
{
Search(text, searcher, parser);
}
searcher.Close();
private static void Search(string text, IndexSearcher searcher, QueryParse parser)
{
Query query = parser.Parse(text);
Hits hits = searcher.Search(query);
int results = hits.Length();
for (int i = 0; i < results; i++)
{
Lucene.Net.Documents.Document doc = hits.Doc(i);
string show = doc.ToString();
float score = hits.Score(i);
/* insert doc id in database table*/
}
Thanks all
First, I would use Luke to check whether my query syntax was right. Then I would check whether that the misbehaving English word is a homogram for a Hindi word (i.e. an English word that is spelled the same as a Hindi word).
If you want to prevent a search for English search terms from coming up with Hindi documents, you will need to mark each document as to whether it is in English or Hindi, then specify that marking in your search query. In Query Parser Syntax, this could look like:
ENGLISHSEARCHTERMS +(language:English)
(where all Hindi documents have their language field set to 'Hindi' and all English documents have their language field set to 'English').

Exact phrase search using Lucene.net

I am having trouble searching for an exact phrase using Lucene.NET 2.0.0.4
For example I am searching for "scope attribute sets the variable" (including quotes) but receive no matches, I have confirmed 100% that the phrase exists.
Can anyone suggest where I am going wrong? Is this even supported with Lucene.NET? As usual the API documentation is not too helpful and a few CodeProject articles I've read don't specifically touch on this.
Using the following code to create the index:
Directory dir = Lucene.Net.Store.FSDirectory.GetDirectory("Index", true);
Analyzer analyzer = new Lucene.Net.Analysis.SimpleAnalyzer();
IndexWriter indexWriter = new Lucene.Net.Index.IndexWriter(dir, analyzer,true);
//create a document, add in a single field
Lucene.Net.Documents.Document doc = new Lucene.Net.Documents.Document();
Lucene.Net.Documents.Field fldContent = new Lucene.Net.Documents.Field(
"content", File.ReadAllText(#"Documents\100.txt"),
Lucene.Net.Documents.Field.Store.YES,
Lucene.Net.Documents.Field.Index.TOKENIZED);
doc.Add(fldContent);
//write the document to the index
indexWriter.AddDocument(doc);
I then search for a phrase using:
//state the file location of the index
Directory dir = Lucene.Net.Store.FSDirectory.GetDirectory("Index", false);
//create an index searcher that will perform the search
IndexSearcher searcher = new Lucene.Net.Search.IndexSearcher(dir);
QueryParser qp = new QueryParser("content", new SimpleAnalyzer());
// txtSearch.Text Contains a phrase such as "this is a phrase"
Query q=qp.Parse(txtSearch.Text);
//execute the query
Lucene.Net.Search.Hits hits = searcher.Search(q);
The target document is about 7 MB plain text.
I have seen this previous question however I don't want a proximity search, just an exact phrase search.
Shashikant Kore is correct with his answer, you need to enable term positions...
However, I would recommend not storing the text of the document in the field unless you absolutely need it to return back to you in the search results... Setting the store to 'NO' might help reduce the size of your index a bit.
Lucene.Net.Documents.Field fldContent =
new Lucene.Net.Documents.Field("content",
File.ReadAllText(#"Documents\100.txt"),
Lucene.Net.Documents.Field.Store.NO,
Lucene.Net.Documents.Field.Index.TOKENIZED,
Lucene.Net.Documents.Field.TermVector.WITH_POSITIONS_OFFSETS);
You have not enabled the term positions. Creating field as follows should solve your problem.
Lucene.Net.Documents.Field fldContent =
new Lucene.Net.Documents.Field("content",
File.ReadAllText(#"Documents\100.txt"),
Lucene.Net.Documents.Field.Store.YES,
Lucene.Net.Documents.Field.Index.TOKENIZED,
Lucene.Net.Documents.Field.TermVector.WITH_POSITIONS_OFFSETS);

Resources