Does ArangoDB provide a way of getting the underlying scores out of text search queries, either through AQL queries against fulltext index or via a custom search view?
One use case for this is to colour search results based on their relevance in a UI.
FOR doc IN myView
SEARCH PHRASE(doc.abstract,"fulltext search","text_en")
OR PHRASE(doc.text,"fulltext search","text_en")
SORT BM25(doc)
LIMIT 10
RETURN { id: doc._key, title: doc.title, score: bm25(doc) }
This would return the top 10 results with the BM25 score (TF/IDF is supported as well) that you can use for highlighting individual records.
I think the support of scores in return values was introduced just recently. I am currently testing 3.5.
Related
I have multiple instances of my application. each application is pointing to its own solr for document indexing.
I am working on a unified search, where user hit a query in the search bar and the relevant documents from all the instance should be ranked based on relevance.
Right now I have implemented a solution based on Round Robin fashion.
For example, I have 2 instances, Ins-1 with solr-1 and Ins-2 with solr-2.
Ins-1 has 1K docs and Ins-2 has 5K docs. when I hit any query, the query will fetch X number of docs from solr-1 and X number of docs from solr-2.
I am showing those 2X documents in round robin fashion. But it is not a best way to show the search result.
I am looking for a solution where I can re-rank those 2X documents based on relevance to the search.
I think you should merge the two instances into single instance. You can import data from one instance to another. Solr Admin UI has a tab 'DataImport' to import data from one collection to another.
here doc1 and doc2 are two invidual responses from solr you can do it in JAVA
SolrDocumentList appendResponse(SolrDocumentList doc1,SolrDocumentList doc2) {
SolrDocumentList documentsList=new SolrDocumentList();
for (SolrDocument solrDocument:doc1)
{
documentsList.add(solrDocument);
}
for (SolrDocument solrDocument:doc2)
{
documentsList.add(solrDocument);
}
return documentsList;
}
I am using solr for indexing some documents and then searching. I want to return those documents that have the same start as the search keywords higher in the results. How can i achieve that?
E.g.
If i the search keyword is "php"
and there are two documents with content :
php developer
ajax php
then i want to return 'php developer' first instead of 'ajax php'.
Any suggestions on how to return results in this order?
I am looking for some sort of an analyzer that only indexes the first word from the content of a field and then giving that field a lot of weight while querying. Maybe that can help. I couldnt find such an analyzer for my purposes.
You can boost the first tokens using payload. Refer to the link mentioned in Payloads
We are making a solr query where we are giving a custom function (which is pretty complex) and sorting the results by value of that function. The query looks something like:
solr/select?customFunc=complexFunction(querySpecificValue1,querySpecificValue2)&sort_by=$customFunc&fq=......
Our understanding is that we can only get back fields on the document and solr score back from solr. Can someone tell us if and how we can fetch the computed value of customFunc for each document. For some reasons we cannot set solr score to be customFunc.
You should use the fl parameter to select pseudo fields, functions and so on, but this is supported only on trunk, which will be released with the 4.0 version of Solr. Have a look at the CommonQueryParameters wiki. The SOLR-2444 issue might be interesting too.
A brief example:
solr/select?q=*:*&fl=*,customFunc:complexFunction(querySpecificValue1,querySpecificValue2)
This helped me :
/solr/auction-En/select/?q=*:*_val_:"sum(x,y)"&debugQuery=true&version=2.2&start=0&rows=10&indent=on&fl=*,score
You will see the values of the function in the debug part.
Im using Solr 3.5.0, and in Schema I have enabled the LowerCaseFilterFactory in all needed fields, bbut When I search for example "shirts" im able to get the results, also when I search for "SHIRTS" i'm able to get expected results, but when I try to search with "shiRTs" its not giving the results. I know I'm missing some thing in Schema.
Please help me on this.
Thanks
Jeyaprakash.
Apply the same analysers and filters at both index and query time, so the the queries you search for match the tokens index.
As in your case -
If you apply the Lower case filter at index time but not at query time :-
Index token will be shirts, However as the search query is not analyzed SHIRTS or even Shirts will not match indexed shirts token.
The same would apply if you are using stemmers, stopwords or other filters.
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#Analyzers
Analyzers are components that pre-process input text at index time
and/or at search time. It's important to use the same or similar
analyzers that process text in a compatible manner at index and query
time. For example, if an indexing analyzer lowercases words, then the
query analyzer should do the same to enable finding the indexed words.
I am using the AdvancedDatabaseCrawler as a base for my search page. I have configured it so that I can search for what I want and it is very fast. The problem is that as soon as you want to do anything with the search results that requires accessing field values the performance goes through the roof.
The main search results part is fine as even if there are 1000 results returned from the search I am only showing 10 or 20 results per page which means I only have to retrieve 10 or 20 items. However in the sidebar I am listing out various filtering options with the number or results associated with each filtering option (eBay style). In order to retrieve these filter options I perform a relationship search based on the search results. Since the search results only contain SkinnyItems it has to call GetItem() on every single result to get the actual item in order to get the value that I'm filtering by. In other words it will call Database.GetItem(id) 1000 times! Obviously that is not terribly efficient.
Am I missing something here? Is there any way to configure Sitecore search to retrieve custom values from the search index? If I can search for the values in the index why can't I also retrieve them? If I can't, how else can I process the results without getting each individual item from the database?
Here is an idea of the functionality that I’m after: http://cameras.shop.ebay.com.au/Digital-Cameras-/31388/i.html
Klaus answered on SDN: use facetting with Apache Solr or similar.
http://sdn.sitecore.net/SDN5/Forum/ShowPost.aspx?PostID=35618
I've currently resolved this by defining dynamic fields for every field that I will need to filter by or return in the search result collection. That way I can achieve the facetted searching that is required without needing to grab field values from the database. I'm assuming that by adding the dynamic fields we are taking a performance hit when rebuilding the index. But I can live with that.
In the future we'll probably look at utilizing a product like Apache Solr.