How do solr implement filter search - search

Can anyone point me to the piece of solr source code which performs filter query (excecuting the fq=).

Solr's main façade to Lucene is SolrIndexSearcher. In particular the getDocListC function seems to do the actual passing to Lucene.
Filter queries and other parameters are passed via a QueryCommand class, you can see that filter queries is simply a list of org.apache.lucene.search.Query, just like the "main query".

Related

wildcard searches on specific elements only

I am looking for a way to do wildcard search only on specific elements when executing a search:search. Specifically, I might have documents that look like the following:
<pdbe:person-envelope xmlns:pdbe="http://schemas.abbvienet.com/people-db/envelope">
<person xmlns="http://schemas.abbvienet.com/people-db/model">
<costcenter>
<code>0000601775</code>
<name>DISC-PLAT INFORM</name>
</costcenter>
<displayName>Tj Tang</displayName>
<upi>10025613</upi>
<firstName>
<preferred>TJ</preferred>
<given>Tze-John</given>
</firstName>
<lastName>
<preferred>Tang</preferred>
<given>Tang</given>
</lastName>
<title>Principal Research Scientist</title>
</person>
<pdbe:raw/>
</pdbe:person-envelope>
When searches happen, I want the search text to be automatically wildcarded, but only for certain elements like displayName, firstName, lastName, but NOT for upi or code. As I understand it, I would have certain wildcard related indexes enabled in the database, but then I would need to have a custom query parser that rewrite the query into multiple cts:element-query and cts:element-value-query statements for each element that I want to wildcard search on, OR'd with the originally parsed search query. Or I can create field constraints, and rewrite the query to use field contraints.
Is there another way to conditionally search using wildcard on some elements but not others, when the user is entering as simple search query?, i.e. partial first and last name, "TJ Tan", but no partial hits when I search "100256".
You are on the right track. Lets take an element (or maybe field) query on "TS Tan"
With cts:tokenize, you can break this up (read about cs:tokenize - it is not just a normal tokenizer).
Then I have "TS" and "Tan"
You can the do things like apply business rules on which word should be wild-carded and which not and build the appropriate cts query (probably individual word queries in an and statement - or a near query - tuning depends on your need).
Now with search phrase tokenized, you can also consider that you may find building your results relies not on a wildcard index, but on a an element word lexicon - where you do your term-expansion with word-matches and those terms are then sent to the query.
We sometimes take that further and combine the query building with xdmp:estimate and make the query less restrictive if we don't get enough results early on.
Where to put this logic?
You mention search:search, so in this case, I would suggest you package this into a custom constraint.

Force document(s) into SOLR search results

Is there a way to force specific documents into SOLR search results?
I'm trying something like this but is not working:
/q=id:21321 OR myNormalSearchConditionsHereIncludingDismaxQUery
My goal is to have certain documents always show up in some search results, no matter what the query is.
you can use Solr MoreLikeThis component
This is just a quick thought i would make from your description.
With this you will no longer require a second query and caching the results.

Restrict facets set in apache solr

I use apache solr for my search engine. I have a schema with a field called "typology". I'd like to search inside all typologies but i need to calculate facets of many fields for only one typology. Is it possible?
Thank you
Your base query for facets does not have to be the same as your search query. facet.query parameter allows you to have a different search for that.

Can solr return function values (not solr score or document fields)?

We are making a solr query where we are giving a custom function (which is pretty complex) and sorting the results by value of that function. The query looks something like:
solr/select?customFunc=complexFunction(querySpecificValue1,querySpecificValue2)&sort_by=$customFunc&fq=......
Our understanding is that we can only get back fields on the document and solr score back from solr. Can someone tell us if and how we can fetch the computed value of customFunc for each document. For some reasons we cannot set solr score to be customFunc.
You should use the fl parameter to select pseudo fields, functions and so on, but this is supported only on trunk, which will be released with the 4.0 version of Solr. Have a look at the CommonQueryParameters wiki. The SOLR-2444 issue might be interesting too.
A brief example:
solr/select?q=*:*&fl=*,customFunc:complexFunction(querySpecificValue1,querySpecificValue2)
This helped me :
/solr/auction-En/select/?q=*:*_val_:"sum(x,y)"&debugQuery=true&version=2.2&start=0&rows=10&indent=on&fl=*,score
You will see the values of the function in the debug part.

How can I configure Sitecore search to retrieve custom values from the search index

I am using the AdvancedDatabaseCrawler as a base for my search page. I have configured it so that I can search for what I want and it is very fast. The problem is that as soon as you want to do anything with the search results that requires accessing field values the performance goes through the roof.
The main search results part is fine as even if there are 1000 results returned from the search I am only showing 10 or 20 results per page which means I only have to retrieve 10 or 20 items. However in the sidebar I am listing out various filtering options with the number or results associated with each filtering option (eBay style). In order to retrieve these filter options I perform a relationship search based on the search results. Since the search results only contain SkinnyItems it has to call GetItem() on every single result to get the actual item in order to get the value that I'm filtering by. In other words it will call Database.GetItem(id) 1000 times! Obviously that is not terribly efficient.
Am I missing something here? Is there any way to configure Sitecore search to retrieve custom values from the search index? If I can search for the values in the index why can't I also retrieve them? If I can't, how else can I process the results without getting each individual item from the database?
Here is an idea of the functionality that I’m after: http://cameras.shop.ebay.com.au/Digital-Cameras-/31388/i.html
Klaus answered on SDN: use facetting with Apache Solr or similar.
http://sdn.sitecore.net/SDN5/Forum/ShowPost.aspx?PostID=35618
I've currently resolved this by defining dynamic fields for every field that I will need to filter by or return in the search result collection. That way I can achieve the facetted searching that is required without needing to grab field values from the database. I'm assuming that by adding the dynamic fields we are taking a performance hit when rebuilding the index. But I can live with that.
In the future we'll probably look at utilizing a product like Apache Solr.

Resources