Complex SOLR query including NOT and OR - search

I have some fairly complex requirements for a SOLR search that I need to execute against my database of tagged content. I need to first filter the database to those results matching my filter tags. Any of these results which have a tag from the blacklist should be removed unless they also contain tags from the whitelist.
So let's say I want to retrieve all documents tagged "forest" or "conservation". But I want to exclude documents tagged "uk" or "europe" - unless they are also tagged with "africa" or "asia". I tried to write this in my SOLR query:
tag:((forest OR conservation) AND (africa OR asia OR !uk OR !europe))
The logic seems sound to me but the query doesn't work. I have 46 documents tagged with "forest" or "conservation" - one of which is also tagged with "asia" and none of which are tagged with "uk" or "europe" but the query returns no results.
I also saw information about the fq (filtered query) parameter to SOLR which sounded like it could be useful. But this didn't seem to help either. It seems it is the combination of NOT and OR which causes SOLR to break down.
Is what I'm trying to do possible? Is my logic flawed somewhere? This is on SOLR 1.4 btw - incase this is an issue which has been resolved somewhere....
Thanks for any advice!

try
tag:((forest OR conservation) AND (africa OR asia OR (*:* AND !uk AND !europe)))

Related

Force document(s) into SOLR search results

Is there a way to force specific documents into SOLR search results?
I'm trying something like this but is not working:
/q=id:21321 OR myNormalSearchConditionsHereIncludingDismaxQUery
My goal is to have certain documents always show up in some search results, no matter what the query is.
you can use Solr MoreLikeThis component
This is just a quick thought i would make from your description.
With this you will no longer require a second query and caching the results.

Honoring previous searches in the next search result in solr

I am using solr for searching. I wants to improve my search result quality based on previously searched terms. Suppose, I have two products in my index with names 'Jewelry Crystal'(say it belongs to Group 1) & 'Compound Crystal'(say it belongs to Group 2). Now, if we query for 'Crystal', then both the products will come.
Let say, if I had previously searched for 'Jewelry Ornament', then I searches for 'Crystal', then I expects that only one result ('Jewelry Crystal') should come. There is no point of showing 'Compound Crystal' product to any person looking for jewelry type product.
Is there any way in SOLR to honour this kind of behavior or is there any other method to achieve this.
First of all, there's nothing built-in in Solr to achive this. What you need for this is some kind of user session, which is not supported by Solr, or a client side storage like a cookie or something for the preceding query.
But to achive the upvote you can use a runtime Boost Query.
Assuming you're using the edismax QueryParser, you can add the following to your Solr query:
q=Crystal&boost=bq:(Jewelry Ornament)
See http://wiki.apache.org/solr/ExtendedDisMax#bq_.28Boost_Query.29

Can solr return function values (not solr score or document fields)?

We are making a solr query where we are giving a custom function (which is pretty complex) and sorting the results by value of that function. The query looks something like:
solr/select?customFunc=complexFunction(querySpecificValue1,querySpecificValue2)&sort_by=$customFunc&fq=......
Our understanding is that we can only get back fields on the document and solr score back from solr. Can someone tell us if and how we can fetch the computed value of customFunc for each document. For some reasons we cannot set solr score to be customFunc.
You should use the fl parameter to select pseudo fields, functions and so on, but this is supported only on trunk, which will be released with the 4.0 version of Solr. Have a look at the CommonQueryParameters wiki. The SOLR-2444 issue might be interesting too.
A brief example:
solr/select?q=*:*&fl=*,customFunc:complexFunction(querySpecificValue1,querySpecificValue2)
This helped me :
/solr/auction-En/select/?q=*:*_val_:"sum(x,y)"&debugQuery=true&version=2.2&start=0&rows=10&indent=on&fl=*,score
You will see the values of the function in the debug part.

Invalid Magento Search result

Searching Magento with fulltext search engine and like method , it will store results in catalogsearch_fulltext table in "data_index" field where it stores value in the format like
each searchable attribute is separated with |.
e.g
3003|Enabled|None||Product name|1.99|yellow|0
here it store sku,status,tax class, product name , price ,color etc etc
It stores all searchable attribute value.
Now the issue is for Configurable product , it will also store the associated products name ,price ,status in the same field like
3003|Enabled|Enabled|Enabled|Enabled|None|None|None|None|Product name|Product name|associted Product name1|associted Product name2|associted Product name3|1.99|2.00|2.99|3.99|yellow|black|yellow|green|0|0|0|0
So what happen is if i search for any word from associated product, it will also list the main configurable product as it has the word in its "data_index" field.
Need some suggestion how can i avoid associated products being included in data_index, So that i can have perfect search result.
thanks
We are looking into our search as well and it has been surprising to see the inefficiencies included in the fulltext table. We have some configurable products as well that have MANY variations and their population in the fulltext search is downright horrendous.
As for solutions, I can only offer my approach to fix the problem (not completed: but rather in the process).
I am extending Magento to include an event listener to the process of indexing the products (Because catalog search indexing is when the fulltext database is populated). Once that process occurs, I am writing my own module to remove duplicate entries from the associated products and also to add the functionality of adding additional search keyword terms as populated from a CSV file.
This should effectively increase search speed dramatically and also return more relevent search results. Because as of now, configurable products are getting "search bias" in the search results.
This isn't so much of an answer as a comment, but it was too lengthy to fit in the comments but I thought this might be beneficial to you. Once I get my module working, if you would like, I can possibly give you directions on how you could implement a similar module yourself.
Hope that helped (if only for moral support in magento's search struggle)

How to implement faceted search suggestion with number of relevant items in Solr?

Hi
I have a very specific need in my company for the system's search engine, and I can't seem to find a solution.
We have a SOLR index of items, all of them have the same fields, with one of the fields being "Type", (And ofcourse, "Title", "Text", and so on).
What I need is: I get an Item Type and a Query String, and I need to return a list of search suggestion with each also saying how meny items of the correct type will that suggested string return.
Something like, if the original string is "goo" I'll get
Goo 10
Google 52
Goolag 2
and so on.
now, How do I do it?
I don't want to re-query SOLR for each different suggestion, but if there is no other way, I just might.
Thanks in advance
you can try edge n-gram tokenization
http://search.lucidimagination.com/search/document/CDRG_ch05_5.5.6
You can try facets. Take a look at my more detailed description ('Autocompletion').
This was implemented at http://jetwick.com with Solr ... now using ElasticSearch but the Solr sources are still available and the idea is also the identical https://github.com/karussell/Jetwick
The SpellCheckComponent of Solr (that gives the suggestions) have extended results that can give the frequency of every suggestion in the index - http://wiki.apache.org/solr/SpellCheckComponent#Extended_Results.
However, the .Net component SolrNet, does not currently seem to support the extendedResults option: "All of the SpellCheckComponent parameters are supported, except for the extendedResults option" - http://code.google.com/p/solrnet/wiki/SpellChecking.
This is implemented using a facet field query with a Prefix set. You can test this using the xml handler like this:
http://localhost:8983/solr/select/?rows=0&facet=true&facet.field=type&f.type.prefix=goo

Resources