Count number of remaining results with Solr - search

I have a Database with Cars and i want to create filters to lets users search for them quicker. I've worked with Solr before but i've never used facets.
The filters should work like this:
Brands
BMW (4)
VW (5)
Toyota (6)
Colors
Red (4)
Blue (3)
Yellow (2)
Green (3)
White (3)
Types
Sports (5)
Sedan (10)
If the user selects one it should end up like this:
Brands
BMW (1)
VW (2)
Toyota (1)
Colors
Red (4) <-- Selected
Blue (-)
Yellow (-)
Green (-)
White (-)
Types
Sports (1)
Sedan (3)
What would be the best way to archive something like this with Solr. Would Facets be perfect for this? I did some research but i haven't found any documentation displaying this kind of feature. Is there a name for this kind of filtering? Thanks allot for your help.

Yes, using Solr facets you can achieve this.
Give field name to facets.field
Index your documents into Solr. you can refer some links as this to index your db data into Solr. after indexing, query along with facets component.
Example:
http://localhost:8983/solr/collection/select?indent=on&q=*:*&wt=json&facet=true&facet.field=cat
scroll down to see facets results as below.
"facet_fields":{
"cat":[
"book",17,
"electronics",12,
"paperback",6,
"currency",4,
"memory",3,
"connector",2,
"graphics card",2,
"hard drive",2,
"search",2,

Related

how to increment google docs heading assignments while retaining relationships?

i've got a big doc with lots of headings assigned
i'd like to select a load of text, and change all heading1 to heading2, all heading2 to heading3, etc. so that the relationships stay the same they're all just one higher (or lower)
eg before:
*HEADING 1
some text
*Heading 2
more text
after:
*Heading 1
*Heading 2
some text
*Heading3
more text
basically i want to add a new category 'above' the existing text
is anything like this possible?

POWER BI/EXCEL/VBA - Row Classification based on a list of contained words

i'm trying to recreate a parser, inside different applications (just to find out a good and light way to do it) that, with a list of given values try to classify a row. Here it is an example:
Table1: contains items that have to be classified:
Code
Description
1
this is a common row that belong to Jake
2
this is a special row that belong to Thomas
Table 2: contains Keywords that have to be searched inside Table1 [Description]
Keywords
Category
common, Jake
Common Jake Row
special, belong, row, Thomas
Special Thomas Row
The result that i want to obtain is:
Code
Description
Category
1
this is a common row that belong to Jake
Common Jake Row
2
this is a special row that belong to Thomas
Special Thoams Row
Is there a way to have this in VBA or Excel or PowerBI?
Thanks in advance
Just do a fuzzy join in Power Query and tweak the threshold.

Kentico Custom Lucene Index with multiple facets that are similar - how to query?

I have created a custom index that stores multiple extra fields that I use to filter by. Say for example I am storing some facets to select Kite colors. Some kites have one color, others have multiple.
Kite A color: dark blue red orange deep red
Kite B color: blue
Where kite A colors are dark blue, red, orange, and deep red.
A query as such
+color:blue
would return both kite a and kite b, even though kite A color is deep blue but not blue. Only kite B should be returned.
My question is this - and its probably been hard for me to find and answer because I dont know the proper terminology, but how should I be storing the values in lucene so that I can separate the values (delimiter?). In addition, how do I phrase the query so that if I search for
color:red it does not return rows that have the value color:"deep red"? And if I were to search for color:(deep red) it wont return rows that have "red" but not "deep red"
Take a look at search index analyzer types: search results depend on the analyzer type + search settings of the object (page type, custom table, etc.).
I think color field is marked tokenized in search settings, that's why it returns results that match individual tokens (subsets) of the field's value. If tokenized disabled, the search only returns items if the full value of the field exactly matches the search expression.
One suggestion here. Are you asking visitor to type in the color (I assume not), or you have a filter list that they can check to filter?
If its a filter list, then you may want to consider use "dark_blue" as the value and "dark blue" as the display. Both for content entry and filter. That way, the filter would be color:dark_blue.
Then your index for this can use "Start width" as the analyzer type, so when search "dark blue", it's looking at "dark_blue" as value, which "blue" will not return. Then, when search "blue", "dark_blue" will not show because it dosen't start with "blue..."

Solr edismax parser and multiple fields search

I use the edismax query parser to handle user queries against our Solr 4.10.3 server.
I configured the q.op parameter to AND and completely disabled the mm parameter in order to hit only 100% matches.
When users search for multiple terms in a single field everything works fine.
For example the query food:(beer cola pizza) returns only those documents that contains all of the terms beer, cola and pizza in the field food which is the expected behaviour.
But when users search in multiple fields Solr seems to forget about the q.op configuration and behaves as if the parameter was set to OR.
For example the query food:(beer cola pizza) AND color:(green yellow blue) returns all those documents that contains one of the terms beer, cola OR pizza in the field food and those that contains one of the terms green, yellow OR blue in the field color which isn't the expected behaviour.
A workaround is to explicitely prepare each term with the + operator just like this: food:(+beer +cola +pizza) AND color:(+green +yellow +blue).
But I need to add this operator in our java-webapplication which is kind of a 'hard code' feature. When users decide to configure the q.op operator back to OR the hard coded + would cause problems I think.
Is there a way to reach the expected search results by configuration?

Menu extracting

I am interesting in extracting and structuring information about restaurant menus. What is needed is to extract the items from the menu in form category / name / price
For instance, we have the following website. Here we have a drinks sections, and there a number of items. For that website I'd like to be able to extract
Drink / Cappuccino / € 1,50
SANDWICHES / filled sandwich, pistolet (round roll) or emperor roll / € 1,30
etc ...
Of course it shouldn't be limited only to this website.
The only way I can see to handle that is applying a bunch of regexps, but I don't believe listing all possible dish names is feasible.
I know that the topic might be too broad for a question, but anyway any suggestions or references to relevant articles or books will be much appreciated.
This seems quite possible. You many not be able to list all possible dishes but you can list all possible categories.
Assuming that in every menu, dish names follows category name and it is followed by the price, you can identify dish names.
The algorithm will look like this:
foreach(category: category_list):
foreach(word:document):
if(category == word):
dish = Read next(if data is structures with table read next row or col)
price = Read next and check it format to see if its Currency or a price
The point is you will need to analyse different websites to understand how the information is structured and prepare your algorithm to deal with all possible structures.

Resources