Complex data indexing with Lucene? - search

So I need to search a database of boxes. Each box can contain multiple items. Each box has a general description and each item has a description along with some descriptor/value pairs (ie, size:large). I want to be able to search this database using a similar set of information- I'm looking for a box like x, holding a, b, and c items. Return the box that contains the most similar items.
However, I'm confused about how I should go about indexing this data in Lucene. I know that I can index multiple values to a single field by overriding the getPositionIncrement() method of the analyzer, but how do I link descriptors to a specific item? For example, I may have 2 items in my box:
Item 1
description: pair of shoes
color: blue
Item 2
description: Jacket
color: red
and I search for
Item A
description: pair of shoes
color: red
Item B
description: Jacket
color: blue
to my knowledge, it will return the box with item 1 and 2 as a match. But I want the color to only apply to that specific item, and I need to be able to search a box to match multiple items simultaneously to find the most similar box.
The reason I'm using Lucene is because this is a regular search job that occurs on a regular basis, but it isn't live, so a search server like Solr is not needed because searches only occur in narrow windows.

In this case you need to store relation between parent document (box) and child documents (items). Fortunately there is BlockJoinQuery for such purposes. This article describes it in details for Lucene 3.x or you can use ToParentBlockJoinQuery which is available for Lucene 4.x

Related

Create relationship between two lookup columns in a sharepoint list

I have a Sharepoint List 'TeamValues' that defines values that should appear in another list.
TeamValues List:
Title Team Sub-Team
1 A Blue
2 A Green
3 B Yellow
4 C Silver
5 C Gold
I have a list that users can edit and in this list I want to lookup 'TeamValues' list and create a dropdown list for users to select a value from the Team column. This needs to be unique so in this case the values that should appear are: A,B,C
Once they select a value I need a second column to populate a drop down list that users can select E.g. If they select Team: A, then the second drop down should show: Blue, Green as the options.
How can I do this using Sharepoint? I don't have any code access and only have the sharepoint GUI to work with.
Is this possible?
I have created a lookup choice column Team that looks at my TeamValues List however this is showing all values e.g. A,A,B,C,C. I have created a lookup choice column for the subteam that also shows every value. I dont know how to link these together or get a relationship between the two choice drop downs/ remove duplicate values. I tried the following option: 'the Enforce unique values is not displayed.' which did not work
No code applicable
I am sorry to say that but I think there is no OOB functionality to show distinct values of lookup column and to do this kind of relation without any code. I also did some research just to be sure but I could not find anything like that.
If You would consider some code You can always use javascript jsLink technology to achieve this. It's not that hard. It's a JS file that You can add to some lib on sharepoint site and then You can add this JS to webpart manually without any deploy or other. After that with javascript You can overwrite the default behavior of any control/column and do this kind of relation or show only distinct values.

Kentico Custom Lucene Index with multiple facets that are similar - how to query?

I have created a custom index that stores multiple extra fields that I use to filter by. Say for example I am storing some facets to select Kite colors. Some kites have one color, others have multiple.
Kite A color: dark blue red orange deep red
Kite B color: blue
Where kite A colors are dark blue, red, orange, and deep red.
A query as such
+color:blue
would return both kite a and kite b, even though kite A color is deep blue but not blue. Only kite B should be returned.
My question is this - and its probably been hard for me to find and answer because I dont know the proper terminology, but how should I be storing the values in lucene so that I can separate the values (delimiter?). In addition, how do I phrase the query so that if I search for
color:red it does not return rows that have the value color:"deep red"? And if I were to search for color:(deep red) it wont return rows that have "red" but not "deep red"
Take a look at search index analyzer types: search results depend on the analyzer type + search settings of the object (page type, custom table, etc.).
I think color field is marked tokenized in search settings, that's why it returns results that match individual tokens (subsets) of the field's value. If tokenized disabled, the search only returns items if the full value of the field exactly matches the search expression.
One suggestion here. Are you asking visitor to type in the color (I assume not), or you have a filter list that they can check to filter?
If its a filter list, then you may want to consider use "dark_blue" as the value and "dark blue" as the display. Both for content entry and filter. That way, the filter would be color:dark_blue.
Then your index for this can use "Start width" as the analyzer type, so when search "dark blue", it's looking at "dark_blue" as value, which "blue" will not return. Then, when search "blue", "dark_blue" will not show because it dosen't start with "blue..."

Solr edismax parser and multiple fields search

I use the edismax query parser to handle user queries against our Solr 4.10.3 server.
I configured the q.op parameter to AND and completely disabled the mm parameter in order to hit only 100% matches.
When users search for multiple terms in a single field everything works fine.
For example the query food:(beer cola pizza) returns only those documents that contains all of the terms beer, cola and pizza in the field food which is the expected behaviour.
But when users search in multiple fields Solr seems to forget about the q.op configuration and behaves as if the parameter was set to OR.
For example the query food:(beer cola pizza) AND color:(green yellow blue) returns all those documents that contains one of the terms beer, cola OR pizza in the field food and those that contains one of the terms green, yellow OR blue in the field color which isn't the expected behaviour.
A workaround is to explicitely prepare each term with the + operator just like this: food:(+beer +cola +pizza) AND color:(+green +yellow +blue).
But I need to add this operator in our java-webapplication which is kind of a 'hard code' feature. When users decide to configure the q.op operator back to OR the hard coded + would cause problems I think.
Is there a way to reach the expected search results by configuration?

SharePoint 2013 KQL each query with xrank on multi value taxonomy field

In my search query I query for items which have a multi value taxonomy field.
The publishing page where I am running the query from also has the same multi value tax field (for example "color").
Example:
List Items (have color column)
- Item 1: Red, Blue
- Item 2: Yellow, Red, White
- Item 3: Red, Blue, Green
- Item 4: Black
Publishing Page (has color column)
- defined colors are: Red, Green
All items match "red", but "Item 3" is obviously my best match, as it has both Red and Green.
I need to return all items. I do not want to filter, I want to rank items.
First attempt: The following query returns all items.
However not each single value out of my multi value field is boosted but only the whole value altogether.
ContentTypeId:0x010600C0DEB45360CF4E9EB452AEFE3A238A1CA1* XRANK(cb=100) MyColorManagedProperty:{Page.MyColorColumn}
Problem:
For example Item 4 will have the same ranking as item 1 and 2. The boost of 100 will only apply to Item 3 (which has Red AND Green). I need a solution where Item 1 and 2 are treated higher than Item 4 as they at least contain "Red".
Update 1: (in a previous version I used a multi value choice field)
Technet: http://technet.microsoft.com/en-us/library/jj683123(v=office.15).aspx) - obviously multi choice fields are not supported. So in the meantime I switched to a multi value taxonomy field.
ContentTypeId:0x010600C0DEB45360CF4E9EB452AEFE3A238A1CA1* XRANK(cb=10) {|owstaxIdRsColor:{Page.RsColor}}
What I get is: XRANK (Property or Property or Property or ...)
What I want is: XRANK Property or XRANK Property or XRANK Property or ...`
Update 2:
Proper bracket placing works its wonders - so I thought. I tried to reproduce the desired result recently and failed miserably. I'm still looking for a solution. The approach below does NOT work.
ContentTypeId:0x010600C0DEB45360CF4E9EB452AEFE3A238A1CA1* XRANK(cb=100) (owstaxIdRsColor:{|{Page.RsColor}})
The result is: ContentTypeId:0x010600C0DEB45360CF4E9EB452AEFE3A238A1CA1* XRANK(cb=100) (owstaxIdRsColor:((#0655a6c23-6f73-43d4-b451-d01e0400717f) OR (#0de2d6451-8825-4c4f-9b02-0b22089b6540))) which basically ranks every item with its "default" value (of 5.xxx).

Solr get field where the maximum keyword match was found

I am not sure if this can be done with solr. But this is what I am doing for an online store search functionality. The main search box is a dismax parser for multiple fields :
qf: description^1.0 color^1.0 name^1.0 size^1.0
Equal weight across multiple fields for now. Further I create a facet on some of these fields
ex: color, size. The client has a request that when they search using a particular keyword and it matches any of the faceted fields the filter appear selected in the front end. So if the user searches for 'red' The color facet for red should appear selected.
Since solr is searching across multiple fields I don't think this is possible or is it?
It is not about Solr. First, This requirement is flawed at user experience level. Traditionally facets (also known as guided navigators) are used to filter search results. Just having "red" across multiple fields does not mean all the products appeared are colr "red" .
When you have "red" selected in Co filter you are visually telling user that all products in search results are "red". If that is not the case do not do it.
It that is the case, then ideal situation is, when the user enters "red" , you should first check user input against colr facets (preferably against cacheed list) and then add that color as filter to query as fq=colr:red parameter so that it is "true" filter and is part of your search query. This can be done against all known displayed facets (colr,size etc.) very quickly and activate them automatically if there is a match. Used right, that would actually make a cool feature.

Resources