Solr title search failing - search

I am indexing the title field for few products in Solr.
But when I am searching, I am not getting those titles in response.
For eg. I am storing following as title : Baboons Typing Tshirt
But when I am searching following I am not getting any result !!!
1)title:Baboons
2)title:(Baboons Typing Tshirt)
3)title:(Baboons*)
On the otherhand, if I am searching like this, I am getting lot of results
1)title:(Tshirt)
I have indexed many titles containing word Tshirt but I want to search a specific title which is failing..!!
I dont know whether Solr is ignoring first words, or it is doing something random.
My Question is basically: If I have a search title with lots of words, I will like to match it with the title which contains maximum common terms.
How to do it?
Thanks

Solr works like that by itself. You don't have to change anything.
You have to be careful how you set up your fields in schema.xml, i.e. how analysis is done.
You can use Solr's admin > Analysis interface to see how exactly your title field (when indexing) and query (when searching) is processed (tokenized, transformed).
Remember, match, in order to occur, requires identical word (case and everything) on both sides (index & query).
To open your index and see how Solr has actually indexed your data, use Luke.

Related

How can I easily get search context around search term with Typesense?

I currently use Typesense to search in an HTML database. When I search for a term, I would like to retrieve N characters before and N characters after the term found in search.
For example, I search for "query" and this is the sentence that matches:
Let's repeat the query we made earlier with a group_by parameter
I would like to easy retrieve a fixed number of letters (or words) before and after the term to show it in a presumably small area where the search results is retrieved, without breaking any words.
For this particular example, I would be showing:
..repeat the query we made earlier..
Is there a feature like this in Typesense?
I have checked Typesense's documents, without any luck.
The feature you're referring to is called snippets/highlights and it's enabled by default. You can control how many words are returned on either side of the matched text using the highlight_affix_num_tokens search parameter, documented under the table here: https://typesense.org/docs/0.23.1/api/search.html#results-parameters
highlight_affix_num_tokens
The number of tokens that should surround the highlighted text on each side. This controls the length of the snippet.

ArangoSearch: how to search without specifying the document field?

I'm looking into ArangoSearch for the first time and it looks like a pretty good functionality.
However, in all the tutorials, despite having the ability to tell it to index all fields, one cannot do a 'blind' search across all fields of the document. Like when we look at the example below:
FOR d in myView SEARCH d.text IN ["quick", "brown"] RETURN d
I don't seem to have the ability to just search d entirely without specifying each individual field that I want to include in my search. Is that correct and if so, why is that and are there workarounds? I'm dealing with a lot of different collections with a lot of different fields that can contain a relevant term, it would be a shame if I'd have to tabulate all of them to make an expansive search.

FTSearch that looks for '-'

does anybody know if there is a possibility to search for '-' using FTSearch?
Set col = db.ftsearch({ [services] = "-"}, 0)
dat requests does not work and instead says:
Notes error: Full text error; see log for more information (
[services] = "-")
Short answer is no.
The full text search treats most symbol characters as a white space. The exception is if the search term itself is wrapped in quotes.
The FT search engine also uses 3-gram for searching. This means that less then 3 characters will not return the results you expect. White spaces would be treated in that search, but only in the context of the found text.
For example: "ce " would find "space " but not "space." or "space" or "spaced".
If you are looking for the field that only contains "-", then a better solution is to create a view with a column containing that field value, and/or filter by that field being that value.
Looks like you are trying to do a full text search in a view? You probably would get better response time and less server impact to use #Formula language if you are working with a view.
I try to keep away on doing full text searches on the entire database. You can use a search on a view collection for faster results. There is no restriction on how many views you can have in a db. There is a cost for everything though. There are so many little tricks that can be used to get better results. Please give us more details on what you are trying to do.

How do I penalize (downward boost) in solr?

I need to do three things and I'm just not able to figure this out:
penalize a missing facet (say "brand") in a search query. I tried doing &defType=dismax&qf=(:* AND -brand:[* TO ])^1000 but it is penalizing all results
up or down boost a particular facet if it contains a particular string irrespective of what the query was - for example I want to up-boost any result containing free/freebi/freebie in the title string and down-boost any result containing "pre-used" in the title string
I tried doing &defType=dismax&qf=(title:[FREEBIES OR FREE])^1000 but it doesnt seem to work
Does this work ?
fq=<YOUR MAIN QUERY>&bq=(-brand:[* TO x])^9999

How can I configure Sitecore search to retrieve custom values from the search index

I am using the AdvancedDatabaseCrawler as a base for my search page. I have configured it so that I can search for what I want and it is very fast. The problem is that as soon as you want to do anything with the search results that requires accessing field values the performance goes through the roof.
The main search results part is fine as even if there are 1000 results returned from the search I am only showing 10 or 20 results per page which means I only have to retrieve 10 or 20 items. However in the sidebar I am listing out various filtering options with the number or results associated with each filtering option (eBay style). In order to retrieve these filter options I perform a relationship search based on the search results. Since the search results only contain SkinnyItems it has to call GetItem() on every single result to get the actual item in order to get the value that I'm filtering by. In other words it will call Database.GetItem(id) 1000 times! Obviously that is not terribly efficient.
Am I missing something here? Is there any way to configure Sitecore search to retrieve custom values from the search index? If I can search for the values in the index why can't I also retrieve them? If I can't, how else can I process the results without getting each individual item from the database?
Here is an idea of the functionality that I’m after: http://cameras.shop.ebay.com.au/Digital-Cameras-/31388/i.html
Klaus answered on SDN: use facetting with Apache Solr or similar.
http://sdn.sitecore.net/SDN5/Forum/ShowPost.aspx?PostID=35618
I've currently resolved this by defining dynamic fields for every field that I will need to filter by or return in the search result collection. That way I can achieve the facetted searching that is required without needing to grab field values from the database. I'm assuming that by adding the dynamic fields we are taking a performance hit when rebuilding the index. But I can live with that.
In the future we'll probably look at utilizing a product like Apache Solr.

Resources