Page crawler index is just searching the exact word in kentico - search

I created a page crawler index and I'm testing all search modes, with all page types in the index content. By the way, the index is just searching for the exact word. example if I'm looking for the word "test" I must to write "test" to find it but if I search for "tes" is not finding anything.
How can I update this behavior?

You should take a look more into Analyzer types and what suits you best. If you set it to 'Subset', then it will show you results for 'tes' but also for 'est'. In this link, you can find more about them.

Related

Direct link to subsections and more than one result per page in sphinx search

From this thread I understood how search in sphinx is built up when the doc is built.
However, I am wondering if:
is there a way to show all the searched keywords within the same page (not only the first one), and
if the searched keyword is inside a subsection, is it possible to have a direct link to it rather than to the main chapter in the search results?
Particularly, as for point 2., it is hard to find the keyword if the page is lengthy and the keyword is far from the top.
EDIT1
Just to clarify what I have now...
Take this search for instance. The seached term ("widget") is only returned once per page (i.e. chapter), while there are plenty of this same term repeated in the same page. I wish the search could give all of them. And possibly, a link to the subsection where they occur, rather than only the link to the chapter.
I am using the "sphinx_rtd_theme" theme if that makes any difference.

Wikipedia wildcard search not working?

I'm trying to do a wildcard search on Wikipedia but the search is not behaving the way the instructions say it should. Here's the advanced search help page:
https://en.wikipedia.org/wiki/Help:Advanced_search
As an example, it says this regarding a Wildcard search:
the query *stan will match Kazakhstan or Afghanistan or Stan Kenton.
However, when I attempt to do that search (or even click on the embedded link to that search), I only get
the page *stan does not exist
and it just lists a bunch of "Stan" entries starting with "Stan Laurel filmography."
Why would this feature not work? Am I missing something?
It does work, however because direct matches for "stan" are scored higher than words with it, Kazakhstan is waaaay down in results. You can try slightly narrowing the results with intitle:*stan however this is still bad. However, a quick check with k*stan shows that it works.
Conclusion: user-written help page has a bad example.

Solr behind Drupal returns too many results for specific query

We've got Solr sat behind one of our client's Drupal 7 websites, and while it's working well, it returns too many results for what should be quite specific queries. (It also has relevance/weighting problems; but I'm hoping that solving this problem will remove the - literally - irrelevant results.)
For example, searching for the phrase 'particular phrase in london' should return the node with that as its title, quite high up; I don't even think that any other content should be returned. But I find that it's returning lots of content, purely on the fact that it mentions "London"!
Frivolously, searching for the ridiculous phrase 'piecrusts in london' returns a lot of results too, apparently just because they mention London. No content on the site mentions actual piecrusts.
When I search for 'particular phrase in london', here are the parameters that end up in the catalina.out log on the server (whitespace added for clarity):
{spellcheck=false&facet=true&f.im_field_health_topic.facet.mincount=1
&facet.mincount=1&f.ds_created.facet.date.gap=%2B1YEAR
&spellcheck.q=particular+phrase+in+london
&qf=taxonomy_names^2.0&qf=path_alias^5.0&qf=content^40&qf=label^21.0
&qf=tos_content_extra^1.0&qf=ts_comments^20&qf=tm_vid_3_names^200
&facet.date=ds_created
&f.ds_created.facet.date.start=1970-01-01T00:00:00Z/YEAR
&f.bundle.facet.mincount=1&hl.fl=content,ts_comments
&json.nl=map&wt=json&rows=10&fl=id,entity_id,entity_type,bundle,bundle_name,
label,is_comment_count,ds_created,ds_changed,score,path,url,is_uid,
tos_name,tm_node,zs_entity
&start=0&facet.sort=count&f.bundle.facet.limit=50&q=special+phrase+in+london
&f.ds_created.facet.date.end=2012-01-01T00:00:00Z%2B1YEAR/YEAR
&bf=recip(ms(NOW,ds_created),3.16e-11,1,1)^150.0
&facet.field=im_field_health_topic&facet.field=bundle
&f.im_field_health_topic.facet.limit=50&f.ds_created.facet.limit=50}
hits=1998 status=0 QTime=14
Note that these parameters have been built by Drupal's Apache Solr module; I don't believe we've got any particular custom code of our own that's doing anything to it.
This corresponds to the following URL, if entered directly in the browser:
http://example.com:8081/solr/CLIENT/select?spellcheck=false&facet=true&f.im_field_health_topic.facet.mincount=1&facet.mincount=1&f.ds_created.facet.date.gap=%2B1YEAR&spellcheck.q=particular+phrase+in+London&qf=taxonomy_names^2.0&qf=path_alias^5.0&qf=content^40&qf=label^21.0&qf=tos_content_extra^1.0&qf=ts_comments^20&qf=tm_vid_3_names^200&facet.date=ds_created&f.ds_created.facet.date.start=1970-01-01T00:00:00Z/YEAR&f.bundle.facet.mincount=1&hl.fl=content,ts_comments&json.nl=map&wt=json&rows=10&fl=id,entity_id,entity_type,bundle,bundle_name,label,is_comment_count,ds_created,ds_changed,score,path,url,is_uid,tos_name,tm_node,zs_entity&start=0&facet.sort=count&f.bundle.facet.limit=50&q=particular+phrase+in+London&f.ds_created.facet.date.end=2012-01-01T00:00:00Z%2B1YEAR/YEAR&bf=recip(ms(NOW,ds_created),3.16e-11,1,1)^150.0&facet.field=im_field_health_topic&facet.field=bundle&f.im_field_health_topic.facet.limit=50&f.ds_created.facet.limit=50
This URL returns nearly 2000 results - that's most of the content on the site! I've experimented with removing each query parameter at a time, and the only one to make any difference seems to be qf and q: if I remove qf, zero results; if I remove q, I get more results back!
I guess there are two questions here:
Is there anything in these parameters that tell Solr "don't worry if 'particular phrase', or 'piecrusts' appears: just collate the results for 'london'" and then order by relevancy? I would add that I think 'in' is mentioned in the stopwords file, so we can probably ignore the effect of that (?)
Or is this something in the (standard Drupal) schema that I need to change?
I appreciate that sometimes search is better for the visitor if it's inclusive; Google does return results even if it doesn't find perfect matches. But, stopwords and stemming aside, the client does require that searches return only results where all words appear in the content.
As mentioned in the post at http://drupal.org/node/1783454, the Apache Solr Search Integration module makes use of the mm param, which is more or less configured to effect rankings by how closely the keywords are in the dataset. Looking through the docs there are other ways you can use the parameter to effect rankings as well. Therefore the results produced by Apache Solr Search Integration are weighted more closely to the AND operator even though it will return more results as you add more keywords. The benefit of this param is that in cases where the user enters keywords that are too restrictive, results will still be returned. Displaying no results is a really quick way to guide people away from your site.
How are you displaying the search ?
Maybe you could solr views to limit the search range ?
http://drupal.org/project/apachesolr_views
thanks
Nick

Kentico CMS: Display the phrase that was searched for on a search results page

On my search results page I would like to display the phrase that the user searched for.
For example, instead of using the title Search Results: as you can see in the screenshot, I would instead like to use the title Search Results for vitae:
Is it possible to pull in the searched word/s?
At present the title is hard coded within the web part container that surrounds the search results web part.
Screenshot of search results currently:
I have had a response back from Kentico about how to do this. See solution below.
Your Search Results: text is filled probably in the Container title
property of your smart search web part, so please just change it to
the following one:
Search Results for {?searchtext?}:
This works perfectly! {?searchtext?} pulls in the searched word.

Does SharePoint Search support range tags?

I am working on a project to digitize approximately 1 million images for which metadata will be added to facilitate search.
Each image is, for example, a page in a dictionary. But not text. Just a static scanned image. OCR is not an option :(
My objective is to emulate the current search procedure which consists of looking up the alphabetical entries till the correct page is found. In absence of machine readable text, I am looking at tagging each page with Dictionary range tag. For Example (Apple-Canada). So if someone searches for "Banana", it should hit the (Apple-Canada) range Tag.
Is this supported in SharePoint out of the box? If not, is there an addon product which provides this functionality or am I looking at building a customized extension?
Any help will be appreciated :)
Installing the IFilter for TIF files is done with a couple of clicks and gives you free OCR along the way. Very good for scanned pages.
On your question though: No, SharePoint does not have any kind of "range" tags or fields. The only vaguely similar thing to what you are requesting is the Thesaurus of the search. There you could define acronyms and synonyms for words and it would actually search for something else. So you could enter Banana but it would actually search for Apple. Some examples here: How to: Customize the Thesaurus in SharePoint Search and Search Server.
Other than that I can only think of a custom implemented search provider giving you the flexibility you need.

Resources