I am trying to use the copyField directive in Solr to copy some fields into a catch-all field for searching. Unfortunately the field does not seem to be populated via the copyField directives at all.
Here are my source fields:
<field name="firstName" type="text_general" indexed="true" stored="true" required="false" />
<field name="lastName" type="text_general" indexed="true" stored="true" required="false" />
<field name="postCode" type="text_general" indexed="true" stored="true" required="false" />
<field name="emailAddress" type="text_general" indexed="true" stored="true" required="false" />
<!-- suggest field -->
<field name="name_Search" type="textSuggest" indexed="true" stored="true" multiValued="true" />
And here are my copyField directives:
<!-- copy fields -->
<copyfield source="firstName" dest="name_Search" />
<copyfield source="lastName" dest="name_Search" />
<copyfield source="emailAddress" dest="name_Search" />
<copyfield source="postCode" dest="name_Search" />
Now running a query on the "name_Search" field does not yield any results, and the field does not appear in the schema browser.
Do I need to do anything else to get copyField working? I am running Solr v5.2.1.
EDIT
Here is the textSuggest field type used for the catch-all field:
<fieldType class="solr.TextField" name="textSuggest" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StandardFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
In the SolrConfig.xml, have configured the suggest handler as follows:
<searchComponent name="suggest" class="solr.SuggestComponent">
<lst name="suggester">
<str name="name">default</str>
<str name="lookupImpl">FuzzyLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">name_Search</str>
<str name="suggestAnalyzerFieldType">textSuggest</str>
<str name="buildOnStartup">true</str>
<str name="buildOnCommit">true</str>
</lst>
<requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy" >
<lst name="defaults">
<str name="suggest">true</str>
<str name="suggest.count">10</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
I know the suggest handler works, as if I explicitly fill the 'name_Search' field, then I can get results as expected.
In your filters, use copyField instead of copyfield (with capital F).
Source : Documentation of Solr
Related
I am very new to Apache Solr and currently trying to understand the concepts. I am using version 6.3. I have created a schema and uploaded a file with a bunch of documents. I do see that 1388 documents are available.
When I put in the q field in the Admin UI "coursetitle:biztalk", I do get the relevant results back but not when I put "biztalk". I thought that I do not need to provide the field name?
Here is the schema:
<field name="courseid" type="string" indexed="true" stored="true" required="true" multiValued="false" />
<field name="coursetitle" type="text_general" indexed="true" stored="true" multiValued="false"/>
<field name="coursetitlesearch" type="text_general" indexed="true" stored="true" multiValued="false"/>
<field name="durationinseconds" type="int" indexed="true" stored="true" />
<field name="releasedate" type="date" indexed="true" stored="true"/>
<field name="description" type="text_general" indexed="true" stored="true"/>
<field name="assessmentstatus" type="text_general" indexed="true" stored="true"/>
<field name="iscourseretired" type="text_general" indexed="true" stored="true"/>
<field name="tag" type="string" multiValued="true" indexed="true" stored="true"/>
<field name="course-author" type="string" multiValued="true" indexed="true" stored="true"/>
You need to specify the field unless that you want to search it the default field.
when you do not specify any field solr search in the default Field which you can configure using the following in schema.
<defaultSearchField> coursetitle </defaultSearchField>
So if you put the above in schema.xml and then search for something like
biztalk in the query param, solr will search it as coursetitle:biztalk
if you want all your fields to be searched without having to specify a field name , look through Copy Fields
I recommend you to go through this https://wiki.apache.org/solr/SchemaXml to see various fields.
Usually some important fields are copied to field which is used to search default by Solr. So I suggest you use same copyfield
Example:
<defaultSearchField>SEARCHINDEX</defaultSearchField>
<copyField source="AUTHOR" dest="SEARCHINDEX"/>
<copyField source="coursetitle" dest="SEARCHINDEX"/>
<copyField source="coursetitlesearch" dest="SEARCHINDEX"/>
<copyField source="SUBTITLE" dest="SEARCHINDEX"/>
Now You cane use SEARCHINDEX field to search all other fields content.
Since using defaultSearchField is depreciated, your request handler in solrconfig.xml defines "df", which takes precedence.
<initParams path="/update/**,/query,/select,/tvrh,/elevate,/spell,/browse">
<lst name="defaults">
<str name="df">text</str>
</lst>
</initParams>
After doing a little research, it looks like that by using edismax, we can indeed pass a list (space separated) of default fields in df e.g.:
df=courseid coursetitle course-author
This way, we do not need to use the copyField!
I am trying to implement scoped autosuggestions like in ecommerce websites like amazon etc.
eg.
if i type Lego , the suggestions should come like
Legolas in Names
Lego in Toys
where Names and Toys are solr field names.
closest aid i got is from this discussion:
solr autocomplete with scope is it possible?
Which informed me that it isn't possible with the suggester which I am currently using.
Until now, using the suggester I am able to achieve autosuggestions from a single solr field. [the autosuggest field , following guidelines in the suggester documentation]
Any ideas/links to help me with ?
Update
I tried to achieve autosuggestions using facets. My query looks something like:
http://localhost:8983/solr/core1/select?q=*%3A*&rows=0&wt=json&indent=true&facet=true&facet.field=field1&facet.field=field2&facet.prefix=i
This gives me all the facet results starting with letter 'i' and term faceted to field1 and field2.
This gave me the idea.
Any comments?
I am assuming you are storing the Names or Toys data as in a field, let call it category.
You can configure the payloadField parameter in the searchComponent definition and pass the category data into it. Later in the application when you receive the suggestion results from solr, show first suggestion from each category or which ever strategy suits better for your use case.
You can find the more information in Solr Suggester.
Suggester component seems useful but in payload field, one can only return a single field which may not satisfy many of the use cases.
By Facet prefixing, you cannot get suggestions from a word in the middle. So "Lego" will give suggestion of a product whose value in name field is "Legolas Sample" but not from "Sample Legolas".
The third way is to implement autosuggest is by using a index analyzer that has a layer of EdgeNGramFilterFactory and then searching on the required prefix.
So, the solr schema will look like
<field name="names" type="string" multiValued="false" indexed="true" stored="true"/>
<field name="toys" type="string" multiValued="false" indexed="true" stored="true"/>
<field name="names_ngram" type="text_suggest_ngram" multiValued="false" indexed="true" stored="false"/>
<field name="toys_ngram" type="text_suggest_ngram" multiValued="false" indexed="true" stored="false"/>
and the field type would have a definition of
<fieldType name="text_suggest_ngram" class="solr.TextField" positionIncrementGap="100" multiValued="true">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EdgeNGramFilterFactory" maxGramSize="10" minGramSize="2"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
and these _ngram fields would be a copyfield:
<copyField source="names" dest="names_ngram"/>
<copyField source="toys" dest="toys_ngram"/>
So , once you have reindexed your data, if you query for "Lego" it will give results from both "Sample Legolas" and "Legolas Sample". However, if you have to categorize these results according to n fields they matched, that would be n different queries which is usually not a problem.
You can add multiple suggester components.
Add one for each field.
E.g. :
<searchComponent name="suggest" class="solr.SuggestComponent">
<lst name="suggester">
<str name="name">namesSuggester</str>
<str name="lookupImpl">BlendedInfixLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">Names</str>
<str name="weightField">Popularity</str>
<str name="indexPath">namesSuggesterIndexDir</str>
<str name="suggestAnalyzerFieldType">suggester</str>
</lst>
<lst name="suggester">
<str name="name">toysSuggester</str>
<str name="lookupImpl">BlendedInfixLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">Toys</str>
<str name="weightField">Popularity</str>
<str name="indexPath">toysSuggesterIndexDir</str>
<str name="suggestAnalyzerFieldType">suggester</str>
</lst>
</searchComponent>
I'm trying to allow a global search across all fields defined in my solr schema.xml. I have the following field:
<field name="catchall"
type="text_en_splitting"
stored="true"
indexed="true"
multiValued="true" />
Then, I have:
<copyField source="*" dest="catchall"/>
<defaultSearchField>catchall</defaultSearchField>
However, when I search without specifying a field, it only searches this field:
<field name="text" type="text_en_splitting" multiValued="false"/>
Is my configuration missing something to search across all fields? Here's an example of the field that is not being included in the default search:
<field name="summary" type="text_en_splitting" indexed="true" stored="true" multiValued="true"/>
I think that I figured out the issue. Apparently with Solr 3.6.1, the default search field is specified in solrconfig.xml rather than in the schema.xml. In solrconfig.xml, I changed the element value from text to catchall.
<requestHandler name="/select" class="solr.SearchHandler">
<lst name="defaults">
<str name="echoParams">explicit</str>
<int name="rows">10</int>
<str name="df">catchall</str>
</lst>
</requestHandler>
I assign a custom "popularity" score for each document in my Solr database. I want search results to be ordered by this custom "score" field rather than the built-in relevancy score that is the default.
First I define my score field:
<fieldType name="sint" class="solr.SortableIntField" sortMissingLast="true" omitNorms="true"/>
<field name="score" type="sint" stored="true" multiValued="false" />
Then I rebuild the index, inserting a score for each document.
To run a query, I use something like this:
(text:hello)+_val_:"score"
Now I would expect the documents to come back sorted by the "score" field, but what I get instead is:
<doc>
<int name="score">566</int>
<str name="text">SF - You lost me at hello...</str>
</doc>
<doc>
<int name="score">41</int>
<str name="text">hello</str>
</doc>
<doc>
<int name="score">77</int>
<str name="text">
CAGE PAGE-SAY HELLO (MIKE GOLDEN's Life Is Bass Remix)-VIM
</str>
</doc>
<doc>
<int name="score">0</int>
<str name="text">Hello Hello Hello</str>
</doc>
Notice that the scores come back out of order: 566, 41, 77, 0. The weird thing is that it only sorts this way with certain queries. I'm not sure what the pattern is, but so far I've only see the bad sorting when scores of "0" come back in the search results.
I've tried IntField instead of SortableIntField, and I've tried putting "sort=score desc" as a query parameter, with no change in behavior.
Am I doing something wrong, or just misunderstanding the meaning of using val:"score" in my query?
EDIT: I tried renaming the "score" field to "popularity" and got the same result.
score field is used by Solr internally, so may be its not a good practice to define a field with the same field name.
you can try defining a field with different field name and both the options you mentioned should work fine.
Edit - This is what i have and works fine (Solr 3.3)
Schema -
Field Type -
<fieldType name="sint" class="solr.SortableIntField" sortMissingLast="true" omitNorms="true"/>
Field -
<field name="popularity" type="int" indexed="true" stored="true" />
Data -
<add>
<doc>
<field name="id">1007WFP</field>
<field name="popularity">566</field>
<field name="text">SF - You lost me at hello...</field>
</doc>
<doc>
<field name="id">2007WFP</field>
<field name="popularity">41</field>
<field name="text">hello</field>
</doc>
<doc>
<field name="id">3007WFP</field>
<field name="popularity">77</field>
<field name="text">
CAGE PAGE-SAY HELLO (MIKE GOLDEN's Life Is Bass Remix)-VIM
</field>
</doc>
<doc>
<field name="id">4007WFP</field>
<field name="popularity">0</field>
<field name="text">Hello Hello Hello</field>
</doc>
</add>
Query -
http://localhost:8983/solr/select?q=*:*&sort=popularity%20desc
Results :-
<result name="response" numFound="4" start="0">
<doc>
<str name="id">1007WFP</str>
<int name="popularity">566</int>
</doc>
<doc>
<str name="id">3007WFP</str>
<int name="popularity">77</int>
</doc>
<doc>
<str name="id">2007WFP</str>
<int name="popularity">41</int>
</doc>
<doc>
<str name="id">4007WFP</str>
<int name="popularity">0</int>
</doc>
</result>
The _val_ hack actually ADDS the "popularity" field to the normally computed score of solr.
So, if you have popularity=41 on document A and popularity=77 on document B, but document A scores more than 36 points better than B for the keyword "hello", then they'll get sorted with A before B.
Use the "sort" field (as you did) that completely overrides normal sorting by score.
An alternative way could be to use a filter query (parameter fq instead of q), that filters matching document without computing any score, and then use _val_ to define your scoring formula. Since with filter queries all retrieved documents will have a score of zero, _val_ would be unaffected and behave as you originally expected.
I want to configure my Solr search engine so I get an exact match for the search term I enter.
eg. 'taxes' should return documents with 'taxes' and not 'tax', 'taxation' etc.
Any help or tips would be appreciated.
I presume your field is a TextField, by default solr does a fuzzy search on this field. What you want is to set up your field as a string field and add no tokenizer then you'll get an exact match.
You can even combine the exact search with a fuzzy search and use DisMax to boost the relative weights.
Example (schema.xml) :
<field name="name" type="string" indexed="true" stored="false" required="true" />
<field name="nameString" type="string" indexed="true" stored="false" required="true" />
<copyField source="name" dest="nameString"/>
Example (solrconfig.xml) :
<requestHandler name="accounts" class="solr.SearchHandler">
<lst name="defaults">
<str name="defType">dismax</str>
<str name="qf">
nameString^10.0 name^5.0 description^1.0
</str>
<str name="tie">0.1</str>
</lst>
</requestHandler>
To turn off stemming in your schema.xml, you can define text field like this:
<types>
<!-- other fields definition -->
<fieldType name="text_no_stem" class="solr.TextField" omitNorms="false">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StandardFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
<!-- other fields definition -->
</types>
<fields>
<!-- other fields definition -->
<dynamicField name="*_nostem" type="text_no_stem" indexed="true" stored="true"/>
<!-- other fields definition -->
</fields>
I'm using sunspot to integrate solr with Ruby on Rails. With this in the schema.xml I define my searchable block like this:
searchable do
text(:wants, as: :wants_nostem)
end
Turn off stemming.
Use the quotes for exact match result :
Example :
core Name : core1
Key : namestring
http://localhost:8983/solr/core1/select?q=namestring:"taxes"&wt=json&indent=true
Use solr string field whcih will do an exact value search e.g
<fieldType class="solr.StrField" name="string" omitNorms="true" sortMissingLast="true" />