search in all fields for multiple values? - search

i have two fields:
title
body
and i want to search for two words
dog
OR
cat
in each of them.
i have tried q=*:dog OR cat
but it doesnt work.
how should i type it?
PS. could i enter default search field = ALL fields in schema.xml in someway?

As Mauricio noted, using a copyField (see http://wiki.apache.org/solr/SchemaXml#Copy_Fields) is one way to allow searching across multiple fields without specifying them in the query string. In that scenario, you define the copyField, and then set the fields that get copied to it.
<field name="mysearchfield" type="string" indexed="true" stored="false"/>
...
<copyField source="title" dest="mysearchfield"/>
<copyField source="body" dest="mysearchfield"/>
Once you've done that, you could do your search like:
q=mysearchfield:dog OR mysearchfield:cat
If your query analyzer is setup to split on spaces (typical), that could be simplified to:
q=mysearchfield:dog cat
If "mysearchfield" is going to be your standard search, you can simplify things even further by defining that copyField as the defaultSearchField in the schema:
<defaultSearchField>mysearchfield</defaultSearchField>
After that, the query would just become:
q=dog cat

Related

How can I sort a multivalued string field with docValues in Apache Solr 6?

We are trying to sort on a multivalued field which is defined like this:
<fieldType class="org.apache.solr.schema.StrField" name="StrField"/>
<field docValues="true" indexed="true" multiValued="true" name="fieldName" type="StrField"/>
The Solr we use is: 6.0.1.1.2338
When trying to sort on the field "fieldName", we get the following error:
"msg": "can not sort on multivalued field: fieldName",
Then we tried to use field functions to enable the sorting:
&sort=field(fieldName,min)+asc
Or like:
&fl=fieldName_alias:field(fieldName,min)&sort=fieldName_alias+asc
Then we are getting the following error:
"Selecting a single value from a multivalued field is not supported for this field: fieldName(type: StrField)"
Doing more research, we realised that the field-Function can only work for numeric types. Now we are not sure, how to implement the sorting. I hope it is possible to do without reindexing.
Any help would be very much appreciated! Thank you very much in advance!

Cannot search on email id field using solr query with wildcard

I have a email id field in my table on which solr search is enabled with wildcard
For a email abc.xyz#pqr.com
Whenever I search abc.xyz* I am able to search, if I search pqr.com* I am able to search but whenever I search abc.xyz#pqr.com* I dont get any results.
Below is the xml configuration of the field
<field indexed="true" multiValued="false"
name="user_email_id" stored="true" type="TextField"/>
below is the generated query
SELECT * FROM example WHERE
solr_query='{"q":"user_email_id:Shubha.Sao#techdata.com*","start":0}' LIMIT 50;
The problem is that your email is split into tokens, and instead of full email you most probably get 2 tokens: Shubha.Sao & techdata.com. You can check how the text is split by your current tokenizer in the Solr UI.
Instead of the TextField with its default StandardAnalyzer you need to use either StrField, or customize analyzer to avoid tokenization of the email - for example, you can use KeywordTokenizer that will leave email intact, but you'll able to apply additional filters, like, LowerCaseFilter. Or you can use UAX29URLEmailTokenizer.

Solr search for matching fields

We are trying to use Solr to search our document contents, however I want to be able to search for fields that match internally. I have looked but cannot find anything on self-referential or inner joins.
So for example:
<doc>
<field name="id">12345</field>
<field name="author">Smith</field>
<field name="last_edit">Smith</field>
...
</doc>
Obviously a (author:Smith AND last_edit:Smith) would work, but I would like to be able to search for all documents where author and last_edit are the same, not necessarily a fixed value. Defining a new field is fine.

solr index for multi-valued multi-type field

I am indexing a collection of xml document with the next structure:
<mydoc>
<id>1234</id>
<name>Some Name</name>
<experiences>
<experience years="10" type="Java"/>
<experience years="4" type="Hadoop"/>
<experience years="1" type="Hbase"/>
</experiences>
</mydoc>
Is there any way to create solr index so that it would support the next query:
find all docs with experience type "Hadoop" and years>=3
So far my best idea is to put delimited years||type into multiValued string field, search for all docs with type "Hadoop" and after that iterate through the results to select years>=3. Obviously this is very inefficient for a large set of docs.
I think there is no obvious solution for indexing data coming from the many-to-many relationship. In this case I would go with dynamic fields: http://wiki.apache.org/solr/SchemaXml#Dynamic_fields
Field definition in schema.xml:
<dynamicField name="experience_*" type="integer" indexed="true" stored="true"/>
So, using your example you would end up with something like this:
<mydoc>
<id>1234</id>
<name>Some Name</name>
<experience_Java>10</experience_Java>
<experience_Hadoop>4</experience_Hadoop>
<experience_Hbase>1</experience_Hbase>
</mydoc>
Then you can use the following query: fq=experience_Java:[3 to *]

Solr query based on a string field's subset

I'd like to send a string to Solr and let it answer with all records which are a subset of that string.
The string I would send has integer numbers separated by spaces. I wanna make solr give me all records where a specific string field is a subset of the numbers I provide as the request string.
An example...
Imagine I have an string field indexed in Solr which is in reality a set of integers separated by space. For example, let's say I have the following record's field indexed in Solr:
"888110"
"888110 888120"
"888110 888120 888130"
"888110 888120 888130 888140"
"888110 888130 888140"
"888110 888140"
"888140"
"888120 888130"
I wanna Solr to receive a query with, for example, "888110 888140" and reply with the following records:
"888110"
"888110 888140"
"888140"
If I query by "888110 888120 888130" the retrieved records would be...
"888110"
"888110 888120"
"888110 888120 888130"
"888120 888130"
The retrieved records must be exactly a subset of the numbers provided as a string.
Is it possible to make Solr behave like this?
I'm a bit confused why in the first example "888110" is not returned, but it is in the second example.
Anyways, if I understand generally what you are trying to do, I would be making a new field multi valued and use your boolean operators (AND ,OR) on the query.
eg in the schema
<field name="code_string" ... />
<field name="codes" ... multiValued="true"/>
so you have a document like
<doc>
<arr name="codes">
<str>811001</str>
<str>811002</str>
</arr>
and in your query
?=codes=811001 OR codes=811002 OR ....
In my experience with solr it is generally cleaner / more maintainable to sacrifice a little memory rather than creating debilitatingly complex chains of filters etc

Resources