SOLR 6.6 eDisMax query not respecting mm parameter - search

I am using Solr 6.6.2 and up until now we were using the DisMax query along with the mm parameter, and it works just as expected. A small example
defType=dismax&q=samsung+iphone&qf=name+brand&mm=1 would return a result set as expected, containing both iPhones, and Samsung products. However when do the exact same thing, just replace the defType to defType=edismax (keeping the mm=1) there is no result returned. I have read the documentation of the eDisMax query parser at the Apache SOLR reference and it clearly says the eDisMax is an extension so I expect the mm to behave the same in both DisMax and eDisMax, also if you scroll down on the same page, the documentation also gives an example of the mm parameter that should work as I expect.
Is this a bug, or am I missing something very obvious? Love some help here
EDIT Adding the solr params that are sent along with the request
eDisMax
{
"responseHeader":{
"zkConnected":true,
"status":0,
"QTime":0,
"params":{
"mm":"1",
"q":"samsung iphone",
"defType":"edismax",
"indent":"on",
"fl":"name, category_names, score",
"fq":"channel:outbound",
"wt":"json",
"_":"1515855895772"}},
"response":{"numFound":0,"start":0,"maxScore":0,"docs":[]
}}
DisMax
{
"responseHeader":{
"zkConnected":true,
"status":0,
"QTime":22,
"params":{
"mm":"1",
"q":"samsung iphone",
"defType":"dismax",
"indent":"on",
"fl":"name, category_names, score",
"fq":"channel:outbound",
"wt":"json",
"_":"1515855895772"}},
"response":{"numFound":2147,"start":0,"maxScore":12.172616,"docs":[
{
"name":"Apple iPhone 5s",
"score":12.172616},...]
}}

Not sure if anyone else faced this issue or not, but I feel it was not well documented.
Our Solr q.op defaults to AND and so a search term like "foo bar car" was searched as foo AND bar AND cat. Now with mm=2 and defType=dismax the query was losely parsed to (foo AND bar AND cat)~2 so far so good..
When I change the defType=edismax the same query "foo bar cat" and mm=2 was parsed as (foo AND bar AND cat) NOTE: the missing ~2.
Now if I go ahead and add q.op=OR interesting things start to happen. The parsed query now looks like (foo bar cat)~2 this is exactly what the dismax does. So you could understand (foo bar cat)~2 find me at least 2 of these terms. If I remove the mm parameter, the parsed query looks like (foo bar cat) and since the q.op=OR it returns any document that contains any of the 3 terms.
Either ways, it doesn't really matter and the same behavior can be achieved, its just confusing that the behavior between disMax and eDisMax is somewhat inconsistent.
Hope this helps someone else.

Related

Kibana and groovy scripting

I was looking for a way to calculate a ratio on Kibana. After many researches i found this way :
Using the "JSON Input" feature in a visualisation.
I have all my informations in an index, with 2 types of documents (boots and reboots).
I am looking for the script which count the number of documents with the type boots, same for the reboots type then divide the second by the first.
It sounds really easy, but i do not find any way to get it after my researches, and i am not used to groovy enough yet to do it by myself.
I found many ways to manipulate documents values (doc['mydocname'].values etc), but nothing about the type.
Thanks in advance.
EDIT : I tried this
{
"aggs" : {
"boots_count" : { "value_count" : { "_type" : "boots" } }
}
}
Which is supposed to count the number of fields (here the field _type) in the index. But when i put it into "JSON Input" in a visualisation, that results in an error :
Error: Request to Elasticsearch failed: {"error":"SearchPhaseExecutionException[Failed to execute phase [query], all shards failed; shardFailures {[BbXJ0O6tRxa_OcyBfYCGJQ][informationbe][0]: SearchParseException[[informationbe][0]: from[-1],size[0]: Parse Failure [Failed to parse source [{\"size\":0,\"aggs\":{\"2\":{\"terms\":{\"field\":\"#sitePoste\",\"size\":5,\"order\":{\"1\":\"desc\"}},\"aggs\":{\"1\":{\"avg\":{\"script\":\"0\",\"lang\":\"expression\",\"ratio\":{\"boots_count\":{\"value_count\":{\"_type\":\"boots\"}}}}}}}}
I am wrong. But where ?
EDIT2 : In other hand, i am trying scripted fields, with something like this using lucene expression :
doc['_type:boots'].count / doc['_type:reboots'].count
but it doesnt work more, i am pretty confident about the "doc['_type:boots']" part, i guess the problem is on the "XXX.count" part.
After many attempts, i understand better and better how it works. Default scripted fields scope is on the document, not on the whole index, so i cant do a count action of whole values of the index from documents in it.
I am looking for a workaround, i'll post it it if find something interesting.
I finally solved my problem :
I added a scripted field, if the type of the document is boots, the scripted field = 1, else 0. Then i created a search with only boots and reboots documents (filter _type:boots _type:reboots) and calculated the average of the scripted field in a metric.
Everything works well !

couchdb futon document editor - can I customize the document validation part?

A VERY nice to have would be if I could edit object-literals in this editor's text-field instead of JSON expressions.
If I could replace the JSON parse with a simple eval - it will make editing sooooo much easier! (and help me design document structures for my projects soooo much more easily)
I mean, gosh!! it's not a protocol school, it's an editor's tool.
The goal of the tool is not to teach me the protocol and comment me on every petty mistake, but to help me design documents for the software.
Why must it ensist on strict JSON? Can't it live with Object Literals, and do for us the
JSON.stringify( eval(editor_textarea.value))
woulnd't that be cool? LOL :D
(yea yea, catching errors and feeding back to the user)
(and for who ever missed the difference - it is mainly in the quote marks in attribute names.
the dry strict JSON protocol require quotemarks ALWAYS, no question asked, where JS object literal require quote-marks only for attribute names that are not legal JS variable names and accepts also numbers without quotation marks)
Strict dry JSON:
{ "attribute" : "value"
, "mapmap" :
{ "map" :
{ "attr" : "sdss"
, "123" : "ss32332"
, "val" : 23323
, "456" : "ss32332"
}
}
}
Object Literal
{ attribute: "value"
, mapmap :
{ map :
{ attr : "sdss"
, 123 : "ss32332"
, val : 23323
, 456 : "ss32332"
}
}
}
Well, it won't solve me missing commas or mismatching brakets, but it does make life easier, where quote marks are a big part of the scaffold.
If you can point me to where I can change this even as patch on the futon I'll be soooOOO greatful :)
Maybe later we can integrate there an editor helper such as the cool one in github source-editor or the one in jsfiddle, that helps you indent and color things nicely.
But lets start with a simple eval.
it will make life easier... :)
It can also let me generate complicated documents using JS code without any additional test software...
Happy coding :)
P.S
If you know the answer here - you might know the answer to this question:
couchdb futon document editor - can I customize the indentation rules?
I had a quick browse, and I believe this is where you will want to add your eval:
https://github.com/apache/couchdb/blob/master/share/www/script/futon.browse.js#L911
and here:
https://github.com/apache/couchdb/blob/master/share/www/script/futon.browse.js#L902
You can edit your local couchdb instance share/www/script/futon.browse.js if you want to see live changes.

Need to use Solr dismax handler but i have no q parameter???

Hi
i am trying to make a solr Query using dismax handler but i have no q parameters because i have to match directly on fields..
hl.fragsize=200&mm=1&facet=on&facet.mincount=1&qf=text+&wt=json&hl=true&rows=50&fl=*+score&start=0&q=*:*&fq=jSFunT:("Fresher"+OR+"Developer+/+Programmer+/+Coder")&fq=jNMinEx:[2+TO+*]&fq=jNMaxEx:[2+TO+5]&fq=jNMinSal:[-1+TO+*]&fq=jNMaxSal:[-1+TO+-1]&bq=jSFunT:("Developer+/+Programmer+/+Coder")^1&bq=jSkill:(HTML)^2&bq=jCID:(41449)^8&bq=jJT:(Developer+)^8&bq=jLoc:(Mumbai-Thane+)^4&bq=jINDT:("IT(Software,+Dotcom,+Infra.Mgmt.%26+UI+Design)")^1
OR you can better understand it from below..
&mm=1
&qf=text
&wt=json
&hl=true
&rows=50
&fl=*+score
&start=0
&q=*:*
&fq=jSFunT:("Fresher"+OR+"Developer+/+Programmer+/+Coder")
&fq=jNMinEx:[2+TO+*]
&fq=jNMaxEx:[2+TO+5]
&fq=jNMinSal:[-1+TO+*]
&fq=jNMaxSal:[-1+TO+-1]
&bq=jSFunT:("Developer+/+Programmer+/+Coder")^1
&bq=jSkill:(HTML)^2
&bq=jCID:(41449)^8
&bq=jJT:(Java Developer)^8
&bq=jLoc:(Mumbai-Thane)^4
&bq=jINDT:("IT(Software,+Dotcom,+Infra.Mgmt.%26+UI+Design)")^1
Here all the "bq" will not work because the qt=dismax is not supplied if i use that then the whole query will fail
can i any one help me out i will be very thankful for this kindness
Have a look at the q.alt parameter, which lets you specify a fall back query:
q.alt=*:*
If you replace your q parameter with that one, dismax should play just fine.

search with startkey, endkey and array keys

I have a view wich returns several elements with array keys.
Example :
{"total_rows":4,"offset":0,"rows":[
{"id":"","key":[15,"2"],"value":1,"doc":{},
{"id":"","key":[20,"2"],"value":1,"doc":{},
{"id":"","key":[20,"3"],"value":1,"doc":{},
{"id":"","key":[20,"4"],"value":1,"doc":{}
]}
I'm trying to search through those elements. So if I do the following request :
/database/_design/element/_view/all/?
startkey=[15, "2"]&
endkey=[20, "3"]&
include_docs=true&reduce=false
Live example : http://jchris.couchone.com/keyhuh/_design/Record/_view/by_CreationDate_and_BoreholeName?startkey=[1267686720,%22sp4%22]&endkey=[1267686725,%22sp4\u9999%22]&include_docs=true&reduce=false
This one doesn't works. It returns me all the records, even the last one, which doesn't meets the second element of the array.
Strangely enough, it works with strings only.
Example :
{"total_rows":4,"offset":0,"rows":[
{"id":"","key":["15","2"],"value":1,"doc":{},
{"id":"","key":["20","2"],"value":1,"doc":{},
{"id":"","key":["20","3"],"value":1,"doc":{},
{"id":"","key":["20","4"],"value":1,"doc":{}
]}
if I do the following request :
/database/_design/element/_view/all/?
startkey=["15", "2"]&
endkey=["20", "3"]&
include_docs=true&
reduce=false
Live Example : http://jchris.couchone.com/keyhuh/_design/Record/_view/by_Client_and_BoreholeName?startkey=[%22Test1%22,%22sp4%22]&endkey=[%22Test1%22,%22sp4\u9999%22]&include_docs=true&reduce=false
Here it'll work well and only return the three first elements.
Am I missing something with couchdb's search for arrays with integers and strings ? Or have I fallen on a bug ?
Note : it does the same with CouchDB 0.10 and 0.11.
This looks wrong, and there are a few things it could be. Is it possible for you to share your code with us? If the data isn't proprietary you could replicate your db to http://jchris.couchone.com/keyhuh and I'll take a look at the whole thing there.
...
Thanks for posting the live data. This is the query that is busted?
http://jchris.couchone.com/keyhuh/_design/Record/_view/by_Client_and_BoreholeName?startkey=[%22Test1%22,%22sp4%22]&endkey=[%22Test1%22,%22sp4\u9999%22]&reduce=false
Because that looks fine to me. What am I missing?

how to do OR search in nutch?

Say,search for results whose Field is 'A' or 'B'?
it seems the default is AND.
Never worked with Nutch actively, but since it's based on Lucene, shouldn't Lucene's rules apply? That is to say, the Query Parser Syntax should be applicable. See if this helps.
i was recently started working with nutch .you need to modify the query.java in nutch to get OR query exicuted.
Add the code in Query.java
public void addShouldTerm(String term, String field) {
clauses.add(new Clause(new Term(term), field, false, false, this.conf));
}
public void addShouldTerm(String term) {
addShouldTerm(term, Clause.DEFAULT_FIELD);
}
and form your query like
Query query= new Query(conf);
query.addNotRequiredTerm("A");
query.addNotRequiredTerm("B");
you will get the results for A Or B.
Please correct me if any other way of doing or better way.
Never used nutch for querying (just for indexing), but the schmea.xml should conatin a defaultOperator which can be set to AND or OR.

Resources