I use elastic search for news articles search. If I search for "Vlamadir Putin", it works because he is in news a lot and Vlamidir and Putin are both not very popular. But if I search for "Raja Ram", it does not work. I have a few articles of "Raja Ram", but some of "Raja Mohanty" and "Ram Srivastava". These articles rank higher than articles quoting "Raja Ram". Is there something wrong in my tokenizer or search functions?
es.indices.create(
index="article-index",
body={
'settings': {
'analysis': {
'analyzer': {
'my_ngram_analyzer' : {
'tokenizer' : 'my_ngram_tokenizer'
}
},
'tokenizer' : {
'my_ngram_tokenizer' : {
'type' : 'nGram',
'min_gram' : '1',
'max_gram' : '50'
}
}
}
}
},
# ignore already existing index
ignore=400
)
res = es.search(index="article-index", fields="url", body={"query": {"query_string": {"query": keywordstr, "fields": ["text", "title", "tags", "domain"]}}})
You can use match_phrase option of elasticsearch
But you can't mention multiple fields for search, instead use _all field
Your query would be
res = es.search(index="article-index", fields="url", body={"query": "match_phrase": {"_all":"keywordstr"}})
Related
I need to translate the following SQL query to ES query:
SELECT *
FROM SKILL
WHERE SKILL.name LIKE 'text' and SKILL.type = 'hard'
I have tried the following using "elasticsearch" library for python3:
query = self.__es.search(index="skills",
body={"from" : skip, "size" : limit,
"query":
{"query_string":
{"query": 'text'}
})
and this worked well. But now, I don't know how to check that the field 'type' is equal to 'hard'.
How can I do that?
Thank you.
You have to use a bool query and in the "must" part put two queries, the full text one and a term one:
{
"query": {
"bool": [{
"match": {
"name": "this is a test"
}
}, {
"term": {
"type": "hard"
}
}]
}
}
Before this you have to store the type property as a keyword field.
{
TypeList" : [
{
"TypeName" : "Carrier"
},
{
"TypeName" : "Not a Channel Member"
},
{
"TypeName" : "Service Provider"
}
]
}
Question :
db.supplies.find("text", {search:"\"chann\" \"mem\""})
For above query I want display :
{
TypeName" : "Not a Channel Member"
}
But I am unable to get my result.
What are changes I have to do in query .
Please help me.
The below query will return your desired result.
db.supplies.aggregate([
{$unwind:"$TypeList"},
{$match:{"TypeList.TypeName":{$regex:/.*chann.*mem.*/,$options:"i"}}},
{$project:{_id:0, TypeName:"$TypeList.TypeName"}}
])
If you can accept to get an output like this:
{
"TypeList" : [
{
"TypeName" : "Not a Channel Member"
}
]
}
then you can get around using the aggregation framework which generally helps performance by running the following query:
db.supplies.find(
{
"TypeList.TypeName": /chann.*mem/i
},
{ // project the list in the following way
"_id": 0, // do not include the "_id" field in the output
"TypeList": { // only include the items from the TypeList array...
$elemMatch: { //... where
"TypeName": /chann.*mem/i // the "TypeName" field matches the regular expression
}
}
})
Also see this link: Retrieve only the queried element in an object array in MongoDB collection
I'm trying to implement relevance feedback for Elastic Search (Elastic.co).
I'm aware of boosting queries, which allow for the specification of postiive and negative terms, with the idea being to discount the negative terms, while not excluding them as would be the case in a boolean must_not.
However, I'm trying to achieve tiered boosting, of both positive and negative terms.
That is, I want to take a list of binned positive and negative terms and generate a query such that there are different positive and negative boost tiers, each containing their own query terms.
something like (pseudo query):
query{
{
terms: [very relevant terms]
pos_boost: 3
}
{
terms: [relevant terms]
pos_boost: 2
}
{
terms: [irrelevant terms]
neg_boost: 0.6
}
{
terms: [very irrelevant terms]
neg_boost: 0.3
}
}
My question is whether or not this can be achieved with nested boosting queries, or if I'm better off with multiple should clauses.
My concern is that I'm not sure if a boost of 0.2 in the should clause of a bool query still gives the document a positive increase in the score or not, as I want to discount the document, rather than provide any increase in score.
With boosting queries, the concern is that I can't control the degree to which positive terms are weighted.
Any help, or suggestions for other implementations, would be greatly appreciated. (What I really wanted to do was create a language model for relevant documents and use that to rank, but I don't see how that can easily be achieved in elastic.)
Seems that you can combine bool query and use boosting query clauses tweaking boost values.
POST so/boost/ {"text": "apple computers"}
POST so/boost/ {"text": "apple pie recipe"}
POST so/boost/ {"text": "apple tree garden"}
POST so/boost/ {"text": "apple iphone"}
POST so/boost/ {"text": "apple company"}
GET so/boost/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"text": "apple"
}
}
],
"should": [
{
"match": {
"text": {
"query": "pie",
"boost": 2
}
}
},
{
"match": {
"text": {
"query": "tree",
"boost": 2
}
}
},
{
"match": {
"text": {
"query": "iphone",
"boost": -0.5
}
}
}
]
}
}
}
Alternately, if you want to encode your language model into your collection at index-time, you can try the approach described here: Elasticsearch: Influence scoring with custom score field in document
To boost the elastic search document(priority based search query) based on custom/variable boost value at query time i.e. conditional boosting.
Java Coding example:
customerKeySearch = QueryBuilders.constantScoreQuery(QueryBuilders.termQuery(keys.type", "xxx"));
customerTypeSearch = QueryBuilders.constantScoreQuery(QueryBuilders.termQuery("keys.keyValues.value", "xxxx"));
keyValueQuery = QueryBuilders.boolQuery().must(customerKeySearch).must(customerTypeSearch).boost(2f);
customerKeySearch = QueryBuilders.constantScoreQuery(QueryBuilders.termQuery(keys.type", "xxx"));
customerTypeSearch = QueryBuilders.constantScoreQuery(QueryBuilders.termQuery("keys.keyValues.value", "xxxx"));
keyValueQuery = QueryBuilders.boolQuery().must(customerKeySearch).must(customerTypeSearch).boost(6f);
Description and search query:
elastic search has its internal score calculation technic so we need to disable this mechanism by setting disableCoord(true) property to true in java for BoleanQuery to apply custom boost effect.
Following Boolean query is running query for boosting the documents in elastic search index based on boost value.
{
"bool" : {
"should" : [ {
"bool" : {
"must" : [ {
"constant_score" : {
"query" : {
"term" : {
"keys.type" : "XXX"
}
}
}
}, {
"constant_score" : {
"query" : {
"term" : {
"keys.keyValues.value" : "XXXX"
}
}
}
} ],
"boost" : 2.0
}
}, {
"bool" : {
"must" : [ {
"constant_score" : {
"query" : {
"term" : {
"keys.type" : "XXX"
}
}
}
}, {
"constant_score" : {
"query" : {
"term" : {
"keys.keyValues.value" : "500072388315"
}
}
}
} ],
"boost" : 6.0
}
}, {
"bool" : {
"must" : [ {
"constant_score" : {
"query" : {
"term" : {
"keys.type" : "XXX"
}
}
}
}, {
"constant_score" : {
"query" : {
"term" : {
"keys.keyValues.value" : "XXXXXX"
}
}
}
} ],
"boost" : 10.0
}
} ],
"disable_coord" : true
}
}
Hi I need to boost the documents based on the on a particular value of a field.. My documents contains a field called Region.. Based on the value present in the region i need to boost my documents..
These are my documents
{
"title":"INOX: Malleshwaram - Mantri Square",
"region":"Bangalore"
}
{
"title":"INOX: Bund Garden Road",
"region":"Pune"
}
{
"title":"INOX: Glomax Mall, Kharghar",
"region":"Mumbai"
}
I have tried to use rescore query in my query which look like this
"rescore" : {
"query" : {
"score_mode":"total",
"query_weight" : 2.5,
"rescore_query_weight" : 0.5,
"rescore_query" : {
"match" : {
"region" : {
"query" : "mumbai",
"slop" : 2
}
}
}
}
}
}
But its not working properly as required..Is there any way to solve this?..
Thanks in advance!
Why rescoring, all you need is boosting. Based on the query type you are using, boosting is possible in
"query_string": {
"fields":["region^56"],
"use_dis_max" : true,
"query": "mumbai"
}
where ^56 is the boosting value.
You can also use as mentioned here http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-boosting-query.html
If you are using the bool query you can use boost like this to boost all queries
{
"bool" : {
"must" : {
"term" : { "region" : "mumbai" }
},
"boost" : 25.0
}
}
My elasticsearch index has 10 types in it. When searching for the term "test" I want to get all the documents that matched that query and a list of all the types that has a least one match for that query.
I know I can get this list by going over all results but I guess there's a better way..
Thanks!
Since facets have been deprecated (https://www.elastic.co/guide/en/elasticsearch/reference/current/search-facets.html) and replaced with aggregations, here is the solution for aggregations:
{
"query": {
...
},
"aggs": {
"your_aggregation_name": {
"terms": {
"field": "_type"
}
}
}
}
Link to documentation: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html
Just managed to do that with elasticsearch facets like described here:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-facets.html#_facet_filter
In short you add this to your query:
"facets" : { "facet_name" : { "terms" : {"field" : "_type"} } }
Hope this help someone.