Preserving the position while searching - search

I need to make elasticsearch query to work with multiple words. I am using edgeNgram tokenizer to support autocompletion feature and I am using query_string for searching.
Document
{
"title":"Gold digital cinema",
"region":"Mumbai"
}
{
"title":"cine park",
"region":"Mumbai"
}
{
"title":"Premier Gold",
"region":"mumbai"
}
Query
{
"query": {
"bool": {
"should": [
{
"query_string": {
"fields": [
"title",
"region"
],
"query":"gold cine"
}
},
{
"fuzzy": {
"title": {
"value":"gold cine",
"min_similarity": 0.5,
"max_expansions": 50,
"prefix_length": 0
}
}
}
]
}
}
}
When I search for gold cine, I need "title":"Gold digital cinema" to be in the top results. But I am getting "title":"cine park" and "title":"Premier Gold" in top.
Is there any way to preserve position while searching?
Thanks in advance.
UPDATE
{
"query":{
"bool":{
"should":[{
"query_string":{
"fields":["title.default_title^10",
"title.ngrams_front^2",
"title.ngrams_back"],
"query":"gold cine",
"boost":2
}
},
{
"function_score":{
"query":{
"query_string":{
"fields":["region"],
"query":"MUMBAI"
}
},
"functions":[{
"script_score":{
"script":"_score + 0.6"}
}
],
"score_mode":"max",
"boost_mode":"avg"
}
}
]
}
}
}

You can use boost to pull up the data on top..
{
"query": {
"bool": {
"should": [
{
"query_string": {
"fields": [
"title",
"region"
],
"query":"gold cine",
"boost": 3
}
},
{
"fuzzy": {
"title": {
"value":"gold cine",
"min_similarity": 0.5,
"max_expansions": 50,
"prefix_length": 0
}
}
}
]
}
}
}
If you use version 1.0.0.and above you should use function score query.
HOpe it helps..!

Related

elasticsearch must OR must_not

I have this query for my elasticsearch request:
{
"query": {
"bool": {
"filter": {
"bool": {
"should" : [
{
"bool" : {
"must_not": {
"exists": {
"field": "visibility_id"
}
}
}
},
{
"bool" : {
"must": {
"terms": {
"visibility.visibility": ["visible"]
}
}
}
}
]
}
}
}
}
}
The goal is to check if the row visibility_id is in the table. If not it will return true has it reach the "must_not". But if the visibility_id column is present it needs to check that this is set to "visible".
At the moment it works if the visibility_id is null but it does not check the terms. terms can be anything else but visible and it will works.
Can someone help me please, I am new to elasticsearch. (I have tried without the filter, bool, only with the should but it does not work neither.)
Try this query, you're missing minimum_should_match: 1
{
"query": {
"bool": {
"minimum_should_match": 1,
"should": [
{
"bool": {
"must_not": {
"exists": {
"field": "visibility_id"
}
}
}
},
{
"terms": {
"visibility.visibility": [
"visible"
]
}
}
]
}
}
}
If visibility is nested in your mapping, your query needs to be like this instead:
{
"query": {
"bool": {
"minimum_should_match": 1,
"should": [
{
"bool": {
"must_not": {
"exists": {
"field": "visibility_id"
}
}
}
},
{
"nested": {
"path": "visibility",
"query": {
"terms": {
"visibility.visibility": [
"visible"
]
}
}
}
}
]
}
}
}

How to calculate total for each token in Elasticsearch

I have a request into Elastic
{
"query":{
"bool":{
"must":[
{
"query_string":{
"query":"something1 OR something2 OR something3",
"default_operator":"OR"
}
}
],
"filter":{
"range":{
"time":{
"gte":date
}
}
}
}
}
}
I wanna calculate count for each token in all documents using elastic search in one request, for example:
something1: 26 documents
something2: 12 documents
something3: 1 documents
Assuming that the tokens are not akin to enumerations (i.e. constrained set of specific values, like state names, which would make a terms aggregation your best bet with the right mapping), I think the closest thing to what you want would be to use filters aggregation:
POST your-index/_search
{
"query":{
"bool":{
"must":[
{
"query_string":{
"query":"something1 OR something2 OR something3",
"default_operator":"OR"
}
}
],
"filter":{
"range":{
"time":{
"gte":date
}
}
}
}
},
"aggs": {
"token_doc_counts": {
"filters" : {
"filters" : {
"something1" : {
"bool": {
"must": { "query_string" : { "query" : "something1" } },
"filter": { "range": { "time": { "gte": date } } }
}
},
"something2" : {
"bool": {
"must": { "query_string" : { "query" : "something2" } },
"filter": { "range": { "time": { "gte": date } } }
}
},
"something3" : {
"bool": {
"must": { "query_string" : { "query" : "something3" } },
"filter": { "range": { "time": { "gte": date } } }
}
}
}
}
}
}
}
The response would look something like:
{
"took": 9,
"timed_out": false,
"_shards": ...,
"hits": ...,
"aggregations": {
"token_doc_counts": {
"buckets": {
"something1": {
"doc_count": 1
},
"something2": {
"doc_count": 2
},
"something3": {
"doc_count": 3
}
}
}
}
}
You can split your query into filters aggregation of three filters. For reference look here: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-filters-aggregation.html
What you would need to do, is to create a Copy_To field and have the mapping as shown below.
Depending on the fields that your query_string queries, you need to include some or all of the fields with copy_to field.
By default query_string searches all the fields, so you may need to specify copy_to for all the fields as shown in below mapping, where for sake of simplicity, I've created only three fields, title, field_2 and a third field content which would act as copied to field.
Mapping
PUT <your_index_name>
{
"mappings": {
"mydocs": {
"properties": {
"title": {
"type": "text",
"copy_to": "content"
},
"field_2": {
"type": "text",
"copy_to": "content"
},
"content": {
"type": "text",
"fielddata": true
}
}
}
}
}
Sample Documents
POST <your_index_name>/mydocs/1
{
"title": "something1",
"field_2": "something2"
}
POST <your_index_name>/mydocs/2
{
"title": "something2",
"field_2": "something3"
}
Query:
You'd get the required document counts for the each and every token using the below aggregation query and I've made use of Terms Aggregation:
POST <your_index_name>/_search
{
"size": 0,
"query": {
"query_string": {
"query": "something1 OR something2 OR something3"
}
},
"aggs": {
"myaggs": {
"terms": {
"field": "content",
"include" : ["something1","something2","something3"]
}
}
}
}
Query Response:
{
"took": 7,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0,
"hits": []
},
"aggregations": {
"myaggs": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "something2",
"doc_count": 2
},
{
"key": "something1",
"doc_count": 1
},
{
"key": "something3",
"doc_count": 1
}
]
}
}
}
Let me know if it helps!

ElasticSearch : How to combine nested 'AND' Not Equal

I want build query for search matching with nested and not equal.
This is my elasticSearch query:
{
"from":0,"size":1000,
"query":{
"nested" : {
"path" : "data",
"query" : {
"match" : {
"data.city" : "california"
}
}
},
"filter":{
"not":{
"filter":{
"term":{
"_id":"01921asda01201"
}
}
}
}
}
}
But I got error, am I write something wrong ? thanks
You can use bool Filter too with must and must_not clause.
{
"from": 0,
"size": 1000,
"filter": {
"bool": {
"must": [
{
"nested": {
"path": "data",
"query": {
"match": {
"data.city": "california"
}
}
}
}
],
"must_not": [
{
"term": {
"_id": "01921asda01201"
}
}
]
}
}
}
You need to use filtered query
GET _search
{
"query": {
"filtered": {
"query": {
"nested": {
"path" : "data",
"query" : {
"match" : {
"data.city" : "california"
}
}
}
},
"filter": {
"bool": {
"must_not": [
{
"term": {
"_id": "01921asda01201"
}
}
]
}
}
}
}
}
You should use a bool query for this, and put your two clauses in the must and must_not sections respectively.
If you don't care about scoring on the data.city field (from your example it's not clear), you might want to use the filter portion instead of the must portion.
{
  "from": 0,
  "size": 1000,
  "query": {
    "bool": {
      "filter": [
        {
          "nested": {
            "path": "data",
            "query": {
              "match": {
                "data.city": "california"
              }
            }
          }
        }
      ],
      "must_not": [
        {
          "term": {
            "_id": "01921asda01201"
          }
        }
      ]
    }
  }
}

ElasticSearch filtering with geo distance

I'm attempting to filter data with both geo distance and fields like 'has_cctv' or 'has_instant_bookings'.
{
"query" : {
"filtered" : {
"filter" : {
"geo_distance": {
"distance": 10000,
"lat_lng": {
"lat": "51.5073509",
"lon": "-0.1277583"
}
}
}
}
}
}
I've tried many combinations of filtering using terms but can't seem to get past errors. For example:
{
"query" : {
"filtered" : {
"filter" : {
"terms": [
{"term": {"has_cctv": 1}}
],
"geo_distance": {
"distance": 10000,
"lat_lng": {
"lat": "51.5073509",
"lon": "-0.1277583"
}
}
}
}
}
}
This gives me '[terms] filter does not support [has_cctv] within lookup element'. Could this be a problem with my query, or a problem with the way the data is stored?
Here goes the correct query:
POST _search
{
"query": {
"filtered": {
"query": {
"term": {
"has_cctv": {
"value": 1
}
}
},
"filter": {
"geo_distance": {
"distance": 10000,
"lat_lng": {
"lat": "51.5073509",
"lon": "-0.1277583"
}
}
}
}
}
}
Just make sure that lat_lng is stored as geo_point
Thanks
Bharvi
Or you could use an and filter and group the two filters together. And a comparison between bool filter and and/or/not filters: http://www.elasticsearch.org/blog/all-about-elasticsearch-filter-bitsets/
{
"query": {
"filtered": {
"filter": {
"and": {
"filters": [
{
"term": {
"has_cctv": "1"
}
},
{
"geo_distance": {
"distance": 10000,
"lat_lng": {
"lat": "51.5073509",
"lon": "-0.1277583"
}
}
}
]
}
}
}
}
}
Two errors.
As you have more than one filter, you need to add a bool filter and put each filter in a must clause.
Then you don't need a terms filter here but a term filter.

Elasticsearch sort on multiple queries

I have a query like so:
{
"sort": [
{
"_geo_distance": {
"geo": {
"lat": 39.802763999999996,
"lon": -105.08748399999999
},
"order": "asc",
"unit": "mi",
"mode": "min",
"distance_type": "sloppy_arc"
}
}
],
"query": {
"bool": {
"minimum_number_should_match": 0,
"should": [
{
"match": {
"name": ""
}
},
{
"match": {
"credit": true
}
}
]
}
}
}
I want my search to always return ALL results, just sorted with those which have matching flags closer to the top.
I would like the sorting priority to go something like:
searchTerm (name, a string)
flags (credit/atm/ada/etc, boolean values)
distance
How can this be achieved?
So far, the query you see above is all I've gotten. I haven't been able to figure out how to always return all results, nor how to incorporate the additional queries into the sort.
I don't believe "sort" is the answer you are looking for, actually. I believe you need a trial-and-error approach starting with a simple "bool" query where you put all your criterias (name, flags, distance). Then you give your name criteria more weight (boost) then a little bit less to your flags and even less to the distance calculation.
A "bool" "should" would be able to give you a sorted list of documents based on the _score of each and, depending on how you score each criteria, the _score is being influenced more or less.
Also, returning ALL the elements is not difficult: just add a "match_all": {} to your "bool" "should" query.
This would be a starting point, from my point of view, and, depending on your documents and your requirements (see my comment to your post about the confusion) you would need to adjust the "boost" values and test, adjust again and test again etc:
{
"query": {
"bool": {
"should": [
{ "constant_score": {
"boost": 6,
"query": {
"match": { "name": { "query": "something" } }
}
}},
{ "constant_score": {
"boost": 3,
"query": {
"match": { "credit": { "query": true } }
}
}},
{ "constant_score": {
"boost": 3,
"query": {
"match": { "atm": { "query": false } }
}
}},
{ "constant_score": {
"boost": 3,
"query": {
"match": { "ada": { "query": true } }
}
}},
{ "constant_score": {
"query": {
"function_score": {
"functions": [
{
"gauss": {
"geo": {
"origin": {
"lat": 39.802763999999996,
"lon": -105.08748399999999
},
"offset": "2km",
"scale": "3km"
}
}
}
]
}
}
}
},
{
"match_all": {}
}
]
}
}
}

Resources