ElasticSearch filtering with geo distance - search

I'm attempting to filter data with both geo distance and fields like 'has_cctv' or 'has_instant_bookings'.
{
"query" : {
"filtered" : {
"filter" : {
"geo_distance": {
"distance": 10000,
"lat_lng": {
"lat": "51.5073509",
"lon": "-0.1277583"
}
}
}
}
}
}
I've tried many combinations of filtering using terms but can't seem to get past errors. For example:
{
"query" : {
"filtered" : {
"filter" : {
"terms": [
{"term": {"has_cctv": 1}}
],
"geo_distance": {
"distance": 10000,
"lat_lng": {
"lat": "51.5073509",
"lon": "-0.1277583"
}
}
}
}
}
}
This gives me '[terms] filter does not support [has_cctv] within lookup element'. Could this be a problem with my query, or a problem with the way the data is stored?

Here goes the correct query:
POST _search
{
"query": {
"filtered": {
"query": {
"term": {
"has_cctv": {
"value": 1
}
}
},
"filter": {
"geo_distance": {
"distance": 10000,
"lat_lng": {
"lat": "51.5073509",
"lon": "-0.1277583"
}
}
}
}
}
}
Just make sure that lat_lng is stored as geo_point
Thanks
Bharvi

Or you could use an and filter and group the two filters together. And a comparison between bool filter and and/or/not filters: http://www.elasticsearch.org/blog/all-about-elasticsearch-filter-bitsets/
{
"query": {
"filtered": {
"filter": {
"and": {
"filters": [
{
"term": {
"has_cctv": "1"
}
},
{
"geo_distance": {
"distance": 10000,
"lat_lng": {
"lat": "51.5073509",
"lon": "-0.1277583"
}
}
}
]
}
}
}
}
}

Two errors.
As you have more than one filter, you need to add a bool filter and put each filter in a must clause.
Then you don't need a terms filter here but a term filter.

Related

Is there a Group BY function for finding result with elastic search query?

I have tried to integrate group by with elastic search. But I didn't get the answer properly. Please support me to fix this issue. Indexed data is,
data = [
{ "fruit":"apple", "taste":5, "timestamp":100},
{ "fruit":"pear", "taste":5, "timestamp":110},
{ "fruit":"apple", "taste":4, "timestamp":200},
{ "fruit":"pear", "taste":8, "timestamp":90},
{ "fruit":"banana", "taste":5, "timestamp":100}]`
My query is,
`myQuery = {"query": {
"match_all": {}
},
"aggs": {
"group_by_fruit": {
"terms": {
"field": "fruit.keyword"
},
}
}
}
It showing all 5 data in the output. Actually I nee d to get only 3 records. The expected result is,
[
{ "fruit":"apple", "taste":4, "timestamp":200},
{ "fruit":"pear", "taste":8, "timestamp":90},
{ "fruit":"banana", "taste":5, "timestamp":100}]
If you want to get the documents with distinct fruit fields having the largest timestamp value you should use a top_hits aggregation.
{
"query": {
"match_all": {}
},
"size": 0,
"aggs": {
"top_tags": {
"terms": {
"field": "fruit.keyword",
"size": <MAX_NUMBER_OF_DISTINCT_FRUITS>
},
"aggs": {
"group_by_fruit": {
"top_hits": {
"sort": [
{
"timestamp": {
"order": "desc"
}
}
],
"size" : 1
}
}
}
}
}
}

ElasticSearch : How to combine nested 'AND' Not Equal

I want build query for search matching with nested and not equal.
This is my elasticSearch query:
{
"from":0,"size":1000,
"query":{
"nested" : {
"path" : "data",
"query" : {
"match" : {
"data.city" : "california"
}
}
},
"filter":{
"not":{
"filter":{
"term":{
"_id":"01921asda01201"
}
}
}
}
}
}
But I got error, am I write something wrong ? thanks
You can use bool Filter too with must and must_not clause.
{
"from": 0,
"size": 1000,
"filter": {
"bool": {
"must": [
{
"nested": {
"path": "data",
"query": {
"match": {
"data.city": "california"
}
}
}
}
],
"must_not": [
{
"term": {
"_id": "01921asda01201"
}
}
]
}
}
}
You need to use filtered query
GET _search
{
"query": {
"filtered": {
"query": {
"nested": {
"path" : "data",
"query" : {
"match" : {
"data.city" : "california"
}
}
}
},
"filter": {
"bool": {
"must_not": [
{
"term": {
"_id": "01921asda01201"
}
}
]
}
}
}
}
}
You should use a bool query for this, and put your two clauses in the must and must_not sections respectively.
If you don't care about scoring on the data.city field (from your example it's not clear), you might want to use the filter portion instead of the must portion.
{
  "from": 0,
  "size": 1000,
  "query": {
    "bool": {
      "filter": [
        {
          "nested": {
            "path": "data",
            "query": {
              "match": {
                "data.city": "california"
              }
            }
          }
        }
      ],
      "must_not": [
        {
          "term": {
            "_id": "01921asda01201"
          }
        }
      ]
    }
  }
}

Elasticsearch sort on multiple queries

I have a query like so:
{
"sort": [
{
"_geo_distance": {
"geo": {
"lat": 39.802763999999996,
"lon": -105.08748399999999
},
"order": "asc",
"unit": "mi",
"mode": "min",
"distance_type": "sloppy_arc"
}
}
],
"query": {
"bool": {
"minimum_number_should_match": 0,
"should": [
{
"match": {
"name": ""
}
},
{
"match": {
"credit": true
}
}
]
}
}
}
I want my search to always return ALL results, just sorted with those which have matching flags closer to the top.
I would like the sorting priority to go something like:
searchTerm (name, a string)
flags (credit/atm/ada/etc, boolean values)
distance
How can this be achieved?
So far, the query you see above is all I've gotten. I haven't been able to figure out how to always return all results, nor how to incorporate the additional queries into the sort.
I don't believe "sort" is the answer you are looking for, actually. I believe you need a trial-and-error approach starting with a simple "bool" query where you put all your criterias (name, flags, distance). Then you give your name criteria more weight (boost) then a little bit less to your flags and even less to the distance calculation.
A "bool" "should" would be able to give you a sorted list of documents based on the _score of each and, depending on how you score each criteria, the _score is being influenced more or less.
Also, returning ALL the elements is not difficult: just add a "match_all": {} to your "bool" "should" query.
This would be a starting point, from my point of view, and, depending on your documents and your requirements (see my comment to your post about the confusion) you would need to adjust the "boost" values and test, adjust again and test again etc:
{
"query": {
"bool": {
"should": [
{ "constant_score": {
"boost": 6,
"query": {
"match": { "name": { "query": "something" } }
}
}},
{ "constant_score": {
"boost": 3,
"query": {
"match": { "credit": { "query": true } }
}
}},
{ "constant_score": {
"boost": 3,
"query": {
"match": { "atm": { "query": false } }
}
}},
{ "constant_score": {
"boost": 3,
"query": {
"match": { "ada": { "query": true } }
}
}},
{ "constant_score": {
"query": {
"function_score": {
"functions": [
{
"gauss": {
"geo": {
"origin": {
"lat": 39.802763999999996,
"lon": -105.08748399999999
},
"offset": "2km",
"scale": "3km"
}
}
}
]
}
}
}
},
{
"match_all": {}
}
]
}
}
}

Elasticsearch lowercase filter search

I'm trying to search my database and be able to use upper/lower case filter terms but I've noticed while query's apply analyzers, I can't figure out how to apply a lowercase analyzer on a filtered search. Here's the query:
{
"query": {
"filtered": {
"filter": {
"bool": {
"should": [
{
"term": {
"language": "mandarin" // Returns a doc
}
},
{
"term": {
"language": "Italian" // Does NOT return a doc, but will if lowercased
}
}
]
}
}
}
}
}
I have a type languages that I have lowercased using:
"analyzer": {
"lower_keyword": {
"type": "custom",
"tokenizer": "keyword",
"filter": "lowercase"
}
}
and a corresponding mapping:
"mappings": {
"languages": {
"_id": {
"path": "languageID"
},
"properties": {
"languageID": {
"type": "integer"
},
"language": {
"type": "string",
"analyzer": "lower_keyword"
},
"native": {
"type": "string",
"analyzer": "keyword"
},
"meta": {
"type": "nested"
},
"language_suggest": {
"type": "completion"
}
}
}
}
The problem is that you have a field that you have analyzed during index to lowercase it, but you are using a term filter for the query which is not analyzed:
Term Filter
Filters documents that have fields that contain a term (not analyzed).
Similar to term query, except that it acts as a filter.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-term-filter.html
I'd try using a query filter instead:
Query Filter
Wraps any query to be used as a filter. Can be placed within queries
that accept a filter.
Example:
{
"constantScore" : {
"filter" : {
"query" : {
"query_string" : {
"query" : "this AND that OR thus"
}
}
}
} }
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-query-filter.html#query-dsl-query-filter
This may be achieved by appending .keyword to your field to query against the keyword version of the field. Assuming language was defined in the mapping with type keyword.
Note that now only the exact text would match: mandarin won't match and Italian would.
Your query would end up like this:
{
"query": {
"filtered": {
"filter": {
"bool": {
"should": [
{
"term": {
"language.keyword": "mandarin" // Returns Empty
}
},
{
"term": {
"language.keyword": "Italian" // Returns Italian.
}
}
]
}
}
}
}
}
Combining the term values is also allowed:
{
"query": {
"filtered": {
"filter": {
"bool": {
"should": [
{
"term": {
"language.keyword":
["mandarin", "Italian"]
}
}
]
}
}
}
}
}

Preserving the position while searching

I need to make elasticsearch query to work with multiple words. I am using edgeNgram tokenizer to support autocompletion feature and I am using query_string for searching.
Document
{
"title":"Gold digital cinema",
"region":"Mumbai"
}
{
"title":"cine park",
"region":"Mumbai"
}
{
"title":"Premier Gold",
"region":"mumbai"
}
Query
{
"query": {
"bool": {
"should": [
{
"query_string": {
"fields": [
"title",
"region"
],
"query":"gold cine"
}
},
{
"fuzzy": {
"title": {
"value":"gold cine",
"min_similarity": 0.5,
"max_expansions": 50,
"prefix_length": 0
}
}
}
]
}
}
}
When I search for gold cine, I need "title":"Gold digital cinema" to be in the top results. But I am getting "title":"cine park" and "title":"Premier Gold" in top.
Is there any way to preserve position while searching?
Thanks in advance.
UPDATE
{
"query":{
"bool":{
"should":[{
"query_string":{
"fields":["title.default_title^10",
"title.ngrams_front^2",
"title.ngrams_back"],
"query":"gold cine",
"boost":2
}
},
{
"function_score":{
"query":{
"query_string":{
"fields":["region"],
"query":"MUMBAI"
}
},
"functions":[{
"script_score":{
"script":"_score + 0.6"}
}
],
"score_mode":"max",
"boost_mode":"avg"
}
}
]
}
}
}
You can use boost to pull up the data on top..
{
"query": {
"bool": {
"should": [
{
"query_string": {
"fields": [
"title",
"region"
],
"query":"gold cine",
"boost": 3
}
},
{
"fuzzy": {
"title": {
"value":"gold cine",
"min_similarity": 0.5,
"max_expansions": 50,
"prefix_length": 0
}
}
}
]
}
}
}
If you use version 1.0.0.and above you should use function score query.
HOpe it helps..!

Resources