Convert strings to floats at aggregation time?

Convert strings to floats at aggregation time? - string

Is there any way to convert strings to floats when specifying a histogram aggregation? Because I have documents with fields that are floats but are not parsed by elasticsearch as such, and when I attempt to do a sum using a string field It throws the next error.
ClassCastException[org.elasticsearch.index.fielddata.plain.PagedBytesIndexFieldData
cannot be cast to org.elasticsearch.index.fielddata.IndexNumericFieldData]}]"
I know I could change the mapping, but for the usage case that I have, it would be more handy if I
could specify something like "script : _value.tofloat()" when writing the
aggregation for the field.
This is my code:
{
"query" : {
"bool": {"
must": [
{"match": { "sensorId": "D14UD021808ARZC" }},
{"match": { "variableName": "CAUDAL"}}
]
}
},
"aggs" : {
"caudal_per_month" : {
"date_histogram" : {
"field" : "timestamp",
"interval" : "month"
},
"aggs": {
"totalmonth": {
"sum": {
"field": "value",
"script" : "_value*1.0"
}
}
}
}
}
}

You need this
{
"query": {
"bool": {
"must": [
{
"match": {
"sensorId": "D14UD021808ARZC"
}
},
{
"match": {
"variableName": "CAUDAL"
}
}
]
}
},
"aggs": {
"caudal_per_month": {
"date_histogram": {
"field": "timestamp",
"interval": "month"
},
"aggs": {
"totalmonth": {
"sum": {
"script": "Float.parseFloat(doc['value'].value)"
}
}
}
}
}
}
For a field that's called value: Float.parseFloat(doc['value'].value)

Related

Is there a Group BY function for finding result with elastic search query?

I have tried to integrate group by with elastic search. But I didn't get the answer properly. Please support me to fix this issue. Indexed data is,
data = [
{ "fruit":"apple", "taste":5, "timestamp":100},
{ "fruit":"pear", "taste":5, "timestamp":110},
{ "fruit":"apple", "taste":4, "timestamp":200},
{ "fruit":"pear", "taste":8, "timestamp":90},
{ "fruit":"banana", "taste":5, "timestamp":100}]`
My query is,
`myQuery = {"query": {
"match_all": {}
},
"aggs": {
"group_by_fruit": {
"terms": {
"field": "fruit.keyword"
},
}
}
}
It showing all 5 data in the output. Actually I nee d to get only 3 records. The expected result is,
[
{ "fruit":"apple", "taste":4, "timestamp":200},
{ "fruit":"pear", "taste":8, "timestamp":90},
{ "fruit":"banana", "taste":5, "timestamp":100}]

If you want to get the documents with distinct fruit fields having the largest timestamp value you should use a top_hits aggregation.
{
"query": {
"match_all": {}
},
"size": 0,
"aggs": {
"top_tags": {
"terms": {
"field": "fruit.keyword",
"size": <MAX_NUMBER_OF_DISTINCT_FRUITS>
},
"aggs": {
"group_by_fruit": {
"top_hits": {
"sort": [
{
"timestamp": {
"order": "desc"
}
}
],
"size" : 1
}
}
}
}
}
}

Update elastic search doc field value for specific fields in all documents

I have documents like this.
{
"a":"test",
"b":"harry"
},
{
"a":""
"b":"jack"
}
I need to update docs with field a==""(empty string) to default value say null in all documents for a given index.
Any help is appreciated. Thanks

Use Update by query with ingest
_update_by_query can also use the Ingest Node feature by specifying a pipeline like this:
define the pipeline
PUT _ingest/pipeline/set-foo
{
"description" : "sets foo",
"processors" : [ {
"set" : {
"field": "a",
"value": null
}
} ]
}
then you can use it like:
POST myindex/_update_by_query?pipeline=set-foo
{
"query": {
"filtered": {
"filter": {
"script": {
"script": "_source._content.length() == 0"
}
}
}
}
}'
OR
POST myindex/_update_by_query?pipeline=set-foo
{
"query": {
"bool" : {
"must" : {
"script" : {
"script" : {
"inline": "doc['a'].empty",
"lang": "painless"
}
}
}
}
}
}

To query a documents with empty string field value, i.e = ''
I did,
"query": {
"bool": {
"must": [
{
"exists": {
"field": "a"
}
}
],
"must_not": [
{
"wildcard": {
"a": "*"
}
}
]
}
}
So overall query to update all docs with field a=="" is,
POST test11/_update_by_query
{
"script": {
"inline": "ctx._source.a=null",
"lang": "painless"
},
"query": {
"bool": {
"must": [
{
"exists": {
"field": "a"
}
}
],
"must_not": [
{
"wildcard": {
"a": "*"
}
}
]
}
}
}

Elasticsearch: search for field null OR in list

I would like to write something like this in ElasticSearch:
SELECT *
FROM ...
WHERE name IS NULL OR name IN ("a","b","c");
I can write the "IS NULL" part using:
{
"query" :
{
"bool" : {
"must_not": {
"exists": {
"field": "name"
}
}
}
}
}
The "IN list" part:
{
"query" :
{
"bool" : {
"should" : [
{
"terms" : {
"name" : [
"a", "b", "c"
]
}
}
]
}
}
}
But I can't find a way to merge these two queries using a OR (and not a AND of course).
Thanks

You can use bool/should in order to combine both
{
"query": {
"bool": {
"should": [
{
"terms": {
"name": [
"a",
"b",
"c"
]
}
},
{
"bool": {
"must_not": {
"exists": {
"field": "name"
}
}
}
}
]
}
}
}

Filter parent by children aggregation in Elasticsearch

I have a parent-children relationship in my ES mapping and I want to filter the parents by the value of an aggregation (avg) on their children. That is, I only want to retrieve parents where that value is within a given range.
I tried to do it with aggs and post-filters but couldn't get it to work.
{
"apartments" : {
"mappings" : {
"apartment_availability" : {
"_parent" : {
"type" : "apartment"
},
"_routing" : {
"required" : true
},
"properties" : {
"availability_date" : {
"type" : "date"
},
"apartment_id" : {
"type" : "long"
},
"id" : {
"type" : "long"
},
"price_cents" : {
"type" : "long"
},
"status" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
},
"apartment" : {
"properties" : {
"id" : {
"type" : "long"
},
}
}
}
}
}
If oru users select a period of March 24th-March 31st and a price range of €150-€300, then we want to show them all apartments that are free in that period and whose average price for that period is in the €150-€300 range.
Here's what we have so far:
{
"query": {
"bool": {
"filter": {
"bool": {
"must": [{
"has_child": {
"type": "apartment_availability",
"min_children": 8,
"max_children": 8,
"query": {
"bool": {
"must": [{
"term": {
"status": "available"
}
}, {
"range": {
"availability_date": {
"gte": "2017-03-24",
"lte": "2017-03-31"
}
}
}]
}
}
}
}]
}
}
}
}
}

My suggestion, using bucket_selector aggregation to choose between apartments:
GET /apartments/apartment/_search
{
"query": {
"bool": {
"filter": {
"bool": {
"must": [
{
"has_child": {
"type": "apartment_availability",
"query": {
"bool": {
"must": [
{
"term": {
"status": "available"
}
},
{
"range": {
"availability_date": {
"gte": "2017-04-01",
"lte": "2017-04-03"
}
}
}
]
}
}
}
}
]
}
}
}
},
"aggs": {
"apartments_ids": {
"terms": {
"field": "id",
"size": 10
},
"aggs": {
"avails": {
"children": {
"type": "apartment_availability"
},
"aggs": {
"filter_avails": {
"filter": {
"bool": {
"must": [
{
"term": {
"status": "available"
}
},
{
"range": {
"availability_date": {
"gte": "2017-04-01",
"lte": "2017-04-03"
}
}
}
]
}
},
"aggs": {
"average": {
"avg": {
"field": "price_cents"
}
}
}
}
}
},
"avg_bucket_filter": {
"bucket_selector": {
"buckets_path": {
"avg": "avails>filter_avails.average"
},
"script": "params.avg > 150 && params.avg < 300"
}
}
}
}
}
}

ElasticSearch : How to combine nested 'AND' Not Equal

I want build query for search matching with nested and not equal.
This is my elasticSearch query:
{
"from":0,"size":1000,
"query":{
"nested" : {
"path" : "data",
"query" : {
"match" : {
"data.city" : "california"
}
}
},
"filter":{
"not":{
"filter":{
"term":{
"_id":"01921asda01201"
}
}
}
}
}
}
But I got error, am I write something wrong ? thanks

You can use bool Filter too with must and must_not clause.
{
"from": 0,
"size": 1000,
"filter": {
"bool": {
"must": [
{
"nested": {
"path": "data",
"query": {
"match": {
"data.city": "california"
}
}
}
}
],
"must_not": [
{
"term": {
"_id": "01921asda01201"
}
}
]
}
}
}

You need to use filtered query
GET _search
{
"query": {
"filtered": {
"query": {
"nested": {
"path" : "data",
"query" : {
"match" : {
"data.city" : "california"
}
}
}
},
"filter": {
"bool": {
"must_not": [
{
"term": {
"_id": "01921asda01201"
}
}
]
}
}
}
}
}

You should use a bool query for this, and put your two clauses in the must and must_not sections respectively.
If you don't care about scoring on the data.city field (from your example it's not clear), you might want to use the filter portion instead of the must portion.
{
  "from": 0,
  "size": 1000,
  "query": {
    "bool": {
      "filter": [
        {
          "nested": {
            "path": "data",
            "query": {
              "match": {
                "data.city": "california"
              }
            }
          }
        }
      ],
      "must_not": [
        {
          "term": {
            "_id": "01921asda01201"
          }
        }
      ]
    }
  }
}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Convert strings to floats at aggregation time? - string

Related

Is there a Group BY function for finding result with elastic search query?

Update elastic search doc field value for specific fields in all documents

Elasticsearch: search for field null OR in list

Filter parent by children aggregation in Elasticsearch

ElasticSearch : How to combine nested 'AND' Not Equal

Categories

Resources