I am Querying for getting aggregate data based on date_range, like below
"aggs": {
"range": {
"date_range": {
"field": "sold",
"ranges": [
{ "from": "2014-11-01", "to": "2014-11-30" },
{ "from": "2014-08-01", "to": "2014-08-31" }
]
}
}
}
using this I am getting this response
"aggregations": {
"range": {
"buckets": [
{
"key": "2014-08-01T00:00:00.000Z-2014-08-31T00:00:00.000Z",
"from": 1406851200000,
"from_as_string": "2014-08-01T00:00:00.000Z",
"to": 1409443200000,
"to_as_string": "2014-08-31T00:00:00.000Z",
"doc_count": 1
},
{
"key": "2014-11-01T00:00:00.000Z-2014-11-30T00:00:00.000Z",
"from": 1414800000000,
"from_as_string": "2014-11-01T00:00:00.000Z",
"to": 1417305600000,
"to_as_string": "2014-11-30T00:00:00.000Z",
"doc_count": 2
}
]
}
}
but instead of only doc_count, I have also required complete aggregate data that satisfy this range,
is threre any way to get this..please help
It's not clear what other fields you're looking for so I've included a couple of examples.
By nesting another aggs inside your first one, you can ask Elasticsearch to pull back additional values e.g. averages, sums, counts, min, max, stats, etc.
this example query will bring back field_count - a count of instances of myfield
and also return order_count - a sum based on a script.
"aggs": {
"range": {
"date_range": {
"field": "sold",
"ranges": [
{ "from": "2014-11-01", "to": "2014-11-30" },
{ "from": "2014-08-01", "to": "2014-08-31" }
]
}
}
},
"aggs" : {
"field_count": {"value_count" : { "field" : "myfield" } },
"order_count": {"sum" : {"script" : " doc[\"output_msgtype\"].value == \"order\" ? 1 : 0"} } }}
}
If you aren't looking for any sums, counts, averages on your data - then an aggregation isn't going to help.
I would instead run a standard query once per range. e.g.:
curl -XGET 'http://localhost:9200/test/cars/_search?pretty' -d '{
"fields" : ["price", "color", "make", "sold" ],
"query":{
"filtered": {
"query": {
"match_all" : { }
},
"filter" : {
"range": {"sold": {"gte": "2014-09-21T20:03:12.963","lte": "2014-09-24T20:03:12.963"}}}
}
}
}'
repeat this query as needed but modifying the range each time.
Related
I have the following query for fetching all products. What I'm trying to achieve is keep the out of stock products I.E. products with stock_sum = 0 at the bottom:
{
"sort": [
{
"updated_at": {
"order": "desc"
}
}
],
"size": 10,
"from": 0,
"query": {
"bool": {
"should": [
{
"range": {
"stock_sum": {
"gte": 1,
"boost": 5
}
}
}
]
}
}
}
But with the above query sort seems to completely override should, which is how it's suppose to behave I guess. A couple of things that I tried are changing the should to must in this case the out of stock products, are left out completely (that's not what I want, I still want the out of stock products at the bottom).
Another approach is remove sort, and then the should query seems to have an effect, but again I need the sort. So my question is how do I get sort and bool => should query to work in tandem ? I.E. sort by updated_at but also keep the stock_sum = 0 at the bottom?
Using match_all and constant_score query in the same should clause and sorting first by _score by asc, then by updated_at by desc should work for your example. Here is an example query:
{
"sort": [
{
"_score": {
"order": "asc"
}
},
{
"updated_at": {
"order": "desc"
}
}
]
"query": {
"bool": {
"should": [
{
"match_all": {}
},
{
"constant_score": {
"filter": {
"term": {
"stock_sum": 0
}
},
"boost": 10
}
}
]
}
}
}
We have applying aggregation and grouping, Need pagination for this.
let body = {
size: item_per_page,
"query": {
"bool": {
"must": [{
"terms": {
"log_action_master_id": action_type
}
}, {
"match": {
[search_by]: searchParams.user_id
}
}, {
"match": {
"unit_id": searchParams.unit_id
}
},
{
"range": {
[search_date]: {
gte: from,
lte: to
}
}
}
]
}
},
"aggs": {
"group": {
"terms": {
"field": "id",
"size": item_per_page,
"order": { "_key": sortdirction }
},
},
"types_count": {
"value_count": {
"field": "id.keyword"
}
},
},
};
You can use below options:-
Composite Aggregation: can combine multiple datasources in a single buckets and allow pagination and sorting on it. It can only paginate linearly using after_key i.e you cannot jump from page 1 to page 3. You can fetch "n" records , then pass returned after key and fetch next "n" records.
GET index22/_search
{
"size": 0,
"aggs": {
"ValueCount": {
"value_count": {
"field": "id.keyword"
}
},
"pagination": {
"composite": {
"size": 2,
"sources": [
{
"TradeRef": {
"terms": {
"field": "id.keyword"
}
}
}
]
}
}
}
}
Include partition: group's the field’s values into a number of partitions at query-time and processing only one partition in each request. Term fields are evenly distributed in different partitions. So you must know number of terms beforehand. You can use cardinality aggregation to get count
GET index22/_search
{
"size": 0,
"aggs": {
"TradeRef": {
"terms": {
"field": "id.keyword",
"include": {
"partition": 0,
"num_partitions": 3
}
}
}
}
}
Bucket Sort aggregation : sorts the buckets of its parents multi bucket aggreation. Each bucket may be sorted based on its _key, _count or its sub-aggregations. It only applies to buckets returned from parent aggregation. You will need to set term size to 10,000(max value) and truncate buckets in bucket_sort. You can paginate using from and size just like in query. If you have terms more that 10,000 you won't be able to use it since it only selects from buckets returned by term.
GET index22/_search
{
"size": 0,
"aggs": {
"valueCount":{
"value_count": {
"field": "TradeRef.keyword"
}
},
"TradeRef": {
"terms": {
"field": "TradeRef.keyword",
"size": 10000
},
"aggs": {
"my_bucket": {
"bucket_sort": {
"sort": [
{
"_key": {
"order": "asc"
}
}
],
"from": 2,
"size": 1
}
}
}
}
}
}
In terms of performance composite aggregation is a better choice
Currently i am trying to group a field based on one field and than getting sum of other fields with respect to the respective field used for grouping. I want to get a new value which needs to be division of the summed field . I will provide the current query i have :
In my query i am aggregating them based on the field ("a_name") and summing "spend" and "gain". I want to get a new field which would be ratio of sum (spend/gain)
I tried adding script but i am getting NaN , also to enable this; i had to enable them first in elasticsearch.yml file
script.engine.groovy.inline.aggs: on
Query
GET /index1/table1/_search
{
"size": 0,
"query": {
"filtered": {
"query": {
"query_string": {
"query": "*",
"analyze_wildcard": true
}
},
"filter": {
"bool": {
"must": [
{
"term": {
"account_id": 29
}
}
],
"must_not": []
}
}
}
},
"aggs": {
"custom_name": {
"terms": {
"field": "a_name"
},
"aggs": {
"spe": {
"sum": {
"field": "spend"
}
},
"gained": {
"sum": {
"field": "gain"
}
},
"rati": {
"sum": {
"script": "doc['spend'].value/doc['gain'].value"
}
}
}
}
}
}
This particular query is showing me a 'NaN' in output. If I replace the division to multiplication the query works.
Essentially what i am looking for is to divide my two aggregators "spe" and "gained"
Thanks!
It might be possible that doc.gain is 0 in some of your documents. You may try changing the script to this instead:
"script": "doc['gain'].value != 0 ? doc['spend'].value / doc['gain'].value : 0"
UPDATE
If you want to compute the ratio of the result of two other metric aggregations, you can do so using a bucket_script aggregation (only available in ES 2.0, though).
{
...
"aggs": {
"custom_name": {
"terms": {
"field": "a_name"
},
"aggs": {
"spe": {
"sum": {
"field": "spend"
}
},
"gained": {
"sum": {
"field": "gain"
}
},
"bucket_script": {
"buckets_paths": {
"totalSpent": "spe",
"totalGained": "gained"
},
"script": "totalSpent / totalGained"
}
}
}
}
}
I have a query like so:
{
"sort": [
{
"_geo_distance": {
"geo": {
"lat": 39.802763999999996,
"lon": -105.08748399999999
},
"order": "asc",
"unit": "mi",
"mode": "min",
"distance_type": "sloppy_arc"
}
}
],
"query": {
"bool": {
"minimum_number_should_match": 0,
"should": [
{
"match": {
"name": ""
}
},
{
"match": {
"credit": true
}
}
]
}
}
}
I want my search to always return ALL results, just sorted with those which have matching flags closer to the top.
I would like the sorting priority to go something like:
searchTerm (name, a string)
flags (credit/atm/ada/etc, boolean values)
distance
How can this be achieved?
So far, the query you see above is all I've gotten. I haven't been able to figure out how to always return all results, nor how to incorporate the additional queries into the sort.
I don't believe "sort" is the answer you are looking for, actually. I believe you need a trial-and-error approach starting with a simple "bool" query where you put all your criterias (name, flags, distance). Then you give your name criteria more weight (boost) then a little bit less to your flags and even less to the distance calculation.
A "bool" "should" would be able to give you a sorted list of documents based on the _score of each and, depending on how you score each criteria, the _score is being influenced more or less.
Also, returning ALL the elements is not difficult: just add a "match_all": {} to your "bool" "should" query.
This would be a starting point, from my point of view, and, depending on your documents and your requirements (see my comment to your post about the confusion) you would need to adjust the "boost" values and test, adjust again and test again etc:
{
"query": {
"bool": {
"should": [
{ "constant_score": {
"boost": 6,
"query": {
"match": { "name": { "query": "something" } }
}
}},
{ "constant_score": {
"boost": 3,
"query": {
"match": { "credit": { "query": true } }
}
}},
{ "constant_score": {
"boost": 3,
"query": {
"match": { "atm": { "query": false } }
}
}},
{ "constant_score": {
"boost": 3,
"query": {
"match": { "ada": { "query": true } }
}
}},
{ "constant_score": {
"query": {
"function_score": {
"functions": [
{
"gauss": {
"geo": {
"origin": {
"lat": 39.802763999999996,
"lon": -105.08748399999999
},
"offset": "2km",
"scale": "3km"
}
}
}
]
}
}
}
},
{
"match_all": {}
}
]
}
}
}
I have an index of documents that look this:
{
url: "/foo/bar",
html_blocks: [
"<h1>hi</h1>"
],
tags: [
"video",
"text"
],
title: "My title"
}
I'd like to query these documents on the title and html_blocks fields, and for any matches add a boost if they have a video tag.
So far, my query looks like this:
{
"query": {
"query_string": {
"query": "foo",
"fields": [
"title",
"html_blocks"
]
}
}
}
How do I modify it so that it continues to only return results if a match is found in the existing query, but a boost is added to any of the results which have a video tag? Thanks!
You want a custom_filters_score which will just boost on matches. Note that filter input is not analyzed, so you might wrap that in a query if you need it analyzed. Your other options to boost, while not really for this case are the boosting query, which is good for demoting results and the custom_score_query which is good for added boosts based on some calculated value.
See: Custom_filters_score
{
"query": {
"custom_filters_score": {
"query": {
"query_string": {
"query": "foo",
"fields": [
"title",
"html_blocks"
]
}
},
"filters": [
{
"filter": {
"term": {
"tags": "video"
}
},
"boost": 3
}
]
}
}
}
Edit:
This is what I mean by wrapping in a query using a filter query. Trust me, once you get the hang of ES, you'll be nested so knee deep that you'll produce some of the most satisfying queries ever.
{
"query": {
"custom_filters_score": {
"query": {
"query_string": {
"query": "foo",
"fields": [
"title",
"html_blocks"
]
}
},
"filters": [
{
"filter": {
//here comes the filter query, and I changed term to match
//since match analyzes
"query":{
"match": {
"tags": "video"
}
}
},
"boost": 3
}
]
}
}
}