Filed Value Factor not working in elastic search - search

GET test/_search?size=3
{
"query":{
"function_score":{
"query":{
"match_all":{
}
},
"functions":[
{
"field_value_factor":{
"field":"maxDocs",
"factor":1.2,
"modifier":"sqrt"
}
}
],
"score_mode":"multiply"
}
}
}
maxDocs has values 10,20,30,40,50,60 etc ... But when i run this query i am not getting the top 3 macDocs values. 40, 50, 60 is what i am expecting in the results.

Related

How to calculate the amount with conditional?

I have such documents in MongoDB:
{
"_id":{
"$oid":"614e0f8fb2f4d8ea534b2ccb"
},
"userEmail":"abc#example.com",
"customId":"abc1",
"amountIn":10,
"amountOut":0,
"createdTimestamp":1632505743,
"message":"",
"status":"ERROR",
}
The amountOut field can be 0, or a positive numeric value.
I need to calculate the sum of the amountIn and amountOut fields only if it is positive.
At the moment I am doing it like this:
query = {
'createdTimestamp': {'$gte': 1632430800},
'createdTimestamp': {'$lte': 1632517200}
}
records = db.RecordModel.objects(__raw__=query).all()
total_amount = 0
for record in records:
if record.amountOut > 0:
total_amount += record.amountOut
else:
total_amount += record.amountIn
But this is very slow.
I know mongoengine has a sum method:
total_amount = db.PaymentModel.objects(__raw__=query).sum('amountIn')
But I don't know how to use the condition for this method.
Maybe there are some other ways to calculate the amount with the condition I need faster?
You can use mongoengine's aggregation api which just allows you to execute aggregations normally.
Now you can use this pipeline in code which utilizes $cond:
query = {
'createdTimestamp': {'$gte': 1632430800, '$lte': 1632517200},
}
pipeline = [
{"$match": query},
{
"$group": {
"_id": None,
"total_amount": {
"$sum": {
"$cond": [
{
"$gt": [
"$amountOut",
0
]
},
"$amountOut",
"$amountIn"
]
}
}
}
}
]
records = db.RecordModel.objects().aggregate(pipeline)
Mongo Playground

Considering multiple fields when searching in Elasticsearch

I want to implement a simple search query using Elasticsearch.
I have two fields, "title" and "description" that I would like to match the searched term with. Currently, I have the body shown below as the body for search body. How can I make it so that the search prioritizes the title match, but if there are matches in the description, they are still included in the search (with lower priority)? Thanks in advance.
body = {
size: 200,
from: 0,
query: {
prefix: {
title: searchTerm
}
}
}
You have to use a constant score query with a score of 0 for the "other" field. Any other boost / function score usage will not reliably score a certain field over another field as the scoring is based on other parameters like text length for example, this means a constant boost (unless very very large) can not guarantee the behaviour you seek.
By using a constant score for each field you can control score manually, like so:
{
size: 200,
from: 0,
query: {
bool: {
should: [
{
prefix: {
title: searchTerm
}
},
{
constant_score: {
filter: {
prefix: {
description: searchTerm
}
},
boost: 0
}
},
]
}
}
}
If you set description boost to be more than 0 then the score will be the combined score of both fields, by doing this you can prioritize documents that have that prefix in both fields over ones that have it in just the title field.
You can use a combination of bool/should clause along with the boost parameter
{
"query": {
"bool": {
"should": [
{
"prefix": {
"title": {
"value": "searchterm"
}
}
},
{
"prefix": {
"description": {
"value": "searchterm",
"boost": 4
}
}
}
]
}
}
}

Multiple criteria within a single SHOULD query block

I'm hooking into an API which I have no control over and would like to extract all recipe entries which match certain criteria. For the most part, this is a simple 'does value equal N', however for one of these criterion I also have to check if another value is greater than 0.
This code works absolutely fine:
should: [
{ match: { 'ItemResult.ItemAction.Type': 853 } },
{ match: { 'ItemResult.ItemAction.Type': 1013 } },
{ match: { 'ItemResult.ItemAction.Type': 1322 } },
{ match: { 'ItemResult.ItemAction.Type': 5845 } }
]
It gives me all recipe entries whose 'ItemResult.ItemAction.Type is either 853, 1013, 1322 or 5845 as expected. The problem comes with this new more complex condition to my should array:
range: {
'ItemResult.ItemAction.Type': { gte: 5100, lte: 5300 },
'ItemResult.ItemAction.Data0': { gt: 0 }
}, ...
Each individual range property works fine, but naturally I'm getting the following error when both are combined like they are above:
"reason":"[range] query doesn\'t support multiple fields
Is there a way I happily have both ranges considered within the same query without impacting the other ItemResult.ItemAction.Type values?
Obviously I can hook into the API a second time to perform the more complex criterion search, but I'm wondering if I can do it all in the one call.
{
"query": {
"bool": {
"must": [
{
"range": {
"ItemResult.ItemAction.Type": {
"gte": 5100,
"lte": 5300
}
}
},
{
"range": {
"ItemResult.ItemAction.Data0": {
"gt": 0
}
}
}
]
}
}
}
Range from elasticsearch doesn't support multiple fields but you can use this query for having multiple range conditions.

Elasticsearch: aggregation with interval hour

I have ~100 documents coming in per hour. Every document has a viewers property (integer).
By the end of the day I want to aggregate an array of 24 documents, one for every hour of the day, represented by the document with the highest viewers count.
My query so far:
// query, fetch all documents of a specific day
var query = {
bool : {
filter : [
{
range : {
'created' : {
gte : day,
lte : day + (60 * 60 * 24)
}
}
}
]
}
}
// aggregation
var aggs = {
// ?
}
I think this could be achieved using Date Histogram plus Top Hits aggregations. Look at this:
{
"size": 0,
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "created",
"interval": "minute"
},
"aggs": {
"Viewers": {
"top_hits": {
"size": 1,
"sort": [
{
"viewers": {
"order": "desc"
}
}
]
}
}
}
}
}
}
Date Histogram aggregation will create buckets for each hour in filtered date range and top hits agg will bring back document with highest viewers (we're ordering documents by viewers in descending order and bringing top 1 hit).
Let me know if this works.

Query the number of elements matching a filter using elastic.js?

I'm building a leaderboard with elasticsearch. I'd like to query all documents who have points greater than a given amount using the following query:
{
"constant_score" : {
"filter" : {
"range" : {
"totalPoints" : {
"gt": 242
}
}
}
}
This works perfectly -- elasticsearch appropriately returns all documents with points greater than 242. However, all I really need is the count of elements matching this query. Since I'm sending the result over the network, it would be helpful if the query simply returned the count, as opposed to all of the documents matching the filter.
How do I get elasticsearch to only report the count of documents matching the filter?
EDIT: I've learned that what I'm looking for is setting search_type to count. However, I'm not sure how to do this with elastic.js. Any noders willing to pitch in their advice?
You can use the query type count for exactly that purpose:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-search-type.html#count
This is an example that should help you:
GET /mymusic/itunes/_search?search_type=count
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"range": {
"year": {
"gt": 2000
}
}
}
}
}
}

Resources