Using boolean fields in groovy script in elasticsearch - doc['field_name'].value not working - groovy

The problem
I am trying to use boolean fields in a script to score. It seems like doc['boolean_field'].value can't be manipulated as a boolean, but _source.boolean_field.value can (even though the Scripting documentation here says "The native value of the field. For example, if its a short type, it will be short.").
What I've tried
I have a field called 'is_new'. This is the mapping:
PUT /test_index/test/_mapping
{
"test": {
"properties": {
"is_new": {
"type": "boolean"
}
}
}
}
I have some documents:
PUT test_index/test/1
{
"is_new": true
}
PUT test_index/test/2
{
"is_new": false
}
I want to do a function_score query that will have a score of 1 if new, and 0 if not:
GET test_index/test/_search
{
"query": {
"function_score": {
"query": {
"match_all": {}
},
"functions": [
{
"script_score": {
"script": "<<my script>>",
"lang": "groovy"
}
}
],
"boost_mode": "replace"
}
}
}
Scripts work when I use the _source.is_new.value field, but don't if I use doc['is_new'].value.
This works:
"if ( _source.is_new) {1} else {0}"
These don't work:
"if ( doc['is_new'].value) {1} else {0}" (always true)
"if ( doc['is_new'].value instanceof Boolean) {1} else {0}" (value isn't a Boolean)
"if ( doc['is_new'].value.toBoolean()) {1} else {0}" (always false)
"if ( doc['is_new']) {1} else {0}" (always true)
I've checked the value, and it thinks it is a string, but I can't do string comparison:
"if ( doc['is_new'].value instanceof String) {1} else {0}" (always true)
"if ( doc['is_new'].value == 'true') {1} else {0}" (always false)
"if ( doc['is_new'].value.equals('true')) {1} else {0}" (always false)
Is this broken, or am I doing it wrong? Apparently it is faster to use doc['field_name'].value, so if possible, it would be nice if this worked.
I am using Elasticsearch v1.4.4.
Thanks!
Isabel

I'm having the same issue on ElasticSearch v1.5.1: Boolean values in the document show up as characters in my script, T' for true and 'F' for false:
if ( doc['is_new'].value == 'T') {1} else {0}

I've just got it!
First, it works only with _source.myField, not with doc['myField'].value.
I think there's a bug there because the toBoolean() method should return a boolean depending on the actual value, but it doesn't.
But I also needed to declare the mapping of the field explicitly as boolean and not_analyzed.
I hope it helps!

Related

How to use jq try with multiple conditions like a case?

I need to output in a single column the value of a field A if it's not null or the value of field B if not null or nothing if both A and B are null.
When I just had filed A to test I wrote this which worked ok
try .fieldA catch ""
But now as I need to take field B if A is null I tried those, and nothing worked
try .fieldA catch try .fieldB catch "" => this only returned empty values ever
try (.fieldA or .fieldB) catch "" => this one outputs true or false, 2 resultst instead of 1
try (.fieldA,.fieldB) catch "" => this one outputs both A and B if both are not nulll, so 2 resultst instead of 1
I'd like the try to stop evaluating whenever the first result is not null
Thanks
Use the alternate operator //, which takes the second alternative if the first is null or false, and empty for the "nothing" result.
If accessing any field might fail, additionally use the optional operator ? on those fields.
{
"fieldA": "A1 present",
"fieldB": "B1 present",
"fieldC": "irrelevant"
}
{
"fieldB": "B2 present",
"fieldC": "irrelevant"
}
{
"fieldA": "A3 present",
"fieldC": "irrelevant"
}
{
"fieldC": "irrelevant"
}
jq '.fieldA // .fieldB // empty'
"A1 present"
"B2 present"
"A3 present"
Demo
Addressing #peak's "warning": If you want to capture a filed that has the explicit boolean value false, while not capturing it if it is either missing or explicitly set to null, you can use values which selects non-null values only, and first to retrieve the first one that exists among a given stream of alternatives:
{
"fieldA": "A1 present",
"fieldB": "B1 present",
"fieldC": "irrelevant"
}
{
"fieldA": false,
"fieldB": "B2 present",
"fieldC": "irrelevant"
}
{
"fieldA": null,
"fieldB": "B3 present",
"fieldC": "irrelevant"
}
{
"fieldB": "B4 present",
"fieldC": "irrelevant"
}
{
"fieldC": "irrelevant"
}
jq 'first(.fieldA, .fieldB | values)'
"A1 present"
false
"B3 present"
"B4 present"
Demo
Using this approach, there's no need to provide the explicit empty case. However, if you want to have a default fallback, add it as the last item in the stream, e.g. first(.fieldA, .fieldB, "" | values).
jq's // operator has quite complicated semantics (*)
and in particular, when evaluating E // F, no distinction is made between E being null, false, or the empty stream.
Also, given an object as input, the expression .x will evaluate to the JSON value null if the key "x" is explicitly
specified as null or if the key "x" is not present at all.
Thus, assuming an object has been given as input,
if we want .fieldA if .fieldA is non-null,
and otherwise .fieldB if .fieldB is non-null,
and otherwise the string "missing", then we would have to write something along the lines of:
if .fieldA != null then .fieldA
elif .fieldB != null then .fieldB
else "missing"
end
Simply replace "missing" by empty to achieve the objective stated in the original question.
(*) See the jq FAQ
for complete details.

Logstash: Renaming nested fields based on some condition

I am trying to rename the nested fields from Elasticsearch while migrating to Amazonelasticsearch
In the document, I want to change the
1.If the value field has JSON type. Change the value field to value-keyword and remove "value-whitespace" and "value-standard" if present
2.If the value field has a size of more than 15. Change the value field to value-standard
"_source": {
"applicationid" : "appid",
"interactionId": "716bf006-7280-44ea-a52f-c79da36af1c5",
"interactionInfo": [
{
"value": """{"edited":false}""",
"value-standard": """{"edited":false}""",
"value-whitespace" : """{"edited":false}"""
"title": "msgMeta"
},
{
"title": "msg",
"value": "hello testing",
},
{
"title": "testing",
"value": "I have a text that can be done and changed only the size exist more than 20 so we applied value-standard ",
}
],
"uniqueIdentifier": "a21ed89c-b634-4c7f-ca2c-8be6f31ae7b3",
}
}
the end result should be
"_source": {
"applicationid" : "appid",
"interactionId": "716bf006-7280-44ea-a52f-c79da36af1c5",
"interactionInfo": [
{
"value-keyword": """{"edited":false}""",
"title": "msgMeta"
},
{
"title": "msg",
"value": "hello testing",
},
{
"title": "testing",
"value-standard": "I have a text that can be done and changed only the size exist more than 20 and so we applied value-standard ",
}
],
"uniqueIdentifier": "a21ed89c-b634-4c7f-ca2c-8be6f31ae7b3",
}
}
For 2), you can do it like this:
filter {
if [_source][interactionInfo][2][value] =~ /.{15,15}/ {
mutate {
rename => ["[_source][interactionInfo][2][value]","[_source][interactionInfo][2][value-standard]"]
}
}
}
The regex .{15,15} matches any string 15 characters long. If the field is shorter than 15 characters long, the regex doesn't match and the mutate#rename isn't applied.
For 1), one possible solution would be trying to parse the field with the json filter and if there's no _jsonparsefailure tag, rename the field.
Founded the solution for this one. I have used a ruby filter in Logstash to check each and every document as well as nested document
Here is the ruby code
require 'json'
def register(param)
end
def filter(event)
infoarray = event.get("interactionInfo")
infoarray.each { |x|
if x.include?"value"
value = x["value"]
if value.length > 15
apply_only_keyword(x)
end
end
if x.include?"value"
value = x["value"]
if validate_json(value)
apply_only_keyword(x)
end
end
}
event.set("interactionInfo",infoarray)
return [event]
end
def validate_json(value)
if value.nil?
return false
end
JSON.parse(value)
return true
rescue JSON::ParserError => e
return false
end
def apply_only_keyword(x)
x["value-keyword"] = x["value"]
x.delete("value")
if x.include?"value-standard"
x.delete("value-standard")
end
if x.include?"value-whitespace"
x.delete("value-whitespace")
end
end

id cannot be used in graphQL where clause?

{
members {
id
lastName
}
}
When I tried to get the data from members table, I can get the following responses.
{ "data": {
"members": [
{
"id": "TWVtYmVyOjE=",
"lastName": "temp"
},
{
"id": "TWVtYmVyOjI=",
"lastName": "temp2"
}
] } }
However, when I tried to update the row with 'id' where clause, the console shows error.
mutation {
updateMembers(
input: {
values: {
email: "testing#test.com"
},
where: {
id: 3
}
}
) {
affectedCount
clientMutationId
}
}
"message": "Unknown column 'NaN' in 'where clause'",
Some results from above confused me.
Why the id returned is not a numeric value? From the db, it is a number.
When I updated the record, can I use numeric id value in where clause?
I am using nodejs, apollo-client and graphql-sequelize-crud
TL;DR: check out my possibly not relay compatible PR here https://github.com/Glavin001/graphql-sequelize-crud/pull/30
Basically, the internal source code is calling the fromGlobalId API from relay-graphql, but passed a primitive value in it (e.g. your 3), causing it to return undefined. Hence I just removed the call from the source code and made a pull request.
P.S. This buggy thing which used my 2 hours to solve failed in build, I think this solution may not be consistent enough.
Please try this
mutation {
updateMembers(
input: {
values: {
email: "testing#test.com"
},
where: {
id: "3"
}
}
) {
affectedCount
clientMutationId
}
}

DynamoDB : SET list_append not working using aws sdk

I need to append a string to a string set in a dynamodb table using the corresponding key. This is the Update expression I use to do updateItem :
var params = {
"TableName" : tableName,
"Key": {
"ID": {
S: "20000"
}
},
"UpdateExpression" : "SET #attrName = list_append(#attrName, :attrValue)",
"ExpressionAttributeNames" : {
"#attrName" : "entries"
},
"ExpressionAttributeValues" : {
":attrValue" : {"SS":["000989"]}
} };
This works when I do updateItem() using the aws cli. But when using aws-sdk in nodejs, I am getting the error:
Invalid UpdateExpression: Incorrect operand type for operator or function; operator or function: list_append, operand type: M\n
Any help?
Thanks
list_append can be read as a "concatenate" operation. You just give it two lists.
"UpdateExpression" : "SET #attrName = list_append(#attrName, :attrValue)",
"ExpressionAttributeNames" : {
"#attrName" : "entries"
},
"ExpressionAttributeValues" : {
":attrValue" : ["000989"]
}
It's worth remembering that lists (and maps) in DynamoDB are not typed and can store arbitrary data.
Side note: Armed with this knowledge, the documentation on appending to the beginning of the list now makes sense:
list_append (operand, operand)
This function evaluates to a list
with a new element added to it. You can append the new element to the
start or the end of the list by reversing the order of the operands.
There's an accepted answer on this question which helped me with part of this issue. However, we'll typically want to update lists with additional objects, not strings. For this, I found it useful to avoid using ExpressionAttributeNames if possible.
1) Make sure the value in your item in your DynamoDB table is a list.
2) Make sure you pass in a list of objects (even if you only have one), not a simple object
UpdateExpression: "set pObj.cObj= list_append(pObj.cObj, :obj)",
ExpressionAttributeValues: {
":obj": [
{myObject: {
property1: '',
property2: '',
property3: '',
}}
]
},
I thought I'd just throw this out there as another option for adding or appending an "object" to a list. It's a map being added an item to the list, and worked well for me:
var upsertExpr = (obj.comments == undefined) ? " :attrValue" : "list_append(#attrName, :attrValue)";
var params = {
TableName: 'tableName',
Key: {
'id': {'S': id},
},
UpdateExpression : "SET #attrName = " + upsertExpr,
ExpressionAttributeNames : {
"#attrName" : "comments"
},
ExpressionAttributeValues : {
":attrValue" : {
"L": [
{ "M" :
{
"comment": {"S": comment},
"vote": {"N": vote.toString()}
}
}
]
}
}
};
maybe this will help someone. i was struggling with updating a list and was getting the same error message as the original poster. i managed to solve my problem when i finally understood the documentation (see the Adding Elements To a List example here http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Expressions.UpdateExpressions.html#Expressions.UpdateExpressions.ADD)
points to note are: 1) that the "list_append takes two lists as input, and appends the second list to the first." and 2) that ExpressionAttributeValues is a list! like this:
{
":vals": {
"L": [
{ "S": "Screwdriver" },
{"S": "Hacksaw" }
]
}
}
good luck!

remove objects from array elastic search

I have required to remove object from array that satisfies the condition, I am able to update the object of array on the basis of condition, which is as follow:
PUT twitter/twit/1
{"list":
[
{
"tweet_id": "1",
"a": "b"
},
{
"tweet_id": "123",
"a": "f"
}
]
}
POST /twitter/twit/1/_update
{"script":"foreach (item :ctx._source.list) {
if item['tweet_id'] == tweet_id) {
item['new_field'] = 'ghi';
}
}",
"params": {tweet_id": 123"}
}
this is working
for remove i am doing this
POST /twitter/twit/1/_update
{ "script": "foreach (item : ctx._source.list) {
if item['tweet_id'] == tweet_id) {
ctx._source.list.remove(item);
}
}",
"params": { tweet_id": "123" }
}
but this is not working and giving this error,
ElasticsearchIllegalArgumentException[failed to execute script];
nested: ConcurrentModificationException; Error:
ElasticsearchIllegalArgumentException[failed to execute script];
nested: ConcurrentModificationException
I am able to remove whole array or whole field using
"script": "ctx._source.remove('list')"
I am also able to remove object from array by specifying all the keys of an object using
"script":"ctx._source.list.remove(tag)",
"params" : {
"tag" : {"tweet_id": "123","a": "f"}
my node module elastic search version is 2.4.2 elastic search server is 1.3.2
You get that because you are trying to modify a list while iterating through it, meaning you want to change a list of object and, at the same time, listing those objects.
You instead need to do this:
POST /twitter/twit/1/_update
{
"script": "item_to_remove = nil; foreach (item : ctx._source.list) { if (item['tweet_id'] == tweet_id) { item_to_remove=item; } } if (item_to_remove != nil) ctx._source.list.remove(item_to_remove);",
"params": {"tweet_id": "123"}
}
If you have more than one item that matches the criteria, use a list instead:
POST /twitter/twit/1/_update
{
"script": "items_to_remove = []; foreach (item : ctx._source.list) { if (item['tweet_id'] == tweet_id) { items_to_remove.add(item); } } foreach (item : items_to_remove) {ctx._source.list.remove(item);}",
"params": {"tweet_id": "123"}
}
For people that need this working in elasticsearch 2.0 and up, the nil and foreach don't get recognized by groovy.
So here's a updated version, including a option to replace a item with the same id by a new object.
and also passing it the upsert will make sure the item gets added even if the document doesn't exist yet
{
"script": "item_to_remove = null; ctx._source.delivery.each { elem -> if (elem.id == item_to_add.id) { item_to_remove=elem; } }; if (item_to_remove != null) ctx._source.delivery.remove(item_to_remove); if (item_to_add.size() > 1) ctx._source.delivery += item_to_add;",
"params": {"item_to_add": {"id": "5", "title": "New item"}},
"upsert": [{"id": "5", "title": "New item"}]
}

Resources