Logstash - How to copy a field into an array - logstash

I am using logstash 5.6
In my document, I have a subfield "[emailHeaders][reingested-on]", and another field called [attributes], which contains several subfields [string], [double], each of which are arrays. :
{
"emailHeaders": {
"reingested-on": ["1613986076000"]
},
"attributes": {
"string": [
{
"name": "attributeString1",
"value": "attributeStringValue1"
},
{
"name": "attributeString2",
"value": "attributeStringValue2"
}
],
"double": [
{
"name": "attributeDouble1",
"value": 1.0
}
]
}
}
If the element [emailHeaders][reingested-on] is present in the document, I want to copy 1613986076000 (ie. the first element of [emailHeaders][reingested-on]) into [attributes][date] as follows:
{
"emailHeaders": {
"reingested-on": ["1613986076000"]
},
"attributes": {
"string": [
{
"name": "attributeString1",
"value": "attributeStringValue1"
},
{
"name": "attributeString2",
"value": "attributeStringValue2"
}
],
"double": [
{
"name": "attributeDouble1",
"value": 1.0
}
],
"date": [
{
"name": "Reingested on",
"value": 1613986076000
}
]
}
}
Note that if [attributes][date] already exists, and already contains an array of name/value pairs, I want my new object to be appended to the array.
Also, note that [attributes][date] is an array of objects which contain a date in their [value] attribute, as per the mapping of my ElasticSearch index:
...
"attributes": {
"properties": {
...
"date": {
"type": "nested",
"properties": {
"id": {"type": "keyword"},
"name": {"type": "keyword"},
"value": {"type": "date"}
}
},
...
}
},
...
I tried the following logstash configuration, with no success:
filter {
# See https://stackoverflow.com/questions/30309096/logstash-check-if-field-exists : this is supposed to allow to "test" if [#metadata][reingested-on] exists
mutate {
add_field => { "[#metadata][reingested-on]" => "None" }
copy => { "[emailHeaders][reingested-on][0]" => "[#metadata][reingested-on]" }
}
if [#metadata][reingested-on] != "None" {
# See https://stackoverflow.com/questions/36127961/append-array-of-json-logstash-elasticsearch: I create a temporary [error] field, and I try to append it to [attributes][date]
mutate {
add_field => { "[error][name]" => "Reingested on" }
add_field => { "[error][value]" => "[#metadata][reingested-on]" }
}
mutate {
merge => {"[attributes][date]" => "[error]"}
}
}
}
But what I get is:
{
"emailHeaders": {
"reingested-on": ["1613986076000"]
},
"error": {
"name": "Reingested on",
"value": "[#metadata][reingested-on]"
},
"attributes": {
"string": [
{
"name": "attributeString1",
"value": "attributeStringValue1"
},
{
"name": "attributeString2",
"value": "attributeStringValue2"
}
],
"double": [
{
"name": "attributeDouble1",
"value": 1.0
}
]
}
}
My temporary [error] object is created, but its value is wrong: it should be 1613986076000 instead of [#metadata][reingested-on]
Also, it is not appended to the array [attribute][date]. In this example, this array does not exist, so I want it to be created with my temporary object as first element, as per the expected result above.

Related

How to add multiple conditions in JSON schema?

I have the below JSON schema
{
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "array",
"items": {
"type": "object",
"properties": {
"op": {
"type": "string",
"minLength": 1,
"enum": [
"add",
"remove",
"replace"
]
},
"path": {
"type": "string",
"minLength": 1,
"enum": [
"/name",
"/description",
"/prefix"
]
},
"value": {
"type": "string",
"minLength": 1
}
},
"additionalProperties": false,
"required": [
"op",
"path",
"value"
],
"minItems": 1,
"allOf": [
{
"if" : {
"properties": {
"path" : {
"const": "/name"
}
}
},
"then": {
"properties": {
"op": {
"const": "replace"
}
}
}
},
{
"if" : {
"properties": {
"path" : {
"const": "/description"
}
}
},
"then": {
"properties": {
"op": {
"const": "replace"
}
}
}
}
]
}
}
As above we can see , if name and description are there ,then op will be replace and for path prefix I have all the operations i.e add , remove and replace , but i want a special condition to be applied for remove operation
Like if path is xFix and op is remove then required parameters should not contain value attrribute , so only op and path.
I think you'll want to turn the logic of that around. Do not include "value" in the "required" list. Instead, add a condition that says, basically, "if op is not 'remove', then value is required".
{
"if" : {
"not": {
"properties": {
"op" : {
"const": "remove"
}
}
}
},
"then": {
"required": ["value"]
}
}

Prepare List from Different input Arrays and Objects in Jolt

Hi I am new to JOLT transformation and I am trying to transform some thing like below.
Main goal here is to have a list of objects without making the constant indexing in jolt.
Transformation of different objects to a common list .
Any Help is appreciate .
Data provides here is an example of what I expected.
Input :
{
"CIT": [
{
"name": "name_CIT_1",
"desc": "desc_CIT_1"
},
{
"name": "name_CIT_2",
"desc": "desc_CIT_2"
},
{
"name": "name_CIT_3",
"desc": "desc_CIT_3"
}
],
"BIT": {
"name": "name_BIT",
"desc": "desc_BIT"
},
"NIT": {
"name": "name_NIT",
"desc": "desc_NIT"
},
"KIT": {
"name": "name_KIT",
"desc": "desc_KIT"
}
}
Jolt:
[
{
"operation": "modify-default-beta",
"spec": {
"*": {}
}
},
{
"operation": "modify-overwrite-beta",
"spec": {}
},
{
"operation": "shift",
"spec": {
"CIT": {
"*": {
"name": "CollegeList[0].name",
"desc": "CollegeList[0].desc"
}
},
"BIT": {
"name": "CollegeList[1].name",
"desc": "CollegeList[1].desc"
},
"NIT": {
"name": "CollegeList[2].name",
"desc": "CollegeList[2].desc"
},
"KIT": {
"name": "CollegeList[3].name",
"desc": "CollegeList[3].desc"
}
}
}
]
Output:
{
"CollegeList" : [ {
"name" : [ "name_CIT_1", "name_CIT_2", "name_CIT_3" ],
"desc" : [ "desc_CIT_1", "desc_CIT_2", "desc_CIT_3" ]
}, {
"name" : "name_BIT",
"desc" : "desc_BIT"
}, {
"name" : "name_NIT",
"desc" : "desc_NIT"
}, {
"name" : "name_KIT",
"desc" : "desc_KIT"
} ]
}
Expected Output:
{
"CollegeList": [
{
"name": "name_CIT_1",
"desc": "desc_CIT_1"
},
{
"name": "name_CIT_2",
"desc": "desc_CIT_2"
},
{
"name": "name_CIT_3",
"desc": "desc_CIT_3"
},
{
"name": "name_BIT",
"desc": "desc_BIT"
},
{
"name": "name_NIT",
"desc": "desc_NIT"
},
{
"name": "name_KIT",
"desc": "desc_KIT"
}
]
}
You can use two levels of shift transformations. Indeed, the desired array is obtained within the first level except for the key of the array which is root as default. Then only renaming of the array's key occurs within the second level such as
[
{
"operation": "shift",
"spec": {
"*": "&1"
}
},
{
"operation": "shift",
"spec": {
"#(0,&)": "CollegeList"
}
}
]
Another approach for the same :
[
{
"operation": "shift",
"spec": {
"CIT": {
"*": "CollegeList[]"
},
"*": "CollegeList[]"
}
}
]

Using JSON descriptors how to define an array in gRPC?

I'm using JSON descriptors instead of proto format. Everithing works, unless the array of Todo. I need an array of Todos.
How define that? I put the "type": "array", but always return the error:
'Error: no such Type or Enum 'array' in Type .Todos'
My json file is like this:
const todo = {
"nested": {
"Services": {
"methods": {
"createTodo": {
"requestType": "Todo",
"requestStream": false,
"responseType": "Todo",
"responseStream": false
},
"readTodos": {
"requestType": "voidNoParam",
"requestStream": false,
"responseType": "Todos",
"responseStream": false
},
"readTodosStream": {
"requestType": "voidNoParam",
"requestStream": false,
"responseType": "Todo",
"responseStream": true
}
}
},
"Todo": {
"fields": {
"id": {
"type": "int32",
"id": 1
},
"text": {
"type": "string",
"id": 2
}
}
},
"Todos": {
"fields": {
"items": {
"type": "array",
"id": 1
}
}
},
"voidNoParam": {
"fields": {}
}
}
}
module.exports = todo
I found the problem, really simple.
"Todos": {
"fields": {
"items": {
"rule": "repeated",
"type": "Todo",
"id": 1
}
}
},

Token Comma expected - can not run JSON query

This is a problem i have working in Excels Power Query.
I have this query saved in a variable named "content" which is passed to the call Web.Contents.
I can not run the query, i get "Token Comma expected" error. Can somebody tell what that is about?
`let
content = "{
"query": [
{
"code": "Region",
"selection": {
"filter": "vs:RegionKommun07",
"values": [
"1283"
]
}
},
{
"code": "Sysselsattning",
"selection": {
"filter": "item",
"values": [
"FÖRV"
]
}
},
{
"code": "Alder",
"selection": {
"filter": "item",
"values": [
"30-34"
]
}
},
{
"code": "Kon",
"selection": {
"filter": "item",
"values": [
"1"
]
}
},
{
"code": "Tid",
"selection": {
"filter": "item",
"values": [
"2015"
]
}
}
],
"response": {
"format": "px"
}
}",
Source = Json.Document(Web.Contents("http://api.scb.se/OV0104/v1/doris/sv/ssd/START/AM/AM0207/AM0207H/BefSyssAldKonK", [Content=Text.ToBinary(content)]))
in
Source`
If you want " inside a quoted string then you need to double them up like "" to escape them.
let
content = "{
""query"": [
{
""code"": ""Region"",
""selection"": {
""filter"": ""vs:RegionKommun07"",
""values"": [
""1283""
]
}
},
...
...
}"
See page 21 here: http://download.microsoft.com/download/8/1/A/81A62C9B-04D5-4B6D-B162-D28E4D848552/Power%20Query%20M%20Formula%20Language%20Specification%20(July%202019).pdf
To include quotes in a text value, the quote mark is repeated, as
follows: "The ""quoted"" text" // The "quoted" text

ElasticSearch query stops working with big amount of data

The problem: I have 2 identical in terms of settings and mappings indexes.
The first index contains only 1 document.
The second index contains the same document + 16M of others.
When I'm running the query on the first index it returns the document, but when I do the same query on the second — I receive nothing.
Indexes settings:
{
"tasks_test": {
"settings": {
"index": {
"analysis": {
"analyzer": {
"tag_analyzer": {
"filter": [
"lowercase",
"tag_filter"
],
"tokenizer": "whitespace",
"type": "custom"
}
},
"filter": {
"tag_filter": {
"type": "word_delimiter",
"type_table": "# => ALPHA"
}
}
},
"creation_date": "1444127141035",
"number_of_replicas": "2",
"number_of_shards": "5",
"uuid": "wTe6WVtLRTq0XwmaLb7BLg",
"version": {
"created": "1050199"
}
}
}
}
}
Mappings:
{
"tasks_test": {
"mappings": {
"Task": {
"dynamic": "false",
"properties": {
"format": "dateOptionalTime",
"include_in_all": false,
"type": "date"
},
"is_private": {
"type": "boolean"
},
"last_timestamp": {
"type": "integer"
},
"name": {
"analyzer": "tag_analyzer",
"type": "string"
},
"project_id": {
"include_in_all": false,
"type": "integer"
},
"user_id": {
"include_in_all": false,
"type": "integer"
}
}
}
}
}
The document:
{
"_index": "tasks_test",
"_type": "Task",
"_id": "1",
"_source": {
"is_private": false,
"name": "135548- test with number",
"project_id": 2,
"user_id": 1
}
}
The query:
{
"query": {
"filtered": {
"query": {
"bool": {
"must": [
[
{
"match": {
"_all": {
"query": "135548",
"type": "phrase_prefix"
}
}
}
]
]
}
},
"filter": {
"bool": {
"must": [
{
"term": {
"is_private": false
}
},
{
"terms": {
"project_id": [
2
]
}
},
{
"terms": {
"user_id": [
1
]
}
}
]
}
}
}
}
}
Also, some findings:
if I replace _all with name everything works
if I replace match_phrase_prefix with match_phrase works too
ES version: 1.5.1
So, the question is: how to make the query work for the second index without mentioned hacks?

Resources