How to add/update data at nested level in existing Index using Logstash mutate plugin - logstash

I have multiple logstash pipelines set up on server that feeds data in Index. Every pipelines adds bunch of fields at the first level of Index along with their nested level.
I already have kpi1 and kpi2 values inside Metrics => data with Metrics being nested array. And I have a requirement to add a new pipeline that will feed the value of kpi3. Here is my filter section in the new pipeline that I created:
filter {
ruby {
code => "
event.set('kpi3', event.get('scoreinvitation'))
"
}
mutate {
# Rename the properties according to the document schema.
rename => {"kpi3" => "[metrics][data][kpi3]"}
}
}
It overwrites the Metrics section ( may be because it is an array??). Here is my mapping :
"metrics" : {
"type" : "nested",
"properties" : {
"data" : {
"properties" : {
"kpi1" : {
....
}
}
}
"name" : {
"type" : "text",
....
}
}
}
How can I keep the existing fields (and values) and still add the new fields inside Metrics => Data ? Any help is appeciated.

The Logstash pipeline looks good, however your mapping doesn't make much sense to me if I'm understanding your requirement correctly.
The metrics property doesn't have to be of type nested. In fact, the metrics property is just a json namespace that contains sub-fields / -objects.
Try the following mapping instead
"metrics": {
"properties": {
"data": {
"properties": {
"kpi1": {
# if you want to assign a value to the kpi1 field, it must have a type
}
}
},
"name": {
"type": "text"
}
}
}

Related

Logstash mutate copy field not available in filter scope?

I'd like to access a field that was copied in my filter block, however it appears the value isn't set at that point, or that I can't access it.
When the same conditional logic is in my output block it works as expected.
Sample of the "json" field after the json filter block. The original input message contains a "message" field that is correctly parsed as shown below.
{
"json":
{
"groups":
[
"vdos.all.hosts.virtualmachine",
"vdos.all.compute.all"
],
"itemid": 1632807,
"name": "Memory Guest Usage Percentage[\"X001\"]",
"clock": 1642625307,
"ns": 723739588,
"value": 4.992676,
"type": 0
}
}
Logstash config
filter {
json {
source => "message"
target => "json"
}
mutate {
copy => { "[json][groups]" => "host_groups" }
}
if "vdos.all.compute.all" not in "%{[host_groups]}" {
drop {}
}
}
I've tried
if "vdos.all.compute.all" not in "[host_groups]" {
drop {}
}
as well as trying to access the json field directly.
if "vdos.all.compute.all" not in "[json][groups]" {
drop {}
}

Mongo text search with AND operation for multiple words partially enered

{
TypeList" : [
{
"TypeName" : "Carrier"
},
{
"TypeName" : "Not a Channel Member"
},
{
"TypeName" : "Service Provider"
}
]
}
Question :
db.supplies.find("text", {search:"\"chann\" \"mem\""})
For above query I want display :
{
TypeName" : "Not a Channel Member"
}
But I am unable to get my result.
What are changes I have to do in query .
Please help me.
The below query will return your desired result.
db.supplies.aggregate([
{$unwind:"$TypeList"},
{$match:{"TypeList.TypeName":{$regex:/.*chann.*mem.*/,$options:"i"}}},
{$project:{_id:0, TypeName:"$TypeList.TypeName"}}
])
If you can accept to get an output like this:
{
"TypeList" : [
{
"TypeName" : "Not a Channel Member"
}
]
}
then you can get around using the aggregation framework which generally helps performance by running the following query:
db.supplies.find(
{
"TypeList.TypeName": /chann.*mem/i
},
{ // project the list in the following way
"_id": 0, // do not include the "_id" field in the output
"TypeList": { // only include the items from the TypeList array...
$elemMatch: { //... where
"TypeName": /chann.*mem/i // the "TypeName" field matches the regular expression
}
}
})
Also see this link: Retrieve only the queried element in an object array in MongoDB collection

Analyzing apache access logs with elasticsearch Watcher

I am using the ELK Stack to analyze logs and I need to analyze and detect anomalies of apache access logs. What can I analyze with apache access logs and how should I give the conditions with curl -XPUT to Watcher?
If you haven't found it already, there's a decent tutorial at https://www.elastic.co/guide/en/watcher/watcher-1.0/watch-log-data.html. It provides a basic example of creating a log watch.
You can analyze/watch anything that you can query in Elasticsearch. It's just a matter of formatting the query with the correct JSON syntax. The guide for crafting the conditions is at https://www.elastic.co/guide/en/watcher/watcher-1.0/condition.html.
You'll also want to look at https://www.elastic.co/guide/en/watcher/watcher-1.0/actions.html to get an idea of the possible actions Watcher can take when a query meets a condition.
As far as the post to Watcher, each watch is essentially a JSON object. Because they can get pretty elaborate, I have found that it's best to create a file for each watch you want to create, and post them like this:
curl -XPUT http://my_elasticsearch:9200/_watcher/watch/my_watch_name -d #/path/to/my_watch_name.json
my_watch_name.json should have these basic elements (as described in the first link above):
{
"trigger" : { ... },
"input" : { ... },
"condition" : { ... },
"actions" : { ... }
}
The actions section is going to be specific to your use case, but here's a basic example of the other sections that I'm using successfully:
{
"trigger" : {
"schedule" : { "interval" : "5m" }
},
"input" : {
"search" : {
"request" : {
"indices" : [ "logstash" ],
"body" : {
"query" : {
"filtered" : {
"query" : {
"match" : { "message" : "error" }
},
"filter" : {
"range" : { "#timestamp" : { "gte" : "now-5m" } }
}
}
}
}
}
}
},
"condition" : {
"compare" : { "ctx.payload.hits.total" : { "gt" : 0 } }
},
"actions" : {
...
}
}

elasticsearch predective search solution

Trying to get predictive drop down search ,How can i make search always starts from left to right
like in example "I_kimchy park" , "park"
If i search only "par" i have to get only park in return , but here i am getting both words , how to treat empty space as a character
POST /test1
{
"settings":{
"analysis":{
"analyzer":{
"autocomplete":{
"type":"custom",
"tokenizer":"standard",
"filter":[ "standard", "lowercase", "stop", "kstem", "edgeNgram" ,"whitespace"]
}
},
"filter":{
"ngram":{
"type":"edgeNgram",
"min_gram":2,
"max_gram":15,
"token_chars": [ "letter", "digit"]
}
}
}
}
}
PUT /test1/tweet/_mapping
{
"tweet" : {
"properties" : {
"user": {"type":"string", "index_analyzer" : "autocomplete","search_analyzer" : "autocomplete"}
}
}}
POST /test1/tweet/1
{"user" : "I_kimchy park"}
POST /test1/tweet/3
{ "user" : "park"}
GET /test1/tweet/_search
{
"query": {
"match_phrase_prefix": {
"user": "park"
}
}
}
That happens because your standard tokenizer splits your user field by white spaces. You can use Keyword Tokenizer in order to treat whole string as a single value (single token).
Please keep in mind that this change may affect other of your functionalities that use this field. You may have to add dedicated "not tokenized" user field for this purpose.

Elasticsearch two level sort in aggregation list

Currently I am sorting aggregations by document score, so most relevant items come first in aggregation list like below:
{
'aggs' : {
'guilds' : {
'terms' : {
'field' : 'guilds.title.original',
'order' : [{'max_score' : 'desc'}],
'aggs' : {
'max_score' : {
'script' : 'doc.score'
}
}
}
}
}
}
I want to add another sort option to the order terms order array in my JSON. but when I do that like this :
{
'order' : [{'max_score' : 'desc'}, {"_count" : "desc"},
}
The second sort does not work. For example when all of the scores are equal it then should sort based on query but it does not work.
As a correction to Andrei's answer ... to order aggregations by multiple criteria, you MUST create an array as shown in Terms Aggregation: Order and you MUST be using ElasticSearch 1.5 or later.
So, for Andrei's answer, the correction is:
"order" : [ { "max_score": "desc" }, { "_count": "desc" } ]
As Andrei has it, ES will not complain but it will ONLY use the last item listed in the "order" element.
I don't know how your 'aggs' is even working because I tried it and I had parsing errors in three places: "order" is not allowed to have that array structure, your second "aggs" should be placed outside the first "terms" aggs and, finally, the "max_score" aggs should have had a "max" type of "aggs". In my case, to make it work (and it does actually order properly), it should look like this:
"aggs": {
"guilds": {
"terms": {
"field": "guilds.title.original",
"order": {
"max_score": "desc",
"_count": "desc"
}
},
"aggs": {
"max_score": {
"max": {
"script": "doc.score"
}
}
}
}
}

Resources