Here is my sample config LS 7.9.
input {
jdbc { ... }
}
filter {
json {
#It's JSON field from DB, including only two for reference.
source => "tags_json"
#Need it as Sub-field like tags.companies, tags.geographies in ES
add_field => {
"[tags][companies]" => "%{companies}"
"[tags][geographies]" => "%{geographies}"
}
output {
elasticsearch { ... }
}
JSON structure in DB field tags_json
{"companies": ["ABC","XYZ"],
"geographies": [{"Market": "Group Market", "Region": "Group Region", "Country": "my_country"}],
"xyz":[]...
}
Logstash prints root geographies field correctly, this is what I need as sub-field under tags.
"geographies" => [
[0] {
"Market" => "Group Market",
"Region" => "Group Region"
},
## But as sub-field under the tags, only geographies is nil
"tags" => {
"companies" => [
[0] "ABC",
[1] "XYZ"
],
"geographies" => nil
}
I tried below copy, ruby, but doesn't seem to fix it :
mutate { copy => { "%{geographies}" => "[tags][geographies]"} }
Also tried Ruby
ruby { code => " event.set('[tags][geographies]', event.get('%{geographies}')) " }
Any help please. Thanks.
Resolved it with ruby event.
ruby {
code => 'event.set("[tags][geographies]", event.get("geographies"))'
}
Related
I inherited a logstash config as follows. I do not want to do major changes in this because I do not want to break anything that is working. The metrics are sent as logs with json in format - "metric": "metricname", "value": "int". This has been working great. However, there is a requirement to have a string in value for a new metric. It is not really a metric but to indicate the state of the processing in string. Based on the following filter, it converts everything to integer and any string in value will be converted to 0. The requirement is that if the value is a string, it shouldn't attempt convert. Thank you!
input {
beats {
port => 5044
}
}
filter {
grok {
match => { "message" => "%{TIMESTAMP_ISO8601:ts} - M_%{DATA:task}_%{NUMBER:thread} - INFO - %{GREEDYDATA:jmetric}"}
remove_field => [ "message", "ecs", "original", "agent", "log", "host", "path" ]
break_on_match => false
}
if "_grokparsefailure" in [tags] {
drop {}
}
date {
match => ["ts", "ISO8601"]
target => "#timestamp"
}
json {
source => "jmetric"
remove_field => "jmetric"
}
split {
field => "points"
add_field => {
"metric" => "%{[points][metric]}"
"value" => "%{[points][value]}"
}
remove_field => [ "points", "event", "tags", "ts", "stream", "input" ]
}
mutate {
convert => { "value" => "integer" }
convert => { "thread" => "integer" }
}
}
You should use index mappings for this mainly.
Even if you handle things in logstash, elasticsearch will - if configured with the defaults - do dynamic mapping, which may work against any configuration you do in logstash.
See Elasticsearch index templates
An index template is a way to tell Elasticsearch how to configure an index when it is created.
...
Index templates can contain a collection of component templates, as well as directly specify settings, mappings, and aliases.
Mappings are pr index! This means that when you apply new mapping, you will have to create a new index. You can "rollover" to a new index, or delete / import your data again. What you do depends on your data, how you receive it, etc. ymmv...
No matter what, if your index has the wrong mapping you will need to create a new index to get the new mapping.
PS! If you have a lot of legacy data take a look at the reindex API for elasticsearch.
I have logstash input that looks like this
{
"#timestamp": "2016-12-20T18:55:11.699Z",
"id": 1234,
"detail": {
"foo": 1
"bar": "two"
}
}
I would like to merge the content of "detail" with the root object so that the final event looks like this:
{
"#timestamp": "2016-12-20T18:55:11.699Z",
"id": 1234,
"foo": 1
"bar": "two"
}
Is there a way to accomplish this without writing my own filter plugin?
You can do this with a ruby filter.
filter {
ruby {
code => "
event['detail'].each {|k, v|
event[k] = v
}
event.remove('detail')
"
}
}
There is a simple way to do that using the json_encode plugin (not included by default).
The json extractor add fields to the root of the event. It's one of the very few extractors that can add things to the root.
filter {
json_encode {
source => "detail"
target => "detail"
}
json {
source => "detail"
remove_field => [ "detail" ]
}
}
I am trying to get logstash to parse key-value pairs in an HTTP get request from my ELB log files.
the request field looks like
http://aaa.bbb/get?a=1&b=2
I'd like there to be a field for a and b in the log line above, and I am having trouble figuring it out.
My logstash conf (formatted for clarity) is below which does not load any additional key fields. I assume that I need to split off the address portion of the URI, but have not figured that out.
input {
file {
path => "/home/ubuntu/logs/**/*.log"
type => "elb"
start_position => "beginning"
sincedb_path => "log_sincedb"
}
}
filter {
if [type] == "elb" {
grok {
match => [ "message", "%{TIMESTAMP_ISO8601:timestamp}
%{NOTSPACE:loadbalancer} %{IP:client_ip}:%{NUMBER:client_port:int}
%{IP:backend_ip}:%{NUMBER:backend_port:int}
%{NUMBER:request_processing_time:float}
%{NUMBER:backend_processing_time:float}
%{NUMBER:response_processing_time:float}
%{NUMBER:elb_status_code:int}
%{NUMBER:backend_status_code:int}
%{NUMBER:received_bytes:int} %{NUMBER:sent_bytes:int}
%{QS:request}" ]
}
date {
match => [ "timestamp", "ISO8601" ]
}
kv {
field_split => "&?"
source => "request"
exclude_keys => ["callback"]
}
}
}
output {
elasticsearch { host => localhost }
}
kv will take a URL and split out the params. This config works:
input {
stdin { }
}
filter {
mutate {
add_field => { "request" => "http://aaa.bbb/get?a=1&b=2" }
}
kv {
field_split => "&?"
source => "request"
}
}
output {
stdout {
codec => rubydebug
}
}
stdout shows:
{
"request" => "http://aaa.bbb/get?a=1&b=2",
"a" => "1",
"b" => "2"
}
That said, I would encourage you to create your own versions of the default URI patterns so that they set fields. You can then pass the querystring field off to kv. It's cleaner that way.
UPDATE:
For "make your own patterns", I meant to take the existing ones and modify them as needed. In logstash 1.4, installing them was as easy as putting them in a new file the 'patterns' directory; I don't know about patterns for >1.4 yet.
MY_URIPATHPARAM %{URIPATH}(?:%{URIPARAM:myuriparams})?
MY_URI %{URIPROTO}://(?:%{USER}(?::[^#]*)?#)?(?:%{URIHOST})?(?:%{MY_URIPATHPARAM})?
Then you could use MY_URI in your grok{} pattern and it would create a field called myuriparams that you could feed to kv{}.
I'm using Logstash + Elasticsearch + Kibana to have an overview of my Tomcat log files.
For each log entry I need to know the name of the file from which it came. I'd like to add it as a field. Is there a way to do it?
I've googled a little and I've only found this SO question, but the answer is no longer up-to-date.
So far the only solution I see is to specify separate configuration for each possible file name with different "add_field" like so:
input {
file {
type => "catalinalog"
path => [ "/path/to/my/files/catalina**" ]
add_field => { "server" => "prod1" }
}
}
But then I need to reconfigure logstash each time there is a new possible file name.
Any better ideas?
Hi I added a grok filter to do just this. I only wanted to have the filename not the path, but you can change this to your needs.
filter {
grok {
match => ["path","%{GREEDYDATA}/%{GREEDYDATA:filename}\.log"]
}
}
In case you would like to combine the message and file name in one event:
filter {
grok {
match => {
message => "ERROR (?<function>[\S]*)"
}
}
grok {
match => {
path => "%{GREEDYDATA}/%{GREEDYDATA:filename}\.log"
}
}}
The result in ElasticSearch (focus on 'filename' and 'function' fields):
"_index": "logstash-2016.08.03",
"_type": "logs",
"_id": "AVZRyEI49-A6kyBCq6Yt",
"_score": 1,
"_source": {
"message": "27/07/16 12:16:18,321 ERROR blaaaaaaaaa.internal.com",
"#version": "1",
"#timestamp": "2016-08-03T19:01:33.083Z",
"path": "/home/admin/mylog.log",
"host": "my-virtual-machine",
"function": "blaaaaaaaaa.internal.com",
"filename": "mylog"
}
I cannot get negative regexp expressions working within LogStash (as described in the docs)
Consider the following positive regex which works correctly to detect fields that have been assigned a value:
if [remote_ip] =~ /(.+)/ {
mutate { add_tag => ["ip"] }
}
However, the negative expression seems to return false even when the field is blank:
if [remote_ip] !~ /(.+)/ {
mutate { add_tag => ["no_ip"] }
}
Am I misunderstanding the usage?
Update - this was fuzzy thinking on my part. There were issues with my config file. If the rest of your config file is sane, the above should work.
This was fuzzy thinking on my part - there were issues with the rest of my config file.
Based on Ben Lim's example, I came up with an input that is easier to test:
input {
stdin { }
}
filter {
if [message] !~ /(.+)/ {
mutate { add_tag => ["blank_message"] }
}
if [noexist] !~ /(.+)/ {
mutate { add_tag => ["tag_does_not_exist"] }
}
}
output {
stdout {debug => true}
}
The output for a blank message is:
{
"message" => "",
"#version" => "1",
"#timestamp" => "2014-02-27T01:33:19.285Z",
"host" => "benchmark.example.com",
"tags" => [
[0] "blank_message",
[1] "tag_does_not_exist"
]
}
The output for a message with the content "test message" is:
test message
{
"message" => "test message",
"#version" => "1",
"#timestamp" => "2014-02-27T01:33:25.059Z",
"host" => "benchmark.example.com",
"tags" => [
[0] "tag_does_not_exist"
]
}
Thus, the "negative regex" /(.+)/ returns true only when the field is empty or the field does not exist.
The negative regex /(.*)/ will only return true when the field does not exist. If the field exists (whether empty or with values), the return value will be false.
Below is my configuration. The type field is not exist, therefore, the negative expression is return true.
input {
stdin {
}
}
filter {
if [type] !~ /(.+)/ {
mutate { add_tag => ["aa"] }
}
}
output {
stdout {debug => true}
}
The regexp /(.+)/ means it accepts everything, include blank. So, when the "type" field is exist, even the field value is blank, it also meet the regexp. Therefore, in your example, if the remote_ip field exist, your "negative expression" will always return false.