logstash - Conditionally converts field types - logstash

I inherited a logstash config as follows. I do not want to do major changes in this because I do not want to break anything that is working. The metrics are sent as logs with json in format - "metric": "metricname", "value": "int". This has been working great. However, there is a requirement to have a string in value for a new metric. It is not really a metric but to indicate the state of the processing in string. Based on the following filter, it converts everything to integer and any string in value will be converted to 0. The requirement is that if the value is a string, it shouldn't attempt convert. Thank you!
input {
beats {
port => 5044
}
}
filter {
grok {
match => { "message" => "%{TIMESTAMP_ISO8601:ts} - M_%{DATA:task}_%{NUMBER:thread} - INFO - %{GREEDYDATA:jmetric}"}
remove_field => [ "message", "ecs", "original", "agent", "log", "host", "path" ]
break_on_match => false
}
if "_grokparsefailure" in [tags] {
drop {}
}
date {
match => ["ts", "ISO8601"]
target => "#timestamp"
}
json {
source => "jmetric"
remove_field => "jmetric"
}
split {
field => "points"
add_field => {
"metric" => "%{[points][metric]}"
"value" => "%{[points][value]}"
}
remove_field => [ "points", "event", "tags", "ts", "stream", "input" ]
}
mutate {
convert => { "value" => "integer" }
convert => { "thread" => "integer" }
}
}

You should use index mappings for this mainly.
Even if you handle things in logstash, elasticsearch will - if configured with the defaults - do dynamic mapping, which may work against any configuration you do in logstash.
See Elasticsearch index templates
An index template is a way to tell Elasticsearch how to configure an index when it is created.
...
Index templates can contain a collection of component templates, as well as directly specify settings, mappings, and aliases.
Mappings are pr index! This means that when you apply new mapping, you will have to create a new index. You can "rollover" to a new index, or delete / import your data again. What you do depends on your data, how you receive it, etc. ymmv...
No matter what, if your index has the wrong mapping you will need to create a new index to get the new mapping.
PS! If you have a lot of legacy data take a look at the reindex API for elasticsearch.

Related

logstash GROK filter along with KV plugin couldn't able to process the events

i am new to ELK. when i onboarded the below log file, it is going to "dead letter queue" in logstash because logstash couldn't able to process the events.I have written the GROK filter to parse the events but logstash still couldn't not process the events. Any help would be appreciated.
Below is the sample log format.
25193662345 [http-nio-8080-exec-44] DEBUG c.s.b.a.m.PerformanceMetricsFilter - method=PUT status=201 appLogicTime=1, streamInTime=0, blobStorageTime=31, totalTime=33 tenantId=b9sdfs-1033-4444-aba5-csdfsdfsf, immutableBlobId=bss_c_586331/Sample_app12-sdas-157123148464.txt, blobSize=2862, domain=abc
2519366789 [http-nio-8080-exec-47] DEBUG q.s.b.y.m.PerformanceMetricsFilter - method=PUT status=201 appLogicTime=1, streamInTime=0, blobStorageTime=32, totalTime=33 tenantId=b0csdfsd-1066-4444-adf4-ce7bsdfssdf, immutableBlobId=bss_c_586334/Sample_app15-615223-157sadas6648465.txt, blobSize=2862, domain=cde
GROK filter:
dissect { mapping => { "message" => "%{NUMBER:number} [%{thread}] %{level} %{class} - %{[#metadata][msg]}" } }
kv { source => "[#metadata][msg]" field_split => "," }
Thanks
You have basically two problems in your configuration.
1.) You are using the dissect filter, not grok, both are used to parse messages, but grok uses regular expressions to validate the value of the field and dissect is just positional, it does not perform any validation, if you have a WORD value in the position of a field that expects a NUMBER, grok will fail, but dissect will not.
If your log lines always have the same pattern, you should continue to use dissect since it is faster and needs less cpu.
Your correct dissect mapping should be:
dissect {
mapping => { "message" => "%{number} [%{thread}] %{level} %{class} - %{[#metadata][msg]}" }
}
2.) The field that contains the kv message is wrong, it has fields separated by space and by comma, kv won't work this way.
After your dissect filter this is the content of [#metadata][msg].
method=PUT status=201 appLogicTime=1, streamInTime=0, blobStorageTime=32, totalTime=33 tenantId=b0csdfsd-1066-4444-adf4-ce7bsdfssdf, immutableBlobId=bss_c_586334/Sample_app15-615223-157sadas6648465.txt, blobSize=2862, domain=cde
To solve this you should use a mutate filter to remove the comma from the [#metadata][msg] and use the kv filter with the default configurations.
This should be your filter configuration
filter {
dissect {
mapping => { "message" => "%{number} [%{thread}] %{level} %{class} - %{[#metadata][msg]}" }
}
mutate {
gsub => ["[#metadata][msg]",",",""]
}
kv {
source => "[#metadata][msg]"
}
}
Your output should be something like this:
{
"number" => "2519366789",
"#timestamp" => 2019-11-03T16:42:11.708Z,
"thread" => "http-nio-8080-exec-47",
"appLogicTime" => "1",
"domain" => "cde",
"method" => "PUT",
"level" => "DEBUG",
"blobSize" => "2862",
"#version" => "1",
"immutableBlobId" => "bss_c_586334/Sample_app15-615223-157sadas6648465.txt",
"streamInTime" => "0",
"status" => "201",
"blobStorageTime" => "32",
"message" => "2519366789 [http-nio-8080-exec-47] DEBUG q.s.b.y.m.PerformanceMetricsFilter - method=PUT status=201 appLogicTime=1, streamInTime=0, blobStorageTime=32, totalTime=33 tenantId=b0csdfsd-1066-4444-adf4-ce7bsdfssdf, immutableBlobId=bss_c_586334/Sample_app15-615223-157sadas6648465.txt, blobSize=2862, domain=cde",
"totalTime" => "33",
"tenantId" => "b0csdfsd-1066-4444-adf4-ce7bsdfssdf",
"class" => "q.s.b.y.m.PerformanceMetricsFilter"
}

In Kibana, I have fields that contains a question mark `?` not showing in metric field

In Kibana, I have fields that contains a question mark ?. The goal is to create a filter that excludes all entries containing a question mark in the field. So, when i'm trying to create a metric under Aggregation with Term those fields which are in ? mark are not visible there, Please help to understand to a newbie ..
Below is the logstash.conf with the filters i'm using along with screen shot i have attached, please suggest what mistake i'm doing and what can be done..
I have ELK version : 6.2.x
# cat logstash-syslog.conf
input {
file {
path => [ "/scratch/rsyslog/*/messages.log" ]
type => "syslog"
}
file {
path => [ "/scratch/rsyslog/Aug/messages.log" ]
type => "apic_logs"
}
}
filter {
if [type] == "syslog" {
grok {
match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp } %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
add_field => [ "received_at", "%{#timestamp}" ]
remove_field => ["#version", "host", "message", "_type", "_index", "_score", "path"]
}
syslog_pri { }
date {
match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
}
}
if [type] == "apic_logs" {
grok {
match => { "message" => "%{CISCOTIMESTAMP:syslog_timestamp} %{CISCOTIMESTAMP} %{SYSLOGHOST:syslog_hostname} (?<prog>[\w._/%-]+) %{SYSLOG5424SD:f1}%{SYSLOG5424SD:f2}%{SYSLOG5424SD:f3}%{SYSLOG5424SD:f4}%{SYSLOG5424SD:f5} %{GREEDYDATA:syslog_message}" }
add_field => [ "received_at", "%{#timestamp}" ]
remove_field => ["#version", "host", "message", "_type", "_index", "_score", "path"]
}
}
}
output {
if [type] == "syslog" {
elasticsearch {
hosts => "noida-elk:9200"
manage_template => false
index => "syslog-%{+YYYY.MM.dd}"
document_type => "messages"
}
}
}
output {
if [type] == "apic_logs" {
elasticsearch {
hosts => "noida-elk:9200"
manage_template => false
index => "apic_logs-%{+YYYY.MM.dd}"
document_type => "messages"
}
}
}
I fixed my issue!
Why do I see the symbol ? by fields in the Kibana Discover page
When you open the Discover page in Kibana, you might see a question mark ? by fields that are listed in the available fields section instead of the character t. When you reload the list of fields, the type of fields is analyzed, and the question mark ? is replaced by the character t.
Be sure to check Mark the box include system indices at the extreme right in the below screen shot.
Rearranging field columns in the table
You can rearrange the field columns in the table. Mouse over the header of the column you want to move, and click the Move column to the left button or the Move column to the right button.
Reloading the list of fields
Complete the following steps to reload the list of fields that are displayed in Kibana:
Select the Management page, then select Index Patterns to list the indexes that are available.
Select the index pattern for your space to see every field and the field's associated core type as recorded by Elasticsearch.
Click the Reload field list button Reload field list to reload the index pattern fields.
The list of fields is refreshed.
In case if you are using kibana , make sure after re-creating the index you are refreshing it, speically when you add new fields to it. This refresh is available under management section of Kibana against each index pattern.
"?" - represents a column available , but not part of elastic index until you do a refresh.
None of the other answers worked for me - take a look at this link though:
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-update-by-query.html#docs-update-by-query
This did work, at least some of the time! Execute (ES/OpenSearch API sandbox in Console):
POST my-index-000001/_update_by_query?conflicts=proceed
or via curl
curl -X POST "localhost:9200/my-index-000001/_update_by_query?conflicts=proceed&pretty"
Note! This did work for me (along with Index-Pattern re-creation) in AWS OpenSearch.

Retrieving RESTful GET parameters in logstash

I am trying to get logstash to parse key-value pairs in an HTTP get request from my ELB log files.
the request field looks like
http://aaa.bbb/get?a=1&b=2
I'd like there to be a field for a and b in the log line above, and I am having trouble figuring it out.
My logstash conf (formatted for clarity) is below which does not load any additional key fields. I assume that I need to split off the address portion of the URI, but have not figured that out.
input {
file {
path => "/home/ubuntu/logs/**/*.log"
type => "elb"
start_position => "beginning"
sincedb_path => "log_sincedb"
}
}
filter {
if [type] == "elb" {
grok {
match => [ "message", "%{TIMESTAMP_ISO8601:timestamp}
%{NOTSPACE:loadbalancer} %{IP:client_ip}:%{NUMBER:client_port:int}
%{IP:backend_ip}:%{NUMBER:backend_port:int}
%{NUMBER:request_processing_time:float}
%{NUMBER:backend_processing_time:float}
%{NUMBER:response_processing_time:float}
%{NUMBER:elb_status_code:int}
%{NUMBER:backend_status_code:int}
%{NUMBER:received_bytes:int} %{NUMBER:sent_bytes:int}
%{QS:request}" ]
}
date {
match => [ "timestamp", "ISO8601" ]
}
kv {
field_split => "&?"
source => "request"
exclude_keys => ["callback"]
}
}
}
output {
elasticsearch { host => localhost }
}
kv will take a URL and split out the params. This config works:
input {
stdin { }
}
filter {
mutate {
add_field => { "request" => "http://aaa.bbb/get?a=1&b=2" }
}
kv {
field_split => "&?"
source => "request"
}
}
output {
stdout {
codec => rubydebug
}
}
stdout shows:
{
"request" => "http://aaa.bbb/get?a=1&b=2",
"a" => "1",
"b" => "2"
}
That said, I would encourage you to create your own versions of the default URI patterns so that they set fields. You can then pass the querystring field off to kv. It's cleaner that way.
UPDATE:
For "make your own patterns", I meant to take the existing ones and modify them as needed. In logstash 1.4, installing them was as easy as putting them in a new file the 'patterns' directory; I don't know about patterns for >1.4 yet.
MY_URIPATHPARAM %{URIPATH}(?:%{URIPARAM:myuriparams})?
MY_URI %{URIPROTO}://(?:%{USER}(?::[^#]*)?#)?(?:%{URIHOST})?(?:%{MY_URIPATHPARAM})?
Then you could use MY_URI in your grok{} pattern and it would create a field called myuriparams that you could feed to kv{}.

logstash : how to extract data from log4j message?

I try to extract data from my log4j message with logstash.
The message look like this :
Method findAll - Start by : bokc
I would like to extract the method name : "findAll" and the user "bokc".
How can I do this?
I use logstash 1.5.2 and my config is :
input {
log4j {
mode => "server"
type => "log4j-artemis"
port => 4560
}
}
filter {
multiline {
type => "log4j-artemis"
pattern => "^\\s"
what => "previous"
}
mutate {
add_field => [ "source_ip", "%{host}" ]
}
}
Use a grok filter:
filter {
grok {
match => [
"message",
"^Method %{WORD:method} - Start by : %{USER:user}"
]
tag_on_failure => []
}
}
This extracts the two words into the fields "method" and "user". The setting of tag_on_failure makes sure that non-matching messages aren't tagged with _grokparsefailure. Since most messages aren't supposed to match the pattern it doesn't make sense to mark them as failures.

Logstash not adding fields

I am using Logstash 1.4.2 and I have the following conf file.
I would expect to see in Kibana in the "Fields" section on the left the options for "received_at" and "received_from" and "description", but I don't.
I see
#timestamp
#version
_id
_index
_type host path
I do see in the _source section on the right side the following...
received_at:2015-05-11 14:19:40 UTC received_from:PGP02 descriptionError1!
So home come these don't appear in the list of "Popular Fields"?
I'd like to filter the right side to not show EVERY field in the _source section on the right. Excuse the redaction blocks.
input
{
file {
path => "C:/ServerErrlogs/office-log.txt"
start_position => "beginning"
sincedb_path => "c:/tools/logstash-1.4.2/office-log.sincedb"
tags => ["product_qa", "office"]
}
file {
path => "C:/ServerErrlogs/dis-log.txt"
start_position => "beginning"
sincedb_path => "c:/tools/logstash-1.4.2/dis-log.sincedb"
tags => ["product_qa", "dist"]
}
}
filter {
grok {
match => ["path","%{GREEDYDATA}/%{GREEDYDATA:filename}\.log"]
match => [ "message", "%{TIMESTAMP_ISO8601:logdate}: %{LOGLEVEL:loglevel} (?<logmessage>.*)" ]
add_field => [ "received_at", "%{#timestamp}" ]
add_field => [ "received_from", "%{host}" ]
}
date {
match => [ "logdate", "ISO8601", "yyyy-MM-dd HH:mm:ss,SSSSSSSSS" ]
}
#logdate is now parsed into timestamp, remove original log message too
mutate {
remove_field => ['message', 'logdate' ]
add_field => [ "description", "Error1!" ]
}
}
output {
elasticsearch {
protocol => "http"
host => "0.0.0.x"
}
}
Update:
I have tired searching with a query like:
tags: data AND loglevel : INFO
then saving this query, and then reloading the page.
But still I don't see loglevel appearing as 'Popular Fields'
If the fields don't appear on the left side, it's probably a kibana caching problem. Go to Settings->Indices, select your index, and click the orange Refresh button.
I had the same issue with logstash not adding fields and after quite a lot of searching and testing other things, suddenly I had the solution (but I´am using the logstash-logback-encoder, so I have JSON already - if you don´t, then you need to transform things into JSON in the logstash "input"-phase).
I added a "json" plugin-filter, that did the magic for me:
filter {
json {
source => "message"
}
}

Resources