Logstash Grok Filter key/value pairs - logstash-grok

Working on getting our ESET log files (json format) into elasticsearch. I'm shipping logs to our syslog server (syslog-ng), then to logstash, and elasticsearch. Everything is going as it should. My problem is in trying to process the logs in logstash...I cannot seem to separate the key/value pairs into separate fields.
Here's a sample log entry:
Jul 8 11:54:29 192.168.1.144 1 2016-07-08T15:55:09.629Z era.somecompany.local ERAServer 1755 Syslog {"event_type":"Threat_Event","ipv4":"192.168.1.118","source_uuid":"7ecab29a-7db3-4c79-96f5-3946de54cbbf","occured":"08-Jul-2016 15:54:54","severity":"Warning","threat_type":"trojan","threat_name":"HTML/Agent.V","scanner_id":"HTTP filter","scan_id":"virlog.dat","engine_version":"13773 (20160708)","object_type":"file","object_uri":"http://malware.wicar.org/data/java_jre17_exec.html","action_taken":"connection terminated","threat_handled":true,"need_restart":false,"username":"BATHSAVER\\sickes","processname":"C:\\Program Files (x86)\\Google\\Chrome\\Application\\chrome.exe"}
Here is my logstash conf:
input {
udp {
type => "esetlog"
port => 5515
}
tcp {
type => "esetlog"
port => 5515
}
filter {
if [type] == "esetlog" {
grok {
match => { "message" => "%{DATA:timestamp}\ %{IPV4:clientip}\ <%{POSINT:num1}>%{POSINT:num2}\ %{DATA:syslogtimestamp}\ %{HOSTNAME}\ %{IPORHOST}\ %{POSINT:syslog_pid\ %{DATA:type}\ %{GREEDYDATA:msg}" }
}
kv {
source => "msg"
value_split => ":"
target => "kv"
}
}
}
output {
elasticsearch {
hosts => ['192.168.1.116:9200']
index => "eset-%{+YYY.MM.dd}"
}
}
When the data is displayed in kibana other than the data and time everything is lumped together in the "message" field only, with no separate key/value pairs.
I've been reading and searching for a week now. I've done similar things with other log files with no problems at all so not sure what I'm missing. Any help/suggestions is greatly appreciated.

Can you try belows configuration of logstash
grok {
match => {
"message" =>["%{CISCOTIMESTAMP:timestamp} %{IPV4:clientip} %{POSINT:num1} %{TIMESTAMP_ISO8601:syslogtimestamp} %{USERNAME:hostname} %{USERNAME:iporhost} %{NUMBER:syslog_pid} Syslog %{GREEDYDATA:msg}"]
}
}
json {
source => "msg"
}
It's working and tested in http://grokconstructor.appspot.com/do/match#result
Regards.

Related

Logstash aggregation return empty message

I have a testing environment to test some logstash plugin before to move to production.
For now, I am using kiwi syslog generator, to generate some syslog for testing.
The field I have are as follow:
#timestamp
message
+ elastic medatadata
Starting from this basic fields, I start filtering my data.
The first thing is to add a new field based on the timestamp and message as follow:
input {
syslog {
port => 514
}
}
filter {
prune {
whitelist_names =>["timestamp","message","newfield", "message_count"]
}
mutate {
add_field => {"newfield" => "%{#timestamp}%{message}"}
}
}
The prune is just to don't process unwanted data.
And this works just fine as I am getting a new field with those 2 values.
The next step was to run some aggregation based on specific content of the message, such as if the message contains logged in or logged out
and to do this, I used the aggregation filter
grok {
match => {
"message" => [
"(?<[#metadata][event_type]>logged out)",
"(?<[#metadata][event_type]>logged in)",
"(?<[#metadata][event_type]>workstation locked)"
]
}
}
aggregate {
task_id => "%{message}"
code => "
map['message_count'] ||= 0; map['message_count'] += 1;
"
push_map_as_event_on_timeout => true
timeout_timestamp_field => "#timestamp"
timeout => 60
inactivity_timeout => 50
timeout_tags => ['_aggregatetimeout']
}
}
This worked as expected but I am having a problem here. When the aggregation times out. the only field populated for the specific aggregation, is the message_count
As you can see in the above screenshot, the newfield and message(the one on the total left, sorry it didn't fit in the screenshot) are both empty.
For the demostration and testing purpose that's is absolutely fine, but it will because unmanageable if I get hundreds of syslog per second not knowing to with message that message_count refers to.
Please, I am struggling here and I don't know how to solve this issue, can please somebody help me to understand how I can fill the newfield with the content of the message that it refers to?
This is my whole logstash configuration to make it easier.
input {
syslog {
port => 514
}
}
filter {
prune {
whitelist_names =>["timestamp","message","newfield", "message_count"]
}
mutate {
add_field => {"newfield" => "%{#timestamp}%{message}"}
}
grok {
match => {
"message" => [
"(?<[#metadata][event_type]>logged out)",
"(?<[#metadata][event_type]>logged in)",
"(?<[#metadata][event_type]>workstation locked)"
]
}
}
aggregate {
task_id => "%{message}"
code => "
map['message_count'] ||= 0; map['message_count'] += 1;
"
push_map_as_event_on_timeout => true
timeout_timestamp_field => "#timestamp"
timeout => 60
inactivity_timeout => 50
timeout_tags => ['_aggregatetimeout']
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "logstash_index"
}
stdout {
codec => rubydebug
}
csv {
path => "C:\Users\adminuser\Desktop\syslog\syslogs-%{+yyyy.MM.dd}.csv"
fields => ["timestamp", "message", "message_count", "newfield"]
}
}
push_map_as_event_on_timeout => true
When you use this, and a timeout occurs, it creates a new event using the contents of the map. If you want fields from the original messages to be in the new event then you have to add them to the map. For the task_id there is a shorthand notation to do this using the timeout_task_id_field option on the filter, otherwise you have explicitly add them
map['newfield'] ||= event.get('newfield');

Make logstash filter by name

I have a log file called "/var/log/commands.log" that I'm trying to separate into fields with logstash & grok. I've got it working. Now, I'm trying to make logstash only do this to the file "/var/log/commands.log" and not any input by doing "if name = commands.log" but something with the "if" statement seems wrong as it skips over it.
input{
file{
path => "/var/log/commands.log"
}
beats{
port => 5044
}
}
filter {
if [log][file][path] == "/var/log/commands.log" {
grok{
match => { "message" => "*very long statement*"
}
}
}
}
output{
elasticsearch { hosts => ["localhost:9200"]}
}
If I remove the if statement it works and the fields are visible in kibana. I'm testing things locally. Does anyone know what's going on?
EDIT: SOLVED: In logstash, it has to be only [path] instead of all the rest.

How to create multiple indexes in logstash.conf file with common host

I am pretty new to logstash.
In our application we are creating multiple indexes, from the below thread i could understand how to resolve that
How to create multiple indexes in logstash.conf file?
but that results in many duplicate lines in the conf file (for host, ssl, etc.). So i wanted to check if there is any better way of doing it?
output {
stdout {codec => rubydebug}
if [type] == "trial" {
elasticsearch {
hosts => "localhost:9200"
index => "trial_indexer"
}
} else {
elasticsearch {
hosts => "localhost:9200"
index => "movie_indexer"
}
}
Instead of above config, can i have something like below?
output {
stdout {codec => rubydebug}
elasticsearch {
hosts => "localhost:9200"
}
if [type] == "trial" {
elasticsearch {
index => "trial_indexer"
}
} else {
elasticsearch {
index => "movie_indexer"
}
}
What you are looking for is using Environment Variables in logstash pipeline. You define this once, and can use same redundant values like you said for HOST, SSL etc.
For more information Logstash Use Environmental Variables
e.g.,
output {
elasticsearch{
hosts => ${ES_HOST}
index => "%{type}-indexer"
}
}
Let me know, if that helps.

Multiple identical messages with logstash/kibana

I'm running an ELK stack on my local filesystem. I have the following configuration file set up:
input {
file {
path => "/var/log/rfc5424"
type => "RFC"
}
}
filter {
grok {
match => { "message" => "%{SYSLOG5424LINE}" }
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
}
}
I have a kibana instance running as well. I write a line to /var/log/rfc5424:
$ echo '<11>1' "$(date +'%Y-%m-%dT%H:%M:%SZ')" 'test-machine test-tag f81d4fae-7dec-11d0-a765-00a0c91e6bf6 log [nsId orgID="12 \"hey\" 345" projectID="2345[hehe]6"] this is a test message' >> /var/log/rfc5424
And it shows up in Kibana. Great! However, weirdly, it shows up six times:
As far as I can tell everything about these message is identical, and I only have one instance of logstash/kibana running, so I have no idea what could be causing this duplication.
Check out if there is .swp or .tmp file for your configuration under conf directory.
Add document id to documents:
output {
elasticsearch {
hosts => ["localhost:9200"]
document_id => "%{uuid_field}"
}
}

How to get metric plugin working in logstash

I am failing to understand how to print the metric.
With following logstash config
input {
generator {
type => "generated"
}
}
filter {
metrics {
type => "generated"
meter => "events"
add_tag => "metric"
}
}
output {
stdout {
tags => "metric"
message => "rate: %{events.rate_1m}"
}
}
all I see is
rate: %{events.rate_1m}
rate: %{events.rate_1m}
instead of actual value.
When I enable debug in stdout I see that #fileds have
the data that metric is support to print.
"#fields" => {
"events.count" => 114175,
"events.rate_1m" => 6478.26368594885,
"events.rate_5m" => 5803.767865770155,
"events.rate_15m" => 5686.915084346328
},
How do I access #fields.events.count?
logstash version = 1.1.13
It looks like a known issue in logstash 1.1.13 and lower.
One need to escape '.' in %{events.rate_1m} as %{events\.rate_1m}
Details are in this logstash JIRA

Resources