Logstash http_poller only shows last logmessage in Kibana - logstash

I am using Logstash to get the log from a url using http_poller. This works fine. The problem I have is that the log that gets received does not get send to Elastic Search in the right way. I tried splitting the result in different events but the only event that shows in Kibana is the last event from the log. Since I am pulling the log every 2 minutes, a lot of log information gets lost this way.
The input is like this:
input {
http_poller {
urls => {
logger1 => {
method => get
url => "http://servername/logdirectory/thislog.log"
}
}
keepalive => true
automatic_retries => 0
# Check the site every 2 minutes
interval => 120
request_timeout => 110
# Wait no longer than 110 seconds for the request to complete
# Store metadata about the request in this field
metadata_target => http_poller_metadata
type => 'log4j'
codec => "json"
# important tag settings
tags => stackoverflow
}
}
I then use a filter to add some fields and to split the logs
filter {
if "stackoverflow" in [tags] {
split {
terminator => "\n"
}
mutate {
add_field => {
"Application" => "app-stackoverflow"
"Environment" => "Acceptation"
}
}
}
}
The output then gets send to the Kibana server using the following output conf
output {
redis {
host => "kibanaserver.internal.com"
data_type => "list"
key => "logstash N"
}
}
Any suggestions why not all the events are stored in Kibana?

Related

logstash file output not working with metadata fileds

I have following pipeline, the requirement is, I need to write "metrics" data to ONE file and EVENT data to another file. I am having two issues with this pipeline.
The file output not creating a "timestamped file" every 30 seconds, instead it is creating a file with this name output%{[#metadata][ts]}.csv and keep on appending the data.
The CSV output, creating a new file with timestamp every 30 seconds, but somehow it is creating an extra file once with the name output%{[#metadata][ts]} and keep on appending meta info to this file.
Can someone please guide me how I can fix this?
input {
beats {
port => 5045
}
}
filter {
ruby {
code => '
event.set("[#metadata][ts]", Time.now.to_i / 30)
event.set("[#metadata][suffix]", "output" + (Time.now.to_i / 30).to_s + ".csv")
'
}
}
filter {
metrics {
meter => [ "code" ]
add_tag => "metric"
clear_interval => 30
flush_interval => 30
}
}
output {
if "metric" in [tags] {
file {
flush_interval => 30
codec => line { format => "%{[code][count]} %{[code][count]}"}
path => "C:/lgstshop/local/csv/output%{[#metadata][ts]}.csv"
}
stdout {
codec => line {
format => "rate: %{[code][count]}"
}
}
}
file {
path => "output.log"
}
csv {
fields => [ "created", "level", "code"]
path => "C:/lgstshop/local/output%{[#metadata][ts]}.evt"
}
}
A metrics filter generates new events in the pipeline. Those events will only go through filters that come after it. Thus the metric events do not have a [#metadata][ts] field, so the sprintf references in the output section are not substituted. Move the ruby filter so that it is after the metrics filter.
If you do not want the metrics sent to the csv then wrap that output with if "metric" not in [tags] { or put it in an else of the existing conditional.

Logstash aggregation return empty message

I have a testing environment to test some logstash plugin before to move to production.
For now, I am using kiwi syslog generator, to generate some syslog for testing.
The field I have are as follow:
#timestamp
message
+ elastic medatadata
Starting from this basic fields, I start filtering my data.
The first thing is to add a new field based on the timestamp and message as follow:
input {
syslog {
port => 514
}
}
filter {
prune {
whitelist_names =>["timestamp","message","newfield", "message_count"]
}
mutate {
add_field => {"newfield" => "%{#timestamp}%{message}"}
}
}
The prune is just to don't process unwanted data.
And this works just fine as I am getting a new field with those 2 values.
The next step was to run some aggregation based on specific content of the message, such as if the message contains logged in or logged out
and to do this, I used the aggregation filter
grok {
match => {
"message" => [
"(?<[#metadata][event_type]>logged out)",
"(?<[#metadata][event_type]>logged in)",
"(?<[#metadata][event_type]>workstation locked)"
]
}
}
aggregate {
task_id => "%{message}"
code => "
map['message_count'] ||= 0; map['message_count'] += 1;
"
push_map_as_event_on_timeout => true
timeout_timestamp_field => "#timestamp"
timeout => 60
inactivity_timeout => 50
timeout_tags => ['_aggregatetimeout']
}
}
This worked as expected but I am having a problem here. When the aggregation times out. the only field populated for the specific aggregation, is the message_count
As you can see in the above screenshot, the newfield and message(the one on the total left, sorry it didn't fit in the screenshot) are both empty.
For the demostration and testing purpose that's is absolutely fine, but it will because unmanageable if I get hundreds of syslog per second not knowing to with message that message_count refers to.
Please, I am struggling here and I don't know how to solve this issue, can please somebody help me to understand how I can fill the newfield with the content of the message that it refers to?
This is my whole logstash configuration to make it easier.
input {
syslog {
port => 514
}
}
filter {
prune {
whitelist_names =>["timestamp","message","newfield", "message_count"]
}
mutate {
add_field => {"newfield" => "%{#timestamp}%{message}"}
}
grok {
match => {
"message" => [
"(?<[#metadata][event_type]>logged out)",
"(?<[#metadata][event_type]>logged in)",
"(?<[#metadata][event_type]>workstation locked)"
]
}
}
aggregate {
task_id => "%{message}"
code => "
map['message_count'] ||= 0; map['message_count'] += 1;
"
push_map_as_event_on_timeout => true
timeout_timestamp_field => "#timestamp"
timeout => 60
inactivity_timeout => 50
timeout_tags => ['_aggregatetimeout']
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "logstash_index"
}
stdout {
codec => rubydebug
}
csv {
path => "C:\Users\adminuser\Desktop\syslog\syslogs-%{+yyyy.MM.dd}.csv"
fields => ["timestamp", "message", "message_count", "newfield"]
}
}
push_map_as_event_on_timeout => true
When you use this, and a timeout occurs, it creates a new event using the contents of the map. If you want fields from the original messages to be in the new event then you have to add them to the map. For the task_id there is a shorthand notation to do this using the timeout_task_id_field option on the filter, otherwise you have explicitly add them
map['newfield'] ||= event.get('newfield');

Logstash metric filter for log-level

Can someone please help me with my metric filter. i want to set up logstash to check the log-level= Error for every 5s and if log-level = ERROR exceeds more than 1 , should send an email. i am using logstash 2.2.4
input {
file {
path => "/var/log/logstash/example"
start_position => beginning
}
}
filter {
grok{
match => { "message" => "\[%{TIMESTAMP_ISO8601:timestamp}\]\[%{LOGLEVEL:log-level}\s*\]" }
}
if [log-level] == "ERROR" {
metrics {
meter => [ "log-level" ]
flush_interval => 5
clear_interval => 5
}
}
}
output {
if [log-level] == "ERROR" {
if [log-level][count] < 1 {
email {
port => 25
address => "mail.abc.com"
authentication => "login"
use_tls => true
from => "alerts#logstash.com"
subject => "logstash alert"
to => "siya#abc.com"
via => "smtp"
body => "here is the event line %{message}"
debug => true
}
}
}
}
Editorial:
I am not a fan of the metrics {} filter, because it breaks assumptions. Logstash is multi-threaded, and metrics is one of the filters that only keeps state within its thread. If you use it, you need to be aware that if you're running 4 pipeline workers, you have 4 independent threads keeping their own state. This breaks the assumption that all events coming "into logstash" will be counted "by the metrics filter".
For your use-case, I'd recommend not using Logstash to issue this email, and instead rely on an external polling mechanism that hits your backing stores.
Because this is the metrics filter, I highly recommend you set your number of filter-workers to 1. This is the -w command-line option when logstash starts. You'll lose parallelism, but you'll gain the ability for a single filter to see all events. If you don't, you can get cases where all, say, 6 threads each see an ERROR event; and you will get six emails.
Your config could use some updates. It's recommended to add a tag or something to the metrics {} filter.
metrics {
meter => [ "log-level" ]
flush_interval => 5
clear_interval => 5
add_tag => "error_metric"
}
}
This way, you can better filter your email segment.
output {
if [tags] include "error_metric" and [log-level][count] > 1 {
email {
}
}
}
This is because the metrics {} filter creates a new event when it flushes, rather than amending an existing one. You need to catch the new event with your filters.

Logstash Grok Filter key/value pairs

Working on getting our ESET log files (json format) into elasticsearch. I'm shipping logs to our syslog server (syslog-ng), then to logstash, and elasticsearch. Everything is going as it should. My problem is in trying to process the logs in logstash...I cannot seem to separate the key/value pairs into separate fields.
Here's a sample log entry:
Jul 8 11:54:29 192.168.1.144 1 2016-07-08T15:55:09.629Z era.somecompany.local ERAServer 1755 Syslog {"event_type":"Threat_Event","ipv4":"192.168.1.118","source_uuid":"7ecab29a-7db3-4c79-96f5-3946de54cbbf","occured":"08-Jul-2016 15:54:54","severity":"Warning","threat_type":"trojan","threat_name":"HTML/Agent.V","scanner_id":"HTTP filter","scan_id":"virlog.dat","engine_version":"13773 (20160708)","object_type":"file","object_uri":"http://malware.wicar.org/data/java_jre17_exec.html","action_taken":"connection terminated","threat_handled":true,"need_restart":false,"username":"BATHSAVER\\sickes","processname":"C:\\Program Files (x86)\\Google\\Chrome\\Application\\chrome.exe"}
Here is my logstash conf:
input {
udp {
type => "esetlog"
port => 5515
}
tcp {
type => "esetlog"
port => 5515
}
filter {
if [type] == "esetlog" {
grok {
match => { "message" => "%{DATA:timestamp}\ %{IPV4:clientip}\ <%{POSINT:num1}>%{POSINT:num2}\ %{DATA:syslogtimestamp}\ %{HOSTNAME}\ %{IPORHOST}\ %{POSINT:syslog_pid\ %{DATA:type}\ %{GREEDYDATA:msg}" }
}
kv {
source => "msg"
value_split => ":"
target => "kv"
}
}
}
output {
elasticsearch {
hosts => ['192.168.1.116:9200']
index => "eset-%{+YYY.MM.dd}"
}
}
When the data is displayed in kibana other than the data and time everything is lumped together in the "message" field only, with no separate key/value pairs.
I've been reading and searching for a week now. I've done similar things with other log files with no problems at all so not sure what I'm missing. Any help/suggestions is greatly appreciated.
Can you try belows configuration of logstash
grok {
match => {
"message" =>["%{CISCOTIMESTAMP:timestamp} %{IPV4:clientip} %{POSINT:num1} %{TIMESTAMP_ISO8601:syslogtimestamp} %{USERNAME:hostname} %{USERNAME:iporhost} %{NUMBER:syslog_pid} Syslog %{GREEDYDATA:msg}"]
}
}
json {
source => "msg"
}
It's working and tested in http://grokconstructor.appspot.com/do/match#result
Regards.

How to get metric plugin working in logstash

I am failing to understand how to print the metric.
With following logstash config
input {
generator {
type => "generated"
}
}
filter {
metrics {
type => "generated"
meter => "events"
add_tag => "metric"
}
}
output {
stdout {
tags => "metric"
message => "rate: %{events.rate_1m}"
}
}
all I see is
rate: %{events.rate_1m}
rate: %{events.rate_1m}
instead of actual value.
When I enable debug in stdout I see that #fileds have
the data that metric is support to print.
"#fields" => {
"events.count" => 114175,
"events.rate_1m" => 6478.26368594885,
"events.rate_5m" => 5803.767865770155,
"events.rate_15m" => 5686.915084346328
},
How do I access #fields.events.count?
logstash version = 1.1.13
It looks like a known issue in logstash 1.1.13 and lower.
One need to escape '.' in %{events.rate_1m} as %{events\.rate_1m}
Details are in this logstash JIRA

Resources