Logstash mixed logs - multiline and normal - logstash

I use input file to get logs like this:
input {
file {
path => "/home/ec2-user/*.log"
}
}
In one of the log files some events are loging with 1 line:
2018-12-10 10:01:30.1097|0|Services.Services|INFO| Message: test
Another are multilines like this one :
2018-12-10 10:01:30.1097|0|Services.Services|INFO| Message: {
"account_id": "ec812648-3857-4625-9d9a-fc8ce1835493",
"name": "Player_539017",
"creation_time": "10/12/2018 10:52:52",
"hq_level": 2,
"force": 2570
} successfully dequeued |url: |action:
How can I capture both of the messages with logstash filter:

Below is an example from this page which uses the multiline codec to capture log lines starting with a date timestamp as single event. This will work for both of the log events mentioned above.
file {
path => "/home/ec2-user/*.log"
codec => multiline {
# Grok pattern names are valid
pattern => "^%{TIMESTAMP_ISO8601} "
negate => true
what => "previous"
}
}

Related

Logstash: unable to filter lines from metrics

I'd need to collect metrics from an URL. The format of the metrics is like that:
# HELP base:classloader_total_loaded_class_count Displays the total number of classes that have been loaded since the Java virtual machine has started execution.
# TYPE base:classloader_total_loaded_class_count counter
base:classloader_total_loaded_class_count 23003.0
I'd need to exclude, from the events collected, all lines which begin with a '#' character.
So I have arranged for the following configuration file:
input {
http_poller {
urls => {
pool_metrics => {
method => "get"
url => "http://localhost:10090/metrics"
headers => {
"Content-Type" => "text/plain"
}
}
}
request_timeout => 30
schedule => { cron => "* * * * * UTC"}
codec => multiline {
pattern => "^#"
negate => "true"
what => previous
}
type => "server_metrics"
}
}
output {
elasticsearch {
# An index is created for each type of metrics inpout
index => "logstash-%{type}"
}
}
Unfortunately, when I check through elastic search the data collected, I see it's not really what I was expecting. For example:
{
"_index" : "logstash-server_metrics",
"_type" : "doc",
"_id" : "2egAvWcBwbQ9kTetvX2o",
"_score" : 1.0,
"_source" : {
"type" : "server_metrics",
"tags" : [
"multiline"
],
"message" : "# TYPE base:gc_ps_scavenge_count counter\nbase:gc_ps_scavenge_count 24.0",
"#version" : "1",
"#timestamp" : "2018-12-17T16:30:01.009Z"
}
},
So it seems that the lines with '#' aren't skipped but appended to the next line from the metrics.
Can you recommend any way to fix it?
The multiline codec doesn't work this way. It merges the events into a single event, appending the lines that don't match ^# as you have observed.
I don't think it's possible to drop messages with a codec, you'll have to use the drop filter instead.
First remove the codec from your input configuration, then add this filter part to your configuration:
filter {
if [message] =~ "^#" {
drop {}
}
}
Using conditionals, if the message matches ^#, the event will be dropped by the drop filter, as you wanted.

Logstash not importing Apache log files

I am just new to logstash trying to import apache log file into Elastic - I see the below error:
RROR] 2017-10-28 00:38:51.085 [LogStash::Runner] agent - Cannot create pipeline {:reason=>"Expected one of #, {, } at line 4, column 19 (byte 81) after input {\nfile {\npath =>\"/home/monus/logstash-tutorial-dataset“\nstart_position =>\""}
here is my logstash.conf file
filter {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
date {
match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
That's logstash's way of telling you it found a syntax error on line 4, character 19 of the config-file. As line 4 in your snippet is a close-bracket, and you have no input {} bracket at all in the snippet, I'd look in your input section for the syntax error.

configuring the logstash-output-csv

I am pretty new to logstash and I have been trying to convert an existing log into a csv format using the logstash-output-csv plugin.
My input log string looks as follows which is a custom log written in our application.
'128.111.111.11/cpu0/log:5988:W/"00601654e51a15472-76":687358:<9>2015/08/18 21:06:56.05: comp/45 55% of memory in use: 2787115008 bytes (change of 0)'
I wrote a quick regex and added it to the patterns_dir using the grok plugin.
My pattern is as follows :
IP_ADDRESS [0-9,.]+
CPU [0-9]
NSFW \S+
NUMBER [0-9]
DATE [0-9,/]+\s+[0-9]+[:]+[0-9]+[:]+[0-9,.]+
TIME \S+
COMPONENT_ID \S+
LOG_MESSAGE .+
without adding any csv filters I was able to get this output.
{
"message" => "128.111.111.11/cpu0/log:5988:W/"00601654e51a15472-76":687358:<9>2015/08/18 21:06:56.05: comp/45 55% of memory in use: 2787115008 bytes (change of 0)",
"#version" => "1",
"#timestamp" => "2015-08-18T21:06:56.05Z",
"host" => "hostname",
"path" => "/usr/phd/raveesh/sample.log_20150819000609",
"tags" => [
[0] "_grokparsefailure"
]
}
This is my configuration in order to get the csv as an output
input {
file {
path => "/usr/phd/raveesh/temporary.log_20150819000609"
start_position => beginning
}
}
filter {
grok {
patterns_dir => "./patterns"
match =>["message", "%{IP_ADDRESS:ipaddress}/%{CPU:cpu}/%{NSFW:nsfw}<%{NUMBER:number}>%{DATE}:%{SPACE:space}%{COMPONENT_ID:componentId}%{SPACE:space}%{LOG_MESSAGE:logmessage}" ]
break_on_match => false
}
csv {
add_field =>{"ipaddress" => "%{ipaddress}" }
}
}
output {
# Print each event to stdout.
csv {
fields => ["ipaddress"]
path => "./logs/firmwareEvents.log"
}
stdout {
# Enabling 'rubydebug' codec on the stdout output will make logstash
# pretty-print the entire event as something similar to a JSON representation.
codec => rubydebug
}
}
The above configuration does not seem to give the output. I am trying only to print the ipaddress in a csv file but finally I need to print all the captured patterns in a csv file. so I need the output as follows :
128.111.111.111,cpu0,nsfw, ....
Could you please let me know the changes i need to make. ?
Thanks in advance
EDIT:
I fixed the regex as suggested using the tool http://grokconstructor.appspot.com/do/match#result
Now my regex filter looks as follows :
%{IP:client}\/%{WORD:cpu}\/%{NOTSPACE:nsfw}<%{NUMBER:number}>%{YEAR:year}\/%{MONTHNUM:month}\/%{MONTHDAY:day}%{SPACE:space}%{TIME:time}:%{SPACE:space2}%{NOTSPACE:comp}%{SPACE:space3}%{GREEDYDATA:messagetext}
How do I capture the individual splits and save it as a csv ?
Thanks
EDIT:
I finally resolved this using the File plugin .
output {
file{
path => "./logs/sample.log"
message_pattern =>"%{client},%{number}"
}
}
The csv tag in the filter section is for parsing the input and exploding the message to key/value pairs.
In your case you are already parsing the input with the grok, so I bet you don't need the csv filter.
But in the output we can see there is a gorkfailure
{
"message" => "128.111.111.11/cpu0/log:5988:W/"00601654e51a15472-76":687358:<9>2015/08/18 21:06:56.05: comp/45 55% of memory in use: 2787115008 bytes (change of 0)",
"#version" => "1",
"#timestamp" => "2015-08-18T21:06:56.05Z",
"host" => "hostname",
"path" => "/usr/phd/raveesh/sample.log_20150819000609",
"tags" => [
[0] "****_grokparsefailure****"
]
}
That means your grok expression cannot parse the input.
You should fix the expression according to your input and then the csv will output properly.
Checkout http://grokconstructor.appspot.com/do/match for some help
BTW, are you sure the patterns NSFW, CPU, COMPONENT_ID, ... are defined somewhere ?
HIH

Collect Data from log files with logstash

I'm trying with logstash to collect data from a log file for a version of NETASQ Firewall which contains a lot of lines , but i can not collect correctly my data , I don't know if there is a standard to follow, but I started like this:
input {
stdin { }
file {
type => "FireWall"
path => "/var/log/file.log"
start_position => 'beginning'
}
}
filter {
grok {
match => [ "message", "%{SYSLOGTIMESTAMP:date} %{WORD:id}"]
}
}
output {
stdout { }
elasticsearch {
cluster => "logstash"
}
}
The first line of my file.log looks like this :
Feb 27 04:02:23 id=firewall time="2015-02-27 04:02:23" fw="GVGM-NEWYORK"
tz=+0200 startime="2015-02-27 04:02:22" pri=5 confid=01 slotlevel=2 ruleid=57
srcif="Vlan2" srcifname="SSSSS" ipproto=udp dstif="Ethernet0"
dstifname="out" proto=teredo src=192.168.21.12 srcport=52469
srcportname=ephemeral_fw_udp dst=94.245.121.253 dstport=3544
dstportname=teredo dstname=teredo.ipv6.microsoft.com.nsatc.net
action=block logtype="filter"#015
And finally How can I collect data from the others lines. Please give me a topic just to start. Thanks All.

Logstash - grok use a field other than message

I am receiving Log4j generated log files from remote servers using Logstash forwarder. The log event has fields including a field named "file" in the format /tomcat/logs/app.log, /tomcat/logs/app.log.1, etc. Of course file path /tomcat/logs is on the remote machine and I would like Logstash to create files on the local file system using only the file name and not use the remote file path.
Locally, I would like to create a file based on file name app.log, app.log.1, etc. How can one accomplish this?
I am unable to use grok since it appears to work only with "message" field and not others.
Example Log Event:
{"message":["11 Sep 2014 16:29:04,934 INFO LOG MESSAGE DETAILS HERE "],"#version":"1","#timestamp":"2014-09-15T05:44:43.472Z","file":["/tomcat/logs/app.log.1"],"host":"aus-002157","offset":"3116","type":"app.log"}
Logstash configuration - what do I use to write the filter section?
input {
lumberjack {
port => 48080
ssl_certificate => "/tools/LogStash/logstash-1.4.2/ssl/logstash.crt"
ssl_key => "/tools/LogStash/logstash-1.4.2/ssl/logstash.key"
}
}
filter{
}
output {
file{
#message_format => "%{message}"
flush_interval => 0
path => "/tmp/%{host}.%{type}.%{filename}"
max_size => "4M"
}
}
Figured out the pattern to be as follows:
grok{
match => [ "file", "^(/.*/)(?<filename>(.*))$" ]
}
Thanks for the help!
Logstash Grok can parse all the fields in a log event, not only message field.
For example, you want to extract the file field,
you can do like this
filter {
grok {
match => [ "file", "your pattern" ]
}
}

Resources