Collect Data from log files with logstash - logstash

I'm trying with logstash to collect data from a log file for a version of NETASQ Firewall which contains a lot of lines , but i can not collect correctly my data , I don't know if there is a standard to follow, but I started like this:
input {
stdin { }
file {
type => "FireWall"
path => "/var/log/file.log"
start_position => 'beginning'
}
}
filter {
grok {
match => [ "message", "%{SYSLOGTIMESTAMP:date} %{WORD:id}"]
}
}
output {
stdout { }
elasticsearch {
cluster => "logstash"
}
}
The first line of my file.log looks like this :
Feb 27 04:02:23 id=firewall time="2015-02-27 04:02:23" fw="GVGM-NEWYORK"
tz=+0200 startime="2015-02-27 04:02:22" pri=5 confid=01 slotlevel=2 ruleid=57
srcif="Vlan2" srcifname="SSSSS" ipproto=udp dstif="Ethernet0"
dstifname="out" proto=teredo src=192.168.21.12 srcport=52469
srcportname=ephemeral_fw_udp dst=94.245.121.253 dstport=3544
dstportname=teredo dstname=teredo.ipv6.microsoft.com.nsatc.net
action=block logtype="filter"#015
And finally How can I collect data from the others lines. Please give me a topic just to start. Thanks All.

Related

supporting different kind of logs with Logstash

I'm trying to collect logs using Logstash where I have different kinds of logs in the same file.
I want to extract certain fields if they exist in the log, and otherwise do something else.
input{
file {
path => ["/home/ubuntu/XXX/XXX/results/**/log_file.txt"]
start_position => "beginning"
}
}
filter {
grok{
match => { "message" => ["%{WORD:logger} %{SPACE}\-%{SPACE} %{LOGLEVEL:level} %{SPACE}\-%{SPACE} %{DATA:message} %{NUMBER:score:float}",
"%{WORD:logger} %{SPACE}\-%{SPACE} %{LOGLEVEL:level} %{SPACE}\-%{SPACE} %{DATA:message}"]}
}
}
output{
elasticsearch {
hosts => ["X.X.X.X:9200"]
}
stdout { codec => rubydebug }
}
for example, log type 1 is:
root - INFO - Best score yet: 35.732
and type 2 is:
root - INFO - Starting an experiment
One of the problems I face is that when a message doesn't contain a number, the field still exists as null in the JSON created which prevents me from using desired functionalities in Kibana.
One option, is just to add a tag on logstash when field is not defined to be able to have a way to filter easily on kibana side. The null value is only set during the insertion in elasticsearch (on logstash side, the field is not defined)
This solution looks like this in you case :
filter {
grok{
match => { "message" => ["%{WORD:logger} %{SPACE}\-%{SPACE} %{LOGLEVEL:level} %{SPACE}\-%{SPACE} %{DATA:message} %{NUMBER:score:float}",
"%{WORD:logger} %{SPACE}\-%{SPACE} %{LOGLEVEL:level} %{SPACE}\-%{SPACE} %{DATA:message}"]}
}
if ![score] {
mutate { add_tag => [ "score_not_set" ] }
}
}

Can we configure logstash input to listen onlyto paricular set of hosts

Currently my logstash input is listening to filebeat on port XXXX,my requirement is to collect log data only from a particular hosts(let's say only from Webservers). I dont want to modify the filebeat configuration directly on to the servers but I want allow only the webservers logs to listen in.
Could anyone suggest how to configure the logstash in this scenario? Following is mylogstash input configuration.
**input {
beats {
port => 50XX
}
}**
In a word, "no", you cannot configure the input to restrict which hosts it will accept input from. What you can do is drop events from hosts you are not interested in. If the set of hosts you want to accept input from is small then you could do this using a conditional
if [beat][hostname] not in [ "hosta", "hostb", "hostc" ] { drop {} }
Similarly, if your hostnames follow a fixed pattern you might be able to do it using a regexp
if [beat][hostname] !~ /web\d+$/ { drop {} }
would drop events from any host whose name did not end in web followed by a number.
If you have a large set of hosts you could use a translate filter to determine if they are in the set. For example, if you create a csv file with a list of hosts
hosta,1
hostb,1
hostc,1
then do a lookup using
translate {
field => "[beat][hostname]"
dictionary_path => "/some/path/foo.csv"
destination => "[#metadata][field]"
fallback => "dropMe"
}
if [#metadata][field] == "dropMe" { drop {} }
#Badger - Thank you for your response!
As you rightly mentioned, I have large number of hosts, and all my web servers follows naming convention(for an example xxxwebxxx). Could you please brief me the following
translate {
field => "[beat][hostname]"
dictionary_path => "/some/path/foo.csv"
destination => "[#metadata][field]"
fallback => "dropMe"
}
if [#metadata][field] == "dropMe" { drop {}
Also, please suggest how to add the above to my logstash.conf, PFB this is how my logstash.conf looks like
input {
beats {
port => 5xxxx
}
}
filter {
if [type] == "XXX" {
grok {
match => [ "message", '"%{TIMESTAMP_ISO8601:logdate}"\t%{GREEDYDATA}']
}
grok {
match => [ "message", 'AUTHENTICATION-(?<xxx_status_code>[0-9]{3})']
}
grok {
match => [ "message", 'id=(?<user_id>%{DATA}),']
}
if ([user_id] =~ "_agent") {
drop {}
}
grok {
match => [ "message", '%{IP:clientip}' ]
}
date {
match => [ "logdate", "ISO8601", "YYYY-MM-dd HH:mm:ss"]
locale => "en"
}
geoip {
source => "clientip"
}
}
}
output {
elasticsearch {
hosts => ["hostname:port"]
}
stdout { }
}

Filebeat cant send data to logstash

I have problem to send data from *.log file to logstash. This is filebeat configuration:
filebeat.prospectors:
- type: log
enabled: true
paths:
- /home/centos/logs/*.log
filebeat.config.modules:
path: ${path.config}/modules.d/*.yml
reload.enabled: false
setup.template.settings:
index.number_of_shards: 3
setup.kibana:
output.logstash:
hosts: "10.206.81.234:5044"
This is logstash configuration:
path.data: /var/lib/logstash
path.config: /etc/logstash/conf.d/*.conf
path.logs: /var/log/logstash
xpack.monitoring.elasticsearch.url: ["10.206.81.236:9200", "10.206.81.242:9200", "10.206.81.243:9200"]
xpack.monitoring.elasticsearch.username: logstash_system
xpack.monitoring.elasticsearch.password: logstash
queue.type: persisted
queue.checkpoint.writes: 10
And this is my pipeline in /etc/logstash/conf.d/test.conf
input {
beats {
port => "5044"
}
file{
path => "/home/centos/logs/mylogs.log"
tags => "mylog"
}
file{
path => "/home/centos/logs/syslog.log"
tags => "syslog"
}
}
filter {
}
output {
if [tag] == "mylog" {
elasticsearch {
hosts => [ "10.206.81.246:9200", "10.206.81.236:9200", "10.206.81.243:9200" ]
user => "Test"
password => "123456"
index => "mylog-%{+YYYY.MM.dd}"
}
}
if [tag] == "syslog" {
elasticsearch {
hosts => [ "10.206.81.246:9200", "10.206.81.236:9200", "10.206.81.243:9200" ]
user => "Test"
password => "123456"
index => "syslog-%{+YYYY.MM.dd}"
}
}
}
I tried to have two separate outputs for mylog and syslog. At first, it works like this: everything was passed to mylog-%{+YYYY.MM.dd} index even files from syslog. So I tried change second if statement to else if. It did not work so I changed it back. Now, my filebeat are not able to send data to logstash and I am receiving this errors:
2018/01/20 15:02:10.959887 async.go:235: ERR Failed to publish events caused by: EOF
2018/01/20 15:02:10.964361 async.go:235: ERR Failed to publish events caused by: client is not connected
2018/01/20 15:02:11.964028 output.go:92: ERR Failed to publish events: client is not connected
My second test was change my pipeline like this:
input {
beats {
port => "5044"
}
file{
path => "/home/centos/logs/mylogs.log"
}
}
filter {
grok{
match => { "message" => "%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}" }
}
}
output {
elasticsearch {
hosts => [ "10.206.81.246:9200", "10.206.81.236:9200", "10.206.81.243:9200" ]
user => "Test"
password => "123456"
index => "mylog-%{+YYYY.MM.dd}"
}
}
If I add some lines to mylog.log file, filebeat will print the same ERR files but it is passed to logstash and I can see it in Kibana. Could anybody explain me why does it not work? What does those errors means?
I am using filebeat an logstash version 6.1.
Sorry if I make any mistake in english.
In the output section, you are using "tag" (note: is in singular) which doesn't exists. But changing this to "tags" will not work either because the field tags is an array and you will be comparing it to a string, so you should get the first item instead of getting the whole array and then compare. Try this:
if [tags[0]] == "mylog" { ......

Logstash not printing anything

I am using logstash for the first time and trying to setup a simple pipeline for just printing the nginx logs. Below is my config file
input {
file {
path => "/var/log/nginx/*access*"
}
}
output {
stdout { codec => rubydebug }
}
I have saved the file as /opt/logstash/nginx_simple.conf
And trying to execute the following command
sudo /opt/logstash/bin/logstash -f /opt/logstash/nginx_simple.conf
However the only output I can see is:
Logstash startup completed
Logstash shutdown completed
The file is not empty for sure. As per my understanding I should be seeing the output on my console. What am I doing wrong ?
Make sure that the character encoding of your logfile is UTF-8. If it is not, try to change it and restart the Logstash.
Please try this code as your Logstash configuration, in order to setup a simple pipeline for just printing the nginx logs.
input {
file {
path => "/var/log/nginx/*.log"
type => "nginx"
start_position => "beginning"
sincedb_path=> "/dev/null"
}
}
filter {
if [type] == "nginx" {
grok {
patterns_dir => "/home/krishna/Downloads/logstash-2.1.0/pattern"
match => {
"message" => "%{NGINX_LOGPATTERN:data}"
}
}
}
date {
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
}
}
output {
elasticsearch {
hosts => [ "127.0.0.1:9200" ]
}
stdout { codec => rubydebug }
}

Retrieving RESTful GET parameters in logstash

I am trying to get logstash to parse key-value pairs in an HTTP get request from my ELB log files.
the request field looks like
http://aaa.bbb/get?a=1&b=2
I'd like there to be a field for a and b in the log line above, and I am having trouble figuring it out.
My logstash conf (formatted for clarity) is below which does not load any additional key fields. I assume that I need to split off the address portion of the URI, but have not figured that out.
input {
file {
path => "/home/ubuntu/logs/**/*.log"
type => "elb"
start_position => "beginning"
sincedb_path => "log_sincedb"
}
}
filter {
if [type] == "elb" {
grok {
match => [ "message", "%{TIMESTAMP_ISO8601:timestamp}
%{NOTSPACE:loadbalancer} %{IP:client_ip}:%{NUMBER:client_port:int}
%{IP:backend_ip}:%{NUMBER:backend_port:int}
%{NUMBER:request_processing_time:float}
%{NUMBER:backend_processing_time:float}
%{NUMBER:response_processing_time:float}
%{NUMBER:elb_status_code:int}
%{NUMBER:backend_status_code:int}
%{NUMBER:received_bytes:int} %{NUMBER:sent_bytes:int}
%{QS:request}" ]
}
date {
match => [ "timestamp", "ISO8601" ]
}
kv {
field_split => "&?"
source => "request"
exclude_keys => ["callback"]
}
}
}
output {
elasticsearch { host => localhost }
}
kv will take a URL and split out the params. This config works:
input {
stdin { }
}
filter {
mutate {
add_field => { "request" => "http://aaa.bbb/get?a=1&b=2" }
}
kv {
field_split => "&?"
source => "request"
}
}
output {
stdout {
codec => rubydebug
}
}
stdout shows:
{
"request" => "http://aaa.bbb/get?a=1&b=2",
"a" => "1",
"b" => "2"
}
That said, I would encourage you to create your own versions of the default URI patterns so that they set fields. You can then pass the querystring field off to kv. It's cleaner that way.
UPDATE:
For "make your own patterns", I meant to take the existing ones and modify them as needed. In logstash 1.4, installing them was as easy as putting them in a new file the 'patterns' directory; I don't know about patterns for >1.4 yet.
MY_URIPATHPARAM %{URIPATH}(?:%{URIPARAM:myuriparams})?
MY_URI %{URIPROTO}://(?:%{USER}(?::[^#]*)?#)?(?:%{URIHOST})?(?:%{MY_URIPATHPARAM})?
Then you could use MY_URI in your grok{} pattern and it would create a field called myuriparams that you could feed to kv{}.

Resources