logstash-forwarder start_postion => beginning - logstash

I am doing centralized logging using logstash. I am using logstash-forwarder on the shipper node and ELK stack on the collector node.The issue is that i want logstash to parse the file from the beginning which is present on the shipper node. The config file logstash-forwarder.conf on the shipper has following configuration :
{
"network": {
"servers": [ "XXX.XX.XX.XXX:5000" ],
"timeout": 15,
"ssl ca": "/etc/pki/tls/certs/logstash-forwarder.crt"
},
"files": [
{
"paths": [
"/apps/newlogs.txt"
],
"fields": { "type": "syslog" }
}
]
}
And the collector configuration is :
input {
lumberjack {
port => 5000
type => "logs"
ssl_certificate => "/etc/pki/tls/certs/logstash-forwarder.crt"
ssl_key => "/etc/pki/tls/private/logstash-forwarder.key"
}
}
filter {
if [type] == "syslog" {
grok {
match => { "message" => "%{TIMESTAMP_ISO8601:logdate}\s%{LOGLEVEL:level}\s-\s%{WORD:USE_CASE}\s:\s%{WORD:STEP_DETAIL}\s:\s\[%{WORD:XXX}\]\s:\s(?<XXX>([^\s]+))\s:\s%{GREEDYDATA:MESSAGE_DETAILS}" }
add_field => [ "received_at", "%{#timestamp}" ]
add_field => [ "received_from", "%{host}" ]
add_tag => [ "level:%{level}" ]
add_tag => [ "USE_CASE:%{USE_CASE}" ]
}
}
}
output {
elasticsearch { host => localhost}
stdout { codec => rubydebug }
}
I want that the file should be parsed from the begging and not for every event generated,Which we do easily in the logstash.conf by specifying start_position => beginning but i am unable to find a straightforward way in logstash-forwarder as the file will be present on the shipper side.
Thanks.

As far as I'm aware, the default behaviour for logstash-forwarder is to start from the beginning of a file - so the shipper should already be reading from the start as intended.
You haven't said what you've tried doing to diagnose the problem. If you haven't already done so, I would temporarily bypass the collector to confirm that the shipper is working as expected and rule out potential issues with the certificates.

Related

All messages receive a "user level notice"

Im trying to parse a message from my network devices which send messages in format similar to
<30>Feb 14 11:33:59 wireless: ath0 Sending auth to xx:xx:xx:xx:xx:xx. Status: The request has been declined due to MAC ACL (52).\n
<190>Feb 14 11:01:29 CCR00 user admin logged out from xx.xx.xx.xx via winbox
<134>2023 Feb 14 11:00:33 ZTE command-log:An alarm 36609 level notification occurred at 11:00:33 02/14/2023 CET sent by MCP GponRm notify: <gpon-onu_1/1/1:1> SubType:1 Pos:1 ONU Uni lan los. restore\n on \n
using this logstash.conf file
input {
beats {
port => 5044
}
tcp {
port => 50000
}
udp {
port => 50000
}
}
## Add your filters / logstash plugins configuration here
filter {
grok {
match => {
"message" => "^(?:<%{POSINT:syslog_pri}>)?%{GREEDYDATA:message_payload}"
}
}
syslog_pri {
}
mutate {
remove_field => [ "#version" , "message" ]
}
}
output {
stdout {}
elasticsearch {
hosts => "elasticsearch:9200"
user => "logstash_internal"
password => "${LOGSTASH_INTERNAL_PASSWORD}"
}
}
which results in this output
{
"#timestamp": [
"2023-02-14T10:38:59.228Z"
],
"data_stream.dataset": [
"generic"
],
"data_stream.namespace": [
"default"
],
"data_stream.type": [
"logs"
],
"event.original": [
"<14> Feb 14 11:38:59 UBNT BOXSERV[boxs Req]: boxs.c(691) 55381193 %% Error 17 occurred reading thermal sensor 2 data\n\u0000"
],
"host.ip": [
"10.125.132.10"
],
"log.syslog.facility.code": [
1
],
"log.syslog.facility.name": [
"user-level"
],
"log.syslog.severity.code": [
5
],
"log.syslog.severity.name": [
"notice"
],
"message_payload": [
" Feb 14 11:38:59 UBNT[boxs Req]: boxs.c(691) 55381193 %% Error 17 occurred reading thermal sensor 2 data\n\u0000"
],
"syslog_pri": [
"14"
],
"_id": "UzmBT4YBAZPdbqc4m_IB",
"_index": ".ds-logs-generic-default-2023.02.04-000001",
"_score": null
}
which is mostly satisfactory, but i would expect the
log.syslog.facility.name
and
log.syslog.severity.name
fields to be processed by the
syslog_pri
filter
with imput of
<14>
to result into
secur/auth
and
Alert
recpectively,
but i keep getting the default user-level notice for all my messages, no matter what the part of the syslog message contains
anyone could advise and maybe fix my .conf syntax, if its wrong?
thank you very much!
i have logstash configured properly to receive logs and send them to elastics, but the grok/syslog_pri doesnt yield expected results
The fact that the syslog_pri filter is setting [log][syslog][facility][code] shows that it has ECS compatibility enabled. As a result, if you do not set the syslog_pri_field_name option on the syslog_pri filter, it will try to parse [log][syslog][priority]. If that field does not exist then it will parse the default value of 13, which is user-level/notice.
thank you for the answer, i have adjusted the code by the given advice
filter {
grok {
match => { "message" => "^(?:<%{POSINT:syslog_code}>)?%{GREEDYDATA:message_payload}"
} }
syslog_pri { syslog_pri_field_name => "syslog_code"
}
mutate { remove_field => [ "#version" , "message" ] } }
and now it behaves as intended
"event" => {
"original" => "<30>Feb 15 18:41:04 dnsmasq-dhcp[960]: DHCPACK(eth0) 10.0.0.165 xx:xx:xx:xx:xx CZ\n"
},
"#timestamp" => 2023-02-15T17:41:04.977038615Z,
"message_payload" => "Feb 15 18:41:04 dnsmasq-dhcp[960]: DHCPACK(eth0) 10.0.0.165 xx:xx:xx:xx:xx CZ\n",
"log" => {
"syslog" => {
"severity" => {
"code" => 6,
"name" => "informational"
},
"facility" => {
"code" => 3,
"name" => "daemon"
}
}
},
"syslog_code" => "30",
"host" => {
"ip" => "xx.xx.xx.xx"
} }
i will adjust the message a bit to fit my needs,
but that is out of the scope of this question
thank you very much!

LogStash Conf | Drop Empty Lines

The contents of LogStash's conf file looks like this:
input {
beats {
port => 5044
}
file {
path => "/usr/share/logstash/iway_logs/*"
start_position => "beginning"
sincedb_path => "/dev/null"
#ignore_older => 0
codec => multiline {
pattern => "^\[%{NOTSPACE:timestamp}\]"
negate => true
what => "previous"
max_lines => 2500
}
}
}
filter {
grok {
match => { "message" =>
['(?m)\[%{NOTSPACE:timestamp}\]%{SPACE}%{WORD:level}%{SPACE}\(%{NOTSPACE:entity}\)%{SPACE}%{GREEDYDATA:rawlog}'
]
}
}
date {
match => [ "timestamp", "yyyy-MM-dd'T'HH:mm:ss.SSS"]
target => "#timestamp"
}
grok {
match => { "entity" => ['(?:W.%{GREEDYDATA:channel}:%{GREEDYDATA:inlet}:%{GREEDYDATA:listener}\.%{GREEDYDATA:workerid}|W.%{GREEDYDATA:channel}\.%{GREEDYDATA:workerid}|%{GREEDYDATA:channel}:%{GREEDYDATA:inlet}:%{GREEDYDATA:listener}\.%{GREEDYDATA:workerid}|%{GREEDYDATA:channel}:%{GREEDYDATA:inlet}:%{GREEDYDATA:listener}|%{GREEDYDATA:channel})']
}
}
dissect {
mapping => {
"[log][file][path]" => "/usr/share/logstash/iway_logs/%{serverName}#%{configName}#%{?ignore}.log"
}
}
}
output {
elasticsearch {
hosts => "${ELASTICSEARCH_HOST_PORT}"
index => "iway_"
user => "${ELASTIC_USERNAME}"
password => "${ELASTIC_PASSWORD}"
ssl => true
ssl_certificate_verification => false
cacert => "/certs/ca.crt"
}
}
As one can make out, the idea is to parse a custom log employing multiline extraction. The extraction does its job. The log occasionally contains an empty first line. So:
[2022-11-29T12:23:15.073] DEBUG (manager) Generic XPath iFL functions use full XPath 1.0 syntax
[2022-11-29T12:23:15.074] DEBUG (manager) XPath 1.0 iFL functions use iWay's full syntax implementation
which naturally is causing Kibana to report an empty line:
In an attempt to supress this line from being sent to ES, I added the following as a last filter item:
if ![message] {
drop { }
}
if [message] =~ /^\s*$/ {
drop { }
}
The resulting JSON payload to ES:
{
"#timestamp": [
"2022-12-09T14:09:35.616Z"
],
"#version": [
"1"
],
"#version.keyword": [
"1"
],
"event.original": [
"\r"
],
"event.original.keyword": [
"\r"
],
"host.name": [
"xxx"
],
"host.name.keyword": [
"xxx"
],
"log.file.path": [
"/usr/share/logstash/iway_logs/localhost#iCLP#iway_2022-11-29T12_23_33.log"
],
"log.file.path.keyword": [
"/usr/share/logstash/iway_logs/localhost#iCLP#iway_2022-11-29T12_23_33.log"
],
"message": [
"\r"
],
"message.keyword": [
"\r"
],
"tags": [
"_grokparsefailure"
],
"tags.keyword": [
"_grokparsefailure"
],
"_id": "oRc494QBirnaojU7W0Uf",
"_index": "iway_",
"_score": null
}
While this does drop the empty first line, it also unfortunately interferes with the multiline operation on other lines. In other words, the multiline operation does not work anymore. What am I doing incorrectly?
Use of the following variation resolved the issue:
if [message] =~ /\A\s*\Z/ {
drop { }
}
This solution is based on Badger's answer provided on the Logstash forums, where this question was raised as well.

logstash-output-file-as time based

My output of logstash directed to the file called apache.log.
This file needs to be generated in every hour.
For Example: apache-2018-04-16-10:00.log or something similar to this.
Here my configuration file :
# INPUT HERE
input {
beats {
port => 5044
}
}
# FILTER HERE
filter {
if [source]=="/var/log/apache2/error.log"
{
mutate {
remove_tag => [ "beats_input_codec_plain_applied" ]
add_tag => [ "apache_logs" ]
}
}
if [source]=="/var/log/apache2/access.log"
{
mutate {
remove_tag => [ "beats_input_codec_plain_applied" ]
add_tag => [ "apache_logs" ]
}
}
}
# OUTPUT HERE
output {
if "apache_logs" in [tags] {
file {
path => "/home/ubuntu/apache/apache-%{+yyyy-mm-dd}.log"
codec => "json"
}
}
}
Please help out to solve.
From the joda-time documentation (http://www.joda.org/joda-time/key_format.html), you have H hour of day (0~23). So output configuration to solve your problem would be:
output {
if "apache_logs" in [tags] {
file {
path => "/home/ubuntu/apache/apache-%{+yyyy-mm-dd-HH}.log"
codec => "json"
}
}
}

Use Logstash with HTML log

I'm new to Logstash, trying to use it to parse a HTML log file.
I need to output only the log lines, i.e. ignore preceding JS, CSS and HTML that are also included in the file.
A log line in the file looks like this:
<tr bgcolor="tomato"><td>Jan 28<br>13:52:25.692</td><td>Jan 28<br>13:52:23.950</td><td>qtp114615276-1648 [POST] [call_id:-8009072655119858507]</td><td>REST</td><td>sa</td><td>0.0.0.0</td><td>ERR</td><td>ProjectValidator.validate(36)</td><td>Project does not exist</td></tr>
I have no problem getting all the lines, but I would like to have an output which contains only the relevant ones, without HTML tags, and looks something like that:
{
"db_timestamp": "2015-01-28 13:52:25.692",
"server_timestamp": "2015-01-28 13:52:25.950",
"node": "qtp114615276-1648 [POST] [call_id:-8009072655119858507]",
"thread": "REST",
"user": "sa",
"ip": "0.0.0.0",
"level": "ERR",
"method": "ProjectValidator.validate(36)",
"message": "Project does not exist"
}
My Logstash configuration is:
input {
file {
type => "request"
path => "<some path>/*.log"
start_position => "beginning"
}
file {
type => "log"
path => "<some path>/*.html"
start_position => "beginning"
}
}
filter {
if [type] == "log" {
grok {
match => [ WHAT SHOULD I PUT HERE??? ]
}
}
}
output {
stdout {}
if [type] == "request" {
http {
http_method => "post"
url => "http://<some url>"
mapping => ["type", "request", "host" ,"%{host}", "timestamp", "%{#timestamp}", "message", "%{message}"]
}
}
if [type] == "log" {
http {
http_method => "post"
url => "http://<some url>"
mapping => [ ALSO WHAT SHOULD I PUT HERE??? ]
}
}
}
Is there a way to do that? So far I haven't found any relevant documentation or samples.
Thanks!
Finally figured out the answer.
Not sure this is the best or most elegant solution, but it works.
I changed the http output format to "message", which enabled me to override and format the whole message as JSON, instead of using mapping. Also, found out how to name parameters in the grok filter and use them in the output.
This is the new Logstash configuration file:
input {
file {
type => "request"
path => "<some path>/*.log"
start_position => "beginning"
}
file {
type => "log"
path => "<some path>/*.html"
start_position => "beginning"
}
}
filter {
if [type] == "log" {
grok {
match => { "message" => "<tr bgcolor=.*><td>%{MONTH:db_date}%{SPACE}%{MONTHDAY:db_date}<br>%{TIME:db_date}</td><td>%{MONTH:alm_date}%{SPACE}%{MONTHDAY:alm_date}<br>%{TIME:alm_date}</td><td>%{DATA:thread}</td><td>%{DATA:req_type}</td><td>%{DATA:username}</td><td>%{IP:ip}</td><td>%{DATA:level}</td><td>%{DATA:method}</td><td>%{DATA:err_message}</td></tr>" }
}
}
}
output { stdout { codec => rubydebug }
if [type] == "request" {
http {
http_method => "post"
url => "http://<some URL>"
mapping => ["type", "request", "host" ,"%{host}", "timestamp", "%{#timestamp}", "message", "%{message}"]
}
}
if [type] == "log" {
http {
format => "message"
content_type => "application/json"
http_method => "post"
url => "http://<some URL>"
message=> '{
"db_date":"%{db_date}",
"alm_date":"%{alm_date}",
"thread": "%{thread}",
"req_type": "%{req_type}",
"username": "%{username}",
"ip": "%{ip}",
"level": "%{level}",
"method": "%{method}",
"message": "%{err_message}"
}'
}
}
}
Note the single quote for the http message block and the double quotes for the parameters inside this block.
For anyone parsing HP ALM logs, the following Logstash filter will do the work:
grok {
break_on_match => true
match => [ "message", "<tr bgcolor=.*><td>%{MONTH:db_date_mon}%{SPACE}%{MONTHDAY:db_date_day}<br>%{TIME:db_date_time}<\/td><td>%{MONTH:alm_date_mon}%{SPACE}%{MONTHDAY:alm_date_day}<br>%{TIME:alm_date_time}<\/td><td>(?<thread_col1>.*?)<\/td><td>(?<request_type>.*?)<\/td><td>(?<login>.*?)<\/td><td>(?<ip>.*?)<\/td><td>(?<level>.*?)<\/td><td>(?<method>.*?)<\/td><td>(?m:(?<log_message>.*?))</td></tr>" ]
}
mutate {
add_field => ["db_date", "%{db_date_mon} %{db_date_day}"]
add_field => ["alm_date", "%{alm_date_mon} %{alm_date_day}"]
remove_field => [ "db_date_mon", "db_date_day", "alm_date_mon", "alm_date_day" ]
gsub => [
"log_message", "<br>", "
"
]
gsub => [
"log_message", "<p>", " "
]
}
Tested and working fine with Logstash 2.4.0

Filtering specific lines from log file in logstash

I am not able to get specific lines from logs file /var/log/messages. I am using logstash-forwarder in client-server and logstash, elasticsearch and kibana in log-server. I tried to install grep filter but it gives me some error so I try to implement below with grok. My original post is here . I found this but m quite unsatisfied.
Following is the configuration for logstash-forwarder file-name: logstash-forwarder in client-server
{
"network": {
"servers": [ "logstashserver-ip:5000" ],
"timeout": 15,
"ssl ca": "xxx.crt"
},
"files": [
{
"paths": [
"/var/log/messages"
],
"fields": { "type": "syslog" }
}
]
}
and following is the logstash configuration in logstashserver
file-name:input.conf
input {
lumberjack {
port => 5000
type => "logs"
ssl_certificate => "xxx.crt"
ssl_key => "xxx.key"
}
}
file-name:filter.conf
filter {
grok {
match => ["message", "\[%{WORD:messagetype}\]: %{GREEDYDATA}"]
}
}
file-name:output.conf
output {
elasticsearch { host => "logstashserver-ip" }
if [messagetype] == "ERROR" {
stdout {
codec => "rubydebug"
}
}
}
Is there anything wrong?
Not sure if you're still having this problem, but I'd look at dropping the messages you don't want. On my server, I get syslog severity levels which include syslog_severity_code as defined at http://en.wikipedia.org/wiki/Syslog#Severity_levels.
If you're getting them in your indices, try something like
filter {
if [type] == 'syslog' and [syslog_severity_code] > 5 {
drop { }
}
}

Resources