How to get the content of only one field in logstash - logstash

I'm working on syslogs that I receive through the network, but I receive them in a different format because of the fact it travels by the network.
So the real syslog is in a field named "message", and I'd like to make a filter to get the content of "message" and also filter him and send it to a file.
Actually this is how the data looks like :
{"#timestamp":"2020-10-12T14:17:16.944Z","message":"<190>key1=\"value1\" key2=\"value2\"","otherKey1":"otherValue1","otherKey2":"otherValue2"}
And here is my actual configuration file :
input{
file{
path => "/var/log/logstash/syslog.txt"
start_position => "beginning"
}
}
filter{
if ("" in [message]){
kv{
value_split => "="
}
mutate{
add_field => {"timestamp" => "%{date} %{time}"}
}
date{
match => ["timestamp", "ISO8601", "yyyy-MM-dd HH:mm:ss"]
target => "#timestamp"
locale => "fr"
}
mutate{
remove_field => ["date", "time", "timestamp"]
}
geoip{
source => "remip"
}
}
}
output{
file{
path => "/var/log/logstash/systest.txt"
}
}
Many thanks in advance for any help or advice !

For your situation, you need to jsut keep the fiels message in the event.
You can do this with the prune filter which enable a whitelist mechanism.
So add this in your filter :
prune {
whitelist_names => ["^message$"]
}
After this, only this field must be written in the file.

Related

Fields parsed from log path not added in logstash

I'm parsing multiple log files with logstash - and want to add fields based on the path of the files to my output. Here are the relevant parts of the config file:
input {
file {
path => "/mnt/logs/**/console-20200108*.log"
type => "tomcat"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
if [type] == "tomcat" {
grok {
patterns_dir => "/usr/share/logstash/patterns"
match => {
"message" => [ "%{TOMCAT_LOG_1}", "%{TOMCAT_LOG_2}" ]
"path" => "\/mnt\/logs\/%{DATA:site}\/%{DATA:version}\/node%{NUMBER:node}\/store\/tomcat\/%{DATA:file}\.log"
}
}
}
}
Here's a sample output:
{
"level" => "INFO",
"type" => "tomcat",
"data" => "Finished indexer cronjob.\r",
"timestamp" => "08-Jan-2020 11:00:05.860",
"qualifier1" => "[update-backofficeIndex-CronJob::ServicelayerJob]",
"#timestamp" => 2020-01-08T11:04:47.364Z,
"path" => "/mnt/logs/protec/qa/node1/store/tomcat/console-20200108.log",
"qualifier3" => "[SolrIndexerJob]",
"host" => "elk",
"#version" => "1",
"message" => "INFO | jvm 1 | srvmain | 08-Jan-2020 11:00:05.860 INFO [update-backofficeIndex-CronJob::ServicelayerJob] (update-backofficeIndex-CronJob) [SolrIndexerJob] Finished indexer cronjob.\r",
"qualifier2" => "(update-backofficeIndex-CronJob)"
}
Based on this, I was expecting to get relevant fields from parsing the message and a few more fields from parsing the path. Yet, I none of the fields from "path" parsing are added.
What am I missing? How do I add site, version, node and file fields?
Creating an answer from a comment that solved the problem (hence community wiki).
Apparently it might be coming from the break_on_match option, which defaults to true. From the doc:
The first successful match by grok will result in the filter being finished. So if %{TOMCAT_LOG_1} or %{TOMCAT_LOG_2} match, it won't try to match the path field

How to cumul filters with logstash?

I'm currently discovering elastic search, kibana and logstash with docker. (Version 7.1.1) The three containers are running well.
I have some data files containing some lines like this one:
foo=bar type=alpha T=20180306174204527
My logstash.conf contains:
input {
file {
path => "/tmp/data/*.txt"
start_position => "beginning"
}
}
filter {
kv {
field_split => "\t"
value_split => "="
}
}
output {
elasticsearch { hosts => ["elasticsearch:9200"] }
stdout {
codec => rubydebug
}
}
I handle this data:
{
"host" => "07f3051a3bec",
"foo" => "bar",
"message" => "foo=bar\ttype=alpha\tT=20180306174204527",
"T" => "20180306174204527",
"#timestamp" => 2019-06-17T13:47:14.589Z,
"path" => "/tmp/data/ucL12018_03_06.txt",
"type" => "alpha"
"#version" => "1",
}
First step of job is done.
Now I want to add a filter to transform the value of the key T as a timestamp.
{
...
"T" => "2018-03-06T17:42:04.527Z",
"#timestamp" => 2019-06-17T13:47:14.589Z,
...
}
I do not know how to do it. I tried to add a second filter just after the kv filter, but nothing change when I add new files.
Add this filter after the kv filter:
date {
match => [ "T", "yyyyMMddHHmmssSSS" ]
target => "T"
}
The date filter will try to parse the field T using the provided pattern to create a date, which will be written to the T field (by default it overwrite the #timestamp field).

Retrieving RESTful GET parameters in logstash

I am trying to get logstash to parse key-value pairs in an HTTP get request from my ELB log files.
the request field looks like
http://aaa.bbb/get?a=1&b=2
I'd like there to be a field for a and b in the log line above, and I am having trouble figuring it out.
My logstash conf (formatted for clarity) is below which does not load any additional key fields. I assume that I need to split off the address portion of the URI, but have not figured that out.
input {
file {
path => "/home/ubuntu/logs/**/*.log"
type => "elb"
start_position => "beginning"
sincedb_path => "log_sincedb"
}
}
filter {
if [type] == "elb" {
grok {
match => [ "message", "%{TIMESTAMP_ISO8601:timestamp}
%{NOTSPACE:loadbalancer} %{IP:client_ip}:%{NUMBER:client_port:int}
%{IP:backend_ip}:%{NUMBER:backend_port:int}
%{NUMBER:request_processing_time:float}
%{NUMBER:backend_processing_time:float}
%{NUMBER:response_processing_time:float}
%{NUMBER:elb_status_code:int}
%{NUMBER:backend_status_code:int}
%{NUMBER:received_bytes:int} %{NUMBER:sent_bytes:int}
%{QS:request}" ]
}
date {
match => [ "timestamp", "ISO8601" ]
}
kv {
field_split => "&?"
source => "request"
exclude_keys => ["callback"]
}
}
}
output {
elasticsearch { host => localhost }
}
kv will take a URL and split out the params. This config works:
input {
stdin { }
}
filter {
mutate {
add_field => { "request" => "http://aaa.bbb/get?a=1&b=2" }
}
kv {
field_split => "&?"
source => "request"
}
}
output {
stdout {
codec => rubydebug
}
}
stdout shows:
{
"request" => "http://aaa.bbb/get?a=1&b=2",
"a" => "1",
"b" => "2"
}
That said, I would encourage you to create your own versions of the default URI patterns so that they set fields. You can then pass the querystring field off to kv. It's cleaner that way.
UPDATE:
For "make your own patterns", I meant to take the existing ones and modify them as needed. In logstash 1.4, installing them was as easy as putting them in a new file the 'patterns' directory; I don't know about patterns for >1.4 yet.
MY_URIPATHPARAM %{URIPATH}(?:%{URIPARAM:myuriparams})?
MY_URI %{URIPROTO}://(?:%{USER}(?::[^#]*)?#)?(?:%{URIHOST})?(?:%{MY_URIPATHPARAM})?
Then you could use MY_URI in your grok{} pattern and it would create a field called myuriparams that you could feed to kv{}.

Logstash not adding fields

I am using Logstash 1.4.2 and I have the following conf file.
I would expect to see in Kibana in the "Fields" section on the left the options for "received_at" and "received_from" and "description", but I don't.
I see
#timestamp
#version
_id
_index
_type host path
I do see in the _source section on the right side the following...
received_at:2015-05-11 14:19:40 UTC received_from:PGP02 descriptionError1!
So home come these don't appear in the list of "Popular Fields"?
I'd like to filter the right side to not show EVERY field in the _source section on the right. Excuse the redaction blocks.
input
{
file {
path => "C:/ServerErrlogs/office-log.txt"
start_position => "beginning"
sincedb_path => "c:/tools/logstash-1.4.2/office-log.sincedb"
tags => ["product_qa", "office"]
}
file {
path => "C:/ServerErrlogs/dis-log.txt"
start_position => "beginning"
sincedb_path => "c:/tools/logstash-1.4.2/dis-log.sincedb"
tags => ["product_qa", "dist"]
}
}
filter {
grok {
match => ["path","%{GREEDYDATA}/%{GREEDYDATA:filename}\.log"]
match => [ "message", "%{TIMESTAMP_ISO8601:logdate}: %{LOGLEVEL:loglevel} (?<logmessage>.*)" ]
add_field => [ "received_at", "%{#timestamp}" ]
add_field => [ "received_from", "%{host}" ]
}
date {
match => [ "logdate", "ISO8601", "yyyy-MM-dd HH:mm:ss,SSSSSSSSS" ]
}
#logdate is now parsed into timestamp, remove original log message too
mutate {
remove_field => ['message', 'logdate' ]
add_field => [ "description", "Error1!" ]
}
}
output {
elasticsearch {
protocol => "http"
host => "0.0.0.x"
}
}
Update:
I have tired searching with a query like:
tags: data AND loglevel : INFO
then saving this query, and then reloading the page.
But still I don't see loglevel appearing as 'Popular Fields'
If the fields don't appear on the left side, it's probably a kibana caching problem. Go to Settings->Indices, select your index, and click the orange Refresh button.
I had the same issue with logstash not adding fields and after quite a lot of searching and testing other things, suddenly I had the solution (but I´am using the logstash-logback-encoder, so I have JSON already - if you don´t, then you need to transform things into JSON in the logstash "input"-phase).
I added a "json" plugin-filter, that did the magic for me:
filter {
json {
source => "message"
}
}

logstash output not showing the desired timestamp

I am trying to get the desired time stamp format from logstash output. I can''t get that if I use this format in syslog
Please share your thoughts about convert to the other format that’s in the _source field like Yyyy-mm-ddThh:mm:ss.sssZ format?
filter {
grok {
match => [ "logdate", "Yyyy-mm-ddThh:mm:ss.sssZ" ]
overwrite => ["host", "message"]
}
_source: {
message: "activity_log: {"created_at":1421114642210,"actor_ip":"192.168.1.1","note":"From system","user":"4561c9d7aaa9705a25f66d","user_id":null,"actor":"4561c9d7aaa9705a25f66d","actor_id":null,"org_id":null,"action":"user.failed_login","data":{"transaction_id":"d6768c473e366594","name":"user.failed_login","timing":{"start":1422127860691,"end":14288720480691,"duration":0.00257},"actor_locatio
I am using this code in syslog file
filter {
if [message] =~ /^activity_log: / {
grok {
match => ["message", "^activity_log: %{GREEDYDATA:json_message}"]
}
json {
source => "json_message"
remove_field => "json_message"
}
date {
match => ["created_at", "UNIX_MS"]
}
mutate {
rename => ["[json][repo]", "repo"]
remove_field => "json"
}
}
}
output {
elasticsearch { host => localhost }
stdout { codec => rubydebug }
}
thanks
"message" => "<134>feb 1 20:06:12 {\"created_at\":1422765535789, pid=5450 tid=28643 version=b0b45ac proto=http ip=192.168.1.1 duration_ms=0.165809 fs_sent=0 fs_recv=0 client_recv=386 client_sent=0 log_level=INFO msg=\"http op done: (401)\" code=401" }
"#version" => "1",
"#timestamp" => "2015-02-01T20:06:12.726Z",
"type" => "activity_log",
"host" => "192.168.1.1"
The pattern in your grok filter doesn't make sense. You're using a Joda-Time pattern (normally used for the date filter) and not a grok pattern.
It seems your message field contains a JSON object. That's good, because it makes it easy to parse. Extract the part that comes after "activity_log: " to a temporary json_message field,
grok {
match => ["message", "^activity_log: %{GREEDYDATA:json_message}"]
}
and parse that field as JSON with the json filter (removing the temporary field if the operation was successful):
json {
source => "json_message"
remove_field => ["json_message"]
}
Now you should have the fields from the original message field at the top level of your message, including the created_at field with the timestamp you want to extract. That number is the number of milliseconds since the epoch so you can use the UNIX_MS pattern in a date filter to extract it into #timestamp:
date {
match => ["created_at", "UNIX_MS"]
}

Resources