Create tag from field using regex - logstash

I am using logstash to collect my apache logs, and as such I have a field called request_url which contains values that look like:
POST /api_v1/services/order_service HTTP/1.1
POST /api_v2/services/user_service HTTP/1.0
I want to create separate tags containing on the API version and the service name, e.g.
POST /api_v1/services/order_service HTTP/1.1 -> ["v1", "order_service"]
POST /api_v2/services/user_service HTTP/1.0 -> ["v2", "user_service"]
How do achieve this in logstash configuration? Thanks for any pointers.

Using the following grok filter, you can separate the components you need and then add the appropriate tags
filter {
grok {
"match" => { "message" => "%{WORD:verb} /api_%{NOTSPACE:version}/services/%{WORD:service}" }
"add_tag" => ["%{version}", "%{service}"]
}
}
The event you'll get will look like this:
{
"message" => "POST /api_v1/services/order_service HTTP/1.1",
"#version" => "1",
"#timestamp" => "2016-06-07T02:52:01.136Z",
"host" => "iMac.local",
"verb" => "POST",
"version" => "v1",
"service" => "order_service",
"tags" => [
[0] "v1",
[1] "order_service"
]
}

Related

logstash GROK filter along with KV plugin couldn't able to process the events

i am new to ELK. when i onboarded the below log file, it is going to "dead letter queue" in logstash because logstash couldn't able to process the events.I have written the GROK filter to parse the events but logstash still couldn't not process the events. Any help would be appreciated.
Below is the sample log format.
25193662345 [http-nio-8080-exec-44] DEBUG c.s.b.a.m.PerformanceMetricsFilter - method=PUT status=201 appLogicTime=1, streamInTime=0, blobStorageTime=31, totalTime=33 tenantId=b9sdfs-1033-4444-aba5-csdfsdfsf, immutableBlobId=bss_c_586331/Sample_app12-sdas-157123148464.txt, blobSize=2862, domain=abc
2519366789 [http-nio-8080-exec-47] DEBUG q.s.b.y.m.PerformanceMetricsFilter - method=PUT status=201 appLogicTime=1, streamInTime=0, blobStorageTime=32, totalTime=33 tenantId=b0csdfsd-1066-4444-adf4-ce7bsdfssdf, immutableBlobId=bss_c_586334/Sample_app15-615223-157sadas6648465.txt, blobSize=2862, domain=cde
GROK filter:
dissect { mapping => { "message" => "%{NUMBER:number} [%{thread}] %{level} %{class} - %{[#metadata][msg]}" } }
kv { source => "[#metadata][msg]" field_split => "," }
Thanks
You have basically two problems in your configuration.
1.) You are using the dissect filter, not grok, both are used to parse messages, but grok uses regular expressions to validate the value of the field and dissect is just positional, it does not perform any validation, if you have a WORD value in the position of a field that expects a NUMBER, grok will fail, but dissect will not.
If your log lines always have the same pattern, you should continue to use dissect since it is faster and needs less cpu.
Your correct dissect mapping should be:
dissect {
mapping => { "message" => "%{number} [%{thread}] %{level} %{class} - %{[#metadata][msg]}" }
}
2.) The field that contains the kv message is wrong, it has fields separated by space and by comma, kv won't work this way.
After your dissect filter this is the content of [#metadata][msg].
method=PUT status=201 appLogicTime=1, streamInTime=0, blobStorageTime=32, totalTime=33 tenantId=b0csdfsd-1066-4444-adf4-ce7bsdfssdf, immutableBlobId=bss_c_586334/Sample_app15-615223-157sadas6648465.txt, blobSize=2862, domain=cde
To solve this you should use a mutate filter to remove the comma from the [#metadata][msg] and use the kv filter with the default configurations.
This should be your filter configuration
filter {
dissect {
mapping => { "message" => "%{number} [%{thread}] %{level} %{class} - %{[#metadata][msg]}" }
}
mutate {
gsub => ["[#metadata][msg]",",",""]
}
kv {
source => "[#metadata][msg]"
}
}
Your output should be something like this:
{
"number" => "2519366789",
"#timestamp" => 2019-11-03T16:42:11.708Z,
"thread" => "http-nio-8080-exec-47",
"appLogicTime" => "1",
"domain" => "cde",
"method" => "PUT",
"level" => "DEBUG",
"blobSize" => "2862",
"#version" => "1",
"immutableBlobId" => "bss_c_586334/Sample_app15-615223-157sadas6648465.txt",
"streamInTime" => "0",
"status" => "201",
"blobStorageTime" => "32",
"message" => "2519366789 [http-nio-8080-exec-47] DEBUG q.s.b.y.m.PerformanceMetricsFilter - method=PUT status=201 appLogicTime=1, streamInTime=0, blobStorageTime=32, totalTime=33 tenantId=b0csdfsd-1066-4444-adf4-ce7bsdfssdf, immutableBlobId=bss_c_586334/Sample_app15-615223-157sadas6648465.txt, blobSize=2862, domain=cde",
"totalTime" => "33",
"tenantId" => "b0csdfsd-1066-4444-adf4-ce7bsdfssdf",
"class" => "q.s.b.y.m.PerformanceMetricsFilter"
}

How to capture repeated pattern in logstash(5.4.0) grok?

I would appreciate if someone can help me out with logstash grok.
Given a log like below ,
IN 192.168.11.2 IN 192.168.11.3
My goal is to put the ip address into array using grok. List of ip is dynamic and possible to extend more than 2.
e.g
tmp = [
"192.168.11.2", "192.168.11.3"
]
However, if I use a filter like below it ends up in single field.
filter {
grok {
match => { "message" => "(?<tmp>(IN %{IPV4}(\s)?)*)" }
}
}
Result,
"path" => "/tmp/sample.csv",
"#timestamp" => 2017-08-24T05:00:08.093Z,
"tmp" => "IN 192.168.11.2 IN 192.168.11.3",
"#version" => "1",
"host" => "host.ywlocal.net",
"message" => "IN 192.168.11.2 IN 192.168.11.3"
Would this be possible?
You can use the ruby filter for more advanced parsing:
filter {
ruby {
code => "event.set('ips') = event.get('message').scan(/\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\/)"
}
}
Regexp is not 100% correct to match ip address but should work for your needs

Logstash parser error, timestamp is malformed

Can somebody tell me what I'm doing wrong, or why Logstash doesn't want to parse an ISO8601 timestamp?
The error message I get is
Failed action ... "error"=>{"type"=>"mapper_parsing_exception",
"reason"=>"failed to parse [timestamp]",
"caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"Invalid
format: \"2017-03-24 12:14:50\" is malformed at \"17-03-24
12:14:50\""}}
Sample log file line (last byte in IP address replaced with 000 on purpose)
2017-03-24 12:14:50 87.123.123.000 12345678.domain.com GET /smil:stream_17.smil/chunk_ctvideo_ridp0va0r600115_cs211711500_mpd.m4s - HTTP/1.1 200 750584 0.714 "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36" https://referrer.domain.com/video/2107 https fra1 "HIT, MISS" 12345678.domain.com
GROK pattern (use http://grokconstructor.appspot.com/do/match to verify)
RAW %{TIMESTAMP_ISO8601:timestamp}%{SPACE}%{IPV4:clientip}%{SPACE}%{HOSTNAME:http_host}%{SPACE}%{WORD:verb}%{SPACE}\/(.*:)?%{WORD:stream}%{NOTSPACE}%{SPACE}%{NOTSPACE}%{SPACE}%{WORD:protocol}\/%{NUMBER:httpversion}%{SPACE}%{NUMBER:response}%{SPACE}%{NUMBER:bytes}%{SPACE}%{SECOND:request_time}%{SPACE}%{QUOTEDSTRING:agent}%{SPACE}%{URI:referrer}%{SPACE}%{WORD}%{SPACE}%{WORD:location}%{SPACE}%{QUOTEDSTRING:cache_status}%{SPACE}%{WORD:account}%{GREEDYDATA}
Logstash configuration (input side):
input {
file {
path => "/subfolder/logs/*"
type => "access_logs"
start_position => "beginning"
}
}
filter {
# skip first two lines in log file with comments
if [message] =~ /^#/ {
drop { }
}
grok {
patterns_dir => ["/opt/logstash/patterns"]
match => { "message" => "%{RAW}" }
}
date {
match => [ "timestamp" , "yyyy-MM-dd HH:mm:ss" ]
locale => "en"
}
# ... (rest of the config omitted for readability)
}
So I am pretty sure this is being caused by the field timestamp being mapped to a type in Elasticsearch that it doesn't parse to. If you post your index mapping, I'd be happy to look at it.
A note: You can quickly solve this by adding remove_field because if the date filter is successful, the value of that field will be pulled into #timestamp. Right now you have the same value stored in two fields. Then you don't have to worry about the mapping for the field. :)
date {
match => [ "timestamp" , "yyyy-MM-dd HH:mm:ss" ]
locale => "en"
remove_field => [ "timestamp" ]
}

Logstash. Get fields by position number

Background
I have the scheme: logs from my app go through rsyslog to central log server, then to Logstash and Elasticsearch.
Logs from app is a pure JSON, but rsyslog adds to log "timestamp", "app name" and "server name" fileds. And log becomes to this:
timestamp app-name server-name [JSON]
Question
How can I remove first three fields with Logstash filters?
Can I get fields by position numbers (like in awk) and do something like:
filter {
somefilter_name {
remove_field => $1, $2, $3
}
}
Or maybe my vision is totally wrong and I must do this in another way?
Thank you!
Use grok{} to match them (they may be useful on their own!) and put the remainder of the event back into the [message] field:
Given input like:
2015-06-16 13:37:30 myApp myServer { "jsonField": "jsonValue" }
And this config:
grok {
pattern => "%{TIMESTAMP_ISO8601:timestamp} %{WORD:app} %{WORD:server} %{GREEDYDATA:message}"
overwrite => [ "message" ]
}
json {
source => "message"
}
Will produce this document:
{
"message" => "{ \"jsonField\": \"jsonValue\" }",
"#version" => "1",
"#timestamp" => "2015-06-16T20:38:55.658Z",
"host" => "0.0.0.0",
"timestamp" => "2015-06-16 13:37:30",
"app" => "myApp",
"server" => "myServer",
"jsonField" => "jsonValue"
}

Logstash email alert

I'm trying to configure logstash to send mail when someone login my server. But it seems doesn't work. This is my config file in /etc/logstash/conf.d/email.conf
My file:
input {
file {
type => "syslog"
path => "/var/log/auth.log"
}
}
filter {
if [type] == "syslog" {
grok {
pattern => [ "%{SYSLOGBASE} Failed password for %{USERNAME:user} from % {IPORHOST:host} port %{POSINT:port} %{WORD:protocol}" ]
add_tag => [ "auth_failure" ]
}
}
}
output {
email {
tags => [ "auth_failure" ]
to => "<admin#gmail.com>"
from => "<alert#abc.com>"
options => [ "smtpIporHost", "smtp.abc.com",
"port", "25",
"domail", "abc.com",
"userName", "alert#abc.com",
"password", "mypassword",
"authenticationType", "plain",
"debug", "true"
]
subject => "Error"
via => "smtp"
body => "Here is the event line %{#message}"
htmlbody => "<h2>%{matchName}</h2><br/><br/><h3>Full Event</h3><br/><br/><div align='center'>%{#message}</div>"
}
}
My logstash file /var/log/logstash/logstash.log
{:timestamp=>"2015-03-10T11:46:41.152000+0700", :message=>"Using milestone 1 output plugin 'email'. This plugin should work, but would benefit from use by folks like you. Please let us know if you find bugs or have suggestions on how to improve this plugin. For more information on plugin milestones, see http://logstash.net/docs/1.4.1/plugin-milestones", :level=>:warn}
any body please help!
You're not using the correct syntax in your grok filter. It should look like this:
grok {
match => ["message", "..."]
}
Other minor comments:
Using tags => ["auth_failure"] for conditional filtering is deprecated. Prefer if "auth_failture" in [tags].
In the email body you're referring to the message with #message. That's deprecated too and the field is named plain message.

Resources