I have an nginx log file, that looks similar to this one:
{ "#timestamp": "2013-09-03T14:21:51-04:00", "#fields": { "remote_addr": "xxxxxxxxxxxx", "remote_user": "-", "body_bytes_sent": "5", "request_time": "0.000", "status": "200", "request": "POST foo/bar/1 HTTP/1.1", "request_body": "{\x22id\x22: \x22460\x22, \x22source_id\x22: \x221\x22, \x22email_address\x22: \x22foo#bar.com\x22, \x22password\x22: \x2JQ6I\x22}", "request_method": "POST", "request_uri": "foo/bar/1", "http_referrer": "-", "http_user_agent": "Java/1.6.0_27" } }
I'm wondering is it possible to use logstash filter to send log that would look something similar to this:
{"#fields": { "request": "POST foo/bar/1 HTTP/1.1", "request_body": "{\x22id\x22: \x22460\x22, \x22source_id\x22: \x221\x22, \x22email_address\x22: \x22foo#bar.com\x22, \x22password\x22: \x2JQ6I\x22}"}
So I'm only interesting in a few fields out of the whole log.
In other words, I would like to extract necessary data out of the log, and than send it to what ever output
Yes, you can do that if you first go through a json filter.
Then you need something like this:
filter {
json {
source => "message"
add_tag => "json"
}
mutate {
tags => [ "json" ]
remove_field => [ "[#fields][remote_addr]", "[#fields][remote_user]", "[#fields][body_bytes_sent]", "[#fields][request_time]" ]
}
}
I've a configuration similar to this working with version 1.2.0 of logstash.
Hope this helps.
Related
I have logstash input that looks like this
{
"#timestamp": "2016-12-20T18:55:11.699Z",
"id": 1234,
"detail": {
"foo": 1
"bar": "two"
}
}
I would like to merge the content of "detail" with the root object so that the final event looks like this:
{
"#timestamp": "2016-12-20T18:55:11.699Z",
"id": 1234,
"foo": 1
"bar": "two"
}
Is there a way to accomplish this without writing my own filter plugin?
You can do this with a ruby filter.
filter {
ruby {
code => "
event['detail'].each {|k, v|
event[k] = v
}
event.remove('detail')
"
}
}
There is a simple way to do that using the json_encode plugin (not included by default).
The json extractor add fields to the root of the event. It's one of the very few extractors that can add things to the root.
filter {
json_encode {
source => "detail"
target => "detail"
}
json {
source => "detail"
remove_field => [ "detail" ]
}
}
I am using logstash to collect my apache logs, and as such I have a field called request_url which contains values that look like:
POST /api_v1/services/order_service HTTP/1.1
POST /api_v2/services/user_service HTTP/1.0
I want to create separate tags containing on the API version and the service name, e.g.
POST /api_v1/services/order_service HTTP/1.1 -> ["v1", "order_service"]
POST /api_v2/services/user_service HTTP/1.0 -> ["v2", "user_service"]
How do achieve this in logstash configuration? Thanks for any pointers.
Using the following grok filter, you can separate the components you need and then add the appropriate tags
filter {
grok {
"match" => { "message" => "%{WORD:verb} /api_%{NOTSPACE:version}/services/%{WORD:service}" }
"add_tag" => ["%{version}", "%{service}"]
}
}
The event you'll get will look like this:
{
"message" => "POST /api_v1/services/order_service HTTP/1.1",
"#version" => "1",
"#timestamp" => "2016-06-07T02:52:01.136Z",
"host" => "iMac.local",
"verb" => "POST",
"version" => "v1",
"service" => "order_service",
"tags" => [
[0] "v1",
[1] "order_service"
]
}
I have a server that sends access logs over to logstash in a custom log format, and am using logstash to filter these logs and send them to Elastisearch.
A log line looks something like this:
0.0.0.0 - GET / 200 - 29771 3 ms ELB-HealthChecker/1.0\n
And gets parsed using this grok filter:
grok {
match => [
"message", "%{IP:remote_host} %{USER:remote_user} %{WORD:method} %{URIPATHPARAM:requested_uri} %{NUMBER:status_code} - %{NUMBER:content_length} %{NUMBER:elapsed_time:int} ms %{GREEDYDATA:user_agent}",
"message", "%{IP:remote_host} - %{WORD:method} %{URIPATHPARAM:requested_uri} %{NUMBER:status_code} - %{NUMBER:content_length} %{NUMBER:elapsed_time:int} ms %{GREEDYDATA:user_agent}",
"message", "%{IP:remote_host} %{USER:remote_user} %{WORD:method} %{URIPATHPARAM:requested_uri} %{NUMBER:status_code} - - %{NUMBER:elapsed_time:int} ms %{GREEDYDATA:user_agent}",
"message", "%{IP:remote_host} - %{WORD:method} %{URIPATHPARAM:requested_uri} %{NUMBER:status_code} - - %{NUMBER:elapsed_time:int} ms %{GREEDYDATA:user_agent}"
]
add_field => {
"protocol" => "HTTP"
}
}
The final log gets parsed into this object (with real IPs stubbed out, and other fields taken out):
{
"_source": {
"message": " 0.0.0.0 - GET / 200 - 29771 3 ms ELB-HealthChecker/1.0\n",
"tags": [
"bunyan"
],
"#version": "1",
"host": "0.0.0.0:0000",
"remote_host": [
"0.0.0.0",
"0.0.0.0"
],
"remote_user": [
"-",
"-"
],
"method": [
"GET",
"GET"
],
"requested_uri": [
"/",
"/"
],
"status_code": [
"200",
"200"
],
"content_length": [
"29771",
"29771"
],
"elapsed_time": [
"3",
3
],
"user_agent": [
"ELB-HealthChecker/1.0",
"ELB-HealthChecker/1.0"
],
"protocol": [
"HTTP",
"HTTP"
]
}
}
Any ideas why I am getting multiple matches per log? Shouldn't Grok be breaking on the first match that successfully parses?
Chances are you have multiple config files that are being loaded. If you look at the output, specifically the elapsed_time shows up as both an integer and a string. From the config file you've provided, that's not possible since you have :int on anything that matches elapsed_time.
I'm using Elasticsearch + Logstash + kibana for windows eventlog analysis. And i get the following log:
{
"_index": "logstash-2015.04.16",
"_type": "logs",
"_id": "Ov498b0cTqK8W4_IPzZKbg",
"_score": null,
"_source": {
"EventTime": "2015-04-16 14:12:45",
"EventType": "AUDIT_FAILURE",
"EventID": "4656",
"Message": "A handle to an object was requested.\r\n\r\nSubject:\r\n\tSecurity ID:\t\tS-1-5-21-2832557239-2908104349-351431359-3166\r\n\tAccount Name:\t\ts.tekotin\r\n\tAccount Domain:\t\tIAS\r\n\tLogon ID:\t\t0x88991C8\r\n\r\nObject:\r\n\tObject Server:\t\tSecurity\r\n\tObject Type:\t\tFile\r\n\tObject Name:\t\tC:\\Folders\\Общая (HotSMS)\\Test_folder\\3\r\n\tHandle ID:\t\t0x0\r\n\tResource Attributes:\t-\r\n\r\nProcess Information:\r\n\tProcess ID:\t\t0x4\r\n\tProcess Name:\t\t\r\n\r\nAccess Request Information:\r\n\tTransaction ID:\t\t{00000000-0000-0000-0000-000000000000}\r\n\tAccesses:\t\tReadData (or ListDirectory)\r\n\t\t\t\tReadAttributes\r\n\t\t\t\t\r\n\tAccess Reasons:\t\tReadData (or ListDirectory):\tDenied by\tD:(D;OICI;CCDCLCSWRPWPLOCRSDRC;;;S-1-5-21-2832557239-2908104349-351431359-3166)\r\n\t\t\t\tReadAttributes:\tGranted by ACE on parent folder\tD:(A;OICI;0x1200a9;;;S-1-5-21-2832557239-2908104349-351431359-3166)\r\n\t\t\t\t\r\n\tAccess Mask:\t\t0x81\r\n\tPrivileges Used for Access Check:\t-\r\n\tRestricted SID Count:\t0",
"ObjectServer": "Security",
"ObjectName": "C:\\Folders\\Общая (HotSMS)\\Test_folder\\3",
"HandleId": "0x0",
"PrivilegeList": "-",
"RestrictedSidCount": "0",
"ResourceAttributes": "-",
"#timestamp": "2015-04-16T11:12:45.802Z"
},
"sort": [
1429182765802,
1429182765802
]
}
I get many log messages with different EventID, and when I recieve a log entry with EventID 4656 - i want to replace the value "4656" with a string "Access Failure". Is there a chance to do so?
You can do it when you are loading with logstash -- just do something like this:
filter {
if [EventID] == "4656" {
mutate {
replace => [ "EventID", "Access Failure" ]
}
}
}
If you have a lot of values, look at translate{}:
translate {
dictionary => [
"4656", "Access Failure",
"1234", "Another Value"
]
field => "EventID"
destination => "EventName"
}
I don't think translate{} will let you replace the original field. You could remove it, though, in favor of the new field.
use replace filter:
Replace a field with a new value. The new value can include %{foo} strings to help you build a new value from other parts of the event.
Example:
filter {
if [source] == "your code like 4656" {
mutate {
replace => { "message" => "%{source_host}: My new message" }
}
}
}
I'm using Logstash + Elasticsearch + Kibana to have an overview of my Tomcat log files.
For each log entry I need to know the name of the file from which it came. I'd like to add it as a field. Is there a way to do it?
I've googled a little and I've only found this SO question, but the answer is no longer up-to-date.
So far the only solution I see is to specify separate configuration for each possible file name with different "add_field" like so:
input {
file {
type => "catalinalog"
path => [ "/path/to/my/files/catalina**" ]
add_field => { "server" => "prod1" }
}
}
But then I need to reconfigure logstash each time there is a new possible file name.
Any better ideas?
Hi I added a grok filter to do just this. I only wanted to have the filename not the path, but you can change this to your needs.
filter {
grok {
match => ["path","%{GREEDYDATA}/%{GREEDYDATA:filename}\.log"]
}
}
In case you would like to combine the message and file name in one event:
filter {
grok {
match => {
message => "ERROR (?<function>[\S]*)"
}
}
grok {
match => {
path => "%{GREEDYDATA}/%{GREEDYDATA:filename}\.log"
}
}}
The result in ElasticSearch (focus on 'filename' and 'function' fields):
"_index": "logstash-2016.08.03",
"_type": "logs",
"_id": "AVZRyEI49-A6kyBCq6Yt",
"_score": 1,
"_source": {
"message": "27/07/16 12:16:18,321 ERROR blaaaaaaaaa.internal.com",
"#version": "1",
"#timestamp": "2016-08-03T19:01:33.083Z",
"path": "/home/admin/mylog.log",
"host": "my-virtual-machine",
"function": "blaaaaaaaaa.internal.com",
"filename": "mylog"
}