logstash configuration grok parse timestamp - logstash

I am trying to parse
[7/1/05 13:41:00:516 PDT]
This is the configuration grok I have written for the same :
\[%{DD/MM/YY HH:MM:SS:S Z}\]
With the date filter :
input {
file {
path => "logstash-5.0.0/bin/sta.log"
start_position => "beginning"
}
}
filter {
grok {
match =>" \[%{DATA:timestamp}\] "
}
date {
match => ["timestamp","DD/MM/YY HH:MM:SS:S ZZZ"]
}
}
output {
stdout{codec => "json"}
}
above is the configuration I have used.
And consider this as my sta.log file content:
[7/1/05 13:41:00:516 PDT]
Getting this error :
[2017-01-31T12:37:47,444][ERROR][logstash.agent ] fetched an invalid config {:config=>"input {\nfile {\npath => \"logstash-5.0.0/bin/sta.log\"\nstart_position => \"beginning\"\n}\n}\nfilter {\ngrok {\nmatch =>\"\\[%{DATA:timestamp}\\]\"\n}\ndate {\nmatch => [\"timestamp\"=>\"DD/MM/YY HH:MM:SS:S ZZZ\"]\n}\n}\noutput {\nstdout{codec => \"json\"}\n}\n\n", :reason=>"Expected one of #, {, ,, ] at line 12, column 22 (byte 184) after filter {\ngrok {\nmatch =>\"\\[%{DATA:timestamp}\\]\"\n}\ndate {\nmatch => [\"timestamp\""}
Can anyone help here?

You forgot to specify the input for your grokfilter. A correct configuration would look like this:
input {
file {
path => "logstash-5.0.0/bin/sta.log"
start_position => "beginning"
}
}
filter {
grok {
match => {"message" => "\[%{DATA:timestamp} PDT\]"}
}
date {
match => ["timestamp","dd/MM/yy HH:mm:ss:SSS"]
}
}
output {
stdout{codec => "json"}
}
For further reference check out the grok documentation here.

Related

Logstash grok config error - while parsing file name

I am new to logstash and grok and I am trying to parse AWS ECS logs in an S3 bucket in the following format -
File Name - my-logs-s3-bucket/3d265ee3-d2ee-4029-a3d9-fd2255d69b92/ecs-fargate-container-8ff0e472-c76f-4f61-a363-64c2b80aa842/000000.gz
Sample Lines -
2019-05-09T16:16:16.983Z JBoss Bootstrap Environment
2019-05-09T16:16:16.983Z JBOSS_HOME: /app/jboss
2019-05-09T16:16:16.983Z JAVA_OPTS: -server -XX:+UseCompressedOops -Djboss.server.log.dir=/var/log/jboss -Xms128m -Xmx4096m
And logstash.conf
input {
s3 {
region => "us-east-1"
bucket => "my-logs-s3-bucket"
interval => "7200"
}
}
filter {
grok {
match => ["message", "%{TIMESTAMP_ISO8601:tstamp}"]
}
date {
match => ["tstamp", "ISO8601"]
}
mutate {
remove_field => ["tstamp"]
add_field => {
"file" => "%{[#metadata][s3][key]}"
}
######### NEED HELP HERE - START #########
#grok {
# match => [ "file", "ecs-fargate-container-%{DATA:containerlogname}"]
#}
######### NEED HELP HERE - END #########
}
}
output {
stdout { codec => rubydebug {
#metadata => true
}
}
}
I am able to see all the logs parsed and the file name extracted when I run logstash using the above configuration and the file name from the output looks like below -
"file" => "myapp-logs/3d265ee3-d2ee-4029-a3d9-fd2255d69b92/ecs-fargate-container-8ff0e472-c76f-4f61-a363-64c2b80aa842/000000.gz",
I am trying to use grok to extract the file name as either ecs-fargate-container-8ff0e472-c76f-4f61-a363-64c2b80aa842 or 8ff0e472-c76f-4f61-a363-64c2b80aa842 by uncommenting grok config lines between #NEED HELP HERE - START and ending with the below error -
Expected one of #, => at line 21, column 10 (byte 536) after filter {\n grok {\n match => [\"message\", \"%{TIMESTAMP_ISO8601:tstamp}\"]\n }\n date {\n match => [\"tstamp\", \"ISO8601\"]\n }\n mutate {\n #remove_field => [\"tstamp\"]\n add_field => {\n \"file\" => \"%{[#metadata][s3][key]}\"\n }\n grok ", :
I am not sure where I am going wrong with this. Please advice.
Your grok filter was inside the mutate filter, try the following.
filter {
grok {
match => ["message", "%{TIMESTAMP_ISO8601:tstamp}"]
}
date {
match => ["tstamp", "ISO8601"]
}
mutate {
remove_field => ["tstamp"]
add_field => { "file" => "%{[#metadata][s3][key]}" }
}
grok {
match => [ "file", "ecs-fargate-container-%{DATA:containerlogname}"]
}
}

How to cumul filters with logstash?

I'm currently discovering elastic search, kibana and logstash with docker. (Version 7.1.1) The three containers are running well.
I have some data files containing some lines like this one:
foo=bar type=alpha T=20180306174204527
My logstash.conf contains:
input {
file {
path => "/tmp/data/*.txt"
start_position => "beginning"
}
}
filter {
kv {
field_split => "\t"
value_split => "="
}
}
output {
elasticsearch { hosts => ["elasticsearch:9200"] }
stdout {
codec => rubydebug
}
}
I handle this data:
{
"host" => "07f3051a3bec",
"foo" => "bar",
"message" => "foo=bar\ttype=alpha\tT=20180306174204527",
"T" => "20180306174204527",
"#timestamp" => 2019-06-17T13:47:14.589Z,
"path" => "/tmp/data/ucL12018_03_06.txt",
"type" => "alpha"
"#version" => "1",
}
First step of job is done.
Now I want to add a filter to transform the value of the key T as a timestamp.
{
...
"T" => "2018-03-06T17:42:04.527Z",
"#timestamp" => 2019-06-17T13:47:14.589Z,
...
}
I do not know how to do it. I tried to add a second filter just after the kv filter, but nothing change when I add new files.
Add this filter after the kv filter:
date {
match => [ "T", "yyyyMMddHHmmssSSS" ]
target => "T"
}
The date filter will try to parse the field T using the provided pattern to create a date, which will be written to the T field (by default it overwrite the #timestamp field).

logstash Mapping Parsing error status 400

When I am trying to use logstash to read through a configuration file, I come up with map parsing error.
:response=>{"index"=>{"_index"=>"logstash-2016.06.07",
"_type"=>"txt", "_id"=>nil, "status"=>400,
"error"=>{"type"=>"mapper_parsing_exception", "r eason"=>"Failed to
parse mapping [default]: Mapping definition for [data] has
unsupported parameters: [ignore_above : 1024]",
"caused_by"=>{"type"=>"mapper_parsing_exception", "reason"=>"Mapping
definition for [data] has unsupported para meters: [ignore_above :
1024]"}}}}, :level=>:warn}←[0m
I found that there is no problem is groking my logs but just do not know what is the matter of the error.
Here is my logstash.conf
input{
stdin{}
file{
type => "txt"
path => "C:\HA\accesslog\trial.log"
start_position => "beginning"
}
}
filter{
grok{
match => {"message" => ["%{IP:ClientAddr}%{SPACE}%{NOTSPACE:access_date}%{SPACE}%{TIME:access_time}%{SPACE}%{NOTSPACE:x-eap.wlsCustomLogField.VirtualHost}%{SPACE}%{WORD:cs-method}%{SPACE}%{PATH:cs-uri-stem}%{SPACE}%{PROG:x-eap.wlsCustomLogField.Protocol}%{SPACE}%{NUMBER:sc-status}%{SPACE}%{NUMBER:bytes}%{SPACE}%{NOTSPACE:x-eap.wlsCustomLogField.RequestedSessionId}%{SPACE}%{PROG:x-eap.wlsCustomLogField.Ecid}%{SPACE}%{NUMBER:x-eap.wlsCustomLogField.ThreadId}%{SPACE}%{NUMBER:x-eap.wlsCustomLogField.EndTs}%{SPACE}%{NUMBER:time-taken}"]}
}
if "_grokparsefailure" in [tags] {
drop { }
}
}
output{
elasticsearch {
hosts => ["localhost:9200"]
index => "logstash-%{+YYYY.MM.dd}"
template_overwrite => true
}
stdout { codec => rubydebug }
}
Please help. Thanks.
I turn out find out the sollution here: github.com/elastic/elasticsearch/issues/16283
Another problem is the created field for indexing is too long. Shortening the name can solve the issue.

Retrieving RESTful GET parameters in logstash

I am trying to get logstash to parse key-value pairs in an HTTP get request from my ELB log files.
the request field looks like
http://aaa.bbb/get?a=1&b=2
I'd like there to be a field for a and b in the log line above, and I am having trouble figuring it out.
My logstash conf (formatted for clarity) is below which does not load any additional key fields. I assume that I need to split off the address portion of the URI, but have not figured that out.
input {
file {
path => "/home/ubuntu/logs/**/*.log"
type => "elb"
start_position => "beginning"
sincedb_path => "log_sincedb"
}
}
filter {
if [type] == "elb" {
grok {
match => [ "message", "%{TIMESTAMP_ISO8601:timestamp}
%{NOTSPACE:loadbalancer} %{IP:client_ip}:%{NUMBER:client_port:int}
%{IP:backend_ip}:%{NUMBER:backend_port:int}
%{NUMBER:request_processing_time:float}
%{NUMBER:backend_processing_time:float}
%{NUMBER:response_processing_time:float}
%{NUMBER:elb_status_code:int}
%{NUMBER:backend_status_code:int}
%{NUMBER:received_bytes:int} %{NUMBER:sent_bytes:int}
%{QS:request}" ]
}
date {
match => [ "timestamp", "ISO8601" ]
}
kv {
field_split => "&?"
source => "request"
exclude_keys => ["callback"]
}
}
}
output {
elasticsearch { host => localhost }
}
kv will take a URL and split out the params. This config works:
input {
stdin { }
}
filter {
mutate {
add_field => { "request" => "http://aaa.bbb/get?a=1&b=2" }
}
kv {
field_split => "&?"
source => "request"
}
}
output {
stdout {
codec => rubydebug
}
}
stdout shows:
{
"request" => "http://aaa.bbb/get?a=1&b=2",
"a" => "1",
"b" => "2"
}
That said, I would encourage you to create your own versions of the default URI patterns so that they set fields. You can then pass the querystring field off to kv. It's cleaner that way.
UPDATE:
For "make your own patterns", I meant to take the existing ones and modify them as needed. In logstash 1.4, installing them was as easy as putting them in a new file the 'patterns' directory; I don't know about patterns for >1.4 yet.
MY_URIPATHPARAM %{URIPATH}(?:%{URIPARAM:myuriparams})?
MY_URI %{URIPROTO}://(?:%{USER}(?::[^#]*)?#)?(?:%{URIHOST})?(?:%{MY_URIPATHPARAM})?
Then you could use MY_URI in your grok{} pattern and it would create a field called myuriparams that you could feed to kv{}.

logstash output not showing the desired timestamp

I am trying to get the desired time stamp format from logstash output. I can''t get that if I use this format in syslog
Please share your thoughts about convert to the other format that’s in the _source field like Yyyy-mm-ddThh:mm:ss.sssZ format?
filter {
grok {
match => [ "logdate", "Yyyy-mm-ddThh:mm:ss.sssZ" ]
overwrite => ["host", "message"]
}
_source: {
message: "activity_log: {"created_at":1421114642210,"actor_ip":"192.168.1.1","note":"From system","user":"4561c9d7aaa9705a25f66d","user_id":null,"actor":"4561c9d7aaa9705a25f66d","actor_id":null,"org_id":null,"action":"user.failed_login","data":{"transaction_id":"d6768c473e366594","name":"user.failed_login","timing":{"start":1422127860691,"end":14288720480691,"duration":0.00257},"actor_locatio
I am using this code in syslog file
filter {
if [message] =~ /^activity_log: / {
grok {
match => ["message", "^activity_log: %{GREEDYDATA:json_message}"]
}
json {
source => "json_message"
remove_field => "json_message"
}
date {
match => ["created_at", "UNIX_MS"]
}
mutate {
rename => ["[json][repo]", "repo"]
remove_field => "json"
}
}
}
output {
elasticsearch { host => localhost }
stdout { codec => rubydebug }
}
thanks
"message" => "<134>feb 1 20:06:12 {\"created_at\":1422765535789, pid=5450 tid=28643 version=b0b45ac proto=http ip=192.168.1.1 duration_ms=0.165809 fs_sent=0 fs_recv=0 client_recv=386 client_sent=0 log_level=INFO msg=\"http op done: (401)\" code=401" }
"#version" => "1",
"#timestamp" => "2015-02-01T20:06:12.726Z",
"type" => "activity_log",
"host" => "192.168.1.1"
The pattern in your grok filter doesn't make sense. You're using a Joda-Time pattern (normally used for the date filter) and not a grok pattern.
It seems your message field contains a JSON object. That's good, because it makes it easy to parse. Extract the part that comes after "activity_log: " to a temporary json_message field,
grok {
match => ["message", "^activity_log: %{GREEDYDATA:json_message}"]
}
and parse that field as JSON with the json filter (removing the temporary field if the operation was successful):
json {
source => "json_message"
remove_field => ["json_message"]
}
Now you should have the fields from the original message field at the top level of your message, including the created_at field with the timestamp you want to extract. That number is the number of milliseconds since the epoch so you can use the UNIX_MS pattern in a date filter to extract it into #timestamp:
date {
match => ["created_at", "UNIX_MS"]
}

Resources