How to pass hardcoded data in logstash input generated section - logstash

How to pass an hardcoded data from input generated, When I am passing this input through input section logstash is successfully executed but didnt produce any filtered output in console. Is there any way to pass json in input generator section??
Note- I have taken this input data from some log file...
input {
generator {
lines => [\"clientContextId\":\"INTERNAL-b8d19563-94a1-442d-9a09-dde36743fb7d\",\"description\":\"A N1QL EXPLAIN statement was executed\",\"id\":28673,\"isAdHoc\":true,\"metrics\":{\"elapsedTime\":\"11.921ms\",\"executionTime\":\"11.764ms\",\"resultCount\":1,\"resultSize\":649},\"name\":\"EXPLAIN statement\",\"node\":\"127.0.0.1:8091\",\"real_userid\":{\"domain\":\"builtin\",\"user\":\"Administrator\"},\"remote\":{\"ip\":\"127.0.0.1\",\"port\":44695},\"requestId\":\"958a7e12-d5a6-4d7b-bd40-ac9bb60cf4a3\",\"statement\":\"explain INSERT INTO `Guardium` (KEY, VALUE) \\nVALUES ( \\\"id::5554\\n\\\", { \\\"Emp Name\\\": \\\"Test4\\\", \\\"Emp Company\\\" : \\\"GS Lab\\\", \\\"Emp Country\\\" : \\\"India\\\"} )\\nRETURNING *;\",\"status\":\"errors\",\"timestamp\":\"2021-01-07T09:37:00.486Z\",\"userAgent\":\"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36 (Couchbase Query Workbench (6.6.1-9213-enterprise))\"]
}
}
filter{
// filter logic
}
output{
stdout { codec => rubydebug }
}

Yes, you can have JSON in a generator input. For example
input { generator { count => 1 lines => [ '{ "id": "43fb7d", "description": "..." }' ] } }
filter { json { source => "message" } }
will result in
"description" => "...",
"id" => "43fb7d",

Related

How to change “message” value in index

In logstash pipeline or indexpattern how to change the following part of CDN log in "message" field to seperate or extract some data then aggrigate them.
<40> 2022-01-17T08:31:22Z logserver-5 testcdn[1]: {"method":"GET","scheme":"https","domain":"www.123.com","uri":"/product/10809350","ip":"66.249.65.174","ua":"Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)","country":"US","asn":15169,"content_type":"text/html; charset=utf-8","status":200,"server_port":443,"bytes_sent":1892,"bytes_received":1371,"upstream_time":0.804,"cache":"MISS","request_id":"b017d78db4652036250148216b0a290c"}
expected change:
{"method":"GET","scheme":"https","domain":"www.123.com","uri":"/product/10809350","ip":"66.249.65.174","ua":"Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)","country":"US","asn":15169,"content_type":"text/html; charset=utf-8","status":200,"server_port":443,"bytes_sent":1892,"bytes_received":1371,"upstream_time":0.804,"cache":"MISS","request_id":"b017d78db4652036250148216b0a290c"}
Bacause this part "<40> 2022-01-17T08:31:22Z logserver-5 testcdn[1]:" is not parsed in jason and I can't create visual dashboard based on some fileds such as country, asn, etc...
The original log that indexed by logstash is:
{
"_index": "logstash-2022.01.17-000001",
"_type": "_doc",
"_id": "Qx8pZ34BhloLEkDviGxe",
"_version": 1,
"_score": 1,
"_source": {
"message": "<40> 2022-01-17T08:31:22Z logserver-5 testcdn[1]: {\"method\":\"GET\",\"scheme\":\"https\",\"domain\":\"www.123.com\",\"uri\":\"/product/10809350\",\"ip\":\"66.249.65.174\",\"ua\":\"Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)\",\"country\":\"US\",\"asn\":15169,\"content_type\":\"text/html; charset=utf-8\",\"status\":200,\"server_port\":443,\"bytes_sent\":1892,\"bytes_received\":1371,\"upstream_time\":0.804,\"cache\":\"MISS\",\"request_id\":\"b017d78db4652036250148216b0a290c\"}",
"port": 39278,
"#timestamp": "2022-01-17T08:31:22.100Z",
"#version": "1",
"host": "93.115.150.121"
},
"fields": {
"#timestamp": [
"2022-01-17T08:31:22.100Z"
],
"port": [
39278
],
"#version": [
"1"
],
"host": [
"93.115.150.121"
],
"message": [
"<40> 2022-01-17T08:31:22Z logserver-5 testcdn[1]: {\"method\":\"GET\",\"scheme\":\"https\",\"domain\":\"www.123.com\",\"uri\":\"/product/10809350\",\"ip\":\"66.249.65.174\",\"ua\":\"Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)\",\"country\":\"US\",\"asn\":15169,\"content_type\":\"text/html; charset=utf-8\",\"status\":200,\"server_port\":443,\"bytes_sent\":1892,\"bytes_received\":1371,\"upstream_time\":0.804,\"cache\":\"MISS\",\"request_id\":\"b017d78db4652036250148216b0a290c\"}"
],
"host.keyword": [
"93.115.150.121"
]
}
}
Thanks
Thank you, This is very useful, I got an idea from your suggestion for this specific scenario:
The following edited logstash.conf solves this problem :
input {
tcp {
port => 5000
codec => json
}
}
filter {
grok {
match => { "message" => "%{TIMESTAMP_ISO8601:timestamp} %{GREEDYDATA:Junk}: %{GREEDYDATA:request}"}
}
json { source => "request" }
}
output {
stdout { codec => rubydebug }
elasticsearch {
hosts => ["elasticsearch:9200"]
manage_template => false
ecs_compatibility => disabled
index => "logs-%{[#metadata][beat]}-%{[#metadata][version]}-%{+YYYY.MM.dd}"
}
}
But my main concern is about editing config files, I'd prefer make any changes in kibana web ui rather than changing logstash.conf, because we use elk for diffrent scenarios in the organization and such changes make an elk server proper for just a special purpose not for multi purposes.
How to get such result without changing logstash config files?
Add these configurations to filter section of you logstash config:
#To parse the message field
grok {
match => { "message" => "<%{NONNEGINT:syslog_pri}>\s+%{TIMESTAMP_ISO8601:syslog_timestamp}\s+%{DATA:sys_host}\s+%{NOTSPACE:sys_module}\s+%{GREEDYDATA:syslog_message}"}
}
#To replace message field with syslog_message
mutate {
replace => [ "message", "%{syslog_message}" ]
}
Once the message field is replaced by syslog_message, You can add the json filter below to parse the json to separate fields as well..
json {
source => "syslog_message"
}

Grok Filter not adding new fields | Logstash

We have the below grok filter configured for our journlabeat. The same was deployed on our local for filebeat was working fine but isn't adding the new fields on journalbeat.
filter {
grok {
patterns_dir => ["/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-patterns-core-4.1.2/patterns"]
match => { "message" => [
'%{IPV4:client_ip} - - \[%{HTTPDATE:date}\] "%{WORD:method} %{URIPATH:request} %{URIPROTO:protocol}\/[1-9].[0-9]" (%{NUMBER:status}|-) (%{NUMBER:bytes}|-) "(%{URI:url}|-)" %{QUOTEDSTRING:client}'
]
break_on_match => false
tag_on_failure => ["failed_match"]
}
}
}
We tried adding the mutate filter for adding new fields using below but it isn't fetching the value and is printing the scalar values itself (example: %{client_ip}).
mutate {
add_field => {
"client_ip" => "%{client_ip}"
"date" => "%{date}"
"method" => "%{method}"
"status" => "%{status}"
"request" => "%{request}"
}
}
The log which we are trying to match is as below.
::ffff:172.65.205.3 - - [09/Jul/2020:11:32:52 +0000] "POST /v1-get-profile HTTP/1.1" 404 71 "https://mycompany.com/customer/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36"
Could someone let me know what exactly are we doing wrong. Thanks in Advance.

Logstash parser error, timestamp is malformed

Can somebody tell me what I'm doing wrong, or why Logstash doesn't want to parse an ISO8601 timestamp?
The error message I get is
Failed action ... "error"=>{"type"=>"mapper_parsing_exception",
"reason"=>"failed to parse [timestamp]",
"caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"Invalid
format: \"2017-03-24 12:14:50\" is malformed at \"17-03-24
12:14:50\""}}
Sample log file line (last byte in IP address replaced with 000 on purpose)
2017-03-24 12:14:50 87.123.123.000 12345678.domain.com GET /smil:stream_17.smil/chunk_ctvideo_ridp0va0r600115_cs211711500_mpd.m4s - HTTP/1.1 200 750584 0.714 "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36" https://referrer.domain.com/video/2107 https fra1 "HIT, MISS" 12345678.domain.com
GROK pattern (use http://grokconstructor.appspot.com/do/match to verify)
RAW %{TIMESTAMP_ISO8601:timestamp}%{SPACE}%{IPV4:clientip}%{SPACE}%{HOSTNAME:http_host}%{SPACE}%{WORD:verb}%{SPACE}\/(.*:)?%{WORD:stream}%{NOTSPACE}%{SPACE}%{NOTSPACE}%{SPACE}%{WORD:protocol}\/%{NUMBER:httpversion}%{SPACE}%{NUMBER:response}%{SPACE}%{NUMBER:bytes}%{SPACE}%{SECOND:request_time}%{SPACE}%{QUOTEDSTRING:agent}%{SPACE}%{URI:referrer}%{SPACE}%{WORD}%{SPACE}%{WORD:location}%{SPACE}%{QUOTEDSTRING:cache_status}%{SPACE}%{WORD:account}%{GREEDYDATA}
Logstash configuration (input side):
input {
file {
path => "/subfolder/logs/*"
type => "access_logs"
start_position => "beginning"
}
}
filter {
# skip first two lines in log file with comments
if [message] =~ /^#/ {
drop { }
}
grok {
patterns_dir => ["/opt/logstash/patterns"]
match => { "message" => "%{RAW}" }
}
date {
match => [ "timestamp" , "yyyy-MM-dd HH:mm:ss" ]
locale => "en"
}
# ... (rest of the config omitted for readability)
}
So I am pretty sure this is being caused by the field timestamp being mapped to a type in Elasticsearch that it doesn't parse to. If you post your index mapping, I'd be happy to look at it.
A note: You can quickly solve this by adding remove_field because if the date filter is successful, the value of that field will be pulled into #timestamp. Right now you have the same value stored in two fields. Then you don't have to worry about the mapping for the field. :)
date {
match => [ "timestamp" , "yyyy-MM-dd HH:mm:ss" ]
locale => "en"
remove_field => [ "timestamp" ]
}

Create tag from field using regex

I am using logstash to collect my apache logs, and as such I have a field called request_url which contains values that look like:
POST /api_v1/services/order_service HTTP/1.1
POST /api_v2/services/user_service HTTP/1.0
I want to create separate tags containing on the API version and the service name, e.g.
POST /api_v1/services/order_service HTTP/1.1 -> ["v1", "order_service"]
POST /api_v2/services/user_service HTTP/1.0 -> ["v2", "user_service"]
How do achieve this in logstash configuration? Thanks for any pointers.
Using the following grok filter, you can separate the components you need and then add the appropriate tags
filter {
grok {
"match" => { "message" => "%{WORD:verb} /api_%{NOTSPACE:version}/services/%{WORD:service}" }
"add_tag" => ["%{version}", "%{service}"]
}
}
The event you'll get will look like this:
{
"message" => "POST /api_v1/services/order_service HTTP/1.1",
"#version" => "1",
"#timestamp" => "2016-06-07T02:52:01.136Z",
"host" => "iMac.local",
"verb" => "POST",
"version" => "v1",
"service" => "order_service",
"tags" => [
[0] "v1",
[1] "order_service"
]
}

logstash add_field and remove_field

I'm attempting to simplify my logstash config. I want to split the program field into separate fields (as show below) however I would prefer to use just one grok statement (if it's at all possible!)
Of the two examples below I get an _grokparsefailure on the second example, but not the first. Since grok has the add_field and remove_field options I would assume that I could combine it all into one grok statement. Why is this not the case? Have I missed some ordering/syntax somewhere?
Sample log:
2016-02-16T16:42:06Z ubuntu docker/THISTESTNAME[892]: 172.16.229.1 - - [16/Feb/2016:16:42:06 +0000] "GET / HTTP/1.1" 304 0 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.109 Safari/537.36" "-"
Why does this work:
filter {
# Extracts the docker name, ID, image etc elements
mutate {
add_field => { "[#metadata][program]" => "%{program}" }
remove_field => "[program]"
}
grok {
patterns_dir => "/logstash/patterns_dir/docker"
match => { "[#metadata][program]" => "%{D_ID}" }
}
}
But this does not:
filter {
grok {
add_field => { "[#metadata][program]" => "%{program}" }
remove_field => "[program]"
patterns_dir => "/logstash/patterns_dir/docker"
match => { "[#metadata][program]" => "%{D_ID}" }
}
}
add_field and remove_field only run if the underlying filter works. In your second example, the [#metadata][program] doesn't yet exist for you to run grok{} against.
This was directly answered by #Alan, however I found this way a little more readable and compressed my code even more:
grok {
patterns_dir => "/logstash/patterns_dir/docker-patterns"
match => { "program" => "%{D_ID}" }
overwrite => [ "program" ]
}

Resources