Logstash override host with filebeat name - logstash

I have setup the FileBeat -> Logstash -> ElasticSearch -> Kibana set-up successfully. Now in logstash I want to override the host with the beat.name. However, When I try to refer to the beat metadata, the variable is not resolved.
mutate {
add_field => {
"timestamp" => "%{year}-%{month}-%{day} %{time}"
}
replace_field => {
"host" => "%{[#metadata][beat][name]}"
}
}
I think I am missing some major configuration. Even when Logstash forwards it to elasticsearch, these symbol resolution are not done.
output {
elasticsearch {
hosts => "localhost:9200"
manage_template => false
index => "%{[#metadata][beat]}-%{+YYYY.MM.dd}"
document_type => "%{[#metadata][type]}"
}
How do we refer to filebeat meta information in logstash config file correctly?

The beat.name field is not carried in the #metadata object. beat is a top-level field in the event. So to refer to the value use [beat][name] or in string use "%{[beat][name]}".

Related

Filebeat with multiple Logstash pipelines

I have Filebeat configured to watch several different logs on a single host, e.g. Nginx and my app server. However, as I understand it, you cannot have multiple outputs in any one Beat -- so my filebeat.yml has a single output.logstash directive which points to my Logstash server.
Does Logstash have the concept of pipeline routing? I have several pipelines configured on my Logstash server but it's unclear how to leverage that from Filebeat, e.g. I would like to send Nginx logs to my Logstash pipeline for Nginx, etc.
Alternatively, is there a way to route the beats for Nginx to logstash:5044, and the beats for my app server to logstash:5045.
For each of the filebeat prospectors you can use the fields option to add a field that logstash can check to identify what type of data the prospector is collecting. Then in logstash you can use pipeline-to-pipeline communication with the distributor pattern to send different types of data to different pipelines.
You can use tags on your filebeat inputs and filter on your logstash pipeline using those tags.
For example, add the tag nginx to your nginx input in filebeat and the tag app-server in your app server input in filebeat, then use those tags in the logstash pipeline to use different filters and outputs, it will be the same pipeline, but it will route the events based on the tag.
If you want to send the different logs to different ports, you will need to run another instance of Filebeat.
You can use tag concepts for multiple log files
filebeat.yml
filebeat.inputs:
- type: log
tags: [apache]
paths:
- "/home/ubuntu/data/apache.log"
- type: log
tags: [gunicorn]
paths:
- "/home/ubuntu/data/gunicorn.log"
queue.mem:
events: 4096
flush.min_events: 512
flush.timeout: 5s
output.logstash:
hosts: ["****************:5047"]
conf.d/logstash.conf
input {
beats {
port => "5047"
host => "0.0.0.0"
}
}
filter {
if "gunicorn" in [tags] {
grok {
match => {
"message" => "%{USERNAME:u1} %{USERNAME:u2} \[%{HTTPDATE:http_date}\] \"%{DATA:http_verb} %{URIPATHPARAM:api} %{DATA:http_version}\" %{NUMBER:status_code} %{NUMBER:byte} \"%{DATA:external_api}\" \"%{GREEDYDATA:android_client}\""
remove_field => ["message"]
}
}
date {
match => ["http_date", "dd/MMM/yyyy:HH:mm:ss XX"]
}
mutate {
remove_field => ["agent"]
}
}
else if "apache" in [tags] {
grok {
match => {
"message" => "%{IPORHOST:client_ip} %{DATA:u1} %{DATA:u2} \[%{HTTPDATE:http_date}\] \"%{WORD:http_method} %{URIPATHPARAM:api} %{DATA:http_version}\" %{NUMBER:status_code} %{NUMBER:byte} \"%{DATA:external_api}\" \"%{GREEDYDATA:gd}\" \"%{DATA:u3}\""
remove_field => ["message"]
}
}
date {
match => ["http_date", "dd/MMM/yyyy:HH:mm:ss +ssss"]
}
mutate {
remove_field => ["agent"]
}
}
}
output {
if "gunicorn" in [tags] {
stdout { codec => rubydebug }
elasticsearch {
hosts => ["0.0.0.0:9200"]
index => "gunicorn-sample-%{+YYYY.MM.dd}"
}
}
else if "apache" in [tags] {
stdout { codec => rubydebug }
elasticsearch {
hosts => ["0.0.0.0:9200"]
index => "apache-sample-%{+YYYY.MM.dd}"
}
}
}

Data missed in Logstash?

Data missed a lot in logstash version 5.0,
is it a serous bug ,when a config the config file so many times ,it useless,data lost happen again and agin, how to use logstash to collect log event property ?
any reply will thankness
Logstash is all about reading logs from specific location and based on you interested information you can create index in elastic search or other output also possible.
Example of logstash conf
input {
file {
# PLEASE SET APPROPRIATE PATH WHERE LOG FILE AVAILABLE
#type => "java"
type => "json-log"
path => "d:/vox/logs/logs/vox.json"
start_position => "beginning"
codec => json
}
}
filter {
if [type] == "json-log" {
grok {
match => { "message" => "UserName:%{JAVALOGMESSAGE:UserName} -DL_JobID:%{JAVALOGMESSAGE:DL_JobID} -DL_EntityID:%{JAVALOGMESSAGE:DL_EntityID} -BatchesPerJob:%{JAVALOGMESSAGE:BatchesPerJob} -RecordsInInputFile:%{JAVALOGMESSAGE:RecordsInInputFile} -TimeTakenToProcess:%{JAVALOGMESSAGE:TimeTakenToProcess} -DocsUpdatedInSOLR:%{JAVALOGMESSAGE:DocsUpdatedInSOLR} -Failed:%{JAVALOGMESSAGE:Failed} -RecordsSavedInDSE:%{JAVALOGMESSAGE:RecordsSavedInDSE} -FileLoadStartTime:%{JAVALOGMESSAGE:FileLoadStartTime} -FileLoadEndTime:%{JAVALOGMESSAGE:FileLoadEndTime}" }
add_field => ["STATS_TYPE", "FILE_LOADED"]
}
}
}
filter {
mutate {
# here converting data type
convert => { "FileLoadStartTime" => "integer" }
convert => { "RecordsInInputFile" => "integer" }
}
}
output {
elasticsearch {
# PLEASE CONFIGURE ES IP AND PORT WHERE LOG DOCs HAS TO PUSH
document_type => "json-log"
hosts => ["localhost:9200"]
# action => "index"
# host => "localhost"
index => "locallogstashdx_new"
# workers => 1
}
stdout { codec => rubydebug }
#stdout { debug => true }
}
To know more you can go throw many available websites like
https://www.elastic.co/guide/en/logstash/current/first-event.html

Kibana does not show fields from grok filter in filebeat

I have log file with apache logs that i want to show in Kibana.
The logs start with IP. I have debuged my pattern and it passes.
I'm trying to add fields in the beats input configuration file, but are not show in Kibana even after refresh of the fields.
Here is the configuration file
filter {
if[type] == "apache" {
grok {
match => { "message" => "%{HOST:log_host}%{GREEDYDATA:remaining}" }
add_field => { "testip" => "%{log_host}" }
add_field => { "data_left" => "%{remaining}" }
}
}
...
Just to add that I have restarted all the services: logstash, elasticsearch, kibana after the new configuration.
The issue could be that your grok pattern is using too rigid of patterns.
Chances are that HOST should be IPORHOST based on your test_ip field's name.
Assuming that the data is actually coming in with the type defined as apache, then it should be:
filter {
if [type] == "apache" {
grok {
match => {
message => "%{IPORHOST:log_host}%{GREEDYDATA:remaining}"
}
add_field => {
testip => "%{log_host}"
data_left => "%{remaining}"
}
}
}
}
Having said that, your usage of add_field is completely unnecessary. The grok pattern itself is creating two fields: log_host and remaining, so there's no need to define extra fields called testip and data_left.
Perhaps even more usefully, you don't need to craft your own Apache web log grok pattern. The COMBINEDAPACHELOG pattern already exists, which gives all of the standard fields automatically.
filter {
if [type] == "apache" {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
# Set #timestamp to the log's time and drop the unneeded timestamp
date {
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
remove_field => "timestamp"
}
}
}
You can see this in a more complete example in the Logstash documentation here.

Logstash - grok renaming field name

Here is an exmple of event message:
{
"timestamp":"2016-03-29T22:35:44.770750-0400",
"flow_id":45385792,
"in_iface":"eth1",
"event_type":"alert",
"src_ip":"3.3.3.8",
"src_port":21,
"dest_ip":"2.2.2.2",
"dest_port":52934,
"proto":"TCP",
"alert":{
"action":"allowed",
"gid":1,
"signature_id":4027,
"rev":0,
"signature":"FTP Successful Login",
"category":"",
"severity":3
},
"payload":"MjU3ICIvaG9tZS9uZXd1c2VyIg0K",
"payload_printable":"257 newuser",
"stream":0,
"packet":"AFBWo0NoAFBWoxZWCABFAABJKDpAAEAGCGcDAwMIAgICAgAVzsbd4MhqOBOjfoAYAOMYcwAAAQEIChHN4EQHnwugMjU3ICIvaG9tZS9uZXd1c2VyIg0K"
}
input
beats
port => 5044
codec => json
type => "SuricataIDPS"
My Logstash config file is the following:
output
elasticsearch
hosts => ["localhost:9200"]
sniffing => true
manage_template => false
index => "%{[#metadata][beat]}-%{+YYYY.MM.dd}"
#document_type => "%{[#metadata][type]}"
I'd like to be able to rename the field alert.signature,
How can I do so?... Seems that it does not recognize that field...
Thanks for your help!
Efrat
You have to define mutate filter within filter stanza:
filter {
mutate {
rename => [ "[alert][signature]", "[alert][signature_renamed]" ]
}
}

Can't get logstash to geoip locate from Netflow input

I'm trying to use Logstash to parse out and geolocate IP addresses from a Netflow source, it works to get the data into Elasticsearch, but it's not putting in the geoip info. Here's my config file that I'm using in logstash
input {
udp {
host => localhost
port => 5555
codec => netflow
}
}
filter {
geoip {
target => "geoip"
source => "ipv4_dst_addr"
add_tag => ["geoip"]
add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}"$
add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}" $
}
}
output {
stdout { }
elasticsearch { host => "127.0.0.1" }
}
More info that might help, Using Logstash 1.4.2 and Elasticsearch 1.3.4.
Any luck in figuring this one out?
If not, please note that you need to use a mutate to convert the coordinates to float.
However, the geoip filter in Logstash 1.3 and up adds a location field directly so you won't have to use add_field and you won't even have to use the converter. If you try these two solutions, please tell me how it goes. Thank you.
A side note: The recommended version to work with Logstash 1.4.2 of Elasticsearch is 1.1.1
I just spent some time digging into this, and it ends up being something of a bug in the Netflow codec code (specifically, in the IP4Addr class in netflow/util.rb).
You should be able to work around this with a mutate filter, like this:
filter {
mutate {
convert => {
"[netflow][ipv4_src_addr]" => "string"
"[netflow][ipv4_dst_addr]" => "string"
}
}
geoip {
source => "[netflow][ipv4_src_addr]"
target => "src_geoip"
}
geoip {
source => "[netflow][ipv4_dst_addr]"
target => "dst_geoip"
}
}
I've submitted a pull request to fix this properly, but in the time being, try that config.

Resources