I am trying to process Nginx access log with the help of Logstash (Version 2.1.3)
On the basis of different endpoints found in Nginx access log, I want to send data in different queues or sometimes in different RabbitMQ Servers.
Here is my Logstash configuration:
input {
stdin {}
}
filter {
grok {
match => { "message" => "(?<status>.*?)!~~!(?<req_tm>.*?)!~~!(?<time>.*?)!~~!(?<req_method>.*?)!~~!(?<req_uri>.*)" }
tag_on_failure => ["first_grok_failed"]
}
if "/endpoint1" in [req_uri] {
mutate { add_field => { "[queue]" => "endpoint_one" } }
mutate { add_field => { "[rmqshost]" => "10.10.10.1" } }
}
else if "/endpoint2" in [req_uri] {
mutate { add_field => { "[queue]" => "endpoint_two" } }
mutate { add_field => { "[rmqshost]" => "10.10.10.2" } }
}
else {
mutate { add_field => { "[queue]" => "endpoint_other" } }
mutate { add_field => { "[rmqshost]" => "10.10.10.3" } }
}
}
output {
rabbitmq {
exchange => "%{[queue]}_exchange"
exchange_type => "direct"
host => "%{[rmqshost]}"
key => "%{[queue]}_key"
password => "mypassword"
user=>"myuser"
vhost=>"myvhost"
durable=>false
}
stdout {
codec => rubydebug
}
}
In filter section of above configuration, I add dynamic variable "queue" and "rmqshost".
In output section, I tried using those varibles inside rabbitmq plugin
block.
I am getting following error which shows that "rmqshost" variable have not got replaced.
Connection to %{[rmqshost]}:5672 refused: host unknown
{:exception=>"MarchHare::ConnectionRefused", :backtrace=>
["/opt/logstash/vendor/bundle/jruby/1.9/gems/march_hare-2.15.0-
java/lib/march_hare/session.rb:473:in `converting_rjc_exceptions_to_ruby'",
"/opt/logstash/vendor/bundle/jruby/1.9/gems/march_hare-2.15.0-java/lib/march_hare/session.rb:500:in `new_connection_impl'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/march_hare-2.15.0-java/lib/march_hare/session.rb:136:in `initialize'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/march_hare-2.15.0-java/lib/march_hare/session.rb:109:in `connect'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/march_hare-2.15.0-java/lib/march_hare.rb:20:in `connect'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-mixin-rabbitmq_connection-2.3.0-java/lib/logstash/plugin_mixins/rabbitmq_connection.rb:137:in `connect'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-mixin-rabbitmq_connection-2.3.0-java/lib/logstash/plugin_mixins/rabbitmq_connection.rb:94:in `connect!'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-rabbitmq-3.0.7-java/lib/logstash/outputs/rabbitmq.rb:40:in `register'", "org/jruby/RubyArray.java:1613:in `each'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.1.3-java/lib/logstash/pipeline.rb:192:in `start_outputs'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.1.3-java/lib/logstash/pipeline.rb:102:in `run'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.1.3-java/lib/logstash/agent.rb:165:in `execute'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.1.3-java/lib/logstash/runner.rb:90:in `run'", "org/jruby/RubyProc.java:281:in `call'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.1.3-java/lib/logstash/runner.rb:95:in `run'", "org/jruby/RubyProc.java:281:in `call'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.22/lib/stud/task.rb:24:in `initialize'"], :level=>:error}
I am running logstash as follows:
/opt/logstash/bin/logstash -f /etc/logstash/conf.d/nginx-filter.conf
with following sample data:
200!~~!0.004!~~!14/Apr/2017:05:15:27 +0000!~~!GET!~~!/endpoint1?key1=val1
200!~~!0.004!~~!14/Apr/2017:05:17:25 +0000!~~!GET!~~!/endpoint2?key1=val2
The sprintf replacement only works for the key field in the rabbitmq output plugin.
#hare_info.exchange.publish(message, :routing_key => event.sprintf(#key), :properties => symbolize(#message_properties.merge(:persistent => #persistent)))
The actual connection is established using https://github.com/logstash-plugins/logstash-mixin-rabbitmq_connection, which does not provide the option to replace the host via variables:
:hosts => #host
So this is currently not supported, only the key field might be replaced by logstash variables.
Related
Logstash 7.8.1
I'm trying to create two documents from one input with logstash. Different templates, different output indexes. Everything worked fine until I tried to change value only on the cloned doc.
I need to have one field in both documents with different values - is it possible with clone filter plugin?
Doc A - [test][event]- trn
Doc B (cloned doc) - [test][event]- spn
I thought that it will work if I use remove_field and next add_field in clone plugin, but I'm afraid that there was problem with sorting - maybe remove_field method is called after add_field (the field was only removed, but not added with new value).
Next I tried to add value to cloned document first and than to original, but it always made an array with both values (orig and cloned) and I need to have only one value in that field:/.
Can someone help me please?
Config:
input {
file {
path => "/opt/test.log"
start_position => beginning
}
}
filter {
grok {
match => {"message" => "... grok...."
}
}
mutate {
add_field => {"[test][event]" => "trn"}
}
clone {
clones => ["cloned"]
#remove_field => [ "[test][event]" ] #remove the field completely
add_field => {"[test][event]" => "spn"} #not added
add_tag => [ "spn" ]
}
}
output {
if "spn" in [tags] {
elasticsearch {
index => "spn-%{+yyyy.MM}"
hosts => ["localhost:9200"]
template_name => "templ1"
}
stdout { codec => rubydebug }
} else {
elasticsearch {
index => "trn-%{+yyyy.MM}"
hosts => ["localhost:9200"]
template_name => "templ2"
}
stdout { codec => rubydebug }
}
}
If you want to make the field that is added conditional on whether the event is the clone or the original then check the [type] field.
clone { clones => ["cloned"] }
if [type] == "cloned" {
mutate { add_field => { "foo" => "spn" } }
} else {
mutate { add_field => { "foo" => "trn" } }
}
add_field is always done before remove_field.
Data missed a lot in logstash version 5.0,
is it a serous bug ,when a config the config file so many times ,it useless,data lost happen again and agin, how to use logstash to collect log event property ?
any reply will thankness
Logstash is all about reading logs from specific location and based on you interested information you can create index in elastic search or other output also possible.
Example of logstash conf
input {
file {
# PLEASE SET APPROPRIATE PATH WHERE LOG FILE AVAILABLE
#type => "java"
type => "json-log"
path => "d:/vox/logs/logs/vox.json"
start_position => "beginning"
codec => json
}
}
filter {
if [type] == "json-log" {
grok {
match => { "message" => "UserName:%{JAVALOGMESSAGE:UserName} -DL_JobID:%{JAVALOGMESSAGE:DL_JobID} -DL_EntityID:%{JAVALOGMESSAGE:DL_EntityID} -BatchesPerJob:%{JAVALOGMESSAGE:BatchesPerJob} -RecordsInInputFile:%{JAVALOGMESSAGE:RecordsInInputFile} -TimeTakenToProcess:%{JAVALOGMESSAGE:TimeTakenToProcess} -DocsUpdatedInSOLR:%{JAVALOGMESSAGE:DocsUpdatedInSOLR} -Failed:%{JAVALOGMESSAGE:Failed} -RecordsSavedInDSE:%{JAVALOGMESSAGE:RecordsSavedInDSE} -FileLoadStartTime:%{JAVALOGMESSAGE:FileLoadStartTime} -FileLoadEndTime:%{JAVALOGMESSAGE:FileLoadEndTime}" }
add_field => ["STATS_TYPE", "FILE_LOADED"]
}
}
}
filter {
mutate {
# here converting data type
convert => { "FileLoadStartTime" => "integer" }
convert => { "RecordsInInputFile" => "integer" }
}
}
output {
elasticsearch {
# PLEASE CONFIGURE ES IP AND PORT WHERE LOG DOCs HAS TO PUSH
document_type => "json-log"
hosts => ["localhost:9200"]
# action => "index"
# host => "localhost"
index => "locallogstashdx_new"
# workers => 1
}
stdout { codec => rubydebug }
#stdout { debug => true }
}
To know more you can go throw many available websites like
https://www.elastic.co/guide/en/logstash/current/first-event.html
I know ntopng can put direct to elasticsearch but my boss want use logtash as layer to transfer log to elasticsearch.
I'm try many time but failed.
ntopng log like:
{"index": {"_type": "ntopng", "_index": "ntopng-2016.08.23"}}
{ "#timestamp": "2016-08-23T01:49:41.0Z", "type": "ntopng", "IN_SRC_MAC": "04:2A:E2:0D:62:FB", "OUT_DST_MAC": "00:16:3E:8D:B7:E4", "IPV4_SRC_ADDR": "14.152.84.14", "IPV4_DST_ADDR": "xxx.xxx.xxx", "L4_SRC_PORT": 34599, "L4_DST_PORT": 53, "PROTOCOL": 17, "L7_PROTO": 5, "L7_PROTO_NAME": "DNS", "IN_PKTS": 15, "IN_BYTES": 1185, "OUT_PKTS": 15, "OUT_BYTES": 22710, "FIRST_SWITCHED": 1471916981, "LAST_SWITCHED": 1471916981, "SRC_IP_COUNTRY": "CN", "SRC_IP_LOCATION": [ 113.250000, 23.116699 ], "DST_IP_COUNTRY": "VN", "DST_IP_LOCATION": [ 105.849998, 21.033300 ], "NTOPNG_INSTANCE_NAME": "ubuntu", "INTERFACE": "ens192", "DNS_QUERY": "cpsc.gov", "PASS_VERDICT": true }
Logstash config:
input {
tcp {
port => 5000
codec => json
}
}
filter{
json{
source => "message"
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
}
stdout{ codec => rubydebug }
}
Thanks
Since the ntopng logs are already in the bulk format expected by Elasticsearch you don't need to use the elasticsearch output but you can use the http output directly like this. No need to have Logstash parse JSON, simply forward the raw bulk commands to ES.
There's one catch, though: we need to add a newline character after the second line otherwise ES will reject the bulk call. We can achieve this with a mutate/update filter that adds a verbatim newline character after the message. Try it out, this will work.
input {
tcp {
port => 5000
codec => multiline {
pattern => "_index"
what => "next"
}
}
}
filter{
mutate {
update => {"message" => "%{message}
"}
}
}
output {
http {
http_method => "post"
url => "http://localhost:9200/_bulk"
format => "message"
message => "%{message}"
}
}
I am trying to get logstash to parse key-value pairs in an HTTP get request from my ELB log files.
the request field looks like
http://aaa.bbb/get?a=1&b=2
I'd like there to be a field for a and b in the log line above, and I am having trouble figuring it out.
My logstash conf (formatted for clarity) is below which does not load any additional key fields. I assume that I need to split off the address portion of the URI, but have not figured that out.
input {
file {
path => "/home/ubuntu/logs/**/*.log"
type => "elb"
start_position => "beginning"
sincedb_path => "log_sincedb"
}
}
filter {
if [type] == "elb" {
grok {
match => [ "message", "%{TIMESTAMP_ISO8601:timestamp}
%{NOTSPACE:loadbalancer} %{IP:client_ip}:%{NUMBER:client_port:int}
%{IP:backend_ip}:%{NUMBER:backend_port:int}
%{NUMBER:request_processing_time:float}
%{NUMBER:backend_processing_time:float}
%{NUMBER:response_processing_time:float}
%{NUMBER:elb_status_code:int}
%{NUMBER:backend_status_code:int}
%{NUMBER:received_bytes:int} %{NUMBER:sent_bytes:int}
%{QS:request}" ]
}
date {
match => [ "timestamp", "ISO8601" ]
}
kv {
field_split => "&?"
source => "request"
exclude_keys => ["callback"]
}
}
}
output {
elasticsearch { host => localhost }
}
kv will take a URL and split out the params. This config works:
input {
stdin { }
}
filter {
mutate {
add_field => { "request" => "http://aaa.bbb/get?a=1&b=2" }
}
kv {
field_split => "&?"
source => "request"
}
}
output {
stdout {
codec => rubydebug
}
}
stdout shows:
{
"request" => "http://aaa.bbb/get?a=1&b=2",
"a" => "1",
"b" => "2"
}
That said, I would encourage you to create your own versions of the default URI patterns so that they set fields. You can then pass the querystring field off to kv. It's cleaner that way.
UPDATE:
For "make your own patterns", I meant to take the existing ones and modify them as needed. In logstash 1.4, installing them was as easy as putting them in a new file the 'patterns' directory; I don't know about patterns for >1.4 yet.
MY_URIPATHPARAM %{URIPATH}(?:%{URIPARAM:myuriparams})?
MY_URI %{URIPROTO}://(?:%{USER}(?::[^#]*)?#)?(?:%{URIHOST})?(?:%{MY_URIPATHPARAM})?
Then you could use MY_URI in your grok{} pattern and it would create a field called myuriparams that you could feed to kv{}.
I have just put up an ELK stack, but I am having trouble regarding the logstash configuration in /etc/logstash/conf.d I have two input sources being forwarded from one linux server, which has a logstash forwarder installed on it with the "files" looking like:
{
"paths": ["/var/log/syslog","/var/log/auth.log"],
"fields": { "type": "syslog" }
},
{
"paths": ["/var/log/osquery/osqueryd.results.log"],
"fields": { "type": "osquery_json" }
}
As you can see, one input is an osquery output (json formatted), and the other is syslog. My current config for logstash is osquery.conf:
input {
lumberjack {
port => 5003
ssl_certificate => "/etc/pki/tls/certs/logstash-forwarder.crt"
ssl_key => "/etc/pki/tls/private/logstash-forwarder.key"
codec => "json"
}
}
filter {
if [type] == "osquery_json" {
date {
match => [ "unixTime", "UNIX" ]
}
}
}
output {
elasticsearch { host => localhost }
stdout { codec => rubydebug }
}
Which works fine for the one input source, but I do not know how to add my other syslog input source to the same config, as the "codec" field is in the input -- I can't change it to syslog...
I am also planning on adding another input source in a windows log format that is not being forwarded by a logstash forwarder. Is there anyway to structure this differently?
It's probably better to just remove the codec from your input if you are going to be handling different codecs on the same input:
input {
lumberjack {
port => 5003
ssl_certificate => "/etc/pki/tls/certs/logstash-forwarder.crt"
ssl_key => "/etc/pki/tls/private/logstash-forwarder.key"
}
}
filter {
if [type] == "osquery_json" {
json {
source => "field_name_the_json_encoded_data_is_stored_in"
}
date {
match => [ "unixTime", "UNIX" ]
}
}
if [type] == "syslog" {
}
}
output {
elasticsearch { host => localhost }
stdout { codec => rubydebug }
}
Then you just need to decide what you want to do with your syslog messages.
I would suggest also splitting your config into multiple files. I tend to to use 01-filename.conf - 10-filename.conf for inputs, 11-29 as filters and anything above that for outputs. These files will be loaded in to logstash in the order they are printed in an ls.