Logstash ignore missing output - logstash

I have a logstash pipeline with an output to OpenSearch.
I was hoping that logstash (which has other outputs in the pipeline) would still work if OpenSearch is down, but this seems to not be the case.
Is there a way to get logstash to ignore an output if it is not available?
Thank you
Thomas

Related

communication filebeat --> logstash: messages wrong order

I try to feed a csv data file to logstash using filebeat. Unfortunately the messages are out of order. Is there any way to correct this?
Could this caused by TCP or any pipeline? Logstash started logstash.javapipeline / pipeline_id=>"main", "pipeline.workers"=>8
I tried:
filebeat - output to console - pass
filebeat - output to logstash (localhost) - logstash w/o filter; output to stdout - fail (wrong order of messages)
Per default order is not guaranteed in Logstash as the events in the batch can be reordered in the filter processing and some events can be processed faster than others.
If you need to order your events you will have to change the number of pipeline.workers to 1, which means that only 1 CPU will be used to process your messages.
Also, set pipeline.ordered to auto in logstash.yml.
Setting pipeline.workers to 1 will make logstash process the events in the orders they are received, but since it will use only 1 CPU, it can impact the performance if you have a high rate of events per second.
This is the part of the documentation about ordering events.

Kibana - add a listener

I have ELK installed, and all works fine. I have one index that always receives logs from Logstash.
Sometimes, Logstash stops working (every second month or so), and nothing comes to the index.
I was wondering is there a way to query the index (some interval), if it does not have any entries to produce some kind of event, which I will handle.
For example, query that index every 10 mins, and if there are no logs, then create an event.
I assume you are looking for ELK's internal tools. There is the Elasticsearch Xpack plugin that gives watchers and notifications. But if that's not a requirement, you can write a nodeJS server that querys the last 5 minutes or so, and you can write the exact notification you need.
I hope I could help.

Really big retrieval lag for Logstash Kafka inputs producing data irregularly

I'm using logstash 2.4 with kafka input 5.1.6. In my config I created a field called input_lag in order to monitor how much time it takes logstash to process logs:
ruby {
code => "event['lag_seconds'] = (Time.now.to_f - event['#timestamp'].to_f)"
}
I listen to several kafka topics from a single logstash instance and for the topics that produce logs regularly everything is OK and the lag is small (several seconds). However, for the topics that produce small amount of logs irregularly I get really big lags. Sometimes it's tens of thousands of seconds.
My configuration for Kafka input is following:
input {
kafka {
bootstrap_servers => "kafka-broker1:6667,kafka-broker2:6667,kafka-broker3:6667"
add_field => { "environment" => "my_env" }
topics_pattern => "a|lot|of|topics|like|60|of|them"
decorate_events => true
group_id => "mygroup1"
codec => json
consumer_threads => 10
auto_offset_reset => "latest"
auto_commit_interval_ms => "5000"
}
}
The logstash instance seems healthy, as logs from other topics are being retrieved regularly. I've checked and if I connect to Kafka using its console consumer the delayed logs are there. I've also thought that it might be a problem with too many topics being served by a single logstash instance and extracted those small topics to separate logstash instances but the lag is exactly the same, so it's not the issue.
Any ideas? I suspect that logstash might be using some exponential delay for log retrieval, but have no idea how to confirm and fix that.
Still lack some information:
Kafaka client version?
What's the content of #timestamp?
What's the order of filter? Is ruby last one?
the delayed logs are there -- 'there' means in Kafaka?
Timestamp
If we didn't use date filter to change this field, #timestamp should be the time at which the log entry was read.
In this case, the lag ups to seconds, so I guess the date filter is used and timestamp here is the time when log generated.
Wait Before Fetch
When use Kafka input plugin to consume message, it will wait some time before server respond. This can be configured by two options:
fetch_max_wait_ms
poll_timeout_ms
You many check them in config file.
Wait Before Filter
Logstash handles input log in batch to improve performance, so if not enough logs comes, it will wait some time.
pipeline.batch.delay
You may check it in logstash.yml.
Metric
Logstash itself has the metric information generate, combined with Elasticseach and Kibana, can be very handy to use. So I suggest you to have a try.
Ref
Kafka Input
Logstash Config
ELK Monitoring

How to speed up logstash pattern matching (grok)?

I have a log file with 200 MB. I feed the log file into logstash, and it is taking few hours to get the job done.
I am wondering if there's a way to speed things up? Perhaps running it in parallel mode?
You can take a look at here about how to speed up.
The default number of filter workers is 1, but you can increase this number with the '-w' flag on the agent.
For example, If your grok pattern is complex, you can use multiple filter worker(thread) to do the filter task and speed up logstash parsing the logs.
Start with 10 workers like this:
`bin/logstash -f test.conf -w 10`
Will output
Settings: User set filter workers: 10, Default filter workers: 1
Logstash startup completed

Graphite: Aggregation Rules not working

I have added many aggregation rules like
app.email.server1.total-sent.1d.sum (86400) = sum app.email.server1.total-sent.1h.sum
I want to know is there any limit on the aggregation rules count. Same kind of other aggregation rules are working.
I checked by using tcpdump also, packets containing the tag app.email.server1.total-sent.1h.sum is also coming.
Can we debug by checking logs. I tried but logs is not mentioning anything regarding the type of metrics getting aggregated.
You want to sum up all 1h to 1d, so in the rule, on the RHS, put * instead of 1h
app.email.server1.total-sent.1d.sum (86400) = sum app.email.server1.total-sent.*.sum

Resources