Assign dynamic value for ttl variable in logstash memcached plugin - logstash

We are using an memcached plugin in logstash to store some data in memory and added ttl as an variable with static value.
Is there any way to assign dynamic value for ttl based on one field in logstash.
Present configuration is:
if [ipfix][AMZDNSTTL] == 3600 {  
memcached {    
id => "ipfix_ttl3600_1"     
hosts => ["192.168.122.3"]     
namespace => "ns_map"     
set => {        
"[event][type]" => "%{[ipfix][AMZDNSIPAddress]}"     
}     
# Expires in 1 Hr     
ttl => 3600 
}
Need an configuration like:
if [ipfix][AMZDNSTTL] == 3600 {  
memcached {    
id => "ipfix_ttl3600_1"     
hosts => ["192.168.122.3"]     
namespace => "ipfix_dns_map"     
set => {        
"[event][type]" => "%{[ipfix][AMZDNSIPAddress]}"     
}     
# Expires in 1 Hr     
ttl => "%{[ipfix][AMZDNSTTL]}"
}
Please check and let us know sir/mam. Thanks in advance

Related

logstash mix json and plain content

I use logstash as a syslog relay, it forwards the data to a graylog and writes data to a file.
I use the dns filter module to replace the IP with the FQDN and after this I can't write raw content to file, the IP is "json-ed".
What I get :
2022-05-17T15:17:01.580175Z {ip=vm2345.lab.com} <86>1 2022-05-17T17:17:01.579496+02:00 vm2345 CRON 2057538 - - pam_unix(cron:session): session closed for user root
What I want to get :
2022-05-17T15:17:01.580175Z vm2345.lab.com <86>1 2022-05-17T17:17:01.579496+02:00 vm2345 CRON 2057538 - - pam_unix(cron:session): session closed for user root
My config :
input {
syslog {
port => 514
type => "rsyslog"
}
}
filter {
if [type] == "rsyslog" {
dns {
reverse => [ "[host][ip]" ]
action => "replace"
}
}
}
output {
if [type] == "rsyslog" {
gelf {
host => "graylog.lab.com"
port => 5516
}
file {
path => "/data/%{+YYYY}/%{+MM}/%{+dd}/%{[host][ip]}/%{[host][ip]}_%{{yyyy_MM_dd}}.log"
codec => "line"
}
stdout { }
}
}
What's the best way to handle this ?
When you use codec => line, there is no default setting for the #format option, so the codec calls, .to_s on the event. The toString method for an event concatenates the #timestamp, the [host] field, and [message] field. You want the [host][ip] field, not the [host] field (which is an object) so tell the codec that
codec => line { format => "%{#timestamp} %{[host][ip]} %{message}" }

Logstash alternative to receive messages from AWS SQS and batch store in AWS S3

I need the ability to store logs as batches in AWS S3 as text files formatted appropriately for JSON-SerDe.
Example of how one of the batched log files would look on S3, quite important that the datetime format is yyyy-MM-dd HH:mm:ss
{"message":"Message number 1","datetime":"2020-12-01 14:37:00"}
{"message":"Message number 2","datetime":"2020-12-01 14:38:00"}
{"message":"Message number 3","datetime":"2020-12-01 14:39:00"}
Ideally these would be stored on S3 every 5 seconds or when queued messages hit 50 but also be configurable.
I've almost managed to get this working with Logstash using the sqs input plugin and the s3 output plugin using the below config
input {
sqs {
endpoint => "AWS_SQS_ENDPOINT"
queue => "logs"
}
}
output {
s3 {
access_key_id => "AWS_ACCESS_KEY_ID"
secret_access_key => "AWS_SECRET_ACCESS_KEY"
region => "AWS_REGION"
bucket => "AWS_BUCKET"
prefix => "audit/year=%{+YYYY}/month=%{+MM}/day=%{+dd}/"
size_file => 128
time_file => 5
codec => "json_lines"
encoding => "gzip"
canned_acl => "private"
}
}
The problem is the S3 output plugin requires the #timestamp field which isn't compatible with our query tool. If you use the mutate filter to remove #timestamp or change to datetime then it will not process the logs. We can't store the datetime field and #timestamp for every record as that drastically increases the amount of data we need to store (millions of logs).
Are there any other software alternatives for achieving this result?
Updated config which is working with Logstash thanks to [Badger][https://stackoverflow.com/users/11792977/badger]
input {
sqs {
endpoint => "http://AWS_SQS_ENDPOINT"
queue => "logs"
}
}
filter {
mutate {
add_field => {
"[#metadata][year]" => "%{+YYYY}"
"[#metadata][month]" => "%{+MM}"
"[#metadata][day]" => "%{+dd}"
}
remove_field => [ "#timestamp" ]
}
}
output {
s3 {
access_key_id => "AWS_ACCESS_KEY_ID"
secret_access_key => "AWS_SECRET_ACCESS_KEY"
region => "AWS_REGION"
bucket => "AWS_BUCKET"
prefix => "audit/year=%{[#metadata][year]}/month=%{[#metadata][month]}/day=%{[#metadata][day]}"
# 1 MB
size_file => 1024
# 1 Minute
time_file => 1
codec => "json_lines"
encoding => "gzip"
canned_acl => "private"
}
}
I do not see any dependency on #timestamp in the s3 output code. You have created one by using a sprintf reference to it in prefix => "audit/year=%{+YYYY}/month=%{+MM}/day=%{+dd}/". You can move those sprintf references to a mutate+add_field filter which adds fields to [#metadata], then remove #timestamp, then reference the [#metadata] fields in the prefix option.

Logstash pull server date as config variable

As part of my logstash config I want to pull the current date from the server, which it uses as part of it's API query using http_poller.
Is there any way to do that? I've tried something along the lines of this:
$(date +%d%m%y%H%M%S)
But it doesn't get picked up. This is the config:
input{
http_poller {
#proxy => { host => "" }
proxy => ""
urls => {
q1 => {
method => post
url => ""
headers => {.... }
body => '{
"rsid": "....",
"globalFilters": [
{
"type": "dateRange",
"dateRange": "%{+ddMMyyHHmmss}"
}
................
}'
}
}
request_timeout => 60
# Supports "cron", "every", "at" and "in" schedules by rufus scheduler
schedule => { cron => "* * * * * UTC"}
codec => "json"
metadata_target => "http_poller_metadata"
}
}
output {
elasticsearch {
hosts => ["xxxx"]
index => "xxxx"
}
}
There is nothing like a variable declaration we can do in input ..
a workaround rather is to define env variables with dates eg. on windows (powershell script) -
$env:startDate=(Get-Date).AddDays(-1).ToString('yyyy-MM-dd')
$env:endDate=(Get-Date).AddDays(0).ToString('yyyy-MM-dd')
Then we can use these variables as ${startDate} in the url .
However once logstash is started the dates remain static.
Guess we need to restart logstash script everyday for it to take the new value of date.
Another alternative is to write a proxy webservice, which will probably be in java or other languages , where the java class can be declared with variables and then it invokes the actual webservice and returns the response back to the logstash script.
This issue is pending in logstash since 2016... not sure why it cannot be addressed !

Sending data to ES index using Logstash?

ES 2.4.1
Logstash 2.4.0
I am sending data to elasticsearch from local to create a index "pica".I used the below conf file.
input {
file {
path => "C:\Output\Receive.txt"
start_position => "beginning"
codec => json_lines
}
}
output {
elasticsearch {
hosts => "http://localhost:9200/"
index => "pica"
}
stdout{
codec => rubydebug
}
}
I couldn't see any output in either logstash prompt or in elasticsearch cluster.
When i seen the .sincedb file it has the following code:
612384816-350504-4325376 0 0 3804
May i know what's the problem here?
Thanks
I guess you're missing out the square brackets [] for the hosts value, since it's a type of array as per the doc. Hence it should look like:
elasticsearch {
hosts => ["localhost:9200"]
index => "pica"
}
OR :
hosts => ["127.0.0.1"] OR hosts => ["localhost"]

Logstash http_poller only shows last logmessage in Kibana

I am using Logstash to get the log from a url using http_poller. This works fine. The problem I have is that the log that gets received does not get send to Elastic Search in the right way. I tried splitting the result in different events but the only event that shows in Kibana is the last event from the log. Since I am pulling the log every 2 minutes, a lot of log information gets lost this way.
The input is like this:
input {
http_poller {
urls => {
logger1 => {
method => get
url => "http://servername/logdirectory/thislog.log"
}
}
keepalive => true
automatic_retries => 0
# Check the site every 2 minutes
interval => 120
request_timeout => 110
# Wait no longer than 110 seconds for the request to complete
# Store metadata about the request in this field
metadata_target => http_poller_metadata
type => 'log4j'
codec => "json"
# important tag settings
tags => stackoverflow
}
}
I then use a filter to add some fields and to split the logs
filter {
if "stackoverflow" in [tags] {
split {
terminator => "\n"
}
mutate {
add_field => {
"Application" => "app-stackoverflow"
"Environment" => "Acceptation"
}
}
}
}
The output then gets send to the Kibana server using the following output conf
output {
redis {
host => "kibanaserver.internal.com"
data_type => "list"
key => "logstash N"
}
}
Any suggestions why not all the events are stored in Kibana?

Resources