Grok pattern for custom response time in ns, us, ms or s - logstash - logstash-grok

I need to parse a response time from the custom log format so it can go via Logstash pipeline to elastic.
Example log entry be:
2018-11-19 23:40:00-0500 avg:30.5ms max:135ms min:6.61ms reqs:20 rsps:20 errs:0 maxcon:3 99th:135ms 95th:134ms 90th:111ms 75th:22.6ms 50th:15.6ms heap:36.7% load:1.43/0.75/0.60 cpu:26.3%
Avg, max, min can be in ns, us, ms or s format.
I have started from:
%{TIMESTAMP_ISO8601:timestamp} avg:%{NUMBER:avg}ms
of course it won't work for ns etc. so I will need something like:
%{TIMESTAMP_ISO8601:timestamp} avg:%{NUMBER:avg}(ns|us|ms|s)
However I will lose infomration as I have to scale of of values to say ms. So ns multiply by 1e6, ms -> 1e3, ms -> 1, s -> 1e-3.
What is the best approach to solve that issue?

Ok so I have found a solution finally.
First a bit of a change for the grok pattern as follows:
%{TIMESTAMP_ISO8601:timestamp} avg:%{NUMBER:avg:float}(?<avgUnit>[unm]?s)
That gives us two elements in the event 'avg' and 'avgUnit' such values can be passed to the ruby script which is executed by plugin:
The script read as flows:
# filter runs for every event
# # return the list of events to be passed forward
# # returning empty list is equivalent to event.cancel
def filter(event)
#convert operates on event
convert(event ,"maxUnit", "max")
convert(event, "minUnit", "min")
convert(event, "avgUnit", "avg")
convert(event, "99thUnit", "99th")
return [event]
end
def convert(event, unitField, valueField)
if event.get(valueField).nil?
event.tag("__#{valueField}__not_found")
return [event]
end
if event.get(unitField).nil?
event.tag("__#{unitField}_not_found")
return [event]
end
unit = event.get(unitField)
value = event.get(valueField)
fieldName = "#{valueField}InMs"
case unit
when "ns"
event.set(fieldName, value / 1.0e6)
when "us"
event.set(fieldName, value / 1.0e3)
when "ms"
event.set(fieldName, value)
when "s"
event.set(fieldName, value * 1.0e3)
else
event.tag("__not_supported_unit_#{unit}")
end
return [event]
end
Pipeline must have configuration to include a script after match:
grok {
match => {
"message" => ["%{TIMESTAMP_ISO8601:tstamp} avg:%{NUMBER:avg:float}(?<avgUnit>[unm]?s)]
}
}
ruby {
path => "script.rb"
}

Related

Max Aggregation with Hazelcast-jet

I want to do a simple max across an entire dataset. I started with the Kafka example at: https://github.com/hazelcast/hazelcast-jet-code-samples/blob/0.7-maintenance/kafka/src/main/java/avro/KafkaAvroSource.java
I just changed the pipeline to:
p.drawFrom(KafkaSources.<Integer, User>kafka(brokerProperties(), TOPIC))
.map(Map.Entry::getValue)
.rollingAggregate(minBy(comparingInt(user -> (Integer) user.get(2))))
.map(user -> (Integer) user.get(2))
.drainTo(Sinks.list("result"));
and the go to:
IListJet<Integer> res = jet.getList("result");
SECONDS.sleep(10);
System.out.println(res.get(0));
SECONDS.sleep(15);
System.out.println(res.get(0));
cancel(job);
to get the largest age of people in the topic. It however doesn't return 20 and seems to return different values on different runs. Any idea why?
You seem to be using rollingAggregate, which produces a new output item every time it receives some input, but all you check is the first item it emitted. You must instead find the latest item it emitted. One way to achieve it is by pushing the result into an IMap sink, using the same key every time:
p.drawFrom(KafkaSources.<Integer, User>kafka(brokerProperties(), TOPIC))
.withoutTimestamps()
.map(Map.Entry::getValue)
.rollingAggregate(minBy(comparingInt(user -> (Integer) user.get(2))))
.map(user -> entry("user", (Integer) user.get(2)))
.drainTo(Sinks.map("result"));
You can fetch the latest result with
IMap<String, Integer> result = jet.getMap("result");
System.out.println(result.get("user");

Logstash KV plugin working

I am trying to use logstash's KV plugin. I have following log format:
time taken for transfer for all files in seconds=23 transfer start time= 201708030959 transfer end time = 201708030959
My .conf file has following KV plugin:
filter {
kv {
value_split => "="
}
}
When I run logstash, it parses complete log file line by line excluding the one having "=". I need seconds, start time and end time to be separated as key value pairs. Please suggest.

Match an approaching date In Lua

I'm looking for a little help on a Lua script. Essentially I'm looking to match an approaching date X number of minutes prior to today. In the example below I've used 9000 minutes.
alarm.get ()
message = "Certificate Expiry Warning - Do something"
SUPPKEY = "Certificate Expiry"
SUBSYS = "1.1"
SOURCE = "SERVERNAME"
--local pattern = "(%d-%m-%Y)"
local t = os.date('*t'); -- get current date and time
print(os.date("%d-%m-%Y")); --Prints todays date
t.min = t.min - 9000; -- subtract 9000 minutes
--print(os.date("%Y-%m-%d %H:%m:%S", os.time(t))); --Original Script
print(os.date("%d-%m-%Y", os.time(t))); --Prints alerting date
if string.match ~=t.min --Match string
--if string.match(a.message, pattern)
--then print (al.message)
then print ("We have a match")
--then nimbus.alarm (1, message , SUPPKEY , SUBSYS , SOURCE) --Sends alert
else print ("Everything is fine") --Postive, no alert
--else print (al.message)
end
The alarm.get grabs a line of text that looks like this:
DOMAIN\USERNAME,Web Server (WebServer),13/01/2017 09:13,13/01/2019,COMPANY_NAME,HOSTNAME_FQDN,SITE
So the line shown above is passed as an a.message variable and I'm looking to match the date highlighted in bold to today's date with 9000 minutes taken off it.
The commented out parts are just me testing different things.
I'm not sure if I understood the question well, but from my perspective it seems you are trying to do two things:
Retrieve current time minus 9000 minutes in format DD/MM/YYYY.
Compare this time to the one your program reads from file and do something, when the two dates are equal.
Here goes my sample code:
-- Settings
local ALLOWED_AGE = 9000 -- In minutes
-- Input line (for testing only)
local inputstr = "DOMAIN\\USERNAME,Web Server (WebServer),13/01/2017 09:13,13/01/2019,COMPANY_NAME,HOSTNAME_FQDN,SITE"
-- Separate line into 7 variables by token ","
local path, server, time, date, company_name, hostname, site = string.match(inputstr, "([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),([^,]+)")
-- Check, if the line is ok (not necessary, but should be here to handle possible errors)
-- Also note, some additional checks should be here (eg. regex to match DD/MM/YYYY format)
if date == nil then
print("Error reading line: "..inputstr)
end
-- Get current time minus 9000 minutes (in format DD/MM/YYYY)
local target_date = os.date("%d/%m/%Y", os.time() - ALLOWED_AGE * 60)
-- Printing what we got (for testing purposes)
print("Target date: "..target_date..", Input date: "..date)
-- Testing the match
if target_date == date then
print("Dates are matched!")
else
print("Dates are not matched!")
end
Although I'm not sure, whether you shouldn't be instead checking for "one date is bigger/smaller then the other" in your case.
Then the code above should be modified to something like this:
-- Extract day, month and year from date in format DD/MM/YYYY
local d, m, y = string.match(date, "([^/]+)/([^/]+)/([^/]+)")
-- Note I'm adding one day, so the certificate will actually expire day after it's "valid until" date.
local valid_until = os.time({year = y, month = m, day = d + 1})
local expire_time = os.time() - ALLOWED_AGE * 60 -- All certificates older than this should expire.
-- Printing what we got (for testing purposes)
print("Expire time: "..expire_time..", Cert valid until: "..valid_until)
-- Is expired?
if valid_until <= expire_time then
print("Oops! Certificate expired.")
else
print("Certificate date is valid.")
end

Include monotonically increasing value in logstash field?

I know there's no built in "line count" functionality while processing files through logstash (for various, understandable and documented reasons). But - there should be a mechanism, within any given logstash instance - to have an monotonically increasing variable / count for every parsed line.
I don't want to go the metrics route since it's a continuous polling mechanism (every n-seconds). Alternatives include pre-processing of log files which given my particular use case - is unacceptable.
Again, let me reiterate - I need the ability to generate/read a monotonically increasing variable that I can store during in a logstash filter.
Thoughts?
here's nothing built into logstash to do it.
You can build a filter to do it pretty easily
Just drop something like this into lib/logstash/filters/seq.rb
# encoding: utf-8
require "logstash/filters/base"
require "logstash/namespace"
require "set"
#
# This filter will adds a sequence number to a log entry
#
# The config looks like this:
#
# filter {
# seq {
# field => "seq"
# }
# }
#
# The `field` is the field you want added to the event.
class LogStash::Filters::Seq < LogStash::Filters::Base
config_name "seq"
milestone 1
config :field, :validate => :string, :required => false, :default => "seq"
public
def register
# Nothing
end # def register
public
def initialize(config = {})
super
#threadsafe = false
# This filter needs to keep state.
#seq=1
end # def initialize
public
def filter(event)
return unless filter?(event)
event[#field] = #seq
#seq = #seq + 1
filter_matched(event)
end # def filter
end # class LogStash::Filters::Seq
This will start at 1 every time Logstash is restarted, but for most situations, this would be ok. If you need something that is persistent across restarts, you need to do a bit more work to persist it somewhere
For anyone finding this in 2018+: logstash now has a ruby filter that makes this much simpler. Put the following in a file somewhere:
# encoding: utf-8
def register(params)
#seq = 1
end
def filter(event)
event.set("seq", #seq)
#seq += 1
return [event]
end
And then configure it like this in your logstash.conf (substitute in the filename you used):
ruby {
path => "/usr/local/lib/logstash/seq.rb"
}
It would be pretty easy to make the field name configurable from logstash.conf, but I'll leave that as an exercise for the reader.
I suspect this isn't thread-safe, so I'm running only a single logstash worker.
this is another choice to slove the problem,this work for me,thanks to the answer from the previous person about thread safe. i use seq field to sort my desc
this is my configure
logstash.conf
filter {
ruby {
code => 'event.set("seq", Time.now.strftime("%N").to_i)'
}
}
logstash.yml
pipeline.batch.size: 200
pipeline.batch.delay: 60
pipeline.workers: 1
pipeline.output.workers: 1

Twitter Search API - Searching by Time

Is there any way to send a query param to Twitter, telling it to only return search results within a specified period of time? For example, give me the results for this "keyword" tweeted between 12 pm and 3 pm ET on July 24, 2011? If Twitter doesn't allow you to search by time -- and only by date -- then is there anything in the results that will allow you to see the exact time when the user made that tweet?
As far as I can tell, there is not a way to specify time (more specific than the date). However, after getting your list of tweets, you can remove those that don't fall within the specified time range by comparing each tweet's timestamp.
This is how I would do it in ruby with the twitter gem:
require 'twitter'
require 'time'
start_time = Time.now - 3*3600
end_time = Time.now
search = Twitter::Search.new.contains('test')
search.since_date(start_time.strftime("%Y-%m-%d"))
search.until_date(end_time.strftime("%Y-%m-%d"))
tweets = search.fetch
tweets.delete_if { |t| Time.parse(t.created_at) < start_time }
tweets.delete_if { |t| Time.parse(t.created_at) > end_time }

Resources