Remove complete event and replace with string in logstash - logstash-grok

Working on a way to completely replace the event with the string in logstash filter.
Input:
{
"a": "b",
"c", "d"
}
Output "a:b-c:d"
Tried using the ruby code. I'm able to form the pattern but how can i replace the output string with the original json event. I want to completely remove the existing event.
Any help is much appreciated.
Thans in advance

Related

Logstash Grok regex parsing

I am trying to do a parsing on a plaintext message using Grok; my goal is to explode the plaintext to a JSON log.
The message has a quite rigid format, as follows:
<timestamp> <loglevel> <greedydata> field1=value1, field2=value2, .... fieldN=valueN
Where the number of fields is not fixed.
It's possible to capture every field=value pair using a named capturing group, being able to use the same "field" name as the key in the output message?
Thanks
TL;DR - use dissect instead of grok
You want something like:
{
"timestamp": <timestamp>,
"loglevel": <loglevel>,
"field1": value1,
"field2": value2,
....
"fieldN": valueN
}
Where the keys (field1, fieldN etc) are dynamic.
You cannot use grok to do this. Even using a pattern like this (then using array position indices) won't work:
( field[0-9]+=%{DATA:value})+$
You need to handle this a different way. Your options are:
handle this before it hits logstash
use a ruby filter
use the dissect filter

Parsing formatted strings in Go

The Problem
I have slice of string values wherein each value is formatted based on a template. In my particular case, I am trying to parse Markdown URLs as shown below:
- [What did I just commit?](#what-did-i-just-commit)
- [I wrote the wrong thing in a commit message](#i-wrote-the-wrong-thing-in-a-commit-message)
- [I committed with the wrong name and email configured](#i-committed-with-the-wrong-name-and-email-configured)
- [I want to remove a file from the previous commit](#i-want-to-remove-a-file-from-the-previous-commit)
- [I want to delete or remove my last commit](#i-want-to-delete-or-remove-my-last-commit)
- [Delete/remove arbitrary commit](#deleteremove-arbitrary-commit)
- [I tried to push my amended commit to a remote, but I got an error message](#i-tried-to-push-my-amended-commit-to-a-remote-but-i-got-an-error-message)
- [I accidentally did a hard reset, and I want my changes back](#i-accidentally-did-a-hard-reset-and-i-want-my-changes-back)
What I want to do?
I am looking for ways to parse this into a value of type:
type Entity struct {
Statement string
URL string
}
What have I tried?
As you can see, all the items follow the pattern: - [{{ .Statement }}]({{ .URL }}). I tried using the fmt.Sscanf function to scan each string as:
var statement, url string
fmt.Sscanf(s, "[%s](%s)", &statement, &url)
This results in:
statement = "I"
url = ""
The issue is with the scanner storing space-separated values only. I do not understand why the URL field is not getting populated based on this rule.
How can I get the Markdown values as mentioned above?
EDIT: As suggested by Marc, I will add couple of clarification points:
This is a general purpose question on parsing strings based on a format. In my particular case, a Markdown parser might help me but my intention to learn how to handle such cases in general where a library might not exist.
I have read the official documentation before posting here.
Note: The following solution only works for "simple", non-escaped input markdown links. If this suits your needs, go ahead and use it. For full markdown-compatibility you should use a proper markdown parser such as gopkg.in/russross/blackfriday.v2.
You could use regexp to get the link text and the URL out of a markdown link.
So the general input text is in the form of:
[some text](somelink)
A regular expression that models this:
\[([^\]]+)\]\(([^)]+)\)
Where:
\[ is the literal [
([^\]]+) is for the "some text", it's everything except the closing square brackets
\] is the literal ]
\( is the literal (
([^)]+) is for the "somelink", it's everything except the closing brackets
\) is the literal )
Example:
r := regexp.MustCompile(`\[([^\]]+)\]\(([^)]+)\)`)
inputs := []string{
"[Some text](#some/link)",
"[What did I just commit?](#what-did-i-just-commit)",
"invalid",
}
for _, input := range inputs {
fmt.Println("Parsing:", input)
allSubmatches := r.FindAllStringSubmatch(input, -1)
if len(allSubmatches) == 0 {
fmt.Println(" No match!")
} else {
parts := allSubmatches[0]
fmt.Println(" Text:", parts[1])
fmt.Println(" URL: ", parts[2])
}
}
Output (try it on the Go Playground):
Parsing: [Some text](#some/link)
Text: Some text
URL: #some/link
Parsing: [What did I just commit?](#what-did-i-just-commit)
Text: What did I just commit?
URL: #what-did-i-just-commit
Parsing: invalid
No match!
You could create a simple lexer in pure-Go code for this use case. There's a great talk by Rob Pike from years ago that goes into the design of text/template which would be applicable. The implementation chains together a series of state functions into an overall state machine, and delivers the tokens out through a channel (via Goroutine) for later processing.

formatting text output in terminals

I'm currently writing a command line tool for myself, that needs to print some information on the terminal. I'm a little annoyed of the whole formatting. Here is my example.
formatter = logging.Formatter(fmt = '%(message)s')
console_logger = logging.getLogger("console_logger")
console_logger.setLevel(logging.DEBUG)
console_logger_handler = logging.StreamHandler()
console_logger_handler.setFormatter(formatter)
console_logger.addHandler(console_logger_handler)
console_logger.propagate = False
here goes some further code and then I have the printing function
for element in open_orders:
console_logger.info("Type: {}, Rate: {}, amount: {}, state: {}, pair: {}/{}, creation: {}, id: {}".format(element.type,
element.rate,
element.amount,
element.state,
element.currency_pair.get_base_currency().upper(),
element.currency_pair.get_quote_currency().upper(),
creation_time,
element.order_id))
I rather would like to have this as a column where the output is aligned at the colon. after each element a line of underscores or minusses would be nice as well, this should respect terminal width. I know this can be hardcoded in some manner, but isn't there a better way? Some kind of templating engine that can handle multiline output?
EDIT:
So here is an example:
Type : buy
Rate : 1234
amount : 1
state : active
pair : usd/eur
creation : 2017.12.12
I know this can be printed line by line with format but I need to determine the length of the longest string on my own and I was wondering if there isn a framework or something more elegant doing this for me.
id : 123456
Use format, add with your data :
for element in open_orders:
console_logger.info("Type: {:25s}, Rate: {:25s}, amount: {:07.2f}, state: {:25s}, pair: {:25s}/{:25s}, creation: {:25s}, id: {:25s}".format(element.type,
element.rate,
element.amount,
element.state,
element.currency_pair.get_base_currency().upper(),
element.currency_pair.get_quote_currency().upper(),
creation_time,
element.order_id))
You can also visit this site : https://pyformat.info/
In addition, you could try to use Colorama.
You have to install it, tipically, from pypi.
It allows you to handle cursor positioning, so you can control in which position at the screen (terminal) you want to print data, using "coordinates". Also, you can apply colors to text, which could give you a cleaner and prettier look if you want to.
So what I finally found which helps a lot at least in case of lists and formatting of them is this
terminaltable

grok filter for logstash

My log file has lines of the form:
10/13 14:05:18.192 [modulename]: [pid]: (debug level string): message string XYZ:<xyz value>
where
modulename is a string
pid is an integer number
debug level string is a string like "debug" or "info" or "error"
message string is a string
xyz value is an integer number
example:
10/13 14:05:18.192 [MyModule]: [12345]: (debug): This is my message. XYZ: 987
I searched around and tried a few things, but am getting _grokparsefailure. Can someone help show me what filter I can use in logstash to parse these logs?
First of all {GREEDYDATA} means until the end of a logging event. So, all the text that resides after dbg_lvl will be assigned to {GREEDYDATA}
Here, try the following code. The problem with your code filter was it was not able to parse anything after msg. Hope this helps.
(?<date>\d\d/\d\d) %{TIME:time} \[%{WORD:module}\]: \[%{WORD:pid}\]: \(%{WORD:log_level}\): %{CISCO_REASON}. %{WORD}: %{BASE10NUM:xyz_number}

Logstash: How to save an entry from earlier in a log for use across multiple lines later in the log?

So the format of my logs looks somethings like this
02:00:30> First line of log for date of 2014-08-13
...
04:03:30> Every other line of log
My question is: how can I save the date from the first line to create the timestamp for the other lines in the files?
Is there a way to set some kind of "global" field that I can reuse for other lines?
I'm looking at historical logs so the current time isn't much use.
I posted a memorize filter that you could use to do that. It was posted here.
You'd use it like this:
filter {
if [message] =~ /date of/ {
grok {
match => [ "message", "date of (?<date>\d\d\d\d-\d\d-\d\d)" ]
}
} else {
// parse your log with grok or some other method that doesn't capture date
}
memorize {
field => date
}
}
So on the first line, because you extract a date, it'll memorize it... since it's not on the remaining lines, it'll add the memorized date to the events.

Resources