Elasticsearch Logstash Kibana and Grok How do I break apart the message? - logstash-grok

I created a filter to break apart our log files and am having the following issue. I'm not able to figure out how to save the parts of the "message" to their own field or tag or whatever you call it. I'm 3 days new to logstash and have had zero luck with finding someone here who knows it.
So for an example lets say this is your log line in a log file
2017-12-05 [user:edjm1971] msg:This is a message from the system.
And what you want to do is to get the value of the user and set that into some index mapping so you can search for all logs that were by that user. Also, you should see the information from the message in their own fields in Kibana.
My pipeline.conf file for logstash is like
grok {
match => {
"message" => "%{TIMESTAMP_ISO8601:timestamp} [sid:%{USERNAME:sid} msg:%{DATA:message}"
}
add_tag => [ "foo_tag", "some_user_value_from_sid_above" ]
}
Now when I run the logger to create logs data gets over to ES and I can see the data in KIBANA but I don't see foo_tag at all with the sid value.
How exactly do I use this to create the new tag that gets stored into ES so I can see the data I want from the message?
Note: using regex tools it all appears to parse the log formats fine and the log for logstash does not spit out errors when processing.
Also for the logstash mapping it is using some auto defined mapping as the path value is nil.
I'm not clear on how to create a mapping for this either.
Guidance is greatly appreciated.

Related

How to add hash the whole content of an event in Logstash for OpenSearch?

the problem is the following: I'm investigating how to add some anti-tampering protection to events stored in OpenSearch that are parsed and sent there by Logstash. Info is composed of application logs collected from several hosts. The idea is to add a hashed field that's linked to the original content so that any modification of the fields break the hash result and can be detected.
Currently, we have in place some grok filters that extract information from the received log lines and store it into different fields using several patterns. To make it more difficult for an attacker who modifies these logs to cover their tracks, I'm thinking of adding an extra field where the whole line is hashed and salted before splitting.
Initial part of my filter config is like this. It was used primarily with ELK, but our project will be switching to OpenSearch:
filter {
grok {
match => { "message" => "%{TIMESTAMP_ISO8601:mytimestamp} (\[)*%{LOGLEVEL:loglevel}(\])* %{JAVACLASS:javaclass}(.)*(\[/])* %{DATA:component} %{DATA:version} - %{GREEDYDATA:message}"}
overwrite => [ "message" ]
overwrite => [ "version" ]
break_on_match => false
keep_empty_captures => true
}
// do more stuff
}
OpenSearch has some info on Field masking, but this is not exactly what I am after.
If any of you could help me with a pointer or an idea on how to do this. I don't know whether the hash fields available in ELK are also available in OpenSearch, or whether the Logstash plugin that does the hashing of fields would be usable without licensing issues. But maybe there are other and better options that I am not aware. I was looking for info on how to call an external script to do this during the filter execution, but I don't even know whether that is possible (apparently not, at least I couldn't find anything).
Any ideas? Thank you!

how to add bytes, session and source parameter in kibana to visualise suricata logs?

I redirected all the logs(suricata logs here) to logstash using rsyslog. I used template for rsyslog as below:
template(name="json-template"
type="list") {
constant(value="{")
constant(value="\"#timestamp\":\"") property(name="timereported" dateFormat="rfc3339")
constant(value="\",\"#version\":\"1")
constant(value="\",\"message\":\"") property(name="msg" format="json")
constant(value="\",\"sysloghost\":\"") property(name="hostname")
constant(value="\",\"severity\":\"") property(name="syslogseverity-text")
constant(value="\",\"facility\":\"") property(name="syslogfacility-text")
constant(value="\",\"programname\":\"") property(name="programname")
constant(value="\",\"procid\":\"") property(name="procid")
constant(value="\"}\n")
}
for every incoming message, rsyslog will interpolate log properties into a JSON formatted message, and forward it to Logstash, listening on port 10514.
Reference link: https://devconnected.com/monitoring-linux-logs-with-kibana-and-rsyslog/
(I have also configured logstash as mention on the above reference link)
I am getting all the column in Kibana discover( as mentioned in json-template of rsyslog) but I also require bytes, session and source column in kibana which I am not getting here. I have attached the snapshot of the column I am getting on Kibana here
Available fields(or say column) on Kibana are:
#timestamp
t #version
t _type
t facility
t host
t message
t procid
t programname
t sysloghost
t _type
t _id
t _index
# _score
t severity
Please let me know how to add bytes, session and source in the available fields of Kibana. I require these parameters for further drill down in Kibana.
EDIT: I have added how my "/var/log/suricata/eve.json" looks like (which I need to visualize in Kibana. )
For bytes, I will use (bytes_toserver+bytes_toclient) which is an available inside flow.
Session I need to calculate.
Source_IP I will use as the source.
{"timestamp":"2020-05 04T14:16:55.000200+0530","flow_id":133378948976827,"event_type":"flow","src_ip":"0000:0000:0000:0000:0000:0000:0000:0000","dest_ip":"ff02:0000:0000:0000:0000:0001:ffe0:13f4","proto":"IPv6-ICMP","icmp_type":135,"icmp_code":0,"flow":{"pkts_toserver":1,"pkts_toclient":0,"bytes_toserver":87,"bytes_toclient":0,"start":"2020-05-04T14:16:23.184507+0530","end":"2020-05-04T14:16:23.184507+0530","age":0,"state":"new","reason":"timeout","alerted":false}}
Direct answer
Read the grok docs in detail.
Then head over to the grok debugger with some sample logs, to figure out expressions. (There's also a grok debugger built in to Kibana's devtools nowadays)
This list of grok patterns might come in handy, too.
A better way
Use Suricata's JSON log instead of the syslog format, and use Filebeat instead of rsyslog. Filebeat has a Suricata module out of the box.
Sidebar: Parsing JSON logs
In Logstash's filter config section:
filter {
json {
source => "message"
# you probably don't need the "message" field if it parses OK
#remove_field => "message"
}
}
[Edit: added JSON parsing]

Is it possible to pick only error entry from logfiles in logstash

I am using logstash to monitor my production server logs, but it throws all logs from info to errors, what I want is that it can only pick errors from log file and throw it on logstash kibana view.
After parsing your log using grok you can use logstash conditionals to check if loglevel (or whatever is your field name) equals to ERROR. If its true forward it to your output plugin,
output {
if [loglevel] == "ERROR"{ # Send ERROR logs only
elasticsearch {
...
}
}
}
If you are using filebeat to ship logs, you can use Processors, to send only logs that contains ERROR.
The contains condition checks if a value is part of a field. The field
can be a string or an array of strings. The condition accepts only a
string value.
For example, the following condition checks if an error is part of the
transaction status:
contains:
status: "Specific error"
Depends on your log format, you might be able to use one of the many supported conditions by filebeat processors,
Each condition receives a field to compare. You can specify multiple
fields under the same condition by using AND between the fields (for
example, field1 AND field2).
For each field, you can specify a simple field name or a nested map,
for example dns.question.name.
You can read more about Conditions here

conversion fields kibana and logstash

I try to convert a field "tmp_reponse" in integer in the file "conf" with logstash as follows :
mutate {
convert => {"TMP_REPONSE" => "integer"}
}
,but on Kibana it shows me that he is still string. I do not understand how I can make a convertion to use my fields "tmp_response" to use it like as a metric fields on kibana
thank you help me please and if there is anyone who can explain to me how I can master the metrics on Kibana and use fields as being of metrics fields
mutate{} will change the type of the field in logstash. If you added a stdout{} output stanza, you would see that it's an integer at that point.
How elasticsearch treats it is another problem entirely. Elasticsearch usually sets the type of a field based on the first input received, so if you sent documents in before you added the mutate to your logstash config, they would have been strings and the elasticsearch index will always consider that field to be a string.
The type may also have been defined in an elasticsearch template or mapping.
The good news is that your mutate will probably set the type when a new index is created. If you're using daily indexes (the default in logstash), you can just wait a day. Or you can delete the index (losing any data so far) and let a new one be created. Or you could rebuild the index.
Good luck.

Logstash grok filter fails to match for some messages

I'm trying to parse my application's logs with logstash (version 1.4.2) and grok, but for some reason I don't understand, grok fails to parse some of the lines that should match the specified filter. I've searched Google and Stackoverflow, but most of the problems other people had seemed to be related to multiline log messages (which isn't the case for me), and I couldn't find anything that solved my problem.
My filter looks like this:
filter {
grok {
match => { "message" => "%{SYSLOGBASE} -(?<script>\w*)-: Adding item with ID %{WORD:item_id} to database."}
add_tag => ["insert_item"]
}
}
Here's the message field of a line that is parsed correctly:
May 11 16:47:55 myhost rqworker: -script-: Adding item with ID 982663745238221172_227691295 to database.
And here's the message field of a line that isn't:
May 11 16:47:55 myhost rqworker: -script-: Adding item with ID 982663772746479443_1639853260 to database.
The only thing that differs between these messages is the item's ID, and Grok Debugger parses them both correctly.
I've checked the logstash log file, but didn't see any relevant error messages.
I'm just starting out with logstash and have no idea what is happening here; any help would be much appreciated!

Resources