logstash - add only first time value - logstash

Here's what I want, it's a bit the opposite of incremental data.
some data's are logs with a specific token, and I want to be able to keep (or to show in Elasticsearch) only the first submitted data, the oldiest information of each token.
I want to ignore any new log of the same token ?
How can I do that ? is it in logstash or elasticsearch ?
Thanks
Updates 2016-05-31
I think we can see that in different perspective. but globally what I want is the table like in the picture, but without the red lines, I want them to be ignored by logstash, or not display in ES queries.
I know it can be done, if I was able to add any flag in those lines I want to delete, but it's not possible, the only fact that tell us they can be removed is because we already have a key first-AAA that has been logged before.
At the logging process, we don't have this information.

You can achieve this using the elasticsearch filter. The filter would check in ES if the record already exists and if it is the case, we ask Logstash to just drop the line.
Note that I'm making the assumption that the Id field (AAA) is used as the document _id and is also present in the document as the Id field. Feel free to change whatever needs to, but this will work.
input {
...
}
filter {
elasticsearch {
hosts => ["localhost:9200"]
query => "_type:your_type AND _id:%{[Id]}"
fields => {"Id" => "found"}
}
if [found] {
drop {}
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
...
}
}

Related

How to add hash the whole content of an event in Logstash for OpenSearch?

the problem is the following: I'm investigating how to add some anti-tampering protection to events stored in OpenSearch that are parsed and sent there by Logstash. Info is composed of application logs collected from several hosts. The idea is to add a hashed field that's linked to the original content so that any modification of the fields break the hash result and can be detected.
Currently, we have in place some grok filters that extract information from the received log lines and store it into different fields using several patterns. To make it more difficult for an attacker who modifies these logs to cover their tracks, I'm thinking of adding an extra field where the whole line is hashed and salted before splitting.
Initial part of my filter config is like this. It was used primarily with ELK, but our project will be switching to OpenSearch:
filter {
grok {
match => { "message" => "%{TIMESTAMP_ISO8601:mytimestamp} (\[)*%{LOGLEVEL:loglevel}(\])* %{JAVACLASS:javaclass}(.)*(\[/])* %{DATA:component} %{DATA:version} - %{GREEDYDATA:message}"}
overwrite => [ "message" ]
overwrite => [ "version" ]
break_on_match => false
keep_empty_captures => true
}
// do more stuff
}
OpenSearch has some info on Field masking, but this is not exactly what I am after.
If any of you could help me with a pointer or an idea on how to do this. I don't know whether the hash fields available in ELK are also available in OpenSearch, or whether the Logstash plugin that does the hashing of fields would be usable without licensing issues. But maybe there are other and better options that I am not aware. I was looking for info on how to call an external script to do this during the filter execution, but I don't even know whether that is possible (apparently not, at least I couldn't find anything).
Any ideas? Thank you!

Check if variable exists in Logstash

I'm trying to write a new logstash filter for events coming from Wazuh. Generally all events set a "%{[rule][description]}" variable and I write this into my alert field. I'm finding that one event is not populating this variable so when I write it to my alert field, I just get %{[rule][description]} instead of the contents.
Does anyone know how to check if a variable exists in a logstash filter? It's fairly easy for fields but not for a variable from what I can gather so far. I'd like to be able to say if the variable doesn't exist, then set it to a string of my choosing.
It's pretty strange getting a rule with the rule.description field empty, could you share with me the rule's id?
Anyway, the filter you are looking for is the following one:
filter{
if [rule][description] == '' {
mutate {
add_field => [ "rule.description", "VALUE" ]
}
}
}
Where you can fill VALUE with the string you wish.
Hope it helps.

Elasticsearch Logstash Kibana and Grok How do I break apart the message?

I created a filter to break apart our log files and am having the following issue. I'm not able to figure out how to save the parts of the "message" to their own field or tag or whatever you call it. I'm 3 days new to logstash and have had zero luck with finding someone here who knows it.
So for an example lets say this is your log line in a log file
2017-12-05 [user:edjm1971] msg:This is a message from the system.
And what you want to do is to get the value of the user and set that into some index mapping so you can search for all logs that were by that user. Also, you should see the information from the message in their own fields in Kibana.
My pipeline.conf file for logstash is like
grok {
match => {
"message" => "%{TIMESTAMP_ISO8601:timestamp} [sid:%{USERNAME:sid} msg:%{DATA:message}"
}
add_tag => [ "foo_tag", "some_user_value_from_sid_above" ]
}
Now when I run the logger to create logs data gets over to ES and I can see the data in KIBANA but I don't see foo_tag at all with the sid value.
How exactly do I use this to create the new tag that gets stored into ES so I can see the data I want from the message?
Note: using regex tools it all appears to parse the log formats fine and the log for logstash does not spit out errors when processing.
Also for the logstash mapping it is using some auto defined mapping as the path value is nil.
I'm not clear on how to create a mapping for this either.
Guidance is greatly appreciated.

conversion fields kibana and logstash

I try to convert a field "tmp_reponse" in integer in the file "conf" with logstash as follows :
mutate {
convert => {"TMP_REPONSE" => "integer"}
}
,but on Kibana it shows me that he is still string. I do not understand how I can make a convertion to use my fields "tmp_response" to use it like as a metric fields on kibana
thank you help me please and if there is anyone who can explain to me how I can master the metrics on Kibana and use fields as being of metrics fields
mutate{} will change the type of the field in logstash. If you added a stdout{} output stanza, you would see that it's an integer at that point.
How elasticsearch treats it is another problem entirely. Elasticsearch usually sets the type of a field based on the first input received, so if you sent documents in before you added the mutate to your logstash config, they would have been strings and the elasticsearch index will always consider that field to be a string.
The type may also have been defined in an elasticsearch template or mapping.
The good news is that your mutate will probably set the type when a new index is created. If you're using daily indexes (the default in logstash), you can just wait a day. Or you can delete the index (losing any data so far) and let a new one be created. Or you could rebuild the index.
Good luck.

Logstash Dynamically assign template

I have read that it is possible to assign dynamic names to the indexes like this:
elasticsearch {
cluster => "logstash"
index => "logstash-%{clientid}-%{+YYYY.MM.dd}"
}
What I am wondering is if it is possible to assign the template dynamically as well:
elasticsearch {
cluster => "logstash"
template => "/etc/logstash/conf.d/%{clientid}-template.json"
}
Also where does the variable %{clientid} come from?
Thanks!
After some testing and feedback from other users, thanks Ben Lim, it seems this is not possible to do so far.
The closest thing would be to do something like this:
if [type] == "redis-input" {
elasticsearch {
cluster => "logstash"
index => "%{type}-logstash-%{+YYYY.MM.dd}"
template => "/etc/logstash/conf.d/elasticsearch-template.json"
template_name => "redis"
}
} else if [type] == "syslog" {
elasticsearch {
cluster => "logstash"
index => "%{type}-logstash-%{+YYYY.MM.dd}"
template => "/etc/logstash/conf.d/syslog-template.json"
template_name => "syslog"
}
}
Full disclosure: I am a Logstash developer at Elastic
You cannot dynamically assign a template because templates are uploaded only once, at Logstash initialization. Without the flow of traffic, deterministic variable completion does not happen. Since there is no traffic flow during initialization, there is nothing there which can "fill in the blank" for %{clientid}.
It is also important to remember that Elasticsearch index templates are only used when a new index is created, and so it is that templates are not uploaded every time a document reached the Elasticsearch output block in Logstash--can you imagine how much slower it would be if Logstash had to do that? If you intend to have multiple templates, they need to be uploaded to Elasticsearch before any data gets sent there. You can do this with a script of your own making using curl and Elasticsearch API calls. This also permits you to update templates without having to restart Logstash. You could run the script any time before index rollover, and when the new indices get created, they'll have the new template settings.
Logstash can send data to a dynamically configured index name, just as you have above. If there is no template present, Elasticsearch will create a best-guess mapping, rather than what you wanted. Templates can and ought to be completely independent of Logstash. This functionality was added for an improved out-of-the-box experience for brand new users. The default template is less than ideal for advanced use cases, and Logstash is not a good tool for template management if you have more than one index template.

Resources