I have written the logstash config file to upload a csv, csv has multiple applicant informations, I need to upload as array of dictionary in the kibana index instead of being a dictionary of dict with index.
filter {
csv {
separator => ","
skip_header => true
columns => [LoanID,Applicant_Income1,Occupation1,Time_At_Work1,Date_Of_Join1,Gender,LoanAmount,Marital_Status,Dependents,Education,Self_Employed,Applicant_Income2,Occupation2,Time_At_Work2,Date_Of_Join2,Applicant_Income3,Occupation3,Time_At_Work3,Date_Of_Join3]
}
mutate {
convert => {
"Applicant_Income1" => "float"
"Time_At_Work1" => "float"
"LoanAmount" => "float"
"Applicant_Income2" => "float"
"Time_At_Work2" => "float"
"Applicant_Income3" => "float"
"Time_At_Work3" => "float"
}
}
mutate{
rename => {
"Applicant_Income1" => "[Applicant][0][Applicant_Income]"
"Occupation1" => "[Applicant][0][Occupation]"
"Time_At_Work1" => "[Applicant][0][Time_At_Work]"
"Date_Of_Join1" => "[Applicant][0][Date_Of_Join]"
"Applicant_Income2" => "[Applicant][1][Applicant_Income]"
"Occupation2" => "[Applicant][1][Occupation]"
"Time_At_Work2" => "[Applicant][1][Time_At_Work]"
"Date_Of_Join2" => "[Applicant][1][Date_Of_Join]"
"Applicant_Income3" => "[Applicant][2][Applicant_Income]"
"Occupation3" => "[Applicant][2][Occupation]"
"Time_At_Work3" => "[Applicant][2][Time_At_Work]"
"Date_Of_Join3" => "[Applicant][2][Date_Of_Join]"
}
}
date {
match => [ "Date_Of_Join1", "yyyy-MM-dd'T'HH:mm:ss.SSZZ" ]
}
date {
match => [ "Date_Of_Join2", "yyyy-MM-dd'T'HH:mm:ss.SSZZ" ]
}
date {
match => [ "Date_Of_Join3", "yyyy-MM-dd'T'HH:mm:ss.SSZZ" ]
}
}
I got the Applicant field as
But I need the Applicant field to be an array of dictionaries, like
I tried add_field, but not working
mutate{
add_field => { "[Applicant][Applicant_Income1]" => "Applicant_Income1",
"[Applicant][Occupation1]" => "Occupation1",
"[Applicant][Time_At_Work1]" => "Time_At_Work1",
"[Applicant][Date_Of_Join1]" => "Date_Of_Join1"
}
}
The square brackets in Logstash Filters do not behave like array elements/entries as in other programming languages, e.g. Java.
[Applicant][0][Applicant_Income]
is not the right syntax to set the value of field Applicant_Income of the first element (zero-based index) in the Applicant-Array. Instead, you create sub-elements 0, 1, 2 underneath the Applicant-element as shown in Figure 1.
To create an array of objects, you should use the ruby filter plugin (https://www.elastic.co/guide/en/logstash/current/plugins-filters-ruby.html). Since you can execute arbitrary ruby code with that filter, it gives you more control/freedom:
filter {
csv {
separator => ","
skip_header => true
columns => [LoanID,Applicant_Income1,Occupation1,Time_At_Work1,Date_Of_Join1,Gender,LoanAmount,Marital_Status,Dependents,Education,Self_Employed,Applicant_Income2,Occupation2,Time_At_Work2,Date_Of_Join2,Applicant_Income3,Occupation3,Time_At_Work3,Date_Of_Join3]
}
mutate {
convert => {
"Applicant_Income1" => "float"
"Time_At_Work1" => "float"
"LoanAmount" => "float"
"Applicant_Income2" => "float"
"Time_At_Work2" => "float"
"Applicant_Income3" => "float"
"Time_At_Work3" => "float"
}
}
ruby{
code => '
event.set("Applicant",
[
{
"Applicant_Income" => event.get("Applicant_Income1"),
"Occupation" => event.get("Occupation1"),
"Time_At_Work" => event.get("Time_At_Work1"),
"Date_Of_Join" => event.get("Date_Of_Join1")
},
{
# next object...
}
]
'
}
date {
match => [ "Date_Of_Join1", "yyyy-MM-dd'T'HH:mm:ss.SSZZ" ]
}
date {
match => [ "Date_Of_Join2", "yyyy-MM-dd'T'HH:mm:ss.SSZZ" ]
}
date {
match => [ "Date_Of_Join3", "yyyy-MM-dd'T'HH:mm:ss.SSZZ" ]
}
mutate{
remove_field => [
"Applicant_Income1",
"Occupation1",
"Time_At_Work1",
"Date_Of_Join1",
"Applicant_Income2",
"Occupation2",
"Time_At_Work2",
"Date_Of_Join2",
"Applicant_Income3",
"Occupation3",
"Time_At_Work3",
"Date_Of_Join3"
]
}
}
With event.set you add a field to the document. The first argument is the fieldname, the second one its value. In this case, you add the field "Applicants" to the document with an array of objects as its value.
event.get is used to get the value of a certain field in the document. You retrieve the value by passing the fieldname to the method.
Please refer to this guide
https://www.elastic.co/guide/en/logstash/current/event-api.html to get more insights of the event API.
I hope I could help you.
I am using logstash 6.2.4 with the following config:
input {
stdin { }
}
filter {
date {
match => [ "message","HH:mm:ss" ]
}
}
output {
stdout { }
}
With the following input:
10:15:20
I get this output:
{
"message" => "10:15:20",
"#version" => "1",
"host" => "DESKTOP-65E12L2",
"#timestamp" => 2019-01-01T09:15:20.000Z
}
I have just a time information, but would like to parse it as current date.
Note that current date is 1. March 2019, so I guess that 2019-01-01 is some sort of default ?
How can I parse time information and add current date information to it ?
I am not really interested in any replace or other blocks as according to the documentation, parsing the time should default to current date.
You need to add a new field merging the current date with the field that contains your time information, which in your example is the message field, then your date filter will need to be tested against this new field, you can do this using the following configuration.
filter {
mutate {
add_field => { "current_date" => "%{+YYYY-MM-dd} %{message}" }
}
date {
match => ["current_date", "YYYY-MM-dd HH:mm:ss" ]
}
}
The result will be something like this:
{
"current_date" => "2019-03-03 10:15:20",
"#timestamp" => 2019-03-03T13:15:20.000Z,
"host" => "elk",
"message" => "10:15:20",
"#version" => "1"
}
I am trying to convert "#timestamp": "2017-08-16T15:20:07.254Z" to "America/Vancouver" timezone.
This is the output:
"localtimestamp" => "2017-08-16 15:20:07.254",
"localtimestamp1" => 2017-08-16T22:20:07.254Z,
mutate {
add_field => {
# Create a new field with string value of the UTC event date
"localtimestamp" => "%{#timestamp}"
}
}
ruby {
code => "
event.set('localtimestamp' , event.get('#timestamp').time.strftime('%Y-%m-%d %H:%M:%S.%L'))
"
}
date {
match => [ "localtimestamp", "yyyy-MM-dd HH:mm:ss.SSS" ]
timezone => "America/Vancouver"
target => "localtimestamp1"
}
Any help would be appreciated. I just need to show the #timestamp in a new field with the stamp converted to a local time
I need to parse the date and timestamp in the log to show in #timestamp field. I am able to parse timestamp but not date.
Input Log:
"2010-08-18","00:01:55","text"
My Filter:
grok {
match => { "message" => '"(%{DATE})","(%{TIME})","(%{GREEDYDATA:message3})"’}
}
Here DATE throws grokparsefailure.
Also not sure how to update the #timestamp field.
Appreciate your help.
The %{DATE} pattern is not what you want. It's looking for something in M/D/Y, M-D-Y, D-M-Y, or D/M/Y format.
For a file like this, you could consider using the csv filter:
filter {
csv {
columns => ["date","time","message3"]
add_filed => {
"date_time" => "%{date} %{time}"
}
}
date {
match => [ "date_time", "yyyy-MM-dd HH:mm:ss" ]
remove_field => ["date", "time", "date_time" ]
}
}
This will handle the case where message3 has embedded quotes in it that have been escaped.
I have file containing series of such messages:
component+branch.job 2014-09-04_21:24:46 2014-09-04_21:24:49
It is string, some white spaces, first date and time, some white spaces and second date and time. Currently I'm using such filter:
filter {
grok {
match => [ "message", "%{WORD:componentName}\+%{WORD:branchName}\.%{WORD:jobType}\s+20%{DATE:dateStart}_%{TIME:timeStart}\s+20%{DATE:dateStop}_%{TIME:timeStop}" ]
}
}
I would like to convert dateStart and timeStart to #timestamp for that message.
I found that there is date filter but I don't know how to use it on two separate fields.
I have also tried something like this as filter:
date {
match => [ "message", "YYYY-MM-dd_HH:mm:ss" ]
}
but it didn't worked as expected.
Based on duplicate suggested by Magnus Bäck, I created solution for my problem. Solution was to mutate parsed data into one field:
mutate {
add_field => {"tmp_start_timestamp" => "20%{dateStart}_%{timeStart}"}
}
and then parse it as I suggested in my question.
So final solution looks like this:
filter {
grok {
match => [ "message", "%{WORD:componentName}\+%{WORD:branchName}\.%{DATA:jobType}\s+20%{DATE:dateStart}_%{TIME:timeStart}\s+20%{DATE:dateStop}_%{TIME:timeStop}" ]
}
mutate {
add_field => {"tmp_start_timestamp" => "20%{dateStart}_%{timeStart}"}
}
date {
match => [ "tmp_start_timestamp", "YYYY-MM-dd_HH:mm:ss" ]
}
}