Logstash input line by line - logstash

How can i read files in logstash line by line using codec?
When i tried the below configuration but it is not working:
file {
path => "C:/DEV/Projects/data/*.csv"
start_position => "beginning"
codec => line {
format => "%{[data]}"
}

Example of configuration with elasticsearch in the output:
input{
file {
path => "C:/DEV/Projects/data/*.csv"
start_position => beginning
}
}
filter {
csv {
columns => [
"COLUMN_1",
"COLUMN_2",
"COLUMN_3",
.
.
"COLUMN_N"
]
separator => ","
}
mutate {
convert => {
"COLUMN_1" => "float"
"COLUMN_4" => "float"
"COLUMN_6" => "float"
}
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
action => "index"
index => "test_index"
}
For filter :
https://www.elastic.co/guide/en/logstash/current/plugins-filters-csv.html

Related

ConfigurationError in Logstash

When I run this configuration file:
input {
file {
path => "/tmp/linuxServerHealthReport.csv"
start_position => "beginning"
sincedb_path => "/home/infra/logstash-7.14.1/snowdb/health_check"
}
codec => multiline {
pattern => "\""
negate => true
what => previous
}
}
filter {
csv {
columns => ["Report_timestamp","Hostname","OS_Relese","Server_Uptime","Internet_Status","Current_CPU_Utilization","Current_Memory_Utilization","Current_SWAP_Utilization","FS_Utilization","Inode_Utilization","FS_Read_Only_Mode_Status","Disk_Multipath_Status","Besclient_Status","Antivirus_Status","Cron_Service_Status","Nagios_Status","Nagios_Heartbest_Status","Redhat_Cluster_Status"]
separator => ","
skip_header => true
}
mutate {
remove_field => ["path", "host"]
}
skip_empty_columns => true
skip_empty_row => true
}
# quote_char => "'"
output {
stdout { codec => rubydebug }
}
I get this error:
Error:
[2021-09-22T15:57:04,929][ERROR][logstash.agent ] Failed to execute action {:action=>LogStash::PipelineAction::Create/pipeline_id:main, :exception=>"LogStash::ConfigurationError", :message=>"Expected one of [ \t\r\n], "#", "{" at line 7, column 9 (byte 226) after input {\n file {\n\t\tpath => "/tmp/linuxServerHealthReport.csv"\n start_position => "beginning"\n sincedb_path => "/home/imiinfra/logstash-7.14.1/snowdb/health_check"\n }\n\t\tcodec ", :backtrace=>["/home/imiinfra/logstash-7.14.1/logstash-core/lib/logstash/compiler.rb:32:in compile_imperative'", "org/logstash/execution/AbstractPipelineExt.java:187:in initialize'", "org/logstash/execution/JavaBasePipelineExt.java:72:in initialize'", "/home/imiinfra/logstash-7.14.1/logstash-core/lib/logstash/java_pipeline.rb:47:in initialize'", "/home/imiinfra/logstash-7.14.1/logstash-core/lib/logstash/pipeline_action/create.rb:52:in execute'", "/home/imiinfra/logstash-7.14.1/logstash-core/lib/logstash/agent.rb:391:in block in converge_state'"]}
You have to work on your formatting, but this is what I have reconstructed from the question. Main issues seem to be parameters of file, you put codec outside of file for some reason. The other issue is csv parameters skip_empty_columns and skip_empty_row which were also outside of csv.
So I did a little bit of formatting and fixed those issues and it should work now.
input {
file {
path => "/tmp/linuxServerHealthReport.csv"
codec => multiline {
pattern => "\""
negate => true
what => previous
}
start_position => "beginning"
sincedb_path => "/home/infra/logstash-7.14.1/snowdb/health_check"
}
}
filter {
csv {
columns => ["Report_timestamp","Hostname","OS_Relese","Server_Uptime","Internet_Status","Current_CPU_Utilization","Current_Memory_Utilization","Current_SWAP_Utilization","FS_Utilization","Inode_Utilization","FS_Read_Only_Mode_Status","Disk_Multipath_Status","Besclient_Status","Antivirus_Status","Cron_Service_Status","Nagios_Status","Nagios_Heartbest_Status","Redhat_Cluster_Status"]
separator => ","
skip_header => true
skip_empty_columns => true
skip_empty_row => true
}
mutate {
remove_field => ["path", "host"]
}
}
output {
stdout { codec => rubydebug }
}

Grok pattern for log with pipeline seperator

I'm trying to figure out the log pattern for the log pattern below.
01/02AVDC190001|00001|4483850000152971|DATAPREP|PREPERATION/ENRICHEMENT |020190201|20:51:52|SCHED
What I've tried so far is :
input {
file {
path => "C:/Elasitcity/Logbase/July10_Logs_SDC/*.*"
start_position => "beginning"
sincedb_path => "NUL"
}
}
filter {
mutate {
gsub => ["message","\|"," "]
}
grok {
match => ["message","%{NUMBER:LOGID} %{NUMBER:LOGPHASE} %{NUMBER:LOGID} %{WORD:LOGEVENT} %{WORD:LOGACTIVITY} %{DATE_US: DATE} %{TIME:LOGTIME}"]
}
}
}
output {
elasticsearch {
hosts => "localhost"
index => "grokcsv"
document_type => "gxs"
}
stdout {}
}
I'm also wondering if its possible to combine the data and time since its seperated by a pipeline character. But that's not the primary question,.

Can't create a field with a variable from a grok match regex

I am currently using logstash, elasticsearch and kibana 6.3.0
My log are generated at a unique id path : /tmp/USER_DATA/FactoryContainer/images/(my unique id)/oar/oar_image_job(my unique id).stdout
What I want to do is to match this unique id and to create a field with this id.
I m a bit novice to logstash filter but I don't know why it doesn't want to use my uid and always return me %{uid} in my field or this Failed to execute action error.
my filter :
input {
file {
path => "/tmp/USER_DATA/FactoryContainer/images/*/oar/oar_image_job*.stdout"
start_position => "beginning"
add_field => { "data_source" => "oar-image-job" }
}
}
filter {
grok {
match => ["path","%{UNIXPATH}%{NUMBER:uid}%{UNIXPATH}"]
}
mutate {
add_field => [ "unique_id" => "%{uid}" ]
}
}
output {
if [data_source] == "oar-image-job" {
elasticsearch {
index => "oar-image-job-%{+YYYY.MM.dd}"
hosts => ["localhost:9200"]
}
}
}
the data_source field is to avoid this issue: When you put multiple config files in a directory for Logstash to use, they will all be concatenated
in the grok debugger %{UNIXPATH}%{NUMBER:uid}%{UNIXPATH} my path return me the good value
link to the solution : https://discuss.elastic.co/t/cant-create-a-field-with-a-variable-from-a-grok-match-regex/142613/7?u=thesmartmonkey
the correct filter :
input {
file {
path => "/tmp/USER_DATA/FactoryContainer/images/*/oar/oar_image_job*.stdout"
start_position => "beginning"
add_field => { "data_source" => "oar-image-job" }
}
}
filter {
grok {
match => { "path" => [ "/tmp/USER_DATA/FactoryContainer/images/%{DATA:unique_id}/oar/oar_image_job%{DATA}.stdout" ] }
}
}
output {
if [data_source] == "oar-image-job" {
elasticsearch {
index => "oar-image-job-%{+YYYY.MM.dd}"
hosts => ["localhost:9200"]
}
}
}

Logstash input filename as output elasticsearch index

Is there a way of having the filename of the file being read by logstash as the index name for the output into ElasticSearch?
I am using the following config for logstash.
input{
file{
path => "/logstashInput/*"
}
}
output{
elasticsearch{
index => "FromfileX"
}
}
I would like to be able to put a file e.g. log-from-20.10.2016.log and have it indexed into the index log-from-20.10.2016. Does the logstash input plugin "file" produce any variables for use in the filter or output?
Yes, you can use the path field for that and grok it to extract the filename into the index field
input {
file {
path => "/logstashInput/*"
}
}
filter {
grok {
match => ["path", "(?<index>log-from-\d{2}\.\d{2}\.\d{4})\.log$" ]
}
}
output{
elasticsearch {
index => "%{index}"
}
}
input {
file {
path => "/home/ubuntu/data/gunicorn.log"
start_position => "beginning"
}
}
filter {
grok {
match => {
"message" => "%{USERNAME:u1} %{USERNAME:u2} \[%{HTTPDATE:http_date}\] \"%{DATA:http_verb} %{URIPATHPARAM:api} %{DATA:http_version}\" %{NUMBER:status_code} %{NUMBER:byte} \"%{DATA:external_api}\" \"%{GREEDYDATA:android_client}\""
remove_field => ["message"]
}
}
date {
match => ["http_date", "dd/MMM/yyyy:HH:mm:ss +ssss"]
}
ruby {
code => "event.set('index_name',event.get('path').split('/')[-1].gsub('.log',''))"
}
}
output {
elasticsearch {
hosts => ["0.0.0.0:9200"]
index => "%{index_name}-%{+yyyy-MM-dd}"
user => "*********************"
password => "*****************"
}
stdout { codec => rubydebug }
}

How to load CSV file in logstash

I am trying to load CSV file in logstash but it is not reading the file and not creating the index in elasticsearch
I need to read the CSV file in elasticsearch.
Tried few changes in config file.
My Config file
input {
file {
type => "csv"
path => "/root/installables/*.csv"
start_position => beginning
}
}
filter {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
}
output {
elasticsearch {
hosts => localhost
index => "client"
}
}
Could anybody tell how to load CSV file in logstash?
I think you should put a "csv" filter. I make it work like this:
input {
file {
path => "/filepath..."
start_position => beginning
# to read from the beginning of file
sincedb_path => "/dev/null"
}
}
filter {
csv {
columns => ["COL1", "COL2"]
}
}
output {
stdout { codec => rubydebug }
elasticsearch {
host => "localhost"
index => "csv_index"
}
}
Also, adding stdout as output helps you to debug and know if the file is loading

Resources