How to load CSV file in logstash - logstash

I am trying to load CSV file in logstash but it is not reading the file and not creating the index in elasticsearch
I need to read the CSV file in elasticsearch.
Tried few changes in config file.
My Config file
input {
file {
type => "csv"
path => "/root/installables/*.csv"
start_position => beginning
}
}
filter {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
}
output {
elasticsearch {
hosts => localhost
index => "client"
}
}
Could anybody tell how to load CSV file in logstash?

I think you should put a "csv" filter. I make it work like this:
input {
file {
path => "/filepath..."
start_position => beginning
# to read from the beginning of file
sincedb_path => "/dev/null"
}
}
filter {
csv {
columns => ["COL1", "COL2"]
}
}
output {
stdout { codec => rubydebug }
elasticsearch {
host => "localhost"
index => "csv_index"
}
}
Also, adding stdout as output helps you to debug and know if the file is loading

Related

Logstash config with filebeat issue when using both beats and file input

I am trying to config a filebeat with logstash. At the moment I managed to successfully config filebeat with logstash and I am running into same issues when creating multiple conf files in the logstash.
So currently I have one filebeats input which is something like :
input {
beats {
port => 5044
}
}
filter {
}
output {
if [#metadata][pipeline] {
elasticsearch {
hosts => ["localhost:9200"]
manage_template => false
index => "systemsyslogs"
pipeline => "%{[#metadata][pipeline]}"
}}
else {
elasticsearch {
hosts => ["localhost:9200"]
manage_template => false
index => "systemsyslogs"
}
}}
And a file Logstash config which is like :
input {
file {
path => "/var/log/foldername/number.log"
start_position => "beginning"
}
}
filter {
grok {
match => { "message" => "%{WORD:username} %{INT:number} %{TIMESTAMP_ISO8601:timestamp}" }
}
}
output {
elasticsearch {
hosts => [ "localhost:9200" ]
index => "numberlogtest"
}
}
The grok filter is working as I successfully managed to create 2 index patterns in kibana and view the data correctly.
The problem is that when I am running logstash with both configs applied, logstash is fetching the data from number.log multiple times and logstash plain logs are getting lots of warning, therefore using a lot of computing resources and CPU is going over 80% ( this is an oracle instance ). If I remove the file config from logstash the system is running properly.
I managed to run logstash with each one of these config files applied individually, but not both at once.
I already added an exception in the filebeats config :
exclude_files:
- /var/log/foldername/*.log
Logstash plain logs when running both config files:
[2023-02-15T12:42:41,077][WARN ][logstash.outputs.elasticsearch][main][39aca10fa204f31879ff2b20d5b917784a083f91c2eda205baefa6e05c748820] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"numberlogtest", :routing=>nil}, {"service"=>{"type"=>"system"}
"caused_by"=>{"type"=>"illegal_state_exception", "reason"=>"Can't get text on a START_OBJECT at 1:607"}}}}}
I already added an exception in the filebeat config :
exclude_files:
- /var/log/foldername/*.log
Fixed by creating a single logstash config with both inputs :
input {
beats {
port => 5044
}
file {
path => "**path**"
start_position => "beginning"
}
}
filter {
if [path] == "**path**" {
grok {
match => { "message" => "%{WORD:username} %{INT:number} %{TIMESTAMP_ISO8601:timestamp}" }
}
}
}
output {
if [#metadata][pipeline] {
elasticsearch {
hosts => ["localhost:9200"]
manage_template => false
index => "index1"
pipeline => "%{[#metadata][pipeline]}"
}
} else {
if [path] == "**path**" {
elasticsearch {
hosts => ["localhost:9200"]
manage_template => false
index => "index2"
}
} else {
elasticsearch {
hosts => ["localhost:9200"]
manage_template => false
index => "index1"
}
}
}
}

Logstash input line by line

How can i read files in logstash line by line using codec?
When i tried the below configuration but it is not working:
file {
path => "C:/DEV/Projects/data/*.csv"
start_position => "beginning"
codec => line {
format => "%{[data]}"
}
Example of configuration with elasticsearch in the output:
input{
file {
path => "C:/DEV/Projects/data/*.csv"
start_position => beginning
}
}
filter {
csv {
columns => [
"COLUMN_1",
"COLUMN_2",
"COLUMN_3",
.
.
"COLUMN_N"
]
separator => ","
}
mutate {
convert => {
"COLUMN_1" => "float"
"COLUMN_4" => "float"
"COLUMN_6" => "float"
}
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
action => "index"
index => "test_index"
}
For filter :
https://www.elastic.co/guide/en/logstash/current/plugins-filters-csv.html

Issue in renaming Json parsed field in Logstash

I am parsing json log file in Logstash. There is a field named #person.name. I tried to rename this field name before sending it to elasticsearch. I also tried to remove the field but I couldn't remove or delete that field because of that my data not getting indexed in Elasticsearch.
Error recorded in elasticsearch
MapperParsingException[Field name [#person.name] cannot contain '.']
at org.elasticsearch.index.mapper.object.ObjectMapper$TypeParser.parseProperties(ObjectMapper.java:276)
at org.elasticsearch.index.mapper.object.ObjectMapper$TypeParser.parseObjectOrDocumentTypeProperties(ObjectMapper.java:221)
at org.elasticsearch.index.mapper.object.ObjectMapper$TypeParser.parse(ObjectMapper.java:196)
at org.elasticsearch.index.mapper.object.ObjectMapper$TypeParser.parseProperties(ObjectMapper.java:308)
at org.elasticsearch.index.mapper.object.ObjectMapper$TypeParser.parseObjectOrDocumentTypeProperties(ObjectMapper.java:221)
at org.elasticsearch.index.mapper.object.RootObjectMapper$TypeParser.parse(RootObjectMapper.java:138)
at org.elasticsearch.index.mapper.DocumentMapperParser.parse(DocumentMapperParser.java:119)
at org.elasticsearch.index.mapper.DocumentMapperParser.parse(DocumentMapperParser.java:100)
at org.elasticsearch.index.mapper.MapperService.parse(MapperService.java:435)
at org.elasticsearch.cluster.metadata.MetaDataMappingService$PutMappingExecutor.applyRequest(MetaDataMappingService.java:257)
at org.elasticsearch.cluster.metadata.MetaDataMappingService$PutMappingExecutor.execute(MetaDataMappingService.java:230) at org.elasticsearch.cluster.service.InternalClusterService.runTasksForExecutor(InternalClusterService.java:458)
at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:762)
My Logstash config
input {
beats {
port => 11153
}
}
filter
{
if [type] == "person_get" {
##Parsing JSON input to JSON Filter..
json {
source => "message"
}
mutate{
rename => { "#person.name" => "#person-name" }
remove_field => [ "#person.name"]
}
fingerprint {
source => ["ResponseTimestamp"]
target => "fingerprint"
key => "78787878"
method => "SHA1"
concatenate_sources => true
}
}
}
output{
if [type] == "person_get" {
elasticsearch {
index => "logstash-person_v1"
hosts => ["xxx.xxx.xx:9200"]
document_id => "%{fingerprint}" # !!! prevent duplication
}
stdout {
codec => rubydebug
}
} }

Logstash input filename as output elasticsearch index

Is there a way of having the filename of the file being read by logstash as the index name for the output into ElasticSearch?
I am using the following config for logstash.
input{
file{
path => "/logstashInput/*"
}
}
output{
elasticsearch{
index => "FromfileX"
}
}
I would like to be able to put a file e.g. log-from-20.10.2016.log and have it indexed into the index log-from-20.10.2016. Does the logstash input plugin "file" produce any variables for use in the filter or output?
Yes, you can use the path field for that and grok it to extract the filename into the index field
input {
file {
path => "/logstashInput/*"
}
}
filter {
grok {
match => ["path", "(?<index>log-from-\d{2}\.\d{2}\.\d{4})\.log$" ]
}
}
output{
elasticsearch {
index => "%{index}"
}
}
input {
file {
path => "/home/ubuntu/data/gunicorn.log"
start_position => "beginning"
}
}
filter {
grok {
match => {
"message" => "%{USERNAME:u1} %{USERNAME:u2} \[%{HTTPDATE:http_date}\] \"%{DATA:http_verb} %{URIPATHPARAM:api} %{DATA:http_version}\" %{NUMBER:status_code} %{NUMBER:byte} \"%{DATA:external_api}\" \"%{GREEDYDATA:android_client}\""
remove_field => ["message"]
}
}
date {
match => ["http_date", "dd/MMM/yyyy:HH:mm:ss +ssss"]
}
ruby {
code => "event.set('index_name',event.get('path').split('/')[-1].gsub('.log',''))"
}
}
output {
elasticsearch {
hosts => ["0.0.0.0:9200"]
index => "%{index_name}-%{+yyyy-MM-dd}"
user => "*********************"
password => "*****************"
}
stdout { codec => rubydebug }
}

Configuration with output file and codec not parsed by logstash

I'm trying a "simple" logstash configuration and want to ouput on a file to check. So I took the conf from https://www.elastic.co/guide/en/logstash/current/plugins-outputs-file.html and put it in my conf:
input {
file {
exclude => ['*.gz']
path => ['/var/log/*.log']
type => 'system logs'
}
syslog {
port => 5000
}
}
output {
elasticsearch {
hosts => ['elasticsearch']
}
file {
path => "/config/logstash_out.log"
codec => {
line {
format => "message: %{message}"
}
}
}
stdout {}
}
but when I launch it (sudo docker run -it --rm --name logstash -p 514:5000 --link elasticsearch:elasticsearch -v "$PWD":/config logstash logstash -f /config/logstash.conf), I've got a complaint from logstash:
fetched an invalid config
{:config=>"input {
file {
exclude => ['*.gz']
path => ['/var/log/*.log']
type => 'system logs'
}
syslog {
port => 5000
}
}
output {
elasticsearch {
hosts => ['elasticsearch']
}
file {
path => \"/config/logstash_out.log\"
codec => {
line {
format => \"message: %{message}\"
}
}
}
stdout {}
}"
, :reason=>"Expected one of #, => at line 20, column 13 (byte 507)
after output { elasticsearch {\n hosts => ['elasticsearch']\n }
\n\n file {\n path => \"/config/logstash_out.log\"\n
codec => { \n line ", :level=>:error}
(I've reformatted a bit so it's more readable)
Any ideas why? I'seen logstash output to file and ignores codec but the proposed solution is marked as DEPRECATED so I would like to avoid
Thanks!
You have the wrong format just like the tutorial.
Here is the pull request.
It isn't
codec => {
line {
format => \"message: %{message}\"
}
}
but it is
codec =>
line {
format => "message: %{message}"
}
You don't need to add quirly brackets around line.
Here is your config correctly.
input {
file {
exclude => ['*.gz']
path => ['/var/log/*.log']
type => 'system logs'
}
syslog {
port => 5000
}
}
output {
elasticsearch {
hosts => ['elasticsearch']
}
file {
path => "/config/logstash_out.log"
codec =>
line {
format => "message: %{message}"
}
}
stdout {}
}

Resources