Is it possible to fetch data from an HTTP URL to Logstash? - logstash

As title says, I need to feed data(CSV or JSON) directly into logstash. What I want is to set a filter, say CSV, which reads the content directly from some http://example.com/csv.php into Logstash without involving any middleman script.

If I understand you correctly, you are trying to call an http resource repeatedly and fetch data into logstash. So you are looking for an input rather than a filter.
Logstash has just released a new http poller input plugin for that purpose. After installing it using bin/plugin install logstash-input-http_poller you can set a config like this to call your resource:
input {
http_poller {
urls => {
myresource => "http://example.com/csv.php"
}
}
request_timeout => 60
interval => 60
codec => "json" # set this if the response is json formatted
}
}
If the response contains CSV you need to set a csv filter.
filter {
csv{ }
}
There are also plugins which perform an http request within the filter section. However, these are supposed to enrich an existing event and that doesn't seem to be what you are looking for.

Related

How to remove fields from filebeat or logstash

I'm very new to ELK stack and i'm trying to process this log from a spring application.
{
"#timestamp": "2021-02-17T18:25:47.646+01:00",
"#version": "1",
"message": "The app is running!",
"logger_name": "it.company.demo.TestController",
"thread_name": "http-nio-8085-exec-2",
"level": "INFO",
"level_value": 20000,
"application_name": "myAppName"
}
On the machine where the Spring application is running i setup filebeat, that is connected to logstash.
Right now, the logstash configuration is this (very simple, very basic):
input {
beats {
port => 5044
ssl => false
client_inactivity_timeout => 3000
}
}
filter {
json {
source => "message"
}
}
output {
elasticsearch {
hosts => localhost
}
}
I added the json { source => "message"} } to extract the message field from the log (i dont know if this is correct).
Anyway, filebeat is sending a lot of fields that are not included to the log, for example:
agent.hostname
agent.id
agent.type
and many other agent fields (version etc)
host.hostname
host.ip
and many other host fields (os.build, os.family etc)
For my pourpose i dont need all this field, maybe i need some of them.. but for sure not all.
I'm asking how i can remove all this fields, and select only the fields i want. How i can do that.
And, to do this i think the right solution is to add a filter to logstash, so all the application (fielebeats) always send the entire paypload and a single instance of logstash will parse the message.. right ?
Doing this on filebeat means that i need to reproduce this configuration for all my application, and it's not centralized. But, because i started this new adventure yesterday.. i dont know if its right
Many thanks
Those fields under [agent] and [host] are being added by filebeat. The filebeat documentation explains how to configure them.

How to access available fields of #metadata in logstash

A sample logstash is running and getting input data from a filebeat running on another machine in the same network. I need to process some metadata of files forwarded by filebeat for example modified date of input file. I found that this information may be available in #metadata variable, and can access some fields like this:
%{[#metadata][type]}
%{[#metadata][beat]}
but I don't know how to access all kind of data stored in this field so that i'll be able to extract my own data.
You can add the following configuration to your logstash.conf file:
output {
stdout {
codec => rubydebug {
metadata => true
}
}
}
https://www.elastic.co/blog/logstash-metadata
But this field does not contain metadata of input file

Logstash Dynamically assign template

I have read that it is possible to assign dynamic names to the indexes like this:
elasticsearch {
cluster => "logstash"
index => "logstash-%{clientid}-%{+YYYY.MM.dd}"
}
What I am wondering is if it is possible to assign the template dynamically as well:
elasticsearch {
cluster => "logstash"
template => "/etc/logstash/conf.d/%{clientid}-template.json"
}
Also where does the variable %{clientid} come from?
Thanks!
After some testing and feedback from other users, thanks Ben Lim, it seems this is not possible to do so far.
The closest thing would be to do something like this:
if [type] == "redis-input" {
elasticsearch {
cluster => "logstash"
index => "%{type}-logstash-%{+YYYY.MM.dd}"
template => "/etc/logstash/conf.d/elasticsearch-template.json"
template_name => "redis"
}
} else if [type] == "syslog" {
elasticsearch {
cluster => "logstash"
index => "%{type}-logstash-%{+YYYY.MM.dd}"
template => "/etc/logstash/conf.d/syslog-template.json"
template_name => "syslog"
}
}
Full disclosure: I am a Logstash developer at Elastic
You cannot dynamically assign a template because templates are uploaded only once, at Logstash initialization. Without the flow of traffic, deterministic variable completion does not happen. Since there is no traffic flow during initialization, there is nothing there which can "fill in the blank" for %{clientid}.
It is also important to remember that Elasticsearch index templates are only used when a new index is created, and so it is that templates are not uploaded every time a document reached the Elasticsearch output block in Logstash--can you imagine how much slower it would be if Logstash had to do that? If you intend to have multiple templates, they need to be uploaded to Elasticsearch before any data gets sent there. You can do this with a script of your own making using curl and Elasticsearch API calls. This also permits you to update templates without having to restart Logstash. You could run the script any time before index rollover, and when the new indices get created, they'll have the new template settings.
Logstash can send data to a dynamically configured index name, just as you have above. If there is no template present, Elasticsearch will create a best-guess mapping, rather than what you wanted. Templates can and ought to be completely independent of Logstash. This functionality was added for an improved out-of-the-box experience for brand new users. The default template is less than ideal for advanced use cases, and Logstash is not a good tool for template management if you have more than one index template.

Logstash - is an output to influx DB available?

I want to have an output for Influx DB from Logstash, is there any such plugin available?
The output is set to graphite.. This is the influx config:
[input_plugins]
# Configure the graphite api
[input_plugins.graphite]
enabled = true
port = 2003
database = "AirAnalytics" # store graphite data in this database
# udp_enabled = true # enable udp interface on the same port as the tcp interface
This is the logstash config:
output {
stdout {}
graphite {
host => "localhost"
port => 2003
}
}
I see the output in the console (stdout) but no other message and nothing gets posted into influx. I checked the influx logs as well, nothing.
I tried posting the same message directly via http to influx and it works, so there's no issue with the message or influx install.
Solved it. I needed to pass on the already prepared influx compatible string to influx via logstash.
Following is the logstash configuration snippet which did the trick:
output {
http {
url => "http://localhost:8086/db/<influx db name>/series?u=<user name>&p=<pwd>"
format => "message"
content_type => "application/json"
http_method => "post"
message => "%{message}"
verify_ssl => false
}
stdout {}
}
Note: If you use the format "json" then logstash wraps the body around a "message" field which was causing a problem.
It's available via logstash-contrib as an output: https://github.com/elasticsearch/logstash-contrib/blob/master/lib/logstash/outputs/influxdb.rb
There is an influxdb output in logstash-contrib, however, this was added after 1.4.2 was released.
With logstash 1.5, there is a new plugin management system. If you're using 1.5, you can install the influxdb output with:
# assuming you're in the logstash directory
$ ./bin/plugin install logstash-output-influxdb
Maybe this help:
http://influxdb.com/docs/v0.8/api/reading_and_writing_data.html
Look at the section: Writing data through Graphite Protocol
maybe you can use the graphite output of logstash.
I think I am going to try that this weekend.
The accepted answer, while it works, is not very flexible because:
It requires the actual JSON payload to be in %{message} or whatever logstash variable you end up using
it doesn't submit the data points in batch where possible (of course, unless you have it in the JSON payload...which...in such case...why are you even using logstash in the first place?)
As noted by Paul and Wilfred, there is support for influxdb written by Jordan Sissel himself, but it was released after 1.4.2...good thing is that it works with 1.4.2 (i've tried it myself)...all you need to do is copy the influxdb.rb file to the /lib/logstash/outputs and configure your logstash accordingly. As for the documentation, you can find it here ...it did take me a bit more effort to find it because googling "influxdb logstash" doesn't take have this link on the first page results.

logstash & kibana setting the #message property

The #message property seems to be the core property when using logstash & kibana. My json logger sends the data with the message at
{"msg":"some one did something"}
if i change it so its
{"#message":"someone did something"}
the logstash server picks it up as "#fields.#message".
I am a bit confused how I can set this property to render correctly.
I suspect that the input is reading events as json and not json_event. The difference is that json will add any fields under the #fields namespace. json_event will expect the full logstash event serialized as json.
The functionality you have is probably what you want. You typically don't want to be sending the full json_event if you don't have to. You can overwrite the #message field in logstash with the mutate filter.
mutate {
type => 'json_logger'
replace => ["#message", "%{msg}"]
remove => "msg"
}

Resources