logstash output for arangodb - logstash

did somebody alrady find an output package for logstash to arangodb? I see that there is one to elasticsearch which probably is quite similar, maybe also to mongodb. But unfortunately I up-to-now didn't find one for arangodb, and the public logstash documentation doesn't help me, as I'm not familiar with ruby.

I gave it a try and found the generic logstash http output plugin to be able to connect to ArangoDB. I wrote a blog article about using ArangoDB as a Logstash output with this plugin.
Compared with a dedicated ArangoDB plugin this has the advantage that it is already available and seems to be maintained by logstash as one of the standard plugins.

It's work fine^
bin/logstash -e 'input { stdin {codec => "json" } } output { http { http_method => "post" url => "http://127.0.0.1:8529/_db/rest/_api/document?collection=rest" format => "json" headers => [ "Authorization", "Basic cm9vdDpwYXNzd29yZA==" ] } }'

Related

how to add bytes, session and source parameter in kibana to visualise suricata logs?

I redirected all the logs(suricata logs here) to logstash using rsyslog. I used template for rsyslog as below:
template(name="json-template"
type="list") {
constant(value="{")
constant(value="\"#timestamp\":\"") property(name="timereported" dateFormat="rfc3339")
constant(value="\",\"#version\":\"1")
constant(value="\",\"message\":\"") property(name="msg" format="json")
constant(value="\",\"sysloghost\":\"") property(name="hostname")
constant(value="\",\"severity\":\"") property(name="syslogseverity-text")
constant(value="\",\"facility\":\"") property(name="syslogfacility-text")
constant(value="\",\"programname\":\"") property(name="programname")
constant(value="\",\"procid\":\"") property(name="procid")
constant(value="\"}\n")
}
for every incoming message, rsyslog will interpolate log properties into a JSON formatted message, and forward it to Logstash, listening on port 10514.
Reference link: https://devconnected.com/monitoring-linux-logs-with-kibana-and-rsyslog/
(I have also configured logstash as mention on the above reference link)
I am getting all the column in Kibana discover( as mentioned in json-template of rsyslog) but I also require bytes, session and source column in kibana which I am not getting here. I have attached the snapshot of the column I am getting on Kibana here
Available fields(or say column) on Kibana are:
#timestamp
t #version
t _type
t facility
t host
t message
t procid
t programname
t sysloghost
t _type
t _id
t _index
# _score
t severity
Please let me know how to add bytes, session and source in the available fields of Kibana. I require these parameters for further drill down in Kibana.
EDIT: I have added how my "/var/log/suricata/eve.json" looks like (which I need to visualize in Kibana. )
For bytes, I will use (bytes_toserver+bytes_toclient) which is an available inside flow.
Session I need to calculate.
Source_IP I will use as the source.
{"timestamp":"2020-05 04T14:16:55.000200+0530","flow_id":133378948976827,"event_type":"flow","src_ip":"0000:0000:0000:0000:0000:0000:0000:0000","dest_ip":"ff02:0000:0000:0000:0000:0001:ffe0:13f4","proto":"IPv6-ICMP","icmp_type":135,"icmp_code":0,"flow":{"pkts_toserver":1,"pkts_toclient":0,"bytes_toserver":87,"bytes_toclient":0,"start":"2020-05-04T14:16:23.184507+0530","end":"2020-05-04T14:16:23.184507+0530","age":0,"state":"new","reason":"timeout","alerted":false}}
Direct answer
Read the grok docs in detail.
Then head over to the grok debugger with some sample logs, to figure out expressions. (There's also a grok debugger built in to Kibana's devtools nowadays)
This list of grok patterns might come in handy, too.
A better way
Use Suricata's JSON log instead of the syslog format, and use Filebeat instead of rsyslog. Filebeat has a Suricata module out of the box.
Sidebar: Parsing JSON logs
In Logstash's filter config section:
filter {
json {
source => "message"
# you probably don't need the "message" field if it parses OK
#remove_field => "message"
}
}
[Edit: added JSON parsing]

How to remove fields from filebeat or logstash

I'm very new to ELK stack and i'm trying to process this log from a spring application.
{
"#timestamp": "2021-02-17T18:25:47.646+01:00",
"#version": "1",
"message": "The app is running!",
"logger_name": "it.company.demo.TestController",
"thread_name": "http-nio-8085-exec-2",
"level": "INFO",
"level_value": 20000,
"application_name": "myAppName"
}
On the machine where the Spring application is running i setup filebeat, that is connected to logstash.
Right now, the logstash configuration is this (very simple, very basic):
input {
beats {
port => 5044
ssl => false
client_inactivity_timeout => 3000
}
}
filter {
json {
source => "message"
}
}
output {
elasticsearch {
hosts => localhost
}
}
I added the json { source => "message"} } to extract the message field from the log (i dont know if this is correct).
Anyway, filebeat is sending a lot of fields that are not included to the log, for example:
agent.hostname
agent.id
agent.type
and many other agent fields (version etc)
host.hostname
host.ip
and many other host fields (os.build, os.family etc)
For my pourpose i dont need all this field, maybe i need some of them.. but for sure not all.
I'm asking how i can remove all this fields, and select only the fields i want. How i can do that.
And, to do this i think the right solution is to add a filter to logstash, so all the application (fielebeats) always send the entire paypload and a single instance of logstash will parse the message.. right ?
Doing this on filebeat means that i need to reproduce this configuration for all my application, and it's not centralized. But, because i started this new adventure yesterday.. i dont know if its right
Many thanks
Those fields under [agent] and [host] are being added by filebeat. The filebeat documentation explains how to configure them.

Integrate Elasticsearch with PostgreSQL while using Sails.js with Waterline ORM

I am trying to integrate Elasticsearch with Sails.js and my database isn't MongoDB: I use PostgreSQL, so this post doesn't help.
I have installed Elasticsearch on my Ubuntu box and now it's running successfully. I also installed this package on my Sails project, but I cannot create indexes on my existing models.
How can I define indexes on my models, and how can I search using Elasticsearch inside my Models?
What are the hooks which I need to define it inside models?
Here you could find a pretty straightforward package (sails-elastic). It operates by configs directly from elasticsearch itself.
Elasticsearch docs and index creation in particular
There are lots of approach to solve this issue. The recommended way is to use logstash by elasticsearch which I have given in detail.
I would list most of the approaches that I know here:
Using Logstash
curl https://download.elastic.co/logstash/logstash/logstash-2.3.2.tar.gz > logstash.tar.gz
tar -xzf logstash.tar.gz
cd logstash-2.3.2
Install the jdbc input plugin:
bin/logstash-plugin install logstash-input-jdbc
Then download postgresql jdbc driver.
curl https://jdbc.postgresql.org/download/postgresql-9.4.1208.jre7.jar > postgresql-9.4.1208.jre7.jar
Now create a configuration file for logstash to use jdbc input as input.conf:
input {
jdbc {
jdbc_driver_library => "/Users/khurrambaig/Downloads/logstash-2.3.2/postgresql-9.4.1208.jre7.jar"
jdbc_driver_class => "org.postgresql.Driver"
jdbc_connection_string => "jdbc:postgresql://localhost:5432/khurrambaig"
jdbc_user => "khurrambaig"
jdbc_password => ""
schedule => "* * * * *"
statement => 'SELECT * FROM customer WHERE "updatedAt" > :sql_last_value'
type => "customer"
}
jdbc {
jdbc_driver_library => "/Users/khurrambaig/Downloads/logstash-2.3.2/postgresql-9.4.1208.jre7.jar"
jdbc_driver_class => "org.postgresql.Driver"
jdbc_connection_string => "jdbc:postgresql://localhost:5432/khurrambaig"
jdbc_user => "khurrambaig"
jdbc_password => ""
schedule => "* * * * *"
statement => 'SELECT * FROM employee WHERE "updatedAt" > :sql_last_value'
type => "employee"
}
# add more jdbc inputs to suit your needs
}
output {
elasticsearch {
index => "khurrambaig"
document_type => "%{type}" # <- use the type from each input
document_id => "%{id}" # <- To avoid duplicates
hosts => "localhost:9200"
}
}
Now run logstash using the above file:
bin/logstash -f input.conf
For every model that you want to insert as a document(table) type in a index(database, khurrambaig here), use appropriate SQL statement ( SELECT * FROM employee WHERE "updatedAt" > :sql_last_value here). Here I have use sql_last_value to put only updated data only. You can do scheduling also and many stuff in logstash. Here I am using every minute. For more details refer this.
To see the documents which has been inserted into index for a particular type:
curl -XGET 'http://localhost:9200/khrm/user/_search?pretty=true'
This will list all the documents under customer models for my case. Look into elastic search api. Use that. Or use nodejs official client.
Using jdbc input
https://github.com/jprante/elasticsearch-jdbc
You can read its readme. It's quite straightforward. But this doesn't provide scheduling and many of the things that are provided by logstash.
Using sails-elastic
You need to use multiple adapters as given in README.
But this isn't recommended because it will slow down your requests. For every creation, updation and deletion, you will be calling two dbs : elastic search and postgresql.
In logstash, indexing of documents is independent of requests. This approach is used by many including wikipedia. Also you remain independent of framework. Today you are using sails, tomorrow you might use something else but you don't need to change anything in case of logstash if you still use postgresql. (If you change db, even then many of the db's input are available and in case of change from any sql rdbms to another, you just need to change to jdbc driver)
There's zombodb also but it work for pre 2.0 elastic only currently (Support for > ES 2.0 coming also).

Logstash - is an output to influx DB available?

I want to have an output for Influx DB from Logstash, is there any such plugin available?
The output is set to graphite.. This is the influx config:
[input_plugins]
# Configure the graphite api
[input_plugins.graphite]
enabled = true
port = 2003
database = "AirAnalytics" # store graphite data in this database
# udp_enabled = true # enable udp interface on the same port as the tcp interface
This is the logstash config:
output {
stdout {}
graphite {
host => "localhost"
port => 2003
}
}
I see the output in the console (stdout) but no other message and nothing gets posted into influx. I checked the influx logs as well, nothing.
I tried posting the same message directly via http to influx and it works, so there's no issue with the message or influx install.
Solved it. I needed to pass on the already prepared influx compatible string to influx via logstash.
Following is the logstash configuration snippet which did the trick:
output {
http {
url => "http://localhost:8086/db/<influx db name>/series?u=<user name>&p=<pwd>"
format => "message"
content_type => "application/json"
http_method => "post"
message => "%{message}"
verify_ssl => false
}
stdout {}
}
Note: If you use the format "json" then logstash wraps the body around a "message" field which was causing a problem.
It's available via logstash-contrib as an output: https://github.com/elasticsearch/logstash-contrib/blob/master/lib/logstash/outputs/influxdb.rb
There is an influxdb output in logstash-contrib, however, this was added after 1.4.2 was released.
With logstash 1.5, there is a new plugin management system. If you're using 1.5, you can install the influxdb output with:
# assuming you're in the logstash directory
$ ./bin/plugin install logstash-output-influxdb
Maybe this help:
http://influxdb.com/docs/v0.8/api/reading_and_writing_data.html
Look at the section: Writing data through Graphite Protocol
maybe you can use the graphite output of logstash.
I think I am going to try that this weekend.
The accepted answer, while it works, is not very flexible because:
It requires the actual JSON payload to be in %{message} or whatever logstash variable you end up using
it doesn't submit the data points in batch where possible (of course, unless you have it in the JSON payload...which...in such case...why are you even using logstash in the first place?)
As noted by Paul and Wilfred, there is support for influxdb written by Jordan Sissel himself, but it was released after 1.4.2...good thing is that it works with 1.4.2 (i've tried it myself)...all you need to do is copy the influxdb.rb file to the /lib/logstash/outputs and configure your logstash accordingly. As for the documentation, you can find it here ...it did take me a bit more effort to find it because googling "influxdb logstash" doesn't take have this link on the first page results.

Using Log4J with LogStash

I'm new to LogStash. I have some logs written from a Java application in Log4J. I'm in the process of trying to get those logs into ElasticSearch. For the life of me, I can't seem to get it to work consistently. Currently, I'm using the following logstash configuration:
input {
file {
type => "log4j"
path => "/home/ubuntu/logs/application.log"
}
}
filter {
grok {
type => "log4j"
add_tag => [ "ApplicationName" ]
match => [ "message", "%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level}" ]
}
}
output {
elasticsearch {
protocol => "http"
codec => "plain"
host => "[myIpAddress]"
port => "[myPort]"
}
}
This configuration seems to be hit or miss. I'm not sure why. For instance, I have two messages. One works, and the other throws a parse failure. Yet, I'm not sure why. Here are the messages and their respective results:
Tags Message
------ -------
["_grokparsefailure"] 2014-04-04 20:14:11,613 TRACE c.g.w.MyJavaClass [pool-2-
thread-6] message was null from https://domain.com/id-1/env-
MethodName
["ApplicationName"] 2014-04-04 20:14:11,960 TRACE c.g.w.MyJavaClass [pool-2-
thread-4] message was null from https://domain.com/id-1/stable-
MethodName
The one with ["ApplicationName"] has my custom fields of timestamp and level. However, the entry with ["_grokparsefailure"] does NOT have my custom fields. The strange piece is, the logs are nearly identical as shown in the message column above. This is really confusing me, yet, I don't know how to figure out what the problem is or how to get beyond it. Does anyone know how how I can use import log4j logs into logstash and get the following fields consistently:
Log Level
Timestamp
Log message
Machine Name
Thread
Thank you for any help you can provide. Even if I can just the log level, timestamp, and log message, that would be a HUGE help. I sincerely appreciate it!
I'd recommend using the log4j socket listener for logstash and the log4j socket appender.
Logstash conf:
input {
log4j {
mode => server
host => "0.0.0.0"
port => [logstash_port]
type => "log4j"
}
}
output {
elasticsearch {
protocol => "http"
host => "[myIpAddress]"
port => "[myPort]"
}
}
log4j.properties:
log4j.rootLogger=[myAppender]
log4j.appender.[myAppender]=org.apache.log4j.net.SocketAppender
log4j.appender.[myAppender].port=[log4j_port]
log4j.appender.[myAppender].remoteHost=[logstash_host]
There's more info in the logstash docs for their log4j input: http://logstash.net/docs/1.4.2/inputs/log4j
It looks like the SocketAppender solution that was used before is deprecated because of some security issue.
Currently the recommended solution is to use log4j fileAppender and then pass the file through filebeat plugin to logstash and then filter.
For more information you can refer the below links:
https://www.elastic.co/blog/log4j-input-logstash
https://www.elastic.co/guide/en/logstash/current/plugins-inputs-log4j.html
On my blog (edit: removed dead link) I described how to send JSON message(s) to the ElasticSearch and then parse it with GROK.
[Click to see blog post with description and Java example][1]
In the post you find description but also simple maven project with example (complete project on github).
Hope it helps you.

Resources