With logstash I am trying to Extract some tables, Transform them locally on the logstash mashine, and then Load the result to ElasticSearch. The reason for this solution is due to very limited computing power on the source server, a MariaDB.
I have tested the input{} separately, it works, so the connection to the mariadb is sound.
I have tested the jdbc_static filter against a microsoftSQL server. So logstash has writing privileges in is current environment.
I have tested the SQL syntax on the MariaDB server directly
I'm running logstash 6.8 and java 8 (java version "1.8.0_211")
I have tried earlier versions of mariadb jdbc connection
(mariadb-java-client-2.4.2.jar, mariadb-java-client-2.2.6-sources,
mariadb-java-client-2.3.0-sources)
My config file
input {
jdbc {
jdbc_driver_library => "C:/Logstash/logstash-6.8.0/plugin/mariadb-java-client-2.4.2.jar"
jdbc_driver_class => "Java::org.mariadb.jdbc.Driver"
jdbc_connection_string => "jdbc:mariadb://xx.xx.xx
jdbc_user => "me"
jdbc_password => "its secret"
schedule => "* * * * *"
statement => "SELECT unqualifiedversionid__ FROM AuditEventFHIR WHERE myUnqualifiedId = '0000134b-fc7f-4c3a-b681-8150068d6dbb'"
}
}
filter {
jdbc_static {
loaders => [
{
id => "auditevent"
query => "SELECT
myUnqualifiedId
,unqualifiedversionid__
,type_
FROM AuditEventFHIR
where myUnqualifiedId = '0000134b-fc7f-4c3a-b681-8150068d6dbb'
"
local_table => "l_ae"
}
]
local_db_objects => [
{
name => "l_ae"
index_columns => ["myUnqualifiedId"]
columns => [
["myUnqualifiedId", "varchar(256)"],
["unqualifiedversionid__", "varchar(24)"],
["type_", "varchar(256)"]
]
}
]
local_lookups => [
{
id => "rawlogfile"
query => "
select myUnqualifiedId from l_ae
"
target => "sql_output"
}
]
jdbc_driver_library => "C:/Logstash/logstash-6.8.0/plugin/mariadb-java-client-2.4.2.jar"
jdbc_driver_class => "Java::org.mariadb.jdbc.Driver"
jdbc_connection_string => "jdbc:mariadb://xx.xx.xx.xx"
jdbc_user => "me"
jdbc_password => "its secret"
}
}
output {
stdout { codec => rubydebug }
}
I am getting this and several other errors, but I suspect fixing the first will fix the rest. But key is that no were in my code are the words "LIMIT 1"
[ERROR][logstash.filters.jdbc.readonlydatabase] Exception occurred when executing loader Jdbc query count {:exception=>"Java::JavaSql::SQLSyntaxErrorException: (conn=1490) You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near '\"T1\" LIMIT 1' at line 8", :backtrace=>["org.mariadb.jdbc.internal.util.exceptions.ExceptionMapper.get(org/mariadb/jdbc/internal/util/exceptions/ExceptionMapper.java:242)", "org.mariadb.jdbc.internal.util.exceptions.ExceptionMapper.getException(org/mariadb/jdbc/internal/util/exceptions/ExceptionMapper.java:171)", "org.mariadb.jdbc.MariaDbStatement.executeExceptionEpilogue(org/mariadb/jdbc/MariaDbStatement.java:248)", "org.mariadb.jdbc.MariaDbStatement.executeInternal(org/mariadb/jdbc/MariaDbStatement.java:338)", "org.mariadb.jdbc.MariaDbStatement.executeQuery(org/mariadb/jdbc/MariaDbStatement.java:512)", "java.lang.reflect.Method.invoke(java/lang/reflect/Method.java:498)", "org.jruby.javasupport.JavaMethod.invokeDirectWithExceptionHandling(org/jruby/javasupport/JavaMethod.java:425)", "org.jruby.javasupport.JavaMethod.invokeDirect(org/jruby/javasupport/JavaMethod.java:292)"]}
The jdbc_static loader, makes a hidden SQL query select count(*) from table limit 1 to get a checksum when downloading the table. This query contains " and mariaDB don't like that.
UNLESS you add 'ANSI_QUOTES' to the sql_mode
Batch command
SET GLOBAL sql_mode = 'ANSI_QUOTES'
Another option is to set the session to allow ansi_quotes
jdbc_connection_string => "jdbc:mariadb://xx.xx.xx/databasename?sessionVariables=sql_mode=ANSI_QUOTES"
Related
I want to Add My CSV File to postgreSQL.but whenever i try to connect it Show Error:
[ERROR][logstash.outputs.jdbc ] Unknown setting 'driver_library' for jdbc
it is My Config.conf File
input {
file {
path => "C:/Users/Desktop/Input.csv"
start_position => "beginning"
codec => plain
}
}
filter {
csv {
separator => ","
columns => ["Column","Metric","Source_Table","Output_Column_Alias","Method"]
}
}
output {
jdbc {
connection_string => "jdbc:postgresql://hostname:5432/Database"
username => "User"
password => "Password"
driver_library => "C:/Users/lib/postgresql-42.5.1.jar"
driver_class => "org.postgresql.Driver"
statement => "INSERT INTO CSV_to_Postgresql (Column,Metric,Source_Table,Output_Column_Alias,Method) VALUES (?, ?, ?, ?, ?)"
}
}```
Use driver_jar_path, not driver_library. The elastic supported plugins use jdbc_driver_library as the name of this option, but the jdbc output is a third party supported plugin which uses different conventions.
with a python script Im running logstash via command inside a docker container, the normal behavior (with logstash installed in the server) is that after the pipeline get the data that pipeline shuts down, but the process never ends.
logstash=subprocess.call(["docker","exec", "-it", "logstash-docker_logstash_1", "/usr/share/logstash/bin/logstash","-f", "/usr/share/logstash/pipeline/site-canvas.conf","--path.data","/usr/share/logstash/config/min-data/"])
Im using docker top to see the running processes inside the container
what can I do to ensure that the process end when finish getting the data?
This is my pipeline
input {
jdbc {
jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
jdbc_connection_string => "jdbc:sqlserver:/db-ip:1433;databasename=omi"
jdbc_user => "my-user"
jdbc_password => "my-pass"
statement => "SELECT
TIME_CREATED,DESCRIPTION as problem, SEVERITY as severity_mame, NODEHINTS_DNSNAME as source,CATEGORY
FROM [omi1062event].[dbo].[ALL_EVENTS]
WHERE STATE = 'OPEN'
AND NODEHINTS_DNSNAME LIKE 'mju%'
AND TIME_CREATED >= DATEADD(day, -1, GETDATE())
ORDER BY TIME_CREATED ASC
"
jdbc_default_timezone => "UTC"
}
}
filter {
date {
match => [ "time_created", "ISO8601", "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'","yyyy-MM-dd HH:mm:ss", "yyyy-MM-dd HH:mm:ss.SSSSSS" ]
timezone => "Chile/Continental"
}
}
output {
elasticsearch {
hosts => "my-ip:9200"
index => "canvas"
user => "my-user"
password => "my-pass"
}
}
Hi all i have created a logstash config file scheduled every 5 minutes which transport data from MSSql sever to Elasticsearch and i run my logstash application using the windows powershell with the following command .\logstash-7.2.0\bin\logstash -f logstash.conf.txt
Logstash Config
input {
jdbc {
jdbc_driver_library => ""
jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
jdbc_connection_string => "jdbc:sqlserver://xxxxxx\SQLEXPRESS:1433;databaseName=xxxx;"
jdbc_user => "xxxxx"
jdbc_password => "xxxx"
jdbc_paging_enabled => true
tracking_column => modified_date
use_column_value => true
clean_run => true
tracking_column_type => "timestamp"
schedule => "*/5 * * * * *"
statement => "SELECT * from [xxxxxxxx] where modified_date >:sql_last_value"
}
}
filter {
mutate {
remove_field => ["#version","#timestamp"]
}
}
output {
elasticsearch {
hosts => "http://localhost:9200"
index => "employee"
document_type => "_doc"
document_id => "%{id}"
}
stdout { codec => rubydebug }
}
How to deploy the same thing in production environment? because in local machine i am using windows powershell to execute my commands how to achieve this in production environment?
Could anyone please guide how to deploy this as a service in production env?
Not sure I understand the question... Are you trying to deploy the same configuration in Linux server in production?
If so, you should change jdbc_driver_class, and possibly also the jdbc_connection_string and hosts parameters, to match the production server.
Check out also the following question:
Set vm.max_map_count on cluster nodes
It may be of help to you, though as I said I'm not sure.
Good luck! :-)
I have table eg QueryConfigTable that holds a query in one column eg ,select * from customertable .I want the query in the column to be hold query to be executed as input to JDBC in logstash I
its taking the column query as value and storing in to elasticSearch
input {
jdbc {
jdbc_driver_library => "mysql-connector-java-5.1.36-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://localhost:3306/MYDB"
//MYDB will be set dynamically.
jdbc_user => "mysql"
parameters => { "favorite_artist" => "Beethoven" }
schedule => "* * * * *"
statement => "SELECT * from QueryConfigTable "
}
}
/// output as elasticSearch
elasticsearch {
hosts => ["http://my-host.com:9200"]
index => "test"
}
final output is
"_index": "test",
"_type": "doc",
"_source": {
"product": "PL SALARIED AND SELF EMPLOYED",
"#version": "1",
"query": "select * from customertable cust where cust.isdeleted !=0"
}
but i want the query value ie, "select * from customertable cust where cust.isdeleted !=0 "to be executed as JDBC input to logstash
The jdbc input will not do this kind of indirection for you. You could write a stored procedure that fetches and executes the SQL and call that from the jdbc input.
I want to use Elasticsearch within my project. I am using Nodejs and postgresql.
I want to connect postgresql with elasticsearch for this i am using jdbc-importer. I followed the steps written in their docs to connect with postgresql and i succeded in this but through command line.
I want to use jdbc-importer within my project through nodejs
commondline code to run jdbc importer:
bin=/Users/mac/Documents/elasticsearch-jdbc-2.3.4.1/bin
lib=/Users/mac/Documents/elasticsearch-jdbc-2.3.4.1/lib
echo '{
"type" : "jdbc",
"jdbc" : {
"url" :
"jdbc:postgresql://localhost:5432/development",
"sql" : "select * from \"Products\"",
"index" : "product",
"type" : "product",
"elasticsearch" : {
"cluster" : "elasticsearch",
"host" : "localhost",
"port" : 9300
}
}
}' | java \
-cp "${lib}/*" \
org.xbib.tools.Runner \
org.xbib.tools.JDBCImporter
above command have created index product in elasticsearch and it also have data from Products table of postgresql.
Now, I want to use that jdbc importer through nodejs and If anyone elso knows other efficient way to manage my postgresql data in elasticsearch they are also welcome to give answere.
You can use Logstash:
https://www.elastic.co/blog/logstash-jdbc-input-plugin
With it you can transfer data from Postgres into ElasticSearch. This is my config file:
input {
jdbc {
# Postgres jdbc connection string to our database, mydb
jdbc_connection_string => "jdbc:postgresql://localhost:5432/MyDB"
# The user we wish to execute our statement as
jdbc_user => "postgres"
jdbc_password => ""
# The path to our downloaded jdbc driver
jdbc_driver_library => "postgresql-42.1.1.jar"
# The name of the driver class for Postgresql
jdbc_driver_class => "org.postgresql.Driver"
# our query
statement => "SELECT * from contacts"
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "contacts"
document_type => "contact"
document_id => "%{uid}"
}
}