I have configured JMS input in logstash to subscribe to JMS topic messages and push messages to elastic search.
input {
jms {
id => "my_first_jms"
yaml_file => "D:\softwares\logstash-6.4.0\config\jms-amq.yml"
yaml_section => "dev"
use_jms_timestamp => true
pub_sub => true
destination => "mytopic"
# threads => 50
}
}
filter {
json{
source => "message"
}
}
output {
stdout { codec => json }
elasticsearch {
hosts => ['http://localhost:9401']
index => "jmsindex"
}
}
System specs:
RAM: 16 GB
Type: 64 bit
Processor: Intel i5-4570T CPU # 2.9 GHz
This is extremely slow. Like 1 message every 3-4 minutes. How should I debug to figure out what is missing?
Note: Before this, I was doing this with #JMSListener in java and that could process 200-300 records per sec easily.
Related
I am trying to load test my code with 50 parallel read requests.
I am querying data based on multiple indexes that I have created. Code looks something like this:
const fetchRecords = async (predicates) => {
let query = aeroClient.query('test', 'mySet');
let filters = [
predexp.stringValue(predicates.load),
predexp.stringBin('load'),
predexp.stringEqual(),
predexp.stringValue(predicates.disc),
predexp.stringBin('disc'),
predexp.stringEqual(),
predexp.integerBin('date1'),
predexp.integerValue(predicates.date2),
predexp.integerGreaterEq(),
predexp.integerBin('date2'),
predexp.integerValue(predicates.date2),
predexp.integerLessEq(),
predexp.stringValue(predicates.column3),
predexp.stringBin('column3'),
predexp.stringEqual(),
predexp.and(5),
]
query.where(filters);
let records = [];
let stream = query.foreach();
stream.on('data', record => {
records.push(record);
})
stream.on('error', error => { throw error });
await new Promise((resolve, reject) => {
stream.on('end', () => resolve());
});
return records;
}
This fails and I get the following error:
AerospikeError: Operation not allowed at this time.
at Function.fromASError (/Users/.../node_modules/aerospike/lib/error.js:113:21)
at QueryCommand.convertError (/Users/.../node_modules/aerospike/lib/commands/command.js:91:27)
at QueryCommand.convertResponse (/Users/.../node_modules/aerospike/lib/commands/command.js:101:24)
at asCallback (/Users/.../node_modules/aerospike/lib/commands/command.js:163:24)
My aerospike.conf content:
service {
user root
group root
paxos-single-replica-limit 1 # Number of nodes where the replica count is automatically reduced to 1.
pidfile /var/run/aerospike/asd.pid
# service-threads 6 # cpu x 5 in 4.7
# transaction-queues 6 # obsolete in 4.7
# transaction-threads-per-queue 4 # obsolete in 4.7
proto-fd-max 15000
}
<...trimmed section>
namespace test {
replication-factor 2
memory-size 1G
default-ttl 30d # 5 days, use 0 to never expire/evict.
nsup-period 120
# storage-engine memory
# To use file storage backing, comment out the line above and use the
# following lines instead.
storage-engine device {
file /opt/aerospike/data/test.dat
filesize 4G
data-in-memory true # Store data in memory in addition to file.
}
}
From a similar question I found that this is happening due to low system configurations.
How can I modify these. Also, I believe 50 requests should have worked, given I was able to insert around 12K records/sec.
Those are scans I guess rather than individual reads. To increase the scan-threads-limit:
asinfo -v "set-config:context=service;scan-threads-limit=128"
I am Working on Logstash input Jdbc with different databases, Primarily I am trying for cassandra and I am using DBSchema driver, the driver is properly working with my java jdbc code, but when coming to integrate with Logstash it was failing to connect, I am trying this from many days, i did n't find any exact solution in forum, i tried all the things, but not worked, with maria db was also i tried it is also failing.
Here I am providing Logs while starting Logstash
Error: com.dbschema.CassandraJdbcDriver not loaded. Are you sure you've included the correct jdbc driver in :jdbc_driver_library?
Exception: LogStash::ConfigurationError
Stack: /opt/logstash-7.1.1/vendor/bundle/jruby/2.5.0/gems/logstash-input-jdbc-4.3.13/lib/logstash/plugin_mixins/jdbc/jdbc.rb:163:in `open_jdbc_connection'
Logstash conf file: (Cassandra)
#Sample Logstash configuration for creating a simple
# Beats -> Logstash -> Elasticsearch pipeline.
input {
jdbc {
clean_run => true
jdbc_connection_string => "jdbc:cassandra://localhost:9042/cloud"
jdbc_user => "cassandra"
jdbc_password => "cassandra"
jdbc_driver_library => "/usr/share/logstash/logstash-core/lib/jars/cassandrajdbc1.2.jar"
jdbc_validate_connection => true
jdbc_driver_class => "com.dbschema.CassandraJdbcDriver"
statement => "SELECT * FROM cloud.event_history_all"
}
}
output {
elasticsearch { hosts => ["localhost:9200"]
index => "log_cassandra" }
stdout { codec => rubydebug }
}
Cassandra version - 3.11.5
elk versions - 7.1.1
java version - 11.0.5
Thanks in advance
Currently i am processing gzip files in logstash using file input plugin. its consuming very high memory and keeps on restarting even after giving a high heap size. As of now on an avg we are processing 50 files per min and the planning to process 1000's of file per min. With 100 files the RAM requirement touches 10Gb. What is the best way to tune this config or is there a better way to process such a huge volume of data in logstash.
is it advisable to write a processing engine in nodejs or any other languages.
Below is the logstash conf.
input {
file {
id => "client-files"
mode => "read"
path => [ "/usr/share/logstash/plugins/message/*.gz" ]
codec => "json"
file_completed_action => log_and_delete
file_completed_log_path => "/usr/share/logstash/logs/processed.log"
}
}
filter {
ruby {
code => 'monitor_name = event.get("path").split("/").last.split("_").first
event.set("monitorName", monitor_name )
split_field = []
event.get(monitor_name).each do |x|
split_field << Hash[event.get("Datapoints").zip(x)]
end
event.set("split_field",split_field)'
}
split {
field => "split_field"
}
ruby {
code => "event.get('split_field').each {|k,v| event.set(k,v)}"
remove_field => ["split_field","Datapoints","%{monitorName}"]
}
}
I am using Logstash 2.2.4.
My current configuration
input {
file {
path => "/data/events/*/*.txt"
start_position => "beginning"
codec => "json"
ignore_older => 0
max_open_files => 46000
}
}
filter {
if [type] not in ["A", "B", "C"] {
drop {}
}
}
output {
HTTP {
http_method => "post"
workers => 3
url => "http://xxx.amazonaws.com/event"
}
}
In an input folder, I have about 25000 static (never updatable) txt files.
I configured --pipeline-workers to 16. In described configuration LS process running 1255 threads and opens about 2,560,685 file descriptors.
After some investigation, I found that LS keeping open files descriptor for all the files in the input folder and HTTP output traffic became very slow.
My question is why LS does not close file descriptor of already processed (transferred) files or implementing kind of input files pagination?
Maybe someone meets the same problem? If yes, please share your solution.
Thanks.
I have ganglia set up on a cluster of servers, all of which have gmond, one of which has gmetad, and one which has log stash and elasticsearch. I’d like to use Logstash’s ganglia input plugin to collect data directly from the monitoring daemons, but I’ve been unsuccessful so far. My logstash logs always show:
{:timestamp=>"2015-07-14T14:33:25.192000+0000", :message=>"ganglia udp listener died", :address=>"10.1.10.178:8664", :exception=>#, :backtrace=>["org/jruby/ext/socket/RubyUDPSocket.java:160:in bind'", "/opt/logstash/lib/logstash/inputs/ganglia.rb:61:inudp_listener'", "/opt/logstash/lib/logstash/inputs/ganglia.rb:39:in run'", "/opt/logstash/lib/logstash/pipeline.rb:163:ininputworker'", "/opt/logstash/lib/logstash/pipeline.rb:157:in `start_input'"], :level=>:warn}
Here's the input config I've been testing with:
input {
ganglia {
host => "10.1.10.178" #ip of logstash node
port => 8666
type => "ganglia_test"
}
}
and I have this in gmond.conf on one of the gmond nodes
udp_send_channel {
host = 10.1.10.178 #logstash node
port = 8666
bind_hostname = yes
}
I've found this problem too. It looks like there's a bug in the Ganglia listener since about version 1.2 (I know it used to work in 1.1..)
I managed to work around the problem by adding an explicit 'UDP' listener. This seems to satisfy logstash and allows the Ganglia listener to keep running.
e.g.
input {
udp {
port => "1112"
type => "dummy"
}
ganglia {
port => "8666"
type => "ganglia"
}
}