Logstash - Data from Kafka to ES - logstash

Using logstash 5.0.0, Taking kafka source as the input -> taking the data and producing the output in Elasticsearch. (ElasticSearch version 5.0.0)
Logstash conf:
input{
kafka{
bootstrap_servers => "XXX.XXX.XX.XXX:9092","XXX.XXX.XX.XXX:9092","XXX.XXX.XX.XXX:9092"
topics => ["a-data","f-data","n-data"]
group_id => "sound"
auto_offset_reset => "earliest"
consumer_threads => 2
}
}
filter{
json{
source => "message"
}
}
output {
elasticsearch {
hosts => [ "XXX.XXX.XX.XXX:9200" ]
}
}
When I run the below configuration , i am getting this following error.
$ ./logstash -f sound.conf
Sending Logstash logs to /logstash-5.0.0/logs which is now configured vi a log4j2.properties.
[2017-01-17T10:53:29,273][ERROR][logstash.agent ] fetched an invalid c onfig {:config=>"input{\nkafka{\nbootstrap_servers => \"XX.XXX.XXX.XX:9092\",\"XXX.XXX.XX.XXX:9092\",\"XXX.XXX.XX.XXX:9092\"\ntopics => [\"a-data\",\"f-data\ ",\"n-data\"]\ngroup_id => \"sound\"\nauto_offset_reset => \"earliest\"\nc onsumer_threads => 2\n}\n}\nfilter{\njson{\nsource => \"message\"\n}\n}\noutput {\nelasticsearch {\nhosts => [ \"XX.XX.XXX.XX:9200\" ]\n}\n}\n\n", :reason=>"Ex pected one of #, {, } at line 3, column 40 (byte 54) after input{\nkafka{\nboots trap_servers => \"XX.XX.XXX.XX:9092\""}
Can anyone help me with this configuration.

Shouldn't your topic be topics which is an array, where you've inserted the values as a hash:
topics => ["a-data","f-data","n-data"] <-- try changing this line

Related

Parsing json using logstash (ELK stack)

I have created a simple json like below
[
{
"Name": "vishnu",
"ID": 1
},
{
"Name": "vishnu",
"ID": 1
}
]
I am holding this values in file named simple.txt . Then i used file beat to listen the file and send the new updates to port 5043,on other side i started the log-stash service which listen to this port in order to parse and pass the json to elastic search.
log-stash is not processing the json values,it hangs in the middle.
logstash
input {
beats {
port => 5043
host => "0.0.0.0"
client_inactivity_timeout => 3600
}
}
filter {
json {
source => "message"
}
}
output {
stdout { codec => rubydebug }
}
filebeat config:
filebeat.prospectors:
- input_type: log
paths:
- filepath
output.logstash:
hosts: ["localhost:5043"]
Logstash output
**
Sending Logstash's logs to D:/elasticdb/logstash-5.6.3/logstash-5.6.3/logs which is now configured via log4j2.properties
[2017-10-31T19:01:17,574][INFO ][logstash.modules.scaffold] Initializing module {:module_name=>"fb_apache", :directory=>"D:/elasticdb/logstash-5.6.3/logstash-5.6.3/modules/fb_apache/configuration"}
[2017-10-31T19:01:17,578][INFO ][logstash.modules.scaffold] Initializing module {:module_name=>"netflow", :directory=>"D:/elasticdb/logstash-5.6.3/logstash-5.6.3/modules/netflow/configuration"}
[2017-10-31T19:01:18,301][INFO ][logstash.pipeline ] Starting pipeline {"id"=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>250}
[2017-10-31T19:01:18,388][INFO ][logstash.inputs.beats ] Beats inputs: Starting input listener {:address=>"0.0.0.0:5043"}
[2017-10-31T19:01:18,573][INFO ][logstash.pipeline ] Pipeline main started
[2017-10-31T19:01:18,591][INFO ][org.logstash.beats.Server] Starting server on port: 5043
[2017-10-31T19:01:18,697][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
**
Every time when i am running log-stash using command
logstash -f logstash.conf
And since there is no processing of json i am stopping that service by pressing ctrl + c .
Please help me in finding the solution.Thanks in advance.
finally i got ended up with config like this.It works for me.
input
{
file
{
codec => multiline
{
pattern => '^\{'
negate => true
what => previous
}
path => "D:\elasticdb\logstash-tutorial.log\Test.txt"
start_position => "beginning"
sincedb_path => "D:\elasticdb\logstash-tutorial.log\null"
exclude => "*.gz"
}
}
filter {
json {
source => "message"
remove_field => ["path","#timestamp","#version","host","message"]
}
}
output {
elasticsearch { hosts => ["localhost"]
index => "logs"
"document_type" => "json_from_logstash_attempt3"
}
stdout{}
}
Json format:
{"name":"sachin","ID":"1","TS":1351146569}
{"name":"sachin","ID":"1","TS":1351146569}
{"name":"sachin","ID":"1","TS":1351146569}

Get JSON from file

Logstash 5.2.1
I can't read JSON documents from a local file using Logstash. There are no documents in the stdout.
I run Logstash like this:
./logstash-5.2.1/bin/logstash -f logstash-5.2.1/config/shakespeare.conf --config.reload.automatic
Logstash config:
input {
file {
path => "/home/trex/Development/Shipping_Data_To_ES/shakespeare.json"
codec => json {}
start_position => "beginning"
}
}
output {
stdout {
codec => rubydebug
}
}
Also, I tried with charset:
...
codec => json {
charset => "UTF-8"
}
...
Also, I tried with/without json codec in the input and with filter:
...
filter {
json {
source => "message"
}
}
...
Logstash console after start:
[2017-02-28T11:37:29,947][WARN ][logstash.agent ] fetched new config for pipeline. upgrading.. {:pipeline=>"main", :config=>"input {\n file {\n path => \"/home/trex/Development/Shipping_Data_To_ES/shakespeare.json\"\n codec => json {\n charset => \"UTF-8\"\n }\n start_position => \"beginning\"\n }\n}\n#filter {\n# json {\n# source => \"message\"\n# }\n#}\noutput {\n stdout {\n codec => rubydebug\n }\n}\n\n"}
[2017-02-28T11:37:29,951][WARN ][logstash.agent ] stopping pipeline {:id=>"main"}
[2017-02-28T11:37:30,434][INFO ][logstash.pipeline ] Starting pipeline {"id"=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>500}
[2017-02-28T11:37:30,446][INFO ][logstash.pipeline ] Pipeline main started
^C[2017-02-28T11:40:55,039][WARN ][logstash.runner ] SIGINT received. Shutting down the agent.
[2017-02-28T11:40:55,049][WARN ][logstash.agent ] stopping pipeline {:id=>"main"}
^C[2017-02-28T11:40:55,475][FATAL][logstash.runner ] SIGINT received. Terminating immediately..
The signal INT is in use by the JVM and will not work correctly on this platform
[trex#Latitude-E5510 Shipping_Data_To_ES]$ ./logstash-5.2.1/bin/logstash -f logstash-5.2.1/config/shakespeare.conf --config.test_and_exit
^C[trex#Latitude-E5510 Shipping_Data_To_ES]$ ./logstash-5.2.1/bin/logstash -f logstash-5.2.1/config/shakespeare.conf --confireload.automatic
^C[trex#Latitude-E5510 Shipping_Data_To_ES]$ ./logstash-5.2.1/bin/logstash -f logstash-5.2.1/config/shakespeare.conf --config.reload.aumatic
Sending Logstash's logs to /home/trex/Development/Shipping_Data_To_ES/logstash-5.2.1/logs which is now configured via log4j2.properties
[2017-02-28T11:45:48,752][INFO ][logstash.pipeline ] Starting pipeline {"id"=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>500}
[2017-02-28T11:45:48,785][INFO ][logstash.pipeline ] Pipeline main started
[2017-02-28T11:45:48,875][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
Why Logstash doesn't put my JSON documents in stdout?
Did you try including the file type within your file input:
input {
file {
path => "/home/trex/Development/Shipping_Data_To_ES/shakespeare.json"
type => "json" <-- add this
//codec => json {} <-- for the moment i'll comment this
start_position => "beginning"
}
}
And then have your filter as such:
filter{
json{
source => "message"
}
}
OR if you're going with the codec plugin make sure to have the synopsis as such within your input:
codec => "json"
OR you might want to try out json_lines plugin as well. Hope this thread comes in handy.
It appears that sincedb_path is important to read JSON files. I was able to import the JSON only after adding this option. It is needed to maintain the current position in the file to be able to resume from that position in case the import is interrupted. I don't need any position tracking, so I just set this to /dev/null and it works.
The basic working Logstash configuration:
input {
file {
path => ["/home/trex/Development/Shipping_Data_To_ES/shakespeare.json"]
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
output {
stdout {
codec => json_lines
}
elasticsearch {
hosts => ["localhost:9200"]
index => "shakespeare"
}
}

Logstash in EC2 can't send log data to AWS Elasticsearch service

In EC2 I have configured logstash as belows
input {
# beats{
# port => 5044
# }
file {
type => "adjustlog"
path => "/etc/logstash/conf.d/sample.log"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
if[type] == 'adjustlog'{
grok {
match => {
"message" => [
"%{TIMESTAMP_ISO8601:timestamp},(%{USERNAME:userId})?,%{USERNAME:setlkey},%{USERNAME:uniqueId},%{NUMBER:providerId},%{USERNAME:itemCode},%{USERNAME:voucherCode},%{USERNAME:samsCode},(%{USERNAME:serviceType})?"
]
}
}
}else {
drop{ }
}
}
output {
elasticsearch{
hosts => ["search-*.es.amazonaws.com:80"]
index => "test"
}
stdout {codec => rubydebug}
}
but logstash can't make index in AWS elasticsearch and
send log data.
(However, curl and wget commands are working well.
I can make index using curl command)
Error logs are
Attempted to send a bulk request to Elasticsearch configured at '["http://search-*.es.amazonaws.com/"]', but an error occurred and it failed! Are you sure you can reach elasticsearch from this machine using the configuration provided? {:error_message=>"search*.es.amazonaws.com:80 failed to respond", :error_class=>"Manticore::ClientProtocolException", :backtrace=>["/opt/logstash/vendor/bundle/jruby/1.9/gems/manticore-0.6.0-java/lib/manticore/response.rb:37:in `initialize'", "org/jruby/RubyProc.java:281:in `call'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/manticore-0.6.0-java/lib/manticore/response.rb:79:in `call'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/manticore-0.6.0-java/lib/manticore/response.rb:256:in `call_once'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/manticore-0.6.0-java/lib/manticore/response.rb:153:in `code'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.17/lib/elasticsearch/transport/transport/http/manticore.rb:84:in `perform_request'", "org/jruby/RubyProc.java:281:in `call'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.17/lib/elasticsearch/transport/transport/base.rb:257:in `perform_request'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.17/lib/elasticsearch/transport/transport/http/manticore.rb:67:in `perform_request'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.17/lib/elasticsearch/transport/client.rb:128:in `perform_request'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/elasticsearch-api-1.0.17/lib/elasticsearch/api/actions/bulk.rb:88:in `bulk'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-2.7.0-java/lib/logstash/outputs/elasticsearch/http_client.rb:53:in `non_threadsafe_bulk'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-2.7.0-java/lib/logstash/outputs/elasticsearch/http_client.rb:38:in `bulk'", "org/jruby/ext/thread/Mutex.java:149:in `synchronize'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-2.7.0-java/lib/logstash/outputs/elasticsearch/http_client.rb:38:in `bulk'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-2.7.0-java/lib/logstash/outputs/elasticsearch/common.rb:172:in `safe_bulk'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-2.7.0-java/lib/logstash/outputs/elasticsearch/common.rb:101:in `submit'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-2.7.0-java/lib/logstash/outputs/elasticsearch/common.rb:86:in `retrying_submit'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-2.7.0-java/lib/logstash/outputs/elasticsearch/common.rb:29:in `multi_receive'", "org/jruby/RubyArray.java:1653:in `each_slice'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-2.7.0-java/lib/logstash/outputs/elasticsearch/common.rb:28:in `multi_receive'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.3.3-java/lib/logstash/output_delegator.rb:130:in `worker_multi_receive'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.3.3-java/lib/logstash/output_delegator.rb:114:in `multi_receive'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.3.3-java/lib/logstash/pipeline.rb:301:in `output_batch'", "org/jruby/RubyHash.java:1342:in `each'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.3.3-java/lib/logstash/pipeline.rb:301:in `output_batch'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.3.3-java/lib/logstash/pipeline.rb:232:in `worker_loop'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.3.3-java/lib/logstash/pipeline.rb:201:in `start_workers'"], :client_config=>{:hosts=>["http://search*.es.amazonaws.com/"], :ssl=>nil, :transport_options=>{:socket_timeout=>0, :request_timeout=>0, :proxy=>nil, :ssl=>{}}, :transport_class=>Elasticsearch::Transport::Transport::HTTP::Manticore, :logger=>nil, :tracer=>nil, :reload_connections=>false, :retry_on_failure=>false, :reload_on_failure=>false, :randomize_hosts=>false, :http=>{:scheme=>"http", :user=>nil, :password=>nil, :port=>80}}, :level=>:error}
What is the check point for debug?
I found this when trying to fix a similar issue. AWS has changed how it implements Elasticsearch node discovery. It will work fine until logstash tries to discover more hosts at which point it breaks. Restarting logstash temporarily but inconsistently fixes the issue. curl and wget work fine too.
:message=>"Cannot get new connection from pool.", :class=>"Elasticsearch::Transport::Transport::Error", :backtrace=>["/opt/logstash/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.15/lib/elasticsearch/transport/transport/base.rb:193:in `perform_request'",
ElasticSearch would work for a bit but then stop ingesting data.
Old config which failed
output {
elasticsearch {
hosts => ["https://search-*.us-east-1.es.amazonaws.com"]
sniffing => true
manage_template => false
index => "%{[#metadata][beat]}-%{+YYYY.MM.dd}"
document_type => "%{[#metadata][type]}"
}
}
Logstash tries to get a list of hosts from Elasticsearch but AWS's implementation has changed the format of the data returned. For more details on the specifics. https://forums.aws.amazon.com/thread.jspa?threadID=222600
https://discuss.elastic.co/t/elasitcsearch-ruby-raises-cannot-get-new-connection-from-pool-error/36252/11
The working config.
output
{
elasticsearch {
hosts => ["https://search-*.us-east-1.es.amazonaws.com"]
manage_template => false
index => "%{[#metadata][beat]}-%{+YYYY.MM.dd}"
document_type => "%{[#metadata][type]}"
}
}
tomwj

How to see the requests sent by LogStash to the elasticsearch output in Fiddler?

I have LS_JAVA_OPTS = -DproxySet=true -Dhttp.proxyHost=127.0.0.1 -Dhttp.proxyPort=8888
And yet, I see no traffic to my elasticsearch node from logstash in Fiddler.
I know my elasticsearch is up and running. When I curl it, Fiddler clearly shows the requests, so it is something about jruby that does not route requests through Fiddler.
I am not calling jruby directly. Rather I use the bin\logstash.bat script.
Appendix
My conf file:
input {
file {
path => 'c:/log/bje-Error.log'
sincedb_path => "NUL"
codec => plain {
charset => "ISO-8859-1"
}
codec => multiline {
pattern => "^%{TIMESTAMP_ISO8601} "
negate => true
what => previous
}
start_position => beginning
ignore_older => 0
}
}
filter {
grok {
match => { "message" => "%{TIMESTAMP_ISO8601:timestamp} \[%{BASE10NUM:thread:int}] %{WORD:machine}:%{WORD:service} \[%{BASE10NUM:localId:int}?:%{UUID:logId}?:(?<jobKind>[^:]+)?:%{BASE10NUM:jobDefinitionId:int}? %{WORD:namespace}?:%{WORD:job}?:(?<customCtx>[^\]]*)\] %{LOGLEVEL:level} %{NOTSPACE:logger} - (?<text>(?m:.*))" }
}
}
output {
stdout { codec => rubydebug }
elasticsearch {
document_type => 'logs_bje'
hosts => ["ncesearch01"]
}
}
Testing in powershell:
PS E:\logstash-2.3.2\bin> (ConvertFrom-Json((Invoke-WebRequest "http://ncesearch01:9200/logstash-*/_count").Content)).count
24666
PS E:\logstash-2.3.2\bin> .\logstash.bat -f C:\dayforce\DayforceDEV\elk\logstach.conf
LS_JAVA_OPTS was set to [-DproxySet=true -Dhttp.proxyHost=127.0.0.1 -Dhttp.proxyPort=8888]. This will be appended to the JAVA_OPTS [ -XX:HeapDumpPath="$LS_HOME/heapdump.hprof"]
io/console not supported; tty will not be manipulated
Settings: Default pipeline workers: 12
Pipeline main started
{
"message" => "2016-05-02 16:00:05.7079 [111] CANWS212:MyBJE [2251:e2737eeb-40d6-4b0e-9608-75ee3de894d3:ScheduledInstance:16 DFUnitTest:BillingDataCollectionJob:] ERROR
SharpTop.Engine.BackgroundJobs.Billing.BillingDataCollectionJob - The client database version is not defined in DFDatabaseIdentification \r",
"#version" => "1",
"#timestamp" => "2016-05-03T03:40:50.531Z",
"path" => "c:/log/bje-Error.log",
"host" => "CANWS212",
"timestamp" => "2016-05-02 16:00:05.7079",
"thread" => 111,
"machine" => "CANWS212",
"service" => "MyBJE",
"localId" => 2251,
"logId" => "e2737eeb-40d6-4b0e-9608-75ee3de894d3",
"jobKind" => "ScheduledInstance",
"jobDefinitionId" => 16,
"namespace" => "DFUnitTest",
"job" => "BillingDataCollectionJob",
"level" => "ERROR",
"logger" => "SharpTop.Engine.BackgroundJobs.Billing.BillingDataCollectionJob",
"text" => "The client database version is not defined in DFDatabaseIdentification \r"
}
^CTerminate batch job (Y/N)? ←[33mSIGINT received. Shutting down the agent. {:level=>:warn}←[0m
stopping pipeline {:id=>"main"}
Pipeline main has been shutdown
The signal HUP is in use by the JVM and will not work correctly on this platform
^CPS E:\logstash-2.3.2\bin> (ConvertFrom-Json((Invoke-WebRequest "http://ncesearch01:9200/logstash-*/_count").Content)).count
24667
PS E:\logstash-2.3.2\bin>
As you can see, http://ncesearch01:9200/logstash-*/_count returns incremented count, hence running logstash did send a request to the elasticsearch. However, it bypassed Fiddler, despite the LS_JAVA_OPTS.
I find some possible reasons for this condition,although I did not try.May this answer should be called "discussion",I`m sorry.
1.You may need a linux OS instead of windows,for the reason,
I am not sure this question has been deal in the latest logstash version
you may be interested in this,Make JAVA_OPTS and LS_JAVA_OPTS work consistently on Windows
2.As we see,the most possible is that
logstash ES_output plugin use the http way to send message
after logstash-2.0,you may use the old version?
moreInfo about ES_output_plugin,logstash-output-plugin-elasticsearch
If anyone has any ideas,your share will be expected~

statsd not wok in my logstash

The config file:
# input are the kafka messages
input
{
kafka
{
topic_id => 'test2'
}
}
# Try to match sensor info
filter
{
json { source => "message"}
}
# StatsD and stdout output
output
{
stdout
{
codec => line
{
format => "%{[testmessage][0][key]}"
}
}
stdout { codec=>rubydebug }
statsd
{
host => "localhost"
port => 8125
increment => ["test.%{[testmessage][0][key]}"]
}
}
Input kafka message:
{"testmessage":[{"key":"key-1234"}]}
Output:
key-1234
{
"testmessage" => [
[0] {
"key" => "key-1234"
}
],
"#version" => "1",
"#timestamp" => "2015-11-09T20:11:52.374Z"
}
Log:
{:timestamp=>"2015-11-09T20:29:03.562000+0000", :message=>"Done running kafka input", :level=>:info}
{:timestamp=>"2015-11-09T20:29:03.563000+0000", :message=>"Plugin is finished", :plugin=><LogStash::Outputs::Stdout codec=><LogStash::Codecs::Line format=>"%{[testmessage][0][key]}", charset=>"UTF-8">, workers=>1>, :level=>:info}
{:timestamp=>"2015-11-09T20:29:03.564000+0000", :message=>"Plugin is finished", :plugin=><LogStash::Outputs::Statsd increment=>["test1.test", "test.%{[testmessage][0][key]}"], codec=><LogStash::Codecs::Plain charset=>"UTF-8">, workers=>1, host=>"localhost", port=>8125, namespace=>"logstash", sender=>"%{host}", sample_rate=>1, debug=>false>, :level=>:info}
{:timestamp=>"2015-11-09T20:29:03.564000+0000", :message=>"Pipeline shutdown complete.", :level=>:info}
Very wired why statsd does not work in my logstash. Looking into lots of examples by Google, no idea why. Any suggestions are welcome. Thanks.
I found the reason, logstash-output-statsd is using UDP by default. But my statsd server is set to use TCP.

Resources