Insert Events with Changed Status Only using Logstash - logstash

I'm inserting data into elasticsearch (Index A) every minute for the healthcheck of some endpoints. I want to read the index A every minute for the last events it received and if the state changes of any endpoint (from healthy to unhealthy or unhealthy to healthy) then insert that event into Index B.
How would I achieve that and if possible can someone provide a sample code please. I tried elasticsearch filter plugin but couldn't get the desired result.
I tried elasticsearch filter plugin but couldn't get the desired result.

Related

How to detect & act on an error in logstash elasticsearch output

When using the logstash elasticsearch output, I'm trying to detect any errors, and if an error occurs do something else with the message. Is this possible?
Specifically, I'm using fingerprinting to allocate a document id, and I want to use elasticsearch output action "create" to throw an error if that document id already exists - but in this case I want to push these potential duplicates elsewhere (probably another elasticsearch index) so I can verify that they are in fact duplicates.
Is this possible? It seems like the Dead Letter Queue might do what I want - except that https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html#_retry_policy states that 409 conflict errors are ignored.

Identify if a logstash pipeline is complete

I have below 2 queries
Is there a specific key in logstash API response which can identify that a pipeline (incremental) completed successfully? I checked this where the Pipeline stats give an event response ...in, filtered, out... but did not find anything which can clearly say that a pipeline completed successfully.
In logstash logs for a particular run I get below result, what does (seconds) denote here?
[2018-07-14T01:00:05,117][INFO ][logstash.inputs.jdbc ] (4.753178s) SELECT a.*,....
[2018-07-14T02:00:45,221][INFO ][logstash.inputs.jdbc ] (42.543719s) SELECT pkey_ AS job_id,.....

ELK reading and storing log4j logs

We have around 50 servers from where we get log4j logs. these folder where log4j are writes we have been mounted to a machine where we have Logstash, which pushes these logs into Elasticsearch. It creates a index in Elasticsearch called logstash-2018.06.25 where it stores all log information here in this table. Now I have to delete the old logs, I have read that on internet that delete with query wouldn’t be a good way, rather we should delete it using CURATOR(Elasticsearch). I have read that curator can delete the whole index. How can I configure my logstash so that it creates index based on the date.
So it will create a index/table based on day wise.
So 25-Jun-2018 index would be created on 25-Jun-2018.
Similary 26-Jun-2018 index would be created on 26-Jun-2018.
This way I would be able to drop index on older file, using this approach I would be able to have faster performance of elastic search.
To do this how to configure my logstash so that i can acheive this.
In elasticsearch output plugin you can configure index name as follows:
output {
elasticsearch {
...
index => "logstash-%{+YYYY.MM.dd}"
...
}
}

AWS Lambda Function (Node) - Custom timeout logging

I'm wondering if there is any way of hijacking the standard "Task timed out after 1.00 seconds" log.
Bit of context : I'm streaming lambda function logs into AWS Elasticsearch / Kibana, and one of the things I'm logging is whether or not the function successfully executed (good to know). I've set up a test stream to ES, and I've been able to define a pattern to map what I'm logging to fields in ES.
From the function, I console log something like:
"\"FAIL\"\"Something messed up\"\"0.100 seconds\""
and with the mapping, I get a log structure like:
Status - Message -------------------- Execution Time
FAIL ---- Something messed up --- 0.100 seconds
... Which is lovely. However if a log comes in like:
"Task timed out after 1.00 seconds"
then the mapping will obviously not apply. If it's picked up by ES it will likely dump the whole string into "Status", which is not ideal.
I thought perhaps I could query context.getRemainingMillis() and if it goes maybe within 10 ms of the max execution time (which you can't get from the context object??) then fire the custom log and ignore the default output. This however feels like a hack.
Does anyone have much experience with logging from AWS Lambda into ES? The key to creating these custom logs with status etc is so that we can monitor the activity of the lambda functions (many), and the default log formats don't allow us to classify the result of the function.
**** EDIT ****
The solution I went with was to modify the lambda function generated by AWS for sending log lines to Elasticsearch. It would be nice if I could interface with AWS's lambda logger to set the log format, however for now this will do!
I'll share a couple key points about this:
The work done for parsing the line and setting the custom fields is done in transform() before the call to buildSource().
The message itself (full log line) is found in logEvent.message.
You don't just reassign the message in the desired format (in fact leaving it be is probably best since the raw line is sent to ES). The key here is to set the custom fields in logEvent.extractedFields. So once I've ripped apart the log line, I set logEvent.extractedFields.status = "FAIL", logEvent.extractedFields.message = "Whoops.", etc etc.

Sphinx search crashes when there are no indexes

I want a sphinx searchd to start up but there are no indexes populated as yet. I have a separate cron job that pulls data from a data source and then calls the indexer to generate the indexes.
So the first time searchd starts the cron job has not yet run, hence there are no indexes. And searchd fails with errors like:
FATAL: no valid indexes to serve
Is there any way to get around this? e.g. to start earchd even when there are no indexes and if someone searched against it during that time, it just returns no docids. Later when the cron job runs, the indexes will be populated and then searched can query those indexes.
if someone searched against it during that time, it just returns no docids.
That would require an actual index to search againast.
Just create an empty index. Then when indexer runs, it recreates the index (with data this time) and notifies searchd - using --rotate switch.
Example of a way to produce a 'empty' index, as provided by #ctx: (Added Dec, 2014)
source force {
type = xmlpipe2
xmlpipe_command = cat /tmp/test.xml
}
index force {
source = force
path = /path/to/sphinx/datadir/filename
charset_type=utf-8
}
/tmp/test.xml:
<?xml version="1.0" encoding="utf-8"?>
<sphinx:docset>
<sphinx:schema>
<sphinx:field name="subject"/>
</sphinx:schema>
</sphinx:docset>
indexer force and now searchd should be able to run.
Alternativly can use something like sql_query = SELECT 1,'' but that does require connection to a real database server.

Resources