We have setup a global timeout for an elasticsearch (5.3.3) cluster in elasticsearch.yml -
search.default_search_timeout: 1nanos
But the responses we get for this, in most cases, have "timed_out": true. However, sometimes, we do get the expected response from elasticsearch with "timed_out": false. However, when "timed_out" is false, we do see that the "took" is returning values > 30 which means the time taken by elasticsearch was around 30ms which is > 1 nanosecond. Ideally the query should have timed out at 1 nanosecond. Is thsis a bug?
Related
I'm inserting data into elasticsearch (Index A) every minute for the healthcheck of some endpoints. I want to read the index A every minute for the last events it received and if the state changes of any endpoint (from healthy to unhealthy or unhealthy to healthy) then insert that event into Index B.
How would I achieve that and if possible can someone provide a sample code please. I tried elasticsearch filter plugin but couldn't get the desired result.
I tried elasticsearch filter plugin but couldn't get the desired result.
I'm using Nutch 1.14 and trying to index a small web crawl into ES v5.3.0 and I keep getting this error:
ElasticIndexWriter
elastic.cluster : elastic prefix cluster
elastic.host : hostname
elastic.port : port
elastic.index : elastic index command
elastic.max.bulk.docs : elastic bulk index doc counts. (default 250)
elastic.max.bulk.size : elastic bulk index length in bytes. (default 2500500)
elastic.exponential.backoff.millis : elastic bulk exponential backoff initial delay in milliseconds. (default 100)
elastic.exponential.backoff.retries : elastic bulk exponential backoff max retries. (default 10)
elastic.bulk.close.timeout : elastic timeout for the last bulk in seconds. (default 600)
Indexer: java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:873)
at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:147)
at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:230)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:239)
Error running:
/home/david/tutorials/nutch/apache-nutch-1.14-src/runtime/local/bin/nutch index -Delastic.server.url=http://localhost:9300/search-index/ searchcrawl//crawldb -linkdb searchcrawl//linkdb searchcrawl//segments/20180824175802
Failed with exit value 255.
I've already done this and I still get the error...
UPDATE - Ok, I've made progress. Indexing seems to work now - no more errors. However, when I go to see use _stats via Kibana to check the document count I get 0 when Nutch is telling me this:
Segment dir is complete: crawl/segments/20180830115119.
Indexer: starting at 2018-08-30 12:19:31
Indexer: deleting gone documents: false
Indexer: URL filtering: false
Indexer: URL normalizing: false
Active IndexWriters :
ElasticRestIndexWriter
elastic.rest.host : hostname
elastic.rest.port : port
elastic.rest.index : elastic index command
elastic.rest.max.bulk.docs : elastic bulk index doc counts. (default 250)
elastic.rest.max.bulk.size : elastic bulk index length. (default 2500500 ~2.5MB)
Indexer: number of documents indexed, deleted, or skipped:
Indexer: 9 indexed (add/update)
Indexer: finished at 2018-08-30 12:19:45, elapsed: 00:00:14
I'm assuming that means ES was sent 9 documents for indexing?
I've used Elasticsearch 6.0 with nutch 1.14 and it worked like a charm, I was using indexer-elastic-rest plugin with port 9200, i am attaching the my nutch-site.xml for the reference.
I have a python3 script that attempts to reindex certain documents in an existing ElasticSearch index. I can't update the documents because I'm changing from an autogenerated id to an explicitly assigned id.
I'm currently attempting to do this by deleting existing documents using delete_by_query and then indexing once the delete is complete:
self.elasticsearch.delete_by_query(
index='%s_*' % base_index_name,
doc_type='type_a',
conflicts='proceed',
wait_for_completion=True,
refresh=True,
body={}
)
However, the index is massive, and so the delete can take several hours to finish. I'm currently getting a ReadTimeoutError, which is causing the script to crash:
WARNING:elasticsearch:Connection <Urllib3HttpConnection: X> has failed for 2 times in a row, putting on 120 second timeout.
WARNING:elasticsearch:POST X:9200/base_index_name_*/type_a/_delete_by_query?conflicts=proceed&wait_for_completion=true&refresh=true [status:N/A request:140.117s]
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='X', port=9200): Read timed out. (read timeout=140)
Is my approach correct? If so, how can I make my script wait long enough for the delete_by_query to complete? There are 2 timeout parameters that can be passed to delete_by_query - search_timeout and timeout, but search_timeout defaults to no timeout (which is I think what I want), and timeout doesn't seem to do what I want. Is there some other parameter I can pass to delete_by_query to make it wait as long as it takes for the delete to finish? Or do I need to make my script wait some other way?
Or is there some better way to do this using the ElasticSearch API?
You should set wait_for_completion to False. In this case you'll get task details and will be able to track task progress using corresponding API: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html#docs-delete-by-query-task-api
Just to explain more in the form of codebase explained by Random for the newbee in ES/python like me:
ES = Elasticsearch(['http://localhost:9200'])
query = {'query': {'match_all': dict()}}
task_id = ES.delete_by_query(index='index_name', doc_type='sample_doc', wait_for_completion=False, body=query, ignore=[400, 404])
response_task = ES.tasks.get(task_id) # check if the task is completed
isCompleted = response_task["completed"] # if complete key is true it means task is completed
One can write custom definition to check if the task is completed in some interval using while loop.
I have used python 3.x and ElasticSearch 6.x
You can use the 'request_timeout' global param. This will reset the Connections timeout settings, as mentioned here
For example -
es.delete_by_query(index=<index_name>, body=<query>,request_timeout=300)
Or set it at connection level, for example
es = Elasticsearch(**(get_es_connection_parms()),timeout=60)
I'm wondering if there is any way of hijacking the standard "Task timed out after 1.00 seconds" log.
Bit of context : I'm streaming lambda function logs into AWS Elasticsearch / Kibana, and one of the things I'm logging is whether or not the function successfully executed (good to know). I've set up a test stream to ES, and I've been able to define a pattern to map what I'm logging to fields in ES.
From the function, I console log something like:
"\"FAIL\"\"Something messed up\"\"0.100 seconds\""
and with the mapping, I get a log structure like:
Status - Message -------------------- Execution Time
FAIL ---- Something messed up --- 0.100 seconds
... Which is lovely. However if a log comes in like:
"Task timed out after 1.00 seconds"
then the mapping will obviously not apply. If it's picked up by ES it will likely dump the whole string into "Status", which is not ideal.
I thought perhaps I could query context.getRemainingMillis() and if it goes maybe within 10 ms of the max execution time (which you can't get from the context object??) then fire the custom log and ignore the default output. This however feels like a hack.
Does anyone have much experience with logging from AWS Lambda into ES? The key to creating these custom logs with status etc is so that we can monitor the activity of the lambda functions (many), and the default log formats don't allow us to classify the result of the function.
**** EDIT ****
The solution I went with was to modify the lambda function generated by AWS for sending log lines to Elasticsearch. It would be nice if I could interface with AWS's lambda logger to set the log format, however for now this will do!
I'll share a couple key points about this:
The work done for parsing the line and setting the custom fields is done in transform() before the call to buildSource().
The message itself (full log line) is found in logEvent.message.
You don't just reassign the message in the desired format (in fact leaving it be is probably best since the raw line is sent to ES). The key here is to set the custom fields in logEvent.extractedFields. So once I've ripped apart the log line, I set logEvent.extractedFields.status = "FAIL", logEvent.extractedFields.message = "Whoops.", etc etc.
I am trying to run an aggregation pipeline using node.js and mongodb native driver on a sharded mongodb cluster with 2 shards. The monogdb ver. is 2.6.1. The operation runs for about 50 minutes and throws the error 'errmsg" : "exception: getMore: cursor didn't exist on server, possible restart or timeout?"' On googling I came across this link . It looks like the issue is not resolved yet. BTW, the size of the collection is about 140 million documents.
Is there a fix/workaround for this issue?
Here is the pipeline that I am trying to run. I don't know at what stage it breaks. It runs for about 50 minutes and the error happens. Same is the case with any aggregation pipeline that I try to run.
db.collection01.aggregate([
{$match:{"state_cd":"CA"}},
{$group : {"_id": "$pat_id" , count : {$sum : 1}}}
],
{out: "distinct_patid_count", allowDiskUse: true }
)
My guess is you could try to lower the batch size to make the cursor more "active".
I came across this error after our server was running for more than 2.5 months. Mongo started dropping cursors even before the timeout (I guess some sort of memory error), restart of mongo solved our problem.