running elasticsearch queries on linux - linux

What 's the correct way of running elasticsearch queries on linux? I came up with the code below but it seems that it is not correct because of many errors that I see.
curl -X GET http://localhost:9200/INDEXED_REPOSITORY/_search?q="constant_score" : {"filter" : { "terms" : { "description" : ["heart", "cancer", and more than 10000 keywords ]}}}}

You're missing a few things, do it like this:
curl -X GET http://localhost:9200/INDEXED_REPOSITORY/_search -d '{
"query": {
"constant_score": {
"filter" : {
"terms" : {
"description" : ["heart", "cancer", and more than 10000 keywords ]
}
}
}
}
}'
or on a single line:
curl -X GET http://localhost:9200/INDEXED_REPOSITORY/_search -d '{"query": {"constant_score": {"filter" : {"terms" : {"description" : ["heart", "cancer", and more than 10000 keywords ]}}}}}'

Related

Remove string in file on linux

I am looking to remove the _id field and its value which changes:
{ "_id" : { "$oid" : "54da1bee58743hd23947f493" }, "name":"david", "age":"33"}
{ "_id" : { "$oid" : "5422222222222345d9f1f493" }, "name":"Dove", "age":"33"}
{ "_id" : { "$oid" : "54da1be57a4b727669f1f493" }, "name":"man", "age":"23"}
outcome:
{"name":"david","age":"33"}
{"name":"Dove", "age":"33"}
{"name":"man", "age":"23"}
I would like to use sed or any other command.
It is easy with jq.
jq -c 'del(._id)' input.txt
O/P:
{"name":"david","age":"33"}
{"name":"Dove","age":"33"}
{"name":"man","age":"23"}

Analyzing apache access logs with elasticsearch Watcher

I am using the ELK Stack to analyze logs and I need to analyze and detect anomalies of apache access logs. What can I analyze with apache access logs and how should I give the conditions with curl -XPUT to Watcher?
If you haven't found it already, there's a decent tutorial at https://www.elastic.co/guide/en/watcher/watcher-1.0/watch-log-data.html. It provides a basic example of creating a log watch.
You can analyze/watch anything that you can query in Elasticsearch. It's just a matter of formatting the query with the correct JSON syntax. The guide for crafting the conditions is at https://www.elastic.co/guide/en/watcher/watcher-1.0/condition.html.
You'll also want to look at https://www.elastic.co/guide/en/watcher/watcher-1.0/actions.html to get an idea of the possible actions Watcher can take when a query meets a condition.
As far as the post to Watcher, each watch is essentially a JSON object. Because they can get pretty elaborate, I have found that it's best to create a file for each watch you want to create, and post them like this:
curl -XPUT http://my_elasticsearch:9200/_watcher/watch/my_watch_name -d #/path/to/my_watch_name.json
my_watch_name.json should have these basic elements (as described in the first link above):
{
"trigger" : { ... },
"input" : { ... },
"condition" : { ... },
"actions" : { ... }
}
The actions section is going to be specific to your use case, but here's a basic example of the other sections that I'm using successfully:
{
"trigger" : {
"schedule" : { "interval" : "5m" }
},
"input" : {
"search" : {
"request" : {
"indices" : [ "logstash" ],
"body" : {
"query" : {
"filtered" : {
"query" : {
"match" : { "message" : "error" }
},
"filter" : {
"range" : { "#timestamp" : { "gte" : "now-5m" } }
}
}
}
}
}
}
},
"condition" : {
"compare" : { "ctx.payload.hits.total" : { "gt" : 0 } }
},
"actions" : {
...
}
}

How to get the last update of the index in elasticsearch

How can I find datetime of last update of the elsasticsearch index?
Elasticsearch index last update time I tried to follow the example , but nothing happened .
curl -XGET 'http://localhost:9200/_all/_mapping'
{"haystack":{"mappings":{"modelresult":{"_all":{"auto_boost":true},"_boost":{"name":"boost","null_value":1.0},"properties":{"act_name":{"type":"string","boost":1.3,"index_analyzer":"index_ngram","search_analyzer":"search_ngram"},"django_ct":{"type":"string","index":"not_analyzed","include_in_all":false},"django_id":{"type":"string","index":"not_analyzed","include_in_all":false},"hometown":{"type":"string","boost":0.9,"index_analyzer":"index_ngram","search_analyzer":"search_ngram"},"id":{"type":"string"},"text":{"type":"string","analyzer":"ngram_analyzer"}}},"mytype":{"_timestamp":{"enabled":true,"store":true},"properties":{}}}}}
curl -XPOST localhost:9200/your_index/your_type/_search -d '{
"size": 1,
"sort": {
"_timestamp": "desc"
},
"fields": [
"_timestamp"
]
}'
{"took":2,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":99,"max_score":null,"hits":[{"_index":"haystack","_type":"modelresult","_id":"account.user.96","_score":null,"sort":[-9223372036854775808]}]}}
What is wrong?
First, what you need to do is to proceed like in that linked question and enable the _timestamp field in your mapping.
{
"modelresult" : {
"_timestamp" : { "enabled" : true }
}
}
Then you can query your index for a single document with the most recent timestamp like this:
curl -XPOST localhost:9200/haystack/modelresult/_search -d '{
"size": 1,
"sort": {
"_timestamp": "desc"
},
"fields": [
"_timestamp"
]
}'

Elasticsearch mac address search/mapping

I can't get the mac address search to return proper results when I'm doing partial searches (half octect). I mean, if I look for the exact mac address I get results but if try to search for partial search like "00:19:9" I don't get anything until I complete the octet.
Can anyone point out which mapping should I use to index it or kind of search query should I use??
curl -XDELETE http://localhost:9200/ap-test
curl -XPUT http://localhost:9200/ap-test
curl -XPUT http://localhost:9200/ap-test/devices/1 -d '
{
"user" : "James Earl",
"macaddr" : "00:19:92:00:71:80"
}'
curl -XPUT http://localhost:9200/ap-test/devices/2 -d '
{
"user" : "Earl",
"macaddr" : "00:19:92:00:71:82"
}'
curl -XPUT http://localhost:9200/ap-test/devices/3 -d '
{
"user" : "James Edward",
"macaddr" : "11:19:92:00:71:80"
}'
curl -XPOST 'http://localhost:9200/ap-test/_refresh'
curl -XGET http://localhost:9200/ap-test/devices/_mapping?pretty
When I to find exact matches I get them correctly....
curl -XPOST http://localhost:9200/ap-test/devices/_search -d '
{
"query" : {
"query_string" : {
"query":"\"00\\:19\\:92\\:00\\:71\\:80\""
}
}
}'
# RETURNS:
{
"took": 6,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.57534903,
"hits": [
{
"_index": "ap-test",
"_type": "devices",
"_id": "1",
"_score": 0.57534903,
"_source": {
"user": "James Earl",
"macaddr": "00:19:92:00:71:80"
}
}
]
}
}
HOWEVER, I need to be able to match partial mac addresses searches like this:
curl -XPOST http://localhost:9200/ap-test/devices/_search -d '
{
"query" : {
"query_string" : {
"query":"\"00\\:19\\:9\""
}
}
}'
# RETURNS 0 instead of returning 2 of them
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
SO, What mapping should I use? Is there a better query string to accomplish this? BTW, what's the difference between using 'query_string' and 'text'?
It looks like you haven't defined a mapping at all, which means elasticsearch will guess off your datatypes and use the standard mappings.
For the field macaddr, this will be recognised as a string and the standard string analyzer will be used. This analyzer will break up the string on whitespace and punctuation, leaving you with tokens consisting of pairs of numbers. e.g. "00:19:92:00:71:80" will get tokenized to 00 19 92 00 71 80. When you search the same tokenization will happen.
What you want is to define an analyzer which turns "00:19:92:00:71:80" into the tokens 00 00: 00:1 00:19 etc...
Try this:
curl -XPUT http://localhost:9200/ap-test -d '
{
"settings" : {
"analysis" : {
"analyzer" : {
"my_edge_ngram_analyzer" : {
"tokenizer" : "my_edge_ngram_tokenizer"
}
},
"tokenizer" : {
"my_edge_ngram_tokenizer" : {
"type" : "edgeNGram",
"min_gram" : "2",
"max_gram" : "17"
}
}
}
}
}'
curl -XPUT http://localhost:9200/ap-test/devices/_mapping -d '
{
"devices": {
"properties" {
"user": {
"type": "string"
},
"macaddr": {
"type": "string",
"index_analyzer" : "my_edge_ngram_analyzer",
"search_analyzer": "keyword"
}
}
}
}'
Put the documents as before, then search with the query specifically aimed at the field:
curl -XPOST http://localhost:9200/ap-test/devices/_search -d '
{
"query" : {
"query_string" : {
"query":"\"00\\:19\\:92\\:00\\:71\\:80\"",
"fields": ["macaddr", "user"]
}
}
}'
As for your last question, the text query is deprecated.
Good luck!
After some research I found and easier way to make it work.
Elasticsearch query options are confusing sometimes because they have so many options...
query_string: has a full-fledged search with a myriad of options and
wildcard uses.
match: is simpler and doesn't require wildcard
characters, or other “advance” features. This one it's great to use
it in search boxes because chances of it failing are very small if not non-existent.
So, that said. This is the one that work the best in most cases and didn't required customized mapping.
curl -XPOST http://localhost:9200/ap-test/devices/_search -d '
{
"query" : {
"match_phrase_prefix" : {
"_all" : "00:19:92:00:71:8"
}
}
}'

Search CouchDB Using ElasticSearch River

I've created a couchDB river (from this elasticsearch example) for elasticsearch with the following code:
curl -XPUT 'localhost:9200/_river/tasks/_meta' -d '{
"type" : "couchdb",
"couchdb" : {
"host" : "localhost",
"port" : 5984,
"db" : "tasks",
"filter" : null
},
"index" : {
"index" : "tasks",
"type" : "tasks",
"bulk_size" : "100",
"bulk_timeout" : "10ms"
}
}'
When I try to search the the couchDB using elasticsearch with this command:
curl -XGET http://localhost:9200/tasks/tasks -d query{"user":"jbattle"}
I get the response:
No handler found for uri [/tasks/tasks] and method [GET][]
I've been searching but have yet to discover a solution to/for this issue.
UPDATE:
I've discovered the proper query is:
curl -XGET 'http://localhost:9200/_river/tasks/_search?q=user:jbattle&pretty=true'
Though, despite no longer receiving an error, I get 0 hits:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"hits" : {
"total" : 0,
"max_score" : null,
"hits" : [ ]
}
Both of your queries are incorrect. The first one is missing the endpoint /_search and the second one is querying index _river instead of index tasks.
The _river index is where your river is stored not your data. When you configured your river, you specified index tasks.
So try this instead:
curl -XGET 'http://localhost:9200/tasks/tasks/_search?q=user:jbattle&pretty=true'
Or if that doesn't work, try searching for any docs in tasks/tasks:
curl -XGET 'http://localhost:9200/tasks/tasks/_search?q=*&pretty=true'
clint
The example file you posted got moved to github. These guys give a decent walkthrough of getting couch and elasticsearch to work together.
Unfortunately, the currently accepted answer doesn't work for me. But if I paste something like this in my browser's address bar it works. Notice that there is only one reference to the "tasks" index in the url, not two.
http://localhost:9200/tasks/_search?pretty=true
To do a real search you could try something like this:
http://localhost:9200/tasks/_search?q="hello"&pretty=true

Resources