How to get the last update of the index in elasticsearch - search

How can I find datetime of last update of the elsasticsearch index?
Elasticsearch index last update time I tried to follow the example , but nothing happened .
curl -XGET 'http://localhost:9200/_all/_mapping'
{"haystack":{"mappings":{"modelresult":{"_all":{"auto_boost":true},"_boost":{"name":"boost","null_value":1.0},"properties":{"act_name":{"type":"string","boost":1.3,"index_analyzer":"index_ngram","search_analyzer":"search_ngram"},"django_ct":{"type":"string","index":"not_analyzed","include_in_all":false},"django_id":{"type":"string","index":"not_analyzed","include_in_all":false},"hometown":{"type":"string","boost":0.9,"index_analyzer":"index_ngram","search_analyzer":"search_ngram"},"id":{"type":"string"},"text":{"type":"string","analyzer":"ngram_analyzer"}}},"mytype":{"_timestamp":{"enabled":true,"store":true},"properties":{}}}}}
curl -XPOST localhost:9200/your_index/your_type/_search -d '{
"size": 1,
"sort": {
"_timestamp": "desc"
},
"fields": [
"_timestamp"
]
}'
{"took":2,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":99,"max_score":null,"hits":[{"_index":"haystack","_type":"modelresult","_id":"account.user.96","_score":null,"sort":[-9223372036854775808]}]}}
What is wrong?

First, what you need to do is to proceed like in that linked question and enable the _timestamp field in your mapping.
{
"modelresult" : {
"_timestamp" : { "enabled" : true }
}
}
Then you can query your index for a single document with the most recent timestamp like this:
curl -XPOST localhost:9200/haystack/modelresult/_search -d '{
"size": 1,
"sort": {
"_timestamp": "desc"
},
"fields": [
"_timestamp"
]
}'

Related

Not able to filter the record from Azure Search based on specific key within indexed json data

I have populated Azure search data using my application and this is what is present in Search Explorer in portal.azure.com.
{
"#odata.context": "https://demosearch.search.windows.net/indexes('<indexname>')/$metadata#docs(*)",
"value": [
{
"#search.score": 1,
"id": "31",
"code": "C001105",
"title": "Demo Course Title 1",
"creator": "FILE_UPLOAD",
"events": [
{
"eventId": 97,
"eventStatus": "PLANNING",
"evtSession": [
{
"postCode": "AB10 1AB",
"townOrCity": "Aberdeen City,",
"dates": {
"from": "2022-08-11T08:00:00Z",
"to": "2022-08-11T11:00:00Z"
}
}
]
}
]
},
{
"#search.score": 1,
"id": "45",
"code": "C001125",
"title": "Demo Course Title 2",
"creator": "FILE_UPLOAD",
"events": [
{
"eventId": 98,
"eventStatus": "IN_PROGRESS",
"evtSession": [
{
"postCode": "BA10 0AN",
"townOrCity": "Bruton",
"dates": {
"from": "2022-08-11T08:00:00Z",
"to": "2022-08-11T09:30:00Z"
}
}
]
}
]
}
],
"#odata.nextLink": "https://demosearch.search.windows.net/indexes('<indexname>')/docs?api-version=2019-05-06&search=%2A&$skip=50"
}
I'm trying below curl to get data where ["townOrCity": "Aberdeen City,"] from Azure search.
curl --location --request POST 'https://demosearch.search.windows.net/indexes/<indexname>/docs/search?api-version=2019-05-06' \
--header 'api-key: XXXX' \
--header 'Accept: application/json' \
--header 'Content-Type: application/json' \
--data-raw '{"count":false,"top":0,"skip":30,"search":"*","orderby":"search.score() desc","filter":"( events/any(evt: evt/evtSession/any(session: search.in(session/townOrCity, '\''Aberdeen City,'\'', '\'','\'') ) ) )","facets":["events/evtSession/townOrCity,count:10000"],"queryType":"full","searchMode":"any"}'
but I'm not getting expected response and value is coming as empty array :
RESPONSE
{
"#odata.context": "https://demosearch.search.windows.net/indexes('<indexname>')/$metadata#docs(*)",
"#search.facets": {
"events/evtSession/townOrCity": []
},
"value": []
}
Please help with the correct payload I should be using to filter out the record with "townOrCity" : "Aberdeen City," OR am I doing something wrong with indexing config or anything ?
"townOrCity" : "Aberdeen City,"
Edit 1:
NOTE: comma mentioned after Aberdeen City causes the issue. If I try same thing witout the comma everything works like a charm. But requirement is to support the comma.
$filter=( events/any(evt: evt/evtSession/any(session: search.in(session/townOrCity, 'Aberdeen City,', ',') ) ) )
there is data present in index but still its not applying filter properly, instead giving no record in response.
This works :
$filter=( events/any(evt: evt/evtSession/any(session: search.in(session/townOrCity, 'Aberdeen City,', '|') ) ) )
There are two overloads of the search.in function:
search.in(variable, valueList)
search.in(variable, valueList, delimiters)
Due to delimeters, comma inside my valueList was removed and hence actual value got changed. Apparently its an exact-match so empty response returned.
https://learn.microsoft.com/en-us/azure/search/search-query-odata-search-in-function

Elasticsearch terms stats query not grouping correctly

I have a terms stats query very similar to this one:
Sum Query in Elasticsearch
However, my key_field is a date.
I was expecting to receive results grouped by the full key_field value ["2014-01-20", "2014-01-21", "2014-01-22"] but it appears to be splitting the key field when it encounters a "-". What I received is actually grouped by ["2014", "01", "20", "21", "22"].
Why is it splitting my key?
You probably have your key_field mapped with a string-type using the standard-analyzer.
That'll tokenize 2014-01-20 into 2014, 01, and 20.
You probably want to index your date as having type date. You can also have it as a string without analyzing it.
Here's a runnable example you can play with: https://www.found.no/play/gist/5eb6b8d176e1cc72c9b8
#!/bin/bash
export ELASTICSEARCH_ENDPOINT="http://localhost:9200"
# Create indexes
curl -XPUT "$ELASTICSEARCH_ENDPOINT/play" -d '{
"settings": {},
"mappings": {
"type": {
"properties": {
"date_as_a_string": {
"type": "string"
},
"date_as_nonanalyzed_string": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}'
# Index documents
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_bulk?refresh=true" -d '
{"index":{"_index":"play","_type":"type"}}
{"date":"2014-01-01T00:00:00.000Z","date_as_a_string":"2014-01-01T00:00:00.000Z","date_as_nonanalyzed_string":"2014-01-01T00:00:00.000Z","x":42}
'
# Do searches
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_search?pretty" -d '
{
"facets": {
"date": {
"terms_stats": {
"key_field": "date",
"value_field": "x"
}
},
"date_as_a_string": {
"terms_stats": {
"key_field": "date_as_a_string",
"value_field": "x"
}
},
"date_as_nonanalyzed_string": {
"terms_stats": {
"key_field": "date_as_nonanalyzed_string",
"value_field": "x"
}
}
},
"size": 0
}
'

Elasticsearch mac address search/mapping

I can't get the mac address search to return proper results when I'm doing partial searches (half octect). I mean, if I look for the exact mac address I get results but if try to search for partial search like "00:19:9" I don't get anything until I complete the octet.
Can anyone point out which mapping should I use to index it or kind of search query should I use??
curl -XDELETE http://localhost:9200/ap-test
curl -XPUT http://localhost:9200/ap-test
curl -XPUT http://localhost:9200/ap-test/devices/1 -d '
{
"user" : "James Earl",
"macaddr" : "00:19:92:00:71:80"
}'
curl -XPUT http://localhost:9200/ap-test/devices/2 -d '
{
"user" : "Earl",
"macaddr" : "00:19:92:00:71:82"
}'
curl -XPUT http://localhost:9200/ap-test/devices/3 -d '
{
"user" : "James Edward",
"macaddr" : "11:19:92:00:71:80"
}'
curl -XPOST 'http://localhost:9200/ap-test/_refresh'
curl -XGET http://localhost:9200/ap-test/devices/_mapping?pretty
When I to find exact matches I get them correctly....
curl -XPOST http://localhost:9200/ap-test/devices/_search -d '
{
"query" : {
"query_string" : {
"query":"\"00\\:19\\:92\\:00\\:71\\:80\""
}
}
}'
# RETURNS:
{
"took": 6,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.57534903,
"hits": [
{
"_index": "ap-test",
"_type": "devices",
"_id": "1",
"_score": 0.57534903,
"_source": {
"user": "James Earl",
"macaddr": "00:19:92:00:71:80"
}
}
]
}
}
HOWEVER, I need to be able to match partial mac addresses searches like this:
curl -XPOST http://localhost:9200/ap-test/devices/_search -d '
{
"query" : {
"query_string" : {
"query":"\"00\\:19\\:9\""
}
}
}'
# RETURNS 0 instead of returning 2 of them
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
SO, What mapping should I use? Is there a better query string to accomplish this? BTW, what's the difference between using 'query_string' and 'text'?
It looks like you haven't defined a mapping at all, which means elasticsearch will guess off your datatypes and use the standard mappings.
For the field macaddr, this will be recognised as a string and the standard string analyzer will be used. This analyzer will break up the string on whitespace and punctuation, leaving you with tokens consisting of pairs of numbers. e.g. "00:19:92:00:71:80" will get tokenized to 00 19 92 00 71 80. When you search the same tokenization will happen.
What you want is to define an analyzer which turns "00:19:92:00:71:80" into the tokens 00 00: 00:1 00:19 etc...
Try this:
curl -XPUT http://localhost:9200/ap-test -d '
{
"settings" : {
"analysis" : {
"analyzer" : {
"my_edge_ngram_analyzer" : {
"tokenizer" : "my_edge_ngram_tokenizer"
}
},
"tokenizer" : {
"my_edge_ngram_tokenizer" : {
"type" : "edgeNGram",
"min_gram" : "2",
"max_gram" : "17"
}
}
}
}
}'
curl -XPUT http://localhost:9200/ap-test/devices/_mapping -d '
{
"devices": {
"properties" {
"user": {
"type": "string"
},
"macaddr": {
"type": "string",
"index_analyzer" : "my_edge_ngram_analyzer",
"search_analyzer": "keyword"
}
}
}
}'
Put the documents as before, then search with the query specifically aimed at the field:
curl -XPOST http://localhost:9200/ap-test/devices/_search -d '
{
"query" : {
"query_string" : {
"query":"\"00\\:19\\:92\\:00\\:71\\:80\"",
"fields": ["macaddr", "user"]
}
}
}'
As for your last question, the text query is deprecated.
Good luck!
After some research I found and easier way to make it work.
Elasticsearch query options are confusing sometimes because they have so many options...
query_string: has a full-fledged search with a myriad of options and
wildcard uses.
match: is simpler and doesn't require wildcard
characters, or other “advance” features. This one it's great to use
it in search boxes because chances of it failing are very small if not non-existent.
So, that said. This is the one that work the best in most cases and didn't required customized mapping.
curl -XPOST http://localhost:9200/ap-test/devices/_search -d '
{
"query" : {
"match_phrase_prefix" : {
"_all" : "00:19:92:00:71:8"
}
}
}'

Elasticsearch sort based on the number of occurrences a string appears in an array

I have an array field containig a list of strings: ie.: ["NY", "CA"]
At search time I have a filter which matches any of the strings in the array.
I would like to sort the results based on documents that have the most number of appearances of the searched string: "NY"
Results should include:
document 1: ["CA", "NY", "NY"]
document 2: ["NY", FL"]
document 3: ["NY", CA", "NY", "NY"]
Results should be ordered as such
User 3, User 1, User 2
Is this possible? If so, how?
For those curious, I was not able to boost based on how many occurrences of the word happen in the array. I did however accomplished what I needed with the following:
curl -X POST "http://localhost:9200/index/document/1" -d '{"id":1,"states_ties":["CA"],"state_abbreviation":"CA","worked_in_states":["CA"],"training_in_states":["CA"]}'
curl -X POST "http://localhost:9200/index/document/2" -d '{"id":2,"states_ties":["CA","NY"],"state_abbreviation":"FL","worked_in_states":["NY","CA"],"training_in_states":["NY","CA"]}'
curl -X POST "http://localhost:9200/index/document/3" -d '{"id":3,"states_ties":["CA","NY","FL"],"state_abbreviation":"NY","worked_in_states":["NY","CA"],"training_in_states":["NY","FL"]}'
curl -X GET 'http://localhost:9200/index/_search?per_page=10&pretty' -d '{
"query": {
"custom_filters_score": {
"query": {
"terms": {
"states_ties": [
"CA"
]
}
},
"filters": [
{
"filter": {
"term": {
"state_abbreviation": "CA"
}
},
"boost": 1.03
},
{
"filter": {
"terms": {
"worked_in_states": [
"CA"
]
}
},
"boost": 1.02
},
{
"filter": {
"terms": {
"training_in_states": [
"CA"
]
}
},
"boost": 1.01
}
],
"score_mode": "multiply"
}
},
"sort": [
{
"_score": "desc"
}
]
}'
results: id: score
1: 0.75584483
2: 0.73383
3: 0.7265643
This would be accomplished by the standard Lucene scoring implementation. If you were simply searching for "NY", without specifying an order, it will sort by relevance, and will assign highest relevance to a document with more occurances of the term, all else being equal.

Indexing geospatial in elastic search results in error?

{ title: 'abcccc',
price: 3300,
price_per: 'task',
location: { lat: -33.8756, lon: 151.204 },
description: 'asdfasdf'
}
The above is the JSON that I want to index. However, when I index it, the error is:
{"error":"MapperParsingException[Failed to parse [location]]; nested: ElasticSearchIllegalArgumentException[unknown property [lat]]; ","status":400}
If I remove the "location" field, everything works.
How do I index geo? I read the tutorial and I'm still confused how it works. It should work like this, right...?
You are getting this error message because the field location s not mapped correctly. It's possible that at some point of time, you tried to index a string in this field and it's now mapped as a string. Elasticsearch cannot automatically detect that a field contains a geo_point. It has to be explicitly specified in mapping. Otherwise, Elasticsearch maps such field as a string, number or object depending on the type of geo_point representation that you used in the first indexed record. Once field is added to the mapping, its type can no longer be changed. So, in order to fix the situation, you will need to delete the mapping for this type and create it again. Here is an example of specifying mapping for a geo_point field:
curl -XDELETE "localhost:9200/geo-test/"
echo
# Set proper mapping. Elasticsearch cannot automatically detect that something is a geo_point:
curl -XPUT "localhost:9200/geo-test" -d '{
"settings": {
"index": {
"number_of_replicas" : 0,
"number_of_shards": 1
}
},
"mappings": {
"doc": {
"properties": {
"location" : {
"type" : "geo_point"
}
}
}
}
}'
echo
# Put some test data in Sydney
curl -XPUT "localhost:9200/geo-test/doc/1" -d '{
"title": "abcccc",
"price": 3300,
"price_per": "task",
"location": { "lat": -33.8756, "lon": 151.204 },
"description": "asdfasdf"
}'
curl -XPOST "localhost:9200/geo-test/_refresh"
echo
# Search, and calculate distance to Brisbane
curl -XPOST "localhost:9200/geo-test/doc/_search?pretty=true" -d '{
"query": {
"match_all": {}
},
"script_fields": {
"distance": {
"script": "doc['\''location'\''].arcDistanceInKm(-27.470,153.021)"
}
},
"fields": ["title", "location"]
}
'
echo
Since you don't specify how you parse it, this:
Parsing through JSON in JSON.NET with unknown property names
may bring some light in.

Resources