searching with regex in elasticsearch - search

Im trying to regex search an elasticsearch database.
My query so far (its not working):
#!/usr/bin/env bash
curl -XGET 'http://localhost:9200/logstash-2015.10.27/_search' -d \
'{
query: {
"regexp": {
"#timestamp": {
value: ".*"
}
}
}
}' | python -m json.tool
and the results im getting are
{
"error": "SearchPhaseExecutionException[Failed to execute phase [query], all shards failed; shardFailures {[DqJwlMoTQ3e8nyl4m7amGw][logstash-2015.10.27][0]: SearchParseException[[logstash-2015.10.27][0]: from[-1],size[-1]: Parse Failure [Failed to parse source [{\n query: {\n \"regexp\": {\n \"#timestamp\": {\n value: \".*\"\n }\n }\n }\n}]]]; nested: IllegalArgumentException[Invalid format: \".*\"]; }{[DqJwlMoTQ3e8nyl4m7amGw][logstash-2015.10.27][1]: SearchParseException[[logstash-2015.10.27][1]: from[-1],size[-1]: Parse Failure [Failed to parse source [{\n query: {\n \"regexp\": {\n \"#timestamp\": {\n value: \".*\"\n }\n }\n }\n}]]]; nested: IllegalArgumentException[Invalid format: \".*\"]; }{[DqJwlMoTQ3e8nyl4m7amGw][logstash-2015.10.27][2]: SearchParseException[[logstash-2015.10.27][2]: from[-1],size[-1]: Parse Failure [Failed to parse source [{\n query: {\n \"regexp\": {\n \"#timestamp\": {\n value: \".*\"\n }\n }\n }\n}]]]; nested: IllegalArgumentException[Invalid format: \".*\"]; }{[DqJwlMoTQ3e8nyl4m7amGw][logstash-2015.10.27][3]: SearchParseException[[logstash-2015.10.27][3]: from[-1],size[-1]: Parse Failure [Failed to parse source [{\n query: {\n \"regexp\": {\n \"#timestamp\": {\n value: \".*\"\n }\n }\n }\n}]]]; nested: IllegalArgumentException[Invalid format: \".*\"]; }{[DqJwlMoTQ3e8nyl4m7amGw][logstash-2015.10.27][4]: SearchParseException[[logstash-2015.10.27][4]: from[-1],size[-1]: Parse Failure [Failed to parse source [{\n query: {\n \"regexp\": {\n \"#timestamp\": {\n value: \".*\"\n }\n }\n }\n}]]]; nested: IllegalArgumentException[Invalid format: \".*\"]; }]",
"status": 400
}
The event that im trying to find is this
{
"_index": "logstash-2015.10.27",
"_type": "logs",
"_id": "AVCml4MI2xxzjEtiGou0",
"_version": 1,
"_score": null,
"_source": {
"host": "server",
"#timestamp": "2015-10-27T00:00:00.142Z",
"type_instance": "free",
"plugin": "exec",
"plugin_instance": "available_memory",
"collectd_type": "gauge",
"value": 855,
"#version": "1"
},
"sort": [
1445904000142
]
}
i've googled things but w/o any luck.
======== update ==========
i managed to query my elasticsearch with this
#!/usr/bin/env bash
curl -XPOST 'http://localhost:9200/logstash-2015.10.27/_search' -d '
{
"query": {
"bool": {
"must": { "range" : { "#timestamp" : { "gte" : "2015-10-27T00:00:01", "lte" : "2015-10-27T00:00:59"} }},
"must": {"regexp" : { "host": "d027.*" }}
}
}
}'

regexp works for string fields. The date fields are actually numbers in Elasticsearch.
For date searching I recommend the range filter: https://www.elastic.co/guide/en/elasticsearch/guide/current/_ranges.html#_ranges_on_dates

Related

Sorting according to date not working in elastic search nodejs

i am trying to return all the data in descending order of date using elastic search in nodejs but not being able to. Here is my get query (it is inside app.get(/api/allAssets)):
let query={
index: 'tokens',
from: 0,
size: 100,
sort: [{"createdAt":{"order":"desc"}}]
}
// if(req.query.allTokens) query.q = `*${req.query.product}*`;
client.search(query)
.then(resp => {
return res.status(200).json({
tokens: resp.hits.hits
})
}).catch(err=>{
console.log(err);
return res.status(500).json({
msg:'Error',
err
});
})
The portion where i push to elastic server is:
// index to elastic server
await client.index({
index:"tokens",
id: data.fingerprint,
mappings:{
properties:{
createdAt: {
type: "date",
format: "strict_date_optional_time_nanos"
}
}
},
body:{
"policyID": data.policyid,
"metadata": data.metadata,
"assetID": data.assetID,
"name": data.name,
"quantity": data.quantity,
"createdAt": data.time,
"fingerprint": data.fingerprint
}
},function (err,resp,status) {
console.log("response",resp);
console.log("--------------------------------------------------------");
})
The code is working when i remove the sort portion from the client.search. The response is like this:
"tokens": [
{
"_index": "tokens",
"_type": "_doc",
"_id": "asset1aq460y056c3mm4jmuv9vg39n7w8y5gykywjwge",
"_score": 1,
"_source": {
"policyID": "3a9241cd79895e3a8d65261b40077d4437ce71e9d7c8c6c00e3f658e",
"metadata": null,
"name": "Firstcoin",
"quantity": "1",
"createdAt": "2021-03-01T21:47:37.000Z",
"fingerprint": "asset1aq460y056c3mm4jmuv9vg39n7w8y5gykywjwge"
}
},
]
but when i add sort: [{"createdAt":{"order":"desc"}}] so that i can return the response in descending order according to createdAt field, it gives an error like this:
{
"msg": "Error",
"err": {
"msg": "[query_shard_exception] No mapping found for [[object Object]] in order to sort on, with { index_uuid=\"QcKwn0wtT-6cvdZ64RqVKw\" & index=\"tokens\" }",
"path": "/tokens/_search",
"query": {
"from": 0,
"size": 100,
"sort": "[object Object]"
},
"statusCode": 400,
"response": "{\"error\":{\"root_cause\":[{\"type\":\"query_shard_exception\",\"reason\":\"No mapping found for [[object Object]] in order to sort on\",\"index_uuid\":\"QcKwn0wtT-6cvdZ64RqVKw\",\"index\":\"tokens\"}],\"type\":\"search_phase_execution_exception\",\"reason\":\"all shards failed\",\"phase\":\"query\",\"grouped\":true,\"failed_shards\":[{\"shard\":0,\"index\":\"tokens\",\"node\":\"vIZfPrAnQRWqciWZYndxtg\",\"reason\":{\"type\":\"query_shard_exception\",\"reason\":\"No mapping found for [[object Object]] in order to sort on\",\"index_uuid\":\"QcKwn0wtT-6cvdZ64RqVKw\",\"index\":\"tokens\"}}]},\"status\":400}"
}
}
Could someone help me on this? I am fairly new to elastic search and am stuck on this.

MongoDB Aggregation pipeline python

I have a collection of log files and i am required to find the number of times a system shows a message "Average limit exceeded while connecting ..." in a given date range and display the result for all the systems in the given date range in descending order
Currently my documents in the mongodb collection look like
{'computerName':'APOOUTRDFG',
'datetime': 11/27/2019 10:45:23.123
'message': 'Average limit ....'
}
So, I have tried filtering my result by first matching the message string and then grouping them by computer name but this does not help out the case
db.collection.aggregate([
{ "$match": {
'message': re.compile(r".*Average limit.*")
},
{ "$group": {
"_id": { "$toLower": "$computerName" },
"count": { "$sum": 1 }
} }
])
Expected results
Date : 01-01-2012 to 31-01-2012
Computer Name Number of Average limit exceeded
computername1 120
computername2 83
computername3 34
Assuming you have the following data in DB:
[
{
"computerName": "APOOUTRDFG",
"datetime": "11/27/2019 10:45:23.123",
"message": "Average limit ...."
},
{
"computerName": "BPOOUTRDFG",
"datetime": "01/02/2012 10:45:23.123",
"message": "Average limit ...."
},
{
"computerName": "CPOOUTRDFG",
"datetime": "01/30/2012 10:45:23.123",
"message": "Average limit ...."
},
{
"computerName": "DPOOUTRDFG",
"datetime": "01/30/2012 10:45:23.123",
"message": "Some other message ...."
}
]
Note: 'datetime' is format %m/%d/%Y %H:%M:%S.%L and input date range is in the format: %d-%m-%Y
The following query can get you the expected output:
db.collection.aggregate([
{
$match:{
"message": /.*Average limit.*/i,
$expr:{
$and:[
{
$gte:[
{
$dateFromString:{
"dateString":"$datetime",
"format":"%m/%d/%Y %H:%M:%S.%L"
}
},
{
$dateFromString:{
"dateString":"01-01-2012",
"format":"%d-%m-%Y"
}
}
]
},
{
$lte:[
{
$dateFromString:{
"dateString":"$datetime",
"format":"%m/%d/%Y %H:%M:%S.%L"
}
},
{
$dateFromString:{
"dateString":"31-01-2012",
"format":"%d-%m-%Y"
}
}
]
}
]
}
}
},
{
$group:{
"_id":{
$toLower:"$computerName"
},
"count":{
$sum:1
}
}
}
]).pretty()
Recommended: Its better to save date as ISODate or as timestamp in DB.

Unable to query nested object

Elasticsearch 6.2.4
Index has mapping:
{
"watcher" : {
"aliases" : { },
"mappings" : {
"doc" : {
"properties" : {
"script" : {
"properties" : {
"body" : {
"type" : "text"
},
"description" : {
"type" : "text"
},
"title" : {
"type" : "text"
}
}
},
"super-user" : {
"properties" : {
"id" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"password" : {
"type" : "text"
},
"sha" : {
"type" : "text"
},
"username" : {
"type" : "text"
}
}
},
"watcher" : {
"properties" : {
"actions" : {
"type" : "object",
"enabled" : false
},
"condition" : {
"type" : "object",
"enabled" : false
}
}
}
}
}
}
}
}
There is a document I want to get by its _source.super-user.id value:
{
"_index" : "watcher",
"_type" : "doc",
"_id" : "sAkqs2UBN8hNgeAd6VYT",
"_score" : 1.0,
"_source" : {
"super-user" : {
"id" : "rwkTs2UBN8hNgeAd902q",
"username" : "elastic",
"sha" : "7598562076f37c7376ccf5c6ad28e00c:0fa96e2c4c0136b12ae1708940c46a52"
}
}
}
How do get this document?
I tried nested query:
const elasticsearch = require('elasticsearch');
const client = new elasticsearch.Client({
host: [
{
host: 'localhost',
protocol: 'http',
auth: 'elastic:password',
port: 9200
}
]
});
(async () => {
try {
const resp = await client.search({
index: 'watcher',
type: 'doc',
body: {
query: {
nested: {
path: 'super-user',
query: {
bool: {
must: [
{
match: {
'super-user.id': 'rwkTs2UBN8hNgeAd902q'
}
}
]
}
}
}
}
}
});
console.log(JSON.stringify(resp, null, 2));
} catch (err) {
console.error(err);
}
})();
But I got failed to create query error:
{ Error: [query_shard_exception] failed to create query: {
"nested" : {
"query" : {
"bool" : {
"must" : [
{
"match" : {
"super-user.id" : {
"query" : "rwkTs2UBN8hNgeAd902q",
"operator" : "OR",
"prefix_length" : 0,
"max_expansions" : 50,
"fuzzy_transpositions" : true,
"lenient" : false,
"zero_terms_query" : "NONE",
"auto_generate_synonyms_phrase_query" : true,
"boost" : 1.0
}
}
}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
},
"path" : "super-user",
"ignore_unmapped" : false,
"score_mode" : "avg",
"boost" : 1.0
}
}, with { index_uuid="5O9HfcORTjiq5SZ0c1lkQA" & index="watcher" }
at respond (/media/trex/safe/Development/private/node_modules/elasticsearch/src/lib/transport.js:307:15)
at checkRespForFailure (/media/trex/safe/Development/private/node_modules/elasticsearch/src/lib/transport.js:266:7)
at HttpConnector.<anonymous> (/media/trex/safe/Development/private/node_modules/elasticsearch/src/lib/connectors/http.js:159:7)
at IncomingMessage.bound (/media/trex/safe/Development/private/node_modules/elasticsearch/node_modules/lodash/dist/lodash.js:729:21)
at emitNone (events.js:111:20)
at IncomingMessage.emit (events.js:208:7)
at endReadableNT (_stream_readable.js:1064:12)
at _combinedTickCallback (internal/process/next_tick.js:138:11)
at process._tickCallback (internal/process/next_tick.js:180:9)
status: 400,
displayName: 'BadRequest',
message: '[query_shard_exception] failed to create query: {\n "nested" : {\n "query" : {\n "bool" : {\n "must" : [\n {\n "match" : {\n "super-user.id" : {\n "query" : "rwkTs2UBN8hNgeAd902q",\n "operator" : "OR",\n "prefix_length" : 0,\n "max_expansions" : 50,\n "fuzzy_transpositions" : true,\n "lenient" : false,\n "zero_terms_query" : "NONE",\n "auto_generate_synonyms_phrase_query" : true,\n "boost" : 1.0\n }\n }\n }\n ],\n "adjust_pure_negative" : true,\n "boost" : 1.0\n }\n },\n "path" : "super-user",\n "ignore_unmapped" : false,\n "score_mode" : "avg",\n "boost" : 1.0\n }\n}, with { index_uuid="5O9HfcORTjiq5SZ0c1lkQA" & index="watcher" }',
path: '/watcher/doc/_search',
query: {},
body:
{ error:
{ root_cause: [Array],
type: 'search_phase_execution_exception',
reason: 'all shards failed',
phase: 'query',
grouped: true,
failed_shards: [Array] },
status: 400 },
statusCode: 400,
response: '{"error":{"root_cause":[{"type":"query_shard_exception","reason":"failed to create query: {\\n \\"nested\\" : {\\n \\"query\\" : {\\n \\"bool\\" : {\\n \\"must\\" : [\\n {\\n \\"match\\" : {\\n \\"super-user.id\\" : {\\n \\"query\\" : \\"rwkTs2UBN8hNgeAd902q\\",\\n \\"operator\\" : \\"OR\\",\\n \\"prefix_length\\" : 0,\\n \\"max_expansions\\" : 50,\\n \\"fuzzy_transpositions\\" : true,\\n \\"lenient\\" : false,\\n \\"zero_terms_query\\" : \\"NONE\\",\\n \\"auto_generate_synonyms_phrase_query\\" : true,\\n \\"boost\\" : 1.0\\n }\\n }\\n }\\n ],\\n \\"adjust_pure_negative\\" : true,\\n \\"boost\\" : 1.0\\n }\\n },\\n \\"path\\" : \\"super-user\\",\\n \\"ignore_unmapped\\" : false,\\n \\"score_mode\\" : \\"avg\\",\\n \\"boost\\" : 1.0\\n }\\n}","index_uuid":"5O9HfcORTjiq5SZ0c1lkQA","index":"watcher"}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"watcher","node":"OHtckm41Ts2DwDlT0A7N4w","reason":{"type":"query_shard_exception","reason":"failed to create query: {\\n \\"nested\\" : {\\n \\"query\\" : {\\n \\"bool\\" : {\\n \\"must\\" : [\\n {\\n \\"match\\" : {\\n \\"super-user.id\\" : {\\n \\"query\\" : \\"rwkTs2UBN8hNgeAd902q\\",\\n \\"operator\\" : \\"OR\\",\\n \\"prefix_length\\" : 0,\\n \\"max_expansions\\" : 50,\\n \\"fuzzy_transpositions\\" : true,\\n \\"lenient\\" : false,\\n \\"zero_terms_query\\" : \\"NONE\\",\\n \\"auto_generate_synonyms_phrase_query\\" : true,\\n \\"boost\\" : 1.0\\n }\\n }\\n }\\n ],\\n \\"adjust_pure_negative\\" : true,\\n \\"boost\\" : 1.0\\n }\\n },\\n \\"path\\" : \\"super-user\\",\\n \\"ignore_unmapped\\" : false,\\n \\"score_mode\\" : \\"avg\\",\\n \\"boost\\" : 1.0\\n }\\n}","index_uuid":"5O9HfcORTjiq5SZ0c1lkQA","index":"watcher","caused_by":{"type":"illegal_state_exception","reason":"[nested] nested object under path [super-user] is not of nested type"}}}]},"status":400}',
toString: [Function],
toJSON: [Function] }
nested is a specific data type: https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html
In your mapping, you dont specify it, so use the simple dot notation to select the right field without the nested query.

Node.js ElastiSearch client. How to use filter_path parameter with _msearch method

I use https://github.com/elastic/elasticsearch-js ElasticSearch client in my application.
I want use Multi Search API and reduce response with filter_path parameter.
In Kibana request looks like:
POST _msearch?filter_path=responses.hits.total
{ "index": "first_index" }
{"query": {"term": {"status": 1}} }
{ "index": "second_index" }
{"query": {"term": {"status": 1}} }
Response:
{
"responses": [
{
"hits": {
"total": 1935
}
},
{
"hits": {
"total": 1212
}
}
]
}
But I can't find right place where should be this filter_path parameter in client.msearch method. Something like:
client.msearch({
body: [
{ "index": "first_index" },
{ "q": 'filter_path=responses.hits.total' },
{"query": {"term": {"status": 1}} },
{ "index": "second_index" },
{ "q": 'filter_path=responses.hits.total' },
{"query": {"term": {"status": 1}} }
]
})
does't work.
How can I send this request with Node.js ElasticSearch Client?
filterPath key should be given at the same level with body.
client.msearch({
body: [
{ "index": "first_index" },
{"query": {"term": {"status": 1}} },
{ "index": "second_index" },
{ "q": 'filter_path=responses.hits.total' },
{"query": {"term": {"status": 1}} }
],
filterPath: "responses.hits.total"
})
msearch function takes a parameter type of MSearchParams.
msearch<T>(params: MSearchParams): Promise<MSearchResponse<T>>;
I recommend you to install the elasticsearch type definition and so you can see the type hierarchy
export interface MSearchParams extends GenericParams {
search_type?: "query_then_fetch" | "query_and_fetch" | "dfs_query_then_fetch" | "dfs_query_and_fetch";
maxConcurrentSearches?: number;
index?: NameList;
type?: NameList;
}
export interface GenericParams {
requestTimeout?: number;
maxRetries?: number;
method?: string;
body?: any;
ignore?: number | number[];
filterPath?: string | string[];
}

query multiple terms in multiple fields using elasticsearch

I want to search for multiple terms in 2 different fields (title, and description), the operator should be OR. Meaning that if any records contains any of these terms (heart, cancer) then that record should be returned.
Here is my code:
curl -XGET 'localhost:9200/INDEXED REPOSITORY/_search?pretty' -H 'Content-
Type: application/json' -d'{"query" : {"constant_score" : {"filter" : {"terms"
: {"description","title" : ["heart","cancer"]}}}}}'
But, I get this error:
"error" : "SearchPhaseExecutionException[Failed to execute phase [query],
all shards failed; shardFailures {[6hWIW7xlSbSqKi4dNg_1bg][geo_021017cde]
[0]: SearchParseException[[geo_021017cde][0]: from[-1],size[-1]: Parse
Failure [Failed to parse source [{\"query\" : {\"constant_score\" :
{\"filter\" : {\"terms\" : {\"description\",\"title\" :
[\"heart\",\"cancer\"]}}}}
Am I missing anything?
I figured out how to resolve it:
{
"query": {
"constant_score": {
"filter": {
"bool": {
"should": [
{
"terms": {
"description": [
"heart",
"cancer"
]
}
},
{
"terms": {
"title": [
"heart",
"cancer"
]
}
}
]
}
}
}
}
}

Resources