I'm working on ElasticSearch 6.5.3 and . Please find below my code:
from elasticsearch import Elasticsearch, helpers
def search(es_object, index_name, request):
toto = es_object.search(index=index_name, body=request)
return toto
fuzziness = 1
request1 = {
"query": {
"match" : {
"Family_name" : {
"query" : family_names,
"fuzziness": fuzziness,
}
}
},
"size" : size
}
request2 = {
"query": {
"match" : {
"First_name" : {
"query" : first_names,
"fuzziness": fuzziness,
}
}
},
"size" : size
}
result1 = search(es,ELASTICSEARCH_INDEX_NAME,request1)
result2 = search(es,ELASTICSEARCH_INDEX_NAME,request2)
I'd like to make a bool fuzzy query on firstname and family name. How can I do it please ?
I tried the following :
request = {
"bool": {
"must": [
{
"query": {
"match" : {
"First_name" : {
"query" : family_name,
"fuzziness": fuzziness,
}
}
},
"query": {
"match" : {
"Prénom" : {
"query" : firstname,
"fuzziness": fuzziness,
}
}
}
}
]
}
}
result = search(es,ELASTICSEARCH_INDEX_NAME,request)
I got the following error meaning that there is some problem in my query. It seems that I cannot combine two match queries having fuzziness simultaneously
RequestError: RequestError(400, 'parsing_exception', 'Unknown key
for a START_OBJECT in [bool].')
Your query is a bit wrong, you need to fix it like this:
{
"query": {
"bool": {
"must": [
{
"match": {
"First_name": {
"query": "family_name",
"fuzziness": fuzziness
}
}
},
{
"match": {
"Prénom": {
"query": "firstname",
"fuzziness": fuzziness
}
}
}
]
}
}
}
The reason is, that boolean query array already understand that it will be an object of query, so you don't need to specify it again
Related
I have a collection with date stored as strings YYYY-mm-DD_HH:MM:SS.UUUZ like 2020-10-20_12:15:22.123+0100
My goal is to query on strings treating those as dates.
What am I doing:
I'm unwinding some header data on multiple documents:
{
"$unwind": {
"path": "$events",
"preserveNullAndEmptyArrays": true
}
}
and also
{
"$unwind": {
"path": "$events.hi2",
"preserveNullAndEmptyArrays": true
}
}
I'm adding a new field made with the string parsed as Date
{
"$addFields": {
"events.hi2.ConnectTimets": {
"$dateFromString": {
"dateString": "$events.hi2.ConnectTime",
"format": "%Y-%m-%d_%H:%M:%S.%L%Z"
}
}
}}
then on a $match stage I try to filter all records with date newer than 1 June 2020:
{
"$match":{
"events.hi2.ConnectTimets": {
"$gt": {"$dateFromString": {
"dateString": "2020-06-01",
"format": "%Y-%m-%d"
}
}
}
}
}
my result is Fetched 0 record(s) in 0ms
even though exists (at least a single document) in the database with a date matching the filter:
{
"_id" : ObjectId("5f438dfbf1feb13c4352e9f4"),
"timestamp" : NumberLong(1598262779045),
"attribute1" : [
"common"
],
"events" : [
{
"eventType" : "ty1",
"timestamp" : NumberLong(1598262779018),
"docId" : NumberLong(282578800148736),
"hi2" : {
"Priority" : 3,
"ClientId" : "client1",
"ConnectTime" : "2020-08-24_09:52:58.993+0000",
"Direction" : 1
}
},
{
"eventType" : "ty2",
"timestamp" : NumberLong(1598262781071),
"docId" : NumberLong(282578800148736),
"hi2" : {
"ref" : "bbbb"
}
}
]
}
When I espected something like
{
"_id" : ObjectId("5f438dfbf1feb13c4352e9f4"),
"timestamp" : NumberLong(1598262779045),
"attribute1" : [
"common"
],
"events" : [
{
"eventType" : "ty1",
"timestamp" : NumberLong(1598262779018),
"docId" : NumberLong(282578800148736),
"hi2" : {
"Priority" : 3,
"ClientId" : "client1",
"ConnectTime" : "2020-08-24_09:52:58.993+0000",
"Direction" : 1
}
}
}
Note_: the add field is ok because if i fire it without the match stage outputs a field with the string parsed as date
You should never store date/time values as string, use always proper Date objects.
Then the query is much simpler:
db.logging.aggregate([
{
$addFields: {
"events.hi2.ConnectTimets": {
$dateFromString: {
dateString: "$events.hi2.ConnectTime",
format: "%Y-%m-%d_%H:%M:%S.%L%Z"
}
}
}
},
{ $match: { "events.hi2.ConnectTimets": { $gte: ISODate("2020-06-01") } } }
])
When you have to work with date/time values then I recommend moment.js. Then you query could look like these:
{ $match: { "events.hi2.ConnectTimets": { $gte: moment("2020-06-01").toDate() } } }
{ $match: { "events.hi2.ConnectTimets": { $gte: moment("2020-06-01").tz('Europe/Zurich').toDate() } } }
{ $match: { "events.hi2.ConnectTimets": { $lte: moment.tz('Europe/Zurich').endOf('day').toDate() } } }
I have simple Node Js application.
I want get filtered data by Path field, that contains 'get' word.
For example my data is like below:
"_source": {
"time": "2020-03-12T01:25:41.61836-07:00",
"level": "Info",
"info": {
"IpAddress": "0.0.0.0",
"Path": "/api/test/getTest/1",
"QueryString": "",
"UserAgent": "",
"LogDate": "2020-03-12T08:25:41.6220806Z",
"Username": "cavidan.aliyev",
"NodeId": "123456"
}
In other words my entity object's structure like as below:
{
time,
level,
info: {
IpAddress,
Path,
QueryString,
UserAgent,
LogDate,
Username,
NodeId
}
}
My query is like below:
client.search({
index: collectionName,
body: {
from: (params.currentPage - 1) * params.pageSize,
size: params.pageSize,
"query": {
"bool": {
"must": mustArr,
"filter": [
{
"match_all": {}
}
]
}
}
}
}, function (err, res) {
if (err) {
reject(err);
}
else {
let result = res.hits.hits. map(x => x._source);
resolve(result);
}
});
How I can filter data by Path field, that contains 'get' word?
Please help me, thanks
You can make use of Wildcard Query inside the filter query you have. I'm assuming that you are making use of Standard Analyzer for info.Path field.
Note that for the sake of simplicity I've just mentioned what should be going inside the filter query you have.
If info.Path is nested type:
POST <your_index_name>/_search
{
"query": {
"bool": {
"filter": { <--- Note this
"nested": {
"path": "info",
"query": {
"wildcard": {
"info.Path": {
"value": "*get*"
}
}
}
}
}
}
}
}
If info.Path is object type:
POST <your_index_name>/_search
{
"query": {
"bool": {
"filter": { <--- Note this
"wildcard":{
"info.Path": "*get*"
}
}
}
}
}
Important Note: Wildcard search slows the query performance, and if you have a control on the Elasticsearch's index, then you should definitely look at ngram search model, which creates n-gram tokens at index-time as mentioned in this link.
Let me know if this helps!
If you don't want returned data with "get" keywords, your wildcard should type into the must_not.
For example:
POST <your_index_name>/_search
{
"query": {
"bool": {
"must_not":{
"filter": {
"wildcard":{
"info.Path": "*get*"
}
}
}
}
}
}
I have following document structure:
Movie:
{
id: int,
title: string,
language: string,
genre: string,
description: string,
cast: array[string],
directors: array[string],
(...)
}
Now, in the web interface, the user can choose (checkbox) the language, genre, directors etc and type some query to the search box.
Let's say I want to search within all thrillers (genre), that are in French or English (language), directed by James Cameron or George Lucas (directors) and I'm typing to the search box "abc" that I would like to find within title or description.
What I want as a result:
- only movies only in French or English
- only movies directed by James Cameron or George Lucas
- only thrillers
- movies that corresponds to "abc"
I'm not sure how to do the OR in the filters, but I have started from something like:
curl -X -XGET 'localhost:9200/movies/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query" : {
"constant_score" : {
"filter" : {
"bool" : {
"should" : [
{ "term" : {"language" : "french"}},
{ "term" : {"language" : "english"}},
{ "term" : {"directors" : "James Cameron"}},
{ "term" : {"directors" : "George Lucas"}}
],
"filter" : [
{ "term" : {"genre" : "thriller"}}
]
}
}
}
}
}
'
Could you please give me some hints?
As I understand, you need a query like this: (language is A or B) and (directors is C or D) and (genre is E) and (title or description is F). In this case you need the following query:
{
"query": {
"bool": {
"filter": [
{
"terms": {
"language": [
"french",
"english"
]
}
},
{
"bool": {
"should": [
{
"match": {
"directors": "James Cameron"
}
},
{
"match": {
"directors": "George Lucas"
}
}
]
}
},
{
"term": {
"genre": "thriller"
}
},
{
"multi_match": {
"query": "abc",
"fields": [
"title",
"description"
]
}
}
]
}
}
}
filter query will work as AND condition but will get you the same score for all matched documents. If you need to vary score depending on subqueries match, you'd better to use must instead of filter. terms query will match if specified field contains at least one term. multi_match query will match if at least one field contains specified query
How to query :
{
"_id" : Object Id("58787242f7d06edbbb88f46e"),
"name" : "aah",
"values" : [
"1484196300685",
"10"
],
"attributes" : {
}
}
i need time where value=10 result 1484196300685
db.collection.find(
{ values: { $elemMatch: { $eq: "10" } } }
)
This is the query that throws the exception using ES 2.0:
bool query does not support filter
How to use Exists and Missing query?
Query:
{
"bool":{
"must":[
{
"bool":{
"should":[
{
"bool":{
"must":[
{
"range":{
"startDate":{
"lte":"2016-10-27T11:24:49.6616538+05:30"
}
}
}
],
"filter":[
{
"bool":{
"must_not":[
{
"exists":{
"field":"endDate"
}
}
]
}
}
]
}
}
]
}
}
]
}
}
Well first off, that error is generally from use on a 1.x version of Elasticsearch. (in that case you need a FilteredQuery)
Next it appears you have many levels of unnecessary nesting. Not sure if maybe you stripped other things out to make a simpler example. I have rewritten your query like this (and added the outer braces):
{
"query" : {
"bool" : {
"must" : [{
"range" : {
"startDate" : { "lte" : "2016-10-27T11:24:49.6616538+05:30" }
}
}
],
"filter" : [{
"bool" : {
"must_not" : [{
"exists" : { "field" : "endDate" }
}
]
}
}
]} }
}
Both your original query and my rewritten one work fine on my server (v2.3.1) so I'm guessing really you have ES 1.x ?
Also, if you are not leveraging the lucene scoring, and just want to return the documents (or apply your own sort) then you can drop the filter altogether and simplify it further:
{
"query" : {
"bool" : {
"must" : [{
"range" : {
"startDate" : { "lte" : "2016-10-27T11:24:49.6616538+05:30"}
}
}
],
"must_not" : [{
"exists" : { "field" : "endDate" }
}
]} }
}