searching for words that has space in Elastic search - node.js

I have a problem in searching that has spaces in my fields with Elastic search.
const body = {
query: {
wildcard: {
name: `${query}*`
}
}
}
Lets say I am searching the name of the cities. When I search "Los", It searches all cities that has substring "los" in it.
But when I search as "Los ang", it doesnt show "Los angeles".

After trying many other way, finally managed to get the result I wanted but still dont know whether this is the right way without wildcard.
const body = {
size: 500,
from: 0,
query: {
query_string: {
fields: ["name"],
query: `${query}*`,
default_operator: "AND"
}
}
}

Related

Nodejs Elasticsearch query default behaviour

On a daily basis, I'm pushing data (time_series) to Elasticsearch. I created an index pattern, and my index have the name: myindex_* , where * is today date (an index pattern has been setup). Thus after a week, I have: myindex_2022-06-20, myindex_2022-06-21... myindex_2022-06-27.
Let's assume my index is indexing products' prices. Thus inside each myindex_*, I have got:
myindex_2022-06-26 is including many products prices like this:
{
"reference_code": "123456789",
"price": 10.00
},
...
myindex_2022-06-27:
{
"reference_code": "123456789",
"price": 12.00
},
I'm using this query to get the reference code and the corresponding prices. And it works great.
const data = await elasticClient.search({
index: myindex_2022-06-27,
body: {
query: {
match: {
"reference_code": "123456789"
}
}
}
});
But, I would like to have a query that if in the index of the date 2022-06-27, there is no data, then it checks, in the previous index 2022-06-26, and so on (until e.g. 10x).
Not sure, but it seems it's doing this when I replace myindex_2022-06-27 by myindex_* (not sure it's the default behaviour).
The issue is that when I'm using this way, I got prices from other index but it seems to use the oldest one. I would like to get the newest one instead, thus the opposite way.
How should I proceed?
If you query with index wildcard, it should return a list of documents, where every document will include some meta fields as _index and _id.
You can sort by _index, to make elastic search return the latest document at position [0] in your list.
const data = await elasticClient.search({
index: myindex_2022-*,
body: {
query: {
match: {
"reference_code": "123456789"
}
}
sort : { "_index" : "desc" },
}
});

Does EdgeNGram autocomplete_filter make sense with prefix search?

i have Elastic Search Index with around 1 million records.
I want to do multi prefix search against 2 fields in the Elastic Search Index, Name and ID (there are around 10 total).
Does creating EdgeNGram autocomplete filter make sense at all?
Or i am missing the point of the EdgeNGram.
Here is the code i have for creation of the index:
client.indices.create({
index: 'testing',
// type: 'text',
body: {
settings: {
analysis: {
filter: {
autocomplete_filter: {
type: 'edge_ngram',
min_gram: 3,
max_gram: 20
}
},
analyzer: {
autocomplete: {
type: 'custom',
tokenizer: 'standard',
filter: [
'lowercase',
'autocomplete_filter'
]
}
}
}
}
}
},function(err,resp,status) {
if(err) {
console.log(err);
}
else {
console.log("create",resp);
}
});
Code for searching
client.search({
index: 'testing',
type: 'article',
body: {
query: {
multi_match : {
query: "87041",
fields: [ "name", "id" ],
type: "phrase_prefix"
}
}
}
},function (error, response,status) {
if (error){
console.log("search error: "+error)
}
else {
console.log("--- Response ---");
console.log(response);
console.log("--- Hits ---");
response.hits.hits.forEach(function(hit){
console.log(hit);
})
}
});
The search returns the correct results, so my question being does creating the edgengram filter and analyzer make sense in this case?
Or this prefix functionality would be given out of the box?
Thanks a lot for your info
It is depending on your use case. Let me explain.
You can use ngram for this feature. Let's say your data is london bridge, then if your min gram is 1 and max gram is 20, it will be tokenized as l, lo, lon, etc..
Here the advantage is that even if you search for bridge or any tokens which is part of the generated ngrams, it will be matched.
There is one out of box feature completion suggester. It uses FST model to store them. Even the documentation says it is faster to search but costlier to build. But the think is it is prefix suggester. Meaning searching bridge will not bring london bridge by default. But there are ways to make this work. Workaround to achieve is that, to have array of tokens. Here london bridge and bridge are the tokens.
There is one more called context suggester. If you know that you are going to search on name or id, it is best over completion suggester. As completion suggester works over on all the index, context suggester works on a particular index based on the context.
As you say, it is prefix search you can go for completion. And you mentioned that there 10 such fields. And if you know the field to be suggested at fore front, then you can go for context suggester.
one nice answer about edge ngram and completion
completion suggester for middle of the words - I used this solution, it works like charm.
You can refer documentation for other default options available within suggesters.

A find() statement with possible null parameters

I'm trying to figure out how Mongoose and MongoDB works... I'm really new to them, and I can't seem to figure how to return values based on a find statement, where some of the given parameters in the query possible are null - is there an attribute I can set for this or something?
To explain it further, I have a web page that has different input fields that are used to search for a company, however they're not all mandatory.
var Company = mongoose.model('Company');
Company.find({companyName: req.query.companyName, position: req.query.position,
areaOfExpertise: req.query.areaOfExpertise, zip: req.query.zip,
country: req.query.country}, function(err, docs) {
res.json(docs);
});
By filling out all the input fields on the webpage I get a result back, but only that specific one which matches. Let's say I only fill out country, it returns nothing because the rest are empty, but I wish to return all rows which are e.g. in Germany. I hope I expressed myself clearly enough.
You need to wrap the queries with the $or logic operator, for example
var Company = mongoose.model('Company');
Company.find(
{
"$or": [
{ "companyName": req.query.companyName },
{ "position": req.query.position },
{ "areaOfExpertise": req.query.areaOfExpertise },
{ "zip": req.query.zip },
{ "country": req.query.country }
]
}, function(err, docs) {
res.json(docs);
}
);
Another approach would be to construct a query that checks for empty parameters, if they are not null then include it as part of the query. For example, you can just use the req.query object as your query assuming the keys are the same as your document's field, as in the following:
/*
the req.query object will only have two parameters/keys e.g.
req.query = {
position: "Developer",
country: "France"
}
*/
var Company = mongoose.model('Company');
Company.find(req.query, function(err, docs) {
if (err) throw err;
res.json(docs);
});
In the above, the req.query object acts as the query and has an implicit logical AND operation since MongoDB provides an implicit AND operation when specifying a comma separated list of expressions. Using an explicit AND with the $and operator is necessary when the same field or operator has to be specified in multiple expressions.
If you are after a query that satisfies both logical AND and OR i.e. return all documents that match the conditions of both clauses for example given a query for position AND country OR any other fields then you may tweak the query to:
var Company = mongoose.model('Company');
Company.find(
{
"$or": [
{ "companyName": req.query.companyName },
{
"position": req.query.position,
"country": req.query.country
},
{ "areaOfExpertise": req.query.areaOfExpertise },
{ "zip": req.query.zip }
]
}, function(err, docs) {
res.json(docs);
}
);
but then again this could be subject to what query parameters need to be joined as mandatory etc.
I simply ended up deleting the parameters in the query in case they were empty. It seems all the text fields in the submit are submitted as "" (empty). Since there are no such values in the database, it would return nothing. So simple it never crossed my mind...
Example:
if (req.query.companyName == "") {
delete req.query.companyName;
}

How to do nested search with custom variable in ealsticsearch using nodejs?

This is my data saved in elastic search:
{
index: productName,
type: 'users',
body: {
name:'xyz',
subject:{
12:{
id:12,
name:Maths
count:3
},
13:{
id:13,
name:Physics
count:7
}
}
}
}
Is There a way to somehow search and get total number of users whose count in maths is greater than 0. Where 12 will be in a variable say subject_id.
I tried searching in the docs but coudn't find any one example to use.
I am new to elastic search any help would be appreciated thanks.
Create an object first, like this:
var queryObj = {
"query":{
"range":{
}
}
};
queryObj.query.range['subjects.'+data.subject_id+'.opened']={
"gte":1
};
then pass this object in the body of elastic search like this
elasticClient.search({
index: indexName,
type: type,
body: {
queryObj
}
}).then(promiseFunc)

elastic search exact phrase matching

I am new to ES. I am having trouble finding exact phrase matches.
Let's assume my index has a field called movie_name.
Let's assume I have 3 documents with the following values
movie_name = Mad Max
movie_name = mad max
movie_name = mad max 3d
If my search query is Mad Max, I want the first 2 documents to be returned but not the 3rd.
If I do the "not_analyzed" solution I will get only document 1 but not 2.
What am I missing?
I was able to do it using the following commands, basically create a custom analyzer, use the keyword tokenizer to prevent tokenization. Then use the analyzer in the "mappings" for the desired field, in this case "movie_name".
PUT /movie
{
"settings":{
"index":{
"analysis":{
"analyzer":{
"keylower":{
"tokenizer":"keyword",
"filter":"lowercase"
}
}
}
}
},
"mappings" : {
"search" : {
"properties" : {
"movie_name" : { "type" : "string", "analyzer":"keylower" }
}
}
}
}
Use Phrase matching like this :
{
"query": {
"match_phrase": {
"movie_name": "a"
}
}
}

Resources