Facing difficulty using compound query with Elasticsearch JS - node.js

I am using the official Elasticsearch package from npm within my node.js application. I was attempting to perform search using compound queries ( bool), But I found that the compound search does not work as expected.
To debug the issue, I tried passing different sets of data for the search query. I found an abnormality wherein the elasticsearch library does not work as expected but the Elasticsearch API does. I'm unable to find this behavior documented anywhere else as well.
I executed 2 sets of code (with the same query) on
1) Node using the official elastic search library
2) Over the Elasticsearch API using Postman
I> Using Elastic Search JS
"index": "bank",
"type": "account",
"body": {
"query": {
"bool": {
"must": [{
"match": {
"address": "avenue"
}
}]
}
}
}
}
II> Using Elastic Search API
"query": {
"bool": {
"must": [{
"match": {
"address": "avenue"
}
}]
}
}
}
The results for the official library come in empty (Empty array), But the results using the elasticsearch API result in the correct set of data.
Another peculiar observation was the below query using elasticsearch JS which works for a single element, but not an array of elements
"index": "bank",
"type": "account",
"body": {
"query": {
"bool": {
"must": {
"match": {
"address": "avenue"
}
}
}
}
}
}
I'm breaking my head over where I'm going wrong, I tried going through docs, stackoverflow and a very little bit of code, And returned empty handed.
Would appreciate any help.
Thanks a lot

Related

How to Configure the Elasticsearch with fuzzy search

I have requirement where I need to install the elasticsearch where they want to use it for doing fuzzy search.
How do I configure it and installed on the Linux box
Thanks
You no need any other configuration for using Elastic fuzzy search. What you care is query string.
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-fuzzy-query.html
To install Elasticsearch in Linux, you can refer to this official ES documentation
There can be several types of fuzzy searches according to your use case -
1. You can use match with fuzziness parameter
2. You can use fuzzy query
Adding a working example with index data, mapping, search query, and search result
Index Mapping:
{
"mappings": {
"properties": {
"name": {
"type": "text"
}
}
}
}
Index Data:
{
"name": "breadsticks"
}
Search Query using Match Query:
Searching for breastiks instead of breadsticks
{
"query":{
"match":{
"name":{
"query":"breadstiks",
"fuzziness":"auto"
}
}
}
}
Search Result:
"hits": [
{
"_index": "66962659",
"_type": "_doc",
"_id": "1",
"_score": 0.25891387,
"_source": {
"name": "breadsticks"
}
}
]
You can set the fuzziness value according to your use case
Search Query using Fuzzy query:
{
"query": {
"fuzzy": {
"name": {
"value": "breadstiks"
}
}
}
}

ElasticSearch search with querystring and verify another field

I need to translate the following SQL query to ES query:
SELECT *
FROM SKILL
WHERE SKILL.name LIKE 'text' and SKILL.type = 'hard'
I have tried the following using "elasticsearch" library for python3:
query = self.__es.search(index="skills",
body={"from" : skip, "size" : limit,
"query":
{"query_string":
{"query": 'text'}
})
and this worked well. But now, I don't know how to check that the field 'type' is equal to 'hard'.
How can I do that?
Thank you.
You have to use a bool query and in the "must" part put two queries, the full text one and a term one:
{
"query": {
"bool": [{
"match": {
"name": "this is a test"
}
}, {
"term": {
"type": "hard"
}
}]
}
}
Before this you have to store the type property as a keyword field.

Elasticsearch Nest Query not returning result as expected

I'm new to Elasticsearch. I'm trying a query and when giving full name I'm getting results. When I give part of it, it's not returning any results. Below is the sample that I have been trying.
{
"query": {
"multi_match": {
"query": "recharge",
"fields": ["category.*","categoryName^3","alterNames","categoryDescription"],
"type": "best_fields"
}
},size:1000
}
If I pass "rech" in the query, I'm not getting any results. Can any one help me here?
As far as I understand, you want to get the results with unfinished query, so you need a wildcard, like this:
{
"query": {
"multi_match": {
"query": "rech*",
"fields": ["category.*", "categoryName^3", "alterNames", "categoryDescription"],
"type": "best_fields"
}
}

ElasticSearch Custom Script for Ordering Performance

I wrote a simple scoring based on a document parameter like below:
POST /_scripts/groovy/CustomScoring
{
"script": "(_source.ProductHits==null ? 0.1 :
(_source.ProductHits[myval]==null?0.2:_source.ProductHits[myval]))"
}
When I use this custom script to sort search results like this:
POST /ecs/product/_search
{
"query": {
"bool": {
"must": [
{
"function_score":{
"query" : {"match_all": {}}
,"script_score": {
"script_id": "CustomScoring",
"lang" : "groovy",
"params":{
"myval": "iphone"
}
}
}
}
]
}
}
}
It takes 800ms to run on 50'000 documents (vs initial run-time which was around 1ms).
How can I optimize this groovy function?
Can Elasticsearch use some kind of caching for this function?
p.s. When I tried to use sum complex formulas based on doc.some_param.value and embedded functions like log it took 40ms instead which is still reasonable.

Query all unique values of a field with Elasticsearch

How do I search for all unique values of a given field with Elasticsearch?
I have such a kind of query like select full_name from authors, so I can display the list to the users on a form.
You could make a terms facet on your 'full_name' field. But in order to do that properly you need to make sure you're not tokenizing it while indexing, otherwise every entry in the facet will be a different term that is part of the field content. You most likely need to configure it as 'not_analyzed' in your mapping. If you are also searching on it and you still want to tokenize it you can just index it in two different ways using multi field.
You also need to take into account that depending on the number of unique terms that are part of the full_name field, this operation can be expensive and require quite some memory.
For Elasticsearch 1.0 and later, you can leverage terms aggregation to do this,
query DSL:
{
"aggs": {
"NAME": {
"terms": {
"field": "",
"size": 10
}
}
}
}
A real example:
{
"aggs": {
"full_name": {
"terms": {
"field": "authors",
"size": 0
}
}
}
}
Then you can get all unique values of authors field.
size=0 means not limit the number of terms(this requires es to be 1.1.0 or later).
Response:
{
...
"aggregations" : {
"full_name" : {
"buckets" : [
{
"key" : "Ken",
"doc_count" : 10
},
{
"key" : "Jim Gray",
"doc_count" : 10
},
]
}
}
}
see Elasticsearch terms aggregations.
Intuition:
In SQL parlance:
Select distinct full_name from authors;
is equivalent to
Select full_name from authors group by full_name;
So, we can use the grouping/aggregate syntax in ElasticSearch to find distinct entries.
Assume the following is the structure stored in elastic search :
[{
"author": "Brian Kernighan"
},
{
"author": "Charles Dickens"
}]
What did not work: Plain aggregation
{
"aggs": {
"full_name": {
"terms": {
"field": "author"
}
}
}
}
I got the following error:
{
"error": {
"root_cause": [
{
"reason": "Fielddata is disabled on text fields by default...",
"type": "illegal_argument_exception"
}
]
}
}
What worked like a charm: Appending .keyword with the field
{
"aggs": {
"full_name": {
"terms": {
"field": "author.keyword"
}
}
}
}
And the sample output could be:
{
"aggregations": {
"full_name": {
"buckets": [
{
"doc_count": 372,
"key": "Charles Dickens"
},
{
"doc_count": 283,
"key": "Brian Kernighan"
}
],
"doc_count": 1000
}
}
}
Bonus tip:
Let us assume the field in question is nested as follows:
[{
"authors": [{
"details": [{
"name": "Brian Kernighan"
}]
}]
},
{
"authors": [{
"details": [{
"name": "Charles Dickens"
}]
}]
}
]
Now the correct query becomes:
{
"aggregations": {
"full_name": {
"aggregations": {
"author_details": {
"terms": {
"field": "authors.details.name"
}
}
},
"nested": {
"path": "authors.details"
}
}
},
"size": 0
}
Working for Elasticsearch 5.2.2
curl -XGET http://localhost:9200/articles/_search?pretty -d '
{
"aggs" : {
"whatever" : {
"terms" : { "field" : "yourfield", "size":10000 }
}
},
"size" : 0
}'
The "size":10000 means get (at most) 10000 unique values. Without this, if you have more than 10 unique values, only 10 values are returned.
The "size":0 means that in result, "hits" will contain no documents. By default, 10 documents are returned, which we don't need.
Reference: bucket terms aggregation
Also note, according to this page, facets have been replaced by aggregations in Elasticsearch 1.0, which are a superset of facets.
The existing answers did not work for me in Elasticsearch 5.X, for the following reasons:
I needed to tokenize my input while indexing.
"size": 0 failed to parse because "[size] must be greater than 0."
"Fielddata is disabled on text fields by default." This means by default you cannot search on the full_name field. However, an unanalyzed keyword field can be used for aggregations.
Solution 1: use the Scroll API. It works by keeping a search context and making multiple requests, each time returning subsequent batches of results. If you are using Python, the elasticsearch module has the scan() helper function to handle scrolling for you and return all results.
Solution 2: use the Search After API. It is similar to Scroll, but provides a live cursor instead of keeping a search context. Thus it is more efficient for real-time requests.

Resources