How to make elastic search only match full field

How to make elastic search only match full field - search

I have a query looking like this:
((company_id:1) AND (candidate_tags:"designer"))
However this also matches users where candidate_tags is interaction designer. How do I exclude these?
Here's my full search body:
{
"query": {
"filtered": {
"query": {
"query_string": {
"query":
"((company_id:1) AND (candidate_tags:\"designer\"))"
}
}
}
}
"sort":{
"candidate_rating":{
"order":"desc"
},
"candidate_tags",
"_score"
}
}
Extra info
Realised now that an answer came in: candidate_tags is an array of strings, and say, a candidate has the tags interaction designer and talent, searching for talent should be a match but designer should not.

Make your candidate_tags field as not_analyzed or analyzed with keyword analyzer.
{
"mappings": {
"test": {
"properties": {
"candidate_tags": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
or add a raw field to your existent mapping like this:
{
"mappings": {
"test": {
"properties": {
"candidate_tags": {
"type": "string",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
}
For the first option use the same query as you use now.
For the second option use candidate_tags.raw, like this:
{
"query": {
"filtered": {
"query": {
"query_string": {
"query": "((company_id:1) AND (candidate_tags.raw:\"designer\"))"
}
}
}
}
...

Another way is to use script:
POST test/t/1
{
"q":"a b"
}
POST test/t/2
{
"q":"a c"
}
POST test/t/_search
{
"query": {
"filtered": {
"filter": {
"script": {
"script": "return _source.q=='a b'"
}
}
}
}
}

By filtering.
By making field candidate_tags an exact-value field - aka not_analyzed field (Andrei Stefan's solution, answered above)
With #2 be careful that you don't later mix the field that is not_analyzed with those that are. More: https://www.elastic.co/guide/en/elasticsearch/guide/current/_exact_value_fields.html
With #1, your query would look something like that (written from memory, don't have ES on me so can't verify):
{
"query": {
"filtered": {
"query": {
"query_string": {
"query":
"((company_id:1) AND (candidate_tags:\"designer\"))"
}
},
"filter" : {
"term" : {
"candidate_tags" : "designer"
}
}
}
}
"sort":{
"candidate_rating":{
"order":"desc"
},
"candidate_tags",
"_score"
}
}

Related

Update elastic search doc field value for specific fields in all documents

I have documents like this.
{
"a":"test",
"b":"harry"
},
{
"a":""
"b":"jack"
}
I need to update docs with field a==""(empty string) to default value say null in all documents for a given index.
Any help is appreciated. Thanks

Use Update by query with ingest
_update_by_query can also use the Ingest Node feature by specifying a pipeline like this:
define the pipeline
PUT _ingest/pipeline/set-foo
{
"description" : "sets foo",
"processors" : [ {
"set" : {
"field": "a",
"value": null
}
} ]
}
then you can use it like:
POST myindex/_update_by_query?pipeline=set-foo
{
"query": {
"filtered": {
"filter": {
"script": {
"script": "_source._content.length() == 0"
}
}
}
}
}'
OR
POST myindex/_update_by_query?pipeline=set-foo
{
"query": {
"bool" : {
"must" : {
"script" : {
"script" : {
"inline": "doc['a'].empty",
"lang": "painless"
}
}
}
}
}
}

To query a documents with empty string field value, i.e = ''
I did,
"query": {
"bool": {
"must": [
{
"exists": {
"field": "a"
}
}
],
"must_not": [
{
"wildcard": {
"a": "*"
}
}
]
}
}
So overall query to update all docs with field a=="" is,
POST test11/_update_by_query
{
"script": {
"inline": "ctx._source.a=null",
"lang": "painless"
},
"query": {
"bool": {
"must": [
{
"exists": {
"field": "a"
}
}
],
"must_not": [
{
"wildcard": {
"a": "*"
}
}
]
}
}
}

Startswith exact word match in elasticsearch?

I have an index containing field title having data as below.
jam bread
jamun
jamaica country
So If user searches for jam, I don't want jamun and jamaica country also come in search result. Right now I am using prefix query in elasticsearch, but it is not giving me result as I want.
{
"query": {
"prefix" : { "title" : "jam" }
}
}

You will get both the results as prefix query actually runs a regexp query (keyword*) on the inverted index so both the results will match.
you can do something like the following and use term query instead of the prefix query to do the exact match on the tokenized keyword.
PUT exact_index1
{
"mappings": {
"document_type" : {
"properties": {
"title" : {
"type": "text"
}
}
}
}
}
POST exact_index1/document_type
{
"title" : "jamun"
}
POST exact_index1/_search
{
"query": {
"term": {
"title": {
"value": "jam"
}
}
}
}
Hope this helps

The completion suggester provides search-as-you-type functionality
PUT - index_name/document_type/_mapping
{
"document_type": {
"properties": {
"title": {
"type": "text"
},
"suggest": {
"type": "completion",
"analyzer": "simple",
"search_analyzer": "simple"
}
}
}
}
POST - index_name/document_type
{
"name": "jamun",
"suggest":
{
"input": "jamun"
},
"output": "jamun"
}
POST - index_name/document_type/_suggest?pretty
{"type-suggest":{"text":"jam","completion":{"field":"suggest"}}}

Elasticsearch: Searching for fields with mapping not_analyzed get no hits

I have elasticsearch running and do all my requests with nodejs.
I have the following mapping applied for my index "mastert4":
{
"mappings": {
"mastert4": {
"properties": {
"s": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
I added exactly one document to the index which looks pretty much like this:
{
"master": {
"vi": "ff155d9696818dde0627e14c79ba5d344c3ef01d",
"s": "Anne Will"
}
}
Now doing any of the following search queries will not return any hits:
{
"index": "mastert4",
"body": {
"query": {
"filtered": {
"query": {
"match"/"term": {
"s": "anne will"/"Anne Will"
}
}
}
}
}
}
But the following query will return the exact document:
{
"index": "mastert4",
"body": {
"query": {
"filtered": {
"query": {
"constant_score": {
"filter": [
{
"missing": {
"field": "s"
}
}
]
}
}
}
}
}
}
And if I search for
{
"exists": {
"field": "s"
}
}
I will get no hits again.
When analyzing the field itsself I get:
{
"tokens": [
{
"token": "Anne Will",
"start_offset": 0,
"end_offset": 9,
"type": "word",
"position": 1
}
]
}
I'm really in a dead end here. Can someone tell me where I did wrong? Thx!!!!

You've enclosed the fields s and vi inside an outer field called master which is not declared in your mapping. That's the reason. If you query for master.s, you'll get results.
The second solution is to remove the enclosing master object in your document and that will work also:
{
"vi": "ff155d9696818dde0627e14c79ba5d344c3ef01d",
"s": "Anne Will"
}

Elasticsearch lowercase filter search

I'm trying to search my database and be able to use upper/lower case filter terms but I've noticed while query's apply analyzers, I can't figure out how to apply a lowercase analyzer on a filtered search. Here's the query:
{
"query": {
"filtered": {
"filter": {
"bool": {
"should": [
{
"term": {
"language": "mandarin" // Returns a doc
}
},
{
"term": {
"language": "Italian" // Does NOT return a doc, but will if lowercased
}
}
]
}
}
}
}
}
I have a type languages that I have lowercased using:
"analyzer": {
"lower_keyword": {
"type": "custom",
"tokenizer": "keyword",
"filter": "lowercase"
}
}
and a corresponding mapping:
"mappings": {
"languages": {
"_id": {
"path": "languageID"
},
"properties": {
"languageID": {
"type": "integer"
},
"language": {
"type": "string",
"analyzer": "lower_keyword"
},
"native": {
"type": "string",
"analyzer": "keyword"
},
"meta": {
"type": "nested"
},
"language_suggest": {
"type": "completion"
}
}
}
}

The problem is that you have a field that you have analyzed during index to lowercase it, but you are using a term filter for the query which is not analyzed:
Term Filter
Filters documents that have fields that contain a term (not analyzed).
Similar to term query, except that it acts as a filter.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-term-filter.html
I'd try using a query filter instead:
Query Filter
Wraps any query to be used as a filter. Can be placed within queries
that accept a filter.
Example:
{
"constantScore" : {
"filter" : {
"query" : {
"query_string" : {
"query" : "this AND that OR thus"
}
}
}
} }
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-query-filter.html#query-dsl-query-filter

This may be achieved by appending .keyword to your field to query against the keyword version of the field. Assuming language was defined in the mapping with type keyword.
Note that now only the exact text would match: mandarin won't match and Italian would.
Your query would end up like this:
{
"query": {
"filtered": {
"filter": {
"bool": {
"should": [
{
"term": {
"language.keyword": "mandarin" // Returns Empty
}
},
{
"term": {
"language.keyword": "Italian" // Returns Italian.
}
}
]
}
}
}
}
}
Combining the term values is also allowed:
{
"query": {
"filtered": {
"filter": {
"bool": {
"should": [
{
"term": {
"language.keyword":
["mandarin", "Italian"]
}
}
]
}
}
}
}
}

Elasticsearch wildcard search on not_analyzed field

I have an index like following settings and mapping;
{
"settings":{
"index":{
"analysis":{
"analyzer":{
"analyzer_keyword":{
"tokenizer":"keyword",
"filter":"lowercase"
}
}
}
}
},
"mappings":{
"product":{
"properties":{
"name":{
"analyzer":"analyzer_keyword",
"type":"string",
"index": "not_analyzed"
}
}
}
}
}
I am struggling with making an implementation for wildcard search on name field. My example data like this;
[
{"name": "SVF-123"},
{"name": "SVF-234"}
]
When I perform following query;
http://localhost:9200/my_index/product/_search -d '
{
"query": {
"filtered" : {
"query" : {
"query_string" : {
"query": "*SVF-1*"
}
}
}
}
}'
It returns SVF-123,SVF-234. I think, it still tokenizes data. It must return only SVF-123.
Could you please help on this?
Thanks in advance

There's a couple of things going wrong here.
First, you are saying that you don't want terms analyzed index time. Then, there's an analyzer configured (that's used search time) that generates incompatible terms. (They are lowercased)
By default, all terms end up in the _all-field with the standard analyzer. That is where you end up searching. Since it tokenizes on "-", you end up with an OR of "*SVF" and "1*".
Try to do a terms facet on _all and on name to see what's going on.
Here's a runnable Play and gist: https://www.found.no/play/gist/3e5fcb1b4c41cfc20226 (https://gist.github.com/alexbrasetvik/3e5fcb1b4c41cfc20226)
You need to make sure the terms you index is compatible with what you search for. You probably want to disable _all, since it can muddy what's going on.
#!/bin/bash
export ELASTICSEARCH_ENDPOINT="http://localhost:9200"
# Create indexes
curl -XPUT "$ELASTICSEARCH_ENDPOINT/play" -d '{
"settings": {
"analysis": {
"text": [
"SVF-123",
"SVF-234"
],
"analyzer": {
"analyzer_keyword": {
"type": "custom",
"tokenizer": "keyword",
"filter": [
"lowercase"
]
}
}
}
},
"mappings": {
"type": {
"properties": {
"name": {
"type": "string",
"index": "not_analyzed",
"analyzer": "analyzer_keyword"
}
}
}
}
}'
# Index documents
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_bulk?refresh=true" -d '
{"index":{"_index":"play","_type":"type"}}
{"name":"SVF-123"}
{"index":{"_index":"play","_type":"type"}}
{"name":"SVF-234"}
'
# Do searches
# See all the generated terms.
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_search?pretty" -d '
{
"facets": {
"name": {
"terms": {
"field": "name"
}
},
"_all": {
"terms": {
"field": "_all"
}
}
}
}
'
# Analyzed, so no match
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_search?pretty" -d '
{
"query": {
"match": {
"name": {
"query": "SVF-123"
}
}
}
}
'
# Not analyzed according to `analyzer_keyword`, so matches. (Note: term, not match)
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_search?pretty" -d '
{
"query": {
"term": {
"name": {
"value": "SVF-123"
}
}
}
}
'
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_search?pretty" -d '
{
"query": {
"term": {
"_all": {
"value": "svf"
}
}
}
}
'

My solution adventure
I have started my case as you can see in my question. Whenever, I have changed a part of my settings, one part started to work, but another part stop working. Let me give my solution history:
1.) I have indexed my data as default. This means, my data is analyzed as default. This will cause problem on my side. For example;
When user started to search a keyword like SVF-1, system run this query:
{
"query": {
"filtered" : {
"query" : {
"query_string" : {
"analyze_wildcard": true,
"query": "*SVF-1*"
}
}
}
}
}
and results;
SVF-123
SVF-234
This is normal, because name field of my documents are analyzed. This splits query into tokens SVF and 1, and SVF matches my documents, although 1 does not match. I have skipped this way. I have create a mapping for my fields make them not_analyzed
{
"mappings":{
"product":{
"properties":{
"name":{
"type":"string",
"index": "not_analyzed"
},
"site":{
"type":"string",
"index": "not_analyzed"
}
}
}
}
}
but my problem continued.
2.) I wanted to try another way after lots of research. Decided to use wildcard query.
My query is;
{
"query": {
"wildcard" : {
"name" : {
"value" : *SVF-1*"
}
}
},
"filter":{
"term": {"site":"pro_en_GB"}
}
}
}
This query worked, but one problem here. My fields are not_analyzed anymore, and I am making wildcard query. Case sensitivity is problem here. If I search like svf-1, it returns nothing. Since, user can input lowercase version of query.
3.) I have changed my document structure to;
{
"mappings":{
"product":{
"properties":{
"name":{
"type":"string",
"index": "not_analyzed"
},
"nameLowerCase":{
"type":"string",
"index": "not_analyzed"
}
"site":{
"type":"string",
"index": "not_analyzed"
}
}
}
}
}
I have adde one more field for name called nameLowerCase. When I am indexing my document, I am setting my document like;
{
name: "SVF-123",
nameLowerCase: "svf-123",
site: "pro_en_GB"
}
Here, I am converting query keyword to lowercase and make search operation on new nameLowerCase index. And displaying name field.
Final version of my query is;
{
"query": {
"wildcard" : {
"nameLowerCase" : {
"value" : "*svf-1*"
}
}
},
"filter":{
"term": {"site":"pro_en_GB"}
}
}
}
Now it works. There is also one way to solve this problem by using multi_field. My query contains dash(-), and faced some problems.
Lots of thanks to #Alex Brasetvik for his detailed explanation and effort

Adding to Hüseyin answer, we can use AND as the default operator. So SVF and 1* will be joined using AND operator, therefore giving us the correct results.
"query": {
"filtered" : {
"query" : {
"query_string" : {
"default_operator": "AND",
"analyze_wildcard": true,
"query": "*SVF-1*"
}
}
}
}

#Viduranga Wijesooriya as you stated "default_operator" : "AND" will check for presence of both SVF and 1 but exact match alone is still not possible,
but ya this will filter the results in more appropriate way leaving with all combination of SVF and 1 and sorting the results by relevance which will promote SVF-1 up the order
For pulling out the exact result
"settings": {
"analysis": {
"analyzer": {
"analyzer_keyword": {
"type": "custom",
"tokenizer": "keyword",
"filter": [
"lowercase"
]
}
}
}
},
"mappings": {
"type": {
"properties": {
"name": {
"type": "string",
"analyzer": "analyzer_keyword"
}
}
}
}
and the query is
{
"query": {
"bool": {
"must": [
{
"query_string" : {
"fields": ["name"],
"query" : "*svf-1*",
"analyze_wildcard": true
}
}
]
}
}
}
result
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "play",
"_type": "type",
"_id": "AVfXzn3oIKphDu1OoMtF",
"_score": 1,
"_source": {
"name": "SVF-123"
}
}
]
}
}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to make elastic search only match full field - search

Another way is to use script: POST test/t/1 { "q":"a b" } POST test/t/2 { "q":"a c" } POST test/t/_search { "query": { "filtered": { "filter": { "script": { "script": "return _source.q=='a b'" } } } } }

Related

Update elastic search doc field value for specific fields in all documents

Startswith exact word match in elasticsearch?

Elasticsearch: Searching for fields with mapping not_analyzed get no hits

Elasticsearch lowercase filter search

Elasticsearch wildcard search on not_analyzed field

Categories

Resources