Search within single document using Elasticsearch - search

If I want to search an index I can use:
$curl -XGET 'X/index1/_search?q=title:ES'
If I want to search a document type I can use:
$curl -XGET 'X/index1/docType1/_search?q=title:ES'
But if I want to search a specific document, this doesn't work:
$curl -XGET 'X/index1/docType1/documentID/_search?q=title:ES'
Is there a simple work around for this so that I can search within a single document as opposed to an entire index or an entire document type? To explain why I need this, I have to do some resource intensive queries to find what I'm looking for. Once I find the documents I need, I don't actually need the whole document, just the highlighted portion that matches the query. But I don't want to store all the highlighted hits in memory because I might not need them for a few hours and at times they could take up a lot of space (I would also prefer not to write them to disk). I'd rather store a list of document ids so that when I need the highlighted portion of a document I can just run the highlighted query on a specific document and get back the highlighted portion. Thanks in advance for your help!

You can index the document's id as a field, then when you query, include the unique document id as a term to narrow the results just to that single document.
'$curl -XPOST 'X/index1/docType1/_search' -d '{
"query": {
"bool": {
"must":[
{"match":{"doc":"223"}},
{"match":{"title":"highlight me please"}}
]
}
}
}'

You can use the Ids Query in Elasticsearch to search on a single document. Elasticsearch by default, indexes a field called _uid which is the combination of type and id so that it can be used for queries, aggregations, scripts, and sorting.
So the query you need will be as follows
curl -XGET 'X/index1/_search' -d '{
"query": {
"bool": {
"must": [
{
"match": {
"title": "ES"
}
},
{
"ids": {
"type" : "docType1",
"values": [
"documentID"
]
}
}
]
}
}
}'
If you need to search on multiple documents, then specify doc_ids in the values array in ids query.

Related

Elasticsearch return all documents of a given type

I have been searching for a solution to this question for a few days. Use case is to simply see the documents of a particular type only. Usually after googling for a long time I end up with some search queries with wildcards. I have gone through many SO Posts like this, this and elastic documentation also. I tried below urls but without any luck.
curl -XGET 'localhost:9200/analytics/test/_search' -d '
{
"query" : {
"match_all" : {}
}
}'
curl -XGET 'localhost:9200/analytics/test/_search' -d '
{
"query": {
"type":{
"value": "test"
}
}
}'
Is there something like a wild card search on a document type to return all documents of that specific doc_type?
To get all documents from index analytics from type test just run:
curl -XGET localhost:9200/analytics/test/_search
or
curl -XGET localhost:9200/analytics/test/_search -d '{
"query": {"match_all" : {}}
}'
If you have count of users more than default size value, you need provide from/size values like described here.

How to fuzzy query against multiple fields in elasticsearch?

Here's my query as it stands:
"query":{
"fuzzy":{
"author":{
"value":query,
"fuzziness":2
},
"career_title":{
"value":query,
"fuzziness":2
}
}
}
This is part of a callback in Node.js. Query (which is being plugged in as a value to compare against) is set earlier in the function.
What I need it to be able to do is to check both the author and the career_title of a document, fuzzily, and return any documents that match in either field. The above statement never returns anything, and whenever I try to access the object it should create, it says it's undefined. I understand that I could write two queries, one to check each field, then sort the results by score, but I feel like searching every object for one field twice will be slower than searching every object for two fields once.
https://www.elastic.co/guide/en/elasticsearch/guide/current/fuzzy-match-query.html
If you see here, in a multi match query you can specify the fuzziness...
{
"query": {
"multi_match": {
"fields": [ "text", "title" ],
"query": "SURPRIZE ME!",
"fuzziness": "AUTO"
}
}
}
Somewhat like this.. Hope this helps.

Elasticsearch: searching specifically for words inside brackets

I'm trying to do an Elastic search for names that are inside brackets. I'm working with a database of names, and some of the names include maiden names within the first name field. Maiden names are indicated with brackets, like "Samantha [Murray]". My clients want our 'exact search' feature to work so that if you search for "Murray" you only get results with firstname Murray, not including maiden names; but if you search for "[Murray]", you get maiden names but NOT firstname = Murray, i.e. search for "Murray" >> "Murray Smith" but not "Samantha [Murray] Jones", search for "[Murray]" >> vice versa.
My problem so far is that elastic search seems to be ignoring the brackets entirely. Here is my query...
{
"query": {
"filtered": {
"query": {
"bool": {
"must": [
{
"query_string": {
"default_field" : "first_name",
"query" : "\\[Murray\\]"
}
}
]
}
}
}
}
}
but I get the same results that I do for "query" : "Murray" with no brackets at all. I tried a regexp but the results were even worse, the names I got weren't even close to "Murray" (I got things like "Rogers").
Is this type of request possible in Elastic? If so, what do I need to change?
Get familiar with analyzers - http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-analyzers.html
If you are using the defaults your analyzer is probably stripping down the brackets.
You need to define an analyzer that doesn't remove brackets if you want to be able to search by it.

Elastic Search size to unlimited

Am new to elastic search. Am facing a problem to write a search query returning all matched records in my collection. Following is my query to search record
{
"size":"total no of record" // Here i need to get total no of records in collection
"query": {
"match": {
"first_name": "vineeth"
}
}
}
By running this query i am only getting maximum 10 records, am sure there is more than 10 matching records in my collection. I searched a lot and finally got size parameter in query. But in my case i dont know the total count of records. I think giving an unlimited number to size variable is not a good practice, so how to manage this situation please help me to solve this issue, Thanks
It's not very common to display all results, but rather use fromand size to specify a range of results to fetch. So your query (for fetching the first 10 results) should look something like this:
{
"from": 0,
"size": 10,
"query": {
"match": {
"first_name": "vineeth"
}
}
}
This should work better than setting size to a ridiculously large value. To check how many documents matched your query you can get the hits.total (total number of hits) from the response.
To fetch all the records you can also use scroll concept.. It's like cursor in db's..
If you use scroll, you can get the docs batch by batch.. It will reduce high cpu usage and also memory usage..
For more info refer
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-scroll.html
To get all records, per de doc, you should use scroll.
Here is the doc:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-scroll.html
But the idea is to specify your search and indicate that you want to scroll it:
curl -XGET 'localhost:9200/twitter/tweet/_search?scroll=1m' -d '
{
"query": {
"match" : {
"title" : "elasticsearch"
}
}
}'
in the scroll param you specify how long you want the search results available.
Then you can retrieve them with the returned scroll_id and the scroll api.
in new versions of elastic (e.g. 7.X), it is better to use pagination than scroll (deprecated):
https://www.elastic.co/guide/en/elasticsearch/reference/current/paginate-search-results.html
deprecated in 7.0.0:
GET /_search/scroll/<scroll_id>

elasticsearch prefix query for multiple words to solve the autocomplete use case

How do I get elastic search to work to solve a simple autocomplete use case that has multiple words?
Lets say I have a document with the following title - Elastic search is a great search tool built on top of lucene.
So if I use the prefix query and construct it with the form -
{
"prefix" : { "title" : "Elas" }
}
It will return that document in the result set.
However if I do a prefix search for
{
"prefix" : { "title" : "Elastic sea" }
}
I get no results.
What sort of query do I need to construct so as to present to the user that result for a simple autocomplete use case.
A prefix query made on Elastic sea would match a term like Elastic search in the index, but that doesn't appear in your index if you tokenize on whitespaces. What you have is elastic and search as two different tokens. Have a look at the analyze api to find out how you are actually indexing your text.
Using a boolean query like in your answer you wouldn't take into account the position of the terms. You would get as a result the following document for example:
Elastic model is a framework to store your Moose object and search
through them.
For auto-complete purposes you might want to make a phrase query and use the last term as a prefix. That's available out of the box using the match_phrase_prefix type in a match query, which was made available exactly for your usecase:
{
"match" : {
"message" : {
"query" : "elastic sea",
"type" : "phrase_prefix"
}
}
}
With this query your example document would match but mine wouldn't since elastic is not close to search there.
To achieve that result, you will need to use a Boolean query. The partial word needs to be a prefix query and the complete word or phrase needs to be in a match clause. There are other tweaks available to the query like must should etc.. that can be applied as needed.
{
"query": {
"bool": {
"must": [
{
"prefix": {
"name": "sea"
}
},
{
"match": {
"name": "elastic"
}
}
]
}
}
}

Resources