ElasticSearch MultiField Search Query - node.js

I have an endpoint that I am proxying into ElasticSearch API for a simple user search I am conducting.
/users?nickname=myUsername&email=myemail#gmail.com&name=John+Smith
Somet details about these parameters are the following
All parameters are optional
nickname can be searched as a full text search (i.e. 'myUser' would return 'myUsername')
email must be an exact match
name can be searched as full text search for each token (i.e. 'john' would return 'John Smith')
The ElasticSearch search call should treat the parameters collectively as AND'd.
Right now, I am not truly sure where to start as I am able to execute the query on each of the parameters alone, but not all together.
client.search({
index: 'users',
type: 'user',
body: {
"query": {
//NEED TO FILL THIS IN
}
}
}).then(function(resp){
//Do something with search results
});

First you need to create the mapping for this particular use case.
curl -X PUT "http://$hostname:9200/myindex/mytype/_mapping" -d '{
"mytype": {
"properties": {
"email": {
"type": "string",
"index": "not_analyzed"
},
"nickname": {
"type": "string"
},
"name": {
"type": "string"
}
}
}
}'
Here by making email as not_analyzed , you are making sure only the exact match works.
Once that is done , you need to make the query.
As we have multiple conditions , it would be a good idea to use bool query.
You can combine multiple queries and how to handle them using bool query
Query -
{
"query": {
"bool": {
"must": [
{
"match": {
"name": "qbox"
}
},
{
"prefix": {
"nickname": "qbo"
}
},
{
"match": {
"email": "me#qbox.io"
}
}
]
}
}
}
Using the prefix query , you are telling Elasticsearch that even if the token starts with qbo , qualify it as a match.
Also prefix query might not be very fast , in that case you can go for ngram analyzer - http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-ngram-tokenizer.html

Related

search multiple field as regexp query in elasticsearch

I am trying to search by different fields such as title and description. When i type keywords, elasticseach must found something if description or title includes that i typed keywords. This is my goal. How can i reach my goal?
You can see the sample code that i used for one field.
query: {
regexp: {
title: `.*${q}.*`,
},
},
I also tried below one but it gave syntax error.
query: {
regexp: {
title: `.*${q}.*`,
},
regexp: {
description: `.*${q}.*`,
},
},
To do so, you need to use a bool query.
GET /<you index>/_search
{
"query": {
"bool": {
"should": [
{
"regexp": {
"title": ".*${q}.*"
}
},
{
"regexp": {
"description": ".*${q}.*"
}
}
]
}
}
}
You can find the documentation => [doc]

search query in elasticsearch

I have documents like the below in my elasticsearch
{"method":"POST","url":"/saas/services/1.0/account/*/purchase/*/subscription/cancel"}
{"method":"POST","url":"/saas/services/1.0/account/*/purchase/*/cancel"}
I am searching these documents using node.js client. Below is the query I am sending.
Example 1 type : DSL
client.search({
index: 'my_node',
body: {
query: {
bool: {
must: {
match: { method: 'POST' },
match: { url: '/saas/services/1.0/account/*/purchase/*/cancel' }
}
}
}
}
}
Output
I am getting the other record with subscription cancel. The query is not matching the exact document. I tried the below query , it worked for this case but not working for few other test cases.
document
{"method":"POST","url":"/paas/service/1.0/act/*/purchase"}}
{"method":"GET","url":"/paas/service/1.0/act/*/purchase"}}
{"method":"PUT","url":"/paas/service/1.0/act/*/purchase"}}
Example 2 type : query string
client.search({
index: 'my_node',
q: 'method: POST AND url: /saas/services/1.0/account/*/purchase'
}
Output
I get the GET purchase document regardless of other two. Tried for few other documents, the method argument is not getting recognized.
How do I write elasticsearch query to search through the documents to match both the properties of method and url. i had tried query time boosting for the url but doesn't seem to work.
Edit - Mapping information
{
"nodeclient": {
"mappings": {
"logs": {
"properties": {
"method": {
"type": "string" },
"url": {
"type": "string" }
} } }
}
}
Try add quotation marks
client.search({ index: 'my_node', body: { "query": { "bool": { "must": {
"match": { "method": 'POST' }, "match": { "url":
'/saas/services/1.0/account//purchase//cancel' } } } } } }
I tried optimizing the query with Mirage plugin for elasticsearch queries. I was trying with different examples and saw the responses. Below is my final query that works for my scenario
{
"query": {
"bool": {
"must": [
{ "match_phrase": { "url": uri }},
{ "match": { "method": http } }
]}
}
match_phrase suited better for the url parameter. Also I missed the array part inside the must clause.

mongoosastic search with two fields with and condition

I have a chatting applications developed in Angular js, Node.js, MongoDB with elastic search integration. I had provided search functionality for chats, which user can enter any combination. Options are chat message, user and date.
So i want to search in elastic search db with help of nodejs, with mutiple field combination. For example, 1) search with username='mohan' and the date='anydate', 2) username='mohan' and chatmessage='Hi there'.
So the result should come which satisfy both conditions.
How can we achieve this using mongoosastic? I tried with below query. But it is giving result with OR condition, I want with AND condition.
{
"query": {
"bool": {
"should": [{
"match": {
"feedMsg": "Hi there"
}
}, {
"match": {
"userId": 'mohan'
}
}, {
"range": {
"feedTime": {
"from": '27/10/2016',
"to": '27/10/2016'
}
}
}]
}
}
}
I found the solutions. Bool Query with must is working for me.
{
"bool" : {
"must" : [],
"should" : [],
"must_not" : [],
"filter": []
}
}
must- All of these clauses must match. The equivalent of AND.
must_not - All of these clauses must not match. The equivalent of
NOT.
should - At least one of these clauses must match. The equivalent of OR.
filter - Clauses that must match, but are run in non-scoring,
filtering mode.
You should use this format, it really works...
var query = {
query_string: {
filtered: {
query: {
multi_match: {
query: req.query.q, // or anything you will put here
}
},
filter: {
term: {
'field': value
}
}
}
}
}

Elasticsearch - Unique counts of a field, filter by the number of documents that contain a specific value in an other field

I'm new to Elasticsearch and I'm trying to do kind of a complex search but I can't found how to do it, or even if it's possible.
I use Elasticsearch to store document.
All of this document have a userId field which is potentially the same in multiple documents.
They also have a documentType field which contains a string.
I want to have a unique count of userId that have at least n number of a DocumentType.
Currently if n = 1 I'm able to retrieve the unique count with this request:
{
"aggs": {
"docType_file": {
"filters": {
"filters": {
"documentType:\"file\"": {
"query": {
"query_string": {
"query": "documentType:\"file\"",
}
}
}
}
},
"aggs": {
"unique_user": {
"cardinality": {
"field": "userId"
}
}
}
}
}
}
But if I want n to be superior to 1 (or if I want to do an other operation, less, equal or range) I don't find a way to do it. I have look in the different aggregations but I don't find how to combine them to retrieve this.
Is it possible to do that in Elasticsearch?
UPDATE:
I found an other way to do the query which may be more interesting to do an other aggregations afterward.
{
"query": {
"filtered": {
"query": {
"query_string": {
"query": "documentType:\"file\""
}
}
}
},
"aggs": {
"unique_user": {
"cardinality": {
"field": "userId",
"precision_threshold": 40000
}
}
}
}
But I still don't know how to filter after that and I kind of think that I'm taking the problem the wrong way but I don't see any other solution.

MongoDB: Query model and check if document contains object or not, then mark / group result

I have a Model called Post, witch contains an property array with user-ids for users that have liked this post.
Now, i need to query the post model, and mark the returned results with likedBySelf true/false for use in by client - is this possible?
I dont have to store the likedBySelf property in the database, just modify the results to have that property.
A temporary solution i found was to do 2 queries, one that finds the posts that is liked by user x, and the ones that have not been liked by user x, and en map (setting likedBySelf true/false) and combine the 2 arrays and return the combined array. But this gives some limitations to other query functions such as limit and skip.
So now my queries looks like this:
var notLikedByQuery = Post.find({likedBy: {$ne: req.body.user._id}})
var likedByQuery = Post.find({likedBy: req.body.user._id})
(I'm using the Mongoose lib)
PS. A typical post can look like this (JSON):
{
"_id": {
"$oid": "55fc463c83b2d2501f563544"
},
"__t": "Post",
"groupId": {
"$oid": "55fc463c83b2d2501f563545"
},
"inactiveAfter": {
"$date": "2015-09-25T17:13:32.426Z"
},
"imageUrl": "https://hootappprodstorage.blob.core.windows.net/devphotos/55fc463b83b2d2501f563543.jpeg",
"createdBy": {
"$oid": "55c49e2d40b3b5b80cbe9a03"
},
"inactive": false,
"recentComments": [],
"likes": 8,
"likedBy": [
{
"$oid": "558b2ce70553f7e807f636c7"
},
{
"$oid": "559e8573ed7c830c0a677c36"
},
{
"$oid": "559e85bced7c830c0a677c43"
},
{
"$oid": "559e854bed7c830c0a677c32"
},
{
"$oid": "559e85abed7c830c0a677c40"
},
{
"$oid": "55911104be2f86e81d0fb573"
},
{
"$oid": "559e858fed7c830c0a677c3b"
},
{
"$oid": "559e8586ed7c830c0a677c3a"
}
],
"location": {
"type": "Point",
"coordinates": [
10.01941398718396,
60.96738099591897
]
},
"updatedAt": {
"$date": "2015-09-22T08:45:41.480Z"
},
"createdAt": {
"$date": "2015-09-18T17:13:32.426Z"
},
"__v": 8
}
#tskippe you can use a method like following to process whether the post is liked by the user himself and call the function anywhere you want.
var processIsLiked = function(postId, userId, doc, next){
var q = Post.find({post_id: postId});
q.lean().exec(function(err,res){
if(err) return utils.handleErr(err, res);
else {
if(_.find(doc.post.likedBy,userId)){ //if LikedBy array contains the user
doc.post.isLiked = true;
} else {
doc.post.isLiked = false;
}
});
next(doc);
}
});
}
Because you are using q.lean() you dont need to actually persist the data. You need to just process it , add isLiked field in the post and send back the response. **note that we are manuplating doc directly. Also you chan tweek it to accept doc containing array of posts and iterating it and attach an isLiked field to each post.
I found that MongoDB's aggregation with $project tequnique was my best bet. So i wrote up an aggregation like this.
Explanation:
Since i want to keep the entire document, but $project purpose is to modify the docs, thus you have to specify the properties you want to keep. A simple way of keeping all the properties is to use "$$ROOT".
So i define a $project, set all my original properties to doc: "$$ROOT", then create a new property "likedBySelf", which is marked true / false if a specified USERID is in the $likedBy set.
I think that this is more clean and simple, than querying every single model after a query to set a likedBySelf flag. It may not be faster, but its cleaner.
Model.aggregate([
{ $project: {
doc: "$$ROOT",
likedBySelf: {
$cond: {
"if": { "$setIsSubset": [
[USERID],
"$likedBy"
]},
"then": true,
"else": false
}
}
}}
]);

Resources