How does MongoDB $text search works? - node.js

I have inserted following values in my events collection
db.events.insert(
[
{ _id: 1, name: "Amusement Ride", description: "Fun" },
{ _id: 2, name: "Walk in Mangroves", description: "Adventure" },
{ _id: 3, name: "Walking in Cypress", description: "Adventure" },
{ _id: 4, name: "Trek at Tikona", description: "Adventure" },
{ _id: 5, name: "Trekking at Tikona", description: "Adventure" }
]
)
I've also created a index in a following way:
db.events.createIndex( { name: "text" } )
Now when I execute the following query (Search - Walk):
db.events.find({
'$text': {
'$search': 'Walk'
},
})
I get these results:
{ _id: 2, name: "Walk in Mangroves", description: "Adventure" },
{ _id: 3, name: "Walking in Cypress", description: "Adventure" }
But when I search Trek:
db.events.find({
'$text': {
'$search': 'Trek'
},
})
I get only one result:
{ _id: 4, name: "Trek at Tikona", description: "Adventure" }
So my question is why it dint resulted:
{ _id: 4, name: "Trek at Tikona", description: "Adventure" },
{ _id: 5, name: "Trekking at Tikona", description: "Adventure" }
When I searched walk it resulted the documents containing both walk and walking. But when I searched for Trek it only resulted the document including trek where it should have resulted both trek and trekking

MongoDB text search uses the Snowball stemming library to reduce words to an expected root form (or stem) based on common language rules. Algorithmic stemming provides a quick reduction, but languages have exceptions (such as irregular or contradicting verb conjugation patterns) that can affect accuracy. The Snowball introduction includes a good overview of some of the limitations of algorithmic stemming.
Your example of walking stems to walk and matches as expected.
However, your example of trekking stems to trekk so does not match your search keyword of trek.
You can confirm this by explaining your query and reviewing the parsedTextQuery information which shows the stemmed search terms used:
db.events.find({$text: {$search: 'Trekking'} }).explain().queryPlanner.winningPlan.parsedTextQuery
{
​ "terms" : [
​ "trekk"
​ ],
​ "negatedTerms" : [ ],
​ "phrases" : [ ],
​ "negatedPhrases" : [ ]
}
You can also check expected Snowball stemming using the online Snowball Demo or by finding a Snowball library for your preferred programming language.
To work around exceptions that might commonly affect your use case, you could consider adding another field to your text index with keywords to influence the search results. For this example, you would add trek as a keyword so that the event described as trekking also matches in your search results.
There are other approaches for more accurate inflection which are generally referred to as lemmatization. Lemmatization algorithms are more complex and start heading into the domain of natural language processing. There are many open source (and commercial) toolkits that you may be able to leverage if you want to implement more advanced text search in your application, but these are outside the current scope of the MongoDB text search feature.

Related

How to create an index for partial text search on MongoDB?

I'm following the tutorial instruction: https://docs.mongodb.com/manual/core/index-text/
This is the sample data:
db.stores.insert(
[
{ _id: 1, name: "Java Hut", description: "Coffee and cakes" },
{ _id: 2, name: "Burger Buns", description: "Gourmet hamburgers" },
{ _id: 3, name: "Coffee Shop", description: "Just coffee" },
{ _id: 4, name: "Clothes Clothes Clothes", description: "Discount clothing" },
{ _id: 5, name: "Java Shopping", description: "Indonesian goods" }
]
)
Case 1: db.stores.find( { $text: { $search: "java coffee shop" } } ) => FOUND
Case 2: db.stores.find( { $text: { $search: "java" } } ) => FOUND
Case 3: db.stores.find( { $text: { $search: "coff" } } ) => NOT FOUND
I'm expecting case 3 is FOUND because the query is matches a part of java coffee shop
Case 3 will not work with $text operator and reason is how Mongo Creates Text Indexes.
Mongo takes text indexed fields values and creates separate indexes for each unique word in string and not character(!).
so this means, that in your case for 1 object:
field name will have 2 indexes:
java
hut
field description will have 3 indexes:
coffee
and
cakes
$text operator compare $search values with this indexes and that's why "coff" will not work.
If you strongly want to take advantages of indexes you have to use $text operator, but it does not give you all flexibility, just like you want.
solution:
You Can simply use $regex with case sensitiveness option (i) and optimize your query with skip and limit.
If you want to return all documents and collection is large, $regex will cause performance issue
you can also check this article https://medium.com/coding-in-depth/full-text-search-part-1-how-to-create-mongodb-full-and-partial-text-search-c09c0bae17a3 and maybe use wildcard indexes for that, but i do not know is it a good practice or not.

MongoDB aggregation $group stage by already created values / variable from outside

Imaging I have an array of objects, available before the aggregate query:
const groupBy = [
{
realm: 1,
latest_timestamp: 1318874398, //Date.now() values, usually different to each other
item_id: 1234, //always the same
},
{
realm: 2,
latest_timestamp: 1312467986, //actually it's $max timestamp field from the collection
item_id: 1234,
},
{
realm: ..., //there are many of them
latest_timestamp: ...,
item_id: 1234,
},
{
realm: 10,
latest_timestamp: 1318874398, //but sometimes then can be the same
item_id: 1234,
},
]
And collection (example set available on MongoPlayground) with the following schema:
{
realm: Number,
timestamp: Number,
item_id: Number,
field: Number, //any other useless fields in this case
}
My problem is, how to $group the values from the collection via the aggregation framework by using the already available set of data (from groupBy) ?
What have been tried already.
Okay, let skip crap ideas, like:
for (const element of groupBy) {
//array of `find` queries
}
My current working aggregation query is something like that:
//first stage
{
$match: {
"item": 1234
"realm" [1,2,3,4...,10]
}
},
{
$group: {
_id: {
realm: '$realm',
},
latest_timestamp: {
$max: '$timestamp',
},
data: {
$push: '$$ROOT',
},
},
},
{
$unwind: '$data',
},
{
$addFields: {
'data.latest_timestamp': {
$cond: {
if: {
$eq: ['$data.timestamp', '$latest_timestamp'],
},
then: '$latest_timestamp',
else: '$$REMOVE',
},
},
},
},
{
$replaceRoot: {
newRoot: '$data',
},
},
//At last, after this stages I can do useful job
but I found it a bit obsolete, and I already heard that using [.mapReduce][1] could solve my problem a bit faster, than this query. (But official docs doesn't sound promising about it) Does it true?
As for now, I am using 4 or 5 stages, before start working with useful (for me) documents.
Recent update:
I have checked the $facet stage and I found it curious for this certain case. Probably it will help me out.
For what it's worth:
After receiving documents after the necessary stages I am building a representative cluster chart, that you may also know as a heatmap
After that I was iterating each document (or array of objects) one-by-one to find their correct x and y coordinated in place which should be:
[
{
x: x (number, actual $price),
y: y (number, actual $realm),
value: price * quantity,
quantity: sum_of_quantity_on_price_level
}
]
As for now, it's old awful code with for...loop inside each other, but in the future, I will be using $facet => $bucket operators for that kind of job.
So, I have found an answer to my question in another, but relevant way.
I was thinking about using $facet operator and to be honest, it's still an option, but using it, as below is a bad practice.
//building $facet query before aggregation
const ObjectQuery = {}
for (const realm of realms) {
Object.assign(ObjectQuery, { `${realm.name}` : [ ... ] }
}
//mongoose query here
aggregation([{
$facet: ObjectQuery
},
...
])
So, I have chosen a $project stage and $switch operator to filter results, such as $groups do.
Also, using MapReduce could also solve this problem, but for some reason, the official Mongo docs recommends to avoid using it, and choose aggregation: $group and $merge operators instead.

How to search cases by CompanyId in Netsuite Suitescript 2.0?

I can able to search the case by company name
var mySearch = search.create({
type: search.Type.SUPPORT_CASE,
columns: [{
name: 'title'
}, {
name: 'company'
}],
filters: [{
name: 'company',
operator: 'is',
values: 'Test'
}]
});
return mySearch.run({
ld: mySearch.id
}).getRange({
start: 0,
end: 1000
});
But I am not able to search case by company id.
companyId is 115
Below are not working
i)
filters: [{
name: 'company',
operator: 'is',
values: 115
}]
ii)
filters: [{
name: 'companyid',
operator: 'is',
values: 115
}]
According to the Case schema company is a Text filter, meaning you would have to provide it with the precise Name of the company, not the internal ID.
Instead you may want to use the customer.internalid joined filter to provide the internal ID. Also, Internal ID fields are nearly always Select fields, meaning they do not accept the is operator, but instead require the anyof or noneof operator.
You can find the valid operators by field type on the Help page titled Search Operators
First, you can try this :
var supportcaseSearchObj = search.create({
type: "supportcase",
filters:
[
["company.internalid","anyof","100"]
],
columns:
[
search.createColumn({
name: "casenumber",
sort: search.Sort.ASC
}),
"title",
"company",
"contact",
"stage",
"status",
"profile",
"startdate",
"createddate",
"category",
"assigned",
"priority"
]
});
Second : how did I get this ? The answer is hint that will make your life easier :
Install the "NetSuite Saved Search Code Export" chrome plugin.
In Netsuite UI, create your saved search (it is always easier that doing it in code).
After saving the search, open it again for edition.
At the top right corner (near list, search menu in the netsuite page), you will see a link "Export as script" : click on it and you will get your code ;)
If you can not install the chrome plugin :
In Netsuite UI, create your saved search (it is always easier that doing it in code).
In your code, load your saved search
Add a log.debug to show the [loadedesearchVar].filters
You can then copy what you will see in the log to use it as your search filters.
Good luck!

Elasticsearch 6.2 - Completion Suggester for long texts

I want to be able to search and suggest through long texts.
Below is my input string:
Clinical Support Specialist Medical Staff
If I search for clin or supp or spe or med or st it should give the results as the above string.
Also searches could be like clinical sup or specialist medi
Below is the mappings I create for the field:
description: {
type: 'completion',
analyzer: 'simple',
preserve_separators: true,
preserve_position_increments: true,
contexts: {
name: 'company',
type: 'category',
path: 'company',
}
}
And below is the search body:
descSuggestor: {
prefix: searchTerm,
completion: {
field: 'description'
}
}
Your question does not specify the elastic search version, or the environment you are trying to write your search query. However, you would be able to do that with regular expression in Kibana. For example, in the Dev tools of Kibana, you could write something like:
GET utilization_aggregation_2018/_search
{
"query": {
"regexp" : {"name": "supp.*"}
}
}
Hope this helps!

Using mongodb, how can I remove an object from an array based on one of its properties?

I have a mongodb collection containing an array of objects, and I'd like to remove one or more of the array objects based on their properties.
Example document from my collection:
nickname: "Bob",
isLibrarian: false,
region: "South America",
favoriteBooks: [
{
title: "Treasure Island",
author: "Robert Louis Stevenson"
},
{
title: "The Great Gatsby",
author: "F. Scott Fitzgerald"
},
{
title: "Kidnapped",
author: "Robert Louis Stevenson"
}
]
In this example, how do I remove, from each of the documents in my collection, all objects within the favoriteBook array whose author matches "Robert Louis Stevenson"?
So that afterwards this user document would be
nickname: "Bob",
isLibrarian: false,
region: "South America",
favoriteBooks: [
{
title: "The Great Gatsby",
author: "F. Scott Fitzgerald"
}
]
and other documents in the collection would be equally trimmed of books by Stevenson.
Thank you in advance for any insight! I love MongoDB but I'm wondering if I've bitten off more than I can chew here...
The below line of mongodb will help you to delete the books that matches a particular author
db.collection.update(
{},
{$pull:{favoriteBooks:{author:"Robert Louis Stevenson"}}},
{multi:true}
);

Resources