Mongoose.js schema description issue (array vs object) - node.js

I need to store some user dictionaries (sets of words, and I need to store it as user property) and each dictionary has actually only one property: language.
So I can describe that property like that:
dictionaries:[{language: 'string', words: [.. word entry schema desc..]
}]
and store dictionaries like that:
dictionaries: [
{language: en, words: [.. words of English dictionary..]},
{language: es, words: [.. words of Spanish dictionary..]}
]
But actually I could store the dictionaries in "less nested" way: not array abut object:
dictionaries: {
en: [.. words of English dictionary..],
es: [.. words of Spanish dictionary..]
}
But I don't see a way to describe such object with mongoose schema. So the question is what is the best (more reasonable in terms of storage and querying) option to go with considering I use mongoose.

Related

Mongoose: Try finding a document with multiple conditions stored in an array

What I am trying to achieve is to find a document by its name and the language.
My current attempt looks like the following one:
Model.find({"name.language": languageIso2, "name.value": { "$regex": search, "$options": "i" }}, (error, SpecialConditons) => {...})
Unfortunately, it doesn't work, because of the following case:
lets say the language is "gb" and you use "a" as a filter for the name, this would send you the model shown in the picture above, because the array "name" contains both (1) the language "gb" and (2) the "a" in the object containing the Finnish translation "Touretten syndrooma".
What I want to achieve is to look for the object that contains "gb" as a language and if the object containing the "gb" also includes an "a" in the value field then it should send me the model back. In this case the value for "gb" doesn't contain an "a", so it should not send it back.
Any idea how to achieve this? Thanks in advance

String matching keywords and key phrases in Python

I am trying to perform a smart dynamic lookup with strings in Python for a NLP-like task. I have a large amount of similar-structure sentences that I would like to parse through each, and tokenize parts of the sentence. For example, I first parse a string such as "bob goes to the grocery store".
I am taking this string in, splitting it into words and my goal is to look up matching words in a keyword list. Let's say I have a list of single keywords such as "store" and a list of keyword phrases such as "grocery store".
sample = 'bob goes to the grocery store'
keywords = ['store', 'restaurant', 'shop', 'office']
keyphrases = ['grocery store', 'computer store', 'coffee shop']
for word in sample.split():
# do dynamic length lookups
Now the issue is this Sometimes my sentences might be simply "bob goes to the store" instead of "bob goes to the grocery store".
I want to find the keyword "store" for sure but if there are descriptive words such as "grocery" or "computer" before the word store I would like to capture that as well. That is why I have the keyphrases list as well. I am trying to figure out a way to basically capture a keyword at the very least then if there are words related to it that might be a possible "phrase" I want to capture those too.
Maybe an alternative is to have some sort of adjective list instead of a phrase list of multiple words?
How could I go about doing these sort of variable length lookups where I look at more than just a single word if one is captured, or is there an entirely different method I should be considering?
Here is how you can use a nested for loop and a formatted string:
sample = 'bob goes to the grocery store'
keywords = ['store', 'restaurant', 'shop', 'office']
keyphrases = ['grocery', 'computer', 'coffee']
for kw in keywords:
for kp in keyphrases:
if f"{kp} {kw}" in sample:
# Do something

get list of collections having all words exists in field which added in given string in mongoDB

I want to search list of collections from mongoDB have all the keywords of given string.
For e.g.
I have a collection
{
"id":1
"text":"go for shopping",
"description":"you can visit this branch as well"
}
{
"id":2
"text":"check exiting discount",
"description":"We have various discount options"
}
Now, If I will pass string like "I want to go for shopping" w.r.t. text field in find query of mongoDB. Then I should get first collection as output because text field value "go for shopping" exists in the input string passed in find query.
This can be achieved through $text operator in MongoDB. But you have to createIndex on the "text" field in your database.(or whichever filed you want to be matched, I would suggest you rename it in your db to avoid confusion)
db.yourCollectionName.createIndex({"text":"text"})
The first field here is the "text" field in your database, and the second one is the mongo operator.
Then you can pass any query like,
db.yourCollectionName.find({$text: {$search: "I want to go for shopping"}})
The "$text" here is the mongo operator.
This would return all documents which have any of the keywords above.
Maybe you can read more around this and improvise and modify.
Ref: MongoDb $text
You can do so through regular expression. MongoDb provides the provision of matching strings through regex patterns.
In your case you could do something like:
db.yourCollectionName.find({text:{$regex:"go for shopping" }})
This will return you all the documents having the phrase "go for shopping" in the text field.
Ref: MongoDb Regex

Data structure / data model for multi-language phrasebook

We want create a multi-language phrasebook / dictionary for a specific
area.
And now I'm thinking about the best data structure / data model for that.
Since it should be more phrasebook than dictionary we want to keep the data model / structure first simple. It should be only used for fast translation: i.e. user selects two languages, types a word and gets translation. The article and description parts are just for displaying, not for search.
There are some specific cases I'm thniking about:
One term can be expressed with several (1..n) words in any language
Any term can also be translated into several (1..m) words in another language
In some languages the word's articel could be important to know
For some words description could be important (e.g. for words from dialects etc.)
I'm not sure about one point: do I reinvent the wheel creating a data model by myself? But I couldn't find any solutions.
I've just created a json data model I'm not sure about if it good enough or not:
[
{
wordgroup-id: 1,
en: [
{word: 'car', plural: 'cars'},
{word: 'auto', plural: 'autos'},
{word: 'vehicle', plural: 'vehicles'},
],
de: [
{word: 'Auto', article: 'das', description: 'Some explanation eg. when to use this word', plural: 'Autos'},
{word: 'Fahrzeug', article: 'das', plural: 'Fahrzeuge'}
],
ru: [...],
...
},
{
wordgroup-id: 2,
...
},
...
]
I also thought about some "corner" cases #triplee wrote about. I thought to solve them with some kind of redundance. Only the word group id and the word within a language should be unique.
I would be very thankfull for any feedback to the first draft of the data model.

mongodb, finding by coordinate + query

I'm building a web application over Node.js and MongoDB which is based on geolocated points.
The document is something like this:
{ name: ""
keywords: [Array of strings]
location: {lng: double, lat: double }
}
I am wondering how could I use find() to find documents that are near from a coordinate but, in addition, are coincident with any of he keywords in the keywords array.
Imagine that keywords are: ["restaurant", "bar", "coffee"]
I've looked into 2d Index, but the secondary index must be a string. It can't be an array of strings.
The problem is that a document could have more than one keyword (or category) so I can't use a simple string to query them
How would you implement this?
Thanks!
What version of mongo? It looks like this was added in 2.4.0: SERVER-8457

Resources