join within a same collection using mongoose - node.js

Pardon my ignorance if it is a very basic question.
ABC collection in MongoDB has the following schema.
{
"metadata": {
"store": 5051,
"catg": "XYZ",
},
"category": {
"name": "XYZ",
"id": "CL141778",
}
}
I need to query (where "metadata.catg" == "category.name")
What is the best way to do it without using mongoose dbref ?

In MongoDB you can not compare two fields values with each other in a query. What I expect is that you need to redesign your schema. You need to look at how you query (and update) as supposed to thinking MongoDB is yet another RDBMs.
Show us the real schema, some example documents and a few query types that you need to run and I can see if I can help.

Related

Query/Filter CosmosDB Core SQL by key name value

I have the following data structure (to be able to query and filter by day):
"id": "someId",
"dailyPlan": {
"2020-04-07": {
"activityCategoryOne": {
"activityOne": "someID",
"activityTwo": "someIDTwo"
}
}
}
So now I want to query only the items from the past and the next two weeks. But I have no idea how I can query the key name ("2020-04-07").
Is this possible and if so could you give me a hint or a link to the documentation as I couldn't find anything about it in my research.
Thank you very much!
Best,
Tixamaster
As comments said, it's impossible to achieve this by SQL in Cosmos DB.
The best workaround is to change your schema like this:
"id": "someId",
"dailyPlan": {
"date": "2020-04-07",
"activityCategoryOne": {
"activityOne": "someID",
"activityTwo": "someIDTwo"
}
}
Then use SQL like this:
SELECT * FROM c WHERE c.dailyPlan.date BETWEEN xxx AND xxx

Data model for nested array of objects in Firestore

I need advice from experienced NoSQL engineers on how I should structure my data.
I want to model my SQL data structure to NoSQL for Google Cloud Firestore.
I have no prior experience with NoSQL databases but I am proficient with traditional SQL.
I use Node.js for writing queries.
So far, I converted three tables to JSON documents with example data:
{
"session": {
"userId": 99992222,
"token": "jwttoken1191891j1kj1khjjk1hjk1kj1",
"created": "timestamp"
}
}
{
"user": {
"id": 99992222,
"username": "userName",
"avatarUrl": "https://url-xxxx.com",
"lastLogin": "2019-11-23 13:59:48.884549",
"created": "2019-11-23 13:59:48.884549",
"modified": "2019-11-23 13:59:48.884549",
"visits": 1,
"profile": true,
"basketDetail": { // I get this data from a third party API
"response": {
"product_count": 2,
"products": [
{
"product_id": 111,
"usageInMinutes_recent": 0,
"usageInMinutes": 0,
"usageInMinutes_windows": 0,
"usageInMinutes_mac": 0,
"usageInMinutes_linux": 0
},
{
"product_id": 222, // no recent usage here
"usageInMinutes": 0,
"usageInMinutes_windows": 0,
"usageInMinutes_mac": 0,
"usageInMinutes_linux": 0
}
]
}
}
}
}
{
"visitor": {
"id": 999922221,
"created": "2019-11-23 13:59:48.884549"
}
}
My questions:
session.userId, user.id, visitor.id can all signify the same user. What is the Firestore equivalent to foreign keys in SQL? How would I connect/join these three collections in a query?
What do I do about the nested object basketDetail? Is it fine where it is or should I define its own collection?
I anticipate queries
occasionally add all the recent usage.
frequently check if a user owns a specific product_id
frequently replace the whole baskedDetail object with new data.
occasionally update one specific product_id.
How would I connect collections user with basketDetail in a query if I separated it?
Thanks for the advice!
session.userId, user.id, visitor.id can all signify the same user. What is the Firestore equivalent to foreign keys in SQL? How would I connect/join these three collections in a query?
Unfortunately, there is not JOIN clause in Firestore. Queries in Firestore are shallow, can only get elements from the collection that the query is run against. There is no way you can get documents from two collections in a single query unless you are using collection group query, but it's not the case since the collections in your project have different names.
If you have three collections, then three separate queries are required. There is no way you can achieve that in a single go.
What do I do about the nested object basketDetail? Is it fine where it is or should I define its own collection?
There are some limits when it comes to how much data you can put into a document. According to the official documentation regarding usage and limits:
Maximum size for a document: 1 MiB (1,048,576 bytes)
As you can see, you are limited to 1 MiB total of data in a single document. So if you think that nested object basketDetail can stay within this limitation then you can use that schema, otherwise, add it to a subcollection. Besides that, all those operations are permitted in Firestore. If you'll have hard times implementing them, post another question so we can take a look at it.
How would I connect collections user with basketDetail in a query if I separated it?
You cannot connect/join two collections. If you separate basketDetail in a subcollection, then two queries are required.

Mongoose and nodejs: about schema and query

I'm building a rest api that allows users to submit and retrieve data produced by surveys: questions are not mandatory and each "submit" could be different from each other. Each submit is a json with data and a "survey id":
{
id: abc123,
surveyid: 123,
array: [],
object: {}
...
}
I have to store this data and allow retrieving and querying.
First approach: going without schema and putting everything in a single collection: it works, but each json field is treated as a "String" and making queries on numeric values is problematic.
Second approach: get questions datatypes for each survey, make/save a mongoose schema on a json file and then keep updated this file.
Something like this, where entry "schema : {}" represent a mongoose schema used for inserting and querying/retrieving data.
[
{
"surveyid" : "123",
"schema" : {
"name": "string",
"username" : "string",
"value" : "number",
"start": "date",
"bool" : "boolean",
...
}
},
{ ... }
]
Hoping this is clear, I've some questions:
Right now I've a single collection for all "submits" and everything is treated as a string. Can I use a mongoose schema, without other modifications, in order to specify that some fields are numeric (or date or whatever)? Is it allowed or is it even a good idea?
Are there any disadvantage using an external json file? Mongoose schemas are loaded at run time when requested or does the service need to be restart when this file is updated?
How to store data with a "schema" that could change often ?
I hope it's clear!
Thank you!

How do I retrieve a DocumentDB id when querying with the MongoDB API?

EDIT: we have just heard back from Microsoft that using the mongodb api against a "traditional" documentdb instance isn't supported. I'll leave this question open until someone comes up with a custom solution or Microsoft enhances the product.
We have successfully connected to our existing DocumentDB instance using the mongodb API, but the find() method does not return the id field inside the results.
The examples on the Query Tutorials page show an id field being returned, but they also show an "_id": "ObjectId(\"58f65e1198f3a12c7090e68c\")" field appended with no explanation as to how that 58f6... string is generated (it doesn't appear anywhere in the sample document at the top of the page).
For example, we can find this document using the query find({"type":"attribute", "association.Season":"Spring"}):
{
"id": "ATTR-TEST-ATT-00007",
"type": "attribute",
"name": "Spring Football",
"association": {
"Season": "Spring"
}
}
...but the mongodb API leaves out the id property from the document, so all we see is:
{
"type": "attribute",
"name": "Spring Football",
"association": {
"Season": "Spring"
}
}
Also, if we use find({"id": "ATTR-TEST-ATT-00001"}) nothing gets returned even though there is a document with that id in the database.
We have tried using the project argument to force inclusion of the id field, but it didn't work.
If we add an _id field to the document then the mongodb driver returns the _id field without needing a projection, but this is not a solution as we want to use both the DocumentDB SQL API and the mongodb API at the same time.
{
"id": "ATTR-TEST-ATT-00007",
"_id": "ATTR-TEST-ATT-00007",
"type": "attribute",
"name": "Spring Football",
"association": {
"Season": "Spring"
}
}
How do we get the mongodb API to return the document's id field in queries?

How to populate a non related field mongoose

Document Role =
{ "_id" = "12345",
Name = "Developer"
},
{ "_id" = "67890",
Name = "Manager"
}
Document Employee =
{ "_id" = "00000",
"Name"= "Jack",
"Roles"= [{_id:"12345"},{_id:"67890"}]
}
I want to select one Role and list all the users having the same role
How to do that?
I want to get some thing like.
{ "_id" = "12345",
Name = "Developer"
Employees = [{"_id":"00000"}]
}
Is it possible to use populate to achieve this?
Mongoose .populate() and other methods you might find are not "join magic" for MongoDB. What they in fact all do is execute "additional" query(ies) operations on the database and "merge" the results "under the hood" for your as opposed to you doing the work yourself.
So your best option as long as you can deal with it is to use "embedding" which keeps the "related" information in the document for which you are "pairing" it to, such as for "Roles":
{
"_id": "0000",
"name": "Developer",
"employees": [{ "_id": "12345", "name": "Jack" }]
}
Which is simple, but of course comes at it's own cost and dealing with the "embedded" entries and how you use it according to "updating" or "reading" as is appropriate. It's a single "read" operation, but "updates" may be more costly due to the need to update the embedded information in multiple places, and multiple documents.
If you can "live" with "referencing" and the cost it incurs then you can always do this:
var rolesSchema = Schema({
"name": String,
"emloyees": [{ "type": Schema.Types.ObjectId, "ref": "Employee" }]
});
var employeesSchema = Schema({
"name": String,
"roles": [{ "type": Schema.Types.ObjectId, "ref": "Role" }]
});
var Role = mongoose.model('Role',rolesSchema);
var Employee = mongoose.model('Employee',employeeSchema);
Role.find({ "_id": "12345"}).populate("employees").exec(function(err,docs) {
// populated "joined" results in here
})
What this does behind the scenes is effectively (basic JavaScript representation and "at best") :
var roles = db.role.find({ "_id": "12345" }).map(function(doc) {
doc.employees = doc.employees.map(function(employee) {
return db.employees.find({ "_id": { "$in": doc.employees } }).toArray();
})
})
Mongoose works on the concept of using the "schema" definition to "know" which collection to execute the "other query" on and then return the "joined" results to you. But it is not a single query but multiple hits to the database.
Other schemes might "keep" the referenced collection information in the document itself, as opposed to relying on the "model code" to get that information. But the same principle applies where you need to make another call to the database and perform some type of "merge" in the API provided.
So it all falls down to your choice. Either you "embed" the data and live with that cost, or you "reference" the data and live with the network "cost" that is associated with multiple database hits.
The key point here is "nothing is free", and not even the way that SQL RDBMS perform "joins" which also has a "cost" of it's own and is a lot of the reasoning why NoSQL solutions like MongoDB do it this way and "do not support joins" in a native fashion for the "cost" involved in distributed data systems.
The main lesson here is to "do what suits you and your application", and not just choose the "coolest thing right now", but basically expect what you get from choosing different storage solutions. They all have their own purposes. Horses for Courses as the saying goes.

Resources