MongoDB: Copy a collection of referenced documents as subdocuments - node.js

I made the mistake of designing a scheme so that I have two collections where one has documents which contain a manual reference to the other. I realized now that I should have created it so that the parent collection contained the other collection as sub-documents instead.
The problem is, I've already put this scheme out into a production environment where hundreds of entries have already been created. What I'd like to do is somehow scan over all of the existing data, and copy the items to their referenced parent_id as a sub-document.
Here is an example of my schema:
Collection 1 - User
_id
Name
Collection 2 - Photos
_id
url
user_id
Is there a quick way to change the existing documents to be one collection like this:
Collection - User
_id
Name
Photos: [...]
Once I have the database setup correctly, I can easily modify my code to use the new one, but the problem I'm having is figuring out how to quickly/procedural copy the documents to their parent.
Additional detail - I'm using MongoHQ.com to host my MongoDB.
Thank You.

I don't know the specifics of your environment, but this sort of change usually involves the following kinds of steps:
Ensure that your old code doesn't complain if there is a Photos array in the User object.
"Freeze" the application so that new User and Photo documents are not created
Run a migration script that copies the Photo documents into the User documents. This should be pretty easy to create either in javaScript or through app code using the driver (see example below)
Deploy the new version of the application that expects Photos to be embedded in the array
"Unfreeze" the application to start creating new documents
If you cannot "Freeze/Unfreeze" you will need to run a delta script after step 4 that will migrate newly created Photo documents after the new application is deployed.
The script will look something like this (untested):
db.User.find().forEach(function (u) {
u.Photos = new Array();
db.Photo.find({user_id : u._id}).forEach(function (p) {
u.Photos.push(p);
}
db.User.Save(u);
}

Related

How to fetch all documents from a firebase collection where each document has some sub collection and in sub collection there is a document?

I am making an Admin dashboard. I want to show all user's details and their orders. When I want to fetch all documents inside the user collection its returning empty. For more In user collection, each document has some sub-collection. In the account sub-collection, there is a document exists with name details where user account details are available as shown in snapshots.
My code is
export function getUsers() {
return firebase.firestore().collection("users").get();
}
If you store user's details directly in the document instead of 'account' sub-collection then fetching "users" collection will return all users' documents with their data. If you say there's no reason then I'd recommend doing this.
Other option would be to use collectionGroup query on "account" which will fetch all the documents from sub-collections named as "account" i.e. giving you every user's account details.
const snap = await db.collectionGroup('account').get()
const users = snap.docs.map(d => ({id: doc.ref.parent.parent.id, data: d.data()))
Here, id is user's document ID.
Firestore queries only access a single collection, or all collections with a specific name. There is no way to query a collection based on values in another collection.
The most common options are:
Query the parent collection first, then check the subcollection for each document. This approach works best if you have relatively few false positives in the parent collection.
Query all child collections with a collection group query, then check the parent document for each result. This approach works best if you have relatively few false positive in your child collection query.
Replicate the relevant information from the child documents into the parent document, and then query the parent collection based on that. For example, you could add a hasOrders field or an orderCount in the user document. This approach always gives optimal results while querying, but requires that you modify the code that writes the data to accommodate.
The third approach is typically the best for a scalable solution. If you come from a background in relation databases, this sort of data duplication may seen unnatural, but it is actually very common in NoSQL databases where you often have to change your data model to allow the queries your app needs.
To learn more about this, I recommend reading NoSQL data modeling and watching Getting to know Cloud Firestore.

Running an Azure function app each time a CosmosDB document is created and updating documents in a second collection

I have a scenario, where we have items save in one documentDb collection e.g. under /items/{documentId}. The document looks similar to:
{
id: [guid],
rating: 5,
numReviews: 1
}
I have a second document collection under /user-reviews/{userIdAsPartitionKey}/{documentId}
The document will look like so:
{
id: [guid],
itemId: [guidFromItemsCollection],
userId: [userId],
rating: 4
}
Upon uploading of this document, I want a trigger to be fired which takes as input this new user rating document, is able to retrieve the relevant document from the items collection, transform the items document based on the new data.
The crux of my problem is: how can I trigger off a document upsert, and how can I retrieve and modify a document from another collection, all within a Funciton App?
I've investigated the following links, which tease at the idea of Triggers being possible on the CosmosDB, but the table suggests we can't hook up a trigger to document DB upload.
https://learn.microsoft.com/en-us/azure/azure-functions/functions-triggers-bindings
https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-documentdb
If it's not possible to set up directly, my assumption is I should have a middle tier service handling the upsert (currently using DocumentClient from client side), which can kick off this processing itself, but I like the simplicity of the serverless function apps if possible.
Operations are scoped to a collection. You cannot trigger an operation in Collection B from an event in Collection A.
You'd either need to implement this in your app tier (as you suggested) or... store both types of documents in the same collection (a common scenario). You might need to add some type of doctype property to help filter your queries, but since documents are schema-free, you can store heterogeneous documents in the same collection.
Also: You mentioned an Azure Function. Within a function, there's nothing stopping you from making multiple database calls (e.g. when something happens in collection a and causes your function to be called, your function can perform an operation in collection b). Just note that this won't be transactional.
I know this is a pretty old question.
The Change Feed was built for this exact scenario.
In today's Azure Portal, there's even a menu option in the CosmosDB blade that allows you to create and trigger Function based on changes in one collection which allows you to detect and react to changes - i.e. to create a document in another collection.

Create new mongoose document without copying entire schema

I have a mongoose question that I'll try to form into something that makes sense. So, I have a User schema with a LOT of stuff in it.
Is there a way to create a new user without copying the entire schema in another file? So, just reference the schema and pass through values that are changed?
In past projects, everything in my schema also needed to be updated when creating users so that wasn't an issue.
User error on my part, I thought there was something wrong with everything except my forgetfulness. Long story short, I added a new partial template to my project and was getting the email via input[type=email]. The partial added a second one so the email parameter was never received by the backend.
Whoops.

what's the best way to bind a mongodb doc to a node.js html page

In past with my PHP / Rails - MYSQL apps I've used the unique ID of a table record to keep track of a record in an html file.
So I'd keep track of how to delete a record shown like this (15 being the ID of the record):
Delete this record
So now I'm using MongoDB. I've tried the same method but the objectID ._id attribute seems to be a loooong byte string that I can't use conveniently.
What's the most sensible way of binding a link in the view to a record (for deletion, or other purposes or whatever)?
If the answer is to create a new id that's unique for each document in the collection, then what's the best way to generate those unique id's?
Thank you.
You could use a counter instead of the ObjectID
But this could create a problem when inserting a new document after you deleted a previous one.
See this blog post for more detail info on Sequential unique identifiers with Node.js and MongoDB.
Or you could use the timestamp part of the ObjectID:
objectId.getTimestamp().toString()
See the node objectid docs

Mongoose: Only return one embedded document from array of embedded documents

I've got a model which contains an array of embedded documents. This embedded documents keeps track of points the user has earned in a given activity. Since a user can be a part of several activities or just one, it makes sense to keep these activities in an array. Now, i want to extract the hall of fame, the top ten users for a given activity. Currently i'm doing it like this:
userModel.find({ "stats.activity": "soccer" }, ["stats", "email"])
.desc("stats.points")
.limit(10)
.run (err, users) ->
(if you are wondering about the syntax, it's coffeescript)
where "stats" is the array of embedded documents/activeties.
Now this actually works, but currently I'm only testing with accounts who only has one activity. I assume that something will go wrong (sorting-wise) once a user has more activities. Is there anyway i can tell mongoose to only return the embedded document where "activity" == "soccer" alongside the top-level document?
Btw, i realize i can do this another way, by having stats in it's own collection and having a db-ref to the relevant user, but i'm wondering if it's possible to do it like this before i consider any rewrites.
Thanks!
You are correct that this won't work once you have multiple activities in your array.
Specifically, since you can't return just an arbitrary subset of an array with the element, you'll get back all of it and the sort will apply across all points, not just the ones "paired" with "activity":"soccer".
There is a pretty simple tweak that you could make to your schema to get around this though. Don't store the activity name as a value, use it as the key.
{ _id: userId,
email: email,
stats: [
{soccer : points},
{rugby: points},
{dance: points}
]
}
Now you will be able to query and sort like so:
users.find({"stats.soccer":{$gt:0}}).sort({"stats.soccer":-1})
Note that when you move to version 2.2 (currently only available as unstable development version 2.1) you would be able to use aggregation framework to get the exact results you want (only a particular subset of an array or subdocument that matches your query) without changing your schema.

Resources