MongoDb integrating with external db - node.js

I have a database which contains data from two separate systems/servers. The first is generated locally [I develop and create this data] (users, activity logs, orders, ...). The second comes from a "product provider" [I only have READ access from API] These objects were created by MySQL and sent in JSON. They already have an "id" property.
With NodeJS, I use request to get a product by "id", and then store it with newProduct.save() appends an _id.
In products, "id" is necessary form relationships with the other collections in my database (such as products_price), and access dynamic endpoints, such as "products/:id/promos".
Note that products are constantly being updated externally and I need to be able to update my documents by "id" not by "_id" as the external server has no knowledge about "_id." [id is unique on a collection level, as each collection is a fresh iteration]
For my first question: should I treat "product.id" as a "regular" MongoDB field and use aggregate/lookup to merge documents from my collections? Or should I overwrite ObjectID() with id? (before saving rename "id" to "_id")
At some point, Orders (local) and Products (external) need to form a relationship where Order _id and Product id (or _id) are stored together for easy retrieval.
Which id do I use in this case?

if you are pretty sure that 'id' coming from your product provider API is unique you better use that as _id (overwrite _id), it will save you:
an unneeded index ('_id' is indexed any way)
some CPU cycles that mongoDB would take to produce the ObjectID
some disk and memory space
(*) even if you find yourself dealing with many different product providers, assuming its one is using his own unique product id you could use a combined _id to make it unique as:
_id = {provider: 'foo', id: xxx}
or _id = [provider_name, product_id]
or _id = provider_name + product_id
etc. etc.
in this use case of multiple providers format depends on how you plan to fetch those products later.

Related

How to fetch all documents from a firebase collection where each document has some sub collection and in sub collection there is a document?

I am making an Admin dashboard. I want to show all user's details and their orders. When I want to fetch all documents inside the user collection its returning empty. For more In user collection, each document has some sub-collection. In the account sub-collection, there is a document exists with name details where user account details are available as shown in snapshots.
My code is
export function getUsers() {
return firebase.firestore().collection("users").get();
}
If you store user's details directly in the document instead of 'account' sub-collection then fetching "users" collection will return all users' documents with their data. If you say there's no reason then I'd recommend doing this.
Other option would be to use collectionGroup query on "account" which will fetch all the documents from sub-collections named as "account" i.e. giving you every user's account details.
const snap = await db.collectionGroup('account').get()
const users = snap.docs.map(d => ({id: doc.ref.parent.parent.id, data: d.data()))
Here, id is user's document ID.
Firestore queries only access a single collection, or all collections with a specific name. There is no way to query a collection based on values in another collection.
The most common options are:
Query the parent collection first, then check the subcollection for each document. This approach works best if you have relatively few false positives in the parent collection.
Query all child collections with a collection group query, then check the parent document for each result. This approach works best if you have relatively few false positive in your child collection query.
Replicate the relevant information from the child documents into the parent document, and then query the parent collection based on that. For example, you could add a hasOrders field or an orderCount in the user document. This approach always gives optimal results while querying, but requires that you modify the code that writes the data to accommodate.
The third approach is typically the best for a scalable solution. If you come from a background in relation databases, this sort of data duplication may seen unnatural, but it is actually very common in NoSQL databases where you often have to change your data model to allow the queries your app needs.
To learn more about this, I recommend reading NoSQL data modeling and watching Getting to know Cloud Firestore.

How to model MongoDB User Schema and their corresponding data?

I'm creating a Google Keeper replica where a user can log in and the list of todo list for that user is stored.
I'm new to mongoDB, express, and react, and I was wondering how someone would go about doing this. Would you create a User Schema with the "list objects" or create a User Schema and a separate List schema.
I think the creating one schema would be more efficient, but when I go to update or delete a note, I don't know how I would target a specific note without an ID since the ID would be associated with the entire user schema.
Thank you!
You can assign an unique id to the list entries while inserting them into the database. For example, you can use timestamp. The structure of your list items will be something like-
{ itemId: "1595488458403", value: "Do the laundry" }
As the items will be created one by one, therefore, there timestamps will be different. To create the timestamp of the present time, use-- new Date().getTime()
Here's the roadmap
Bring the value of the list item from your frontend to the backend. (say, "Do the laundry")
Define a variable "itemId" in the backend route:
itemId = new Date().getTime()
While inserting the item to the user's list of to-dos, insert:
{ itemId: itemId, value: "Do the laundry" }

How to expose MongoDB documents primary keys in a REST API?

I am building a REST API with MongoDB + nodeJS. All the documents are stored and are using _id as the primary key. I've read here that we should not expose the _id and we should use another ID which is not incremental.
In the DB, a document is represented as:
{
_id: ObjectId("5d2399b83e9148db977859ea")
bookName: "My book"
}
For the following the endpoints, how should the documents be exposed?
GET /books
GET /books/{bookId}
Currently my API returns:
{
_id: "5d2399b83e9148db977859ea"
bookName: "My book"
}
but should it instead return something like:
{
id: "some-unique-id-generated-on-creation"
bookName: "My book"
}
Questions
Should I expose the _id so that one can make queries such as:
GET /books/5d2399b83e9148db977859ea
Should I use a UUID for my ID instead of ObjectId?
Should I keep the internal _id (but never expose it) and create another attribute id which would use UUID or another custom generated ID ?
Is it a good practice to work with _id in my backend or should I only make queries using my own custom ID? Example: find({ id: }) instead of find({ _id: })
To answer your questions.
You can expose _id so that authenticated users can make queries like GET, PUT and PATCH on that _id.
MongoDB has support that allows you to generate your own BSON ID and use it, instead of mongodb created it's own _id during the insert.
There is no need of duplicating logic, the main purpose of _id is to identify each document separately and having two id columns means you are storing redundant data, follow DRY (DO NOT REPEAT YOURSELF) principle wherever possible.
It's not a bad practice to work with _id in your backend.
Hope this helps!
Given you're using Mongoose, you can use 'virtuals', which are essentially fake fields that Mongoose creates. They're not stored in the DB, they just get populated at run time:
// Duplicate the ID field.
Schema.virtual('id').get(function(){
return this._id.toHexString();
});
// Ensure virtual fields are serialised.
Schema.set('toJSON', {
virtuals: true
});
Any time toJSON is called on the Model you create from this Schema, it will include an 'id' field that matches the _id field Mongo generates. Likewise you can set the behaviour for toObject in the same way.
You can refer the following docs:
1) https://mongoosejs.com/docs/api.html
2) toObject method
In my case, whether it's a security risk or not, but my _id is a concatenation of any of the fields in my Document that are semantically considered as keys, i.e. if i have First Name, Last Name, and Email as my identifier, and a fourth field such as Age as attribute, then _id would be concatenation of all these 3 fields. It would not be difficult to get and update such record as long as I have First Name, Last Name and email information available

Cloudant/Couchdb Architecture

I'm building an address-book app that uses a back-end Cloudant database. The database stores 3 types of documents:
-> User Profile document
-> Group document
-> User-to-Group Link document
As the names of the document go, there are users in my database, there are groups for users(like whatsapp), and there are link documents for each user to a group (the link document also stores settings/privileges of that user in that group).
My client-side app on login, queries cloudant for the user document, and each group document using view collation over the link documents of that user.
Then using the groups that I have identified above, I find all the other users of that group.
Now, the challenge is that I need to monitor any changes on the group and user documents. I am using pouchdb on the app side, and can invoke the 'changes' API against the ids of all the group and user documents. But the scale of this can be maybe 500 users in each group, and a logged in user being part of 10-50 groups. That multiplied to 1000s of users will become a nightmare for the back-end to support.
Is my scalability concern warranted? Or is this normal for cloudant?
If I understand your schema correctly, you documents of this form:
{
_id: "user:glynn",
type: "user",
name: "Glynn Bird"
}
{
_id: "group:Developers",
type: "group",
name: "Software Developers"
}
{
_id: "user:glynn:developers"
}
In the above example, the primary key's sorting allows a user and all of its memberships to be retrieved by using startkey and endkey parameters do the database's _all_docs endpoint.
This is "scalable" in the sense that if is efficient for Cloudant retrieve data from a primary or secondary index because the index is held in a b-tree so data with adjacent keys is store next to each other. A limit parameter can be used to paginate through larger data sets.
yes the documents are more or less how you've specified.
Link documents are as follows:
{
"_id": <AutoGeneratedID>,
"type": "link",
"user": user_id,
"group": group_id
}
I've written the following view map function:
if(type == "link") {
emit(doc.user, {"_id": doc.user});
emit([doc.user, doc.group], {"_id": doc.group});
emit([doc.group, doc.user], {"_id": doc.user});
}
using the above 3 indexes and include-docs=true, 1st lets me get my logged-in user document, 2nd lets me get all group documents for my logged-in user (using start and end key), and 3rd lets me get all other user documents for a group (using start and end key again).
Fetching the documents is done, but now I need to monitor changes on users of each group, for this, don't I need to query the changes API with array of user ids ? Is there any other way ?
Cloudant retrieve data from a primary or secondary index because the
index is held in a b-tree so data with adjacent keys is store next to
each other
Sorry, I did not understand this statement ?
Thanks.
Part 1.
I recommend to get rid of the "link" type here - it's good for SQL world, but not for CouchDb.
Instead of this, it is better to utilize a benefit of Document Storage, i.e. store user groups in property "Groups" for "User"; and property "Users" for "Group".
With this approach you can set up filtered replication to process only changes of specific groups and these changes will already contain all the users of the group.
I want to notice, that I made an assumption, that number of groups for a user and number of groups is reasonable (hundreds at maximum) and doesn't change frequently.
Part 2.
You can just store ids in these properties and then use Views to "join" other data. Or I was also thinking about other approach (for my use case, but yours is similar):
1) Group contains only ids of users - no views needed.
2) You create a view of each user contacts, i.e. for each user get all users with whom he has mutual groups.
3) Replicate this view to client app.
When user opens a group, values (such as names and pics of contacts are taken from this local "dictionary").
This approach can save some traffic.
Please, let me know what do you think. Because right now I'm working on designing architecture of my solution. Thank you!)

How to find particular json document from couchdb

How to find particular json document details from couchdb
For ex : Database name : employee_mgmt, in that database contains 50 json documents. So i want to find particular employee json documents ( Find by employee id ).
CouchDB does in it self not provide you with collections/buckets, hence all your documents are peers. It's up to you to provide meta-data e.g. by having a property $doctype with a value representing what kind of document it is. This is useful if you are writing maps and e.g. want to create a view (secondary index) returning something applicable only to employees.
Know, if you just want to query by _id you don't need the above. Just do a simple GET with an URI as: http://host:port/databasename/documentid
More information: http://docs.couchdb.org/en/1.6.1/api/document/common.html#get--db-docid
If you want to get a batch of documents matching many _id use the builtin index _all_docs http://docs.couchdb.org/en/1.6.1/api/database/bulk-api.html#post--db-_all_docs

Resources