Nested databases in CouchDB - couchdb

It seems you are unable to nest databases in CouchDB. How do people work around this limitation? For example, assume I want to create a blogging engine where each domain has a separate database. Within each database I might want a Users database, an Orders database, etc. to contain the various user documents, order documents, and so forth.
The obvious way seems to be a flat structure where the database name demarcates the artificial boundary between database nesting levels with a hyphen:
myblog.com-users
myblog.com-posts
myblog.com-comments
anotherblog.com-users
anotherblog.com-posts
anotherblog.com-comments
...hundreds more...
Another solution would be to keep the lower-level databases and mark each document with the top-level value:
users database containing a document User1, with field instance="Test" or a field domain="myblog.com"

I think you're misusing the term database here. There is no reason you can't store the users, posts, and comments data in a single couchdb database. Your couchdb views can separate out the user documents from the posts documents, from the comments documents.
example map function for user documents in a couchdb database:
function(doc) {
if (doc.type = 'user') { // only return user documents
emit([doc.domain, doc.id], doc); // the returned docs will be sorted by domain
}
}
see View Api for ways to restrict that views results by domain using startkey and endkey with view collation.

I think the best solution is to have one database per domain, each storing domain specific data.

Related

How to fetch all documents from a firebase collection where each document has some sub collection and in sub collection there is a document?

I am making an Admin dashboard. I want to show all user's details and their orders. When I want to fetch all documents inside the user collection its returning empty. For more In user collection, each document has some sub-collection. In the account sub-collection, there is a document exists with name details where user account details are available as shown in snapshots.
My code is
export function getUsers() {
return firebase.firestore().collection("users").get();
}
If you store user's details directly in the document instead of 'account' sub-collection then fetching "users" collection will return all users' documents with their data. If you say there's no reason then I'd recommend doing this.
Other option would be to use collectionGroup query on "account" which will fetch all the documents from sub-collections named as "account" i.e. giving you every user's account details.
const snap = await db.collectionGroup('account').get()
const users = snap.docs.map(d => ({id: doc.ref.parent.parent.id, data: d.data()))
Here, id is user's document ID.
Firestore queries only access a single collection, or all collections with a specific name. There is no way to query a collection based on values in another collection.
The most common options are:
Query the parent collection first, then check the subcollection for each document. This approach works best if you have relatively few false positives in the parent collection.
Query all child collections with a collection group query, then check the parent document for each result. This approach works best if you have relatively few false positive in your child collection query.
Replicate the relevant information from the child documents into the parent document, and then query the parent collection based on that. For example, you could add a hasOrders field or an orderCount in the user document. This approach always gives optimal results while querying, but requires that you modify the code that writes the data to accommodate.
The third approach is typically the best for a scalable solution. If you come from a background in relation databases, this sort of data duplication may seen unnatural, but it is actually very common in NoSQL databases where you often have to change your data model to allow the queries your app needs.
To learn more about this, I recommend reading NoSQL data modeling and watching Getting to know Cloud Firestore.

Cloudant/Couchdb Architecture

I'm building an address-book app that uses a back-end Cloudant database. The database stores 3 types of documents:
-> User Profile document
-> Group document
-> User-to-Group Link document
As the names of the document go, there are users in my database, there are groups for users(like whatsapp), and there are link documents for each user to a group (the link document also stores settings/privileges of that user in that group).
My client-side app on login, queries cloudant for the user document, and each group document using view collation over the link documents of that user.
Then using the groups that I have identified above, I find all the other users of that group.
Now, the challenge is that I need to monitor any changes on the group and user documents. I am using pouchdb on the app side, and can invoke the 'changes' API against the ids of all the group and user documents. But the scale of this can be maybe 500 users in each group, and a logged in user being part of 10-50 groups. That multiplied to 1000s of users will become a nightmare for the back-end to support.
Is my scalability concern warranted? Or is this normal for cloudant?
If I understand your schema correctly, you documents of this form:
{
_id: "user:glynn",
type: "user",
name: "Glynn Bird"
}
{
_id: "group:Developers",
type: "group",
name: "Software Developers"
}
{
_id: "user:glynn:developers"
}
In the above example, the primary key's sorting allows a user and all of its memberships to be retrieved by using startkey and endkey parameters do the database's _all_docs endpoint.
This is "scalable" in the sense that if is efficient for Cloudant retrieve data from a primary or secondary index because the index is held in a b-tree so data with adjacent keys is store next to each other. A limit parameter can be used to paginate through larger data sets.
yes the documents are more or less how you've specified.
Link documents are as follows:
{
"_id": <AutoGeneratedID>,
"type": "link",
"user": user_id,
"group": group_id
}
I've written the following view map function:
if(type == "link") {
emit(doc.user, {"_id": doc.user});
emit([doc.user, doc.group], {"_id": doc.group});
emit([doc.group, doc.user], {"_id": doc.user});
}
using the above 3 indexes and include-docs=true, 1st lets me get my logged-in user document, 2nd lets me get all group documents for my logged-in user (using start and end key), and 3rd lets me get all other user documents for a group (using start and end key again).
Fetching the documents is done, but now I need to monitor changes on users of each group, for this, don't I need to query the changes API with array of user ids ? Is there any other way ?
Cloudant retrieve data from a primary or secondary index because the
index is held in a b-tree so data with adjacent keys is store next to
each other
Sorry, I did not understand this statement ?
Thanks.
Part 1.
I recommend to get rid of the "link" type here - it's good for SQL world, but not for CouchDb.
Instead of this, it is better to utilize a benefit of Document Storage, i.e. store user groups in property "Groups" for "User"; and property "Users" for "Group".
With this approach you can set up filtered replication to process only changes of specific groups and these changes will already contain all the users of the group.
I want to notice, that I made an assumption, that number of groups for a user and number of groups is reasonable (hundreds at maximum) and doesn't change frequently.
Part 2.
You can just store ids in these properties and then use Views to "join" other data. Or I was also thinking about other approach (for my use case, but yours is similar):
1) Group contains only ids of users - no views needed.
2) You create a view of each user contacts, i.e. for each user get all users with whom he has mutual groups.
3) Replicate this view to client app.
When user opens a group, values (such as names and pics of contacts are taken from this local "dictionary").
This approach can save some traffic.
Please, let me know what do you think. Because right now I'm working on designing architecture of my solution. Thank you!)

CouchDB - Get custom fields within _users for replication filtering

I am developing a simple client for Android which fetches data from a CouchDB database. There will be only one database for all users. The data pull-replicated is filtered by a JS function. Such function (simplified) would be like this:
function(doc,req) {
if (!doc.type || doc.type !='item') { return false; }
if (doc.foo && ... && req.userCtx.bar.indexOf(doc.foo) != -1) { return true; }
...
}
As I have read in the official documentation, _users is a perfect place to set custom fields related to the user. So did I as you can see in the above code (see req.userCtx.bar array).
The problem I am facing is that the object/JSON req.userCtx only contains these fields: db, name and roles.
1. What would be a good alternative to my idea? I am a little bit stuck right now at this point. 2. How can I retrieve the user's data (all fields official and custom)?. 3. Is it correct to add as filter parameter a large array?
NOTE
I am thinking of a messy alternative of adding an array-field in every item which will contain the list with all users allowed to pull such item although I have the feeling that there must be another way.
Saving user data in _users is interesting because only the user or an admin can read a user's document.
However, as you've found out, that doesn't mean that all user data is available to the userCtx object. All you get is the user's name and roles array. Can you make do with roles?
To retrieve all of the user's data, you should fetch the user's document from the _users database. You can do that with a GET request on http://localhost:5984/_users/org.couchdb.user:[USER].
To know what would be an appropriate solution to your problem, we'd need quite a bit more info. For instance, looking at your code, it seems you designed that filter with the intention of restricting replication to documents listed as being visible to the user. However, you can't really lock down CouchDB in a way that replication works, and the user doesn't have read access to the entire database. You really need one db per user for this to work.

CouchDB replication strategy with dynamic groups of users

This is the situation:
We have a series of users who share some documents. The documents they can share might change throughout the day, so can the documents themselves (changes and deletions). The users can change some information on the documents.
E.g.
Users | Documents
A | X
A | Y
A | Z
B | X
B | Z
C | Y
Possible groups: A+C, A+B
The server on CouchDB is a replica of a SQL Server DB with this data, an ETL takes care of managing changes on CouchDB. However, the CouchDB database is replicated on each user phone via PouchDB.
The goal:
To replicate changes and deletions accordingly.
What we've tried:
1) we figured we'd structure our documents with a list of users that can access to it. Each document would have a "Users" array and then a filter in the design document would take care of the replication to the clients. Unfortunately document deletions and document changes that won't pass the filter (e.g. a user is removed from the array) are not present in the _changes feed so cannot be replicated accordingly on the clients
2) database per user. This is not possible, because users need to see each others work on the documents (they share them)
3) database per group of users. Pretty much the same problem as the first solution, but worse. In fact:
- groups of user can change and no longer be present: how do reflect that client-side?
- a document can shift to a new group: it will have to be redownloaded from scratch. This greatly increases the download size
- the same document can be in more than one group! (see example above)
- each client would have to know in which group she is everytime she logs in and replicate multiple databases. Then on the return trip you'd have to know on which databases the document was present
Is there a recipe for this situation? Am I missing an obvious solution?
EDIT
Partial solution for case 1:
localDB.sync(remoteDB, {
live: true,
retry: true,
filter: 'app/by_user',
query_params: { "agente": agent }
})
.on('paused', function(info){
console.log("paused");
localDB.allDocs().then(function(docs){
console.log("allDocs");
docs.rows.forEach(function(row){
console.log(row);
remoteDB.get(row.id)
.then(function(doc){
if(doc.Agents.indexOf(agent) < 0){
localDB.remove(doc);
}
});
});
});
})
.on('change', function(result){
console.log("change!");
result.change.docs.forEach(function(change) {
if(!change.deleted){
$rootScope.$apply(function(){
$rootScope.$broadcast('upsert', change);
});
}
});
});
Each remove() is giving me a 409 (conflict), and rightfully so. Is there a way to tell Pouch "no longer consider this as replicable and just remove it from my DB?"
(3) Seems like the simplest solution to me, i.e. the "database per role" solution.
I think your difficulty stems from trying to manage permissions inside the documents themselves (and then using filtering replication). When you do that, you are basically trying to mirror CouchDB's permission system inside your documents, which is going to cause headaches.
Why not create a database per role, and assign roles to users using the normal _users database? If roles change, then users will lose or gain access to a set of documents. You would need to have server endpoints to handle the role-shuffling, or you would need to set up separate "admin" databases with special privileges, where users can change the roles.
Then on the client side, you can either replicate from multiple CouchDB databases into a single PouchDB (and then collate the results together yourself), or into a single PouchDB (probably a bad idea if you need to sync bidirectionally). Obviously you would need an initial step where you determine which databases the user has access to, but that's a small downside in my opinion.
Then if the user loses access to a document, they will simply get normal 401 errors during replication (which will show up in the 'denied' event during live replication). No need for ddocs or filtered replication - much simpler!
We arrived at the conclusion that:
1) our use-case might not be what CouchDB is good for
2) we value our mental health. After almost a month struggling with this problem we'd rather try and fail
3) documents are relatively inexpensive, so even if they stay on the user's phone that won't cause any major distress. If the data builds up too much they can simply clear the data and start fresh
Solution:
1) Keep the architecture as to point 1
2) After each 'pause' event triggers compare local docs with remote docs, if the remote doc doesn't pass the filter remove it from the UI. Should there be a way to remove the local document only we'll be very interested in upgrading to that logic.
1) still sounds as the simplest approach to me..
I don't know PouchDB very well, but in plain CouchDB, changes on deleted document can be workaround by extending attributes on deleted document, using your own custom DELETE function.
I mean.. a delete is like an update which sets the _deleted attribute to true.
So, instead of directly deleting documents, using the normal CouchDB crud DELETE on document, you can create an update function like this:
function(doc,req){
// optional acls for deleting doc.. doc is owned by req.userCtx.name
// doc.users are users already granted to work with this doc
return [{
"_id" : doc._id,
"_rev": doc._rev,
"_deleted":true,
"users": doc.users
},"Ok doc deleted"];
}
Furthermore, using document rewriting rules, this update function can eventually be called even when submitting an HTTP DELETE request(not only on PUT or POST).. In this way your delete behaviour becomes totally transparent to the client... and you delete in a way which can be more useful for your use case.
The Smileupps Chatty couchapp tutorial app uses this approach: extended deletes for different document types are performed within user/drop.js, profile/drop.js, chat/drop.js files

Basic CouchDB Queries

I've never worked with a database before, but I chose Couch DB because I needed a Json database, and HTTP queries seemed kinda simple. However the documentation assumes a level of knowledge I just don't have.
Assuming I have a database called 'subjects', it seems I can access the json by using GET on
http://localhost:5984/subjects/c6604f65029f1a6a5d565da029001f4c
However beyond that I'm stuck. Ideally I want to be able to:
Access a list of all the keys in the database (not their values)
Access an individual element by its key
Do I need to use views for this? Or can I just set fields in my GET request? Can someone give me a complete example of the request they'd use? Please don't link to the CouchDB documentation, it really hasn't helped me so far.
Views can be used to fetch the data
1) In order to get all keys from the database you can use below view
function(doc) {
if (doc.type=="article")
emit(doc._id,null); //emit(key,value), if you have any other field as key then specify as doc.key e.g doc.
}
You can access this view from browser using below URL
http://<ipaddress>:<port>/databasename/_design/designdocumentname/_view/viewname
e.g :
http://<ipaddress>:<port>/article/_design/articlelist/_view/articlelist
article is the database name,articlelist is name of the design document as well as view.
2) In order to access individual document by key
Below view will return all the articles belonging to a particular department
function(doc) {
if(doc.type == 'article' ) {
emit([doc.departmentname], doc);
}
}
Query this view based on the "department name"
e.g: Get all the articles belonging to "IBU3" department
http://<ipaddress>:<port>/department/_design/categoryname/_view/categoryname?key=[%22IBU3%22]

Resources